The LOFAR Central Processor
Kjeld
van der Schaaf (schaaf@astron.nl) &
Marco de Vos (devos@astron.nl)
ASTRON,
P.O. Box 2, 7990 AA Dwingeloo, The
Netherlands
This
poster describes the basic functions and architecture of the LOFAR Central
Processor.
At
the frequencies and with the bandwidths where LOFAR operates it is well
possible to design a correlator and postprocessing platform based on
off-the-shelf components, with only a limited set of functions implemented in
dedicated hardware. This gives the high flexibility in configurations that is
required to accommodate the range of science applications from wide-field
imaging to pulsar detection. It also allows us to take optimal advantage of the
developments in processing platforms.
The
poster just tries to highlight some aspects of the Central Processor.
For
more information please contact the authors. Part of this work has been
published in
“Cluster
Computers and Grid Processing in the First Radio-Telescope of a New
Generation,”
De
Vos, van der Schaaf & Bregman, IEEE Conference CCGrid2001, Brisbane, May
2001
Position of the Central Processor in the LOFAR system



At the central processing site, signals from the remote stations are received, possibly unpacked and routed to pipelines of processing cluster nodes, with high bandwidth connections along the pipeline. Within the pipeline, the available hardware focuses on the needs of the data flow processing steps. For example, the first processing steps in an imaging application involve beamforming and correlation, (within the dashed oval) which can be executed very efficiently on programmable logic hardware. Therefore, the first nodes in the pipeline will be equipped with such hardware resources. In the last stages of the imaging processing chain SelfCalibration algorithms are used (within the dotted oval). Those algorithms are rather complex in nature, which suggests the use of microprocessors with large instruction sets, branch prediction etc. for these processing steps.
Central Processor Hardware
Architecture

The
Figure above gives an overview of the central processing platform hardware
architecture. The data flow processing steps are executed from left to right on
nodes of the hybrid cluster computer. The routing sections take care of
reorganizing the incoming data-stream from “all channels for each station” (as
it appears at the end of the data transport network) to “all stations for each
channel” (as is required for correlation). After the routing sections, the
hardware is organised in pipelines offering a dedicated chain of hardware
resources to the application.
Functional blocks in the
imaging application data flow processing chain

In
the Figure above the functional blocks for the imaging application type are
shown. Following the data flow, we start with the Virtual Core (VC) beamforming
from the VC station data. These beams are added to the station beams input to
the correlation and phase/gain correction. The phase/gain correction parameters
may include ionospheric corrections, origination from the station-core Selfcal,
or RFI mitigation. The integrated data
is stored in the temporary data store from which the Selfcalibration process
may fetch data and return further calibrated (and integrated) data. After the
Selfcalibration process, the calibrated UV data in the temporary data store is
used to produce the final data products.
Multi-Beamforming
Correlation on a Digital Signal Processing board

Example partitioning of data flows and processing tasks for the MBFC.
Four processing lines are shown, each of which may be targeted onto 1-2 digital
signal processors (e.g. a FPGA and a DSP). Both input and output data flows of the
board are of order 1 Gbps to fit on the node’s backplane.
Processing requirements and
technology
The figure above shows a pre-design for the Multi-Beamforming
Correlator (MBFC). In this design, packets of data for one beam and 1/10th
of the principle frequency band are sent to a digital signal processing board.
On the board 4 processing lines are implemented on 1-2 digital signal
processors each. The exact mapping on hardware is dependent on the state of
technology in a few years from now. The trend in both FPGA and DSP processing
capacity indicate that implementation with 2 processors per processing line in
2004 can be based on main-stream chips (1 FPGA and 1 DSP). Possibly
implementation of all lines in only one DSP/XPP chip is achievable around 2004.
Table 1
Processing power estimates for the MBFC processing steps.
|
Function |
Processing description |
Processing power [Complex MACS/ms] |
Processing power in FLOPS |
|
VCBF; |
Linear combinations of all 40 VCS signals; each 40
Complex MAC |
20 VCBeams *2pol*100 ch*40 = 160 103 |
160 103 * 1kHz * 8 FLOP = 1 GFLOPS |
|
Correlator |
Complex multiplication of all input beam-polarisation
combinations |
16k products*4 pol*100 ch = 6.4 106 |
6.4 106 * 1 kHz * 7 FLOP = 45 GFLOPS |
|
Integrator |
1 Complex addition per input product |
16k products*4 pol*100 ch = 6.4 106 |
6.4 106 *1 kHz * 2 FLOP = 12 GFLOPS |
Reconfiguring the Central
Processor for different science applications.

Schematic
representation of re-configuration for three application types. The first two
lines give specifications of the available platform, where “DSP” stands for
generic digital signal processing processors such as DSP, FPGA or XPP. The next
three lines show the mapping of application processes on these resources.