The LOFAR Central Processor

Kjeld van der Schaaf (schaaf@astron.nl) & Marco de Vos (devos@astron.nl)

ASTRON, P.O. Box 2, 7990 AA  Dwingeloo, The Netherlands

 

 

This poster describes the basic functions and architecture of the LOFAR Central Processor.

At the frequencies and with the bandwidths where LOFAR operates it is well possible to design a correlator and postprocessing platform based on off-the-shelf components, with only a limited set of functions implemented in dedicated hardware. This gives the high flexibility in configurations that is required to accommodate the range of science applications from wide-field imaging to pulsar detection. It also allows us to take optimal advantage of the developments in processing platforms.

 

The poster just tries to highlight some aspects of the Central Processor.

For more information please contact the authors. Part of this work has been published in

“Cluster Computers and Grid Processing in the First Radio-Telescope of a New Generation,”

De Vos, van der Schaaf & Bregman, IEEE Conference CCGrid2001, Brisbane, May 2001
Position of the Central Processor in the LOFAR system

 

 

 

At the central processing site, signals from the remote stations are received, possibly unpacked and routed to pipelines of processing cluster nodes, with high bandwidth connections along the pipeline. Within the pipeline, the available hardware focuses on the needs of the data flow processing steps. For example, the first processing steps in an imaging application involve beamforming and correlation, (within the dashed oval) which can be executed very efficiently on programmable logic hardware. Therefore, the first nodes in the pipeline will be equipped with such hardware resources. In the last stages of the imaging processing chain SelfCalibration algorithms are used (within the dotted oval). Those algorithms are rather complex in nature, which suggests the use of microprocessors with large instruction sets, branch prediction etc. for these processing steps.


 

Central Processor Hardware Architecture

 

 

The Figure above gives an overview of the central processing platform hardware architecture. The data flow processing steps are executed from left to right on nodes of the hybrid cluster computer. The routing sections take care of reorganizing the incoming data-stream from “all channels for each station” (as it appears at the end of the data transport network) to “all stations for each channel” (as is required for correlation). After the routing sections, the hardware is organised in pipelines offering a dedicated chain of hardware resources to the application.


Functional blocks in the imaging application data flow processing chain

 

 

 

 

In the Figure above the functional blocks for the imaging application type are shown. Following the data flow, we start with the Virtual Core (VC) beamforming from the VC station data. These beams are added to the station beams input to the correlation and phase/gain correction. The phase/gain correction parameters may include ionospheric corrections, origination from the station-core Selfcal, or RFI mitigation.  The integrated data is stored in the temporary data store from which the Selfcalibration process may fetch data and return further calibrated (and integrated) data. After the Selfcalibration process, the calibrated UV data in the temporary data store is used to produce the final data products.

 

 

 

 

Multi-Beamforming Correlation on a Digital Signal Processing board

 

 

 

Example partitioning of data flows and processing tasks for the MBFC. Four processing lines are shown, each of which may be targeted onto 1-2 digital signal processors (e.g. a FPGA and a DSP). Both input and output data flows of the board are of order 1 Gbps to fit on the node’s backplane.

 

Processing requirements and technology

 

 

The figure above shows a pre-design for the Multi-Beamforming Correlator (MBFC). In this design, packets of data for one beam and 1/10th of the principle frequency band are sent to a digital signal processing board. On the board 4 processing lines are implemented on 1-2 digital signal processors each. The exact mapping on hardware is dependent on the state of technology in a few years from now. The trend in both FPGA and DSP processing capacity indicate that implementation with 2 processors per processing line in 2004 can be based on main-stream chips (1 FPGA and 1 DSP). Possibly implementation of all lines in only one DSP/XPP chip is achievable around 2004.

 

Table 1 Processing power estimates for the MBFC processing steps.

 

Function

Processing description

Processing power

[Complex MACS/ms]

Processing power in FLOPS

VCBF;

Linear combinations of all 40 VCS signals; each 40 Complex MAC

20 VCBeams *2pol*100 ch*40

= 160 103

160 103 * 1kHz * 8 FLOP

= 1 GFLOPS

Correlator

Complex multiplication of all input beam-polarisation combinations

16k products*4 pol*100 ch

= 6.4 106

6.4 106 * 1 kHz * 7 FLOP

= 45 GFLOPS

Integrator

1 Complex addition per input product

16k products*4 pol*100 ch

= 6.4 106

6.4 106 *1 kHz * 2 FLOP 

= 12 GFLOPS


 

Reconfiguring the Central Processor for different science applications.

 

 

 

Schematic representation of re-configuration for three application types. The first two lines give specifications of the available platform, where “DSP” stands for generic digital signal processing processors such as DSP, FPGA or XPP. The next three lines show the mapping of application processes on these resources.