Software and Computing
Processing the vast quantities of data produced by the SKA will require two (one in South Africa, one in Australia) very high-performance central supercomputers capable of in excess of 100 petaflops (one hundred thousand million million floating point operations per second of raw processing power), which is equivalent to the fastest supercomputer on Earth at the present time (Source: Top500; November 2018).
In total, these two supercomputers will archive 600 Petabytes of data per year. To store this data on an average 500 GB laptop, you would need more than a million of them every year.
But why does the SKA need such immense computing power?
Scientific image and signal processing for radio astronomy imaging consists of several fundamental steps, all of which must be completed as quickly as possible across thousands of telescopes connected by thousands of miles of fibre optic cable.The computers must be able to make decisions on objects of interest, and remove data which is of no scientific benefit, such as radio interference from things like mobile phones or similar devices, even with the remote locations which will host the SKA.
The key steps include:
- Removing data that has been corrupted by interference or faults in the system. This can include interference from mobile phones, or any other Earth-based radio signal, by errors in the signal transport, or problems with the hardware.
- Calibrating each antenna’s signal to remove the effects of instrumental variation and variations in the line-of-sight propagation of the radio signal. With so many radio telescopes, this requires huge amounts of processing power to manage.
- Transforming the data onto a rectangular grid in what radio astronomers call the “u-v plane” – this is akin to interpolating a few randomly scattered altitude measurements onto a regular map grid to estimate the altitudes at all grid intersections.
- A mathematical calculation called a Fourier transformation to convert the data into a representation of the object’s image in the sky.
- A further calculation called “deconvolution of the point-response function of the array” to remove the radio equivalent of the spikes around bright stars in an optical image.
These steps must be done for thousands of separated frequencies, as whilst the SKA radio telescopes work over a low, mid and high-frequency ranges, they are indeed that – ranges, and thousands of individual frequencies must be analysed within each range. The SKA’s computers are required to do all of this in real time.
A complex task
The steps are combined in iterative loops that typically involve refining estimates of array parameters – such as complex gains – while concurrently creating images that converge towards the transformed observed data. Buffer memory is required to store interim processing results while the processing loops are being executed. The science processing facility will also have large data storage sub-systems. The end results of the converged image processing form the basis of the final astronomical images that are distributed to astronomers and physicists around the world.
Multiple telescopes require multiple computers
The supercomputers that will be using the SKA will use millions of computer CPU processors operating in parallel. To speed up some types of processing, the processors are likely to be assisted by specialised hardware. One of the software challenges for the SKA will be to adapt algorithms to operate on these new types of computing architectures.
Several other types of software developments will be required. These include:
- Observing preparation and scheduling which are independent of data pipeline and data reduction (the conversion of the raw data into useful scientific data) processing. This is how the SKA telescopes will convert the vast amounts of raw data into useful science for the astronomers.
- Real time – albeit relatively long time constant – monitor and control.
- Other time critical online data reduction and archiving in addition to that described above.
The project will also require middleware software and tools for ongoing software development from now until the telescopes are operational, and will continue to be refined even thereafter.