GPU Accelerated Image Quality Assessment-Based Software for Transient Detection

X. Li University of Oxford, Oxford, the United Kingdom; wes.armour@oerc.ox.ac.uk K. Adámek University of Oxford, Oxford, the United Kingdom; wes.armour@oerc.ox.ac.uk and W. Armour University of Oxford, Oxford, the United Kingdom; wes.armour@oerc.ox.ac.uk

Abstract

Fast imaging localises celestial transients using source finders in the image domain. The need for high computational throughput in this process is driven by next-generation telescopes such as Square Kilometre Array (SKA), which, upon completion, will be the world’s largest aperture synthesis radio telescope. It will collect data at unprecedented velocity and volume. Due to the vast amounts of data the SKA will produce, current source finders based on source extraction may be inefficient in a wide-field search. In this paper, we focus on the software development of GPU-accelerated transient finders based on Image Quality Assessment (IQA) methods — Low-Information Similarity Index (LISI) and augmented LISI (augLISI). We accelerate the algorithms using GPUs, achieving kernel time of approximately 0.1 milliseconds for transient finding in $2048\times 2048$ images.

1 Introduction

Fast imaging localises celestial transients by using source detection algorithms in radio astronomical images. These transients are usually concentrated in a small section of the image in wide-field observation, while the rest of the image remains largely stable with slight variations over time. Since it can be challenging for human visual system to identify transients within these images, we develop transient finders based on two intensity-sensitive Image Quality Assessment (IQA) methods: Low-Information Similarity Index (LISI; Li and Armour, 2022) and augmented LISI (augLISI; Li et al., 2024), as expressed by

\mathrm{LISI}\left({\mathbf{x},\mathbf{y}}\right)=D\frac{\sum\limits_{i=1}^{N}\frac{\left|{x_{i}+y_{i}}\right|}{\left|{x_{i}-y_{i}}\right|+C_{1}}}{\max\left({\sum\limits_{i=1}^{N}x_{i}},{\sum\limits_{i=1}^{N}y_{i}}\right)+C_{2}},

(1)

\mathrm{augLISI}\left({\mathbf{x},\mathbf{y}}\right)=1-\frac{\sum\limits_{i=1}^{N}\left|{x_{i}+y_{i}}\right|\left|{x_{i}-y_{i}}\right|}{\sum\limits_{i=1}^{N}x_{i}+\sum\limits_{i=1}^{N}y_{i}+C},

(2)

where $\mathbf{x}$ and $\mathbf{y}$ are input images, with $C_{1}<<1$ , $C_{2}<<1$ , and $C<<1$ . Both IQA methods yield values from 0 to 1, with 1 indicating identical input images. LISI works best for images with only point sources and minimal noise, whereas augLISI is better suited for images containing extended sources or higher levels of noise.

2 Methodology

To enhance the efficiency, we leverage GPUs to accelerate algorithms through CUDA programming and parallel computing, and implement GPU-accelerated transient finders based on LISI ¹¹1https://github.com/egbdfX/gpuLISI and augLISI ²²2https://github.com/egbdfX/gpuAugLISI. A C/C++ interface ensures portability, allowing input of FITS images, passing arrays to CUDA functions, and outputting the corresponding IQA matrix. During pre-processing, negative pixel values in the inputs are set to zeros. The input images are divided into tiles for transient localisation. The key CUDA kernels, split_lisi and split_auglisi, compute LISI and augLISI values for each tile in parallel, where image and tile sizes are $N\times N$ and $N_{T}\times N_{T}$ pixels, respectively.

According to Equation (1) and (2), the pixel-to-pixel summation is the most computationally expensive operation. Each tile is processed by a thread block, with computations remaining independent at both the tile and pixel levels. We use a reduction algorithm for efficient aggregation, leveraging shared memory for intermediate results. Threads process batches of pixels to compute partial sums, which are further reduced to obtain final LISI or augLISI values.

On an NVIDIA H100 GPU ³³3https://resources.nvidia.com/en-us-tensor-core, the kernels achieve median execution times of 0.101 ms (LISI) and 0.091 ms (augLISI) for $N=2048$ and $N_{T}=64$ over 10 runs. Kernel execution times for different configurations are shown in Fig. 1.

Refer to caption — Figure 1: Kernel execution time for GPU-accelerated transient finders based on *Left:* LISI and *Right:* augLISI.

3 Results

To evaluate our transient finders, we simulate Measurement Sets (MS) using source information included in GaLactic and Extragalactic All-sky MWA (GLEAM) catalogue (Hurley-Walker et al., 2017) with SKA1-LOW telescope configuration, employing Oxford’s Square Kilometre Array Radio-telescope simulator (OSKAR; Dulwich, 2020). Images are restored using Hogbom CLEAN (Hogbom, 1974), as shown in Fig. 2. The first dataset (MS1) contains 33 sources, while the second (MS2) excludes 5 low-intensity sources from MS1 to mimic transient disappearance. Transients typically exhibit on-and-off emission behaviour, causing intensity variations between observations of the same section of the sky, while fixed sources remain unchanged. To identify and characterise these changes, the images are divided into tiles for position-based similarity analysis. Here, the image contains $2048\times 2048$ pixels, with each tile sized at $64\times 64$ pixels.

By applying our transient finders to the images of MS1 and MS2, we compute tile-wise similarities, yielding an IQA (LISI or augLISI) matrix. The LISI or augLISI values for individual matrix elements, each representing the similarity of a tile in the images, can be aggregated into a single value equivalent to the LISI or augLISI of the similarity of the entire images, as expressed by

\mathrm{LISI}\left({\mathbf{x},\mathbf{y}}\right)=\frac{\sum\limits_{t=1}^{n}\left\{\mathrm{lisi}_{t}\times\left[\mathrm{max}\left(\sum\limits_{k=1}^{N_{T}\times N_{T}}x_{k},\sum\limits_{k=1}^{N_{T}\times N_{T}}y_{k}\right)+C_{2}\right]\right\}}{\mathrm{max}\left(\sum\limits_{i=1}^{N\times N}x_{i},\sum\limits_{i=1}^{N\times N}y_{i}\right)+C_{2}},

(3)

\mathrm{augLISI}\left({\mathbf{x},\mathbf{y}}\right)=\frac{\sum\limits_{t=1}^{n}\left[\mathrm{auglisi}_{t}\times\left(\sum\limits_{k=1}^{N_{T}\times N_{T}}x_{k}+\sum\limits_{k=1}^{N_{T}\times N_{T}}y_{k}+C\right)\right]-\left(n-1\right)C}{\sum\limits_{i=1}^{N\times N}x_{i}+\sum\limits_{i=1}^{N\times N}y_{i}+C}.

(4)

Here, $k$ ( $k=1,2,...,N_{T}\times N_{T}$ ) denotes a pixel index within a tile, and $i$ ( $i=1,2,...,N\times N$ ) denotes a pixel index in the entire image. The image contains $n$ tiles, with:

n=\frac{N}{N_{T}}\times\frac{N}{N_{T}}.

In Equation (3), $\mathrm{LISI}$ indicates the LISI value for the entire image, and $\mathrm{lisi}_{t}$ indicates the LISI value for the $t$ -th tiles ( $t=1,2,...,n$ ). Similarly, in Equation (4), $\mathrm{augLISI}$ refers to the augLISI value for the entire image, and $\mathrm{auglisi}_{t}$ refers to the augLISI value for the $t$ -th tiles ( $t=1,2,...,n$ ). Notably, the “entire image” may also refer to a larger tile formed by merging smaller ones.

The LISI and augLISI results comparing corresponding tiles of the MS1 and MS2 images are illustrated in Fig. 3, with colour bars indicating the IQA values.

Figure 3 highlights the strengths and limitations of each approach. LISI is highly sensitive to changes across all intensity levels, whether high or low, even when the variations are subtle. This sensitivity is reflected in the wide range of LISI values across the image, including low values in the corners caused by the CLEAN algorithm and spread structures from the CLEAN beams which convolved on the model image. On the other hand, augLISI focuses on significant changes, with reduced sensitivity to minor variations, as shown by the narrower range of augLISI values. It effectively identifies source differences while disregarding noise variations, localising transients to tiles with fewer false positives.

4 Conclusion

GPU-accelerated transient finders, based on LISI and augLISI, are designed for transient detection in radio astronomical images. These transient finders exhibit both high sensitivity and efficiency in identifying and localising sources with variations. While LISI achieves greater sensitivity than augLISI, it is also more prone to false positives. In contrast, augLISI delivers higher accuracy for transient localisation.

Acknowledgements

The authors are grateful to Fred Dulwich and Ben Mort for their valuable guidance on OSKAR.

References

Dulwich [2020] F. Dulwich. OSKAR 2.7.6. https://zenodo.org/records/3758491, 2020.
Hogbom [1974] J. Hogbom. Aperture synthesis with a non-regular distribution of interferometer baselines. Astron. Astrophys. Suppl., 15:417–426, 1974.
Hurley-Walker et al. [2017] N. Hurley-Walker, J. Callingham, P. Hancock, et al. GaLactic and Extragalactic All-sky Murchison Widefield Array (GLEAM) survey - I. A low-frequency extragalactic catalogue. MNRAS, 464(1):1146–1167, 2017.
Li and Armour [2022] X. Li and W. Armour. Intensity-sensitive similarity indexes for image quality assessment. In 2022 26th International Conference on Pattern Recognition (ICPR), pages 1975–1981, 2022.
Li et al. [2024] X. Li, K. Adámek, and W. Armour. Intensity-sensitive quality assessment of extended sources in astronomical images. ApJS, 274(2):37, 2024.