20XX Vol. X No. XX, 000–000
22institutetext: School of Software Engineering, Anyang Normal University, Anyang 455000, China
33institutetext: Shanghai Astronomical Observatory, Chinese Academy of Sciences, Shanghai 200030, China
44institutetext: International Center for Radio Astronomy Research (ICRAR), The University of Western Australia, Crawley, Perth, WA, Australia
55institutetext: Kunming University Of Science And Technology, Kunming 650500, China
\vs\noReceived 2012 June 12; accepted 2012 July 27
Enhanced Remote Astronomical Archive System Based on the File-Level Unlimited Sliding-Window Technique
Abstract
Data archiving is one of the most critical issues for modern astronomical observations. With the development of a new generation of radio telescopes, the transfer and archiving of massive remote data have become urgent problems to be solved. Herein, we present a practical and robust file-level flow-control approach, called the Unlimited Sliding-Window (USW), by referring to the classic flow-control method in TCP protocol. Basing on the USW and the Next Generation Archive System (NGAS) developed for the Murchison Widefield Array telescope, we further implemented an enhanced archive system (ENGAS) using ZeroMQ middleware. The ENGAS substantially improves the transfer performance and ensures the integrity of transferred files. In the tests, the ENGAS is approximately three to twelve times faster than the NGAS and can fully utilize the bandwidth of network links. Thus, for archiving radio observation data, the ENGAS reduces the communication time, improves the bandwidth utilization, and solves the remote synchronous archiving of data from observatories such as Mingantu spectral radioheliograph. It also provides a better reference for the future construction of the Square Kilometer Array (SKA) Science Regional Center.
keywords:
Remote Data Archive, NGAS, Sliding Window‘
1 Introduction
Modern astronomical observatories or stations are generally located in sparsely populated areas with specific enviromental conditions, dictating the remote transfer of large amounts of observational data to specific data processing centers. Therefore, building a reliable system for transferring and archiving data is essential for modern astronomical telescope systems (Dewdney et al. 2013). For example, the Square Kilometer Array (SKA) telescope(Schilizzi 2004; Dewdney et al. 2009) will be built in radio-quiet zones in two host countries (the Karoo site in South Africa and the Boolardy site in Western Australia). During its first phase (SKA1), the SKA will generate 50300PB of archival data per year; during the second phase (SKA2), the volume of newly added archival data will increase by approximately 100 times(Chrysostomou et al. 2018). To tackle the data deluge expected from the SKA observatory and enable the community to exploit SKA data for high impact science communication, the massive volumes of data generated by SKA will have to be transferred to SKA Regional Centres (SRCs) worldwide(Barbosa et al. 2020; An et al. 2019) via a 100 Gbit/s network link. The Five-hundred-meter Aperture Spherical radio Telescope (FAST)(Nan et al. 2011) also plans to transfer massive filterbank data in real-time from the FAST observatory to two early science centers near Guiyang city, Guizhou via a 100 Gbit/s fiber link(Li et al. 2018).
The Next Generation Archive System (NGAS) is a robust remote HTTP-based data archive system that has been widely deployed. It was initially developed by the European Southern Observatory (ESO)(Wicenec et al. 2001). The SKA precursor facility, Murchison Widefield Array (MWA), uses the NGAS to synchronize the mirrored archive data from MWA to the data centers at the Massachusetts Institute of Technology (MIT), United States, and the Victoria University of Wellington (VUW), New Zealand(Wu et al. 2013). In addition, the NGAS is used by the Atacama Large Millimeter/submillimeter Array (ALMA) to synchronize massive amounts of archive data from ALMA to the data centers located in North America, Europe, and the East Asia(Wicenec et al. 2010; Stoehr et al. 2014). The NGAS can deliver significantly superior file-transfer performance. Deploying multiple NGAS Subscribers and NGAS Providers is an easy and effective approach to improve the file-transfer performance. The maximum throughput that the NGAS achieved running 24 clients(Subscribers) against four servers(Providers) on a total of 28 machines was 1100 MB/s, which significantly exceeds the required rate(Wicenec et al. 2012).
In developing the data archiving system for the Mingantu Spectral Radioheliograph (MUSER) and investigating the SRCs’ data exchange solutions, we analyzed the source program of the NGAS and tested it experimentally. We demonstrated that the NGAS is an excellent candidate for future remote data archiving systems, but there is still potential for improvement and optimization.
The rest of the paper is organized as follows. In section 2, we analyze the existing deficiencies of the NGAS. Then, we introduce our enhanced high-performance NGAS (ENGAS) in section 3, test its performance in section 4, present and discuss our results in section 5, and finally conclude our paper in section 6.
2 NGAS Framework Analysis and Transfer Performance Evaluation
2.1 NGAS Framework Analysis
The NGAS was developed in pure Python language and has a high degree of portability. Moreover, it is a highly modular application system. A schematic module diagram of the NGAS is shown in Figure 1. The NGAS contains eight modules: Fundamental library (ngamsLib), NGAS C-Client (ngamsCClient), NGAS Java-Client(ngamsJClient), NGAS Python-Client (ngamsPClient), Plug-In support(ngamsPlugIns), Server Module(ngamsServer), UDP-based data transfer module (ngamsUDT), and Utils module (ngamsUtils).

The NGAS Server, which is a multi-thread HTTP server based on the HTTP POST, GET, and PUT methods, is the core of the NGAS. At least 20 application layer commands supported by the NGAS server are mainly used to archive, retrieve, query, automatically mirror, and synchronize data and check data integrity. The NGAS server can be run in three modes: cache server, data-mover (or read-only) server, or regular server.
The Data Subscription Service of the NGAS enables the synchronization of full or partial data files set to remote Data Subscribers. According to data-archiving requirements, the NGAS can be deployed as a Data Provider to send data files and a Data Subscriber in charge of receiving data files.
2.2 Transfer Performance Evaluation
The performance of the NGAS is closely related to network bandwidth and storage system performance. However, network transmission and flow control have more significant impacts on the overall performance.
We test the performance of the NGAS in the regular server mode according to the construction requirements of the SRC, especially the file transfer performance of the NGAS. Figure 2 presents the experimental results of file-transfer performance with different numbers of threads. Owing to the HTTP protocol, the NGAS client (Subscriber) could only obtain an average of nearly 160 MB in one thread on a 10 Gb ethernet environment. However, when we attempt to increase the number of threads to boost the transfer performance, the transfer performance decreases instead.


After the static code analysis of the NGAS, we deduced that the transfer performance could decline owing to two main reasons, as described next.
1) Flow-control mechanism of NGAS
Figure 3 clarifies the delivery procedure of the Provider side. The delivery procedure mainly includes the send–receive process and re-transmitting sub-procedures. The send–receive process procedure mainly includes three sub-procedures: sending a message, receiving a response, and processing a response. The re-transmitting sub-procedure is mainly used to re-transmit the message that has been not received the corresponding response. The flow-control method used by the data subscription service of the NGAS involves high coupling of the following three sub-procedures in the send–receive process procedure; there is a strict sequence among these sub-procedures.
This flow-control method in the data subscription service of the NGAS is similar to the stop-and-wait Automatic Repeat reQuest (ARQ)(Bada 2017) protocol, with the sliding-window size being 1, shown in Figure 4. In an objective analysis, such a flow-control method is robust. However, it does not allow the Provider to send the following data file to the peer before receiving and processing the peer’s response. As a result, the NGAS Provider has to spend considerable time waiting for the response generated by the subscriber till the NGAS Subscriber successfully receives and archives the data file.
2) Limitation of the Python threading module
Another problem is that the file data delivery process of the Data Subscription Service of the NGAS is implemented using the Python threading module. The Python threading module is based on CPython. In CPython, owing to the Global Interpreter Lock (GIL), only one thread can execute Python code at once (even though specific performance-oriented libraries might overcome this limitation)111https://docs.python.org/3.5/library/multiprocessing.html. That means the file data delivery process of the Data Subscription Service of the NGAS will be at the expense of much of the parallelism afforded by multi-processor machines.

3 Enhanced High-Performance NGAS
We studied the problem of the relatively low transfer performance of the current NGAS. We first introduce Python’s multiprocessing package (Singh et al. 2013) to replace the original Python threading module. We then propose a file-level Unlimited Sliding-Window Technique (USW) based on the Pub/Sub model of ZeroMQ. We also define the message format for high-performance USW communication. Finally, we redeveloped the NGAS transfer module and implemented an enhanced high-performance archive system (ENGAS).
3.1 File-level Unlimited Sliding-Window Technique
Referring to the concept of ARQ that is widely used for error control in data-communication systems(Lin et al. 1984), we propose a file-level USW method for asynchronous data delivery upon ZeroMQ communication middleware(refer to Figure 5).

3.1.1 Message Format Definitions in USW
Two types of messages are designed to satisfy the requirements of the asynchronous communication (Pub/Sub) mode of the ZeroMQ. The Provider is mainly responsible for sending data files, and the Subscriber is mainly responsible for receiving data files, messages, and responses (refer to Figure 5).
The P-message (refer to Figure 6) is the formatted message sent by the Provider. The P-message format consists of six parts: IP address of the Subscriber (SI), TCP Port of the Subscriber (SP), IP address of the Provider (PI), TCP Port of the Provider (PP), metadata information of the file (FM), and data of the file (FD). In addition, the FM contains file identification, file version, file size, format, checksum, checksum method, compression flag, and compression method.
The S-message (refer to Figure 6) (response message) is generated by the Subscriber, which consists of five parts: IP address of the Subscriber (SI), TCP Port of the Subscriber (SP), IP address of the Provider (PI), TCP Port of the Provider (PP), and metadata information related to the received file (FM).

3.1.2 Principle of USW
The Provider complies with the following: 1. It records the basic information of the message (SI-SP-PI-PP) for each P-message sent by the Provider. 2. It sends P-messages continuously without waiting for the response message and without considering the storage capacity of the Subscriber. 3. It receives the response message sent by the Subscriber and confirms whether the file has been sent successfully according to the SI-SP-PI-PP information. 4. It re-sends the corresponding file after a timeout error according to the recorded information.
Likewise, the Subscriber complies with the following: 1. It receives the P-message sent by the Provider. 2. It sends an S-message if the file is confirmed as successful.
The USW cannot guarantee that the files sent by the Provider arrive at the Subscriber in that order. However, it can guarantee that the files sent by the Provider eventually arrive at the Subscriber.
3.2 System Implementation
Based on the USW and NGAS system, we develop two modules, namely data_provider and data_subscriber to replace the origin codes in ngamsServer (refer to Figure 1).
On the one hand, the data_provider module is implemented for sending messages, receiving response messages, processing response messages, re-transmitting messages, and other functions of the Provider. It is deployed as a daemon process. Four main processes will be created and listened the specified ports. (refer to Figure 7).
On the other hand, the data_subscriber module is implemented for receiving message, processing message, sending response message, and other Subscriber functions as well. It is also deployed as a daemon process in the machines of the Subscribers. Three main processes are created in the system background. (refer to Figure 8).
After the data_provider and data_subscriber modules are initialized, the data_subscriber module will connect to the pre-defined data_provider module automatically, and data pairs sending message process, receiving message process and sending response process, receiving response process will be created based on the ZeroMQ’s Pub/Sub pattern(Hintjens 2013).


4 Performance Test
Two types of tests were designed to compare the performances of ENGAS and NGAS based on MySQL using the InnoDB engine with a default key buffer size of 8 MB, default key cache block size of 1 KB, and default key cache division limit of 300. The first test compared the performances of ENGAS and NGAS with different file sizes. The second test compared the performances of ENGAS and NGAS with different concurrencies.
The experimental platform consisted of two Inspur NF5280M5 servers. Each server has two Intel(R) Xeon(R) Silver 4114 CPUs @ 2.20GHz (10 cores for each), 256 GB DDR4-2666ER memory, two 10 Gb Intel Corporation Ethernet adapters, one Samsung 860 SSD drive with 512 GB, and four 2 TB Seagate Enterprise hard drivers. In addition, each server ran on the CentOS 7.4.1708 operating system with MySQL database version 5.6.38, using Python version 3.8.3. During the experiment, one server was deployed as the Provider for sending files, and another server was deployed as a Subscriber in charge of receiving data files.
Name | File number | Single file size(Bytes) | Total size (GB) |
---|---|---|---|
Dataset1 | 2,000,000 | 66,240 | 123.38 |
Dataset2 | 200,000 | 662,400 | 123.38 |
Dataset3 | 20,000 | 6,624,000 | 123.38 |
Dataset4 | 2,000 | 66,240,000 | 123.38 |
Dataset5 | 200 | 662,400,000 | 123.38 |
Dataset6 | 20 | 6,624,000,000 | 123.38 |
Dataset7 | 333,344 | 397,427 | 123.38 |
The experimental data sets are based on the simulated data generated by several MWA FITS files downloaded by the Python-based application programming interface (Manta-ray-client222https://github.com/MWATelescope/manta-ray-client); they are presented in Table 1. The total size of each data set was approximately 123.38 GB.
Note that: 1) all experiments were performed in memory; and 2) the average transmission/archiving rate is the total size of each data set divided by the time elapsed. The elapsed time is defined as the duration of time between when the first and last files are received.
4.1 Performance Comparison under Different File Sizes
To test the impacts of different file sizes on the average transmission performances of the ENGAS and NGAS under the same testing conditions, seven experimental data sets (presented in Table 1) were used as experimental data. The concurrent numbers of ENGAS and NGAS were both set as one in the test.
The file sizes used in the performance test spanned five orders of magnitude; thus, to demonstrate the relationship between the file size and the average speed more effectively in a relatively small picture, we used the data set of the corresponding file sizes to replace the file sizes (as shown in Figure 9).

As is evident from Figure 9, the average transmission/archiving rate of the ENGAS is faster than that of the NGAS. When the file size was increased from tens of KB to thousands of MB, the ENGAS went from being nearly 17.45 times faster than the NGAS to being 2.68 times faster. This phenomenon might have occurred because increasing the file size from tens of KB to thousands of MB decreased the number of files from 2,000,000 to 20, which decreased the overall waiting time for the NGAS. The proposed file-level USW method for asynchronous data delivery improved the average archiving rate, compared to that for the NGAS. In other words, in this situation, the ENGAS could achieve a better data delivery rate than that of NGAS. Furthermore, for the ENGAS, the file size of Dataset3 was more suitable for remote archives than those of the other six data sets.
4.2 Performance Variation under Different Concurrencies
To verify whether different concurrent numbers will affect the average transmission performance, we test the concurrent numbers ranging from 1 to 8 for the ENGAS and NGAS, respectively. Dataset6 was used as experimental data.
The results are presented in Figure 10. Clearly, the average ingestion rate achieved by the ENGAS was significantly faster than that of NGAS. With an increase in the concurrent number, the average transmission/archiving rate of the ENGAS increased from being almost 3.05 times faster than that of NGAS to approximately 12.33 times faster. In addition, the gap between the average archiving rates of the ENGAS and NGAS widened to a fixed value of 12.33.
Furthermore, for the NGAS, with an increase in the number of concurrent threads, the average ingestion rate obtained decreased rapidly from its maximum average ingestion rate (216.57 MB/s) at a concurrent number of 1 to a fixed average ingestion rate (approximately 84.40 MB/s). This phenomenon may have occurred because that for a given IO, the multiple threads of the NGAS compete for the resources of a processor, which may cause the average ingestion rate to decrease.
Moreover, as the number of concurrent processes increased, the average archiving rate obtained by the ENGAS initially increased rapidly to its maximum average ingestion rate (1083.56 MB/s) at a concurrent number of 3. It then decreased gradually, and finally tended to reach a fixed average ingestion rate of approximately 1040.00 MB/s. Although the maximum average archiving rate and fixed average archiving rate were still lower than the theoretical upper limit (1250 MB/s) at a bandwidth of 10 Gb/s, they were basically equal to the average rate of 1079.00 MB/s tested by iperf3(Udayakumar et al. 2018).

5 Discussion
Although the ARQ mechanism can guarantee that all the files sent by the Provider eventually arrive at the Subscriber, the order of file transfer is not guaranteed. Fortunately, for massive scientific data synchronization between different data centers, the sent data is not required to arrive at the receiving end in strict accordance with the sending order. Thus, the USW can be used for massive data synchronization in next-generation telescope systems.
Typically, the flow-control mechanism used by the ENGAS is not sufficiently mature. The ENGAS can only achieve file-level re-transmission. Therefore, when the network-link quality is poor and frequent packet loss occurs, the efficiency of the ENGAS becomes very low.
6 Conclusions
Herein, we proposed a file-level Unlimited Sliding-Window technique (USW) to improve the flow-control performance. Experimental results indicate that the USW-based ENGAS delivered acceptable performance in terms of the data transmission rate; the average transmission rate achieved by the ENGAS was faster than that of NGAS (approximately 3.05–12.33 times faster under different experiment conditions; refer to Figure 9,10). The ENGAS is an open-source software that can be freely downloaded and deployed from https://github.com/astronomical-data-processing/ENGAS.
According to the experiment results, the carefully optimized codes of the USW-based ENGAS delivered acceptable archiving/transmission performance. In this regard, the USW can be considered an effective technique for transmitting/synchronizing massive amounts of data generated by next-generation telescopes, and ENGAS can be considered an effective data transmitting/synchronizing plugin for the NGAS.
Acknowledgements.
This work is supported by the National Key Research and Development Program of China (2020SKA0110300), the Joint Research Fund in Astronomy (U1831204, U1931141) under cooperative agreement between the National Natural Science Foundation of China (NSFC) and the Chinese Academy of Sciences (CAS), the National Natural Science Foundation of China (No.11903009). the Funds for International Cooperation and Exchange of the National Natural Science Foundation of China (11961141001), Yunnan Key Research and Development Program(2018IA054). The Key Science and Technology Program of Henan Province (No.202102210152, No.212102210611, and No.202102210125),the Research and Cultivation Fund Project of Anyang Normal University (AYNUKPY-2019-24, AYNUKPY-2020-25). This work is also supported by Astronomical Big Data Joint Research Center, co-founded by National Astronomical Observatories, Chinese Academy of Sciences and Alibaba Cloud. The authors would like to thank Chen Wu from International Centre for Radio Astronomy Research, University of Western Australia for providing the valuable suggestions to improve the experiments. The authors gratefully acknowledge the helpful comments and suggestions of the reviewers.References
- An et al. (2019) An, T., Wu, X.-P., & Hong, X. 2019, Nature Astronomy, 3, 1030
- Bada (2017) Bada, A. B. 2017, The International Journal of Engineering and Science (IJES), 6, 64
- Barbosa et al. (2020) Barbosa, D., Antón, S., Barraca, J. P., et al. 2020, arXiv preprint arXiv:2005.01140
- Chrysostomou et al. (2018) Chrysostomou, A., Bolton, R., & Davis, G. R. 2018, in Observatory Operations: Strategies, Processes, and Systems VII, Vol. 10704, International Society for Optics and Photonics, 1070419
- Dewdney et al. (2009) Dewdney, P. E., Hall, P. J., Schilizzi, R. T., & Lazio, T. J. L. 2009, Proceedings of the IEEE, 97, 1482
- Dewdney et al. (2013) Dewdney, P., Turner, W., Millenaar, R., et al. 2013, Document number SKA-TEL-SKO-DD-001 Revision, 1
- Hintjens (2013) Hintjens, P. 2013, ZeroMQ: messaging for many applications (” O’Reilly Media, Inc.”)
- Li et al. (2018) Li, D., Wang, P., Qian, L., et al. 2018, IEEE Microwave Magazine, 19, 112
- Lin et al. (1984) Lin, S., Costello, D. J., & Miller, M. J. 1984, IEEE Communications magazine, 22, 5
- Nan et al. (2011) Nan, R., Li, D., Jin, C., et al. 2011, International Journal of Modern Physics D, 20, 989
- Schilizzi (2004) Schilizzi, R. T. 2004, in Ground-based Telescopes, Vol. 5489, International Society for Optics and Photonics, 62
- Singh et al. (2013) Singh, N., Browne, L.-M., & Butler, R. 2013, Astronomy and Computing, 2, 1
- Stoehr et al. (2014) Stoehr, F., Lacy, M., Leon, S., et al. 2014, in Observatory Operations: Strategies, Processes, and Systems V, Vol. 9149, International Society for Optics and Photonics, 914902
- Udayakumar et al. (2018) Udayakumar, N., Khera, A., Suri, L., Gupta, C., & Subbulakshmi, T. 2018, in 2018 International Conference on Communication and Signal Processing (ICCSP), IEEE, 0791
- Wicenec et al. (2010) Wicenec, A., Chen, A., Checcucci, A., et al. 2010, in Astronomical Data Analysis Software and Systems XIX, Vol. 434, 457
- Wicenec et al. (2001) Wicenec, A., Knudstrup, J., & Johnston, S. 2001, The Messenger, 106, 11
- Wicenec et al. (2012) Wicenec, A., Pallot, D., Checcucci, A., et al. 2012, in Software and Cyberinfrastructure for Astronomy II, Vol. 8451, International Society for Optics and Photonics, 845118
- Wu et al. (2013) Wu, C., Wicenec, A., Pallot, D., & Checcucci, A. 2013, Experimental Astronomy, 36, 679