Federated Learning for COVID-19 Detection with Generative Adversarial Networks in Edge Cloud Computing

Dinh C. Nguyen, Ming Ding, Pubudu N. Pathirana,
Aruna Seneviratne, , and Albert Y. Zomaya Dinh C. Nguyen is with the School of Engineering, Deakin University, Waurn Ponds, VIC 3216, Australia (e-mail: cdnguyen@deakin.edu.au).Ming Ding is with Data61, CSIRO, Australia (email: ming.ding@data61.csiro.au).Pubudu N. Pathirana is with the School of Engineering, Deakin University, Waurn Ponds, VIC 3216, Australia (email: pubudu.pathirana@deakin.edu.au).Aruna Seneviratne is with the School of Electrical Engineering and Telecommunications, University of New South Wales (UNSW), NSW, Australia (e-mail: a.seneviratne@unsw.edu.au).Albert Y. Zomaya is with the School of Computer Science, The University of Sydney, Australia (e-mail: albert.zomaya@sydney.edu.au).

Abstract

COVID-19 has spread rapidly across the globe and become a deadly pandemic. Recently, many artificial intelligence-based approaches have been used for COVID-19 detection, but they often require public data sharing with cloud datacentres and thus remain privacy concerns. This paper proposes a new federated learning scheme, called FedGAN, to generate realistic COVID-19 images for facilitating privacy-enhanced COVID-19 detection with generative adversarial networks (GANs) in edge cloud computing. Particularly, we first propose a GAN where a discriminator and a generator based on convolutional neural networks (CNNs) at each edge-based medical institution alternatively are trained to mimic the real COVID-19 data distribution. Then, we propose a new federated learning solution which allows local GANs to collaborate and exchange learned parameters with a cloud server, aiming to enrich the global GAN model for generating realistic COVID-19 images without the need for sharing actual data. To enhance the privacy in federated COVID-19 data analytics, we integrate a differential privacy solution at each hospital institution. Moreover, we propose a new blockchain-based FedGAN framework for secure COVID-19 data analytics, by decentralizing the FL process with a new mining solution for low running latency. Simulations results demonstrate the superiority of our approach for COVID-19 detection over the state-of-the-art schemes.

Index Terms:

COVID-19, federated learning, generative adversarial network, edge cloud.

I Introduction

The COVID-19 pandemic has caused a devastating effect on the public health and global economy. The severity of the epidemic is enormous that the World Health Organization (WHO) has declared it as a pandemic within a month of its wide-scale expansion [1]. Recently, artificial intelligence (AI) techniques such as machine learning (ML) [2] and deep learning (DL) [3] have been employed to automatically diagnose and detect COVID-19 using X-ray images. For example, the projects at [4, 5, 6] develop DL-based techniques such as convolutional neural networks (CNNs) [7] to identify COVID-19 cases by extracting essential features from chest X-rays. To implement DL algorithms for COVID-19-related detection, collecting large COVID-19-related X-ray images is a significant task, but it is a highly expensive and time-consuming process that requires the participation of many patients and experts. Besides, in the context of COVID-19 pandemic, the data collection becomes more challenging since medical staff may risk infections during the collection process.

Recently, generative adversarial networks (GANs) [8] have been received much attention for medical imaging applications, which allows for generating high-quality synthetic images from original data based on the interaction of two components including a generator and a discriminator via a min-max game. In this game, the generator is responsible for generating synthesized data points from random samples while the discriminator attempts to distinguish between the real samples and the one produced by the generator. The ultimate goal of training a GAN is to obtain a generator that can well capture the real data distribution to generate real-looking synthetic data. The use of GANs thus mitigates the pressure of data collection as well as improves COVID-19 data training performances [9]. However, in practical scenarios, COVID-19 image datasets are distributed across multiple sites, where data in each site is too limited in quantity and diversity to train accurately a GAN for the entire data population [10]. Moreover, due to the growing user privacy concerns and strict institutional regulations, the data owners, e.g., hospitals, are not willing to share their data with datacentres which thus hinders the COVID-19 analytics. This dilemma makes it hard to aggregate all distributed data in a single server to implement centralized COVID-19 data training. Therefore, developing a collaborative solution for the integration of COVID-19 data across multiple institutions is highly needed to help overcome these challenges as well as boost the COVID-19 detection performance.

Federated learning (FL) as an emerging distributed collaborative AI approach [11], [12] is particularly attractive for assisting intelligent health data analytics, including COVID-19 prediction tasks [13]. This learning paradigm is enabled by coordinating multiple data sources to perform collaborative AI training with an aggregator such as a cloud server. In this way, each institution can train its data model by exchanging its model gradients without the need for sharing actual data. This approach thus offers means to protect effectively data privacy and reduce the cost of data transmission and storage. In the COVID-19 disease scenario, we observe that data for generative learning are distributed among various medical institutions, e.g., COVID-19 X-ray images are stored in hospitals. By integrating with GANs, FL can be naturally leveraged to build a federated generative model over the distributed institutions for high-quality and privacy-enhanced COVID-19 data training. This aims to address three critical challenges, including dataset limitation, privacy protection, and constrained training performances in COVID-19 analytics.

There are still several remaining issues in the current FL systems. For example, traditional FL frameworks mostly rely on a single server for the model aggregation, which possibly leads to single-point failures once the server is attacked. Moreover, since the hospitals need to use a central server for the data training, malicious attacks at the server can exploit the user information implicitly carried by the model updates or modify the model information without authorization, which makes the FL unreliable. In this context, blockchain has emerged as a promising solution to replace the central server to coordinate the FL process, by using its decentralized networking topology [14]. In this context, hospitals can participate in the FL data training in a decentralized manner without the need for a central server. This solution helps avoid the risks of single-point failures and significantly mitigates model aggregation attacks for secure and reliable data FL training.

I-A Related Works

Several studies using GAN and FL have been proposed for supporting COVID-19 detection tasks. Specifically, the authors in [15], [16] developed a GAN-based approach to detect COVID-19 X-ray images, along with using transfer learning for lung segmentation for facilitating classification. The study in [17] implemented an image synthesis approach to produce high-quality and realistic COVID-19 chest tomography (CT) images for the use in DL-based semantic segmentation and classification. Moreover, the application of GAN for synthetic chest X-ray image augmentation was also investigated in [18] by using a conditional GAN, which helps speed up the COVID-19 detection with enhanced classification performances. The potential of GAN in solving data limitation in the COVID-19 pandemic was demonstrated in [19], where DL-based models, e.g., Alexnet, Googlenet, and Restnet18, are employed to evaluate the synthetic data quality and perform COVID-19-related predictions. Another work in [20] employed GANs for mobility estimation in the COVID-19 pandemic under complex social contexts and constrained training datasets with multiple data sources.

Furthermore, the applications of FL in COVID-19 diagnosis and detection have been investigated in recent works. For example, the study in [21] proposed a novel dynamic fusion-based FL approach for diagnostic image analytics to identify COVID-19 cases. The main focus of this work is on designing a client selection mechanism which allows for deciding clients to join the FL training based on their local model performances, and developing a model fusion solution to perform FL aggregation. The research in [22] proposed an FL scheme for federated COVID region segmentation using chest computed tomography (CT) images by the collaboration of hospitals from China, Italy, and Japan. A multi-national database consisting of 1704 scans distributed among these countries is built to build a global COVID-19 detection model via a federated semi-supervised learning technique. Another work in [23] developed a federated DL framework for privacy-enhanced detection of lung abnormalities caused by COVID-19. Each institution runs a CNN model to detect lesions from COVID-19 CT images, and update the gradients to a data centre for building a generalizable, low-cost, and scalable AI model for COVID-19 disease diagnosis and management. Furthermore, the authors in [24] leveraged FL to build a COVID-19 infection screening scheme based on chest X-ray images. Several CNN-based models are employed in the FL setting, showing promising results on COVID-19 classification compared to standalone schemes without federation. The feasibility of FL was also evaluated via real-world experiments in [25] for COVID-19 X-ray image analytics and classification.

In terms of blockchain-based FL, the work in [26] suggested an FL model with blockchain for COVID-19 CT imaging by the cooperation of multiple hospitals. The focus of this work was on developing a data normalization-based FL technique to accurately train the collaborative deep CNN model using the datasets collected from different hospitals and CT scan machines. The work in [27] introduced a conceptual concept of the blockchain, edge computing, and FL integration for controlling the COVID-19 pandemic. The potential of blockchain and FL was also investigated in [28] for health data analytics, where the benefits of the integration of these technologies were analyzed. However, no implementation and simulation results have been reported in these works [27], [28]. Other related works in [29], [30] proposed blockchain-based FL schemes for health Internet of Things (IoT) networks, but their roles in COVID-19 detection have not been investigated. The studies in [31], [32] also paid attention to blockchain-based FL designs, where the proposed solutions were used to mostly address attack issues in the data communication and model aggregation. Nonetheless, all existing works [26, 27, 28, 29, 31, 32] have not addressed the latency issue caused by blockchain mining in the blockchain-FL systems. Moreover, the integrated design of blockchain, FL, and GANs and the investigation of this integrated model in the COVID-19 context are still missing in the above literature studies.

TABLE I: The comparison of the existing works and our scheme.

Features	Schemes
Features	[21]	[22]	[26]	[27]	[29]	[31]	Ours
FL for COVID-19	✓	✓	✓	✓			✓
Integrated FL-GAN for COVID-19							✓
Differential privacy design					✓		✓
Decentralized FL training			✓	✓	✓	✓	✓
Low-latency blockchain design							✓
Integrated blockchain-FL for COVID-19			✓				✓

I-B Motivations and Our Key Contributions

The motivations of our work can be explained as follows. Firstly, despite these research efforts, most existing GAN algorithms [15, 16, 17, 18] for COVID-19 analytics are trained using limited and imbalanced datasets from a single institution which cannot achieve a desired COVID-19 detection accuracy. Secondly, during a pandemic, due to the increasing user privacy concerns and strict regulations, medical institutions such as hospitals are not willing to share their COVID-19 image data with a data centre for AI training, which calls for COVID-19 data training without data sharing. Thirdly, the convergence of FL and GANs is a very interesting research direction, which can achieve better COVID-19 image augmentation with privacy enhancement for better disease detection and diagnosis. However, its potential has not been explored for the COVID-19 detection domain in the open literature [21, 22, 23, 24]. Finally, how to develop a new blockchain solution for secure and low-latency federated data training is an urgent need, aiming to support efficient COVID-19 detection in the pandemic.

Motivated by these limitations, we here propose a novel scheme called FedGAN for privacy-ensured and efficient COVID-19 detection by enabling a joint design of GAN and FL across medical institutions in edge cloud computing. The key purpose of our proposed scheme is to generate high-quality synthetic image data for supporting COVID-19 detection tasks without the need for COVID-19 image data sharing. Our proposed solution not only solves the problems of data limitation and imbalance thanks to generative learning but also enhances COVID-19 data privacy, as well as enhances the COVID-19 detection performance due to the collaboration of multiple data sources from distributed institutions. Moreover, we propose a new blockchain-based FedGAN framework for secure COVID-19 data analytics, by decentralizing the FL process with a new mining solution for low running latency. The comparison of our paper and the related works via several key features is summarized in Table I. In a nutshell, the unique contributions of this article are highlighted as follows:

•

We propose a novel FedGAN scheme for COVID-19 detection, by enabling a joint design of GAN and FL across the distributed medical institutions in edge cloud computing. This model is highly effective in generating realistic COVID-19 X-ray images and thus facilitating the automatic COVID-19 detection in the current pandemic scenario with COVID-19 data scarcity at each institution.
•

We propose a collaborative data augmentation scheme where a discriminator and a generator of the GAN at each edge-based institution alternatively train their model and update their trained parameters to a cloud server without disclosing the actual image samples. The proposed solution thus enables GANs to federate the training for building the global GAN model which is used to generate synthetic COVID-19 X-ray images. To enhance the privacy in federated COVID-19 data analytics, we integrate a differential privacy solution at each hospital institution.
•

We then further propose a new blockchain-based FedGAN framework for secure COVID-19 data analytics, by decentralizing the FL process over the hospital institutions. Particularly, we propose a novel mining mechanism to mitigate the mining latency caused by the blockchain adoption in the FL system that has not been addressed in the literature works. Further, we investigate the potential of blockchain-based FedGAN in COVID-19 detection scenarios with real-world datasets.
•

We also design an efficient CNN-based classifier which can flexibly perform COVID-19 classification in three labelled classes (COVID-19 positive, normal, and pneumonia). Finally, we implement extensive simulations to evaluate the effectiveness of our designs, showing the significant improvement of the proposed scheme in COVID-19 detection with low running latency, compared to existing methods.

I-C Paper Structure

The remainder of this article is organized as follows. Section II describes the system model and explains the standalone GAN model as the basic solution for COVID-19 image augmentation. Next, in Section III, we present our FedGAN model that enhances the COVID-19 image augmentation with privacy awareness and then provide its theoretical analysis. We also propose a new blockchain-based FedGAN framework for decentralized COVID-19 data analytics. We provide the simulations and evaluate the efficiency of our scheme as well as compare with other related schemes in Section V. Finally, Section VI concludes the paper.

II System Model

In this section, we describe the network model of our proposed scheme, and then analyze the baseline standalone approach in COVID-19 detection.

II-A Network Model

Refer to caption — Figure 1: The proposed FedGAN architecture for COVID-19 detection.

We consider a FedGAN model for COVID-19 detection as illustrated in Fig. 1, including a set $\mathcal{N}$ of edge nodes (ENs) located at medical institutions (e.g., hospitals) and a cloud server. Note that ENs can be local computers or powerful IoT devices installed in hospitals for data training. Each EN (or institution) $n\in\mathcal{N}$ participates in the FL process using its own COVID-19 image dataset to build a global GAN with the help of a cloud server, aiming to generate high-quality synthetic COVID-19 X-ray images for improving the overall COVID-19 detection. We assume that each institution $n$ has its own dataset, denoted as $D_{n}$ which follows a distribution $p_{n}(x)$ where $x$ is the real data samples acquired from the real COVID-19 image dataset.

We design a GAN at each EN including a generator and a discriminator by using CNNs. Particularly, each institution trains a generator to learn a generative data distribution $p^{g}_{n}$ based on its dataset $D_{n}$ , aiming to mimic the real data distribution $p^{d}_{n}$ which is $p^{g}_{n}=p^{d}_{n}$ ¹¹1At this optimal condition, the discriminator cannot distinguish the real samples from the synthetic samples generated by a well-functioning generator.. To do so, given a random noise $z$ from a probability distribution $p^{z}_{n}(z)$ , the generator learns to generate a fake COVID-19 image data point $G_{n}(z,\theta_{n}^{g})$ where $G_{n}$ represents the CNN with parameters $\theta_{n}^{g}$ . Moreover, we design another CNN as a discriminator $D_{n}(x,\theta_{n}^{d})$ at each institution which tries to classify the real COVID-19 image data point $x$ from the distribution $p_{n}(x)$ against the one produced from the generator. The discriminator outputs a value 1 if the input is $x$ or 0 if the input is $G_{n}(z,\theta_{n}^{g})$ . Accordingly, the generator and the discriminator at each hospital interact to derive the parameters $\theta_{n}^{g},\theta_{n}^{d}$ so that the generator produces the synthetic data distribution $p^{g}_{n}$ similar to the real data distribution $p^{d}_{n}$ to fool the discriminator. Mathematically, the objective function of the GAN at each institution $n$ can be formulated via a min-max game with a value function $V_{n}(D,G)$ as:

\underset{G}{\min}~{}\underset{D}{\max}V_{n}(D,G)=\mathds{E}_{x\sim p_{n}^{d}}\log D_{n}(x)+\\ \mathds{E}_{z\sim p_{n}^{z}}\log\left(1-D_{n}(G_{n}(z))\right),

(1)

where $\mathds{E}$ is the expectation, $D_{n}(x)$ denotes the probability that $D_{n}$ distinguishes $x$ as real data samples, and $D_{n}(G_{n}(z))$ represents the probability that $D_{n}$ determines the data generated by $G_{n}$ . In (1), the first term implies that the $D$ controls how the synthetic sample should be close to the real sample, while the second term penalizes the implausible points generated from the generator. Therefore, the discriminator $D_{n}$ aims to maximize the value function $V_{n}(D,G)$ in 1, while the generator tries to minimize this value.

II-B Standalone GAN for COVID-19 Detection

In this subsection, we analyze the standalone GAN, a traditional approach used in [15, 16, 17] for COVID-19 detection. In this case, every institution $n$ only trains the GAN using its own dataset without the federation.
Proposition 1: For a given generator $G_{n}$ , the optimal discriminator is

D^{*}_{n}=\frac{p^{d}_{n}}{p^{d}_{n}+p^{g}_{n}}.

(2)

Proof. For a given generator $G_{n}$ , we can derive the probability distribution function for the generator $p^{g}_{n}$ . Based on [33], we can express the value function $V_{n}$ in 1 as below

V_{n}(D_{n},G_{n})=\int_{x}p^{d}_{n}(x)\log D_{n}(x)dx+\\ \int_{z}p^{z}_{n}(z)\log(1-D_{n}(G_{n}(z)))dz\\ =\int_{x}\left[p^{d}_{n}(x)\log D_{n}(x)+p^{g}_{n}(x)\log(1-D_{n}(x))\right]dx.

(3)

We know that the function $a\log(y)+b\log(1-y)$ achieves its minimum value at $\frac{a}{a+b}$ , $\forall(a,b)\in(0,1]$ , which thus leads to the result in 2. Next, we derive the optimum of the standalone GAN.
Theorem 1: The optimal value of a standalone GAN for COVID-19 detection is

V^{*}_{n}(D_{n},G_{n})=-\log 4+2*JSD(p^{d}_{n}\parallel p^{g}_{n}),

(4)

where $JSD(p^{d}_{n}\parallel p^{g}_{n})$ is the Jensen-Shannon divergence between distributions $p^{d}_{n}$ and $p^{g}_{n}$ [33].
Proof. From 2 and 3, we have

\scriptstyle V^{*}_{n}(D,G)=\bigintssss_{x}\left[p^{d}_{n}(x)\log D^{*}_{n}(x)+p^{g}_{n}(x)\log(1-D^{*}_{n}(x))\right]dx\\ \scriptstyle=\bigintssss_{x}\left[p^{d}_{n}(x)\log\left(\frac{p^{d}_{n}(x)}{p^{d}_{n}(x)+p^{g}_{n}(x)}\right)+p^{g}_{n}(x)\log\left(\frac{p^{g}_{n}}{p^{d}_{n}+p^{g}_{n}(x)}\right)\right]dx\\ \scriptstyle=\bigintssss_{x}\left[p^{d}_{n}(x)\log\left(\frac{p^{d}_{n}(x)}{2\frac{p^{d}_{n}(x)+p^{g}_{n}(x)}{2}}\right)+p^{g}_{n}(x)\log\left(\frac{p^{g}_{n}(x)}{2\frac{p^{d}_{n}(x)+p^{g}_{n}(x)}{2}}\right)\right]dx\\ \scriptstyle=\bigintssss_{x}p^{d}_{n}(x)log\frac{1}{2}dx+\bigintssss_{x}p^{g}_{n}(x)log\frac{1}{2}dx\\ \scriptstyle+\bigintssss_{x}p^{d}_{n}(x)\log\left(\frac{p^{d}_{n}(x)}{\frac{p^{d}_{n}(x)+p^{g}_{n}(x)}{2}}\right)dx+\bigintssss_{x}p^{g}_{n}(x)\log\left(\frac{p^{g}_{n}}{\frac{p^{d}_{n}(x)+p^{g}_{n}(x)}{2}}\right)dx\\ \scriptstyle=-\log 4+KL\left(p^{d}_{n}\parallel\frac{p^{d}_{n}(x)+p^{g}_{n}(x)}{2}\right)+KL\left(p^{g}_{n}\parallel\frac{p^{d}_{n}(x)+p^{g}_{n}(x)}{2}\right),

(5)

where $KL$ is the Kullback-Leibler divergence. Based on [33], we know that the Jensen-Shannon divergence between probability distributions $A(x)$ and $B(x)$ is $JSD(A\parallel B)$ which is a symmetrized and smoothed version of the all important divergence measure of Kullback-Leibler divergence $KL~{}(~{}A~{}\parallel~{}B)$ which is defined as: $KL~{}(~{}A~{}\parallel~{}C)+KL(B\parallel C)=2*JSD(A\parallel B)$ , where $C=\frac{1}{2}(A+B)$ . Accordingly, from 4 we derive $V^{*}_{n}(D,G)=-\log 4+2*JSD(p^{d}_{n}||p^{g}_{n})$ , completing the proof.

Since $JSD(p^{d}_{n}||p^{g}_{n})$ is always non-negative [33], the global minimum of $V^{*}_{n}(D,G)$ in the standalone GAN-based COVID-19 detection is $-\log 4$ . In the following, we will present the proposed FedGAN and prove theoretically its advantages in training the GAN for efficient COVID-19 detection.

III Proposed FedGAN for COVID-19 Detection

III-A Theoretical Analysis of FedGAN

In FedGAN, the value function can be defined as a multi-agent game of discriminators and generators of all institutions $n\in\mathcal{N}$ . The generators cooperatively learn to generate fake COVID-19 X-ray images in order to fool all discriminators of institutions, whereas the discriminators attempt to differentiate the real data from fake images generated by generators. Then, the value function of the FedGAN can be defined as

V^{fed}(D,G)=\sum_{n=1}^{N}V_{n}(D_{n},G_{n})=\\ \sum_{n=1}^{N}\left[\mathds{E}_{x\sim p_{n}^{d}}\log D_{n}(x)+\mathds{E}_{z\sim p_{n}^{z}}\log\left(1-D_{n}(G_{n}(z))\right)\right].

(6)

Moreover, for any given generators $G_{n}$ , the optimal discriminator $D_{n}^{*}$ for the FedGAN is also similar to the standalone GAN which is given by 2. Accordingly, the optimal value of the FedGAN for COVID-19 detection can be calculated as follows:

\scriptstyle V^{fed*}_{n}(D,G)=\sum_{n=1}^{N}\bigintssss_{x}\left[p^{d}_{n}(x)\log D^{*}_{n}(x)+p^{g}_{n}(x)\log(1-D^{*}_{n}(x))\right]dx\\ \scriptstyle=\sum_{n=1}^{N}\bigintssss_{x}\left[p^{d}_{n}(x)\log\left(\frac{p^{d}_{n}(x)}{p^{d}_{n}(x)+p^{g}_{n}(x)}\right)+p^{g}_{n}(x)\log\left(\frac{p^{g}_{n}}{p^{d}_{n}+p^{g}_{n}(x)}\right)\right]dx\\ \scriptstyle=\sum_{n=1}^{N}\bigintssss_{x}\left[p^{d}_{n}(x)\log\left(\frac{p^{d}_{n}(x)}{2\frac{p^{d}_{n}(x)+p^{g}_{n}(x)}{2}}\right)+p^{g}_{n}(x)\log\left(\frac{p^{g}_{n}(x)}{2\frac{p^{d}_{n}(x)+p^{g}_{n}(x)}{2}}\right)\right]dx\\ \scriptstyle=\sum_{n=1}^{N}\left[-\log 4+2*JSD(p^{d}_{n}||p^{g}_{n})\right]=-n\log 4+2*\sum_{n=1}^{N}JSD(p^{d}_{n}||p^{g}_{n}).

(7)

In the multi-agent game, each generator at each institution $n$ is trained to learn perfectly the contribution of the actual data $p^{d}_{n}$ where the minimum of the value function $V^{fed}(D,G)$ can be achieved at $p^{d}_{n}=p^{g}_{n}$ for $n\in\mathcal{N}$ . Thus, the solution of 7 yields $\sum_{n=1}^{N}JSD(p^{g}_{n}||p^{g}_{n})=0$ . As a result, the optimal value function for FedGAN can achieve as $V^{fed*}_{n}(D,G)=-n\log 4$ .

Based on the above analysis, we can see that the federated approach can achieve a better global minimum of the GAN value function, compared to the standalone approach, due to the federation of multiple institutions which enables learning the data distribution over the entire population. In order words, the proposed FedGAN approach can learn better the COVID-19 image distribution to produce better synthetic image data which facilitates the detection tasks. We will provide extensive simulations to verify the advantage of the FedGAN.

III-B Training of FedGAN for COVID-19 Detection

We denote the global training iteration horizon as $T$ and index time by $t$ . Each institution $n\in\mathcal{N}$ joins the FedGAN training with the cloud server, by updating the parameters of the discriminator and the generator $\theta^{d}_{n,t}$ and $\theta^{g}_{n,t}$ in each global round and exchange them with the cloud server for aggregation. We assume that the COVID-19 image data in this work is independent and identically distributed (iid) across the institutions, while the non-iid data case will be considered in future works. For every global epoch $t$ , each institution $n$ collaboratively trains its discriminator $D_{n}$ and generator $G_{n}$ . Specifically, the generator $G_{n}$ produces $k$ minibatchs of fake samples from the noise probability distribution $p^{z}_{n}(z)$ as $\{z^{(1)},z^{(2)},...,z^{(k)}\}$ . Also, the discriminator $D_{n}$ samples $k$ minibatchs of real data from the actual image distribution $p^{d}_{n}(x)$ as $\{x^{(1)},x^{(2)},...,x^{(k)}\}$ . Then, each institution $n$ updates simultaneously the discriminator $D_{n}$ by ascending its stochastic gradient:

\bigtriangledown_{\theta^{d}}\frac{1}{k}\left[\sum_{j=1}^{k}\log D\left(x^{(j)}\right)+\log\left(1-D\left(G\left(z^{(j)}\right)\right)\right)\right],

(8)

and updates the generator $G_{n}$ by descending its stochastic gradient:

\bigtriangledown_{\theta^{g}}\frac{1}{k}\sum_{j=1}^{k}\log\left(1-D\left(G\left(z^{(j)}\right)\right)\right),

(9)

to update its own weights $\theta_{n}^{d},\theta_{n}^{g}$ . These stochastic gradient calculations also characterize the approximation of the value function defined in 6.

After the training, the institutions upload their updates $\theta_{n}^{d},\theta_{n}^{g}$ to the cloud server for model aggregation. In this work, we adopt the popular model averaging approach [13] to aggregate the local model parameters: $\theta^{d}\leftarrow\frac{1}{N}\sum_{1}^{N}\theta_{n}^{d}$ , $\theta^{g}\leftarrow\frac{1}{N}\sum_{1}^{N}\theta_{n}^{g}$ . Then, the cloud server broadcasts the new global updates $\theta^{d},\theta^{g}$ to all institutions for the next round of GAN learning. The FedGAN process is iterated until the global loss function converges with a desired accuracy. The training procedure of the proposed FedGAN is summarized in Algorithm 1.

Algorithm 1 Training procedure of the proposed FedGAN

1: Cloud server executes:

2: Initialize global training period

T

, local training epoch

L

, local weights

\theta_{n}^{d},\theta_{n}^{g},\forall n\in\mathcal{N}

, learning rate

\sigma

3: for each global round

t=1,2,...,T

4: for each institution

n\in\mathcal{N}

\theta^{d}_{n,t+1},\theta^{g}_{n,t+1}\leftarrow\textbf{LocalUpdate}(n,\theta_{t}^{d},\theta_{t}^{g})

6: end for

\theta^{d}_{t+1}\leftarrow\frac{1}{N}\sum_{1}^{N}\theta^{d}_{n,t+1}

\theta^{g}_{t+1}\leftarrow\frac{1}{N}\sum_{1}^{N}\theta^{g}_{n,t+1}

9: end for

10:

\textbf{LocalUpdate}(n,\theta^{d},\theta^{g})

: // Run at each institution

n

11: for each local epoch

i=1,2,...,L

12: Sample

k

minibatchs of noise samples

\{z^{(1)},z^{(2)},...,z^{(k)}\}

from distribution

p^{z}_{n}(z)

13: Sample

k

minibatchs of real data

\{x^{(1)},x^{(2)},...,x^{(k)}\}

from actual COVID-19 image distribution

p^{d}_{n}(x)

14: Update the discriminator via 8

15: Update the generator via 9

16: Update the weights

\theta^{d}

and

\theta^{g}

\theta^{d}\leftarrow\theta^{d}-\sigma\bigtriangledown_{\theta^{d}}(\theta^{d},\theta^{g});\theta^{g}\leftarrow\theta^{g}-\sigma\bigtriangledown_{\theta^{g}}(\theta^{d},\theta^{g})

(10)

17: end for

18: Return

\theta^{d},\theta^{g}

to the cloud server

In FedGAN, the computational complexity mostly comes from the computation at each institution since the cloud server only implements the parameter aggregation and does not result in much computational costs. Thus, we here focus on analyzing the computational complexity at each institution. In every global training round, each institution $n$ collaboratively trains its discriminator and generator to compute its $\theta_{n}^{d}$ and $\theta_{n}^{g}$ , respectively. To do that, the generator $G_{n}$ produces $k$ minibatchs of fake samples of batch size $b^{g}$ , and the discriminator $D_{n}$ samples $k$ minibatchs of real data of batch size $b^{d}$ . Accordingly, in the generator, the batch generation requires $kb^{g}G_{fp}$ floating point operations, where $G_{fp}$ is the number of floating operations to generate a fake data sample of the generator, and a memory of $kb^{g}G_{neu}$ , where $G_{neu}$ is the number of neurons of the generator. Therefore, the computational complexity of the generator can be determined as $kb^{g}G_{fp}+kb^{g}G_{neu}=\mathcal{O}\left(kb^{g}(G_{fp}+G_{neu})\right)$ . Similarly, the computational complexity of the discriminator can be specified as $\mathcal{O}\left(kb^{d}(D_{fp}+D_{neu})\right)$ , where $D_{fp}$ and $D_{neu}$ are the number of floating operations to generate a real data sample and the number of neurons of the generator. To sum up, the computational complexity at an institution in the FedGAN framework after $T$ global training round is $\mathcal{O}\left(Tk\left[\left(b^{g}(G_{fp}+G_{neu})\right)+\left(b^{d}(D_{fp}+D_{neu})\right)\right]\right)$ .

III-C Differential Privacy for FedGAN

To further enhance privacy for FedGAN training, we integrate an $\epsilon$ -differential privacy solution at each hospital site, where $\epsilon$ is the distinguishable bound of all outputs on two adjacent datasets $D,D^{\prime}$ in a database. A randomized function $\mathcal{A}$ is $\epsilon$ -differential privacy if

Pr[\mathcal{A}(D)\in\mathcal{S}]\leq e^{\epsilon}Pr[\mathcal{A}(D^{\prime})\in\mathcal{S}],

(11)

where $\mathcal{S}\in range(\mathcal{A})$ . To guarantee $\epsilon$ -differential privacy, we here apply a gradient perturbation technique with differentially-private stochastic gradient descent (DP-SGD) [12], where Gaussian noises are added to the gradient during the training. Accordingly, we can determine the gradient descent update at training round $t$ as

\theta_{t+1}=\theta_{t}-\sigma\left(\nabla L(\theta_{t})+\zeta\right),

(12)

where $L$ is the loss function of GAN and $\zeta$ is the noise guaranteeing differential privacy.

IV Blockchain-based Federated Learning for COVID-19 Detection

Although FedGAN can support privacy-enhanced COVID-19 data analytics, how to address the security issues in terms of information leakage caused by the malicious cloud third party and single-point failures is a critical challenge. More specifically, the traditional FL framework basically relies on a single cloud server to perform FL aggregation. However, the cloud party may illegally exploit the information uploaded from hospitals without the consent of healthcare users, which potentially leads to the leakage of sensitive health information. Adversaries also deploy attacks to steal or modify the FL updates during the aggregation process at the cloud sever. Moreover, the centralized configuration in traditional FL systems is also vulnerable to the risks of single-point failures if the cloud server is attacked which would disrupt the entire FL process [14].

The roles of blockchain for secure COVID-19 data analytics have been reviewed in our recent work [34], where blockchain can enable decentralized data learning over distributed institutions without the need for a central server. Therefore, we here present a novel decentralized FedGAN framework by integrating a blockchain-based solution, as illustrated in Fig. 2. Instead of relying on a centralized cloud server to coordinate the FL aggregation, we here replace it with a blockchain that is able to decentralize the FL aggregation for security enhancement [35]. Moreover, the use of blockchain is able to enhance the scalability of FL implementation in practical healthcare networks. This is enabled by its decentralization feature which allows for interconnecting distributed edge nodes and hospitals to train image datasets in a peer-to-peer manner.

IV-A Working Procedure of Blockchain-based FedGAN

The working procedure of our blockchain-based FedGAN framework is explained in the following steps:

1.

The EN that is willing to join the FL process downloads an initial model from the blockchain. Note that each EN also needs to set up a wallet account that contains a public key for identification and a private key for transaction signature.
2.

Each EN performs the training of the GAN model (i.e., a generator and a discriminator) to compute the gradients using its own local COVID-19 X-ray image dataset. In the case of using differential privacy, an amount of $\epsilon$ -differential privacy noises is added to the gradient during the training. After local training, each EN submits its model updates to the blockchain by creating a transaction.
3.

The miners will aggregate the transactions uploaded from ENs to construct a block after a certain period of time. Then, the miners perform the mining to verify the block using a consensus mechanism.
4.

After the mining, if all miners achieve an agreement on the verified block, this block is then appended to the blockchain. Now each EN can download the block that contains all FL updates of other ENs to compute the global model. In this regard, the global model is constructed locally instead of in the central cloud server like in the traditional FL architecture. The GAN training is iterated until the desired accuracy performance is achieved.

Based on the working procedure, we can see that the total running latency costs of blockchain-based FedGAN training mostly come from the FedGAN training latency and blockchain mining latency. In this particular work, we focus on addressing the blockchain mining latency, by proposing a novel block consensus mechanism as presented in the following.

IV-B Proposed Consensus Mechanism for Blockchain-based FedGAN

In the blockchain-based FedGAN system, when the number of transactions (e.g., local GAN updates) to the blockchain increases, the consensus workload to validate and append them into the blockchain also increases significantly. Although consensus mechanisms such as Delegated Proof-of-Stake (DPoS) have been applied to replace computationally expensive consensus schemes like Proof-of-Work, these solutions still have high latency costs. Indeed, in current consensus algorithms, e.g., DPoS [36], each miner must contact at least more than half of the total nodes in the miner group, which consequently increases latency and alleviates the scalability of the blockchain system. Moreover, each miner node must implement a repeated verification process across the miner network, which results in unnecessary consensus latency. A possible solution is to reduce the number of miner nodes to mitigate the consensus latency, but it potentially compromises the security of blockchain because of the higher probability of adding compromised transactions from malicious nodes [36]. To solve these mining issues, here we propose a new lightweight consensus mechanism called Proof of Reputation (PoR) for our blockchain-FedGAN system. Compared to the DPoS scheme, here we make a significant improvement in the miner selection based on a reputation score evaluation approach. Moreover, instead of using a repeated verification among miner nodes, we implement a lightweight block verification solution that allows each miner to only verify once with another node during the consensus process, which would significantly reduce the verification latency. There are two main parts to our PoR consensus, including miner node selection and block verification.

IV-B1 Miner Node Selection

In this phase, the ENs first calculate the reputation score of miners and then select the miner nodes to implement the mining process.

- Reputation Calculation: In our blockchain-based FedGAN system, in addition to COVID-19 data training, ENs also participate in the delegate selection process to vote the mining candidates for performing blockchain consensus. In this regard, each EN votes its preferred miners with the most reputation. Here, the reputation of a miner is measured by its computing capability to mine the block. That is, a miner that allocates more computational resources to the mining tasks will have a higher reputation score to obtain a higher priority for mining the block. To this end, we define a reputation function to determine the score for each miner as follows:

\Psi_{m}=e^{1-\frac{T_{m}^{PoR}}{\tau}}-1.

(13)

Here, $T_{m}^{PoR}$ is the mining latency of the miner $m$ where $m$ is the miner index in the mining group with $m\in\{1,...,M\}$ , and $\tau$ denotes the mining latency threshold. This equation implies that the miner that has lower mining latency will achieve a higher reputation score to obtain a higher priority for mining the block.

- Miner Selection: Based on the calculated reputation score, each EN votes for miner candidates based on their reputation ranking. The top miners in the mining group with highest reputation scores are selected to become actual miners (here, we call them as edge miners (EMs)) to perform the mining. Besides, similar to the traditional DPoS framework [37], each of the EMs also acts as a block manager which is responsible for performing block generation, broadcasting blocks to other miners for verification, and block aggregation after being verified, during its time slot of the consensus process.

IV-B2 Lightweight Block Verification

The block manager first generates an unverified block that contains several health transactions aggregated in a certain time period, and then transmits this block to all EMs for verification. Different from the traditional DPoS scheme which relies on a repeated verification process among miners, here we implement a lightweight PoR-based verification solution that allows each miner only needs to verify once with another node during the consensus process, which significantly reduces the verification latency. The block manager first divides the block $B$ with the whole transaction into $K$ transaction parts $Tr_{k}$ ( $k=(1,...,K)$ ) that will be assigned to each EM member $EM_{m}$ within the miner group. Each miner $EM_{m}$ will be also assigned a unique random number $R_{m}$ . Subsequently, a $EM_{m}$ selects any miner $s$ ( $s!=n$ ) for verification on its assigned transaction part $Tr_{k}$ . If a majority of EMs (at least 51%) returns positive outcomes, the block manager approves the verified block $B$ and appends it into the chain.

IV-B3 Latency of Block Verification

In this sub-section, we calculate the verification latency incurred by the mining. Here, we assume that each EM receives the same transaction part $Tr_{k}$ . Each EM is willing to contribute their resource $C=\{c_{1},...,c_{m}\}$ (in CPU cycles/s) to execute the verification of the transaction part $k$ . For each EM $n$ , the CPU resource occupied to verify the transaction $k$ is $\Phi_{m}$ . We also denote $Tr_{k}^{re}$ as the size of verified transaction result $Tr_{k}$ .

Conceptually, the block verification process in our proposed PoR mechanism at an EM experiences four steps: (1) unverified block transmission from the block manager to the EMs, (2) local block verification at the EM, (3) broadcasting of the verification result among two EMs, and (4) transmission of verification result feedback from the EMs to the manager. The delay caused by the execution of these steps at each miner $n$ can be calculated as:

T_{m}^{PoR}=\frac{Tr_{k}}{r_{m}^{d}}+\frac{\Phi_{m}}{c_{m}}+\xi Tr_{k}|L^{2}|+\frac{Tr_{k}^{re}}{r_{m}^{u}},k\in[1,...,K],

(14)

where $r_{m}^{u}$ and $r_{m}^{d}$ represents the transmission rates of miner-manager uplink and downlink, respectively. Here, the transmission latency of the transaction part $Tr_{k}$ is $\frac{Tr_{k}}{r_{m}^{d}}$ , and the latency for local verification is $\frac{\Phi_{k}}{c_{m}}$ . The latency for transaction broadcasting among two miners is specified as $\xi Tr_{k}|L^{2}|$ , where $\xi$ is a pre-defined parameter of transaction broadcasting among two miners and can be determined via historical verification records [38]. The last component is verification feedback time, shown as $\frac{Tr_{k}^{re}}{r_{m}^{u}}$ .

Meanwhile, in the DPoS model [37], each miner needs to perform repeated verification on the whole block $B$ , where its verification latency is computed as [38]:

T_{m}^{DPoS}=\frac{B}{r_{m}^{d}}+\frac{\Phi_{m}^{B}}{c^{B}_{m}}+\xi B|L^{N}|+\frac{B^{re}}{r_{m}^{u}},

(15)

where $\Phi_{m}^{B}$ represents the CPU resource needed for executing the block $B$ with respect to the total budget $c_{m}^{B}$ . Moreover, $B^{re}$ is the size of the verified outcome for block $B$ . $|L^{N}|$ implies that all miners $n$ join the repeated verification on the block. By comparison of equations 14 and 15, it can be seen that the proposed PoR scheme consumes less time in the verification process, compared to the traditional DPoS scheme, for the same block size and number of miners. The benefits of our proposed PoR mechanism will be verified in the following section.

V Experiments and Performance Evaluations

V-A Experimental Settings

We use two popular COVID-19 X-ray datasets for simulations, including a DarkCOVID dataset [7] with total 620 X-ray images and a ChestCOVID dataset [39] with total 950 X-ray images for three classes (COVID-19, normal (no pneumonia), and pneumonia (with no COVID-19 infection)) which have been collected from different regions which makes them suitable for our FedGAN setting. For each dataset, we divide into the training and testing set with a 80:20 ratio.

In the FedGAN system, we set up five institutions where each of them has a discriminator and a generator based on CNNs, as shown in Fig. 1. Each discriminator $D$ takes COVID-19 X-ray images in the form of 64x64x1 size where the mini-batch is set to 32. The discriminator is configured with five hidden layers, each having 128 dimensions along with LeakyRELU and dropout. Furthermore, each generator $G$ takes 64-dimensional noise samples from a standard Gaussian distribution. Every generator has five hidden layers, where the first three layers have 256 dimensions and the last two layers have 128 dimensions. These parameters are selected based on preliminary experimental results. The FedGAN is trained for 500 global rounds where each local GAN is trained for 20 epochs.

Moreover, we design a CNN-based classifier for COVID-19 detection with three classes: COVID-19 positive, normal, and pneumonia. The CNN architecture consists of an input layer with the shape of 32x32x3 and three hidden convolutional layers with kernel 3x3, ReLU activation functions and max pooling. Here, the first hidden layer has 32 dimensions, the second layer has 128 dimensions with Relu as the activation in each layer. The final layer has three dimensions, with SGD as the optimizer, a softmax function as the activation to output prediction results over three classes. Moreover, batch normalization and dropout (0.5) are added to avoid overfitting on the training set. The CNN classifier is trained with the learning rate of 0.001. These hyperparameters are selected via multiple training trials for reliable classification results. All simulations were implemented in Pytorch on a desktop server with an Intel Core i7 4.7GHz CPU and 128 GB memory with Nvidia Pascal Titan X and CUDA 8.0.

V-B Performance Evaluations on FedGAN

TABLE II: Generation of synthetic COVID-19 X-ray image data using the FedGAN model.

Dataset	Classes	Original data	Synthetic data
DarkCOVID dataset	COVID-19	150	500
	Normal	232	500
	Pneumonia	238	500
	Sum	620	1500
ChestCOVID dataset	COVID-19	223	800
	Normal	421	800
	Pneumonia	306	800
	Sum	950	2400

TABLE III: Accuracies of CNN classifier on mixed actual data and synthetic data on DarkCOVID dataset.

Training size	$\alpha=0$	$\alpha=1$	$\alpha=2$	$\boldsymbol{\alpha=3}$	$\alpha=4$
500	0.487	0.550	0.842	0.911	0.901
1000	0522	0.614	0.850	0.925	0.934
1500	0.663	0.749	0.871	0.931	0.927
2000	0.784	0.799	0.872	0.967	0.955

TABLE IV: Accuracies of CNN classifier on mixed actual data and synthetic data on ChestCOVID dataset.

Training size	$\beta=0$	$\beta=1$	$\beta=2$	$\beta=3$	$\boldsymbol{\beta=4}$	$\beta=5$
500	0.561	0.617	0.715	0.890	0.931	0.900
1000	0.547	0.621	0.745	0.846	0.945	0.918
1500	0.57	0.732	0.870	0.871	0.950	0.925
2000	0.61	0.808	0.904	0.932	0.962	0.953
2500	0.690	0.893	0.945	0.945	0.973	0.932

We investigate the performance of our proposed FedGAN scheme and compare it with the state-of-the-art schemes, including: the standalone scheme [5], the standalone scheme with GAN [18], the FL scheme without GAN [24], and the centralized scheme (all datasets are transmitted to the cloud for classification). All considered schemes use a CNN-based classifier for evaluating the COVID-19 detection. For reliable evaluation, the reported results are averaged from five runs of numerical simulations.

TABLE V: Comparison of performance results for COVID-19 detection on ChestCOVID dataset.

Classes	Standalone scheme without GAN			Standalone scheme with GAN			FL scheme without GAN			Proposed FedGAN scheme
Classes	Precision	Sensitivity	F1-score	Precision	Sensitivity	F1-score	Precision	Sensitivity	F1-score	Precision	Sensitivity	F1-score
COVID-19	0.871	0.848	0.907	0.909	0.893	0.903	0.975	0.979	0.988	0.993	0.978	0.991
Normal	0.867	0.997	0.877	0.876	0.959	0.884	0.946	0.999	0.948	0.964	1	0.969
Pneumonia	0.847	0.550	0.696	0.857	0.542	0.689	0.941	0.808	0.891	0.966	0.876	0.932

V-B1 Evaluation of FedGAN Training

We first evaluate the training loss of the FedGAN model, including the discriminator and generator losses for training on DarkCOVID and ChestCOVID datasets during 500 epochs, as shown in Fig. 3. Notably, the discriminator loss achieves a stable convergence after 100 epochs at both datasets, which show that the FedGAN model can synthesize COVID-19 image data and its quality is improved over the training time. That is, the FedGAN model can learn the features of real COVID-19 X-ray images and generate high-quality synthetic COVID-19 X-ray images in a fashion that the discriminator cannot differentiate them from the actual ones.

In Fig. 4, we compare the discriminator loss of the proposed FedGAN and the standalone scheme with GAN [18]. It can be seen that the standalone scheme cannot achieve a good result in both cases due to the lack of access to the full dataset. By contrast, our FedGAN scheme can achieve better minimum loss thanks to its ability to learn data over the entire distribution of all institutions. These simulated results are also aligned with our theoretical analysis in Section III. Therefore, our model can produce better COVID-19 X-ray images, as illustrated in Fig. 5. The details of data generation for both datasets with synthetic COVID-19 X-ray image numbers are presented in TABLE II. Here, we generate 1500 synthetic DarkCOVID images and 2400 synthetic ChestCOVID images, with equal image volume in each class which thus addresses dataset limitation and imbalance. We will use these synthetic data associated with real data in the following simulations for COVID-19 detection.

V-B2 Evaluation of COVID-19 Detection Peformance with FedGAN

To implement COVID-19 detection, an important step is to determine how much synthetic data should be used for the best detection rate. To do so, we mix actual data and synthetic data according to different ratios as the number of synthetic data against the number of actual data, which are denoted as $\alpha$ and $\beta$ on DarkCOVID and ChestCOVID datasets, respectively. $\alpha=0$ and $\beta=0$ imply that only actual data is used for training. From TABLE III, when the mixing ratio $\alpha$ increases, the accuracies of CNN classifier generally increase on the training DarkCOVID subsets, which also shows the benefit of data augmentation offered by GANs in COVID-19 classification. Notably, $\alpha=3$ yields the best accuracy in most training DarkCOVID sets, but increasing further synthetic data can degrade the accuracy performance due to the overfitting issue. Similarly, we also investigate on the ChestCOVID dataset in TABLE IV, showing that the highest accuracy performance is achieved with $\beta=4$ . Therefore, we will use these mixing ratios for the remaining simulations.

Next, we evaluate the performance of COVID-19 detection via common quality metrics including precision, sensitivity, and F1-score. As illustrated in TABLE V, our proposed FedGAN scheme outperforms other frameworks at three metrics. For instance, for the COVID-19 class in the ChestCOVID dataset, our scheme can increase the precision and F1 score up to 0.993 and 0.991, compared to lower results at other schemes. The advantages of our FedGAN in COVID-19 detection are also confirmed via confusion matrixes in Fig. 6 and Fig. 7.

Furthermore, we evaluate different schemes in terms of FCN scores. They are used to measure the quality of the generated images on an input segmentation map that can be implemented by feeding generated X-ray images into the fully-convolutional semantic segmentation network (FCN). We use three standard segmentation metrics following CycleGAN [40] to evaluate FCN scores, including the per-pixel accuracy, the per-class accuracy, and the mean class Intersection-Over-Union (IOU). As indicated in Table VI, our FedGAN scheme outperforms the other approaches for both datasets. For example, in the training of DarkCOVID dataset, our scheme can improve the per-pixel accuracy by 29%, increase the per-class accuracy by 18% and the mean IOU by 12%, compared to the FL scheme without GAN. The improvements on FCN scores of our scheme over existing schemes are also shown on the training of ChestCOVID dataset in Table VII, showing the better stability of image-label translation of our proposed approach.

TABLE VI: Comparison of FCN scores on DarkCOVID dataset.

Schemes	Per-pixel acc.	Per-class acc.	Mean IOU
Standalone scheme without GAN	0.32	0.24	0.28
Standalone scheme with GAN	0.47	0.41	0.35
FL scheme without GAN	0.63	0.52	0.49
Proposed FedGAN scheme	0.82	0.65	0.56

TABLE VII: Comparison of FCN scores on ChestCOVID dataset.

Schemes	Per-pixel acc.	Per-class acc.	Mean IOU
Standalone scheme without GAN	0.45	0.40	0.34
Standalone scheme with GAN	0.59	0.58	0.39
FL scheme without GAN	0.75	0.63	0.43
Proposed FedGAN scheme	0.87	0.74	0.59

We then investigate the detection performance in terms of accuracy for different FL schemes and our FedGAN scheme on testing datasets. The standalone scheme is used as the baseline which theoretically has the lowest accuracy rates due to the lack of federation. As shown in Fig. 8, the more participating institutions in data training, the higher accuracy achieved for both datasets. This can be explained by the enhanced image feature learning efficiency thanks to the use of diverse data sources for improving the generalizability of CNN model when the number of institutions increase. However, our FedGAN scheme achieves the best accuracy among all FL approaches and is close to the centralized scheme with the full dataset. For example, for testing DarkCOVID dataset in Fig. 8(a), when the epoch is 200, the accuracy of our scheme stands at 0.992 which is 8%, 19%, 25%, and 28% higher than those of the FL schemes with 2,3,4 institutions and the standalone scheme, respectively. It also achieves a competitive performance level with the ideal centralized scheme (0.995). The accuracy of our scheme is also the highest on the testing ChestCOVID dataset among baselines in Fig. 8(b), achieving a value of 0.985 at 200 epochs and closeness to the centralized scheme.

Moreover, we compare the accuracy performance of our scheme with other COVID-19 detection schemes, as indicated in Fig. 9. Our FedGAN scheme can significantly improve the accuracy rate in both datasets due to its GAN and federated learning combination. That is, our scheme yields the best accuracy of 0.963 after 200 running epochs for testing DarkCOVID dataset in Fig. 9(a), compared to the FL scheme without GAN (0.922), the standalone scheme with GAN (0.856), and the standalone scheme without GAN (0.705). Its performance is also nearly close to the centralized scheme (0.969). A similar observation is also obtained on testing ChestCOVID dataset in Fig. 9(b), with the notable accuracy performance (0.975) achieved by our proposed FedGAN scheme. These simulation results demonstrate the high confidence and effectiveness of our design in detecting COVID-19.

V-B3 Evaluation of Differential Privacy-enabled FedGAN Performance

We investigate the accuracy performance of the FedGAN scheme with differential privacy, where $\epsilon$ is set to 0.3. As can be seen in Fig. 10, although differential privacy can provide a degree of privacy to COVID-19 data training, its scheme suffers from a degradation of accuracy performances in both datasets. How to achieve a balance between privacy preservation and data utility (e.g., training accuracy) is still an open problem for further investigation.

Next, we evaluate the FedGAN scheme in terms of the data utility performance measured by F1 score when varying the privacy parameter $\epsilon$ from 0.01 to 0.5. As shown in Fig. 11, when $\epsilon$ increases, the level of privacy decreases, and thus enhancing the data utility. This trend is consistent for both DarkCOVID and ChestCOVID datasets. The simulation result also implies that the selection of privacy parameter $\epsilon$ in differential privacy settings plays a significant role in the quality of FL training.

V-C Performance Evaluations on Blockchain-based FedGAN

Here, we evaluate the performance of our proposed PoR consensus scheme via numerical simulations and compare it with the traditional DPoS scheme via the verification block latency metric. We set up 10 transactions per block and vary the numbers of miners from 2 to 100. Motivated by [38], the mining parameters are set up as follows: edge computation resources $c_{m}=[10^{3}-10^{6}]$ CPU cycles/s, input/output block data sizes $B=500$ KB, $B^{re}=50$ KB, the uplink transmission rate $r_{m}^{u}=[100-250]$ kbps, the downlink transmission rate $r_{m}^{d}=[100-250]$ kbps, $\xi=0.5$ , $\tau_{m}=1000$ ms.

V-C1 Latency Performance of Block Verification

We evaluate the block mining latency when varying the size of data block from 50 KB to 500 KB in the blockchain network with 10 miners. As shown in Fig. 12, our proposed mining scheme yields a lower verification latency than the traditional DPoS scheme due to our lightweight verification strategy. In particular, our mechanism shows its good advantage when the size of data block is large (e.g., $>400KB$ ), while the DPoS scheme requires much time to verify the large blocks. The simulation results also imply the block mining analysis in section IV-B3.

V-C2 Accuracy Performance

We investigate the accuracy performance of the blockchain-based FedGAN scheme and compare it with other related schemes. From the simulation results in Fig. 13, we find that the blockchain-based FedGAN scheme can achieve the competitive accuracy rates with the baseline FedGAN scheme, especially when training with the DarkCOVID dataset. This shows that the integration of blockchain with a decentralized model aggregation enabled by the blockchain consensus does not affect the overall training performance, while providing a high degree of security for the federated data training. Compared with other existing schemes, such as the standalone scheme and the FL scheme without GAN, the blockchain-based FedGAN scheme also achieves much better accuracy performances in both datasets.

V-C3 Performance of Total Running Latency

We then compare the cost of the blockchain-based FedGAN scheme with our PoR design in terms of the total running latency which consists of the training latency and the mining latency at a global training cycle. For fair comparison, we use the pure FL scheme without blockchain as the baseline which only has the training latency. As indicated in Fig. 14 for both datasets, our proposed blockchain-based FedGAN scheme can achieve a relatively competitive latency performance with the FL baseline scheme and save much running time compared with the traditional blockchain-based FedGAN scheme with the DPoS. For example, our scheme can reduce the running time by 23.2% and 19.5% when training the DarkCOVID and ChestCOVID datasets, respectively, compared with the traditional scheme with DPoS. These results are enabled by our advanced consensus design that minimizes the mining latency during the aggregation stage, which leads to the reduction in the overall running time.

To fully realize FL-GAN in practical healthcare networks, several issues and challenges should be considered. For example, how to ensure efficient resource scheduling for edge nodes to run GAN models is a critical issue since the training of large-scale X-ray images along with mining involvement requires a significant amount of energy and memory resources. Another challenge can be the lack of motivation of edge nodes in joining the FL-GAN process. An edge node may not be willing to devote its resources to train image datasets and run mining if there is no incentives or rewards. Therefore, it is desirable to develop proper incentive mechanisms to encourage edge nodes from hospitals to participate in the FL-GAN process, which in turn ensures the robustness of the federated health data training.

VI Conclusion and Future Work

This paper has proposed FedGAN, a novel scheme for COVID-19 detection by enabling the joint design of FL and GAN in a distributed medical network with edge cloud computing. The proposed approach has taken the data augmentation using distributed GANs and the federated data training using FL without sharing actual data for COVID-19 detection. To enhance the privacy in federated COVID-19 data training, we have applied a differential privacy solution at each hospital institution. We have then proposed a new blockchain-based FedGAN framework for secure COVID-19 data analytics, by decentralizing the FL process over the hospital institutions with a novel mining mechanism. Our theoretical analysis and numerical simulations have showed that our scheme significantly improves the performances of COVID-19 detection, with the high detection accuracy rate and low running time, compared to the state-of-the-art schemes.

In future work, it is of interest to extend the proposed FL-blockchain model to other healthcare applications. For example, the integrated FL-blockchain framework can be useful for federated human activity analytics, where wearable sensor devices can collaborate to train a shared human motion classification model. In such cases, blockchain can be used to establish a decentralized network of sensor devices to share the model updates and coordinate the training without relying on a centralized authority.

References

[1] X. Kong, K. Wang, S. Wang, X. Wang, X. Jiang, Y. Guo, G. Shen, X. Chen, and Q. Ni, “Real-time Mask Identification for COVID-19: An Edge Computing-based Deep Learning Framework,” IEEE Internet of Things Journal, pp. 1–1, 2021.
[2] M. Li, D. G. Andersen, J. W. Park, A. J. Smola, A. Ahmed, V. Josifovski, J. Long, E. J. Shekita, and B.-Y. Su, “Scaling distributed machine learning with the parameter server,” in 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), 2014, pp. 583–598.
[3] Q.-V. Pham, D. C. Nguyen, T. Huynh-The, W.-J. Hwang, and P. N. Pathirana, “Artificial Intelligence (AI) and Big Data for Coronavirus (COVID-19) Pandemic: A Survey on the State-of-the-Arts,” IEEE Access, vol. 8, pp. 130 820–130 839, 2020.
[4] Y. Song, S. Zheng, L. Li, X. Zhang, X. Zhang, Z. Huang, J. Chen, R. Wang, H. Zhao, Y. Zha, J. Shen, Y. Chong, and Y. Yang, “Deep learning Enables Accurate Diagnosis of Novel Coronavirus (COVID-19) with CT images,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, pp. 1–1, 2021.
[5] K. H. Abdulkareem, M. A. Mohammed, A. Salim, M. Arif, O. Geman, D. Gupta, and A. Khanna, “Realizing an Effective COVID-19 Diagnosis System Based on Machine Learning and IoT in Smart Hospital Environment,” IEEE Internet of Things Journal, pp. 1–1, 2021.
[6] Y. Tai, B. Gao, Q. Li, Z. Yu, C. Zhu, and V. Chang, “Trustworthy and Intelligent COVID-19 Diagnostic IoMT through XR and Deep Learning-based Clinic Data Access,” IEEE Internet of Things Journal, pp. 1–1, 2021.
[7] T. Ozturk, M. Talo, E. A. Yildirim, U. B. Baloglu, O. Yildirim, and U. R. Acharya, “Automated Detection of COVID-19 cases using Deep Neural Networks with X-ray Images,” Computers in Biology and Medicine, vol. 121, p. 103792, Jun. 2020.
[8] Z. Pan, W. Yu, X. Yi, A. Khan, F. Yuan, and Y. Zheng, “Recent Progress on Generative Adversarial Networks (GANs): A Survey,” IEEE Access, vol. 7, pp. 36 322–36 333, 2019.
[9] J. Vijay Kumar et al., “Advanced Machine Learning-based Analytics on COVID-19 data using Generative Adversarial Networks,” Materials Today: Proceedings, Oct. 2020.
[10] C. Saez, N. Romero, J. A. Conejero, and J. M. Garcia-Gomez, “Potential Limitations in COVID-19 Machine Learning due to Data Source Variability: A case study in the nCov2019 Dataset,” Journal of the American Medical Informatics Association, Oct. 2020.
[11] D. C. Nguyen, M. Ding, P. N. Pathirana, A. Seneviratne, J. Li, D. Niyato, and H. V. Poor, “Federated Learning for Industrial Internet of Things in Future Industries,” IEEE Wireless Communications, pp. 1–8, 2021.
[12] D. C. Nguyen, M. Ding, P. N. Pathirana, A. Seneviratne, J. Li, and H. V. Poor, “Federated Learning for Internet of Things: A Comprehensive Survey,” IEEE Communications Surveys & Tutorials, pp. 1–1, 2021.
[13] J. Pang, J. Li, Z. Xie, Y. Huang, and Z. Cai, “Collaborative City Digital Twin For Covid-19 Pandemic: A Federated Learning Solution,” arXiv:2011.02883 [cs], Nov. 2020, arXiv: 2011.02883.
[14] D. C. Nguyen, P. N. Pathirana, M. Ding, and A. Seneviratne, “BEdgeHealth: A Decentralized Architecture for Edge-Based IoMT Networks Using Blockchain,” IEEE Internet of Things Journal, vol. 8, no. 14, pp. 11 743–11 757, Jul. 2021.
[15] S. Motamed, P. Rogalla, and F. Khalvati, “RANDGAN: Randomized Generative Adversarial Network for Detection of COVID-19 in Chest X-ray,” Oct. 2020, arXiv: 2010.06418.
[16] N. E. M. Khalifa et al., “Detection of Coronavirus (COVID-19) Associated Pneumonia based on Generative Adversarial Networks and a Fine-Tuned Deep Transfer Learning Model using Chest X-ray Dataset,” Apr. 2020, arXiv: 2004.01184.
[17] Y. Jiang, H. Chen, M. H. Loew, and H. Ko, “COVID-19 CT Image Synthesis with a Conditional Generative Adversarial Network,” IEEE Journal of Biomedical and Health Informatics, pp. 1–1, 2020.
[18] A. Waheed, M. Goyal, D. Gupta, A. Khanna, F. Al-Turjman, and P. R. Pinheiro, “CovidGAN: Data Augmentation Using Auxiliary Classifier GAN for Improved Covid-19 Detection,” IEEE Access, vol. 8, pp. 91 916–91 923, 2020.
[19] M. Loey, F. Smarandache, and N. E. M Khalifa, “Within the Lack of Chest COVID-19 X-ray Dataset: A Novel Detection Model Based on GAN and Deep Transfer Learning,” Symmetry, vol. 12, no. 4, p. 651, Apr. 2020.
[20] H. Bao, X. Zhou, Y. Zhang, Y. Li, and Y. Xie, “COVID-GAN: Estimating Human Mobility Responses to COVID-19 Pandemic through Spatio-Temporal Conditional Generative Adversarial Networks,” in Proceedings of the 28th International Conference on Advances in Geographic Information Systems. Seattle WA USA: ACM, Nov. 2020, pp. 273–282.
[21] W. Zhang, T. Zhou, Q. Lu, X. Wang, C. Zhu, H. Sun, Z. Wang, S. K. Lo, and F.-Y. Wang, “Dynamic Fusion-based Federated Learning for COVID-19 Detection,” IEEE Internet of Things Journal, pp. 1–1, 2021.
[22] D. Yang et al., “Federated Semi-supervised Learning for COVID Region Segmentation in Chest CT using multi-national Data from China, Italy, Japan,” Medical Image Analysis, vol. 70, p. 101992, May 2021.
[23] Q. Dou et al., “Federated Deep Learning for Detecting COVID-19 Lung Abnormalities in CT: a Privacy-preserving Multinational Validation Study,” npj Digital Medicine, vol. 4, no. 1, p. 60, Dec. 2021.
[24] I. Feki, S. Ammar, Y. Kessentini, and K. Muhammad, “Federated Learning for COVID-19 Screening from Chest X-ray Images,” Applied Soft Computing, vol. 106, p. 107330, Jul. 2021.
[25] B. Liu, B. Yan, Y. Zhou, Y. Yang, and Y. Zhang, “Experiments of Federated Learning for COVID-19 Chest X-ray Images,” Jul. 2020, arXiv: 2007.05592.
[26] R. Kumar, A. A. Khan, J. Kumar, A. Zakria, N. A. Golilarz, S. Zhang, Y. Ting, C. Zheng, and W. Wang, “Blockchain-Federated-Learning and Deep Learning Models for COVID-19 detection using CT Imaging,” IEEE Sensors Journal, pp. 1–1, 2021.
[27] S. Otoum, I. Al Ridhawi, and H. T. Mouftah, “Preventing and Controlling Epidemics Through Blockchain-Assisted AI-Enabled Networks,” IEEE Network, vol. 35, no. 3, pp. 34–41, May 2021.
[28] J. Passerat-Palmbach, T. Farnan, M. McCoy, J. D. Harris, S. T. Manion, H. L. Flannery, and B. Gleim, “Blockchain-orchestrated machine learning for privacy preserving federated learning in electronic health data,” in 2020 IEEE International Conference on Blockchain (Blockchain). Rhodes Island, Greece: IEEE, Nov. 2020, pp. 550–555.
[29] Y. Qu, S. R. Pokhrel, S. Garg, L. Gao, and Y. Xiang, “A Blockchained Federated Learning Framework for Cognitive Computing in Industry 4.0 Networks,” IEEE Transactions on Industrial Informatics, vol. 17, no. 4, pp. 2964–2973, Apr. 2021.
[30] M. A. Rahman, M. S. Hossain, M. S. Islam, N. A. Alrajeh, and G. Muhammad, “Secure and Provenance Enhanced Internet of Health Things Framework: A Blockchain Managed Federated Learning Approach,” IEEE Access, vol. 8, pp. 205 071–205 087, 2020.
[31] L. Feng, Y. Zhao, S. Guo, X. Qiu, W. Li, and P. Yu, “Blockchain-based Asynchronous Federated Learning for Internet of Things,” IEEE Transactions on Computers, pp. 1–1, 2021.
[32] M. Shen, H. Wang, B. Zhang, L. Zhu, K. Xu, Q. Li, and X. Du, “Exploiting Unintended Property Leakage in Blockchain-Assisted Federated Learning for Intelligent Edge Computing,” IEEE Internet of Things Journal, vol. 8, no. 4, pp. 2265–2275, Feb. 2021.
[33] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative Adversarial Nets,” in Advances in Neural Information Processing Systems, vol. 27, 2014, pp. 2672–2680.
[34] D. C. Nguyen, M. Ding, P. N. Pathirana, and A. Seneviratne, “Blockchain and AI-Based Solutions to Combat Coronavirus (COVID-19)-Like Epidemics: A Survey,” IEEE Access, vol. 9, pp. 95 730–95 753, 2021.
[35] D. C. Nguyen, M. Ding, Q.-V. Pham, P. N. Pathirana, L. B. Le, A. Seneviratne, J. Li, D. Niyato, and H. V. Poor, “Federated Learning Meets Blockchain in Edge Computing: Opportunities and Challenges,” IEEE Internet of Things Journal, Apr. 2021.
[36] T. Zhou, X. Li, and H. Zhao, “DLattice: A Permission-Less Blockchain Based on DPoS-BA-DAG Consensus for Data Tokenization,” IEEE Access, vol. 7, pp. 39 273–39 287, 2019.
[37] I. D. Apostolopoulos and T. A. Mpesiana, “Covid-19: automatic detection from X-ray images utilizing transfer learning with convolutional neural networks,” Physical and Engineering Sciences in Medicine, vol. 43, no. 2, pp. 635–640, Jun. 2020.
[38] J. Kang, Z. Xiong, D. Niyato, D. Ye, D. I. Kim, and J. Zhao, “Toward Secure Blockchain-Enabled Internet of Vehicles: Optimizing Consensus Management Using Reputation and Contract Theory,” IEEE Transactions on Vehicular Technology, vol. 68, no. 3, pp. 2906–2920, Mar. 2019.
[39] P. Afshar, S. Heidarian, F. Naderkhani, A. Oikonomou, K. N. Plataniotis, and A. Mohammadi, “COVID-CAPS: A Capsule Network-based Framework for Identification of COVID-19 cases from X-ray Images,” Pattern Recognition Letters, vol. 138, pp. 638–643, 2020.
[40] J. Long, E. Shelhamer, and T. Darrell, “Fully Convolutional Networks for Semantic Segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431–3440.