Refiner: Data Refining against Gradient Leakage Attacks in Federated Learning

Mingyuan Fan
East China Normal University
fmy2660966@gmail.com Cen Chen
East China Normal University
cenchen@dase.ecnu.edu.cn Chengyu Wang
Alibaba Group
chengyu.wcy@alibaba-inc.com Xiaodan Li
Alibaba Group
fiona.lxd@alibaba-inc.com Wenmeng Zhou
Alibaba Group
wenmeng.zwm@alibaba-inc.com Jun Huang
Alibaba Group
huangjun.hj@alibaba-inc.com

Abstract

Recent works have brought attention to the vulnerability of Federated Learning (FL) systems to gradient leakage attacks. Such attacks exploit clients’ uploaded gradients to reconstruct their sensitive data, thereby compromising the privacy protection capability of FL. In response, various defense mechanisms have been proposed to mitigate this threat by manipulating the uploaded gradients. Unfortunately, empirical evaluations have demonstrated limited resilience of these defenses against sophisticated attacks, indicating an urgent need for more effective defenses. In this paper, we explore a novel defensive paradigm that departs from conventional gradient perturbation approaches and instead focuses on the construction of robust data. Intuitively, if robust data exhibits low semantic similarity with clients’ raw data, the gradients associated with robust data can effectively obfuscate attackers. To this end, we design Refiner that jointly optimizes two metrics for privacy protection and performance maintenance. The utility metric is designed to promote consistency between the gradients of key parameters associated with robust data and those derived from clients’ data, thus maintaining model performance. Furthermore, the privacy metric guides the generation of robust data towards enlarging the semantic gap with clients’ data. Theoretical analysis supports the effectiveness of Refiner, and empirical evaluations on multiple benchmark datasets demonstrate the superior defense effectiveness of Refiner at defending against state-of-the-art attacks.

1 Introduction

The past decade has witnessed a remarkable surge in the demand for extensive datasets [36, 38, 32], which serve as the driving force behind the recent astonishing breakthroughs achieved by deep neural networks (DNNs) across diverse domains [9, 11, 29]. However, the acquisition of large-scale datasets presents significant challenges. Entities and organizations are often reluctant to share their internal data due to concerns about potential leakage of sensitive information [23]. A notable example is the healthcare sector [42, 31], which is bound by privacy protection laws that strictly prohibit the sharing of patient-related information. Thus, training DNNs while safeguarding privacy has become a fundamental problem.

Refer to caption — Figure 1: A sketch map of FL and GLAs.

To address the abovementioned problem, Federated Learning (FL) has emerged as a highly promising privacy-preserving training framework [23]. In FL, as depicted in Figure 1, multiple clients participate by computing gradients locally [27], which are then contributed to a global server. The global server aggregates the uploaded gradients to update a globally-shared model, subsequently distributing the updated model back to the participating clients for the next training round. This collaborative process ensures that the privacy of individual client data is preserved, as the communication is limited to gradients or model parameters rather than the raw data itself. The prevailing belief in the practicability and effectiveness of FL has led to its widespread adoption across various privacy-sensitive scenarios. However, recent works on Gradient Leakage Attacks (GLAs) [54, 53, 12, 48] have challenged this belief by demonstrating that gradients alone are sufficient to reconstruct clients’ raw data. As shown in Figure 1, these attacks align the gradients of model parameters associated with dummy data with uploaded gradients and exploit the optimization process of the dummy data to closely resemble the original client data, thereby reconstructing sensitive information with a high level of fidelity. GLAs highlight the need for effective defenses in FL to ensure the confidentiality of sensitive client information.

Two primary types of methods have been proposed to safeguard privacy in FL: encryption-based methods [28, 8, 50, 10, 5] and perturbation-based methods [1, 54, 52, 39, 44]. Encryption-based methods ensure that the server only obtains encrypted versions of clients’ uploaded gradients, thereby enhancing privacy protection. However, the aggregated gradients still need to be decrypted into plaintext [10, 5], making them still vulnerable to GLAs. Moreover, encryption-based methods introduce considerable computational costs, data storage, and communication overheads which dominate the training process, thus significantly affecting the practicability of such methods [54, 39, 44, 50].

Perturbation-based methods typically impose subtle perturbation into ground-truth gradients, which helps to obscure the adversary’s (the server’s) ability to recover accurate data from the gradients. However, perturbation-based methods inherently involve a trade-off between utility (model performance) and privacy. Notable defenses include differential privacy [1] and gradient pruning [2], which perturb gradients by injecting random noises or discarding part of gradients. Recent research [46, 52] has revealed that gradient compression, specifically low-bit gradient quantization [4, 37, 34], can also contribute to defending against GLAs. State-of-the-art defenses [39, 44] estimate the privacy information contained in each gradient element for gradient pruning, theoretically achieving the optimal trade-off. Unfortunately, recent empirical evaluations [3, 49] have shown that these defenses struggle to achieve a satisfactory balance between privacy protection and model performance. In fact, these defenses often rely on oversimplified assumptions that do not align with the high complexity of DNNs (Section 7). As a result, it remains a challenge to craft appropriate perturbations for DNNs that effectively mislead the adversary while still providing informative gradients for the global model. Designing more effective defenses against GLAs remains to be explored.

In this paper, we explore a novel defensive paradigm that departs from conventional gradient perturbation approaches and instead focuses on the construction of robust data. Our inspiration stems from an intuitive yet compelling idea: if we can generate robust data that exhibit substantial significant semantic dissimilarity compared to clients’ raw data while maintaining utility, the gradients associated with these data are likely to have the potential to effectively obfuscate attackers while causing only minimal degradation in the model’s performance.

To realize this idea, we propose Refiner which crafts robust data by jointly optimizing two key metrics for privacy protection and performance maintenance. As illustrated in Figure 2, the privacy metric, i.e., the evaluation network, encourages robust data to deviate semantically from clients’ raw data, while the utility metric aligns the gradients of model parameters associated with clients’ data and robust data. In practice, to protect privacy, clients do not directly upload the gradients associated with their raw data. Instead, they upload gradients of robust data that are crafted through Refiner for their raw data, thereby maintaining the confidentiality of clients’ raw data.

•

Privacy metric. While norm-based distances between robust data and original data are commonly used to measure privacy [39], their effectiveness is limited due to the significant divergence with human-vision-system difference metrics [40]. To address this limitation, Refiner instead employs an evaluation network to accurately estimate the human perception distance between robust data and original data. The perception distance serves as the privacy metric, reflecting the overall amount of disclosed privacy information associated with clients’ data. By leveraging the human perception distance, Refiner provides a more accurate and effective evaluation of the privacy information contained in robust data.
•

Utility metric. The utility metric is defined as the weighted distance between the gradients of model parameters with respect to robust data and clients’ data. By minimizing the gradient distance, the utility of the robust data approaches that of the clients’ data. The weighting scheme prioritizes the preservation of gradients associated with important parameters, thus better maintaining the utility of the robust data. This weighting mechanism consists of element-wise weight and layer-wise weight. The element-wise weight is determined based on empirical and theoretical observations, considering the values of both gradients and parameters to identify critical parameters. Additionally, the layer-wise weight exploits the significance of earlier layers by allocating more attention to them during the learning process.

We also provide a solid theoretical foundation for these metrics and the effectiveness of Refiner and conduct extensive experiments on multiple benchmark datasets to show the superior performance of Refiner compared to baselines in defending against state-of-the-art attacks.

2 Background & Related Work

2.1 Federated Learning

Training DNNs while maintaining client privacy has become a crucial necessity due to data protection regulations¹¹1https://www.dlapiperdataprotection.com/, such as General Data Protection Regulation of the European Union. FL serves as a promising privacy-preserving solution through a gradient exchange mechanism between clients and a server [23]. We focus on a typical FL scenario, where a server collaborates with a set of $m$ clients to train a model $F_{\theta}(\cdot)$ parameterized by $\theta$ , utilizing loss function $\mathcal{L(\cdot,\cdot)}$ . The $i$ -th client’s local dataset is denoted by $D_{i}$ . The server and clients interact with each other for a total of $N$ rounds by alternately executing the two steps:

•

Model Distribution and Local Training. The server samples a subset of $K$ clients from the entire client pool to participate in the current round while the others await their turn. The server broadcasts the global model $F_{\theta}(\cdot)$ to selected clients. The $k$ -th selected client ( $k=1,\cdots,K$ ) samples raw data $x,y$ from his local dataset and computes gradients $g_{k}=\nabla_{\theta}\mathcal{L}(F_{\theta}(x),y)$ .
•

Global Aggregation. The server performs global aggregation by averaging the gradients received from selected clients, which are subsequently used to update the global model: $\theta=\theta-\frac{\eta}{K}\sum_{i=1}^{K}g_{i}$ where $\eta$ is the learning rate.

In the literature [27, 34], selected clients may update the local model over multiple steps and then upload the final model parameters. In this paper, we employ a default local step of 1, where clients directly upload the gradients without performing multiple local updates. The rationale behind this decision stems from the fact that the server typically derives gradients by comparing the global model parameters with the uploaded local model parameters. Multiple local steps could potentially lead to a discrepancy between the derived gradients and the ground-truth gradients associated with clients’ data, thereby reducing the effectiveness of attacks [52]. Thus, we adopt the setting that considers the most powerful attack scenario to better examine the effectiveness of defenses.

2.2 Threat Model

We assume that the server is honest-but-curious [44], which adheres to the pre-defined training rules but with a strong intention of recovering clients’ raw data by exploiting uploaded gradients. We allow the server to access public data sources and sufficient computational resources, thereby enabling the server to launch stronger attacks to evaluate our defense more thoroughly. We suppose that clients are aware of the privacy risks, and, as a remedy, clients proactively adopt defenses. To simplify the notations, subscripts, and superscripts representing specific clients are omitted throughout the paper.

2.3 Gradient Leakage Attack

The seminal work of GLAs was proposed by Zhu et al. [54]. Given randomly initialized dummy data $x^{\prime}$ with dummy label $y^{\prime}$ and uploaded gradients $g$ , the server tackles a gradient matching problem as follows:

\begin{split}x^{\prime},y^{\prime}={\arg\min}_{x^{\prime},y^{\prime}}||\nabla_{\theta}\mathcal{L}(F_{\theta}(x^{\prime}),y^{\prime})-g||_{2}.\end{split}

(1)

Empirical evaluations illustrated the remarkable similarity between resulting $x^{\prime}$ and clients’ data $x$ . The Follow-up work [43, 53] showed how to infer the ground-truth label $y$ from the gradients of the fully-connected layer, reducing the complexity associated with solving Equation 1. However, several studies [46, 12, 48] empirically demonstrated that bad initialization causes convergence towards local optima, making recovered data noisy. In response, Geiping et al. [12] and Jeon et al. [20] dedicated efforts to explore various regularization techniques and heuristic tricks, including total variation, restart, and improved initialization, aiming to eliminate invalid reconstructions. Recently, since the output of generative models is not noisy, Li et al. [25] and Yue et al. [49] trained a generative network to recover clients’ data. The potential assumption is the availability of an abundance of data that is similar to the clients’ data. They optimized latent vectors of the generative network to produce images whose gradients are similar to uploaded ones.

2.4 Gradient Leakage Defense

While cryptographic methods have emerged as potential solutions in defending GLAs, the substantial overhead limits their applicability [54, 44, 25]. Similar to [54, 44, 25], this paper does not discuss cryptographic methods, but instead mainly investigates alternative perturbation-based methods [54, 3, 47]. One such perturbation-based method [54, 3, 47] is differential privacy (DP), which involves adding random noises to ground-truth gradients to generate perturbed gradients. Moreover, some work [39, 44] explored gradient pruning or gradient quantization (GQ) to obfuscate the server. Soteria, proposed by Sun et al. [39], employs a theoretically-derived metric to evaluate the privacy information contained in each gradient element and selectively prunes gradients in fully-connected layers. However, Balunović et al. [3], Wang et al. [44] discovered that Soteria could be compromised by muting gradients in fully-connected layers during the reconstruction process (Equation 1). To address this vulnerability, they developed a gradient mixing mechanism to solve the problem.

3 Our Approach: Refiner

3.1 Problem Formulation

We here formulate the problem that Refiner solves. The primary objective is to construct robust data $x^{*}$ that possess sufficient utility while minimizing the disclosure of privacy information from clients’ raw data $x$ . Then, Refiner upload gradients associated with $x^{*}$ . Refiner formulates the following optimization problem to craft robust data:

x^{*}=\mathop{\arg\min}_{x^{*}}UM(x^{*},x)-\beta\cdot PM(x^{*},x),

(2)

where $UM$ and $PM$ denote the utility and privacy metrics. $UM$ evaluates the degree to which $x^{*}$ diverges from $x$ in terms of utility, while $PM$ quantifies the reduction in privacy associated with $x^{*}$ compared to $x$ . $\beta$ serves as a balance factor to regulate the trade-off between utility and privacy. Increasing $\beta$ places a higher priority on privacy protection. Intuitively, with well-defined metrics, the optimal solution of Equation 2 aligns with our definition of robust data. Section 3.2 and Section 3.3 will elaborate on our $UM$ and $PM$ , respectively.

3.2 Utility Metric

3.2.1 Motivation

The utility of data is defined as the extent to which it contributes to reducing the model loss. DNNs utilize gradients to update model parameters to minimize loss, making it straightforward to exploit the mean square error (MSE) between gradients associated with $x$ and $x^{*}$ as the utility metric, i.e., $UM(x^{*},x)=||\nabla_{\theta}\mathcal{L}(F_{\theta}(x^{*}),y)-\nabla_{\theta}\mathcal{L}(F_{\theta}(x),y)||_{2}^{2}$ . However, adopting the vanilla form of MSE treats all gradient elements equally. Consequently, minimizing the distance between gradient elements at different positions yields identical improvements in utility, disregarding the varying contributions of different parameters on model performance.

We customize a weighted MSE to more effectively evaluate the utility contained in $x^{*}$ . The overall idea of weighted MSE is to endow the gradients of important parameters with higher weights, so as to pay more attention to aligning the gradients of important parameters associated with $x$ and $x^{*}$ . To determine the weights, we consider two factors: element-wise weight along with layer-wise weight. The product of element-wise weight and layer-wise weight is used as the ultimate weight for the corresponding gradient element.

3.2.2 Element-wise Weight

Insight 1

The significance of parameters is closely tied to both their values and gradients. Intuitively, parameters with high magnitude values are crucial since they can considerably amplify their inputs. Higher outputs are more likely to exert a more substantial influence on the model’s predictions. Besides, the essence of gradient is quantifying the impact of a slight modification in parameters on the loss [6]. Hence, parameters that own gradients with larger magnitudes yield a greater influence in shaping the model performance.

Based on Insight 1, the element-wise weight can be defined as the absolute value resulting from multiplying the parameter’s value with corresponding gradients. The values of parameters reflect their significance to upstream layers, and the gradients of parameters suggest their importance to downstream layers. Consequently, the element-wise weight serves as an effective metric to evaluate the importance of parameters.

Theoretical justification. We offer mathematical insights into the effectiveness of element-wise weight through $Q$ -function. Originating from statistics [41], $Q$ -function serves as a tool for identifying estimators with smaller variances at the expense of accuracy. The core principle involves linearly scaling the estimation parameters. Consider $Q(u)=\mathcal{L}(F_{u\theta}(x),y)$ where $u$ is a linear scaling factor. By differentiating $Q(u)$ , there is $Q^{\prime}(u)=\nabla_{\theta}\mathcal{L}(F_{u\theta}(x),y)^{T}\theta$ . By setting $u=1$ , $Q^{\prime}(1)$ is simplifies to $\nabla_{\theta}\mathcal{L}(F_{\theta}(x),y)^{T}\theta$ . It is well-shared that the model achieves its optimality when $\nabla_{\theta}\mathcal{L}(F_{\theta}(x),y)=0$ . Therefore, the value of $Q^{\prime}(1)$ serves as an auxiliary criterion to check the model’s optimality. Furthermore, we can enforce $Q^{\prime}(1)=0$ by minimizing $|Q^{\prime}(1)|=|\sum_{i}\nabla_{\theta}\mathcal{L}(F_{\theta}(x),y)_{i}\theta_{i}|\leq\sum_{i}|\nabla_{\theta}\mathcal{L}(F_{\theta}(x),y)_{i}\theta_{i}|$ , where the subscript $i$ indicates the $i$ -th element of the vector of interest. If only a subset of parameters is allowed to be optimized, focusing on optimizing the parameters with the maximum absolute weight and gradient product, i.e., the parameters that our element-wise weight identifies, can minimize $Q(1)$ to the fullest extent.

Table 1: The model performance with three different gradient pruning strategies.

Pruning Rate	0.2	0.4	0.6	0.8
Weight	51.33	48.71	46.83	42.76
Grad	51.68	50.89	49.08	47.47
Ours	52.85	51.52	50.91	49.58

Empirical validation. We conduct a quick examination of the effectiveness of element-wise weight. Following Zhu et al. [54], we use LeNet and CIFAR10 dataset, together with hyperparameters including an epoch of 20, a learning rate of 0.01, and a batch size of 512. We employed three distinct gradient pruning strategies: (1) pruning the gradients of parameters with the minimum absolute gradient (Grad), (2) pruning the gradients of parameters with the minimum absolute weight (Weight), and (3) pruning the gradients of parameters based on the minimum absolute dot product between weight and gradient (Ours). Table 1 reports the accuracy of the models with different pruning ratios and strategies on CIFAR10 test set. As can be seen, element-wise weight achieves the best performance, as it can better identify important parameters compared to the other two pruning strategies.

3.2.3 Layer-wise Weight

Insight 2

A neural network consists of multiple layers and each layer receives input from its adjacent upstream layer (except the input layer) and delivers its output to the adjacent downstream layer (except the output layer). The earlier layers are probably more significant than the latter layers. On the one hand, any interference with the learning process of the earlier layers has the potential to propagate and enlarge errors along the forward propagation path, resulting in substantial amplification of undesired effects. On the other hand, a common belief is that the earlier layers concentrate on identifying foundational features shared across various samples, and therefore disrupting the learning process of earlier layers would inevitably lead to significant performance degradation.

Table 2: The model performance when perturbing different layers.

Noise Magnitude	Layer 1	Layer 2	Layer 3	Linear Layer
0.001	52.09	52.49	52.88	53.39
0.01	51.42	52.81	52.89	53.10
0.1	50.63	50.94	50.97	51.77

Drawing upon Insight 2, the parameters in earlier layers commonly are more important than those in late layers. The empirical validation, with a similar setup used in Table 1, can be found in Table 2. We add uniform random noises into the gradients of different layers individually. Notably, perturbing the gradients of early layers has a more detrimental impact on the model’s performance. Therefore, we define layer-wise weight that allocates more attention to the parameters in earlier layers. In this way, the gradients associated with $x^{*}$ and $x$ of parameters in the earlier layers can be better aligned to maintain the utility of $x^{*}$ .

The specific form of layer-wise weight draws inspiration from the multiplicative structure of DNNs. DNNs typically consist of stacked layers, each consisting of a linear function followed by a non-linear activation function $\sigma(\cdot)$ [33]. In mathematical terms, we represent a DNN as $\sigma\cdots\sigma(w_{2}\sigma(w_{1}x))$ , with the bias term absorbed into the weight term for simplicity. This multiplicative structure implies that the perturbation of weight parameters within a single layer can exert an exponential effect on the final model output, specifically when considering piece-wise linear activation functions like ReLU. Hence, our layer-wise weight employs an exponential decay mechanism to assign weights.

Suppose that $F_{\theta}(\cdot)$ comprise a total of $K$ layers, with the $i$ -th layer parameterized with $\theta[i]$ ( $\theta=\{\theta[1],\theta[2],\cdots,\theta[K]\}$ ). The layer-wise weight of gradient elements in $i$ -th layer $\theta[i]$ is defined as $power(\tau,i)$ , where $\tau$ is decay factor ( $0\leq\tau\leq 1$ ) and $power(\cdot,\cdot)$ is power function.

3.2.4 Ultimate Weight

Here we define the ultimate weight for $\theta[i]$ , which is the product between element-wise weight and layer-wise weight:

\begin{split}weight(\theta[i])=|grad(\theta[i])\cdot value(\theta[i])\cdot power(\tau,i)|,\end{split}

(3)

where $grad(\theta[i])$ and $value(\theta[i])$ denotes taking out the gradients associated with $\theta[i]$ and values of $\theta[i]$ . $UM$ now can be expressed as follows:

\begin{split}UM(x^{*},x)&=||\sum_{i=1}^{K}weight(\theta[i])\\ &(\nabla_{\theta[i]}\mathcal{L}(F_{\theta}(x^{*}),y)-grad(\theta[i]))||_{2}^{2}.\end{split}

(4)

3.3 Privacy Metric

3.3.1 Motivation

PM aims to quantify the level of privacy leakage regarding $x$ caused by $x^{*}$ , or in other words, it measures the dissimilarity between $x$ and $x^{*}$ in terms of human perception. However, common distance metrics like MSE fail to serve as an effective PM. These metrics are designed to only ensure high similarity between two inputs when their values are low. Unfortunately, the opposite conclusion does not always hold true. Consider Figure 3, where seemingly indistinguishable images to the human eye are deemed significantly different by MSE. Consequently, employing MSE may cause pseudo-privacy protections, highlighting the need for a better privacy metric.

3.3.2 Metric Design

Inspired by the above, we begin by summarizing the characteristic that an ideal PM should possess, named monotonicity principle: with the increasing value of $PM(x^{*},x)$ , the privacy information involved in $x^{*}$ regarding $x$ should decrease monotonously. One intuitive idea for designing such a PM is to generate a reference sample for $x$ , which does not contain any private information associated with $x$ . This reference sample can then be exploited as a benchmark for evaluating privacy information contained in $x^{*}$ . The closer $x$ is to the reference sample, the less private information from $x$ will be embedded in $x^{*}$ . However, a key challenge arises: how to obtain a suitable reference sample?

At first glance, one might consider utilizing specific random noises²²2Random noises are considered as non-information samples [35]. as the reference sample. Though technically feasible, it produces a bottleneck for the performance of Refiner, because the search space is limited to the underlying route from $x$ to the specific reference sample. In other words, Refiner is unable to navigate other noise regions that may contain more informative samples. To address the problem, we define the distance of $x^{*}$ to the uniform distribution as our PM, i.e., the uniform distribution is our reference sample.

Let $x^{*}$ be distributed according to Dirac delta distribution $p(x)$ that concentrates mass at $x^{*}$ ³³3In mathematical terms, Dirac delta distribution [41, 15] satisfies the integral property $\int p(x)\delta(x-x^{*})dx=1$ . $\delta$ function is zero everywhere except at the point $x^{*}$ , where it becomes infinite.. To measure the distance between $x^{*}$ and uniform noise distribution $q(x)$ , we employ the JS divergence: $\frac{1}{2}\int p(x)log\frac{2p(x)}{p(x)+q(x)}+q(x)log\frac{2q(x)}{p(x)+q(x)}dx$ . Unfortunately, directly evaluating JS divergence becomes computationally infeasible due to its high-dimensional nature. To this end, we leverage conjugate functions to reformulate JS divergence as follows (see Appendix A for details):

\underset{D,D(x)\in[0,1]}{\max}\mathbb{E}_{x\sim p(x)}[log(1-D(x))]+\mathbb{E}_{x\sim q(x)}[logD(x)].

(5)

Practical Implementation. Equation 5 indicates the need to construct a new $D$ for each $x$ to estimate the JS divergence. To reduce costs, we search for a single $D$ for all images. The $D$ in this paper is a neural network dubbed the evaluation network, as DNNs are capable of approximating any function [17, 18]. Intuitively, the underlying goal of Equation 5 is to encourage $D$ to output 1 for random noises and 0 for natural samples. Essentially, the evaluation network indeed quantifies the proportion of noise within $x^{*}$ . This inspires us to create training data-label pairs for the network in an interpolate manner, i.e., $((1-r)\cdot t_{1}+r\cdot t_{2},r),r\sim U(0,1),t_{1}\sim p,t_{2}\sim q$ , and these pairs are supplied into the network as supervised signals for training. Doing so enjoys that the monotonicity principle can be explicitly infused into the evaluation network.

We use TinyImageNet dataset for training the evaluation network and more details can be found in Appendix D. Table 4 reports the predictions of the trained evaluation network for mixtures of random noises and natural images. An illustrative example is provided in Figure 5. Remarkably, the trained evaluation network performs well over different datasets like CIFAR-10, CIFAR-100, and SVHN, demonstrating its good generality. Moreover, it can be observed that when the noise ratio reaches approximately 0.6, human eyes are no longer able to identify privacy information. A byproduct of the evaluation network is its potential application as a privacy metric. In practical scenarios, clients can obtain the trained evaluation network from a trusted third party at no cost or train their own evaluation network using publicly available data. In this paper, all clients share this trained evaluation network by default.

3.4 Solving Optimization Task (Equation 2)

Upon defining the concrete forms of $UM$ and $PM$ , the remaining challenge is figuring out how to solve Equation 2. A straightforward solution is to harness gradient descent algorithm [6], which stands as one of the most commonly employed optimization algorithms. While gradient descent algorithm is capable of tackling optimization problems featuring solely convex terms like UM, it encounters obstacles when confronted with highly non-linear and non-convex neural networks like PM. Specifically, we initialize $x^{*}$ as $x$ , subsequently aiming to maximize PM( $x$ , $x^{*}$ ), i.e., making $x^{*}$ noisy. As shown in Figure 6(b), the resulting $x^{*}$ deviate only slightly from $x$ (Figure 6(a)), yet the model confidently identifies it as a noisy sample. This phenomenon and $x^{*}$ are known as adversarial vulnerability and adversarial examples [13, 7].

To address the above problem, we propose noise-blended initialization, which by blending $x^{*}$ with random noises, i.e., $x^{*}=(1-\alpha)x+\alpha v,\ v\sim q(x),\alpha\in[0,1]$ . $\alpha$ is a blend factor. A big value of $\alpha$ enables $x^{*}$ to be initialized around the area flooding with noises. Subsequently, during the optimization process, the evaluation network will maintain $x^{*}$ as a noisy sample. Figure 6(c) exemplify the noise-blended initialization with $\alpha=0.5$ . The resultant $x^{*}$ does not contain any semantic information regarding $x$ , showing the effectiveness of the noise-blended initialization.

4 Theoretical Analysis

This section conducts a theoretical analysis of Refiner from three fundamental dimensions, including performance maintenance, privacy protection, and time cost.

4.1 Performance Maintenance

Assumption 1

$\mathcal{L}(F_{\theta}(x),y)$ is $L$ -smooth: $\forall\theta_{1},\theta_{2},$ it holds that $\mathcal{L}(F_{\theta_{1}}(x),y)\leq\mathcal{L}(F_{\theta_{2}}(x),y)+\nabla_{\theta_{2}}\mathcal{L}(F_{\theta_{2}}(x),y)^{T}(\theta_{1}-\theta_{2})+\frac{L}{2}||\theta_{1}-\theta_{2}||_{2}^{2}$ .

Assumption 2

$\mathcal{L}(F_{\theta}(x),y)$ is $u$ -strongly convex: $\forall\theta_{1},\theta_{2},$ there is $\mathcal{L}(F_{\theta_{1}}(x),y)\geq\mathcal{L}(F_{\theta_{2}}(x),y)+\nabla_{\theta_{2}}\mathcal{L}(F_{\theta_{2}}(x),y)^{T}(\theta_{1}-\theta_{2})+\frac{u}{2}||\theta_{1}-\theta_{2}||_{2}^{2}$ .

Assumption 3

Let $x_{i},y_{i}$ be uniformly random samples drawn from the local dataset $D_{i}$ of the $i$ -th client. The variance of stochastic gradients is bounded: $\mathbb{E}||\nabla_{\theta}\mathcal{L}(F_{\theta}(x_{i}),y_{i})-g_{i}^{full}||_{2}^{2}\leq\sigma_{i}^{2},i=1,2,\cdots,K.$ Wherein, $g_{i}^{full}$ represents the gradients of model parameters $\nabla_{\theta}\mathcal{L}(F_{\theta}(\cdot),\cdot)$ computed across the complete local dataset $D_{i}$ .

Assumption 4

There exists a real number $G$ and the expected squared norm of stochastic gradients is uniformly bounded by $G$ : $\mathbb{E}||\nabla_{\theta}\mathcal{L}(F_{\theta}(x_{i}),y_{i})||_{2}^{2}\leq G^{2},i=1,2,\cdots,K$ .

Theorem 1

Let Assumption 1, 2, 3, and 4 hold. Let $x^{*}$ be the optimal solution of Equation 2. Suppose the gradient distance associated with $x$ and $x^{*}$ is bounded: $||\nabla_{\theta}\mathcal{L}(F_{\theta}(x),y)-\nabla_{\theta}\mathcal{L}(F_{\theta}(x^{*}),y)||_{2}^{2}\leq\epsilon^{2}$ . Let $\theta^{*}$ represent the global optimal solution and $\theta_{i}^{*}$ ( $i=1,2,\cdots,K$ ) denote the local optimal solution for $i$ -th client. Choose $\kappa=\frac{L}{u},\gamma=\max\{8\kappa,1\},\eta=\frac{2}{u(\gamma+t)}$ . Let $\Gamma=\mathbb{E}[\mathcal{L}(F_{\theta^{*}}(\cdot),\cdot)]-\frac{1}{K}\sum_{i=1}^{K}\mathbb{E}[\mathcal{L}(F_{\theta_{i}^{*}}(\cdot),\cdot)]$ . There is:

\begin{split}&\mathbb{E}[\mathcal{L}(F_{\theta}(\cdot),\cdot)]-\mathcal{L}(F_{\theta^{*}}(\cdot),\cdot)\\ &\leq\frac{2\kappa}{\gamma+N}(\frac{Q+C}{u}+\frac{u\gamma}{2}\mathbb{E}||\theta_{initial}-\theta^{*}||_{2}^{2}),\end{split}

where $Q=\frac{1}{K^{2}}\sum_{i=1}^{K}(\epsilon^{2}+\sigma_{k}^{2})+6L\Gamma,\ C=\frac{4}{K}(\epsilon^{2}+G^{2})$ , and $\theta_{initial}$ is initial parameters of the globally-shared model.

This subsection examines how Refiner contributes to the convergence of the global model. By establishing Assumption 1, 2, 3, and 4 and demonstrating Theorem 1 (proof available in the Appendix B), we shed light on the subsequent discussion. Theorem 1 provides an upper bound on the gap between the optimal solution and the model trained with Refiner after sufficient iterations. As the gradient distance between robust and client data decreases, this upper bound also decreases, implying better performance of the global model. This highlights the effectiveness behind minimizing the gradient distance between robust data and clients’ data in maintaining performance. The inclusion of the gradient distance term ( $UM$ ) in Equation 2 guarantees the boundedness of $||\nabla_{\theta}\mathcal{L}(F_{\theta}(x),y)-\nabla_{\theta}\mathcal{L}(F_{\theta}(x^{*}),y)||$ . In practice, we modify the gradients of robust data slightly. If the difference between the gradients of robust data and clients’ data exceeds $\epsilon$ , we search for the closest gradients within the $\epsilon$ -constraint and use them as the final uploaded gradients (see Appendix C for more details). Doing so enables us to evaluate the utility-privacy trade-off of Refiner by varying values of $\epsilon$ ⁴⁴4Adjusting the value of $\beta$ is also an alternative way, but we find that tuning $\epsilon$ is more convenient..

It is worth noting that FL commonly involves two scenarios: when the client data distributions are identical (IID) and non-identical (Non-IID). Our theoretical analysis does not require uniformity in the client data distributions. Thus, Theorem 1 is applicable to both scenarios. Moreover, these assumptions are consistent with [24] and aim to ensure a unique global extreme point, or that any local optimal point is also a global optimal point. For neural networks with multiple extreme points, Theorem 1 can be understood as guaranteeing convergence to at least one local extreme point.

4.2 Privacy Protection

There exist various types of GLAs, each with unique features. Analyzing the effectiveness of Refiner against each specific attack instance is cumbersome. Therefore, we begin by distilling GLAs into a generalized form by identifying patterns shared across attack instances. Then we analyze the effectiveness of Refiner against the generalized attack. In this way, our privacy analysis can be extended to encompass a wide spectrum of attack variations and potential evolving attacks.

GLAs exploit the mapping relationship between data and gradients to reconstruct high-fidelity data. Specifically, we represent the mapping relationship from data to gradients by $f(x)=g$ . The objective of GLAs is to construct an inverse map $\hat{f}^{-1}(\cdot)$ from gradients to data that yields $\hat{f}^{-1}(g)\approx x$ , thus enabling the recovery of clients’ data. Different GLAs can be derived by specifying the concrete form of $\hat{f}^{-1}(\cdot)$ ⁵⁵5In fact, $f^{-1}(\cdot)$ and $\hat{f}^{-1}(\cdot)$ may not be one-to-one. But, after fixing random factors like random seeds, learning rate, etc., they have a unique solution and can be considered one-to-one. The term ”specified” here includes the meaning of fixing these factors.. Example 1, 2, and 3 are provided to demonstrate this.

Example 1

(Deep Leakage from Gradients (DLG) [54]). DLG’s inverse mapping is defined as the optimal solution to a gradient matching problem, which minimizes the $L_{2}$ -norm distance between the uploaded gradients and reconstructed data gradients: $\hat{x}=\hat{f}^{-1}(g)=\mathop{\arg\min}_{\hat{x}}||\nabla_{\theta}F_{\theta}(\hat{x})-g||_{2}^{2}$ .

Example 2

(Inverting Gradients [12]). Similar to DLG, Geiping et al. [12] also solves a gradient matching problem for data reconstruction. But $L_{2}$ -norm distance is replaced by cosine similarity and a total variation term used to regularize reconstructed data is introduced: $\hat{x}=\hat{f}^{-1}(g)=\mathop{\arg\min}_{\hat{x}}1-\frac{<\nabla_{\theta}F_{\theta}(\hat{x}),g>}{||\nabla_{\theta}F_{\theta}(\hat{x}||_{2}||g||_{2}}+TV(x)$ .

Example 3

(Generative Gradient Leakage (GGA) [25]). GGA optimizes a latent input vector of a generative model to search data whose gradients align well with uploaded gradients. Moreover, GGA adds a KL divergence term to avoid too much deviation between the latent vector and the generator’s latent distribution: $\hat{x}=\hat{f}^{-1}(g)=G(z)$ , where $z=\mathop{\arg\min}_{x}||\nabla_{\theta}F_{\theta}(G(z))-g||_{2}^{2}+KL(z||q)$ . $q$ is Gaussian distribution with a mean of 0 and a standard deviation of 1.

In practice, these attacks demonstrate superiority in recovering high-quality data. Therefore, their $\hat{f}^{-1}(\cdot)$ is quite similar to ground-truth one $f^{-1}(\cdot)$ . Inspired by this, we make $||\hat{f}^{-1}(\cdot)-f^{-1}(\cdot)||\leq\sigma$ , where $\sigma$ is a small positive number.

Refiner uploads the gradients associated with $x^{*}$ , which is denoted by $g^{*}=f(x^{*})$ . The server uses $g^{*}$ to recover data, i.e. $\hat{f}^{-1}(g^{*})$ . We now consider the distance of $x$ and $\hat{f}^{-1}(g^{*})$ :

\begin{split}&||\hat{f}^{-1}(g^{*})-x||=||\hat{f}^{-1}(g^{*})-f^{-1}(g)||\\ &=||\hat{f}^{-1}(g^{*})-f^{-1}(g^{*})+f^{-1}(g^{*})-f^{-1}(g)||\\ &\geq|\ ||f^{-1}(g^{*})-f^{-1}(g)||-||\hat{f}^{-1}(g^{*})-f^{-1}(g^{*})||\ |\\ &=|\ ||x^{*}-x||-||\hat{f}^{-1}(g^{*})-f^{-1}(g^{*})||\ |.\end{split}

(6)

Equation 6 establishes a lower bound of $||\hat{f}^{-1}(g^{*})-x||$ , which is determined by $x^{*}$ , $x$ , and $\sigma$ . In practice, $x^{*}$ obtained by Equation 2 are often noisy and significantly differ from $x$ , i.e., $||x^{*}-x||>\sigma$ (see Appendix F for the empirical validation). When $\sigma$ is small, the lower bound can be simplified as follows:

||\hat{f}^{-1}(g^{*})-x||\geq||x^{*}-x||-\sigma.

(7)

Equation 7 suggests that a smaller $\sigma$ results in higher privacy lower bound, i.e., stronger privacy protection capabilities. On the other hand, considering larger values of $\sigma$ may be a little trivial. In detail, a larger $\sigma$ leads to a significant discrepancy between the recovered data and clients’ data. Consequently, clients can directly upload ground-truth gradients since the attacker is unable to reconstruct high-fidelity data. In summary, the above analysis demonstrates the effectiveness and resilience of Refiner against GLAs, particularly those characterized by smaller $\sigma$ values.

4.3 Time Complexity

Table 3: Time complexity comparison between different defenses.

\mathcal{N},\mathcal{N}_{last},S_{1},S_{2},I

denote the total number of model parameters, the total number of model parameters in the last fully connected layer, the time required for once forward-backward propagation of the globally-shared model and the evaluation network, the iteration number for solving Equation 2, respectively.

DP	GQ	Pruning	Soetria	Ours
$\mathcal{O}(\mathcal{N})$	$\mathcal{O}(\mathcal{N})$	$\mathcal{O}(\mathcal{N}\log(\mathcal{N}))$	$\mathcal{O}(\mathcal{N}_{last}\cdot S_{1})$	$\mathcal{O}(\iota\cdot(2\cdot S_{1}+S_{2}))$

Before delving into the analysis of time complexity, it is essential to acknowledge that achieving excellence in all aspects of utility, privacy, and time is almost impossible. Because striking a better utility-privacy trade-off often necessitates additional time investment for searching. For instance, DP equally perturbs all gradient elements, without considering the inherent privacy information contained in each gradient element. In contrast, Soteria addresses this limitation by estimating the privacy information of each gradient element and selectively pruning those with the highest privacy information, resulting in an improved utility-privacy trade-off. However, this estimation process incurs additional time costs.

DP, GQ, and gradient pruning, none of which necessitate extra forward-backward propagations, are discussed together. DP generates noises for each gradient element, GQ quantizes each gradient element, and gradient pruning sorts and then removes part of gradient elements. By assuming the model of interest owns a total of $\mathcal{N}$ parameters, the time complexity of both DP and GQ scales linearly with the number of parameters, $\mathcal{O}(\mathcal{N})$ . For gradient pruning, the dominant factor is the sorting time complexity, i.e., $\mathcal{O}(\mathcal{N}\log(\mathcal{N}))$ .

Soteria focuses on pruning gradients in the final fully connected layer. This entails a meticulous evaluation of the privacy information associated with each individual gradient element residing in the layer. Given that the total count of parameters contained within the layer to be denoted as $\mathcal{N}_{last}$ , and the time complexity of once forward-backward propagation of the global model is $\mathcal{O}(S_{1})$ . Soteria necessitates executing forward-backward propagation a specific number of times, corresponding to the number of neurons in the layer. Consequently, the overall time complexity of Soteria can be expressed as $\mathcal{O}(\mathcal{N}_{last}\cdot S_{1})$ . The number of elements in a typical fully connected layer is usually $1024$ .

The time complexity of Refiner primarily centers around solving Equation 2. Each iteration of Equation 2 requires twice forward-backward propagations of the global model, along with one forward-backward propagation of the evaluation network. Hence, the overall time complexity is the sum of these operations. Assuming the time complexity of running single forward-backward propagation for the evaluation network is $\mathcal{O}(S_{2})$ and the total number of iterations for solving Equation 2 is $\iota$ , the time complexity of Refiner is $\mathcal{O}(\iota\cdot(2\cdot S_{1}+S_{2}))$ . Table 3 summarizes the time complexities associated with these defenses. Section 6.2 conducts an empirical comparison of the time costs for these defenses.

5 Evaluation Setup

Table 4: The attack hyperparameters follow their original papers. Total variance (TV) quantifies the smoothness of inputs and can be used to enhance the quality of recovered images.

L_{2}

-norm controls the pixel value within a legal range

[0\sim 1]

. BN statistics force the statistics of recovered images to close the statistics of datasets. KL distance restricts the latent variables to follow Gaussian distribution.

Attack

iGLA

InvertingGrad

GradInvertion

GGA

Loss Function

Euclidean

Cosine Similarity

Euclidean

Regularization

None

TV+

L_{2}

-norm

+BN statistics

GAN+KL

Distance

Optimizer

L-BFGS

Adam

Learning Rate

0.01

Label Inference

\checkmark

\checkmark

\checkmark

\checkmark

Attack Iteration

300

4000

5.1 Attack

Four state-of-the-art attacks are considered to examine the performance of Refiner, including iGLA [53], GradInversion [48], InvertingGrad [12], GGA [25, 49]. These attacks cover the primary types of GLAs currently explored. The first three attacks address variants of Equation 1 by employing varying optimizers, loss functions, and so on. GGA, on the other hand, focuses on optimizing the latent vectors of a GAN to align the gradients. We implement these attacks at the start of training, as this stage is most vulnerable to GLAs [3]. For GGA, we use public codes ⁶⁶6https://raw.githubusercontent.com/pytorch/examples/master/dcgan/main.py to train a GAN. Table 4 provides an overview of the attack distinctions and the hyperparameters employed.

5.2 Competitor

We compare Refiner to the following defenses: DP [1], GQ [34, 49], gradient pruning [2, 54], and Soteria [44]⁷⁷7We use Soteria proposed in [44] to compare, which is an improved version compared with the original Soteria [39].. These defenses involve utility-privacy trade-offs that are regulated by hyperparameters such as noise magnitude for DP, discretization level for GQ, and pruning rate for pruning and Soteria. Stronger privacy protection can be achieved by increasing these hyperparameters, but this comes at the cost of performance degradation. We vary these hyperparameters to obtain the utility-privacy trade-off curves for each defense. Specifically, for DP, we employ two commonly used kinds of noises, namely Gaussian (DP-Gaussian) and Laplace (DP-Laplace), with magnitude ranging from $10^{-6}$ to $10^{2}$ and $C$ of 1. As for GQ, we consider discretization levels, including 1, 2, 4, 8, 12, 16, 20, 24, and 28 bits. For pruning and Soteria, we set pruning ratio in $\{0.1,0.2,\cdots,0.9\}$ . Besides, we alter $\epsilon$ of Refiner over $\{0.01,0.03,0.05,0.07,0.1,0.3,0.5,0.7,0.9\}$ (see Section 4.1). Unless specified otherwise, for Refiner, the default values of $\alpha$ , $\beta$ , refinement iterations $\iota$ and $\tau$ are set to 0.5, 1, 10, and 0.95.

5.3 Evaluation Metric

We assess defenses from three fundamental dimensions: performance maintenance, privacy protection, and time cost.

Performance maintenance. We define the performance maintenance metric (PMM), which calculates the accuracy ratio achieved with defenses compared to the original accuracy without any defenses:

PMM=\frac{defense\_acc}{original\_acc}\times 100\%,

where $original\_acc$ and $defense\_acc$ denote the accuracy of the model with and without defense⁸⁸8The original accuracies of LeNet are 84.12%, 54.01%, and 21.45% on SVHN, CIFAR10, and CIFAR100. The original accuracies of ResNet10 and ResNet18 are 83.25% and 84.95% on CIFAR10..

Privacy protection. Privacy protection measures the amount of privacy information recovered from the reconstructed images. Given the lack of a universally perfect privacy assessment metric, we employ a diverse range of metrics including PSNR [16], SSIM [45], LPIPS [51], and the evaluation network to achieve a more comprehensive assessment:

•

PSNR is the logarithmic average distance between the original and reconstructed images. A higher PSNR indicates high-quality reconstructed images.
•

SSIM captures human perception of image quality better by considering factors like brightness, contrast, and structure. SSIM ranges from 0 to 1, with values closer to 1 indicating greater similarity between two images.
•

LPIPS assesses the similarity of two images based on the comparison of features extracted by DNNs. A value close to 0 indicates a higher similarity between two images. LPIPS often better captures semantic and perceptual differences compared to PSNR and SSIM.
•

The evaluation network estimates the proportion of noise in the data, producing noise ratio between 0 and 1. As mentioned earlier, when the noise ratio exceeds 0.6, the human eye probably recognizes it as noisy data.

More details can be found in Appendix E.

Time cost. Time cost is also an important dimension for evaluating a defense. We record the time required for a single iteration of the model when employing defenses.

5.4 Model, Dataset, and Training Setting

We select two widely-used model architectures, LeNet [22] and ResNet [14], on three benchmark datasets SVHN [30], CIFAR10 [21], and CIFAR100 [21]. We include ResNet10 and ResNet18 to study the impact of different scales. Our focus is on a typical FL scenario involving 10 clients collaborating to train a global model, with the assistance of a server.

Our evaluation also examines the effect of data distribution on FL by considering both IID and non-IID settings [19]. In the IID setting, the training datasets are randomly and equally divided among the 10 clients, ensuring identical distribution for each client’s local dataset. In the Non-IID setting, we allocate the training datasets unevenly among the 10 clients in a label imbalance fashion [26, 19].

The training process takes place over 8000 iterations, during which each client computes gradients locally using a batch size of 128. These gradients are then uploaded to the server, which updates the global model using FedAvg with a learning rate of 0.01. We scale the data to the range of 0 to 1 without employing additional data augmentation techniques.

6 Empirical Evaluation

6.1 Privacy-Utility Trade-off

Numerical results. Figure 7 illustrates the trade-off curves between PMM (utility) and PSNR (privacy) for various defenses against four state-of-the-art attacks on CIFAR10. Overall, Refiner consistently outperforms baselines by achieving lower PSNR while maintaining comparable PMM. For instance, when employing InvertingGrad attack, Refiner obtains an improvement of approximately 3 in PSNR compared to state-of-the-art defense Soteria, while Refiner maintains slightly higher utility than Soteria.

Moreover, we observe that the impact of Refiner, Soteria, and Pruning on model performance is relatively mild compared to other defenses, except Refiner. Particularly, excessively aggressive defense strengths, such as employing a noise magnitude of DP greater than or equal to $10^{-2}$ or employing a quantization precision less than or equal to 8 bits with GQ, can significantly deteriorate the model’s performance.

Semantic results. We see that the reconstruction quality of GGA, as measured by PSNR, is significantly inferior to the other three attacks. One possible explanation is that GGA focuses more on semantic rather than pixel-level reconstruction. To substantiate this conjecture, Figure 8 showcases the reconstruction images of GGA. Based on Figure 8, GGA indeed recovers the privacy information embedded in the original image (i.e., airplane), highlighting its attack effectiveness. Furthermore, compared to the PSNR metric, LPIPS places greater emphasis on the similarity of features extracted using DNNs, thereby underscoring the importance of semantic-level similarity over pixel-level similarity. Therefore, we deploy LPIPS metric and the evaluation network for further assessment of GGA’s effectiveness. From Figure 9(a), we observe that the attack performance of GGA is competitive with InvertingGrad against Soteria and Refiner. Remarkably, in Figure 9(b), the evaluation network consistently rates GGA’s attack effectiveness highly regardless of defense strength. This is intuitive since the generated images by GGA are noise-free, as displayed in Figure 8. In short, the above analysis stresses the necessity of employing diverse privacy metrics to assess defense performance.

Evaluating defenses with varied assessment metrics. Figure 10 shows the performance of different defenses under various privacy metrics. As can be seen, Refiner consistently achieves a superior trade-off between privacy and utility when evaluated using different privacy metrics. For instance, Refiner enables the inclusion of approximately 30% noise information in constructed data with negligible impact on the model’s performance, whereas Soteria achieves only around 20% noise inclusion capability.

Safeguarding different model architectures. We here examine the performance of different defenses when applied to ResNet. The corresponding results are presented in Figure 11. Remarkably, Refiner exhibits a superior trade-off between utility and privacy, especially when applied to larger models (ResNet18) compared to baselines. We hypothesize that this can be attributed to the increased non-linearity capacity of larger models, rendering the assumptions of other defenses less valid and consequently leading to degraded effectiveness.

Safeguarding different datasets. We also evaluate the performance of Refiner on different datasets, SVHN and CIFAR100. The results not only confirm the consistent superior performance of Refiner, but also reveal two noteworthy observations. Firstly, we observe a better reconstruction quality for attacks carried out on SVHN, with the highest achieved PSNR approaching 40. This is probably because SVHN is a less sophisticated dataset, rendering the construction easier. Secondly, we observe the remarkable effectiveness of Refiner compared to baselines when applied to the CIFAR100 dataset, suggesting that Refiner exhibits greater resilience in handling more complex datasets.

Defense with Non-IID setting. In the Non-IID setting, the datasets of clients are heterogeneous, posing greater challenges to maintaining performance. We evaluate the effectiveness of our approach in non-IID scenarios, as shown in Figure 13. Wherein, Refiner outperforms baselines. Moreover, while we observe little change in the quality of attacker-reconstructed images (measured by PSNR) under non-IID conditions, the difficulty of maintaining performance increases significantly compared to IID conditions.

Table 5: The execution time (s) for a single iteration of varying models equipped with various defenses using an A10 GPU.

Dfense	LeNet	ResNet10	ResNet18	ResNet34
Pruning	0.0187	0.0495	0.0873	0.1370
Gaussian	0.0467	0.7317	1.8408	3.3020
Laplace	0.0469	0.7322	1.8414	3.3025
GQ	0.0179	0.0277	0.0514	0.1057
Soteria	0.4103	11.3148	21.7539	39.4410
Our	0.3340	1.7341	2.8539	4.7012

6.2 Empirical Time Complexity

The additional time cost they imposed by defenses is a crucial dimension of assessing their effectiveness. In this regard, Table 5 reports the runtime required for a single iteration of models when employing different defenses. For the small-scale model LeNet, Refiner and Soteria exhibit a time cost within the same order of magnitude. However, this time cost is one order of magnitude higher compared to Pruning, Gaussian, Laplace, and GQ. This is due to the fact that Refiner and Soteria require multiple forward-backward propagations to search for better utility-privacy trade-offs.

For larger models ResNet10 and ResNet18, The runtime required by Pruning and GQ increases in an approximate linear fashion. Refiner, Gaussian, and Laplace own comparable runtime overhead. Notably, Soteria incurs a significantly higher time cost of the 40s per iteration on ResNet34, which is 8.4x higher than the time required for Refiner. The heavy overhead of Soteria stems from the repeated execution of forward-backward propagation, which needs to be performed as many times as there are neurons in the fully connected layer. Consequently, as the number of fully connected neurons increases in larger models, the time cost of Soteria escalates significantly. In contrast, Refiner offers a scalable time cost for large models since the number of forward-backward propagations depends mainly on the user-specified number of iterations, granting flexibility based on factors such as clients’ privacy preferences and computing resources. Empirical results indicate that good performance can be achieved with only 10 iterations, far lower than the runtime cost associated with Soteria. In summary, Refiner achieves a better utility-privacy trade-off compared to DP, GQ, and gradient pruning, albeit at the expense of higher time cost. But, Refiner’s time cost is still lower than that of Soteria.

Table 6: The PSNR and PMM achieved by Refiner over varying

\beta

against InvertingGrad in LeNet trained with CIFAR10.

$\beta$	$10^{-3}$	$10^{-2}$	$10^{-1}$	$10^{0}$	$10^{1}$	$10^{2}$	$10^{3}$
PSNR	13.89	13.62	13.44	13.20	13.08	12.70	12.04
PMM	91.93	91.12	91.06	90.69	90.00	89.30	88.90

Table 7: The PSNR and PMM achieved by Refiner over varying

\iota

against InvertingGrad in LeNet trained with CIFAR10.

$\iota$	5	10	15	20
PSNR	13.05	13.20	13.37	13.45
PMM	85.61	90.69	91.19	92.17

Table 8: The PSNR and PMM achieved by Refiner over varying

\alpha

against InvertingGrad in LeNet trained with CIFAR10.

$\alpha$	0	0.1	0.3	0.5	0.7	0.9
PSNR	21.24	16.52	14.99	13.20	12.53	10.52
PMM	98.52	95.45	92.61	90.69	90.08	89.34

Table 9: The PSNR and PMM achieved by Refiner over varying

\tau

against InvertingGrad in LeNet trained with CIFAR10.

$\tau$	0.91	0.93	0.95	0.97	1.00
PSNR	12.93	13.05	13.10	13.15	15.44
PMM	90.98	90.82	90.69	90.46	89.28

6.3 Sub-components Analysis

In this section, we investigate the impact of four sub-components of Refiner on its performance. We fix $\epsilon=0.1$ .

Impact of balance factor $\beta$ . Table 6 reports the performance of Refiner as $\beta$ varies against InvertingGrad. Overall, the higher $\beta$ is, the lower PMM is, and the higher MSE is. Moreover, the performance of Refiner is more robust to tuning $\beta$ compared with $\alpha$ , i.e., there is not a huge performance shift w.r.t. different $\beta$ . Thus, we believe setting $\alpha=1$ by default is a good option in practice.

Impact of refinement iterations $\iota$ . Refinement iterations can influence the quality of resultant robust data and Table 7 reports the performance of Refiner over different refinement iterations against InvertingGrad. An observation is that, the performance of Refiner intensively increases by raising refinement iterations from 1 to 100, but further enhancing refinement iterations from 100 to 200 only produces a slight improvement, suggesting that refinement iterations of 100 are sufficient for optimization convergence in practice.

Impact of $\alpha$ . Table 8 reports the performance of Refiner when configured with varying refinement iteration settings against InvertingGrad. It is observed that increasing the refinement iterations from 5 to 10 leads to a significant improvement in the performance of Refiner, enabling the acquisition of more PMM with similar PSNR (similar privacy protection ability). Intuitively, this is because during the early iterations, Refiner can maintain robust data to be noisy while effectively aligning gradients associated with clients’ and robust data. Further increasing the refinement iterations to 15 or 20 only yields marginal enhancements, suggesting that a refinement iteration value of approximately 10 would be a preferable choice for practical applications if considering the time cost.

Impact of $\tau$ . When considering the promotion of the alignment between client data and robust data gradients, $\tau$ plays a significant role. It directs more attention towards aligning the gradients of early layers, serving a dual purpose. Firstly, it aids in maintaining the performance of the model since the early layers contribute more significantly to its performance. Secondly, according to [39], the issue of privacy leakage is more closely associated with the gradients of later layers. The experimental results are summarized in Table 9. By including $\tau$ in Refiner, i.e., a transition from 1 to 0.97, we see only a marginal 1% decrease in PMM, yet achieve a substantial 2-point reduction in PSNR. Further increasing the value of $\tau$ does not yield significant changes in PSNR and PMM, suggesting that the observed improvements reach a plateau.

7 Discussion and Future Work

Why does Refiner work? We are interested in understanding why Refiner is more effective than perturbation-based methods. We find that perturbation-based methods exhibit optimality under specific assumptions. However, these assumptions rarely hold true for DNNs, thereby posing a challenge to achieving the desired performance using these methods.

To elaborate, perturbing each gradient element equally, as commonly practiced in DP and GQ, assumes that every gradient element carries an equal amount of privacy information and contributes equally to the model performance. Under this assumption, the optimal utility-privacy trade-off can be reached through uniformly perturbing fashions. By regarding all gradient elements as possessing identical privacy information, the optimal solution, derived from Taylor expansion, suggests preserving the gradient elements with the highest magnitudes, i.e.gradient pruning. Soteria takes into account the linearity assumption of DNNs, allowing for an estimation of the privacy information contained in each gradient element as well as assuming the same utility across all gradient elements. In this way, Soteria can theoretically achieve optimality by selectively removing the gradient elements with the highest privacy information.

When employing Refiner, clients upload the gradients associated with robust data that are generated by optimizing two metrics for privacy protection and performance maintenance. The uploaded gradients in Refiner only involve privacy from robust data. By reducing semantic similarity between the robust data and clients’ data, the uploaded gradients contain less privacy information associated with clients’ data. Importantly, such privacy estimation fashion avoids potential estimation errors that may arise from questionable assumptions made by perturbation-based methods. Regarding utility, Refiner prioritizes important parameters and narrows the gap between the gradients of these parameters with respect to robust and clients’ data. Section 3.3 validates that preserving the gradient of important parameters is a better solution than retaining gradients with the largest magnitude, i.e., gradient pruning. Taken together, we grasp why Refiner enjoys so impressive performance compared to perturbation-based methods.

Limitation and future work. The superior performance of Refiner is accompanied by a higher time cost compared to DP, GQ, and gradient pruning, but still lower than Soteria. The non-negligible extra time costs constitute the major drawback of Refiner, probably making some difficulty in its implementation over resource-constrained clients. However, it should be acknowledged that higher performance commonly entails increased computational resources as a trade-off, specifically in deep learning field. In summary, we believe that Refiner introduces a novel idea along with a pair of concrete privacy and utility metrics for mitigating GLAs. In future endeavors, our focus will be on reducing the time cost associated with Refiner, so as to extend it to be more suitable to resource-constrained scenarios.

References

Abadi et al. [2016] Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pages 308–318, 2016.
Aji and Heafield [2017] Alham Fikri Aji and Kenneth Heafield. Sparse communication for distributed gradient descent. ArXiv, abs/1704.05021, 2017. URL https://api.semanticscholar.org/CorpusID:2140766.
Balunović et al. [2022] Mislav Balunović, Dimitar I Dimitrov, Robin Staab, and Martin Vechev. Bayesian framework for gradient leakage. In International Conference on Learning Representations, 2022.
Bernstein et al. [2018] Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli, and Anima Anandkumar. signsgd: compressed optimisation for non-convex problems. In International Conference on Machine Learning, 2018. URL https://api.semanticscholar.org/CorpusID:7763588.
Bonawitz et al. [2017] Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. B. McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. Practical secure aggregation for privacy-preserving machine learning. Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 2017. URL https://api.semanticscholar.org/CorpusID:3833774.
Boyd and Vandenberghe [2005] Stephen P. Boyd and Lieven Vandenberghe. Convex optimization. Journal of the American Statistical Association, 100:1097 – 1097, 2005. URL https://api.semanticscholar.org/CorpusID:37925315.
Carlini and Wagner [2016] Nicholas Carlini and David A. Wagner. Towards evaluating the robustness of neural networks. 2017 IEEE Symposium on Security and Privacy (SP), pages 39–57, 2016. URL https://api.semanticscholar.org/CorpusID:2893830.
Dowlin et al. [2016] Nathan Dowlin, Ran Gilad-Bachrach, Kim Laine, Kristin E. Lauter, Michael Naehrig, and John Robert Wernsing. Cryptonets: applying neural networks to encrypted data with high throughput and accuracy. In International Conference on Machine Learning, 2016.
Fan et al. [2022] Mingyuan Fan, Yang Liu, Cen Chen, Shengxing Yu, W. Guo, Li Wang, and Ximeng Liu. Toward evaluating the reliability of deep-neural-network-based iot devices. IEEE Internet of Things Journal, 9:17002–17013, 2022. URL https://api.semanticscholar.org/CorpusID:245560775.
Fereidooni et al. [2021] Hossein Fereidooni, Samuel Marchal, Markus Miettinen, Azalia Mirhoseini, Helen Möllering, Thien Duc Nguyen, Phillip Rieger, Ahmad-Reza Sadeghi, T. Schneider, Hossein Yalame, and Shaza Zeitouni. Safelearn: Secure aggregation for private federated learning. 2021 IEEE Security and Privacy Workshops (SPW), pages 56–62, 2021. URL https://api.semanticscholar.org/CorpusID:233176356.
Geert et al. [2017] Geert, Litjens, Thijs, Kooi, Babak, Ehteshami, Bejnordi, Arnaud, Arindra, and Adiyoso. A survey on deep learning in medical image analysis. Medical Image Analysis, 2017.
Geiping et al. [2020] Jonas Geiping, Hartmut Bauermeister, Hannah Dröge, and Michael Moeller. Inverting gradients-how easy is it to break privacy in federated learning? Advances in Neural Information Processing Systems, 33:16937–16947, 2020.
Goodfellow et al. [2014] Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations, 2014.
He et al. [2015] Kaiming He, X. Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2015. URL https://api.semanticscholar.org/CorpusID:206594692.
Hoff [2009] Peter D. Hoff. A first course in bayesian statistical methods. 2009. URL https://api.semanticscholar.org/CorpusID:60995805.
Horé and Ziou [2010] Alain Horé and Djemel Ziou. Image quality metrics: Psnr vs. ssim. 2010 20th International Conference on Pattern Recognition, pages 2366–2369, 2010. URL https://api.semanticscholar.org/CorpusID:9506273.
Hornik et al. [1989] Kurt Hornik, Maxwell B. Stinchcombe, and Halbert L. White. Multilayer feedforward networks are universal approximators. Neural Networks, 2:359–366, 1989. URL https://api.semanticscholar.org/CorpusID:2757547.
Hornik et al. [1990] Kurt Hornik, Maxwell B. Stinchcombe, and Halbert L. White. Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Networks, 3:551–560, 1990. URL https://api.semanticscholar.org/CorpusID:13533363.
Hsu et al. [2019] Tzu-Ming Harry Hsu, Qi, and Matthew Brown. Measuring the effects of non-identical data distribution for federated visual classification. ArXiv, abs/1909.06335, 2019. URL https://api.semanticscholar.org/CorpusID:202572978.
Jeon et al. [2021] Jinwoo Jeon, Kangwook Lee, Sewoong Oh, Jungseul Ok, et al. Gradient inversion with generative image prior. Advances in Neural Information Processing Systems, 34:29898–29908, 2021.
Krizhevsky [2009] Alex Krizhevsky. Learning multiple layers of features from tiny images. 2009. URL https://api.semanticscholar.org/CorpusID:18268744.
LeCun et al. [1998] Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proc. IEEE, 86:2278–2324, 1998. URL https://api.semanticscholar.org/CorpusID:14542261.
Li et al. [2020a] Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith. Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 37(3):50–60, 2020a.
Li et al. [2020b] Xiang Li, Kaixuan Huang, Wenhao Yang, Shusen Wang, and Zhihua Zhang. On the convergence of fedavg on non-iid data. 2020b.
Li et al. [2022] Zhuohang Li, Jiaxin Zhang, Luyang Liu, and Jian Liu. Auditing privacy defenses in federated learning via generative gradient leakage. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10132–10142, 2022.
MacKay [2004] David John Cameron MacKay. Information theory, inference, and learning algorithms. IEEE Transactions on Information Theory, 50:2544–2545, 2004. URL https://api.semanticscholar.org/CorpusID:5436619.
McMahan et al. [2016] H. B. McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Agüera y Arcas. Communication-efficient learning of deep networks from decentralized data. In International Conference on Artificial Intelligence and Statistics, 2016. URL https://api.semanticscholar.org/CorpusID:14955348.
Mohassel and Zhang [2017] Payman Mohassel and Yupeng Zhang. Secureml: A system for scalable privacy-preserving machine learning. 2017 IEEE Symposium on Security and Privacy (SP), pages 19–38, 2017. URL https://api.semanticscholar.org/CorpusID:11605311.
Moshayedi et al. [2022] Ata Jahangir Moshayedi, Atanu Shuvam Roy, Amin Kolahdooz, and Yang Shuxin. Deep learning application pros and cons over algorithm. EAI Endorsed Transactions on AI and Robotics, 2022. URL https://api.semanticscholar.org/CorpusID:247311133.
Netzer et al. [2011] Yuval Netzer, Tao Wang, Adam Coates, A. Bissacco, Bo Wu, and A. Ng. Reading digits in natural images with unsupervised feature learning. 2011. URL https://api.semanticscholar.org/CorpusID:16852518.
Nweke et al. [2019] Henry Friday Nweke, Ying Wah Teh, Ghulam Mujtaba, and Mohammed Ali Al-Garadi. Data fusion and multiple classifier systems for human activity detection and health monitoring: Review and open research directions. Information Fusion, 46:147–170, 2019.
OpenAI [2023] OpenAI. Gpt-4 technical report. ArXiv, abs/2303.08774, 2023. URL https://api.semanticscholar.org/CorpusID:257532815.
Pouyanfar et al. [2018] Samira Pouyanfar, Saad Sadiq, Yilin Yan, Haiman Tian, Yudong Tao, Maria Presa Reyes, Mei-Ling Shyu, Shu-Ching Chen, and Sundaraja S Iyengar. A survey on deep learning: Algorithms, techniques, and applications. ACM Computing Surveys (CSUR), 51(5):1–36, 2018.
Reisizadeh et al. [2019] Amirhossein Reisizadeh, Aryan Mokhtari, Hamed Hassani, Ali Jadbabaie, and Ramtin Pedarsani. Fedpaq: A communication-efficient federated learning method with periodic averaging and quantization. In International Conference on Artificial Intelligence and Statistics, 2019. URL https://api.semanticscholar.org/CorpusID:203593931.
Reza [1994] Fazlollah M Reza. An introduction to information theory. Courier Corporation, 1994.
Russakovsky et al. [2014] Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael S. Bernstein, Alexander C. Berg, and Li Fei-Fei. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115:211 – 252, 2014. URL https://api.semanticscholar.org/CorpusID:2930547.
Sahu et al. [2018] Anit Kumar Sahu, Tian Li, Maziar Sanjabi, Manzil Zaheer, Ameet Talwalkar, and Virginia Smith. Federated optimization in heterogeneous networks. arXiv: Learning, 2018. URL https://api.semanticscholar.org/CorpusID:59316566.
Schuhmann et al. [2022] Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, Patrick Schramowski, Srivatsa Kundurthy, Katherine Crowson, Ludwig Schmidt, Robert Kaczmarczyk, and Jenia Jitsev. Laion-5b: An open large-scale dataset for training next generation image-text models. ArXiv, abs/2210.08402, 2022. URL https://api.semanticscholar.org/CorpusID:252917726.
Sun et al. [2021] Jingwei Sun, Ang Li, Binghui Wang, Huanrui Yang, Hai Li, and Yiran Chen. Soteria: Provable defense against privacy leakage in federated learning from representation perspective. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9311–9319, 2021.
Suárez et al. [2021] Juan Luis Suárez, Salvador García, and Francisco Herrera. A tutorial on distance metric learning: Mathematical foundations, algorithms, experimental analysis, prospects and challenges. Neurocomputing, 425:300–322, 2021. ISSN 0925-2312. doi: https://doi.org/10.1016/j.neucom.2020.08.017. URL https://www.sciencedirect.com/science/article/pii/S0925231220312777.
Theodoridis [2015] Sergios Theodoridis. Machine learning: a Bayesian and optimization perspective. Academic press, 2015.
Vepakomma et al. [2018] Praneeth Vepakomma, Otkrist Gupta, Tristan Swedish, and Ramesh Raskar. Split learning for health: Distributed deep learning without sharing raw patient data. ArXiv, abs/1812.00564, 2018. URL https://api.semanticscholar.org/CorpusID:54439509.
Wainakh et al. [2021] Aidmar Wainakh, Till Müßig, Tim Grube, and Max Mühlhäuser. Label leakage from gradients in distributed machine learning. 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC), pages 1–4, 2021. URL https://api.semanticscholar.org/CorpusID:232266184.
Wang et al. [2022] Junxiao Wang, Song Guo, Xin Xie, and Heng Qi. Protect privacy from gradient leakage attack in federated learning. In IEEE INFOCOM 2022-IEEE Conference on Computer Communications, pages 580–589. IEEE, 2022.
Wang et al. [2004] Zhou Wang, Alan Conrad Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13:600–612, 2004. URL https://api.semanticscholar.org/CorpusID:207761262.
Wei et al. [2020] Wenqi Wei, Ling Liu, Margaret Loper, Ka-Ho Chow, Mehmet Emre Gursoy, Stacey Truex, and Yanzhao Wu. A framework for evaluating client privacy leakages in federated learning. In European Symposium on Research in Computer Security, pages 545–566. Springer, 2020.
Wei et al. [2021] Wenqi Wei, Ling Liu, Yanzhao Wut, Gong Su, and Arun Iyengar. Gradient-leakage resilient federated learning. In 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS), pages 797–807. IEEE, 2021.
Yin et al. [2021] Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M Alvarez, Jan Kautz, and Pavlo Molchanov. See through gradients: Image batch recovery via gradinversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16337–16346, 2021.
Yue et al. [2023] K. Yue, Richeng Jin, Chau-Wai Wong, Dror Baron, and Huaiyu Dai. Gradient obfuscation gives a false sense of security in federated learning. 32nd USENIX Security Symposium (USENIX Security 23), 2023.
Zhang et al. [2020] Chengliang Zhang, Suyi Li, Junzhe Xia, Wei Wang, Feng Yan, and Yang Liu. Batchcrypt: Efficient homomorphic encryption for cross-silo federated learning. In USENIX Annual Technical Conference, 2020.
Zhang et al. [2018] Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 586–595, 2018. URL https://api.semanticscholar.org/CorpusID:4766599.
Zhang et al. [2022] Rui Zhang, Song Guo, Junxiao Wang, Xin Xie, and Dacheng Tao. A survey on gradient inversion: Attacks, defenses and future directions. ArXiv, abs/2206.07284, 2022. URL https://api.semanticscholar.org/CorpusID:249674534.
Zhao et al. [2020] Bo Zhao, Konda Reddy Mopuri, and Hakan Bilen. idlg: Improved deep leakage from gradients. arXiv preprint arXiv:2001.02610, 2020.
Zhu et al. [2019] Ligeng Zhu, Zhijian Liu, and Song Han. Deep leakage from gradients. Advances in Neural Information Processing Systems, 32, 2019.

Appendix A Proof of Privacy Metric

We here prove the following equation:

\begin{split}&\frac{1}{2}\int p(x)log\frac{2p(x)}{p(x)+q(x)}+q(x)log\frac{2q(x)}{p(x)+q(x)}dx\\ &=\underset{D,D(x)\in[0,1]}{\max}\mathbb{E}_{x\sim p(x)}[log(1-D(x))]+\mathbb{E}_{x\sim q(x)}[logD(x)].\end{split}

(8)

A convex function $g(x)$ can be represented using its conjugate form as follows:

\begin{split}&g(x)=\operatorname*{max}_{t}\ tx-g(t),\\ &t=g^{\prime}(x),g(t)=-f(x)+f^{\prime}(x)x,\end{split}

(9)

where $t$ belongs to the range of $g^{\prime}(x)$ . Equation 9 indicates that the value of $g(x)$ at a certain point can be evaluated by maximizing $t$ . When given an $x$ , there exists a fixed corresponding value for $t$ , which we denote as $t=H(x)$ . Consequently, Equation 9 can be rewritten as follows:

\begin{split}g(x)=\operatorname*{max}_{H(x)}\ H(x)x-g(H(x)),\end{split}

(10)

where the range of $H(x)$ belongs to the range of $g^{\prime}(x)$ .

Now let’s return to the JS divergence of $p(x)$ and $q(x)$ . Upon reorganization, we obtain the following expression of JS divergence:

\begin{split}&\frac{1}{2}\int p(x)log\frac{2p(x)}{p(x)+q(x)}+q(x)log\frac{2q(x)}{p(x)+q(x)}dx\\ &=\frac{1}{2}\int p(x)\{log\frac{2}{1+q(x)/p(x)}+\frac{q(x)}{p(x)}log\frac{2q(x)/p(x)}{1+q(x)/p(x)}\}dx.\end{split}

(11)

Let’s consider letting $z=\frac{q(x)}{p(x)}$ and define $h(z)=log\frac{2}{1+z}+zlog\frac{2z}{1+z}$ . It is easy to verify that $h(z)$ is a convex function over its domain ranging from $0$ to $+\infty$ . Figure 14 shows the shape of $h(z)$ .

Now we represent JS divergence using its conjugate form as follows:

\begin{split}&\frac{1}{2}\int p(x)log\frac{2p(x)}{p(x)+q(x)}+q(x)log\frac{2q(x)}{p(x)+q(x)}dx\\ &=\frac{1}{2}\int p(x)h(\frac{q(x)}{p(x)})dx\\ &=\frac{1}{2}\int p(x)\{\operatorname*{max}_{H(\frac{q(x)}{p(x)})}\ H(\frac{q(x)}{p(x)})\frac{q(x)}{p(x)}-g(H(\frac{q(x)}{p(x)}))\}dx\\ &=\frac{1}{2}\operatorname*{max}_{H(\frac{q(x)}{p(x)})}\int q(x)H(\frac{q(x)}{p(x)})-p(x)g(H(\frac{q(x)}{p(x)}))dx.\end{split}

(12)

According to Equation 10, the corresponding form of $g(\cdot)$ to $h(\cdot)$ is $g(t)=-log(2-e^{t})$ . Substituting the specific form of $g(t)=-log(2-e^{t})$ into Equation 12, we obtain:

\begin{split}&\frac{1}{2}\operatorname*{max}_{H(\frac{q(x)}{p(x)})}\int q(x)H(\frac{q(x)}{p(x)})-p(x)g(H(\frac{q(x)}{p(x)}))dx\\ &=\frac{1}{2}\operatorname*{max}_{H(\frac{q(x)}{p(x)})}\int q(x)H(\frac{q(x)}{p(x)})+p(x)log(2-e^{H(\frac{q(x)}{p(x)})})dx.\end{split}

(13)

Setting $H(\frac{q(x)}{p(x)})=log(2D(x))$ and substituting it into Equation 13, we obtain:

\begin{split}&\frac{1}{2}\operatorname*{max}_{H(\frac{q(x)}{p(x)})}\int q(x)H(\frac{q(x)}{p(x)})+p(x)log(2-e^{H(\frac{q(x)}{p(x)})})dx\\ &=\frac{1}{2}\operatorname*{max}_{D(x)}\int q(x)log2D(x)+p(x)log(2-e^{log2D(x)})dx\\ &=\frac{1}{2}\operatorname*{max}_{D(x)}\int q(x)logD(x)+p(x)log(1-D(x))dx+\frac{1}{2}log4\\ &=\frac{1}{2}\operatorname*{max}_{D(x)}\int q(x)logD(x)+p(x)log(1-D(x))dx+log2\\ &=\frac{1}{2}\operatorname*{max}_{D(x)}\mathbb{E}_{x\sim p(x)}[log(1-D(x))]+\mathbb{E}_{x\sim q(x)}[logD(x)]+log2.\end{split}

(14)

Equation 5 and Equation 14 differ only in the absence of $\frac{1}{2}$ and $log2$ , which do not affect the final results. Moreover, the definition of conjugate function requires $H(\cdot)$ to be limited to the range of $g^{\prime}(t)$ . Since the range of $g^{\prime}(t)$ is smaller than $log2$ , we can derive that the range of $D(\cdot)$ is constrained to the interval of 0 to 1.

Appendix B Proof of Theorem 1

We here show proof of Theorem 1. We assume $||\nabla_{\theta}\mathcal{L}(F_{\theta}(x),y)-\nabla_{\theta}\mathcal{L}(F_{\theta}(x^{*}),y)||\leq\epsilon$ . By leveraging Assumption 3, there is:

\begin{split}&\mathbb{E}||\nabla_{\theta}\mathcal{L}(F_{\theta}(x_{i}),y_{i})-g_{i}^{full}||_{2}^{2}\\ &=\mathbb{E}||\nabla_{\theta}\mathcal{L}(F_{\theta}(x_{i}),y_{i})-\nabla_{\theta}\mathcal{L}(F_{\theta}(x^{*}),y)\\ &+\nabla_{\theta}\mathcal{L}(F_{\theta}(x^{*}),y)-g_{i}^{full}||_{2}^{2}\\ &\leq\mathbb{E}||\nabla_{\theta}\mathcal{L}(F_{\theta}(x_{i}),y_{i})-\nabla_{\theta}\mathcal{L}(F_{\theta}(x^{*}),y)||_{2}^{2}\\ &+\mathbb{E}||\nabla_{\theta}\mathcal{L}(F_{\theta}(x^{*}),y)-g_{i}^{full}||_{2}^{2}=\sigma_{i}^{2}+\epsilon^{2}.\end{split}

(15)

Based on Assumption 4, we obtain:

\begin{split}&\mathbb{E}||\nabla_{\theta}\mathcal{L}(F_{\theta}(x_{i}^{*}),y_{i})||_{2}^{2}\\ &=\mathbb{E}||\nabla_{\theta}\mathcal{L}(F_{\theta}(x_{i}^{*}),y_{i})-\nabla_{\theta}\mathcal{L}(F_{\theta}(x_{i}),y_{i})+\nabla_{\theta}\mathcal{L}(F_{\theta}(x_{i}),y_{i})||_{2}^{2}\\ &\leq\mathbb{E}||\nabla_{\theta}\mathcal{L}(F_{\theta}(x_{i}^{*}),y_{i})-\nabla_{\theta}\mathcal{L}(F_{\theta}(x_{i}),y_{i})||_{2}^{2}\\ &+\mathbb{E}||\nabla_{\theta}\mathcal{L}(F_{\theta}(x_{i}),y_{i})||_{2}^{2}=G^{2}+\epsilon^{2}.\end{split}

(16)

By applying Equation 15, Equation 16 in Theorem 2 in [24], we can establish the following convergence guarantee of FedAvg with Refiner, i.e., Theorem 1 in this paper:

\begin{split}&\mathbb{E}[\mathcal{L}(F_{\theta}(\cdot),\cdot)]-\mathcal{L}(F_{\theta^{*}}(\cdot),\cdot)\\ &\leq\frac{2\kappa}{\gamma+N}(\frac{Q+C}{u}+\frac{u\gamma}{2}\mathbb{E}||\theta_{initial}-\theta^{*}||_{2}^{2}).\end{split}

Notably, although we assume a local step of 1, this convergence guarantee can be easily extended to multiple local steps.

Appendix C Projection Algorithm

Appendix D More Detailed Settings

The architecture and training settings of the evaluation network. The evaluation network used in this paper consists of four layers: three convolutional layers with Relu activation function followed by a fully connected layer with a sigmoid function. The size of the convolutional kernels is $3\times 3$ , with a stride of 2 and padding of 1. The number of convolutional kernels in the three convolutional layers is 128, 256, and 512, respectively. The last convolutional layer’s output is flattened to match the required input size of the fully connected layer. The output dimension of the fully connected layer is 1, meaning that it predicts the proportion of noises in given inputs. We optimize the evaluation network using the Adam optimizer with a learning rate of $10^{-4}$ , batch size of 128, trained for 20 epochs. The evaluation network is trained by randomly mixing random noises into each batch eleven times using $\alpha=\{0,0.1,\cdots,0.9,1\}$ (the value of $\alpha$ also is supervised signals) before iterating on the next batch.

Non-IID setting. Here we elaborate on how to generate non-IID datasets of clients. We assign a local distribution for each client, representing the distribution of data categories in local datasets. To achieve this, we employ a symmetric Dirichlet distribution with a concentration parameter of 1. By independently sampling from this distribution for each client, we determine the number of samples associated with specific labels that the client possesses. Finally, we randomly select a specified number of samples from training dataset to form the non-IID local datasets. These datasets are used in Section 6 to evaluate the effectiveness of different defenses.

Appendix E Concrete Formula of Evaluation Metrics

We here elaborate on the metrics used in this paper, including PSNR, SSIM, and LPIPS. The evaluation network has been detailed in Section 3.3. Consider the images of interest, denoted as $X,Y\in\mathbb{R}^{c\times h\times w}$ . $c,h,w$ represent the dimensions of channels, heights, and weights for the images.

PSNR (Peak Signal-to-Noise Ratio). PSNR serves as a useful benchmark to evaluate the quality of reconstructed images compared to the original. It quantifies the level of noise or distortion present in reconstructed images by comparing the mean square error (MSE) between the original and reconstructed images. The formula for PSNR can be expressed as follows:

\begin{split}&MSE(X,Y)=\frac{1}{chw}\sum_{i=1}^{c}\sum_{j=1}^{h}\sum_{k=1}^{w}(X_{i,j,k}-Y_{i,j,k})^{2},\\ &PSNR(X,Y)=10\cdot\log_{10}(\frac{{L^{2}}}{{MSE(X,Y)}}).\end{split}

(17)

MSE is the mean square error between the original and reconstructed images and $L$ is the maximum possible pixel value (e.g., 255 for an 8-bit image). In our evaluation, images are scaled within the range of 0 to 1. Thus, $L$ equals 1.

SSIM (Structural Similarity Index). The SSIM metric gauges the degree of structural similarity between two images, by accounting for differences in their luminance, contrast, and structure. The index ranges from 0 to 1, with a score of 1 indicating complete similarity. The formula for SSIM is:

SSIM(X,Y)=\frac{{(2\mu_{X}\mu_{Y}+C_{1})(2\sigma_{XY}+C_{2})}}{{(\mu_{X}^{2}+\mu_{Y}^{2}+C_{1})(\sigma_{X}^{2}+\sigma_{Y}^{2}+C_{2})}}.

where $\mu_{X}$ , $mu_{Y}$ are the mean values of $X$ and $Y$ , $\sigma_{X}^{2}$ , $\sigma_{Y}^{2}$ are the variances of $X$ and $Y$ , $\sigma_{XY}$ is their covariance, and $C_{1}$ and $C_{2}$ are small constants added for promoting numerical stability. In this paper, we call the off-the-shelf SSIM function provided by the skimage library ⁹⁹9https://github.com/scikit-image/scikit-image to directly compute the SSIM score for a pair of images.

LPIPS (Learned Perceptual Image Patch Similarity). LPIPS assesses the perceptual similarity of two images by leveraging convolutional neural networks. It calculates the dissimilarity between feature representations extracted from convolutional neural networks. Let us denote the extracted features of images $X,Y$ as $A,B\in\mathbb{R}^{C\times H\times W}$ . In this case, the LPIPS is given by:

LPIPS(X,Y)=\sum_{i=1}^{C}\frac{1}{HW}\sum_{j=1}^{H}\sum_{k=1}^{W}||o_{i}(A_{j,k}^{i}-B_{j,k}^{i})||_{2}^{2}.

Here, $o_{i}$ represents the predefined coefficients that weight the different channels. $A_{j,k}^{i},B_{j,k}^{i}$ denotes the value of $i$ -th channel’s element at $j$ -th row and $k$ -th column in the feature map $A,B$ . For practical implementation, we employ the LPIPS function provided by the open-source library lpips ¹⁰¹⁰10https://github.com/richzhang/PerceptualSimilarity, which is built upon the AlexNet backbone.

Appendix F Supplementary Evidence Supporting Section 4.2

Table 10: The MSE distance between

x

and corresponding recovered data using four different attack methods.

Attack	iGLA	GradInvertion	InvertingGrad	GGA
$\sigma$	0.00018	0.00053	0.00046	0.00594

We here empirically demonstrate that $\sigma\ll||x^{*}-x||$ , where $\sigma$ represents the distance between $x$ and the reconstructed data using gradients associated with $x$ . We randomly extract 1000 from the training set of CIFAR10 as $x$ . We utilize LeNet and follow the same configuration as in Section 5. Table 10 reports the MSE distance averaged over 1000 samples using four different attack methods. We further construct robust data for these 1000 samples. The MSE distance, averaged over these 1000 samples, between $x$ and their robust data is 0.0839, significantly larger than $\sigma$ , thus indicating the effectiveness of privacy protection analysis.