This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Evaluation of Inference Attack Models for
Deep Learning on Medical Data

Maoqiang Wu Xinyue Zhang Jiahao Ding Hien Nguyen Rong Yu Miao Pan Stephen T. Wong School of Automation, Guangdong University of Technology, Guangzhou 510006, China Electrical and Computer Engineering Department, University of Houston, TX 77004, USA Systems Medicine and Bioengineering, Houston Methodist Cancer Center, 6445 Main Street, TX 77030, USA
Abstract

Deep learning has attracted broad interest in healthcare and medical communities. However, there has been little research into the privacy issues created by deep networks trained for medical applications. Recently developed inference attack algorithms indicate that images and text records can be reconstructed by malicious parties that have the ability to query deep networks. This gives rise to the concern that medical images and electronic health records containing sensitive patient information are vulnerable to these attacks. This paper aims to attract interest from researchers in the medical deep learning community to this important problem. We evaluate two prominent inference attack models, namely, attribute inference attack and model inversion attack. We show that they can reconstruct real-world medical images and clinical reports with high fidelity. We then investigate how to protect patients’ privacy using defense mechanisms, such as label perturbation and model perturbation. We provide a comparison of attack results between the original and the medical deep learning models with defenses. The experimental evaluations show that our proposed defense approaches can effectively reduce the potential privacy leakage of medical deep learning from the inference attacks.

keywords:
attribute inference attack, model inversion attack, medical machine learning, collaborative inference
journal: Journal of  Templates

1 Introduction

Deep learning has become increasingly popular in healthcare and medicine areas. Medical institutions hold various modalities of medical data, such as electronic health records, biomedical images, and pathology test results. Based on medical data, deep neural network models are trained to address necessary healthcare concerns [1]. Examples include but not limited to: 1) deep learning models based on medical records outperformed traditional clinical models for detecting patterns in health trends and risk factors [2]; 2) deep learning model had high sensitivity and specificity for detecting diabetic retinopathy and macular edema in retinal fundus photographs [3]; 3) a mammography-based deep learning model was more accurate than traditional clinical models for predicting breast cancer risk [4].

However, the application of deep learning in healthcare is confronted with privacy threats. The medical records contain personal private information like drug usage patterns of the individual patient. Medical institutions also hold patients’ profile information such as home address, gender, age, etc. The private information might be unwittingly leaked when the aforementioned data is used for training a deep learning model [5, 6]. For example, attribute inference attacks [5] can utilize the trained model and incomplete information about a data point to infer the missing information for that point. The adversary could exploit such an attack to infer the target private information according to partial information on medical records. Another instance is model inversion attacks that enable the image data used for classification inference to be recovered based on the intermediate output of convolutional neural networks (CNNs) [7]. The adversary could deploy model inversion attack to recover the medical images of any target patient and infer private health conditions accordingly.

The risk of privacy leakage makes medical institutions increasingly less willing to share their data. This inevitably slows down the research progress at the intersection of deep learning and healthcare. Thus, it is necessary to evaluate the potential hazards of various attack models on medical data and develop corresponding defenses on such attacks.

In this paper, we attempt to implement two types of attack models on medical data, which are shown in Fig. 1. We use attribute inference attack to infer the sensitive attributes in medical record data according to the rest attributes and class labels, when training a deep learning model. We also employ model inversion attack to recover the medical image data based on the intermediate inference output. Against those attacks, we present two types of inference attack defense mechanisms. Label perturbation provides a way to add noise into confidence scores and thus hinders the privacy leakage from model prediction. Model perturbation is proposed to add noise into parameters of deep network models and thus disturb the privacy disclosure in model inference. Experimental results show that both attacks successfully disclose the private medical information used in training and inference processes, and the attacks are not effective any more under the proposed inference attack defense mechanisms.

The main contributions of this paper are summarized as follows.

  • 1.

    We evaluate attribute inference attack and model inversion attack on medical data. That demonstrates the privacy vulnerability of deep learning models, which limits their applications in the medical area. As far as we know, we are the first to evaluate the model inversion attack on medical data.

  • 2.

    We present inference attack defenses based on label perturbation and model perturbation. The mechanisms can significantly alleviate the privacy breaches of medical data in both the training phase and the inference phase.

The rest of this paper is organized as follows. We summarize the related work in Section 2. In Section 3, we describe the attribute inference attack and model inversion attack. We propose label perturbation mechanism and model perturbation mechanism against the two attacks in Section 4. Simulation results are presented in Section 5. Then we conclude the paper in Section 6.

2 Related Work

There are different types of privacy attacks against training and inference data. These attacks severely threaten patients’ privacy when deep learning is used in the healthcare area. The first type is membership inference attacks [8, 6], which tries to infer whether a target sample is contained in the dataset. The second type is model encoding attacks [9], the adversary who directly accesses to the training data can encode the sensitive data into the trained model and then retrieve the encoded sensitive information. The third type is attribute inference attack, given some attributes of the dataset, the adversary could infer the sensitive attribute. The fourth type is model inversion attack, given a deep learning model and some features of input data, the adversary could recover the rest of the features of the input data. In this paper, we choose two prominent inference attacks, namely attribute inference attack and model inversion attack, which may reconstruct medical images and clinical reports and be more threatening to patients’ privacy. We evaluate their attack performance on medical records and medical images, and then propose defense methods against these two inference attacks.

Attribute Inference Attack Attribute inference attack is studied in various areas. Gong et al. [10, 11] studied attribute inference attacks to infer the users’ sensitive attribute of social networks by integrating social friends and behavioral records. May et al. [12] proposed a new framework for inference attacks in social networks, which smartly integrates and modifies the existing state-of-the-art CNN models. Qian et al. [13] demonstrated that knowledge graphs can strengthen de-anonymization and attribute inference attacks, and thus increase the risk of privacy disclosure. However, few studies evaluate the attribute inference attacks in the healthcare area. In this paper, we adopt the same attribute inference attack in [14] which infers the sensitive attributes based on confident cores in predictions and is conveniently deployed on healthcare data. We propose a label perturbation method to effectually defend against the attribute inference attack.

Model inversion attack Model inversion attack is an outstanding attack to recover the input data of deep neuron networks. He et al. [5] proposed a model inversion attack to recover input images via the confidence score generated in the softmax model. He et al. [7] proposed a model inversion attack to reconstruct input images via the intermediate outputs of the neural network. Hitaj et al. [15] utilized Generative adversarial network (GAN) to recover the image in a collaborative training system. In this paper, we adopt the same model inversion attack in [7] by considering the medical collaborative deep learning scenario, where two hospitals hold different parts of a deep neuron network and collaborate to complete the training and inference via transmitting the intermediate output information. As far as we know, we are the first to evaluate the model inversion attack on medical data via intermediate output information. We propose an effective and convenient perturbation method instead of using the defenses suggested in [7], i.e., combining Trust Execution Environment and Homomorphic Encryption that requires special architecture support and huge computational burden.

Other Attacks against Machine Learning Besides attribute inference attack and membership inference attack, there exist numerous other types of attacks against ML models [16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38]. A major attack type is adversarial examples [30, 28, 29, 35]. In this setting, the adversary tries to carefully craft noise and add them to the data samples aiming to mislead the target classifying. In addition, a similar type of attack is backdoor attack, where the adversary tries to embed a trigger into a trained model and to exploit when the model is deployed [18, 26, 32]. Another line of work is model stealing attack, [36] proposed the first attack on inferring a model’s parameters and the related works focus on inferring a model’s hyperparameters [27, 37].

Possible Defenses To defend against the privacy attack, many researchers focused on defense methods. Trust Execution Environment [39] is specialized hardware for secure remote computation and data confidentiality protection against privileged adversaries. Homomorphic Encryption [40] allows the training and inference operations on encrypted input data, so the sensitive information will not be leaked. However, these methods require special architecture support and a huge computational burden. Differential Privacy (DP) [41] adds noise into the training model and there exists a trade-off between usability and privacy. However, our attacks mainly focus on the inference phase rather than the training phase and thus the DP methods are not suitable to defend against our attacks. We propose label perturbation that adds noise in the predicted label to defend attribute inference attack and mode perturbation that adds noise into the after-trained model to defend model inversion attack. The proposed methods are effective and convenient for application. We also give the results of the trade-off between model accuracy and attack performance. These results provide an intuitive guide for medical staff to adjust the defenses against the two inference attacks.

Refer to caption
(a) Attribute inference attack
Refer to caption
(b) Model inversion attack
Figure 1: Inference attack models and defense approaches for medical deep learning.

3 Inference Attack Models

3.1 Vulnerability of Medical Deep Learning

Via the use of deep learning algorithms, medical institutions can improve the rate of correct diagnosis [2]. In the training phase, deep neural networks are trained based on input medical data and output diagnosis results to learn the inherent relationships between them. In the inference phase, the after-trained deep neural networks can achieve high-accurate diagnosis results given new medical data as input. However, during the training phase and inference phase, the adversary could adopt attack methods to infer or recover the input medical data which contains sensitive information of patients. In this paper, we evaluate two prominent inference attacks, i.e., attribute inference attack and model inversion attack. As for attribute inference attack, we assume that the adversary knows the attributes of all the patients except the sensitive attributes when the patients’ medical records are used as input. This assumption applies to the cases that the attacker can search the rest attributes such as age and gender from the public database. The adversary utilizes the inherent relationships among attributes and labels to recover the patients’ sensitive attributes of input medical records. As for model inversion attack, we take the vulnerability of collaborative deep learning as an example, which provides an efficient paradigm to accelerate the learning and prediction process. The fundamental idea is to split a deep neural network into two parts. For example as Fig. 1b, in medical collaborative learning, the first few layers are stored in Hospital A while the rest are kept in Hospital B. In the collaborative training mode [42], Hospital A sends the outputs of the cut layer to Hospital B and then retrieve the gradients of the cut layer. In the collaborative inference mode [7, 43], Hospital A sends the outputs of the cut layer to Hospital B and retrieves the final results. The model training and inference processes are collaboratively carried out without sharing the raw data. However, the shared intermediate output information may be leaked during the transmission. Given the information, the adversary could recover the raw data with model inversion attack and thus compromise the data privacy of Hospital A.

3.2 Attribute Inference Attack

As shown in Fig. 1a, attribute inference attack [14] enables an adversary to deduce sensitive attributes in patients’ medical records. In this setting, the goal of the adversary is to guess the value of the sensitive features of a data point, e.g., sex attribute, given public knowledge about the data sample and the access to the model. Let (x,y)(x,y) denote a data sample where xx denotes the input patient information, and yy is the label of this data sample. We assume that a deep network f(x)f(x) takes the input xx to predict the output yy. The network’s parameters are optimized by reducing the discrepancy between the predicted value f(x)f(x) and the true outcome yy measured by the cross-entropy loss. We assume there are dd attributes in a data sample xx and let xdx_{d} be a sensitive attribute in xx that an attacker wants to learn. Given the values of attributes x1,x2,,xd1x_{1},x_{2},\cdots,x_{d-1}, the prior probabilities of all attributes and the access to the model f(x)f(x), the attacker aims to find the value of xdx_{d} to maximize the posterior probability P(xd|x1,x2,,xd1,f(x))P(x_{d}|x_{1},x_{2},\cdots,x_{d-1},f(x)). Therefore, the attacker can obtain the value of the sensitive attribute xdx_{d}.

3.3 Model Inversion Attack

As shown in Fig. 1b, model inversion attack [7] enables an adversary to recover an input medical image x0x_{0} from the corresponding intermediate output v0=fθ(x0)v_{0}=f_{\theta}(x_{0}), where fθf_{\theta} is the former layers of the model in Hospital A. We consider the black box attack setting, where the adversary does not know the structure or parameters θ\theta of fθf_{\theta} but he could query the black-box model, i.e., he could input the arbitrary data XX into the model and observe the intermediate outputs fθ(X)f_{\theta}(X). This assumption happens to the use case where Hospital A releases its APIs to other medical entities as training and inference services. In this setting, we build an inverse network model that learns the inverse mapping from output to input without the original model information. Roughly, the inverse model gωfθ1g_{\omega}\approx f^{-1}_{\theta} can be regarded as the approximated inverse function of fθf_{\theta}, where v=fθ(x)v=f_{\theta}(x) is input and xx is output.

Algorithm 1 shows the detailed model inversion attack consisting of four phases. In the observation phase, the adversary uses a cluster of samples X=x1,,xnX={x_{1},\cdots,x_{n}} as inputs to query fθf_{\theta} and obtain V=fθ(x1),,fθ(xn)V={f_{\theta}(x_{1}),\cdots,f_{\theta}(x_{n})}. Here the sample set XX is assumed to follow the same distribution of x0x_{0}. The assumption applies to the case that the radiologic images usually follows the same distribution. In the training phase, the adversary trains the inverse network gωg_{\omega} by using VV as inputs and XX as targets. We exploit the l2l_{2} norm in the pixel space as the loss function, which is given as

l(ω;X)=1ni=1ngω(fθ(xi))xi22.l(\omega;X)=\frac{1}{n}{\sum^{n}_{i=1}}\left\|g_{\omega}\big{(}f_{\theta}(x_{i})\big{)}-x_{i}\right\|^{2}_{2}. (1)

In particular, the structure of gωg_{\omega} is not necessarily related to fθf_{\theta}. In our experiment, an entirely different architecture is leveraged for the attack. In the recovery phase, the adversary leverages the trained inverse model to recover the raw data from the intermediate value: x0=gω(v0)x_{0}^{\prime}=g_{\omega}(v_{0}).

Algorithm 1 Model Inversion Attack Algorithm

Input: input data X=x1,x2,,xnX={x_{1},x_{2},\cdots,x_{n}} of the same distribution from target data x0x_{0}, output v0v_{0} of target data, batch size BB, epoch number EE, learning rate η\eta
Output: recovered data x0x_{0}^{\prime}

1:query the model by input data V=fθ(X)V=f_{\theta}(X)
2:initialize ω0\omega_{0}
3:for each epoch tTt\in T do
4:     β\beta\leftarrow (split VV into batches of size BB)
5:     for each batch bβb\in\beta do
6:         ωt+1ωtηl(ωt;b)\omega_{t+1}\leftarrow\omega_{t}-\eta\nabla l(\omega_{t};b)      
7:recover the target data x0=gω(v0)x_{0}^{\prime}=g_{\omega}(v_{0})
8:return x0x_{0}^{\prime}

4 Inference Attack Defense Mechanisms

4.1 Label Perturbation Based Protection

We apply randomized responses [44] to protect the learning model output labels of each data sample against attribute attacks. Intuitively, given a flipping probability pp, for a binary classification, the predicted label yy will be flipped with pp. Similarly, we assume the class set 𝒞={1,2,,C}(C>2)\mathcal{C}=\{1,2,\cdots,C\}(C>2), the predicted label y𝒞y\in\mathcal{C} will be perturbed with the probability of pp. If the predicted label yy is going to be replaced, there is 1/(C1)1/(C-1) probability for each one of the other C1C-1 classes that the original label yy will be substituted by the corresponding class. The inference accuracy of the attribute inference attack will deteriorate when the adversary obtains the inaccurate predicted label. Although the training performance can be influenced by the label perturbation, by controlling the flipping probability pp carefully, we can still have an acceptable training model.

4.2 Model Perturbation Based Protection

To defend against model inversion attack, we adopt model perturbation in CNN model. Different from the label perturbation that adds noise into predicted label, model perturbation adds noise into model parameters θ\theta (weights and bias) before the forward propagation is implemented. Specifically, we use Gaussian mechanism with expectation 0 and variance σ\sigma to generate noise and add it into model parameters, which is given as

θ=θ+𝒩(0,σ2I).\theta=\theta+\mathcal{N}(0,\sigma^{2}I). (2)

Accordingly, the output of the cut layer is perturbed in collaborative deep learning. Model inversion attack becomes difficult to build an accurate mapping from the output to the input image and thus the image recovery quality decreases.

5 Performance Evaluation

5.1 Attribute Inference Attack

5.1.1 Experiment Settings

We evaluate attribute inference attack and label perturbation approach on two public medical record datasets: cardiovascular disease dataset and heart disease dataset. The cardiovascular disease dataset consists of 70,00070,000 records, 1111 feature attributes including sensitive information such as age and gender, and labels indicating the presence or absence of cardiovascular disease. Heart disease dataset [45] contains 1313 attributes, 303303 instances, and labels referring to the presence of heart disease in individual patients. We split the dataset into training set and testing set with 80% and 20%. Our experiments use a multilayer perceptron classifier including 2 hidden layers with 100 neurons in each hidden layer.

Refer to caption
(a) Smoking
Refer to caption
(b) Alcohol intake
Figure 2: Attribute attack performance on the “cardiovascular disease” dataset.

5.1.2 Evaluation Results

As described in Section 3.2, in the experiments, we assume the attacker can obtain other information of a patient except for only one attribute and the marginal prior knowledge of the targeted attribute. We implement each experiment 10 times and show the mean value as the curve and the standard deviation as the error bar. The flip probability denotes the defense level. The higher flip probability means better defense. When the flip probability is equal to 0, it means no defense mechanism is applied. Fig. 2 and 3 demonstrates the attack and defense performance on two datasets. We select two attributes from the “heart” dataset, fasting blood sugar and gender, as the attacker’s targets. For the “cardiovascular” dataset, we choose smoking and alcohol intake as the target attribute. We can observe that the attack accuracy reduces with higher flip probability. Also, the testing accuracy degrades slightly, if the defense level is high.

Refer to caption
(a) Smoking
Refer to caption
(b) Alcohol intake
Figure 3: Attribute attack performance on the “cardiovascular disease” dataset.

5.2 Model Inversion Attack

5.2.1 Experiment Settings

We evaluate model inversion attack and model perturbation-based defence on two public mammography datasets: MIAS [46] and CBIS-DDSM [47]. All the images of MIAS dataset have been padded/clipped to 1024×10241024\times 1024. A total of 280280 samples are obtained from MIAS for training (181181 normal, 5757 benign, 4242 malignant) while 5050 samples are used for testing (2626 normal, 1212 benign, 1212 malignant). We clip and compress all the images of CBIS-DDSM to 256×256256\times 256. A total of 2,3262,326 samples are obtained from CBIS-DDSM for training (1,2631,263 benign, 1,0631,063 malignant) while 772772 samples are used for testing (419419 benign, 353353 malignant). We adopt an CNN with 66 convolution layers and 22 fully connection layers on the two datasets. Each convolution layer has 3232 channels and kernel size is 33. There is a maxpool layer after every two convolution layers. The model is split at 2nd, 4th, and 6th convolution layers. We select ADAM as our optimizer and set the learning rate as 0.0010.001.

Refer to caption
Figure 4: Recovered MIA inputs via model inversion attack.
Refer to caption
Figure 5: Recovered CBIS-DDSM inputs via model inversion attack.
Table 1: MSE, PSNR, SSIM for model inversion attack with different split layers
MIAS DDSM
layer 2 layer 4 layer 6 layer 2 layer 4 layer 6
MSE 2.9252.925 55.04255.042 88.83988.839 1.6721.672 29.64929.649 110.460110.460
PSNR 44.16244.162 31.03931.039 28.99528.995 46.38546.385 33.96233.962 28.26928.269
SSIM 0.9990.999 0.9940.994 0.9900.990 0.9990.999 0.9950.995 0.9840.984

5.2.2 Evaluation Results

Fig. 4 and Fig. 5 show the recovered images via model inversion attack. When the split point is in a lower layer, the recovered images have high quality. When the split point is in a deeper layer, the recovered images become relatively blurry and lose certain details. But even if in the recovered image from the output of the 6th layer, the details within the breasts can still be clearly identified. The attackers could diagnose the patient’s breast health given the recovered mammography image with the help of classification models or radiologists.

To quantify the attack results, we adopt three metrics, Mean-Square Error (MSE), Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) [48], which are shown in Table 1. MSE reflects pixel-wise similarity while PSNR measures the pixel-level recovery quality of the image. SSIM measures the human perceptual similarity of two images by considering their luminance, contrast, and structure. It ranges from [0,1][0,1], where 11 represents the most similar. When the split point is in a deeper layer, the recovered inputs have higher MSE, PSNR, and lower SSIM, which means the attack becomes harder.

Fig. 6 and Fig. 7 show the defense performance of model perturbation with different noise scale when the split point is in the 4th layer. We experiment Gaussian noise distributions with scale 0.020.02 to 0.050.05 and central 0. When the scale increases, the recovered inputs become more blurry and lose more details.

Table 2 shows the inference accuracy of CNN models injected by different scale of noises as well as MSE, PSNR and SSIM metrics for the attack. Non-noise means the original trained model without perturbation. When the scale increases, the recovered inputs have higher MSE, PSNR, and lower SSIM. Model perturbation reduces the quality of recovered inputs while slightly decreases the inference performance. These results provide an intuitive guide for medical staff to adjust the scale of noises in defense against the model inversion attack while keeping a satisfied model performance.

Refer to caption
Figure 6: Recovered MIAS inputs with and without model perturbation.
Refer to caption
Figure 7: Recovered CBIS-DDSM inputs with and without model perturbation.
Table 2: MSE, PSNR, SSIM for deep learning model with and without model perturbation
Datasets MIAS DDSM
Noise non 0.02 0.05 non 0.02 0.05
Accuracy 0.620.62 0.600.60 0.550.55 0.6180.618 0.5820.582 0.5530.553
MSE 55.04255.042 1992.1211992.121 6163.9276163.927 29.64929.649 327.7327.7 4238.924238.92
PSNR 31.03931.039 15.47915.479 10.52810.528 33.96233.962 23.07223.072 11.89711.897
SSIM 0.9940.994 0.7140.714 0.1700.170 0.9950.995 0.6080.608 0.5230.523

6 Conclusion

In this paper, we have evaluated two types of inference attacks on medical images and clinical records, and demonstrated that these attacks can infer sensitive attributes of medical health records as well as recover medical images at high fidelity. Our research finding exposes the risk of privacy leakage for using deep learning models in training medical data. To circumvent this problem, we proposed inference attack defenses based on label perturbation and model perturbation. Experimental results showed that the proposed defenses can effectively defend the malicious inference attacks while the deep learning performance can still be preserved commendably. The experimental results and the approaches presented help to raise awareness about the privacy issues of deploying deep learning networks in medicine and potentially open up a new vista to ensure patients’ privacy and confidentiality in the increasing adaptation of AI-enabled information infrastructure in healthcare delivery and medical research.

References

  • [1] R. Miotto, F. Wang, S. Wang, X. Jiang, J. T. Dudley, Deep learning for healthcare: review, opportunities and challenges, Briefings in bioinformatics 19 (6) (2018) 1236–1246.
  • [2] A. Rajkomar, E. Oren, K. Chen, A. M. Dai, N. Hajaj, M. Hardt, P. J. Liu, X. Liu, J. Marcus, M. Sun, et al., Scalable and accurate deep learning with electronic health records, NPJ Digital Medicine 1 (1) (2018) 18.
  • [3] V. Gulshan, L. Peng, M. Coram, M. C. Stumpe, D. Wu, A. Narayanaswamy, S. Venugopalan, K. Widner, T. Madams, J. Cuadros, et al., Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs, Jama 316 (22) (2016) 2402–2410.
  • [4] A. Yala, C. Lehman, T. Schuster, T. Portnoi, R. Barzilay, A deep learning mammography-based model for improved breast cancer risk prediction, Radiology 292 (1) (2019) 60–66.
  • [5] M. Fredrikson, S. Jha, T. Ristenpart, Model inversion attacks that exploit confidence information and basic countermeasures, in: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, 2015, pp. 1322–1333.
  • [6] R. Shokri, M. Stronati, C. Song, V. Shmatikov, Membership inference attacks against machine learning models, in: 2017 IEEE Symposium on Security and Privacy (SP), IEEE, 2017, pp. 3–18.
  • [7] Z. He, T. Zhang, R. B. Lee, Model inversion attacks against collaborative inference, in: Proceedings of the 35th Annual Computer Security Applications Conference, 2019, pp. 148–162.
  • [8] M. Nasr, R. Shokri, A. Houmansadr, Machine learning with membership privacy using adversarial regularization, in: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, 2018, pp. 634–646.
  • [9] C. Song, T. Ristenpart, V. Shmatikov, Machine learning models that remember too much, in: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 2017, pp. 587–601.
  • [10] N. Z. Gong, B. Liu, You are who you know and how you behave: Attribute inference attacks via users’ social friends and behaviors, in: 25th {\{USENIX}\} Security Symposium ({\{USENIX}\} Security 16), 2016, pp. 979–995.
  • [11] N. Z. Gong, B. Liu, Attribute inference attacks in online social networks, ACM Transactions on Privacy and Security (TOPS) 21 (1) (2018) 1–30.
  • [12] B. Mei, Y. Xiao, R. Li, H. Li, X. Cheng, Y. Sun, Image and attribute based convolutional neural network inference attacks in social networks, IEEE Transactions on Network Science and Engineering.
  • [13] J. Qian, X.-Y. Li, C. Zhang, L. Chen, De-anonymizing social networks and inferring private attributes using knowledge graphs, in: IEEE INFOCOM 2016-The 35th Annual IEEE International Conference on Computer Communications, IEEE, 2016, pp. 1–9.
  • [14] M. Fredrikson, E. Lantz, S. Jha, S. Lin, D. Page, T. Ristenpart, Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing, in: 23rd {\{USENIX}\} Security Symposium ({\{USENIX}\} Security 14), San Diego, CA, 2014.
  • [15] B. Hitaj, G. Ateniese, F. Perez-Cruz, Deep models under the gan: information leakage from collaborative deep learning, in: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 2017, pp. 603–618.
  • [16] Y. Chen, S. Wang, D. She, S. Jana, On training robust PDF malware classifiers, in: 29th USENIX Security Symposium (USENIX Security 20), USENIX Association, Boston, MA, 2020.
    URL https://www.usenix.org/conference/usenixsecurity20/presentation/chen-yizheng
  • [17] K. Ganju, Q. Wang, W. Yang, C. A. Gunter, N. Borisov, Property inference attacks on fully connected neural networks using permutation invariant representations, in: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, 2018, pp. 619–633.
  • [18] T. Gu, B. Dolan-Gavitt, S. Garg, Badnets: Identifying vulnerabilities in the machine learning model supply chain, arXiv preprint arXiv:1708.06733.
  • [19] W. Guo, D. Mu, J. Xu, P. Su, G. Wang, X. Xing, Lemna: Explaining deep learning based security applications, in: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, 2018, pp. 364–379.
  • [20] M. Jagielski, N. Carlini, D. Berthelot, A. Kurakin, N. Papernot, High accuracy and high fidelity extraction of neural networks, in: 29th {\{USENIX}\} Security Symposium ({\{USENIX}\} Security 20), 2020.
  • [21] Y. Ji, X. Zhang, S. Ji, X. Luo, T. Wang, Model-reuse attacks on deep learning systems, in: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, 2018, pp. 349–363.
  • [22] K. Leino, M. Fredrikson, Stolen memories: Leveraging model memorization for calibrated white-box membership inference, arXiv preprint arXiv:1906.11798.
  • [23] J. Li, N. Li, B. Ribeiro, Membership inference attacks and defenses in supervised learning via generalization gap, arXiv preprint arXiv:2002.12062.
  • [24] Z. Li, C. Hu, Y. Zhang, S. Guo, How to prove your model belongs to you: a blind-watermark based framework to protect intellectual property of dnn, in: Proceedings of the 35th Annual Computer Security Applications Conference, 2019, pp. 126–137.
  • [25] X. Ling, S. Ji, J. Zou, J. Wang, C. Wu, B. Li, T. Wang, Deepsec: A uniform platform for security analysis of deep learning model, in: 2019 IEEE Symposium on Security and Privacy (SP), IEEE, 2019, pp. 673–690.
  • [26] Y. Liu, S. Ma, Y. Aafer, W.-C. Lee, J. Zhai, W. Wang, X. Zhang, Trojaning attack on neural networks, in: 25nd Annual Network and Distributed System Security Symposium, NDSS 2018, San Diego, California, USA, February 18-221, 2018, The Internet Society, 2018.
  • [27] S. J. Oh, B. Schiele, M. Fritz, Towards reverse-engineering black-box neural networks, in: Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, Springer, 2019, pp. 121–144.
  • [28] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, A. Swami, Practical black-box attacks against machine learning, in: Proceedings of the 2017 ACM on Asia conference on computer and communications security, 2017, pp. 506–519.
  • [29] N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, A. Swami, The limitations of deep learning in adversarial settings, in: 2016 IEEE European symposium on security and privacy (EuroS&P), IEEE, 2016, pp. 372–387.
  • [30] N. Papernot, P. McDaniel, A. Sinha, M. P. Wellman, Sok: Security and privacy in machine learning, in: 2018 IEEE European Symposium on Security and Privacy (EuroS&P), IEEE, 2018, pp. 399–414.
  • [31] E. Quiring, A. Maier, K. Rieck, Misleading authorship attribution of source code using adversarial learning, in: 28th {\{USENIX}\} Security Symposium ({\{USENIX}\} Security 19), 2019, pp. 479–496.
  • [32] A. Salem, R. Wen, M. Backes, S. Ma, Y. Zhang, Dynamic backdoor attacks against machine learning models, arXiv preprint arXiv:2003.03675.
  • [33] A. Shafahi, W. R. Huang, M. Najibi, O. Suciu, C. Studer, T. Dumitras, T. Goldstein, Poison frogs! targeted clean-label poisoning attacks on neural networks, in: Advances in Neural Information Processing Systems, 2018, pp. 6103–6113.
  • [34] D. She, Y. Chen, A. Shah, B. Ray, S. Jana, Neutaint: Efficient dynamic taint analysis with neural networks, in: 2020 IEEE Symposium on Security and Privacy (SP), 2020, pp. 364–380.
  • [35] F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, P. McDaniel, Ensemble adversarial training: Attacks and defenses, in: International Conference on Learning Representations, 2018.
  • [36] F. Tramèr, F. Zhang, A. Juels, M. K. Reiter, T. Ristenpart, Stealing machine learning models via prediction apis, in: 25th {\{USENIX}\} Security Symposium ({\{USENIX}\} Security 16), 2016, pp. 601–618.
  • [37] B. Wang, N. Z. Gong, Stealing hyperparameters in machine learning, in: 2018 IEEE Symposium on Security and Privacy (SP), IEEE, 2018, pp. 36–52.
  • [38] B. Wang, Y. Yao, S. Shan, H. Li, B. Viswanath, H. Zheng, B. Y. Zhao, Neural cleanse: Identifying and mitigating backdoor attacks in neural networks, in: 2019 IEEE Symposium on Security and Privacy (SP), IEEE, 2019, pp. 707–723.
  • [39] M. Sabt, M. Achemlal, A. Bouabdallah, Trusted execution environment: what it is, and what it is not, in: 2015 IEEE Trustcom/BigDataSE/ISPA, Vol. 1, IEEE, 2015, pp. 57–64.
  • [40] A. Acar, H. Aksu, A. S. Uluagac, M. Conti, A survey on homomorphic encryption schemes: Theory and implementation, ACM Computing Surveys (CSUR) 51 (4) (2018) 1–35.
  • [41] J. Ding, S. M. Errapotu, H. Zhang, Y. Gong, M. Pan, Z. Han, Stochastic admm based distributed machine learning with differential privacy, in: International conference on security and privacy in communication systems, Springer, 2019, pp. 257–277.
  • [42] P. Vepakomma, O. Gupta, T. Swedish, R. Raskar, Split learning for health: Distributed deep learning without sharing raw patient data, arXiv preprint arXiv:1812.00564.
  • [43] E. Li, L. Zeng, Z. Zhou, X. Chen, Edge ai: On-demand accelerating deep neural network inference via edge computing, IEEE Transactions on Wireless Communications.
  • [44] S. L. Warner, Randomized response: A survey technique for eliminating evasive answer bias, Journal of the American Statistical Association 60 (309) (1965) 63–69.
  • [45] R. Detrano, A. Janosi, W. Steinbrunn, M. Pfisterer, J.-J. Schmid, S. Sandhu, K. H. Guppy, S. Lee, V. Froelicher, International application of a new probability algorithm for the diagnosis of coronary artery disease, The American journal of cardiology 64 (5) (1989) 304–310.
  • [46] P. SUCKLING J, The mammographic image analysis society digital mammogram database, Digital Mammo (1994) 375–386.
  • [47] R. S. Lee, F. Gimenez, A. Hoogi, K. K. Miyake, M. Gorovoy, D. L. Rubin, A curated mammography data set for use in computer-aided detection and diagnosis research, Scientific data 4 (2017) 170177.
  • [48] A. Hore, D. Ziou, Image quality metrics: Psnr vs. ssim, in: 2010 20th international conference on pattern recognition, IEEE, 2010, pp. 2366–2369.