Privacy Against Hypothesis-Testing Adversaries for Quantum Computing
Abstract
A novel definition for data privacy in quantum computing based on quantum hypothesis testing is presented in this paper. The parameters in this privacy notion possess an operational interpretation based on the success/failure of an omnipotent adversary being able to distinguish the private categories to which the data belongs using arbitrary measurements on quantum states. Important properties of post processing and composition are then proved for the new notion of privacy. The relationship between privacy against hypothesis-testing adversaries, defined in this paper, and quantum differential privacy are then examined. It is shown that these definitions are intertwined in some parameter regimes. This enables us to provide an interpretation for the privacy budget in quantum differential privacy based on its relationship with privacy against hypothesis testing adversaries.
I Introduction
Quantum computing algorithms have garnered huge attention due to their considerable speedups in several classically difficult problems, such as factorising [1]. These breakthroughs and the added attention has paved the way for development of new algorithms for big data processing, such as quantum machine learning [2, 3, 4]. However, data processing can result in unintended information leakage [5]. This is an important issue because, as quantum hardware becomes more commercially available, these algorithms can be implemented on real-world sensitive, private, or proprietary datasets. Therefore, there is a need for development of frameworks to better understand private information leakage in quantum computing algorithms and to construct privacy-preserving algorithms.
In the classical computing literature, differential privacy has become the gold standard of privacy analysis and private algorithm design [6, 7, 8]. This is often attributed to the fact that differential privacy makes minimal assumptions about the data (e.g., range rather than distribution) and meets important properties of post processing and compositions [9]. Although possessing powerful guarantees, differential privacy has been polarizing [10, 11, 12]. Criticisms surroundings conservativeness of differential privacy have motivated a host of studies on privacy in information theory that can handle privacy-utility trade-off better in certain situations [13, 14, 15, 16, 17]. In fact, adoption of hypothesis-testing and estimation-based adversaries have been proposed as less conservative alternatives to differential privacy by social scientists following implementation of differential privacy in the 2020 United States Decennial Census of Population and Housing [10]. Nonetheless, differential privacy has been recently extended to quantum computing [18, 19, 20]. However, very little attention has been paid other forms of privacy in quantum systems. In this paper, we investigate privacy against hypothesis-testing adversaries. This is of particular interest to us due to the need for providing an operational, intuitive notion of privacy with real-world interpretations for privacy analysis and guarantees, which is somewhat absent in the differential privacy literature.
In this paper, we particularly propose a novel definition for data privacy for quantum computing based on quantum hypothesis testing. The design parameters in this notion of privacy possess an operational interpretation (specifically for general lay-users) based on the success/failure of an omnipotent adversary being able to distinguish the private class to which the data belongs (e.g., suffering from a certain disease in health datasets or belonging to the training dataset in membership inference attacks) based on the arbitrary measurement operators. We prove two very important properties for the new notion of privacy: post processing and composition. These properties are highly sought-after in privacy definitions [18] and information leakage metrics [13]. Subsequently, we investigate the relationship between privacy against hypothesis-testing adversaries and quantum differential privacy. This enables us to provide an interpretation for parameters of differential privacy based on its relationship with privacy against hypothesis-testing adversaries in certain parameter regimes. We will finally investigate the effectiveness of differential privacy against hypothesis-testing adversaries.
The remainder of this paper is organized as follows. We provide a review of basic concepts in quantum computing and information in Section II. The definition and results on privacy against hypothesis-testing adversaries is presented in Section III. Section IV presents quantum differential privacy and its relationship with privacy against hypothesis-testing adversaries. Finally, we present some concluding remarks and future directions for research in Section V.
II Quantum states and channels
The definitions and preliminary results in this review section are mostly borrowed from [21]. When the results or definitions are from outside this source, appropriate citations are presented.
A quantum system is modelled by a Hilbert space , i.e., a complex vector space, equipped with an inner product, that is complete with respect to the norm defined by the inner product. Throughout this paper, Dirac’s notation is used to denote quantum states. That is, a pure quantum state, which is an element (i.e., vector) of Hilbert space with unit norm, is denoted by ‘ket’ , e.g., . The inner product of two states and is denoted by . Here, ‘bra’ is used to refer to conjugate transpose of and .
The basic element of interest in quantum information theory is a quantum bit, which is often referred to as the qubit. A qubit is a 2-dimensional quantum state. Any qubit can be written in terms of the so-called computational basis and that form an orthonormal basis for the two dimensional Hilbert space modelling the qubit, that is, any qubit can be written as with such that . Combination of two qubits and is denoted by their tensor product , where is the Kronecker or tensor product. For the sake of brevity, we sometimes refer to as or . When two qubits and belong to or assigned to two distinct registers and (e.g., qubits used by two separate parties), and this information is either unclear from the context or must be emphasized, we write or . A quantum (logic) gate is any unitary operator, e.g., such that , that acts on a quantum state. Note that, here, denotes the conjugate transpose of .
A mixed quantum state is represented by an ensemble such that for all and . A mixed quantum state implies that the quantum system is in pure state with probability for all . A convenient way to model and analyse mixed quantum state is to use density operators. The density operator corresponding to ensemble is given by . Evidently, by construction, . Note that pure quantum states can also be modelled using rank-one density operator . Therefore, there is no loss of generality to work with density operators even when dealing with pure quantum states. Combination of two density operators and is denoted by their tensor product .
A basic operation in quantum systems is measurement, which enables extraction of information about the quantum states of the systems. A measurement is modelled by a set of operators with normalization constraint that . By performing measurement on a quantum system with state , we observe output with probability in which case, after the measurement, the state of the quantum system is . When the post-measurement state of the quantum system is of no interest, we can use the positive operator-valued measure (POVM) framework, which is a set of positive semi-definite Hermitian matrices such that . In this case, the probability of obtaining output when taking a measurement on a system with quantum state is given by .
A quantum channel is the most general quantum operation. A quantum channel is a mapping from the space of density operators to potentially another space of density operators that is both completely positive and trace preserving. Quantum channels model open quantum systems, i.e., quantum systems that interact with environment, and thus can model noisy quantum behaviours. According to Choi-Kraus theorem [21, Theorem 4.4.1], for each quantum channel , there exists a family of linear operators for some such that and for all density operators . This is referred to as the Kraus representation of quantum channels. For instance, a quantum (logic) gate with unitary operator can be represented by . Similarly, if we discard or delete the outcome of measurement , the quantum state transition can be modelled by quantum channel . We define the tensor product of quantum channels and as for all density operators and .
The trace norm or Schatten 1-norm of any linear operator is defined as . Based on this, we can define the trace distance between any two density operators and with Recall that density operators belong to the set of linear operators (i.e., matrices). The distance is equal to zero when two quantum states are equal. However, the distance attains its maximum value when two quantum states have support on orthogonal subspaces. For , the -relative entropy between two quantum states and is defined as The -relative entropy satisfies a few important properties that we will use in this paper. These properties are borrowed from [22]. First, with equality if and . Second, -relative entropy enjoys data processing inequality, i.e., for all density operators and all quantum channels . Also, where is the binary entropy function and is the usual relative entropy in quantum information theory. The -relative entropy and the trace distance also satisfy the following relationship [23]. The smooth max-relative entropy is defined as , where and .
Depolarizing channel is an important type of quantum noise that is represented by
(1) |
where is the dimension of the Hilbert space to which the system belongs and is a probability parameter.
III Quantum Hypothesis Testing and Privacy
Consider a quantum hypothesis testing scenario where a decision maker aims to distinguish between two quantum states (null hypothesis) and (alternative hypothesis). This is done by performing POVM with and for . If measurement outcome corresponding to the operator is realized, the decision maker guesses that the state is while, if measurement outcome corresponding to the operator is realized, the decision maker guesses that the state is . The probability of a type-I error (false positive) is equal
(2) |
The probability of a type-II error (false negative) is given by
(3) |
The optimal test, which seeks to minimize the false negative probability subject to a constraint on maintaining the false positive probability below , is given by
(4a) | ||||
(4b) | ||||
(4c) |
This is referred to as asymmetric quantum hypothesis testing [24]. The following well-known result (see, e.g., [22]) can be easily derived based on the definition of and -relative entropy .
Proposition 1
Proof:
Note that
This concludes the proof. ∎
Alternatively, a combination of false positive and false negative probabilities can be minimized:
where and , respectively, denote the prior probability that quantum state and the prior probability that quantum state are prepared. Clearly, by construct, . This is referred to as symmetric quantum hypothesis testing [24].
Theorem 1 (Helstrom-Holevo theorem [21, p. 254-255])
The most indistinguishable quantum states are . In this case, a decision maker would not be able to identify the quantum states because their observable are equivalent. Therefore, we can define
Therefore, for general density operators, we have
In quantum data privacy, it is desired to protect the quantum state of a system (which is being used for quantum computation) from being accurately estimated. Particularly, given a quantum state , we want to make sure that no decision maker can identify whether the quantum state of the system is or another similar quantum state . Similarity is modelled or captured using the neighbourhood relationship, c.f., differential privacy [20].
Definition 1 (Neighbouring Relationship)
A neighbouring or similarity relationship over the set of density operators is a mathematical relation that is both reflective and symmetric. The notation signifies that two quantum states and are neighbouring or similar. Note that, by definition, (reflectivity) and implies (symmetry).
An example of neighbouring or similarity relationship is the notion defined using trace distance in [18]. In this case, we say if and only if for some constant . However, we may select another notion of similarity that ensures that two quantum states are neighbouring if they are constructed based on two private datasets that differ in the data of one individual. Such a definition is well-suited for quantum machine learning with privacy guarantees [25].
Definition 2 (-Privacy Against Hypothesis-Testing Adversary)
For any and , a quantum channel is -private (against hypothesis-testing adversary) if for all neighbouring states .
This definition implies that, if two states are similar , a quantum channel is private if it makes distinguishing the reported or output states and difficult by any decision maker. In fact, Proposition 1 shows that probability of false negatives for any detection mechanism is lower bounded by if the probability of false positives bounded by . Therefore, as tends to zero (privacy guarantee is strengthened/privacy budget is reduced), the probability of false negatives move towards one (i.e., the decision maker would become overwhelmed by false negatives).
Proposition 2
Assume that a quantum channel is -private. Then, the quantum channel is -private if and .
Proof:
First note that, if , we have
where the inequality follows from that . Therefore, for all , we get ∎
The following corollary, building on Proposition 2, shows that -privacy against hypothesis testing adversary is the strongest notion of privacy and thus, -privacy can be thought of as relaxations of -privacy.
Corollary 1
Assume that a quantum channel is -private. Then, the quantum channel is -private for all .
Although privacy is here defined in terms of asymmetric quantum hypothesis testing, we prove the following important bound on the power of symmetric quantum hypothesis testing.
Theorem 2
For any -private quantum channel ,
(5) |
where
Proof:
Theorem 2 shows that, by decreasing , the combined probabilities of false positive and false negative denoted by increases towards its maximum value . Figure 1 illustrates the lower bound on versus the privacy budget for various choices of for the case that . As expected, reducing the privacy budget strengthens the privacy guarantees.
It is stipulated that any useful notion of privacy should admit two important properties of post processing and composition [20]. In the remainder of this section, we discuss these properties and their application to privacy against hypothesis testing adversary.
Theorem 3 (Post Processing)
Let be any -private and be an arbitrary quantum channel, then is -private.
Proof:
The proof follows from that, for all and , [22]. ∎
Theorem 3 shows that an adversary cannot weaken the privacy guarantees by processing the received quantum information in any way.
Theorem 4 (Composition)
Let be any -private and be any -private. Assume that if and . Then, Then is -private.
Proof:
Using the additivity results in [26, Appendix A], we get Therefore, if for , then . ∎
In practical data processing applications, there is often a need to deal with complicated algorithms in which responses from several queries based on private user data are fused together to extract useful statistical information from the data. For instance, when training machine learning models, iterative gradient descent algorithms can be used and the gradient at each epoch can be modelled as a query on the private data used for training [27]. In this case, it is desirable to establish composition rules for combination of several privacy-preserving quantum operations. Theorem 4 provides such a result for privacy against hypothesis testing adversaries.
IV Quantum Differential Privacy
The gold standard of privacy analysis and enforcement in the computer science literature is differential privacy, which has been recently extended to quantum computing algorithms [18]. In this section, we establish a relationship between differential privacy and privacy against hypothesis testing adversaries.
Definition 3
For any , a quantum channel is -differentially private if
(8) |
for all measurements and neighbouring density operators .
We can prove the following result regarding the relationship between quantum differential privacy and privacy against hypothesis testing adversaries.
Theorem 5
The following two statements hold:
-
•
If be -private, then is -differentially private.
-
•
If be -differentially private, then is -private for all .
Proof:
First, [23, Proposition 4.1]. Therefore, if is -private, we get . From Lemma III.2 in [20], a quantum channel is -differentially private if and only if . This proves that is -differentially private.
For the second part, note that [23, Proposition 4.1]. Therefore, if is -differentially private, we have . This implies that is -private for all . ∎
Theorem 6 (Lemma IV.2 [20])
Consider neighbourhood notion that if . Then, the depolarizing channel is -differentially private with
Corollary 2
Consider neighbourhood notion that if . Then, the depolarizing channel is -private with and all .
Proof:
We finish this section with analysing the performance of hypothesis testing adversaries for differentially-private quantum channels.
Theorem 7
For any -differentially private quantum channel ,
(9) |
where
Proof:
Assume that . Because of -differential privacy, for all measurements . Therefore, ∎
Theorem 7 provides a lower bound for the false negative rate for the best asymmetric hypothesis testing mechanism. The lower bound grows, and thus the decision maker would get overwhelmed by false negatives, with decreasing and . Therefore, the privacy guarantees strengthens as the privacy budget reduces in quantum differential privacy. This is illustrated in Figure 2 [top].
Theorem 8
For any -differentially private quantum channel ,
(10) |
where
Proof:
First, assume that . The definition of differential privacy implies that for all . As a result,
where the last inequality follows from that and that because . Therefore, using Lemma 1 in the appendix, we have
(11) |
Alternatively, assume that . Following the same line of reasoning, we get
(12) |
Therefore,
This concludes the proof. ∎
Theorem 8 provides a lower bound for the combined false positive and negative rates of the best symmetric hypothesis testing mechanism. The lower bound grows towards as and become smaller, which demonstrates that the privacy guarantees strengthen as the privacy budget reduces in quantum differential privacy. This is illustrated in Figure 2 [bottom].
V Conclusions and Future Work
We presented a novel definition for privacy in quantum computing based on quantum hypothesis testing. Important properties of post processing and composition were proved for this new notion of privacy. We then examined the relationship between privacy against hypothesis-testing adversaries, defined in this paper, and quantum differential privacy are then examined. In the composition rules for privacy against hypothesis adversaries, we only considered the case of . Future work can expand these results for general case of . Furthermore, we only showed that -differential privacy can be translated to privacy against hypothesis testing adversaries (the inverse results are more general in this paper). Therefore, another avenue for future research is to expand these results to general -differential privacy. Finally, an important direction for future research is to use the proposed framework in numerical setups based on real-world data.
References
- [1] P. W. Shor, “Algorithms for quantum computation: discrete logarithms and factoring,” in Proceedings 35th Annual Symposium on Foundations of Computer Science, pp. 124–134, Ieee, 1994.
- [2] S. Lloyd, M. Mohseni, and P. Rebentrost, “Quantum principal component analysis,” Nature Physics, vol. 10, no. 9, pp. 631–633, 2014.
- [3] H.-Y. Huang, R. Kueng, and J. Preskill, “Information-theoretic bounds on quantum advantage in machine learning,” Physical Review Letters, vol. 126, no. 19, p. 190505, 2021.
- [4] J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd, “Quantum machine learning,” Nature, vol. 549, no. 7671, pp. 195–202, 2017.
- [5] M. Kearns and A. Roth, The ethical algorithm: The science of socially aware algorithm design. Oxford University Press, 2019.
- [6] C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3, pp. 265–284, Springer, 2006.
- [7] C. Dwork, “Differential privacy: A survey of results,” in Theory and Applications of Models of Computation: 5th International Conference, TAMC 2008, Xi’an, China, April 25-29, 2008. Proceedings 5, pp. 1–19, Springer, 2008.
- [8] M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, “Deep learning with differential privacy,” in Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pp. 308–318, 2016.
- [9] C. Dwork and A. Roth, “The algorithmic foundations of differential privacy,” Foundations and Trends® in Theoretical Computer Science, vol. 9, no. 3–4, pp. 211–407, 2014.
- [10] V. J. Hotz, C. R. Bollinger, T. Komarova, C. F. Manski, R. A. Moffitt, D. Nekipelov, A. Sojourner, and B. D. Spencer, “Balancing data privacy and usability in the federal statistical system,” Proceedings of the National Academy of Sciences, vol. 119, no. 31, p. e2104906119, 2022.
- [11] R. Bhaskar, A. Bhowmick, V. Goyal, S. Laxman, and A. Thakurta, “Noiseless database privacy,” in International Conference on the Theory and Application of Cryptology and Information Security, pp. 215–232, Springer, 2011.
- [12] S. U. Nabar, B. Marthi, K. Kenthapadi, N. Mishra, and R. Motwani, “Towards robustness in query auditing,” in Proceedings of the 32nd international conference on Very large data bases, pp. 151–162, VLDB Endowment, 2006.
- [13] I. Issa, A. B. Wagner, and S. Kamath, “An operational approach to information leakage,” IEEE Transactions on Information Theory, vol. 66, no. 3, pp. 1625–1657, 2019.
- [14] F. Farokhi and N. Ding, “Measuring information leakage in non-stochastic brute-force guessing,” in 2020 IEEE Information Theory Workshop (ITW), pp. 1–5, IEEE, 2021.
- [15] J. Liao, O. Kosut, L. Sankar, and F. du Pin Calmon, “Tunable measures for information leakage and applications to privacy-utility tradeoffs,” IEEE Transactions on Information Theory, vol. 65, no. 12, pp. 8043–8066, 2019.
- [16] Z. Li, T. J. Oechtering, and D. Gündüz, “Privacy against a hypothesis testing adversary,” IEEE Transactions on Information Forensics and Security, vol. 14, no. 6, pp. 1567–1581, 2018.
- [17] F. Farokhi and H. Sandberg, “Fisher information as a measure of privacy: Preserving privacy of households with smart meters using batteries,” IEEE Transactions on Smart Grid, vol. 9, no. 5, pp. 4726–4734, 2017.
- [18] L. Zhou and M. Ying, “Differential privacy in quantum computation,” in 2017 IEEE 30th Computer Security Foundations Symposium (CSF), pp. 249–262, IEEE, 2017.
- [19] S. Aaronson and G. N. Rothblum, “Gentle measurement of quantum states and differential privacy,” in Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pp. 322–333, 2019.
- [20] C. Hirche, C. Rouzé, and D. S. França, “Quantum differential privacy: An information theory perspective,” arXiv preprint arXiv:2202.10717, 2022.
- [21] M. Wilde, Quantum Information Theory. Quantum Information Theory, Cambridge University Press, 2013.
- [22] L. Wang and R. Renner, “One-shot classical-quantum capacity and hypothesis testing,” Physical Review Letters, vol. 108, no. 20, p. 200501, 2012.
- [23] F. Dupuis, L. Kraemer, P. Faist, J. M. Renes, and R. Renner, “Generalized entropies,” in XVIIth international congress on mathematical physics, pp. 134–153, World Scientific, 2014.
- [24] B. Regula, L. Lami, and M. M. Wilde, “Postselected quantum hypothesis testing,” arXiv preprint arXiv:2209.10550, 2022.
- [25] W. M. Watkins, S. Y.-C. Chen, and S. Yoo, “Quantum machine learning with differential privacy,” arXiv preprint arXiv:2103.06232, 2021.
- [26] X. Yuan, “Hypothesis testing and entropies of quantum channels,” Physical Review A, vol. 99, no. 3, p. 032317, 2019.
- [27] N. Wu, F. Farokhi, D. Smith, and M. A. Kaafar, “The value of collaboration in convex machine learning with differential privacy,” in 2020 IEEE Symposium on Security and Privacy (SP), pp. 304–317, IEEE, 2020.
Lemma 1
The following identity holds:
Proof:
The proof is similar to the standard argument for the trace distance. Note that the difference operator is Hermitian. So we can diagonalize it as , where is an orthonormal basis of eigenvectors and is a set of real eigenvalues. Define matrices and . Evidently, by construction, . Note that,
where the last equality follows from
For all , we have
with equality achieved if . This implies that
∎