This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

VerifyML: Obliviously Checking Model Fairness Resilient to Malicious Model Holder

Guowen Xu, Xingshuo Han, Gelei Deng, Tianwei Zhang, Shengmin Xu, Jianting Ning, Anjia Yang, Hongwei Li Guowen Xu,Xingshuo Han,Gelei Deng and Tianwei Zhang are with the School of Computer Science and Engineering, Nanyang Technological University. (e-mail: guowen.xu@ntu.edu.sg; xingshuo001@e.ntu.edu.sg; GDENG003@e.ntu.edu.sg; tianwei.zhang@ntu.edu.sg) Shengmin Xu and Jianting Ning are with the College of Computer and Cyber Security, Fujian Normal University, Fuzhou, China (e-mail: smxu1989@gmail.com; jtning88@gmail.com) Anjia Yang is with the College of Cyber Security, Jinan University, Guangzhou 510632, China. (e-mail:anjiayang@gmail.com) Hongwei Li is with the school of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.(e-mail: hongweili@uestc.edu.cn)
Abstract

In this paper, we present VerifyML, the first secure inference framework to check the fairness degree of a given Machine learning (ML) model. VerifyML is generic and is immune to any obstruction by the malicious model holder during the verification process. We rely on secure two-party computation (2PC) technology to implement VerifyML, and carefully customize a series of optimization methods to boost its performance for both linear and nonlinear layer execution. Specifically, (1) VerifyML allows the vast majority of the overhead to be performed offline, thus meeting the low latency requirements for online inference. (2) To speed up offline preparation, we first design novel homomorphic parallel computing techniques to accelerate the authenticated Beaver’s triple (including matrix-vector and convolution triples) generation procedure. It achieves up to 1.7×1.7\times computation speedup and gains at least 10.7×10.7\times less communication overhead compared to state-of-the-art work. (3) We also present a new cryptographic protocol to evaluate the activation functions of non-linear layers, which is 4×4\times42×42\times faster and has >48×>48\times lesser communication than existing 2PC protocol against malicious parties. In fact, VerifyML even beats the state-of-the-art semi-honest ML secure inference system! We provide formal theoretical analysis for VerifyML security and demonstrate its performance superiority on mainstream ML models including ResNet-18 and LeNet.

Index Terms:
Privacy Protection, Deep Learning, Cryptography.

1 Introduction

Machine learning (ML) systems are increasingly being used to inform and influence people’s decisions, leading to algorithmic outcomes that have powerful implications for individuals and society. For example, most personal loan default risks are calculated by automated ML tools. This approach greatly speeds up the decision-making process, but as with any decision-making algorithm, there is a tendency to provide accurate results for the majority, leaving certain individuals and minority groups disadvantaged [1, 41]. This problem is widely defined as the unfairness of the ML model. It often stems from the underlying inherent human bias in the training samples, and a trained ML model amplifies this bias to the point of causing discriminatory decisions about certain groups and individuals.

Actually, the unfairness of ML model entangles in every corner of society, not only being spied on in financial risk control. A prime example comes from COMPAS [18], an automated software used in US courts to assess the probability of criminals reoffending. A investigation of the software reveals a bias against African-Americans, i.e., COMPAS having a higher false positive rate for African-American offenders than white criminals, owing to incorrectly estimating their risk of reoffending. Similar model decision biases pervade other real-world applications including childcare systems [7], employment matching [33], AI chatbots, and ad serving algorithms [16]. As mentioned earlier, these resulting unfair decisions stem from neglected biases and discrimination hidden in data and algorithms.

To alleviate the above problems, a series of recent works [4, 24, 34, 32, 31] have proposed for formalizing measures of fairness for classification models, as well as their variants, in aim to provide instructions for verifying the fairness of a given model. Several evaluation tools have also been released that facilitate automated checks for discriminatory decisions in a given model. For example, Aequitas [36] as a toolkit provides testing of models against several bias and fairness metrics corresponding to different population subgroups. It feeds back test reports to developers, researchers and governments to assist them in making conscious decisions to avoid tending to harm specific population groups. IBM also offers a toolkit AI Fairness 360 [3], which aims to bringing fairness research algorithms to the industrial setting, creating a benchmark where all fairness algorithms can be evaluated, and providing an environment for researchers to share their ideas.

Existing efforts in theory and tools have led the entire research community to work towards unbiased verification of the ML model fairness. However, existing verification mechanisms either require to white-box access the target model or require clients to send queries in plaintext to the model holder, which is impractical as it incurs a range of privacy concerns. Specifically, model holders are often reluctant to disclose model details because training a commercial model requires a lot of human cost, resources, and experience. Therefore, ML models, as precious intellectual property rights, need to be properly protected to ensure the company’s competitiveness in the market. On the other hand, the queries that clients used to test model fairness naturally contain sensitive information, including loan records, disease history, and even criminal information. These highly private data should clearly be guaranteed confidentiality throughout the verification process. Hence, these requirements for privacy raises a challenging but meaningful question:

Can we design a verification framework that only returns the fairness of the model to the client and the parties cannot gain any private information?

We materialize the above question to a scenario where a client interacts with the model holder to verify the fairness of the model. Specifically, before using the target model’s inference service, the client sends a set of queries for testing fairness to the model holder, which returns inference results to the client enabling it to locally evaluate how fair the model is. In such a scenario, the client is generally considered to be semi-honest since it needs to evaluate the model correctly for subsequent service. The model holder may be malicious, it may trick the client into believing that the model is of high fairness by arbitrarily violating the verification process. A natural solution to tackle such concerns is to leverage state-of-the-art generic 2PC tools [22, 20, 6] that provide malicious security. It guarantees that if either entity behaves maliciously, they will be caught and the protocol aborted, protecting privacy. However, direct grafting of standard tools incurs enormous redundant overhead, including heavy reliance on zero-knowledge proofs [11], tedious computational authentication and interaction [15] (see Section 3 for more details).

To reduce the overhead, we propose VerifyML, a 2PC-based secure verification framework implemented on the model holder-malicious threat model. In this model, the client is considered semi-honest but the model holder is malicious and can arbitrarily violate the specification of the protocol. We adaptively customize a series of optimization methods for VerifyML, which show much better performance than the fully malicious baseline. Our key insight is to move the vast majority of operations to the client to bypass cumbersome data integrity verification and reduce the frequency of interactions between entities. Further, we design highly optimized methods to perform linear and nonlinear layer functions for ML, which brings at least 440×4-40\times speedup compared to state-of-the-art techniques. Overall, our contributions are as follows:

  • \bullet

    We leverage the hybrid combination of HE-GC to design VerifyML. In VerifyML, the execution of ML’s linear layer is implemented by homomorphic encryption (HE) while the non-linear layer is performed by the garbled circuit (GC). VerifyML allows more than 95% of operations to be completed in the offline phase, thus providing very low latency in the online inference phase. Actually, VerifyML’s online phase even beats DELPHI [29], the state-of-the-art scheme for secure ML inference against only semi-honest adversaries.

  • \bullet

    We design a series of optimization methods to reduce the overhead of the offline stage. Specifically, we design new homomorphic parallel computation methods, which are used to generate authenticated Beaver’s triples, including matrix-vector and convolution triples, in a Single Instruction Multiple Data (SIMD) manner. Compared to existing techniques, we generate triples of matrix-vector multiplication without any homomorphic rotation operation, which is very computationally expensive compared to other homomorphic operations including addition and multiplication. Besides, we reduce the communication complexity of generating convolution triples (aka matrix multiplication triples) from cubic to quadratic with faster computing performance.

  • \bullet

    We design computationally-friendly GC to perform activation functions of nonlinear layers (mainly ReLU). Our key idea is to minimize the number of expensive multiplication operations in the GC. Then, we use the GC as a one-time pad to simplify verifying the integrity of the input from the server. Compared to the state-of-the-art works, our non-linear layer protocol achieves at least an order of magnitude performance improvement.

  • \bullet

    We provide formal theoretical analysis for VerifyML security and demonstrate its performance superiority on various datasets and mainstream ML models including ResNet-18 and LeNet. Compared to state-of-the-art work, our experiments show that VerifyML achieves up to 1.7×1.7\times computation speedup and gains at least 10.7×10.7\times less communication overhead for linear layer computation. For non-linear layers, VerifyML is also 4×4\times42×42\times faster and has >48×>48\times lesser communication than existing 2PC protocol against malicious parties. Meanwhile, VerifyML demonstrates an encouraging online runtime boost by 32.6×32.6\times and 32.2×32.2\times over existing works on LeNet and ResNet-18, respectively, and at least an order of magnitude communication cost reduction.

2 Preliminaries

2.1 Threat Model

We consider a secure ML inference scenario, where a model holder P0P_{0} and a client P1P_{1} interact with each other to evaluate the fairness of the target model. In such a model holder-malicious threat model, P0P_{0} holds the model 𝐌\mathbf{M} while the client owns the private test set used to verify the fairness of the model. The client is generally considered to be semi-honest, that is, it follows the protocol’s specifications in the interaction process for evaluating the fairness of the model unbiased. However, it is possible to infer model parameters by passively analyzing data streams captured during interactions. The model holder is malicious. It may arbitrarily violate the specification of the protocol to trick clients into believing that they hold a high-fairness model. The network architecture is assumed to be known to both P0P_{0} and P1P_{1}. VerifyML aims to construct such a secure inference framework that enables P1P_{1} to correctly evaluate the fairness of model without knowing any details of the model parameters, meanwhile, P0P_{0} knows nothing about the client’s input. We provide a formal definition of the threat model in Appendix A.

2.2 Notations

We use λ\lambda and σ\sigma to denote the computational security parameter and the statistical security parameter, respectively. [k][k] represents the set {1,2,k}\{1,2,\cdots k\} for k>0k>0. In our VerifyML, all the arithmetic operations are calculated in the field 𝔽p\mathbb{F}_{p}, where pp is a a prime and we define κ=logp\kappa=\lceil\log p\rceil. This means that there is a natural mapping for elements in 𝔽p\mathbb{F}_{p} to {0,1}κ\{0,1\}^{\kappa}. For example, a[i]a[i] indicates the ii-th bit of aa on this mapping, i.e, a=i[κ]a[i]2i1a=\sum_{i\in[\kappa]}a[i]\cdot 2^{i-1}. Given two vectors 𝐚\mathbf{a} and 𝐛\mathbf{b}, and an element α𝔽p\alpha\in\mathbb{F}_{p}, 𝐚+𝐛\mathbf{a}+\mathbf{b} indicates the element-wise addition, α+𝐚\alpha+\mathbf{a} and α𝐚\alpha\mathbf{a} mean that each component of 𝐚\mathbf{a} performs addition and multiplication with α\alpha, respectively. 𝐚𝐛\mathbf{a}\ast\mathbf{b} represents the inner production between vectors 𝐚\mathbf{a} and 𝐛\mathbf{b}. Similarly, given any function f:𝔽p𝔽pf:\mathbb{F}_{p}\rightarrow\mathbb{F}_{p}, f(𝐚)f(\mathbf{a}) denotes evaluation of ff on each component on 𝐚\mathbf{a}. a||ba||b represents the concatenation of aa and bb. UnU_{n} is used to represent the uniform distribution on the set {0,1}n\{0,1\}^{n} for any n>0n>0.

For ease of exposition, we consider an ML model, usually a neural network model 𝐌\mathbf{M}, consisting of alternating linear and nonlinear layers. We assume that the specification of the linear layer is 𝐋1,𝐋m\mathbf{L}_{1},\cdots\mathbf{L}_{m} and the non-linear layer is f1,,fm1f_{1},\cdots,f_{m-1}. Given an initial input (i.e. query) 𝐱0\mathbf{x}_{0}, the model holder will sequentially execute 𝐯i=𝐋i𝐱i1\mathbf{v}_{i}=\mathbf{L}_{i}\mathbf{x}_{i-1} and 𝐱i=fi(𝐯i)\mathbf{x}_{i}=f_{i}(\mathbf{v}_{i}). Finally, 𝐌\mathbf{M} outputs the inference result 𝐯m=𝐋m𝐱m1=𝐌(𝐱0)\mathbf{v}_{m}=\mathbf{L}_{m}\mathbf{x}_{m-1}=\mathbf{M}(\mathbf{x}_{0}).

2.3 ML Fairness Measurement

Let 𝒳\mathcal{X} be the set of possible inputs and 𝒴\mathcal{Y} be the set of all possible labels. In addition, let 𝒪\mathcal{O} be a finite set related to fairness (e.g., ethnic group). We assume that 𝒳×𝒴×𝒪\mathcal{X}\times\mathcal{Y}\times\mathcal{O} is drawn from a probability space Ω\Omega with an unknown distribution 𝒟\mathcal{D}, and use 𝐌(𝐱)\mathbf{M}(\mathbf{x}) to denote the model inference result given an input 𝐱\mathbf{x}. Based on these, we review the term of the empirical fairness gap (EFG) [38], which is widely used to measure the fairness of ML models against a specific group. To formalize the formulation of EFG, we first describe the definition of conditional risk as follows:

ϝo(𝐌)=𝔼(𝐱,y,o)𝒟[𝕀{𝐌(𝐱)y}|o=o]\begin{split}\digamma_{o}(\mathbf{M})=\mathop{\mathbb{E}}\limits_{(\mathbf{x},y,o^{\prime})\sim\mathcal{D}}[\mathbb{I}\{\mathbf{M}(\mathbf{x})\neq y\}|\mathit{o}^{\prime}=\mathit{o}]\end{split} (1)

Given a set of samples (𝐱,y,o)(\mathbf{x},y,o^{\prime}) satisfying distribution 𝒟\mathcal{D}, ϝo(𝐌)\digamma_{o}(\mathbf{M}) is the expectation of the number of misclassified entries in the test set that belong to group oo, where 𝕀{Φ}\mathbb{I}\{\Phi\} represents the indicator function with a predicate Φ\Phi. Given an independent sample set Ψ={(𝐱(1),y(1),o(1)),,(𝐱(t),y(t),o(t))}\Psi=\{(\mathbf{x}^{(1)},y^{(1)},o^{(1)}),\cdots,(\mathbf{x}^{(t)},y^{(t)},o^{(t)})\}\sim 𝒟t\mathcal{D}^{t}, the empirical conditional risk is defined as follows:

ϝ~o(𝐌,Ψ)=1toi=1t[𝕀{𝐌(𝐱(i))y(i)}|o(i)=o]\begin{split}\tilde{\digamma}_{o}(\mathbf{M},\Psi)=\frac{1}{t_{o}}\sum_{i=1}^{t}[\mathbb{I}\{\mathbf{M}(\mathbf{x}^{(i)})\neq y^{(i)}\}|\mathit{o}^{(i)}=\mathit{o}]\end{split} (2)

where tot_{o} indicates the number of samples in Ψ\Psi from group oo. Then, we describe the term fairness gap (FG), which is used to measure the maximum margin of any two groups, specifically,

FG=maxoo,o1𝒪|ϝoo(𝐌)ϝo1(𝐌)|\begin{split}FG=\max_{o_{o},o_{1}\in\mathcal{O}}|\digamma_{o_{o}}(\mathbf{M})-\digamma_{o_{1}}(\mathbf{M})|\end{split} (3)

Likewise, the empirical fairness gap (EFG) is defined as

EFG=maxoo,o1𝒪|ϝ~oo(𝐌,Ψ)ϝ~o1(𝐌,Ψ)|\begin{split}EFG=\max_{o_{o},o_{1}\in\mathcal{O}}|\tilde{\digamma}_{o_{o}}(\mathbf{M},\Psi)-\tilde{\digamma}_{o_{1}}(\mathbf{M},\Psi)|\end{split} (4)

Lastly, we say a ML model 𝐌\mathbf{M} is ϵ\epsilon-fair on (𝒪,𝒟)(\mathcal{O},\mathcal{D}), if its fairness gap is smaller than ϵ\epsilon with confidence 1δ1-\delta. Formally, a ϵ\epsilon-fair 𝐌\mathbf{M} is defined as satisfying the following conditions:

Pr[maxoo,o1𝒪|ϝoo(𝐌)ϝo1(𝐌)|>ϵ]δ\begin{split}Pr\left[\max_{o_{o},o_{1}\in\mathcal{O}}|\digamma_{o_{o}}(\mathbf{M})-\digamma_{o_{1}}(\mathbf{M})|>\epsilon\right]\leq\delta\end{split} (5)

In practice, we usually replace FG in Eqn.5 with EFG to facilitate the measurement of fairness. Note that once the client gets enough predictions in the target model, it can locally evaluate the fairness of the model according to Eqn.5.

2.4 Fully Homomorphic Encryption

Let the plaintext space be 𝔽p\mathbb{F}_{p}, informally, a Fully homomorphic encryption (FHE) under the public key encryption system usually contains the following algorithms:

  • \bullet

    𝙺𝚎𝚢𝙶𝚎𝚗(1λ)(pk,sk)\mathtt{KeyGen}(1^{\lambda})\rightarrow(pk,sk). Taking the security parameter λ\lambda as input, 𝙺𝚎𝚢𝙶𝚎𝚗\mathtt{KeyGen} is a random algorithm used to output the public key pkpk and the corresponding secret key sksk required for homomorphic encryption.

  • \bullet

    𝙴𝚗𝚌(pk,x)c\mathtt{Enc}(pk,x)\rightarrow c. Given pkpk and a plaintext x𝔽px\in\mathbb{F}_{p}, the algorithm 𝙴𝚗𝚌\mathtt{Enc} outputs a ciphertext cc encrypting xx.

  • \bullet

    𝙳𝚎𝚌(sk,c)x\mathtt{Dec}(sk,c)\rightarrow x. Taking sksk and a ciphertext cc as input, 𝙳𝚎𝚌\mathtt{Dec} decrypts cc and outputs the corresponding plaintext xx.

  • \bullet

    𝙴𝚟𝚊𝚕(pk,c1,c2,F)c\mathtt{Eval}(pk,c_{1},c_{2},F)\rightarrow c^{\prime}. Given pkpk, two ciphertexts c1c_{1} and c2c_{2}, and a function FF, the algorithm 𝙴𝚟𝚊𝚕\mathtt{Eval} outputs a ciphertext cc^{\prime} encrypting F(c1,c2)F(c_{1},c_{2}).

We require FHE to satisfy correctness, semantic security, and functional privacy111Functional privacy ensures that given a ciphertext cc, which is an encrypted share of F(x1,x2)F(x_{1},x_{2}) obtained by homomorphically evaluating LL, cc is indistinguishable from ciphertext cc^{\prime} encrypting a share of F(x1,x2)F^{\prime}(x_{1},x_{2}) for any FF^{\prime}. . In VerifyML, we use the SEAL library [37] to implement the fully homomorphic encryption. In addition, we utilize ciphertext packing technology (CPT) [39] to encrypt multiple plaintexts into a single ciphertext, thus enabling homomorphic computation in a SIMD manner. Specifically, given two plaintext vectors 𝐱=(x1,,xn)\mathbf{x}=(x_{1},\cdots,x_{n}) and 𝐱=(x1,,xn)\mathbf{x^{\prime}}=(x_{1}^{\prime},\cdots,x_{n}^{\prime}), we can pack 𝐱\mathbf{x} and 𝐱\mathbf{x}^{\prime} into ciphertexts cc and cc^{\prime} each of them containing nn plaintext slots. Homomorphic operations between cc and cc^{\prime} including addition and multiplication are equivalent to performing the same element-wise operations on the corresponding plaintext slots.

FHE also provides algorithm 𝚁𝚘𝚝𝚊𝚝𝚒𝚘𝚗\mathtt{Rotation} to handle operations between data located in different plaintext slots. Informally, given a plaintext vector 𝐱=(x0,,xn)\mathbf{x}=(x_{0},\cdots,x_{n}) is encrypted into a single ciphertext cc, 𝚁𝚘𝚝𝚊𝚝𝚒𝚘𝚗(pk,c,j)\mathtt{Rotation}(pk,c,j) transforms cc into another ciphertext cc^{\prime} whose encrypted plaintext vector is x=(xj+1,xj+2,,x1,,xj)x^{\prime}=(x_{j+1},x_{j+2},\cdots,x_{1},\cdots,x_{j}). In this way, data on different plaintext slots can be moved to the same position to achieve element-wise operations under ciphertext. In FHE, rotation operations are computational expensive compared to homomorphic addition and multiplication operations. Therefore, the optimization criterion for homomorphic SIMD operations is to minimize the number of rotation operations.

2.5 Parallel Matrix Homomorphic Multiplication

We review the parallel homomorphic multiplication method between arbitrary matrices proposed by Jiang et al.[17], which will be used to accelerate the generation of authenticated triples for convolution in VerifyML. We take the homomorphic multiplication of two d×dd\times d dimensional matrices as an example. Specifically, given a d×dd\times d dimensional matrix 𝐗=(xi,j)0i,j<d\mathbf{X}={(x_{i,j})}_{0\leq i,j<d}, we first define four useful permutations, σ\sigma, τ\tau, ϕ\phi, and φ\varphi, over the field 𝔽pd×d\mathbb{F}_{p}^{d\times d}. Let σ(𝐗)i,j=𝐗i,i+j\sigma(\mathbf{X})_{i,j}=\mathbf{X}_{i,i+j}, τ(𝐗)i,j=𝐗i+j,j\tau(\mathbf{X})_{i,j}=\mathbf{X}_{i+j,j}, ϕ(𝐗)i,j=𝐗i,j+1\phi(\mathbf{X})_{i,j}=\mathbf{X}_{i,j+1} and φ(𝐗)i,j=𝐗i+1,j\varphi(\mathbf{X})_{i,j}=\mathbf{X}_{i+1,j}. Then for two square matrices 𝐗\mathbf{X} and 𝐘\mathbf{Y} of order dd, we can calculate the matrix multiplication between the two by the following formula:

𝐗𝐘=k=0d1(ϕkσ(𝐗))(φkτ(𝐘))\begin{split}\mathbf{X}\ast\mathbf{Y}=\sum_{k=0}^{d-1}(\phi^{k}\circ\sigma(\mathbf{X}))\odot(\varphi^{k}\circ\tau(\mathbf{Y}))\end{split} (6)

where \odot denotes the element-wise multiplication. We provide a toy example of the multiplication of two 3×33\times 3 matrices in Figure 1 for ease of understanding.

We can convert a d×dd\times d-dimensional matrix to a vector of length d2d^{2} by encoding map 𝔽pd2𝔽pd×d\mathbb{F}_{p}^{d^{2}}\rightarrow\mathbb{F}_{p}^{d\times d}: 𝐱=(x0,,xd21)𝐗=(xdi+j)0i,j<d\mathbf{x}=(x_{0},\cdots,x_{d^{2}-1})\mapsto\mathbf{X}={(x_{d\cdot i+j})}_{0\leq i,j<d}. A ciphertext is said to encrypt a matrix 𝐗\mathbf{X} if it encrypts the corresponding plaintext vector 𝐱\mathbf{x}. Therefore, given two square matrices 𝐗\mathbf{X} and 𝐗\mathbf{X}, the multiplication of the two under the ciphertext is calculated as follows:

𝐜𝐗𝐜𝐘=k=0d1(ϕk(𝙴𝚗𝚌pk(σ(𝐗))))(φk(𝙴𝚗𝚌pk(τ(𝐘))))\begin{split}\mathbf{c}_{\mathbf{X}}\circledast\mathbf{c}_{\mathbf{Y}}=\sum_{k=0}^{d-1}(\phi^{k}(\mathtt{Enc}_{pk}(\sigma(\mathbf{X}))))\boxtimes(\varphi^{k}(\mathtt{Enc}_{pk}(\tau(\mathbf{Y}))))\end{split} (7)
Refer to caption
Figure 1: Parallel matrix multiplication

In the following sections, we will use 𝐜𝐗𝐜𝐘\mathbf{c}_{\mathbf{X}}\circledast\mathbf{c}_{\mathbf{Y}} to represents multiplication between any matrixes 𝐗\mathbf{X} and 𝐘\mathbf{Y} in ciphertext. \boxtimes denotes the elewent-wise homomorphic multiplication between two ciphertexts. In Section 4.1.2, we describe how to utilize the parallel homomorphic multiplication described above to boost the generation of authenticated convolution triples.

2.6 Secret Sharing

  • \bullet

    Additive Secret Sharing. Given any x𝔽px\in\mathbb{F}_{p}, a 2-out-of-2 additive secret sharing of xx is a pair (x0,x1)=(xr,r)𝔽p2(\left\langle x\right\rangle_{0},\left\langle x\right\rangle_{1})=(x-r,r)\in\mathbb{F}_{p}^{2}, where rr is a random value uniformly selected from 𝔽p\mathbb{F}_{p}, and x=x0+x1x=\left\langle x\right\rangle_{0}+\left\langle x\right\rangle_{1}. Additive secret sharing is perfectly hiding, that is, given a share x0\left\langle x\right\rangle_{0} or x1\left\langle x\right\rangle_{1}, xx is perfectly hidden.

  • \bullet

    Authenticated Shares. Given a random value α\alpha (known as the MAC key) uniformly chosen from 𝔽p\mathbb{F}_{p}, for any x𝔽px\in\mathbb{F}_{p}, the authenticated shares of xx on α\alpha denote that each party PbP_{b} holds [[x]]b={αb,xb,αxb}b{0,1}[\![x]\!]_{b}=\{\langle\alpha\rangle_{b},\langle x\rangle_{b},\langle\alpha x\rangle_{b}\}_{b\in\{0,1\}}222Sometimes in [[x]]b[\![x]\!]_{b} we omit αb\langle\alpha\rangle_{b} for brevity., where we have (α0+α1)×(x0+x1)=(αx0+αx1)(\langle\alpha\rangle_{0}+\langle\alpha\rangle_{1})\times(\langle x\rangle_{0}+\langle x\rangle_{1})=(\langle\alpha x\rangle_{0}+\langle\alpha x\rangle_{1}). While in the general malicious 2PC setting, α\alpha should be generated randomly through interactions between all parties, in our model holder-malicious model, α\alpha can be picked up by P1P_{1} and secretly shared with P0P_{0}. Authenticated sharing provides logp\lfloor\log p\rfloor bits of statistical security. Informally, if a malicious P0P_{0} tries to forge the shared xx to be x+βx+\beta, by tampering with its shares (x0,αx0)(\langle x\rangle_{0},\langle\alpha x\rangle_{0}) to (x0+β,αx0+β)(\langle x\rangle_{0}+\beta,\langle\alpha x\rangle_{0}+\beta^{\prime}), for non-zero {β,β}\{\beta,\beta^{\prime}\}, the probability of parties being authenticated to hold the share of x+βx+\beta (i.e., αx+β=α(x+β)\alpha x+\beta^{\prime}=\alpha(x+\beta)) is at most 2logp2^{-\lfloor\log p\rfloor}.

2.7 Authenticated Beaver’s Triples

In VerifyML, we require the technique of authenticated Beaver’s triples to detect possible breaches of the protocol from the malicious model holder. In more detail, authenticated Beaver’s multiplication triple is denoted that each PbP_{b} holds a tuple {[[x]]b,[[y]]b,[[z]]b}b{0,1}\{[\![x]\!]_{b},[\![y]\!]_{b},[\![z]\!]_{b}\}_{b\in\{0,1\}}, where xx, yy, z𝔽pz\in\mathbb{F}_{p}, and satisfy xy=zxy=z. Giving P0P_{0} and P1P_{1} holding authenticated shares of cc and dd, i.e., ([[c]]0[\![c]\!]_{0}, [[d]]0[\![d]\!]_{0}), ([[c]]1[\![c]\!]_{1}, [[d]]1[\![d]\!]_{1}), respectively, to compute the authenticated share of the product of cc and dd, the parties first reveal cxc-x and dyd-y, and then each party PbP_{b} locally computes the authenticated share of [[e=cd]]b[\![e=c\cdot d]\!]_{b} as follows:

eb=(cx)(dy)+xb(dy)+(cx)yb+zbαeb=αb(cx)(dy)+αxb(dy)+(cx)αyb+αzb\begin{split}\langle e\rangle_{b}&=(c-x)\cdot(d-y)+\langle x\rangle_{b}\cdot(d-y)+(c-x)\cdot\langle y\rangle_{b}+\langle z\rangle_{b}\\ \langle\alpha e\rangle_{b}&=\langle\alpha\rangle_{b}(c-x)\cdot(d-y)+\langle\alpha x\rangle_{b}\cdot(d-y)+(c-x)\cdot\langle\alpha y\rangle_{b}+\langle\alpha z\rangle_{b}\end{split} (8)

Authenticated Beaver’s multiplication triple is independent of the user’s input in the actual execution of the secure computing protocol, thus can be generated offline (see Section 4) to speed up the performance of online secure multiplication computations. Inspired by existing work to construct custom triples for specific mathematical operations [30] for improving performance, we generalize traditional Beaver’s triples to matrix-vector multiplication and convolution domains. We provide the definitions of matrix-vector and convolution triples below and leave the description of generating them to Section 4.

  • \bullet

    Authenticated Matrix-Vector triples: is denoted that each PbP_{b} holds a tuple {[[𝐗]]b,[[𝐲]]b,[[𝐳]]b}b{0,1}\{[\![\mathbf{X}]\!]_{b},[\![\mathbf{y}]\!]_{b},[\![\mathbf{z}]\!]_{b}\}_{b\in\{0,1\}}, where 𝐗\mathbf{X} is a matrix uniformly chosen from 𝔽pd1×d2\mathbb{F}_{p}^{d_{1}\times d_{2}}, 𝐲\mathbf{y} represents a vector selected from 𝔽pd2\mathbb{F}_{p}^{d_{2}}, and 𝐳𝔽pd1\mathbf{z}\in\mathbb{F}_{p}^{d_{1}} satisfying 𝐗𝐲=𝐳\mathbf{X}\ast\mathbf{y}=\mathbf{z}, where d1d_{1} and d2d_{2} are determined depending on the ML model architecture.

  • \bullet

    Authenticated Convolution triples (aka matrix multiplication triples333We can reduce the convolution operation to matrix multiplication by transforming the inputs of convolution appropriately. We provide a detailed description in Section 4.): is denoted that each PbP_{b} holds a tuple {[[𝐗]]b,[[𝐘]]b,[[𝐙]]b}b{0,1}\{[\![\mathbf{X}]\!]_{b},[\![\mathbf{Y}]\!]_{b},[\![\mathbf{Z}]\!]_{b}\}_{b\in\{0,1\}}, where 𝐗\mathbf{X} and 𝐘\mathbf{Y} are tensors uniformly chosen from 𝔽puw×uh×ci\mathbb{F}_{p}^{u_{w}\times u_{h}\times c_{i}} and 𝔽p(2l+1)×(2l+1)×ci×co\mathbb{F}_{p}^{(2l+1)\times(2l+1)\times c_{i}\times c_{o}}, respectively. 𝐙𝔽puw×uh×co\mathbf{Z}\in\mathbb{F}_{p}^{u_{w}^{\prime}\times u_{h}^{\prime}\times c_{o}} satisfying convolution 𝙲𝚘𝚗𝚟(𝐗,𝐘)=𝐙\mathtt{Conv}(\mathbf{X},\mathbf{Y})=\mathbf{Z}, where uwu_{w}, uwu_{w}, uwu_{w}^{\prime}, uhu_{h}^{\prime}, ll, cic_{i} and coc_{o} are determined depending on the model architecture.

2.8 Oblivious Transfer

We take OTn to denote the 1-out-of-2 Oblivious Transfer (OT) [13, 10]. In OTn, the inputs of the sender (assuming P0P_{0} for convenience) are two strings s0,s1{0,1}ns_{0},s_{1}\in\{0,1\}^{n}, and the input of the receiver (P1P_{1}) is a bit b{0,1}b\in\{0,1\} for selection. At the end of the OT-execution, P1P_{1} learns sbs_{b} while P0P_{0} learns nothing. In this paper, we require that the instance of OTn is secure against a semi-honest sender and a malicious receiver. We use OTκn{}_{n}^{\kappa} to represent κ\kappa instances of OTn. We exploit [21] to implement OTκn{}_{n}^{\kappa} with the communication complexity of κλ+2n\kappa{\lambda+2n} bits.

2.9 Garbled Circuits

The garbling scheme [35, 8] for boolean circuits parsing arbitrary functions consists of a pair of algorithms (𝙶𝚊𝚛𝚋𝚕𝚎\mathtt{Garble}, 𝙶𝙲𝙴𝚟𝚊𝚕\mathtt{GCEval}) defined as follows:

  • \bullet

    𝙶𝚊𝚛𝚋𝚕𝚎(1λ,C)(𝙶𝙲,{{𝚕𝚊𝚋i,jin}i[n],{𝚕𝚊𝚋jout}}j{0,1})\mathtt{Garble}(1^{\lambda},C)\rightarrow(\mathtt{GC},\{\{\mathtt{lab}_{i,j}^{in}\}_{i\in[n]},\{\mathtt{lab}_{j}^{out}\}\}_{j\in\{0,1\}}). Giving the security parameter λ\lambda and an arbitrary Boolean circuit C:{0,1}n{0,1}C:\{0,1\}^{n}\rightarrow\{0,1\}, the algorithm 𝙶𝚊𝚛𝚋𝚕𝚎\mathtt{Garble} outputs a garbled circuit 𝙶𝙲\mathtt{GC}, a set of input labels {𝚕𝚊𝚋i,jin}i[n],j{0,1}\{\mathtt{lab}_{i,j}^{in}\}_{i\in[n],j\in\{0,1\}} of this 𝙶𝙲\mathtt{GC}, and a set of output labels {𝚕𝚊𝚋jout}j{0,1}\{\mathtt{lab}_{j}^{out}\}_{j\in\{0,1\}}, where the size of each label is λ\lambda bits. For any x{0,1}nx\in\{0,1\}^{n}, we refer to {𝚕𝚊𝚋i,x[i]in}i[n]\{\mathtt{lab}_{i,x[i]}^{in}\}_{i\in[n]} as the garbled input of xx, and 𝚕𝚊𝚋C(x)out\mathtt{lab}_{C(x)}^{out} as the garbled output of C(x)C(x).

  • \bullet

    𝙶𝙲𝙴𝚟𝚊𝚕(𝙶𝙲,{𝚕𝚊𝚋i}i[n])𝚕𝚊𝚋\mathtt{GCEval}(\mathtt{GC},\{\mathtt{lab}_{i}\}_{i\in[n]})\rightarrow\mathtt{lab^{\prime}}. Giving the garbled circuit 𝙶𝙲\mathtt{GC} and a set of input labels {𝚕𝚊𝚋i}i[n]\{\mathtt{lab}_{i}\}_{i\in[n]}, the algorithm 𝙶𝙲𝙴𝚟𝚊𝚕\mathtt{GCEval} outputs a label 𝚕𝚊𝚋\mathtt{lab^{\prime}}.

Let 𝙶𝚊𝚛𝚋𝚕𝚎(1λ,C)(𝙶𝙲,{{𝚕𝚊𝚋i,jin}i[n],{𝚕𝚊𝚋jout}}j{0,1})\mathtt{Garble}(1^{\lambda},C)\rightarrow(\mathtt{GC},\{\{\mathtt{lab}_{i,j}^{in}\}_{i\in[n]},\{\mathtt{lab}_{j}^{out}\}\}_{j\in\{0,1\}}), the above garbled scheme (𝙶𝚊𝚛𝚋𝚕𝚎\mathtt{Garble}, 𝙶𝙲𝙴𝚟𝚊𝚕\mathtt{GCEval}) is required to satisfy the following properties:

  • \bullet

    Correctness. 𝙶𝙲𝙴𝚟𝚊𝚕\mathtt{GCEval} is faithfully performed on the 𝙶𝙲\mathtt{GC} and correctly outputs garbled results when given the garbled input of xx. Formally, for any Boolean circuit CC and input x{0,1}nx\in\{0,1\}^{n}, 𝙶𝙲𝙴𝚟𝚊𝚕\mathtt{GCEval} holds that

    𝙶𝙲𝙴𝚟𝚊𝚕(𝙶𝙲,{𝚕𝚊𝚋i,x[i]in}i[n])𝚕𝚊𝚋C(x)out\mathtt{GCEval}(\mathtt{GC},\{\mathtt{lab}_{i,x[i]}^{in}\}_{i\in[n]})\rightarrow\mathtt{lab}_{C(x)}^{out}
  • \bullet

    Security. Given CC, the garbled circuit 𝙶𝙲\mathtt{GC} of CC and garbled inputs of any x{0,1}nx\in\{0,1\}^{n} can be simulated by a polynomial probability-time simulator 𝚂𝚒𝚖\mathtt{Sim}. Formally, for any circuit CC and input x{0,1}nx\in\{0,1\}^{n}, we have (𝙶𝙲,{𝚕𝚊𝚋i,x[i]in}i[n])𝚂𝚒𝚖(1λ,C)(\mathtt{GC},\{\mathtt{lab}_{i,x[i]}^{in}\}_{i\in[n]})\approx\mathtt{Sim}(1^{\lambda},C), where \approx indicates computational indistinguishability.

  • \bullet

    Authenticity. This implies that given the garbled input of xx and 𝙶𝙲\mathtt{GC}, it is infeasible to guess the output label of 1C(x)1-C(x). Formally, for any circuit CC and x{0,1}nx\in\{0,1\}^{n}, we have (𝚕𝚊𝚋1C(x)out|𝙶𝙲,{𝚕𝚊𝚋i,x[i]in}i[n])Uλ\left(\mathtt{lab}_{1-C(x)}^{out}|\mathtt{GC},\{\mathtt{lab}_{i,x[i]}^{in}\}_{i\in[n]}\right)\approx U_{\lambda}.

Without loss of generality, the garbled scheme described above can be naturally extended to securely implement Boolean circuits with multi-bit outputs. In VerifyML, we utilize state-of-the-art optimization strategies, including point-and-permute [12], free-XOR [23] and half-gates [42] to construct the garbling scheme.

3 Technical Intuition

VerifyML is essentially a 2PC protocol over the model holder-malicious threat model, where the client unbiasedly learns the inference results on a given test set, thereby faithfully evaluating the fairness of the target model locally. For boosting the performance of the 2PC protocol execution, we customize a series of optimization methods by fully exploring the advantages of cryptographic primitives and their natural ties in inference process. Below we present a high-level technically intuitive overview of VerifyML’s design.

3.1 Offline-Online Paradigm

Consistent with state-of-the-art work on the setting of semi-honest models [29], VerifyML is deconstructed into an offline stage and an online stage, where the preprocessing process of the offline stage is independent of the input of model holders and clients. In this way, the majority (>95%>95\%) of the computation can be performed offline to minimize the overhead of the online process. Figure 2 provides an overview of VerifyML, where we describe the computational parts required for the offline and online phase, respectively.

  Offline Phase. This phase the client and model holder pre-compute data in preparation for subsequent online execution, which is independent of input from all parties. That is, VerifyML can run this phase without knowing the client’s input 𝐱0\mathbf{x}_{0} and the model holder’s input 𝐌\mathbf{M}. \bullet Preprocessing for the linear layer. The Client interacts with the model holder to generate authenticated triples for matrix-vector multiplication and convolution. \bullet Preprocessing for the nonlinear layer. The client constructs a garbled circuit 𝙶𝙲\mathtt{GC} for circuit C parsing ReLU. The client sends 𝙶𝙲\mathtt{GC} and a set of ciphertexts to the model holder for generating the authenticated shares of ReLU’s results.
  Online Phase. This phase is divided into following parts. \bullet Preamble. The client secretly shares its input 𝐱0\mathbf{x}_{0} with the model holder, and similarly, the model holder shares the model parameter 𝐌\mathbf{M} with the client. Thus both the model holder and the client hold an authenticated share of 𝐱0\mathbf{x}_{0} and 𝐌\mathbf{M}. Note that the sharing of 𝐌\mathbf{M} can be done offline, if the model to be verified is knowed in advance. \bullet Layer evaluation. Let 𝐱i\mathbf{x}_{i} be the result of evaluating the first ii layers of model 𝐌\mathbf{M} on 𝐱0\mathbf{x}_{0}. At the beginning of the i+1i+1-th layer, both the client and the model holder hold an authenticated share about 𝐱i\mathbf{x}_{i} and the i+1i+1-th layer parameter 𝐋i+1\mathbf{L}_{i+1}, i.e., parties Pb{0,1}{P_{b}}_{\in\{0,1\}} hold ([[𝐱i]]b,[[𝐋i+1]]b)([\![\mathbf{x}_{i}]\!]_{b},[\![\mathbf{L}_{i+1}]\!]_{b}). 1. Linear layer . The client interacts with the model holder to perform the authenticated shares of 𝐯i+1=𝐋i+1𝐱i+1\mathbf{v}_{i+1}=\mathbf{L}_{i+1}\mathbf{x}_{i+1}, where both parties securely compute matrix-vector multiplication and convolution operations with the aid of triples generated in the precomputing process. 2. Nonlinear layer. After the linear layer, the two parties hold the authenticated shares of 𝐯i+1\mathbf{v}_{i+1}. The client and the model holder invoke the OT to send the garbled input of 𝙶𝙲\mathtt{GC} to the model holder. The model holder evaluates the 𝙶𝙲\mathtt{GC}, and eventually the two parties get authenticated shares of the ReLU result. \bullet Consistency check. The client interacts with the model holder to check any malicious behavior of the model holder during the entire inference process. The client uses the properties of the authenticated sharing to construct the consistency check protocol. If consistency passes, the client locally computes the fairness of the target model, otherwise the client outputs abort.
Figure 2: Overview of the VerifyML

3.2 Linear Layer Optimization

As described in Figure 2, we move almost all linear operations into the offline phase, where we construct customized triples for matrix-vector multiplication and convolution to accelerate linear execution. Specifically, 1) we design an efficient construction of matrix-multiplication triples instead of generating Beaver’s multiplication triples for individual multiplications (see Section 4.1.1). Our core insight is a new packed homomorphic multiplication method for matrices and vectors. We explore the inherent connection between secret sharing and homomorphic encryption to remove all the rotation operation in parallel homomorphic computation. 2) We extend the idea of generating matrix multiplicative triples over semi-honest models [30] into convolution domain over the model holder-malicious threat model (see Section 4). The core of our construction is derived from E2DM [17], which proposes a state-of-the-art method for parallel homomorphic multiplication between arbitrary matrices. We further optimize E2DM to achieve at least 2×2\times computational speedup compared to naive use.

Our optimization technique for linear layer computation exhibits superior advantages compared to state-of-the-art existing methods [22, 20]444Note that several efficient parallel homomorphic computation methods [19, 43] with packed ciphertext have been proposed and run on semi-honest or client-malicious models [29, 26, 5] for secure inference. It may be possible to transfer these techniques to our method to speed up triple’s generation, but this is certainly non-trivial and we leave it for future work.. In more detail, we reduce the communication overhead from cubic to quadratic (both for offline and online phases) compared to Overdrive [22], which is the mainstream tool for generating authenticated multiplicative triples on malicious adversary models(see Section 4 for detailed analysis).

3.3 Non-linear Layer Optimization

We use the garbled circuit to achieve secure computation of nonlinear functions (mainly ReLU) in ML models. Specifically, assumed that P0P_{0} and P1P_{1} learn the authenticated sharing about 𝐯i=𝐋i𝐱i1\mathbf{v}_{i}=\mathbf{L}_{i}\mathbf{x}_{i-1} after executing the ii-th linear layer, that is, each party PbP_{b} holds [[𝐯i]]b={αb,𝐯ib,α𝐯ib}b{0,1}[\![\mathbf{v}_{i}]\!]_{b}=\{\langle\alpha\rangle_{b},\langle\mathbf{v}_{i}\rangle_{b},\langle\alpha\mathbf{v}_{i}\rangle_{b}\}_{b\in\{0,1\}}. Then, {𝐯ib}b{0,1}\{\langle\mathbf{v}_{i}\rangle_{b}\}_{b\in\{0,1\}} will be used as the input of ReLU (dented as fif_{i} for brevity) in the ii-th nonlinear layer for both parties learning the authentication sharing about 𝐱i=fi(𝐯i)\mathbf{x}_{i}=f_{i}(\mathbf{v}_{i}), i.e., [[𝐱i]]b[\![\mathbf{x}_{i}]\!]_{b}. However, constructing such a satisfactory garbling scheme has the following intractable problems.

  • \bullet

    How to validate input from the malicious model holder. Since the model holder is malicious, it must be ensured that the input from the model holder in the 𝙶𝙲\mathtt{GC} (i.e. 𝐯i0\langle\mathbf{v}_{i}\rangle_{0}) is consistent with the share obtained by the previous linear layer. In the traditional malicious adversary model [22, 20, 6], a standard approach is to verify the correctness of the authenticated sharing of all inputs from malicious entities in the 𝙶𝙲\mathtt{GC}. However, this is very expensive and takes tens of seconds or even minutes to process a ReLU function. It obviously does not meet the practicality of ML model inference because a modern ML model usually contains thousands of ReLU functions.

  • \bullet

    How to minimize the number of multiplication encapsulated into 𝙶𝙲\mathtt{GC}. For the ii-th nonlinear layer, we need to compute the authenticated shares of the ReLU output, i.e. [[𝐱i]]b={αb,𝐱ib,α𝐱ib}b{0,1}[\![\mathbf{x}_{i}]\!]_{b}=\{\langle\alpha\rangle_{b},\langle\mathbf{x}_{i}\rangle_{b},\langle\alpha\mathbf{x}_{i}\rangle_{b}\}_{b\in\{0,1\}}. This requires at least two multiplications on the field, if all computations are encapsulated into the 𝙶𝙲\mathtt{GC}. Note that performing arithmetic multiplication operations in the 𝙶𝙲\mathtt{GC} is expensive and requires at least O(κ2λ)O(\kappa^{2}\lambda) communication overhead.

We design novel protocols to remedy the above problems through the following insights: (1) garbled circuits already achieve malicious security against garbled circuit evaluators (i.e., the model holder in our setting) [26]. This means that we only need to construct a lightweight method to check the consistency between the input of the malicious adversary in the nonlinear layer and the results obtained by the previous linear layer. Then, this method can be integrated with 𝙶𝙲\mathtt{GC} to achieve end-to-end nonlinear secure computing (see Section 4). (2) It is enough to calculate the output label for each bit of fi(𝐯i)f_{i}(\mathbf{v}_{i})’s share (i.e., fi(𝐯i)[j]f_{i}(\mathbf{v}_{i})[j], for 1jκ1\leq j\leq\kappa) in the GC, rather than obtaining the exact arithmetic share of fi(𝐯i)f_{i}(\mathbf{v}_{i}) [5]. Moreover, we can parse ReLU function as ReLU(𝐯i)=𝐯isign(𝐯i)ReLU(\mathbf{v}_{i})=\mathbf{v}_{i}\cdot sign(\mathbf{v}_{i}), where the sign function sign(𝐯i)sign(\mathbf{v}_{i}) equals 1 if t0t\geq 0 and 0 otherwise. Hence, we only encapsulate the non-linear part of ReLU(𝐯i)ReLU(\mathbf{v}_{i}) (i.e., sign(𝐯i)sign(\mathbf{v}_{i})) into the 𝙶𝙲\mathtt{GC}, thereby substantially minimizing the number of multiplication operations.

Compared with works [22, 20, 6] with malicious adversary, VerifyML reduces the communication overhead of each ReLU function from 2cλ+190κλ+232κ22c\lambda+190\kappa\lambda+232\kappa^{2} to 2dλ+4κλ+6κ22d\lambda+4\kappa\lambda+6\kappa^{2}, where dcd\ll c. Our experiments show that VerifyML achieves 4×4\times-42×42\times computation speedup and gains 48×48\times less communication overhead for nonlinear layer computation.

Remark 3.1 . Beyond the above optimization strategies, we also do a series of strategies to reduce the overhead in the implementation process, including removing the reliance on distributed decryption primitives in previous works [22, 20, 6] and minimizing the number of calls to zero-knowledge proofs of ciphertexts. In the following section, we provide a comprehensive technical description of the proposed method.

4 The VerifyML Framework

4.1 Offline Phase

In this section, we describe the technical details of VerifyML. As described above, VerifyML is divided intooffline and online phases. We first describe the operations that need to be precomputed in the offline phase, including generating matrix-vector multiplications and triples for convolution, and garbled circuits for constructing the objective function. Then, we introduce the technical details of the online phase.

4.1.1    Generating matrix-vector multiplication triple

Figure 3 depicts the interaction between the model holder P0P_{0} and the client P1P_{1} to generate triples of matrix-vector multiplications. Succinctly, P0P_{0} first uniformly selects 𝐗0\langle\mathbf{X}\rangle_{0} and 𝐲0\langle\mathbf{y}\rangle_{0} and sends their encryption to P1P_{1}, along with zero-knowledge proofs about these ciphertexts, where 𝐲0\langle\mathbf{y}\rangle_{0} need to be transformed into matrix 𝐘0\langle\mathbf{Y}\rangle_{0} before encryption (step 2 in Figure 3). P1P_{1} recovers 𝐗\mathbf{X} and 𝐘\mathbf{Y} in ciphertext and then computes (α𝐗0,α𝐘0,α𝐙0,𝐙0)(\langle\alpha\mathbf{X}\rangle_{0},\langle\alpha\mathbf{Y}\rangle_{0},\langle\alpha\mathbf{Z}\rangle_{0},\langle\mathbf{Z}\rangle_{0}) (step 3 in Figure 3). Then it returns the corresponding ciphertexts to P0P_{0}. P0P_{0} decrypts them and computes α𝐲1\langle\alpha\mathbf{y}\rangle_{1}, α𝐳1\langle\alpha\mathbf{z}\rangle_{1} and 𝐳1\langle\mathbf{z}\rangle_{1} (step 4 in Figure 3).

Input: {Pb}b{0,1}\{P_{b}\}_{b\in\{0,1\}} holds 𝐗b\langle\mathbf{X}\rangle_{b} uniformly chosen from 𝔽pd1×d2\mathbb{F}_{p}^{d_{1}\times d_{2}}, and 𝐲b\langle\mathbf{y}\rangle_{b} uniformly chosen from 𝔽pd2\mathbb{F}_{p}^{d_{2}}. In addition, P1P_{1} hold a MAC key α\alpha uniformly chosen from 𝔽p\mathbb{F}_{p}.
Output: PbP_{b} obtains {[[𝐗]]b,[[𝐲]]b,[[𝐳]]b}b{0,1}\{[\![\mathbf{X}]\!]_{b},[\![\mathbf{y}]\!]_{b},[\![\mathbf{z}]\!]_{b}\}_{b\in\{0,1\}} where 𝐗𝐲=𝐳\mathbf{X}\ast\mathbf{y}=\mathbf{z}.
Procedure: 1. P0P_{0} and P1P_{1} participate in a secure two-party computation such that P0P_{0} obtains an FHE public secret key pair (pkpk, sksk) while P1P_{1} obtains the public key pkpk. This process is performed only once. 2. P0P_{0} first converts 𝐲0\langle\mathbf{y}\rangle_{0} into a d1×d2d_{1}\times d_{2}-dimensional matrix 𝐘0\langle\mathbf{Y}\rangle_{0} where each row constitutes a copy of 𝐲0\langle\mathbf{y}\rangle_{0}. Then, P0P_{0} send the encryptions c1𝙴𝚗𝚌(pk,𝐗0)c_{1}\leftarrow\mathtt{Enc}(pk,\langle\mathbf{X}\rangle_{0}) and c2𝙴𝚗𝚌(pk,𝐘0)c_{2}\leftarrow\mathtt{Enc}(pk,\langle\mathbf{Y}\rangle_{0}) to P1P_{1} along with zero-knowledge (ZK) proofs of plaintext knowledge of the two ciphertexts 444A ZK proof of knowledge for ciphertexts is used to state that c1c_{1} and c2c_{2} are valid ciphertexts generated from the given FHE cryptosystem. Readers can refer to [22, 6] for more details.. 3. P1P_{1} also converts 𝐲1\langle\mathbf{y}\rangle_{1} into a d1×d2d_{1}\times d_{2}-dimensional matrix 𝐘1\langle\mathbf{Y}\rangle_{1} where each row constitutes a copy of 𝐲1\langle\mathbf{y}\rangle_{1}. Then it samples (α𝐗1,α𝐘1,α𝐙1,𝐙1)(\langle\alpha\mathbf{X}\rangle_{1},\langle\alpha\mathbf{Y}\rangle_{1},\langle\alpha\mathbf{Z}\rangle_{1},\langle\mathbf{Z}\rangle_{1}) from 𝔽p4×(d1×d2)\mathbb{F}_{p}^{4\times({d_{1}\times d_{2}})}. P1P_{1} sends c3=𝙴𝚗𝚌pk(α(𝐗1+𝐗0)α𝐗1)c_{3}=\mathtt{Enc}_{pk}(\alpha(\langle\mathbf{X}\rangle_{1}+\langle\mathbf{X}\rangle_{0})-\langle\alpha\mathbf{X}\rangle_{1}), c4=𝙴𝚗𝚌pk(α(𝐘1+𝐘0)α𝐘1)c_{4}=\mathtt{Enc}_{pk}(\alpha(\langle\mathbf{Y}\rangle_{1}+\langle\mathbf{Y}\rangle_{0})-\langle\alpha\mathbf{Y}\rangle_{1}), c5=𝙴𝚗𝚌pk(α(𝐗𝐘)α𝐙1)c_{5}=\mathtt{Enc}_{pk}(\alpha(\mathbf{X}\odot\mathbf{Y})-\langle\alpha\mathbf{Z}\rangle_{1}), and c6=𝙴𝚗𝚌pk((𝐗𝐘)𝐙1)c_{6}=\mathtt{Enc}_{pk}((\mathbf{X}\odot\mathbf{Y})-\langle\mathbf{Z}\rangle_{1}) to P0P_{0}. 4. P0P_{0} decrypts c3c_{3}, c4c_{4}, c5c_{5} and c6c_{6} to obtain (α𝐗0,α𝐘0,α𝐙0,𝐙0)(\langle\alpha\mathbf{X}\rangle_{0},\langle\alpha\mathbf{Y}\rangle_{0},\langle\alpha\mathbf{Z}\rangle_{0},\langle\mathbf{Z}\rangle_{0}), respectively. Then, it sums the elements of each row of the matrices α𝐘0\langle\alpha\mathbf{Y}\rangle_{0}555Note that for α𝐘0\langle\alpha\mathbf{Y}\rangle_{0}, we only take the all elements in the first row as α𝐲0\langle\alpha\mathbf{y}\rangle_{0} by default.The operation for α𝐘1\langle\alpha\mathbf{Y}\rangle_{1} is the same as above., α𝐙0\langle\alpha\mathbf{Z}\rangle_{0} and 𝐙0\langle\mathbf{Z}\rangle_{0} to form the vectors α𝐲0\langle\alpha\mathbf{y}\rangle_{0}, α𝐳0\langle\alpha\mathbf{z}\rangle_{0} and 𝐳0\langle\mathbf{z}\rangle_{0}. P1P_{1} does the same for (α𝐘1,α𝐙1,𝐙1)(\langle\alpha\mathbf{Y}\rangle_{1},\langle\alpha\mathbf{Z}\rangle_{1},\langle\mathbf{Z}\rangle_{1}) to obtain α𝐲1\langle\alpha\mathbf{y}\rangle_{1}, α𝐳1\langle\alpha\mathbf{z}\rangle_{1} and 𝐳1\langle\mathbf{z}\rangle_{1}. 5. PbP_{b} outputs {[[𝐗]]b,[[𝐲]]b,[[𝐳]]b}b{0,1}\{[\![\mathbf{X}]\!]_{b},[\![\mathbf{y}]\!]_{b},[\![\mathbf{z}]\!]_{b}\}_{b\in\{0,1\}}, where 𝐗𝐲=𝐳\mathbf{X}\ast\mathbf{y}=\mathbf{z}.
Figure 3: Algorithm πMtriple\pi_{Mtriple} for generating authenticated matrix-vector multiplication triple
Refer to caption
Figure 4: Matrix-vector multiplication

Figure 4 provides an example of the multiplication of a 3×43\times 4-dimensional matrix 𝐗\mathbf{X} and a 44-dimensional vector 𝐲\mathbf{y} to facilitate understanding. To compute the additive sharing of 𝐳=𝐗𝐲\mathbf{z}=\mathbf{X}\ast\mathbf{y} (step(a) in Figure 4), 𝐲\mathbf{y} is first transformed into a matrix 𝐗\mathbf{X} by copying, where each row of 𝐘\mathbf{Y} contains a copy of 𝐲\mathbf{y}. P1P_{1} then performs element-wise multiplications (step(b) in Figure 4) for 𝐗\mathbf{X} and 𝐘\mathbf{Y} under the ciphertext. To construct the additive sharing of 𝐳=𝐗𝐲\mathbf{z}=\mathbf{X}\ast\mathbf{y}, P1P_{1} uniformly chooses a random matrix 𝐑𝔽p3×4\mathbf{R}\in\mathbb{F}_{p}^{3\times 4} and computes 𝐙0=𝐗𝐘𝐑\langle\mathbf{Z}\rangle_{0}=\mathbf{X}\odot\mathbf{Y}-\mathbf{R} (step(c) in Figure 4). P1P_{1} sends the ciphertext result to P0P_{0}. P0P_{0} decrypts it and sums each row in plaintext to obtain vector 𝐳0\langle\mathbf{z}\rangle_{0} (step(d) in Figure 4), similarly, P1P_{1} performs the same operation on matrix 𝐑\mathbf{R} to obtain 𝐳1\langle\mathbf{z}\rangle_{1}.

Remark 4.1: Compared to generating multiplication triples for single multiplication [22, 20], our constructed matrix-multiplication triples enable the communication overhead to be independent of the number of multiplications, only related to the size of the input. This reduces the amount of data that needs to be exchanged between P0P_{0} and P1P_{1}. In addition, we move the majority of the computation to be executed by the semi-honest party, which avoids the need for distributed decryption and frequent zero-knowledge proofs in malicious adversary settings. Compared to existing parallel homomorphic computation methods [17, 14], our matrix-vector multiplication does not involve any rotation operation, which is very computationally expensive compared to other homomorphic operations. This stems from our observation of the inner tie between HE and secret sharing. Since the final ciphertext result needs to be secretly shared to P0P_{0} and P1P_{1}, we can first perform the secret sharing under the ciphertext (see step(c) and step(d) in Figure 4), and then perform all rotation and summation operations under the plaintext.

Security. Our protocol for generating matrix-vector multiplication triples, πMtriple\pi_{Mtriple}, is secure against the malicious model holder P0P_{0} and the semi-honest client P1P_{1}. We provide the following theorem and prove it in Appendix B.

Theorem 4.1.

Let the fully homomorphic encryption used in πMtriple\pi_{Mtriple} have the properties defined in Section 2.4. πMtriple\pi_{Mtriple} is secure against the malicious model holder P0P_{0} and the semi-honest client P1P_{1}.

4.1.2    Generating convolution triple

We describe the technical details of generating authenticated triples for convolution. Briefly, for a given convolution operation, we first convert it to equivalent matrix multiplications, and then generate triples for the matrix multiplications. We start by reviewing the definition of convolution and how to translate it into the equivalent matrix multiplication. Then, we explain how to generate authenticated triples.

Convolution. Assuming an input tensor of size uw×uhu_{w}\times u_{h} with cic_{i} channels, denoted as 𝐗ijk\mathbf{X}_{ijk}, where 1iuw1\leq i\leq u_{w} and 1juh1\leq j\leq u_{h} are spatial coordinates, and 1kci1\leq k\leq c_{i} is the channel. Let coc_{o} kernels with a size of (2l+1)×(2l+1)×ci(2l+1)\times(2l+1)\times c_{i} denote as tensor 𝐘Δi,Δj,k,k\mathbf{Y}_{\Delta_{i},\Delta_{j},k,k^{\prime}}, where lΔi,Δjl-l\leq\Delta_{i},\Delta_{j}\leq l are shifts of the spatial coordinates, 1kci1\leq k\leq c_{i} and 1kco1\leq k^{\prime}\leq c_{o} are the channels and kernels, respectively. The convolution between 𝐗\mathbf{X} and 𝐘\mathbf{Y} (i.e., 𝐙=𝙲𝚘𝚗𝚟(𝐗,𝐘)\mathbf{Z}=\mathtt{Conv}(\mathbf{X},\mathbf{Y})) is defined as below:

𝐙ijk=Δi,Δj,k𝐗i+Δi,j+Δj,k𝐘Δi,Δj,k\begin{split}\mathbf{Z}_{ijk^{\prime}}=\sum_{\Delta_{i},\Delta_{j},k}\mathbf{X}_{i+\Delta_{i},j+\Delta_{j},k}\cdot\mathbf{Y}_{\Delta_{i},\Delta_{j},k^{\prime}}\end{split} (9)

The resulting tensor 𝐙ijk\mathbf{Z}_{ijk^{\prime}} has uw×uhu_{w}^{\prime}\times u_{h}^{\prime} spatial coordinates and c0c_{0} channels. We have uw=(uw(2l+1)+2p)/s+1u_{w}^{\prime}=(u_{w}-(2l+1)+2p)/s+1 and uh=(uh(2l+1)+2p)/s+1u_{h}^{\prime}=(u_{h}-(2l+1)+2p)/s+1, where pp represents the number of turns to zero-pad the input, and ss represents the stride size of the kernel movement [25]. Note that the entries of 𝐗\mathbf{X} to be zero if i+Δii+\Delta_{i} or j+Δjj+\Delta_{j} are outside of the ranges [1;uw][1;u_{w}^{\prime}] and [1;uh][1;u_{h}^{\prime}], respectively.

Conversion between convolution and matrix multiplication. Based on Eqn.(9), we can easily convert convolution into an equivalent matrix multiplication. Specifically, we construct a matrix 𝐗\mathbf{X^{\prime}} with dimension uwuh×(2l+1)2ciu_{w}^{\prime}u_{h}^{\prime}\times(2l+1)^{2}\cdot c_{i}, where 𝐗(i,j)(Δi,Δj,k)=𝐗i+Δi,j+Δj,k\mathbf{X^{\prime}}_{(i,j)(\Delta_{i},\Delta_{j},k)}=\mathbf{X}_{i+\Delta_{i},j+\Delta_{j},k}. Similarly, we construct a matrix 𝐘\mathbf{Y^{\prime}} of dimension (2l+1)2ci×co(2l+1)^{2}\cdot c_{i}\times c_{o} such that 𝐘(Δi,Δj,k)k=𝐘Δi,Δj,k\mathbf{Y^{\prime}}(\Delta_{i},\Delta_{j},k)k^{\prime}=\mathbf{Y}_{\Delta_{i},\Delta_{j},k^{\prime}}. Then, the original convolution operation is transformed into 𝐙=𝐗𝐘\mathbf{Z^{\prime}}=\mathbf{X^{\prime}}\ast\mathbf{Y^{\prime}}, where 𝐙(ij)k\mathbf{Z^{\prime}}_{(ij)k^{\prime}} =𝐙ijk=\mathbf{Z}_{ijk^{\prime}}. In Appendix 9, we provide a detailed example to implement the above transformation.

Input: {Pb}b{0,1}\{P_{b}\}_{b\in\{0,1\}} holds 𝐗b\langle\mathbf{X}\rangle_{b} uniformly chosen from 𝔽puw×uh×ci{\mathbb{F}_{p}}^{u_{w}\times u_{h}\times c_{i}}, and 𝐘b\langle\mathbf{Y}\rangle_{b} uniformly chosen from 𝔽p(2l+1)×(2l+1)×ci×co{\mathbb{F}_{p}}^{(2l+1)\times(2l+1)\times c_{i}\times c_{o}}. In addition, p1p_{1} holds a MAC key α\alpha uniformly chosen from 𝔽p\mathbb{F}_{p}
Output: PbP_{b} obtains {[[𝐗]]b,[[𝐘]]b,[[𝐙]]b}b{0,1}\{[\![\mathbf{X}]\!]_{b},[\![\mathbf{Y}]\!]_{b},[\![\mathbf{Z}]\!]_{b}\}_{b\in\{0,1\}}, where 𝐙=𝙲𝚘𝚗𝚟(𝐗,𝐘)\mathbf{Z}=\mathtt{Conv}(\mathbf{X},\mathbf{Y}).
Procedure: 1. P0P_{0} and P1P_{1} participate in a secure two-party computation such that P0P_{0} obtains an FHE public-secret key pair (pkpk, sksk) while P1P_{1} obtains the public key pkpk. This process is performed only once. 2. P0P_{0} first converts 𝐗0\langle\mathbf{X}\rangle_{0} and 𝐘0\langle\mathbf{Y}\rangle_{0} into equivalent matrixes 𝐗0\langle\mathbf{X^{\prime}}\rangle_{0} and and 𝐘0\langle\mathbf{Y^{\prime}}\rangle_{0}, where 𝐗0𝔽puwuh×(2l+1)2ci\langle\mathbf{X^{\prime}}\rangle_{0}\in{\mathbb{F}_{p}}^{u_{w}^{\prime}u_{h}^{\prime}\times(2l+1)^{2}\cdot c_{i}} while 𝐘0𝔽p(2l+1)2ci×co\langle\mathbf{Y^{\prime}}\rangle_{0}\in{\mathbb{F}_{p}}^{(2l+1)^{2}\cdot c_{i}\times c_{o}}. Then, P0P_{0} sends the encryptions c1𝙴𝚗𝚌(pk,𝐗0)c_{1}\leftarrow\mathtt{Enc}(pk,\langle\mathbf{X^{\prime}}\rangle_{0}) and c2𝙴𝚗𝚌(pk,𝐘0)c_{2}\leftarrow\mathtt{Enc}(pk,\langle\mathbf{Y^{\prime}}\rangle_{0}) to P1P_{1} along with zero-knowledge (ZK) proofs of plaintext knowledge of the two ciphertexts. 3. P1P_{1} also converts 𝐗1\langle\mathbf{X}\rangle_{1} and 𝐘1\langle\mathbf{Y}\rangle_{1} into equivalent matrixes 𝐗1\langle\mathbf{X^{\prime}}\rangle_{1} and and 𝐘1\langle\mathbf{Y^{\prime}}\rangle_{1}. Then it samples (α𝐗1,α𝐘1,α𝐙1,𝐙1)(\langle\alpha\mathbf{X^{\prime}}\rangle_{1},\langle\alpha\mathbf{Y^{\prime}}\rangle_{1},\langle\alpha\mathbf{Z^{\prime}}\rangle_{1},\langle\mathbf{Z^{\prime}}\rangle_{1}), and computes c3=𝙴𝚗𝚌pk(α(𝐗1+𝐗0)α𝐗1)c_{3}=\mathtt{Enc}_{pk}(\alpha(\langle\mathbf{X^{\prime}}\rangle_{1}+\langle\mathbf{X^{\prime}}\rangle_{0})-\langle\alpha\mathbf{X^{\prime}}\rangle_{1}), c4=𝙴𝚗𝚌pk(α(𝐘1+𝐘0)α𝐘1)c_{4}=\mathtt{Enc}_{pk}(\alpha(\langle\mathbf{Y^{\prime}}\rangle_{1}+\langle\mathbf{Y^{\prime}}\rangle_{0})-\langle\alpha\mathbf{Y}\rangle_{1}), c5=α(𝐜𝐗𝐜𝐘)𝙴𝚗𝚌pk(α𝐙1)c_{5}=\alpha\boxtimes(\mathbf{c}_{\mathbf{X^{\prime}}}\circledast\mathbf{c}_{\mathbf{Y^{\prime}}})-\mathtt{Enc}_{pk}(\langle\alpha\mathbf{Z^{\prime}}\rangle_{1}), and c6=(𝐜𝐗𝐜𝐘)𝙴𝚗𝚌pk(𝐙1)c_{6}=(\mathbf{c}_{\mathbf{X^{\prime}}}\circledast\mathbf{c}_{\mathbf{Y^{\prime}}})-\mathtt{Enc}_{pk}(\langle\mathbf{Z}\rangle_{1}). P1P_{1} sends c3c_{3}, c4c_{4}, c5c_{5} and c6c_{6} to P0P_{0}. 4. P0P_{0} decrypts c3c_{3}, c4c_{4}, c5c_{5} and c6c_{6} to obtain (α𝐗0,α𝐘0,α𝐙0,𝐙0)(\langle\alpha\mathbf{X^{\prime}}\rangle_{0},\langle\alpha\mathbf{Y^{\prime}}\rangle_{0},\langle\alpha\mathbf{Z^{\prime}}\rangle_{0},\langle\mathbf{Z^{\prime}}\rangle_{0}), respectively. Then, Both P0P_{0} and P1P_{1} converts these matrices back into tensors to get (α𝐗b,α𝐘b,α𝐙b,𝐙b)(\langle\alpha\mathbf{X}\rangle_{b},\langle\alpha\mathbf{Y}\rangle_{b},\langle\alpha\mathbf{Z}\rangle_{b},\langle\mathbf{Z}\rangle_{b}) for b={0,1}b=\{0,1\}. 5. PbP_{b} outputs {[[𝐗]]b,[[𝐘]]b,[[𝐙]]b}b{0,1}\{[\![\mathbf{X}]\!]_{b},[\![\mathbf{Y}]\!]_{b},[\![\mathbf{Z}]\!]_{b}\}_{b\in\{0,1\}}, where 𝐙=𝙲𝚘𝚗𝚟(𝐗,𝐘)\mathbf{Z}=\mathtt{Conv}(\mathbf{X},\mathbf{Y}).
Figure 4: Algorithm πCtriple\pi_{Ctriple} for generating authenticated convolution triple

Generating convolution triple. Figure 4 depicts the interaction between the model holder P0P_{0} and the client P1P_{1} to generate triples of convolution. Succinctly, P0P_{0} first uniformly selects 𝐗0\langle\mathbf{X^{\prime}}\rangle_{0} and 𝐘0\langle\mathbf{Y^{\prime}}\rangle_{0} and sends their encryption to P1P_{1}, along with zero-knowledge proofs about these ciphertexts (step 2 in Figure 4). P1P_{1} recovers 𝐗\mathbf{X^{\prime}} and 𝐘\mathbf{Y^{\prime}} under the ciphertext and then computes (α𝐗0,α𝐘0,α𝐙0,𝐙0)(\langle\alpha\mathbf{X^{\prime}}\rangle_{0},\langle\alpha\mathbf{Y^{\prime}}\rangle_{0},\langle\alpha\mathbf{Z^{\prime}}\rangle_{0},\langle\mathbf{Z^{\prime}}\rangle_{0}) (step 3 in Figure 4). Then it returns the corresponding ciphertexts to P0P_{0}. P0P_{0} decrypts these ciphertexts and computes α𝐗0,α𝐘0\langle\alpha\mathbf{X}\rangle_{0},\langle\alpha\mathbf{Y}\rangle_{0}, α𝐙0\langle\alpha\mathbf{Z}\rangle_{0} and 𝐙0\langle\mathbf{Z}\rangle_{0} (step 4 in Figure 4). Finally, PbP_{b} obtains {[[𝐗]]b,[[𝐘]]b,[[𝐙]]b}b{0,1}\{[\![\mathbf{X}]\!]_{b},[\![\mathbf{Y}]\!]_{b},[\![\mathbf{Z}]\!]_{b}\}_{b\in\{0,1\}}, where 𝐙=𝙲𝚘𝚗𝚟(𝐗,𝐘)\mathbf{Z}=\mathtt{Conv}(\mathbf{X},\mathbf{Y}).

Remark 4.2: We utilize the method in [17] to perform the homomorphic multiplication operations involved in generating convolution triples in parallel. Given the multiplication of two d×dd\times d-dimensional matrices, it reduces the computational complexity from O(d2)O(d^{2}) to O(d)O(d), compared with the existing method [14]. Besides, [17] requires only one ciphertext to represent a single matrix whereas existing work [14] requires dd ciphertexts (assuming the number of plaintext slots nn in FHE is greater than d2d^{2}). In addition, compared to generating multiplication triples for single multiplication [22, 20], the communication overhead of our method is independent of the number of multiplications, only related to the size of the input, i.e., reduce the communication cost from cubic to quadratic (both offline and online phases).

Remark 4.3: We further exploit the properties of semi-honest clients to improve the performance of generating convolution triples. Specifically, for the multiplication of matrices 𝐗\mathbf{X} and 𝐘\mathbf{Y}, the permutations σ(𝐗)\sigma(\mathbf{X}) and φ(𝐘)\varphi(\mathbf{Y}) can be done in plaintext beforehand, which reduces the rotation in half compared to the original method (see Section 3.2 in [17] for comparison). Moreover, we move the majority of the computation to be executed by the semi-honest party, which avoids the need for distributed decryption and frequent zero-knowledge proofs in malicious adversary settings.

Security. Our protocol for generating authenticated convolution triples, πCtriple\pi_{Ctriple}, is secure against the malicious model holder P0P_{0} and the semi-honest client P1P_{1}. We provide the following theorem.

Theorem 4.2.

Let the fully homomorphic encryption used in πCtriple\pi_{Ctriple} have the properties defined in Section 2.4. πCtriple\pi_{Ctriple} is secure against the malicious model holder P0P_{0} and the semi-honest client P1P_{1}.

Proof.

The proof logic of this theorem is very similar to Theorem 4.1, we omit it for brevity. ∎

4.1.3    Preprocessing for the nonlinear layer

This process is performed by the client to generate garbled circuits of nonlinear functions for the model holder. Note that we do not generate 𝙶𝙲\mathtt{GC} for ReLu but for the nonlinear part of ReLU, i.e. sign(𝐯)sign(\mathbf{v}) given an arbitrary input 𝐯\mathbf{v}. We first define a truncation function 𝐓𝐫𝐮𝐧h:{0,1}λ{0,1}h\mathbf{Trun}_{h}:\{0,1\}^{\lambda}\rightarrow\{0,1\}^{h}, which outputs the last hh bits of the input, where λ\lambda satisfies λ2κ\lambda\geq 2\kappa. Then, the client is required to generate random ciphertexts and send them to the model holder as follows.

  • \bullet

    Given the security parameter λ\lambda, and the boolean circuit boolnCbooln^{C} denoted the nonlinear part of ReLU, P1P_{1} computes 𝙶𝚊𝚛𝚋𝚕𝚎(1λ,boolnC)(𝙶𝙲,{{𝚕𝚊𝚋i,jin}i[2κ],\mathtt{Garble}(1^{\lambda},booln^{C})\rightarrow(\mathtt{GC},\{\{\mathtt{lab}_{i,j}^{in}\}_{i\in[2\kappa]}, {𝚕𝚊𝚋i,jout}i[2κ]}j{0,1})\{\mathtt{lab}_{i,j}^{out}\}_{i\in[2\kappa]}\}_{j\in\{0,1\}}), where 𝙶𝙲\mathtt{GC} is the garbled circuit of boolnCbooln^{C}, {{𝚕𝚊𝚋i,jin}i[2κ],\{\{\mathtt{lab}_{i,j}^{in}\}_{i\in[2\kappa]}, {𝚕𝚊𝚋i,jout}i[2κ]}j{0,1}\{\mathtt{lab}_{i,j}^{out}\}_{i\in[2\kappa]}\}_{j\in\{0,1\}} represent all possible garbled input and output labels, respectively. P1P_{1} sends 𝙶𝙲\mathtt{GC} to the model holder P0P_{0}.

  • \bullet

    P1P_{1} uniformly selects ηi,1\eta_{i,1}, γi,1\gamma_{i,1} and ιi,1\iota_{i,1} from 𝔽p\mathbb{F}_{p} for every i[κ]i\in[\kappa]. Then, P1P_{1} sets (ηi,0,γi,0,ιi,0)=(1+ηi,1,α+γi,1,α+ιi,1)(\eta_{i,0},\gamma_{i,0},\iota_{i,0})=(1+\eta_{i,1},\alpha+\gamma_{i,1},\alpha+\iota_{i,1}).

  • \bullet

    P1P_{1} parses {𝚕𝚊𝚋i,jout}\{\mathtt{lab}_{i,j}^{out}\} as ςi,j||ϑi,j\varsigma_{i,j}||\vartheta_{i,j} for every i[2κ]i\in[2\kappa] and j{0,1}j\in\{0,1\}, where ςi,j{0,1}\varsigma_{i,j}\in\{0,1\} and ϑi,j{0,1}λ1\vartheta_{i,j}\in\{0,1\}^{\lambda-1}.

  • \bullet

    For every i[κ]i\in[\kappa] and j{0,1}j\in\{0,1\}, P1P_{1} sends cti,ςi,jct_{i,\varsigma_{i,j}} and ct^i,ςi+κ,j\hat{ct}_{i,\varsigma_{i+\kappa,j}} to P0P_{0}, where cti,ςi,j=ιi,j𝐓𝐫𝐮𝐧κ(ϑi,j)ct_{i,\varsigma_{i,j}}=\iota_{i,j}\oplus\mathbf{Trun}_{\kappa}(\vartheta_{i,j}) and ct^i,ςi+κ,j=(ηi,j||γi,j)𝐓𝐫𝐮𝐧2κ(ϑi+κ,j)\hat{ct}_{i,\varsigma_{i+\kappa,j}}=(\eta_{i,j}||\gamma_{i,j})\oplus\mathbf{Trun}_{2\kappa}(\vartheta_{i+\kappa,j}).

Security. We leave the explanation of above ciphertexts sent by P1P_{1} to P0P_{0} to the following sections. Here we briefly describe the security of preprocessing for nonlinear layers. It is easy to infer that the above preprocessing for the nonlinear layer is secure against the semi-honest client P1P_{1} and the malicious model holder P0P_{0}. Specifically, for the client P1P_{1}, since the entire preprocessing process does not require the participation of the model holder, the client cannot obtain any private information about the model holder. Similarly, for the malicious model holder P0P_{0}, since the preprocessing is non-interactive and the generated ciphertext satisfies the 𝙶𝙲\mathtt{GC} security defined in Section 2.9, P0P_{0} cannot obtain the plaintext corresponding to the ciphertext sent by the client.

4.2 Online Phase

In this section, we describe the online phase of VerifyML. We first explain how VerifyML utilizes the triples generated in the offline phase to generate authenticated shares for matrix-vector multiplication and convolution. Then, we describe the technical details of the nonlinear operation.

4.2.1    Perform linear layers in the online phase

Preamble: Consider a neural network (NN) consists of mm linear layers and m1m-1 nonlinear layers. Let the specification of the linear layer is 𝐋1,𝐋1,𝐋m\mathbf{L}_{1},\mathbf{L}_{1},\cdots\mathbf{L}_{m} and the non-linear layer is f1,,fm1f_{1},\cdots,f_{m-1}.
Input:P0P_{0} holds {𝐋i}i[m]\{\mathbf{L}_{i}\}_{i\in[m]}, i.e., weights for the mm linear layers. P1P_{1} holds 𝐱0\mathbf{x}_{0} as the input of NN, a random MAC key α\alpha from 𝔽p\mathbb{F}_{p} to be used throughout the protocol execution.
Output: PbP_{b} obtains [[𝐯i=𝐋i𝐱i1]]b[\![\mathbf{v}_{i}=\mathbf{L}_{i}\mathbf{x}_{i-1}]\!]_{b} for i[m]i\in[m] and b={0,1}b=\{0,1\}.
Procedure:
Input Sharing: 1. To share P0P_{0}’s input {𝐋i}i[m]\{\mathbf{L}_{i}\}_{i\in[m]}, all parties pick up a fresh authenticated element [[𝐑i]][\![\mathbf{R}_{i}]\!] of the same dimension as 𝐋i\mathbf{L}_{i}. 2. [[𝐑i]][\![\mathbf{R}_{i}]\!] is opened to P0P_{0}, and then it sends ϖi=𝐋i𝐑i\varpi_{i}=\mathbf{L}_{i}-\mathbf{R}_{i} to P1P_{1}. 3. PbP_{b} locally computes [[𝐋i]]b=[[𝐑i]]b+ϖi[\![\mathbf{L}_{i}]\!]_{b}=[\![\mathbf{R}_{i}]\!]_{b}+\varpi_{i} for b={0,1}b=\{0,1\}. 4. To share P1P_{1}’s input 𝐯0\mathbf{v}_{0}, P1P_{1} randomly selects two masks ξ\xi and ζ\zeta of the same dimension as 𝐯0\mathbf{v}_{0}. Then, it sends [[𝐯0]]0=(𝐯0ξ,α𝐯0ζ)[\![\mathbf{v}_{0}]\!]_{0}=(\mathbf{v}_{0}-\xi,\alpha\mathbf{v}_{0}-\zeta) to P0P_{0}. P1P_{1} sets [[𝐯0]]1=(ξ,ζ)[\![\mathbf{v}_{0}]\!]_{1}=(\xi,\zeta). 5. For each i[m]i\in[m], \bullet Matrix-vector Multiplication: To generate an authenticated triple of multiplications between matrix 𝐀\mathbf{A} and vector 𝐛\mathbf{b}, where 𝐀\mathbf{A} and 𝐛\mathbf{b} are variables generated in the inference process. P0P_{0} and P1P_{1} take a fresh authenticated matrix-vector triple {[[𝐗]]b,[[𝐲]]b,[[𝐳]]b}b{0,1}\{[\![\mathbf{X}]\!]_{b},[\![\mathbf{y}]\!]_{b},[\![\mathbf{z}]\!]_{b}\}_{b\in\{0,1\}} of dimensions consistent with 𝐀\mathbf{A} and 𝐛\mathbf{b}. Then, both party open 𝐀𝐗\mathbf{A}-\mathbf{X} and 𝐛𝐲\mathbf{b}-\mathbf{y}. Finally, PbP_{b} locally computes [[𝐀𝐛]]b[\![\mathbf{A}\ast\mathbf{b}]\!]_{b} based on Eqn.(8). \bullet Convolution: To generate an authenticated triple of Convolution between tensors 𝐀\mathbf{A} and 𝐁\mathbf{B}, where 𝐀\mathbf{A} and 𝐁\mathbf{B} are variables generated in the inference process. P0P_{0} and P1P_{1} take a fresh authenticated Convolution triple {[[𝐗]]b,[[𝐘]]b,[[𝐙]]b}b{0,1}\{[\![\mathbf{X}]\!]_{b},[\![\mathbf{Y}]\!]_{b},[\![\mathbf{Z}]\!]_{b}\}_{b\in\{0,1\}} of dimensions consistent with 𝐀\mathbf{A} and 𝐁\mathbf{B}. Then, both party open 𝐀𝐗\mathbf{A}-\mathbf{X} and 𝐛𝐘\mathbf{b}-\mathbf{Y}. Finally, PbP_{b} locally computes [[𝙲𝚘𝚗𝚟(𝐀,𝐁)]]b[\![\mathtt{Conv}(\mathbf{A},\mathbf{B})]\!]_{b} based on Eqn.(8). 6. PbP_{b} obtains [[𝐯i=𝐋i𝐱i1]]b[\![\mathbf{v}_{i}=\mathbf{L}_{i}\mathbf{x}_{i-1}]\!]_{b} for i[m]i\in[m] and b={0,1}b=\{0,1\}.
Figure 5: Online linear layers protocol πOLin\pi_{OLin}

Figure 5 depicts the interaction of the model holder and the client to perform linear layer operations in the online phase. Specifically, given the model holder’s input {𝐋i}i[m]\{\mathbf{L}_{i}\}_{i\in[m]} and the client’s input 𝐯0\mathbf{v}_{0}, both parties first generate authenticated shares of their respective inputs (steps 1-4 in Figure 5). Since the client is considered semi-honest, its input is shared more efficiently than the model holder, i.e. only local computations are required on randomly selected masks, while the sharing process of model holder’s input is consistent with the previous malicious settings [22, 20, 6]. After that, the model holder and the client use the triples generated in the offline phase (i.e., matrix-vector multiplication triples and convolution triples) to generate authenticated sharing of linear layer computation results (step 5 in Figure 5).

Security. Our protocol for performing linear layer operations in the online phase, πOLin\pi_{OLin}, is secure against the malicious model holder P0P_{0} and the semi-honest client P1P_{1}. We provide the following theorem.

Theorem 4.3.

Let triples used in πOLin\pi_{OLin} are generated from πMtriple\pi_{Mtriple} and πCtriple\pi_{Ctriple}. πOLin\pi_{OLin} is secure against the malicious model holder P0P_{0} and the semi-honest client P1P_{1}.

Proof.

The proof logic of this theorem is identical to that of [9]. Interested readers can refer to [9] for more details. ∎

4.2.2    Perform non-linear layers in the online phase

Input:P0P_{0} holds [[𝐯i]]0[\![\mathbf{v}_{i}]\!]_{0} and P1P_{1} holds [[𝐯i]]1[\![\mathbf{v}_{i}]\!]_{1} for i[m]i\in[m] and b={0,1}b=\{0,1\}. In addition, P1P_{1} holds the MAC key α\alpha.
Output: PbP_{b} obtains [[𝐱i=ReLU(𝐯i)]]b[\![\mathbf{x}_{i}=ReLU(\mathbf{v}_{i})]\!]_{b} and α𝐯ib\langle\alpha\mathbf{v}_{i}\rangle_{b} for i[m]i\in[m] and b={0,1}b=\{0,1\}.
Procedure(take single 𝐯i\mathbf{v}_{i} as an example): 1. Garbled Circuit Phase: \bullet P0P_{0} and P1P_{1} invoke the OTκλ{}_{\lambda}^{\kappa} (see Section 2.8), where P1P_{1}’s inputs are {𝚕𝚊𝚋j,0in,𝚕𝚊𝚋j,1in}j{κ+1,,2κ}\{\mathtt{lab}_{j,0}^{in},\mathtt{lab}_{j,1}^{in}\}_{j\in\{\kappa+1,\cdots,2\kappa\}} while P0P_{0}’s input is 𝐯i0\left\langle\mathbf{v}_{i}\right\rangle_{0}. Hence, P0P_{0} learns {𝚕𝚊𝚋~jin}j{κ+1,,2κ}\{\mathtt{\tilde{lab}}_{j}^{in}\}_{j\in\{\kappa+1,\cdots,2\kappa\}}. Also, P1P_{1} sends its garbled inputs {{𝚕𝚊𝚋~jin=𝚕𝚊𝚋j,𝐯i1[j]}j[κ]\{\{\mathtt{\tilde{lab}}_{j}^{in}=\mathtt{lab}_{j,\left\langle\mathbf{v}_{i}\right\rangle_{1}[j]}\}_{j\in[\kappa]} to P0P_{0}. \bullet With 𝙶𝙲\mathtt{GC} and {𝚕𝚊𝚋~jin}j[2κ]\{\mathtt{\tilde{lab}}_{j}^{in}\}_{j\in[2\kappa]}, P0P_{0} evaluates 𝙶𝙲𝙴𝚟𝚊𝚕(𝙶𝙲,{𝚕𝚊𝚋~jin}j[2κ]){𝚕𝚊𝚋~jout}j[2κ]\mathtt{GCEval}(\mathtt{GC},\{\mathtt{\tilde{lab}}_{j}^{in}\}_{j\in[2\kappa]})\rightarrow\{\mathtt{\tilde{lab}}_{j}^{out}\}_{j\in[2\kappa]}. 2. Authentication Phase 1: \bullet P0P_{0} parses 𝚕𝚊𝚋~jout\mathtt{\tilde{lab}}_{j}^{out} as ς~j||ϑ~j\tilde{\varsigma}_{j}||\tilde{\vartheta}_{j} where ς~j{0,1}\tilde{\varsigma}_{j}\in\{0,1\} and ϑ~j{0,1}λ1\tilde{\vartheta}_{j}\in\{0,1\}^{\lambda-1} for every j[2κ]j\in[2\kappa]. \bullet P0P_{0} computes cj=ctj,ς~j𝐓𝐫𝐮𝐧κ(ϑ~j)c_{j}=ct_{j,\tilde{\varsigma}_{j}}\oplus\mathbf{Trun}_{\kappa}(\tilde{\vartheta}_{j}) and (dj||ej)=ct^j,ς~i+κ𝐓𝐫𝐮𝐧2κ(ϑ~j+κ)(d_{j}||e_{j})=\hat{ct}_{j,\tilde{\varsigma}_{i+\kappa}}\oplus\mathbf{Trun}_{2\kappa}(\tilde{\vartheta}_{j+\kappa}) for every j[κ]j\in[\kappa]. 3. Local Computation Phase: \bullet P1P_{1} outputs g11=(j[κ]ιj,12j1)\langle g_{1}\rangle_{1}=(-\sum_{j\in[\kappa]}\iota_{j,1}2^{j-1}), g21=(j[κ]ηj,12j1)\langle g_{2}\rangle_{1}=(-\sum_{j\in[\kappa]}\eta_{j,1}2^{j-1}) and g31=(j[κ]γj,12j1)\langle g_{3}\rangle_{1}=(-\sum_{j\in[\kappa]}\gamma_{j,1}2^{j-1}). \bullet P0P_{0} outputs g10=(j[κ]cj2j1)\langle g_{1}\rangle_{0}=(\sum_{j\in[\kappa]}c_{j}2^{j-1}), g20=(j[κ]dj2j1)\langle g_{2}\rangle_{0}=(\sum_{j\in[\kappa]}d_{j}2^{j-1}) and g30=(j[κ]ej2j1)\langle g_{3}\rangle_{0}=(\sum_{j\in[\kappa]}e_{j}2^{j-1}). 4. Authentication Phase 2: \bullet For every 𝐯i\mathbf{v}_{i} where i[m]i\in[m], PbP_{b} randomly select a fresh authenticated triple {[[x]]b,[[y]]b,[[z]]b}b{0,1}\{[\![x]\!]_{b},[\![y]\!]_{b},[\![z]\!]_{b}\}_{b\in\{0,1\}}. \bullet All parties reveal 𝐯ix\mathbf{v}_{i}-x and g2yg_{2}-y to each other, and then locally compute z2b=𝐯isign(𝐯i)b\langle z_{2}\rangle_{b}=\langle\mathbf{v}_{i}\cdot sign(\mathbf{v}_{i})\rangle_{b} and z3b=α𝐯isign(𝐯i)b\langle z_{3}\rangle_{b}=\langle\alpha\mathbf{v}_{i}\cdot sign(\mathbf{v}_{i})\rangle_{b} based on Eqn.(8). \bullet PbP_{b} obtains [[𝐱i=ReLU(𝐯i)]]b=(z2b,z3b)[\![\mathbf{x}_{i}=ReLU(\mathbf{v}_{i})]\!]_{b}=(\langle z_{2}\rangle_{b},\langle z_{3}\rangle_{b}) and α𝐯ib=g1b\langle\alpha\mathbf{v}_{i}\rangle_{b}=\langle g_{1}\rangle_{b}.
Figure 6: Online non-linear layers protocol πONlin\pi_{ONlin}

In this section, we present the technical details of the execution of nonlinear functions in the online phase. We mainly focus on how to securely compute the activation function ReLU, which is the most representative nonlinear function in deep neural networks. As shown in Figure 5, the result 𝐯i\mathbf{v}_{i} obtained from each linear layer 𝐋i\mathbf{L}_{i} is held by both parties in the format of authenticated sharing. Similarly, for the function fif_{i} in the ii-th nonlinear layer, the goal of VerifyML is to securely compute fi(𝐯i)f_{i}(\mathbf{v}_{i}) and share it to the model holder and client in the authenticated sharing manner. We describe details in Figure 6.

Garbled Circuit Phase. As described in Section 4.1.3, in the offline phase, P1P_{1} constructs a 𝙶𝙲\mathtt{GC} for the nonlinear part of ReLU (i.e., sign(𝐯𝐢)sign(\mathbf{v_{i}}) for arbitrary input 𝐯𝐢𝔽p\mathbf{v_{i}}\in\mathbb{F}_{p}) and sent it to P0P_{0}. In the online phase, P0P_{0} and P1P_{1} invoke the OTκλ{}_{\lambda}^{\kappa}, where P1P_{1} as the sender whose inputs are {𝚕𝚊𝚋j,0in,𝚕𝚊𝚋j,1in}j{κ+1,,2κ}\{\mathtt{lab}_{j,0}^{in},\mathtt{lab}_{j,1}^{in}\}_{j\in\{\kappa+1,\cdots,2\kappa\}} while P0P_{0}’s (as the receiver) input is 𝐯i0\left\langle\mathbf{v}_{i}\right\rangle_{0}. As a result, P0P_{0} gets set of garbled inputs of 𝐯𝐢\mathbf{v_{i}} in 𝙶𝙲\mathtt{GC}. Then, P0P_{0} evaluates 𝙶𝙲\mathtt{GC} with garbled inputs of 𝐯𝐢\mathbf{v_{i}} and learns the set of output labels for the bits of 𝐯i\mathbf{v}_{i} and sign(𝐯i)sign(\mathbf{v}_{i}).

Authentication Phase 1. This phase aims to calculate the share of the authentication of each bit of 𝐯𝐢\mathbf{v_{i}}, i.e., sign(𝐯i)[j],αsign(𝐯i)[j]sign(\mathbf{v}_{i})[j],\alpha sign(\mathbf{v}_{i})[j], and α𝐯i[j]\alpha\mathbf{v}_{i}[j] for j[κ]j\in[\kappa], based on the previous phase. We take an example of how to calculate α𝐯i\alpha\mathbf{v}_{i}. It is clear that the share of α𝐯i[j]\alpha\mathbf{v}_{i}[j] is either 0 or α\alpha depending on whether 𝐯i[j]\mathbf{v}_{i}[j] is 0 or 1. Recall that the output of the GC is two output labels corresponding to each 𝐯i[j]\mathbf{v}_{i}[j] (each one for 𝐯i[j]=0\mathbf{v}_{i}[j]=0 and 1). We use the symbol 𝚕𝚊𝚋j,0out\mathtt{lab}_{j,0}^{out} and 𝚕𝚊𝚋j,1out\mathtt{lab}_{j,1}^{out} to denote 𝐯i[j]=0\mathbf{v}_{i}[j]=0 and 𝐯i[j]=1\mathbf{v}_{i}[j]=1, respectively. To calculate the shares of α𝐯i[j]\alpha\mathbf{v}_{i}[j], P1P_{1} randomly selects ιj𝔽p\iota_{j}\in\mathbb{F}_{p} in the offline phase and encrypts it as 𝚕𝚊𝚋j,1out\mathtt{lab}_{j,1}^{out} and encrypts ιj+α\iota_{j}+\alpha as 𝚕𝚊𝚋j,0out\mathtt{lab}_{j,0}^{out}. P1P_{1} sends the two ciphertexts to P0P_{0} and sets its own share of α𝐯i[j]\alpha\mathbf{v}_{i}[j] to ιj-\iota_{j}. Since P0P_{0} has obtained 𝚕𝚊𝚋j,𝐯i[j]out\mathtt{lab}_{j,\mathbf{v}_{i}[j]}^{out} in the previous phase, it can definitely decrypt it and obtain its own share of α𝐯i[j]\alpha\mathbf{v}_{i}[j]. Computation of sign(𝐯i)[j]sign(\mathbf{v}_{i})[j] and αsign(𝐯i)[j]\alpha sign(\mathbf{v}_{i})[j] follows a similar logic, utilizing the random values ηj,1\eta_{j,1}, γj,1\gamma_{j,1} sent by P1P_{1} to P0P_{0} in the offline phase, respectively.

Local Computation Phase. This process is used to calculate the share of sign(𝐯i),αsign(𝐯i)sign(\mathbf{v}_{i}),\alpha sign(\mathbf{v}_{i}), and α𝐯i\alpha\mathbf{v}_{i} based on the results learned by all parties in the previous stage. For example, to compute the share of α𝐯i\alpha\mathbf{v}_{i}, each party locally multiplies the share of α𝐯i[j]\alpha\mathbf{v}_{i}[j] with 2j12^{j-1} and sums all the resultant values. Each party computes the share of sign(𝐯i)sign(\mathbf{v}_{i}) and αsign(𝐯i)\alpha sign(\mathbf{v}_{i}) in a similar manner.

Authentication Phase 2. We compute the shares of ReLU(𝐯i)=𝐯isign(𝐯i)ReLU(\mathbf{v}_{i})=\mathbf{v}_{i}sign(\mathbf{v}_{i}), and αReLU(𝐯i)\alpha ReLU(\mathbf{v}_{i}). Since each party holds the authenticated shares of 𝐯i\mathbf{v}_{i} and sign(𝐯i)sign(\mathbf{v}_{i}), we can achieve this based on Eqn.(8).

Remark 4.4. We adopt two methods to minimize the number of multiplication operations involved in the 𝙶𝙲\mathtt{GC}. One is to compute the garbled output of per-bit of sign(𝐯i)sign(\mathbf{v}_{i}) in 𝙶𝙲\mathtt{GC}. Another is to encapsulate only the nonlinear part of ReLU into 𝙶𝙲\mathtt{GC}. In this way, we avoid computing αReLU(𝐯i)\alpha ReLU(\mathbf{v}_{i}) and ReLU(𝐯i)ReLU(\mathbf{v}_{i}) in 𝙶𝙲\mathtt{GC}, which is multiply operation intensive. Compared with works [22, 20, 6] with malicious adversary, VerifyML reduces the communication overhead of each ReLU function from 2cλ+190κλ+232κ22c\lambda+190\kappa\lambda+232\kappa^{2} to 2dλ+4κλ+6κ22d\lambda+4\kappa\lambda+6\kappa^{2}, where dcd\ll c.

Remark 4.5. We devise a lightweight method to check whether the model holder’s input at the non-linear layer is consistent with what it has learned at the previous layer. Specifically, at the end of evaluating the i1i-1-th linear layer, both parties learns the share of α𝐯i\alpha\mathbf{v}_{i}. Then, 𝐯i\mathbf{v}_{i} is used as the input of the ii-th nonlinear. To check that P0P_{0} is fed the correct input, We require α𝐯i\alpha\mathbf{v}_{i} to be recomputed in 𝙶𝙲\mathtt{GC} and share again to both parties. Therefore, after evaluating each nonlinear layer, both parties hold two independent shares of α𝐯i\alpha\mathbf{v}_{i}. This provides a way to determine if P0P_{0} provided the correct input by verifying that the two independent shares are consistent (See Section 4.3 for more details).

Correctness. We analyze the correctness of our protocol πONlin\pi_{ONlin} as follows. Based on the correctness of OTκλ{}_{\lambda}^{\kappa}, the model holder P0P_{0} holds {𝚕𝚊𝚋~jin=𝚕𝚊𝚋j,𝐯i0[j]}j{κ+1,2κ}\{\mathtt{\tilde{lab}}_{j}^{in}=\mathtt{lab}_{j,\left\langle\mathbf{v}_{i}\right\rangle_{0}[j]}\}_{j\in\{\kappa+1,\cdots 2\kappa\}}. Using {𝚕𝚊𝚋~jin\{\mathtt{\tilde{lab}}_{j}^{in} =𝚕𝚊𝚋j,𝐯i1[j]}j[κ]=\mathtt{lab}_{j,\left\langle\mathbf{v}_{i}\right\rangle_{1}[j]}\}_{j\in[\kappa]} for j[κ]j\in[\kappa], and the correctness of (𝙶𝚊𝚛𝚋𝚕𝚎\mathtt{Garble}, 𝙶𝙲𝙴𝚟𝚊𝚕\mathtt{GCEval}) for circuit boolnfbooln^{f}, we learn 𝚕𝚊𝚋~jout=𝚕𝚊𝚋j,𝐯i[j]out\mathtt{\tilde{lab}}_{j}^{out}=\mathtt{lab}_{j,\mathbf{v}_{i}[j]}^{out} and 𝚕𝚊𝚋~j+κout=𝚕𝚊𝚋j+κ,sign(𝐯i)[j]out\mathtt{\tilde{lab}}_{j+\kappa}^{out}=\mathtt{lab}_{j+\kappa,sign(\mathbf{v}_{i})[j]}^{out}, for j[κ]j\in[\kappa]. Therefore, for i[k]i\in[k], we have ς~j||ϑ~j=ςj,𝐯i[j]||ϑj,𝐯i[j]\tilde{\varsigma}_{j}||\tilde{\vartheta}_{j}=\varsigma_{j,\mathbf{v}_{i}[j]}||\vartheta_{j,\mathbf{v}_{i}[j]} and ς~j+κ||ϑ~j+κ=ςj+κ,sign(𝐯)i[j]||ϑj+κ,sign(𝐯i)[j]\tilde{\varsigma}_{j+\kappa}||\tilde{\vartheta}_{j+\kappa}=\varsigma_{j+\kappa,sign(\mathbf{v})_{i}[j]}||\vartheta_{j+\kappa,sign(\mathbf{v}_{i})[j]}. Hence, cj=ctj,ςj,𝐯i[j]c_{j}=ct_{j,\varsigma_{j,\mathbf{v}_{i}[j]}} 𝐓𝐫𝐮𝐧κ(ϑj,𝐯i[j])=ιj,𝐯i[j]\oplus\mathbf{Trun}_{\kappa}(\vartheta_{j,\mathbf{v}_{i}[j]})=\iota_{j,\mathbf{v}_{i}[j]} and (dj||ej)=ct^j,ςj+κ,sign(𝐯i)[j]𝐓𝐫𝐮𝐧2κ(ϑj+κ,sign(𝐯i)[j])=ηj,sign(𝐯i)[j]||γj,sign(𝐯i)[j](d_{j}||e_{j})=\hat{ct}_{j,\varsigma_{j+\kappa,sign(\mathbf{v}_{i})[j]}}\oplus\mathbf{Trun}_{2\kappa}(\vartheta_{j+\kappa,sign(\mathbf{v}_{i})[j]})=\eta_{j,sign(\mathbf{v}_{i})[j]}||\gamma_{j,sign(\mathbf{v}_{i})[j]}. Based on these, we have

  • \bullet

    g1=j[κ](cjιj,0)2j1=j[κ]α(𝐯i[j])2j1=α𝐯ig_{1}=\sum_{j\in[\kappa]}(c_{j}-\iota_{j,0})2^{j-1}=\sum_{j\in[\kappa]}\alpha(\mathbf{v}_{i}[j])2^{j-1}=\alpha\mathbf{v}_{i}.

  • \bullet

    g2=j[κ](djηj,0)2j1=j[κ](sign(𝐯i)[j])2j1=sign(𝐯i)g_{2}=\sum_{j\in[\kappa]}(d_{j}-\eta_{j,0})2^{j-1}=\sum_{j\in[\kappa]}(sign(\mathbf{v}_{i})[j])2^{j-1}=sign(\mathbf{v}_{i}).

  • \bullet

    g3=j[κ](ejγj,0)2j1=j[κ]α(sign(𝐯i)[j])2j1=αsign(𝐯i)g_{3}=\sum_{j\in[\kappa]}(e_{j}-\gamma_{j,0})2^{j-1}=\sum_{j\in[\kappa]}\alpha(sign(\mathbf{v}_{i})[j])2^{j-1}=\alpha sign(\mathbf{v}_{i}).

Since each party holds the authenticated shares of 𝐯i\mathbf{v}_{i} and sign(𝐯i)sign(\mathbf{v}_{i}), we can easily compute the shares of f(𝐯i)=𝐯isign(𝐯i)f(\mathbf{v}_{i})=\mathbf{v}_{i}sign(\mathbf{v}_{i}), and αf(𝐯i)\alpha f(\mathbf{v}_{i}). This concludes the correctness proof.

Security. Our protocol for performing nonlinear layer operations in the online phase, πONlin\pi_{ONlin}, is secure against the malicious model holder P0P_{0} and the semi-honest client P1P_{1}. We provide the following theorem and prove it in Appendix D.

Theorem 4.4.

Let (𝙶𝚊𝚛𝚋𝚕𝚎\mathtt{Garble}, 𝙶𝙲𝙴𝚟𝚊𝚕\mathtt{GCEval}) be a garbling scheme with the properties defined in Section 2.9. Authenticated shares have the properties defined in Section 2.6. Then our protocol πONlin\pi_{ONlin} is secure against the malicious model holder P0P_{0} and the semi-honest client P1P_{1}.

4.3 Consistency Check

VerifyML performs πOLin\pi_{OLin} and πONlin\pi_{ONlin} alternately in the online phase to output the inference result 𝐌(𝐱0)\mathbf{M}(\mathbf{x}_{0}) for a given input 𝐱0\mathbf{x}_{0}, where all intermediate results output by the nonlinear layer and the linear layer are held on P0P_{0} and P1P_{1} in an authenticated sharing manner. To verify the correctness of 𝐌(𝐱0)\mathbf{M}(\mathbf{x}_{0}), the client needs to perform a consistency check on all computed results. If the verification passes, P1P_{1} locally evaluates the fairness of the ML model based on Eqn.(2). Otherwise, abort. In more detail, for sharing P0P_{0}’s input and executing each linear layer {𝐋i}i[m]\{\mathbf{L}_{i}\}_{i\in[m]}, VerifyML needs to pick up a large number of fresh authenticated single elements or triples (see Figure 5) and open them for computation. Assume that the set of all opened elements is (a1,a2,at)(a_{1},a_{2}\cdots,a_{t}), and PbP_{b} holds ρib=αaib\left\langle\mathbf{\rho}_{i}\right\rangle_{b}=\left\langle\alpha a_{i}\right\rangle_{b} as well as τib=aib\left\langle\mathbf{\tau}_{i}\right\rangle_{b}=\left\langle a_{i}\right\rangle_{b}, we need to perform a consistency check to verify ρiατi=0\mathbf{\rho}_{i}-\alpha\mathbf{\tau}_{i}=0. Beside, For executing each nonlinear layer {fi}i[m1]\{f_{i}\}_{i\in[m-1]}, the inputs of πONlin\pi_{ONlin} are shares of 𝐯i\mathbf{v}_{i} and τi=α𝐯i\mathbf{\tau}_{i}=\alpha\mathbf{v}_{i}. To check that P0P_{0} is fed the correct input, We require α𝐯i\alpha\mathbf{v}_{i} to be recomputed in the 𝙶𝙲\mathtt{GC} and share it again to both parties, denoting the new α𝐯i\alpha\mathbf{v}_{i} as ξi\mathbf{\xi}_{i}. We also need to perform a consistency check to verify i=1i=mτiξi=𝟎\sum_{i=1}^{i=m}\mathbf{\tau}_{i}-\mathbf{\xi}_{i}=\mathbf{0}.

Input:PbP_{b} b{0,1}b\in\{0,1\} holds τib\left\langle\tau_{i}\right\rangle_{b}, ξib\left\langle\xi_{i}\right\rangle_{b} and [[aj]]b[\![a_{j}]\!]_{b} for i[m1]i\in[m-1] and j[t]j\in[t].
Output: P1P_{1} obtains 𝐌(𝐱0)\mathbf{M}(\mathbf{x}_{0}) if verification passes. Otherwise, abort.
Procedure \bullet For i[m]i\in[m] and j[t]j\in[t], P1P_{1} uniformly samples 𝐫i\mathbf{r}_{i} and 𝐫j\mathbf{r}_{j} and sends them to P0P_{0}. \bullet P0P_{0} computes q0=j[t]𝐫j(ρj0α0aj)+i[m1]𝐫i(τi0ξi0)\left\langle q\right\rangle_{0}=\sum_{j\in[t]}\mathbf{r}_{j}(\left\langle\mathbf{\rho}_{j}\right\rangle_{0}-\alpha_{0}a_{j})+\sum_{i\in[m-1]}\mathbf{r}_{i}(\left\langle\mathbf{\tau}_{i}\right\rangle_{0}-\left\langle\mathbf{\xi}_{i}\right\rangle_{0}), and sends q0\left\langle q\right\rangle_{0} to P1P_{1}. \bullet P1P_{1} computes q1=j[t]𝐫j(ρj1α1aj)+i[m1]𝐫i(τi1ξi1)\left\langle q\right\rangle_{1}=\sum_{j\in[t]}\mathbf{r}_{j}(\left\langle\mathbf{\rho}_{j}\right\rangle_{1}-\alpha_{1}a_{j})+\sum_{i\in[m-1]}\mathbf{r}_{i}(\left\langle\mathbf{\tau}_{i}\right\rangle_{1}-\left\langle\mathbf{\xi}_{i}\right\rangle_{1}). \bullet P1P_{1} aborts if q0+q10modp\left\langle q\right\rangle_{0}+\left\langle q\right\rangle_{1}\neq 0\mod p. Else, P1P_{1} locally evaluates the fairness of the ML model based on Eqn.(2) by reconstructing 𝐌(𝐱0)\mathbf{M}(\mathbf{x}_{0}).
Figure 7: Consistency check protocol πOcheck\pi_{Ocheck}

Figure 7 presents the details of consistency check, where we combine all the above checks into a single check by using random scalars picked by P1P_{1}. The correctness of πOcheck\pi_{Ocheck} can be easily deduced by inspecting the implementation of the protocol. Specifically, By correctness of πOLin\pi_{OLin}, we have ρjατj=(ρj0α0aj+ρj1α1aj)=0\mathbf{\rho}_{j}-\alpha\mathbf{\tau}_{j}=(\left\langle\mathbf{\rho}_{j}\right\rangle_{0}-\alpha_{0}a_{j}+\left\langle\mathbf{\rho}_{j}\right\rangle_{1}-\alpha_{1}a_{j})=0 for every linear layer {𝐋j}j[m]\{\mathbf{L}_{j}\}_{j\in[m]}. By correctness of πONlin\pi_{ONlin}, we have τiξi=(τi0ξi0)+(τi1ξi1)=0\mathbf{\tau}_{i}-\mathbf{\xi}_{i}=(\left\langle\mathbf{\tau}_{i}\right\rangle_{0}-\left\langle\mathbf{\xi}_{i}\right\rangle_{0})+(\left\langle\mathbf{\tau}_{i}\right\rangle_{1}-\left\langle\mathbf{\xi}_{i}\right\rangle_{1})=0 for all nonlinear layers. Hence, we have q0+q1=j[t]𝐫j(ρjατj)+i[m1]𝐫i(τiξi)=0\left\langle q\right\rangle_{0}+\left\langle q\right\rangle_{1}=\sum_{j\in[t]}\mathbf{r}_{j}(\mathbf{\rho}_{j}-\alpha\mathbf{\tau}_{j})+\sum_{i\in[m-1]}\mathbf{r}_{i}(\mathbf{\tau}_{i}-\mathbf{\xi}_{i})=0.

Security. We demonstrate that the consistency check protocol πOcheck\pi_{Ocheck} have an overwhelming probability to abort if P0P_{0} tampered with the input during execution. We provide the following theorem and prove it in Appendix E.

Theorem 4.5.

In real execution, if P0P_{0} tampers with its input, then P1P_{1} aborts with probability at least 11/p1-1/p.

5 Performance Evaluation

In this section, we conduct experiments to demonstrate the performance of VerifyML. Since there is no secure inference protocol specifically designed for the malicious model holder threat model, we choose the state-of-the-art generic MPC framework Overdrive [22]555Although work [6] shows better performance compared to Overdrive, it is difficult to compare with [6] because of the unavailability of its code. However, we clearly outperform [6] by constructing a more efficient method to generate triples. In addition, [6] requires fitting nonlinear functions such as ReLU to a quadratic polynomial to facilitate computation, which is also contrary to the motivation of this paper. as the baseline. Note that we also consider the client as a semi-honest entity when implementing Overdrive, so that Overdrive can also utilize the properties of semi-honest client to avoid redundant verification and zero-knowledge proof. In this way, we can “purely” discuss the technical advantages of VerifyML over Overdrive, while excluding the inherent advantages of VerifyML due to the weaker threat model. Specifically, we analyze the performance of VerifyML from offline and online phases, respectively, where we discuss the superiority of VerifyML over Overdrive in terms of computation and communication cost in performing linear and non-linear layers. In the end, We demonstrate the cost superiority of VerifyML compared to Overdrive on mainstream models including ResNet-18 and LeNet.

5.1 Implementation details

VerifyML is implemented through the C++ language and provides 128 bits of computational security and 40 bits of statistical security. The entire system operates on the 44-bit prime field. We utilize the SEAL homomorphic encryption library [37] to perform nonlinear layers including generative matrix-vector multiplication and convolution triples, where we set the maximum number of slots allowed for a single ciphertext as 4096. The garbled circuit for the nonlinear layer is constructed on the EMP toolkit [40] (with the OT protocol that resists active adversaries). Zero-knowledge proofs of plaintext knowledge are implemented based on MUSE [26]. Our experiments are carried out in both the LAN and WAN settings. LAN is implemented with two workstations in our lab. The client workstation has AMD EPYC 7282 1.4GHz CPUs with 32 threads on 16 cores and 32GB RAM. The server workstation has Intel(R) Xeon(R) E5-2697 v3 2.6GHz CPUs with 28 threads on 14 cores and 64GB RAM. The WAN setting is based on a connection between a local PC and an Amazon AWS server with an average bandwidth of 963Mbps and running time of around 14ms.

5.2 Performance of offline phase

5.2.1    Cost of generating matrix-vector multiplication triple

TABLE I: Cost of generating the matrix-vector multiplication triple
  Dimension Comm.cost (MB) Running time (s)
Overdrive VerifyML(Reduction) Overdrive VerifyML (speedup)
LAN WAN LAN WAN
  1×40961\times 4096 27.1 2.1 (12.9×\times) 2.3 17.7 0.9 (2.6×\times) 12.4 (1.5×\times)
16×204816\times 2048 216.4 17.6 (12.3×\times) 15.3 26.2 7.6 (2.0×\times) 14.1 (1.6×\times)
16×409616\times 4096 432.8 34.5 (12.5×\times) 30.6 43.4 15.1 (2.0×\times) 26.9 (1.6×\times)
64×204864\times 2048 865.6 68.3 (12.7×\times) 60.9 72.4 29.2 (2.1×\times) 40.7 (1.7×\times)
64×409664\times 4096 1326.2 135.7 (9.8×\times) 103.0 114.8 57.8 (1.8×\times) 68.2 (1.6×\times)
128×4096128\times 4096 2247.4 271.9 (8.3×\times) 187.1 199.1 117.3 (1.6×\times) 128.4 (1.5×\times)
 

TABLE I describes the comparison of the overhead of VerifyML and Overdrive in generating matrix-vector multiplication triples in different dimensions. It is clear that VerifyML is superior in performance to Overdrive, both in terms of communication overhead and computational overhead. We observe that VerifyML achieves more than 8×8\times reduction in communication overhead and at least 1.5×1.5\times speedup in computation compared to Overdrive. This stems from Overdrive’s disadvantage in constructing triples, i.e. constructing triples for only a single multiplication operation (or multiplication between a single row of a matrix and a vector). In addition, the generation process requires frequent interaction between the client and the model holder (for zero-knowledge proofs and preventing breaches by either party). This inevitably incurs substantial computational and communication overhead. Our constructed matrix-multiplication triples enable the communication overhead to be independent of the number of multiplications, only related to the size of the input. This substantially reduces the amount of data that needs to be exchanged between P0P_{0} and P1P_{1}. In addition, we move the majority of the computation to be executed by P1P_{1}, which avoids the need for distributed decryption and frequent zero-knowledge proofs in malicious adversary settings. Moreover, our matrix-vector multiplication does not involve any rotation operation. As a result, these optimization methods motivate VerifyML to exhibit a satisfactory performance overhead in generating triples.

5.2.2    Cost of generating convolution triple

TABLE II: Cost of generating the convolution triple
  Input Kernel Comm.cost (GB) Running time (s)
Overdrive VerifyML Overdrive VerifyML (Speedup)
LAN WAN LAN WAN
  16×1616\times 16 @128@128 1×11\times 1 @128@128 17.1 2.1 1476.1 1494.6 924.7 (1.6×\times) 938.4 (1.6×\times)
16×1616\times 16 @256@256 1×11\times 1 @256@256 67.8 8.2 6059.3 6059.31 3568.8 (1.7×\times) 3580.8 (1.7×\times)
16×1616\times 16 @512@512 3×33\times 3 @128@128 467.5 56.8 40753.4 40767.1 25387.2 (1.6×\times) 25401.5 (1.6×\times)
32×3232\times 32 @2048@2048 5×55\times 5 @512@512 83127.8 7324.3 7245056.2 7245068.8 4521023.3 (1.6×\times) 4521165.6 (1.6×\times)
 

TABLE II shows the comparison of the performance of VerifyML and Overdrive in generating convolution triples in different dimensions, where input tensor of size uw×uhu_{w}\times u_{h} with cic_{i} channels is denoted as uw×uh@ciu_{w}\times u_{h}@c_{i}, and the size of corresponding kernel is denoted as kw×kh@cok_{w}\times k_{h}@c_{o}. We observe that VerifyML is much lower than Overdrive in terms of computational and communication overhead. For instance, VerifyML gains a reduction of up to 9×9\times in communication cost and a speedup of at least 1.6×1.6\times in computation. This is due to the optimization method customized by VerifyML for generating convolution triples. Compared to Overdrive, which focuses on constructing authenticated triples for a single multiplication operation, VerifyML uses the homomorphic parallel matrix multiplication method constructed in [17] as the underlying structure to construct matrix multiplication triples equivalent to convolution triples. Since a single matrix is regarded as a computational entity, the above method makes the communication overhead between the client and the model holder only related to the size of the matrix, and independent of the number of operations of the multiplication between the two matrices (that is, the communication complexity is reduced from O(d3)O(d^{3}) to O(d)2O(d)^{2} given the multiplication between the two d×dd\times d matrices). In addition, the optimized parallel matrix multiplication reduces the homomorphic rotation operation from O(d2)O(d^{2}) to O(d)O(d). This enables VerifyML to show significant superiority in computing convolution triples.

5.3 Performance of online phase

In the online phase, VerifyML is required to perform operations at the linear and nonlinear layers alternately. Here we discuss the overhead performance of VerifyML compared to Overdrive separately.

5.3.1    Performance of executing linear layers

TABLE III: Comparison of the communication overhead for executing convolution in the online phase
  Input Kernel Comm.cost (MB)
Overdrive VerifyML (Reduction)
  16×1616\times 16 @128@128 1×11\times 1 @128@128 46.1 0.5 (85.3×\times)
16×1616\times 16 @256@256 1×11\times 1 @256@256 184.5 1.4 (128.0×\times)
16×1616\times 16 @512@512 3×33\times 3 @128@128 1271.7 15.7 (81.2×\times)
32×3232\times 32 @2048@2048 5×55\times 5 @512@512 226073.0 1459.8 (154.9×\times)
 

Since both VerifyML and Overdrive follow the same computational logic to perform the linear layer in the online phase, i.e. use pre-generated authenticated triples to compute matrix-vector multiplication and convolution, both exhibit similar compu- tational overhe. Therefore, we focus on analyzing the difference in communication overhead between the two of executing convolution. TABLE III depicts the communication overhead of VerifyML and Overdriveffor computing convolution in different dimensions. It is obvious that VerifyML shows superior performance in communication overhead compared to Overdrive. This is mainly due to the fact that Overdrive needs to open a fresh authenticated Beaver’s multiplication triple for each multiplication operation, which makes the communication overhead of executing the entire linear layer positively related to the total multiplication operations involved. In contrast, VerifyML customizes matrix-vector multiplication and convolution triples, which makes the cost independent of the number of multiplication operations in the linear layer. This substantially reduces the amount of data that needs to be exchanged during the execution.

5.3.2    Performance of executing nonlinear layers

Refer to caption
Refer to caption
Figure 6: Comparison of the overhead for executing nonlinear layers. ((a) Running time improvement of VerifyML over Overdrive. The y-axis shows OverdrivetimeVerifyMLtime\rm\frac{Overdrive\;time}{\textit{VerifyML}\;time} (b) Comparison of the communication overhead.

Figure 6 provides the comparison of the cost between Overdrive and VerifyML. We observe that VerifyML outperforms Overdrive by 442×4-42\times in runtime on LAN Setting and 316×3-16\times in WAN Setting. For example, Overdrive takes 165.4s165.4s and 1283.5s1283.5s to compute 2152^{15} ReLUs on LAN and WAN setting, respectively. Whereas, VerifyML took just 5.1s5.1s and 110.2s110.2s in the respective network settings. For communication overhead, we observed that Overdrive required 401401KB of traffic to perform a single ReLU while we only need 8.338.33KB, which is at least a 48×48\times improvement. This is mainly due to the fact that our optimized 𝙶𝙲\mathtt{GC} substantially reduces the multiplication operations involved in evaluating in the 𝙶𝙲\mathtt{GC}. Moreover, Overdrive needs to verify the correctness of the input from the model holder in the 𝙶𝙲\mathtt{GC}, which is very expensive. Conversely, VerifyML designs lightweight consistency verification methods to achieve this.

5.4 Performance of end-to-end secure inference

TABLE IV: Cost of end-to-end secure inference
  LeNet Phases Comm.cost (MB) Running time (s)
Overdrive VerifyML Overdrive VerifyML (Speedup)
LAN WAN LAN WAN
Offline 3427.8 209.6 235.5 246.8 92.9 (2.5×\times) 104.6 (2.4×\times)
Online 2543.1 54.0 32.8 254.9 1.0 (32.6×\times) 21.9 (11.6×\times)
Total 5970.9 263.6 268.3 501.7 93.9 (2.9×\times) 126.5 (4.0×\times)
  ResNet18 Phases Comm.cost (MB) Running time (s)
Overdrive VerifyML Overdrive VerifyML (Speedup)
LAN WAN LAN WAN
Offline 2116018.6 257257.7 238774.2 238957.4 114003.1 (2.1×\times) 114978.8 (2.1×\times)
Online 19359.5 459.4 177.0 1373.7 5.5 (32.2×\times) 117.9 (11.7×\times)
Total 2135378.1 257717.1 238951.2 240331.1 114008.6 (2.1×\times) 115096.7 (2.1×\times)
 

We compare the performance of VerifyML and Overdrive on real-world ML models. In our experiments, we choose ResNet-18 and LeNet, which are trained on the CelebA [28] and C-MNIST datasets[2] respectively. Note that CelebA and C-MNIST are widely used to check how fair a given trained model is. TABLE IV shows the performance of VerifyML and Overdrive in terms of computation and communication overhead. Compared to Overdrive, VerifyML demonstrates an encouraging online runtime boost by 32.6×32.6\times and 32.2×32.2\times over existing works on LeNet and ResNet-18, respectively, and at least an order of magnitude communication cost reduction. In online phase, Overdrive takes 32.8s32.8s and 177s177s to compute single query on LeNet and ReNet-18, respectively. Whereas, VerifyML took just 1s1s and 5.5s5.5s in the respective network settings. Consistent with the previous analysis, this stems from the customized optimization mechanism we designed for VerifyML.

5.5 Comparison with other works

Compared with DELPHI. We demonstrate that for the execution of non-linear layers, the communication overhead of VerifyML is even lower than the state-of-the-art scheme DELPHI [29] under hte semi-honest threat model. Specifically, for the ii-th nonlinear layer, DELPHI needs to calculate shares of fi(𝐯𝐢)f_{i}(\mathbf{v_{i}}) in 𝙶𝙲\mathtt{GC} and share it with two parties. DELPHI requires at least 3κ3\kappa additional AND gates, which incurs at least 6κλ6\kappa\lambda bits of communication, compared to only computing each bit of fi(𝐯𝐢)f_{i}(\mathbf{v_{i}}) in VerifyML. In our experiment, For κ=44\kappa=44, λ=28\lambda=28, our method gives roughly 9×9\times less communication for generating shares of fi(𝐯𝐢)f_{i}(\mathbf{v_{i}}), i.e., DELPHI required 3232KB of traffic to perform a single ReLU while we only need 8.338.33KB.

Compared with MUSE and SIMC. We note that several works such as MUSE [26] and SIMC [5] have been proposed to address ML secure inference on the client malicious threat model. Such a threat model considers that the server (i.e., the model holder) is semi-honest but the malicious client may arbitrarily violate the protocol to obtain private information. These works intuitively seem to translate to our application scenarios with appropriate modification. However, we argue that this is non-trivial. In more detail, in the client malicious model, the client’s inputs are encrypted and sent to the semi-honest model holder, which performs all linear operations for speeding up the computation. Since the model holder holds the model parameter in the plaintext, executing the linear layer only involves homomorphic operations between the plaintext and the ciphertext. Such type of computation is compatible with mainstream homomorphic optimization methods including GALA [43] and GAZELLE [19]. However, in VerifyML, the linear layer operation cannot be done in the model holder because it is considered malicious. One possible approach is to encrypt the model data and perform linear layer operations with two-party interaction. This is essentially performing homomorphic operations between ciphertext and ciphertext, which is not compatible with previous optimization strategies. Therefore, instead of simply fine-tuning MUSE [26] and SIMC [5], we must redesign new parallel homomorphic computation methods to fit this new threat model. On the other hand, we observe that the techniques for nonlinear operations in MUSE [26] and SIMC [5] can clearly be transferred to VerifyML. However, our method still outperforms SIMC (an upgraded version of MUSE). This mainly stems from the fact that we only encapsulate the nonlinear part of ReLU into 𝙶𝙲\mathtt{GC} to further reduce the number of multiplication operations. Experiments show that our method is about one third of SIMC in terms of computing and communication overhead.

6 Conclusion

In this paper, we proposed VerifyML, the first secure inference framework to check the fairness degree of a given ML model. We designed a series of optimization methods to reduce the overhead of the offline stage. We also presented optimized 𝙶𝙲\mathtt{GC} to substantially speed up operations in the non-linear layers. In the future, we will focus on designing more efficient optimization strategies to further reduce the computation overhead of VerifyML, to make secure ML inference more suitable for a wider range of practical applications.

References

  • [1] Finastra Adam Lieberman. How data scientists can create a more inclusive financial services landscape, 2022.
  • [2] Martin Arjovsky, Léon Bottou, Ishaan Gulrajani, and David Lopez-Paz. Invariant risk minimization. arXiv preprint arXiv:1907.02893, 2019.
  • [3] Rachel KE Bellamy, Kuntal Dey, Michael Hind, Samuel C Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovic, et al. Ai fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias. arXiv preprint arXiv:1810.01943, 2018.
  • [4] Sumon Biswas and Hridesh Rajan. Do the machine learning models on a crowd sourced platform exhibit bias? an empirical study on model fairness. In Proceedings of ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering (ESEC/FSE), pages 642–653, 2020.
  • [5] Nishanth Chandran, Divya Gupta, Sai Lakshmi Bhavana Obbattu, and Akash Shah. Simc: Ml inference secure against malicious clients at semi-honest cost. Cryptology ePrint Archive, 2021.
  • [6] Hao Chen, Miran Kim, Ilya Razenshteyn, Dragos Rotaru, Yongsoo Song, and Sameer Wagh. Maliciously secure matrix multiplication with applications to private deep learning. In International Conference on the Theory and Application of Cryptology and Information Security (ASIACRYPT), pages 31–59. Springer, 2020.
  • [7] Alexandra Chouldechova, Diana Benavides-Prado, Oleksandr Fialko, and Rhema Vaithianathan. A case study of algorithm-assisted decision making in child maltreatment hotline screening decisions. In Conference on Fairness, Accountability and Transparency, pages 134–148. PMLR, 2018.
  • [8] Michele Ciampi, Vipul Goyal, and Rafail Ostrovsky. Threshold garbled circuits and ad hoc secure computation. In Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT), pages 64–93. Springer, 2021.
  • [9] Ivan Damgård, Valerio Pastro, Nigel Smart, and Sarah Zakarias. Multiparty computation from somewhat homomorphic encryption. In Annual Cryptology Conference (CRYPTO), pages 643–662. Springer, 2012.
  • [10] Nico Döttling, Sanjam Garg, Mohammad Hajiabadi, Daniel Masny, and Daniel Wichs. Two-round oblivious transfer from cdh or lpn. Annual International Conference on the Theory and Applications of Cryptographic Techniques(EUROCRYPT), 12106:768, 2020.
  • [11] Uriel Feige, Amos Fiat, and Adi Shamir. Zero-knowledge proofs of identity. Journal of cryptology, 1(2):77–94, 1988.
  • [12] Tore Kasper Frederiksen, Thomas Pelle Jakobsen, Jesper Buus Nielsen, Peter Sebastian Nordholt, and Claudio Orlandi. Minilego: Efficient secure two-party computation from general assumptions. In Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT), pages 537–556. Springer, 2013.
  • [13] Alex B Grilo, Huijia Lin, Fang Song, and Vinod Vaikuntanathan. Oblivious transfer is in miniqcrypt. In Annual International Conference on the Theory and Applications of Cryptographic Techniques(EUROCRYPT), pages 531–561. Springer, 2021.
  • [14] Shai Halevi and Victor Shoup. Algorithms in helib. In Annual Cryptology Conference, pages 554–571. Springer, 2014.
  • [15] Carmit Hazay, Emmanuela Orsini, Peter Scholl, and Eduardo Soria-Vazquez. Concretely efficient large-scale mpc with active security (or, tinykeys for tinyot). In International Conference on the Theory and Application of Cryptology and Information Security (ASIACRYPT), pages 86–117. Springer, 2018.
  • [16] Ayanna Howard and Jason Borenstein. The ugly truth about ourselves and our robot creations: the problem of bias and social inequity. Science and engineering ethics, 24(5):1521–1536, 2018.
  • [17] Xiaoqian Jiang, Miran Kim, Kristin Lauter, and Yongsoo Song. Secure outsourced matrix computation and application to neural networks. In Proceedings of the ACM SIGSAC conference on computer and communications security (CCS), pages 1209–1222, 2018.
  • [18] Surya Mattu Julia Angwin, Jeff Larson and ProPublica Lauren Kirchner. Machine bias, 2016.
  • [19] Chiraag Juvekar, Vinod Vaikuntanathan, and Anantha Chandrakasan. {\{GAZELLE}\}: A low latency framework for secure neural network inference. In USENIX Security Symposium (USENIX Security 18), pages 1651–1669, 2018.
  • [20] Marcel Keller. Mp-spdz: A versatile framework for multi-party computation. In Proceedings of ACM SIGSAC conference on computer and communications security (CCS), pages 1575–1590, 2020.
  • [21] Marcel Keller, Emmanuela Orsini, and Peter Scholl. Actively secure ot extension with optimal overhead. In Annual Cryptology Conference (CRYPTO), pages 724–741. Springer, 2015.
  • [22] Marcel Keller, Valerio Pastro, and Dragos Rotaru. Overdrive: making spdz great again. In Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT), pages 158–189. Springer, 2018.
  • [23] Vladimir Kolesnikov, Payman Mohassel, and Mike Rosulek. Flexor: Flexible garbling for xor gates that beats free-xor. In Annual Cryptology Conference (CRYPTO), pages 440–457. Springer, 2014.
  • [24] Preethi Lahoti, Alex Beutel, Jilin Chen, Kang Lee, Flavien Prost, Nithum Thain, Xuezhi Wang, and Ed Chi. Fairness without demographics through adversarially reweighted learning. Advances in neural information processing systems (NeurIPS), 33:728–740, 2020.
  • [25] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 521(7553):436–444, 2015.
  • [26] Ryan Lehmkuhl, Pratyush Mishra, Akshayaram Srinivasan, and Raluca Ada Popa. Muse: Secure inference resilient to malicious clients. In USENIX Security Symposium (USENIX Security 21), pages 2201–2218, 2021.
  • [27] Yehuda Lindell. How to simulate it–a tutorial on the simulation proof technique. Tutorials on the Foundations of Cryptography, pages 277–346, 2017.
  • [28] Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision, pages 3730–3738, 2015.
  • [29] Pratyush Mishra, Ryan Lehmkuhl, Akshayaram Srinivasan, Wenting Zheng, and Raluca Ada Popa. Delphi: A cryptographic inference service for neural networks. In USENIX Security Symposium, pages 2505–2522, 2020.
  • [30] Payman Mohassel and Yupeng Zhang. Secureml: A system for scalable privacy-preserving machine learning. In IEEE symposium on security and privacy (S&P), pages 19–38. IEEE, 2017.
  • [31] Debarghya Mukherjee, Mikhail Yurochkin, Moulinath Banerjee, and Yuekai Sun. Two simple ways to learn individual fairness metrics from data. In International Conference on Machine Learning (ICML), pages 7097–7107. PMLR, 2020.
  • [32] Luca Oneto and Silvia Chiappa. Fairness in machine learning. In Recent Trends in Learning From Data, pages 155–196. Springer, 2020.
  • [33] Osonde A Osoba and William Welser IV. An intelligence in our image: The risks of bias and errors in artificial intelligence. Rand Corporation, 2017.
  • [34] Flavien Prost, Pranjal Awasthi, Nick Blumm, Aditee Kumthekar, Trevor Potter, Li Wei, Xuezhi Wang, Ed H Chi, Jilin Chen, and Alex Beutel. Measuring model fairness under noisy covariates: A theoretical perspective. In Proceedings of AAAI/ACM Conference on AI, Ethics, and Society (AIES), pages 873–883, 2021.
  • [35] Mike Rosulek and Lawrence Roy. Three halves make a whole? beating the half-gates lower bound for garbled circuits. In Annual International Cryptology Conference (CRYPTO), pages 94–124. Springer, 2021.
  • [36] Pedro Saleiro, Benedict Kuester, Loren Hinkson, Jesse London, Abby Stevens, Ari Anisfeld, Kit T Rodolfa, and Rayid Ghani. Aequitas: A bias and fairness audit toolkit. arXiv preprint arXiv:1811.05577, 2018.
  • [37] Microsoft SEAL (release 4.0). https://github.com/Microsoft/SEAL, March 2022. Microsoft Research, Redmond, WA.
  • [38] Shahar Segal, Yossi Adi, Benny Pinkas, Carsten Baum, Chaya Ganesh, and Joseph Keshet. Fairness in the eyes of the data: Certifying machine-learning models. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (AIES), pages 926–935, 2021.
  • [39] Nigel P Smart and Frederik Vercauteren. Fully homomorphic simd operations. Designs, codes and cryptography, 71(1):57–81, 2014.
  • [40] Xiao Wang, Alex J Malozemoff, and Jonathan Katz. Emp-toolkit: Efficient multiparty computation toolkit. https://github.com/emp-toolkit, 2016.
  • [41] Paul Weiss, Rifkind, Wharton, and Garrison LLP. Breaking new ground, cfpb will pursue discrimination as an ¡°unfair¡± practice across the range of consumer financial services, 2022.
  • [42] Samee Zahur, Mike Rosulek, and David Evans. Two halves make a whole. In Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT), pages 220–250. Springer, 2015.
  • [43] Qiao Zhang, Chunsheng Xin, and Hongyi Wu. Gala: Greedy computation for linear algebra in privacy-preserved neural networks. In Proceedings of the Network and Distributed System Security (NDSS), 2021.

Appendix

Appendix A Threat Model

We formalize the threat model involved in VerifyML with the simulated paradigm [27]. We define two interactions to capture security: a real interaction by P0P_{0} and P1P_{1} in the presence of adversary 𝐀\mathbf{A} and an environment ZZ, and an ideal interaction where parties send their respective inputs to a trusted entity that computes functionally faithfully. Security requires that for any adversary 𝐀\mathbf{A} in real interaction, there exists a simulator 𝐒\mathbf{S} in ideal interaction, such that no environment 𝐙\mathbf{Z} can distinguish real interaction from ideal interaction. Specifically, let f=(f0,f1)f=(f_{0},f_{1}) be the two-party functionality such that P0P_{0} and P1P_{1} invoke ff on inputs aa and bb to obtain f0(a,b)f_{0}(a,b) and f1(a,b)f_{1}(a,b), respectively. We say a protocol π\pi securely implements ff if it holds the following properties.

  • Correctness: If P0P_{0} and P1P_{1} are both honest, then P0P_{0} gets f0(a,b)f_{0}(a,b) and P1P_{1} gets f1(a,b)f_{1}(a,b) from the execution of π\pi on the inputs aa and bb, respectively.

  • Semi-honest Client Security: For a semi-honest adversary 𝐀\mathbf{A} that compromises P1P_{1}, there exists a simulator 𝐒\mathbf{S} such that for any input (a,b)(a,b), we have

    View𝐀π(a,b)𝐒(b,f1(a,b))\begin{split}View_{\mathbf{A}}^{\pi}(a,b)\approx\mathbf{S}(b,f_{1}(a,b))\end{split}

    where View𝐀π(a,b)View_{\mathbf{A}}^{\pi}(a,b) represents the view of 𝐀\mathbf{A} during the execution of π\pi, and aa and bb are the inputs of P0P_{0} and P1P_{1}, respectively. 𝐒(b,f1(a,b))\mathbf{S}(b,f_{1}(a,b)) represents the view simulated by 𝐒\mathbf{S} when it is given access to bb and f1(a,b)f_{1}(a,b). \approx indicates computational indistinguishability of two distributions View𝐀π(a,b)View_{\mathbf{A}}^{\pi}(a,b) and 𝐒(b,f1(a,b))\mathbf{S}(b,f_{1}(a,b)).

  • Malicious Model Holder Security: For the malicious adversary 𝐀\mathbf{A} that compromises P0P_{0}, there exists a simulator 𝐒\mathbf{S}, such that for any input bb from P1P_{1}, we have

    OutP1,View𝐀π(b,)Out^,𝐒f(b,)\begin{split}Out_{P_{1}},View_{\mathbf{A}}^{\pi}(b,\cdot)\approx\hat{Out},\mathbf{S}^{f(b,\cdot)}\end{split}

where View𝐀π(b,)View_{\mathbf{A}}^{\pi}(b,\cdot) denotes 𝐀\mathbf{A}’s view during the execution of π\pi with S1S_{1}’s input bb. OutP1Out_{P_{1}} indicates the output of P1P_{1} in the real protocol execution. Similarly, Out^\hat{Out} and 𝐒f(b,)\mathbf{S}^{f(b,\cdot)} represents the output of P1P_{1} and the simulated view in the ideal interaction.

Appendix B Proof of Theorem 1

Proof.
Input: P0P_{0} holds 𝐗0\langle\mathbf{X}\rangle_{0} uniformly chosen from 𝔽pd1×d2\mathbb{F}_{p}^{d_{1}\times d_{2}} and 𝐲0\langle\mathbf{y}\rangle_{0} uniformly chosen from 𝔽pd2\mathbb{F}_{p}^{d_{2}}. P1P_{1} holds 𝐗1\langle\mathbf{X}\rangle_{1} uniformly chosen from 𝔽pd1×d2\mathbb{F}_{p}^{d_{1}\times d_{2}}, and 𝐲1\langle\mathbf{y}\rangle_{1} uniformly chosen from 𝔽pd2\mathbb{F}_{p}^{d_{2}} and a MAC key α\alpha uniformly chosen from 𝔽p\mathbb{F}_{p}
Output: PbP_{b} obtains {[[𝐗]]b,[[𝐲]]b,[[𝐳]]b}b{0,1}\{[\![\mathbf{X}]\!]_{b},[\![\mathbf{y}]\!]_{b},[\![\mathbf{z}]\!]_{b}\}_{b\in\{0,1\}}, where 𝐗𝐲=𝐳\mathbf{X}\ast\mathbf{y}=\mathbf{z}.
Figure 8: Functionality of Mtriple\mathcal{F}_{Mtriple}

Let Mtriple\mathcal{F}_{Mtriple} shown in Figure 8 be the functionality of generating matrix-vector multiplication triple. We first prove security for semi-honest clients and then demonstrate security against malicious model holders.

Semi-honest client security. The simulator 𝚂𝚒𝚖𝚌\mathtt{Sim_{c}} samples (pk,sk)𝙺𝚎𝚢𝙶𝚎𝚗(1λ)(pk,sk)\leftarrow\mathtt{KeyGen}(1^{\lambda}). The simulator and the semi-honest client run a secure two-party protocol to generate the public and secret keys for homomorphic encryption. When the simulator accesses the ideal functionality, it provides pkpk as output. In addition, the 𝚂𝚒𝚖𝚌\mathtt{Sim_{c}} sends 𝙴𝚗𝚌pk(𝟎)\mathtt{Enc}_{pk}(\mathbf{0}) to the client along with the simulated zero-knowledge proof of well-formedness of ciphertexts. We now show the indistinguishability between real and simulated views by the following hybrid arguments.

  • 𝙷𝚢𝚋𝟷\mathtt{Hyb_{1}}: This corresponds to the real execution of the protocol.

  • 𝙷𝚢𝚋𝟸\mathtt{Hyb_{2}}: The simulator 𝚂𝚒𝚖𝚌\mathtt{Sim_{c}} runs the two-party computation protocol with the semi-honest client to generate the public and secret keys for homomorphic encryption. When the simulator accesses the ideal functionality, we sample (pk,sk)𝙺𝚎𝚢𝙶𝚎𝚗(1λ)(pk,sk)\leftarrow\mathtt{KeyGen}(1^{\lambda}) and send pkpk to the semi-honest client. This hybrid is computationally indistinguishable to 𝚑𝚢𝚋𝟷\mathtt{hyb_{1}}.

  • 𝙷𝚢𝚋𝟹\mathtt{Hyb_{3}}: In this hybrid, instead of sending the encryptions c1𝙴𝚗𝚌(pk,𝐗0)c_{1}\leftarrow\mathtt{Enc}(pk,\langle\mathbf{X}\rangle_{0}) and c2𝙴𝚗𝚌(pk,𝐘0)c_{2}\leftarrow\mathtt{Enc}(pk,\langle\mathbf{Y}\rangle_{0}) to P1P_{1}, 𝚂𝚒𝚖𝚌\mathtt{Sim_{c}} sends ciphertexts with all 0s (i.e., 𝙴𝚗𝚌pk(𝟎)\mathtt{Enc}_{pk}(\mathbf{0})) to the client. 𝚂𝚒𝚖𝚌\mathtt{Sim_{c}} also provides a zero-knowledge (ZK) proof of plaintext knowledge of the ciphertexts. For any two plaintexts, FHE ensures that an adversary cannot distinguish them from their ciphertexts. In addition, zero-knowledge proofs also guarantee the indistinguishability of two ciphertexts. Therefore, this hybrid is indistinguishable from the previous one.

Malicious model holder security. The simulator 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} samples (pk,sk)𝙺𝚎𝚢𝙶𝚎𝚗(1λ)(pk,sk)\leftarrow\mathtt{KeyGen}(1^{\lambda}). The simulator and the semi-honest client run a secure two-party protocol to generate the public and secret keys for homomorphic encryption. When the simulator accesses the ideal functionality, it provides (pk,sk)(pk,sk) as outputs. Once P0P_{0} sends c1𝙴𝚗𝚌(pk,𝐗0)c_{1}\leftarrow\mathtt{Enc}(pk,\langle\mathbf{X}\rangle_{0}) and c2𝙴𝚗𝚌(pk,𝐘0)c_{2}\leftarrow\mathtt{Enc}(pk,\langle\mathbf{Y}\rangle_{0}), 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} verifies the validity of the ciphertext from the client. If the verification is passed, 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} extracts 𝐗0\langle\mathbf{X}\rangle_{0} and 𝐘0\langle\mathbf{Y}\rangle_{0} and the randomness used for generating these ciphertexts, since it has access to the client’s input. Then, 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} samples 𝐗1\langle\mathbf{X}\rangle_{1} and 𝐘1\langle\mathbf{Y}\rangle_{1}, and queries the ideal functionalities on the input 𝐗0\langle\mathbf{X}\rangle_{0}, 𝐗1\langle\mathbf{X}\rangle_{1}, 𝐘0\langle\mathbf{Y}\rangle_{0} and 𝐘1\langle\mathbf{Y}\rangle_{1} to obtain (α𝐗0,α𝐲0,α𝐳0,𝐳0)(\langle\alpha\mathbf{X}\rangle_{0},\langle\alpha\mathbf{y}\rangle_{0},\langle\alpha\mathbf{z}\rangle_{0},\langle\mathbf{z}\rangle_{0}). Then, 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} uses these outputs and the randomness used to generate the initial ciphertexts to construct the four simulated ciphertexts. It sends the simulated ciphertexts to the client.

  • 𝙷𝚢𝚋𝟷\mathtt{Hyb_{1}}: This corresponds to the real execution of the protocol.

  • 𝙷𝚢𝚋𝟸\mathtt{Hyb_{2}}: The simulator 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} runs the two-party computation protocol with the malicious model holder to generate the public and secret keys for homomorphic encryption. When the simulator accesses the ideal functionality, we sample (pk,sk)𝙺𝚎𝚢𝙶𝚎𝚗(1λ)(pk,sk)\leftarrow\mathtt{KeyGen}(1^{\lambda}) and send them to the malicious model holder. This hybrid is computationally indistinguishable to 𝚑𝚢𝚋𝟷\mathtt{hyb_{1}}.

  • 𝙷𝚢𝚋𝟹\mathtt{Hyb_{3}}: In this hybrid, 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} checks the validity of the ciphertext from the client. If the zero-knowledge proofs are valid, 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} extracts 𝐗0\langle\mathbf{X}\rangle_{0} and 𝐘0\langle\mathbf{Y}\rangle_{0} and the randomness used for generating these ciphertexts, since it has access to the client’s input. The properties of zero-knowledge proofs ensure that this hybrid is indistinguishable from the previous one.

  • 𝙷𝚢𝚋𝟺\mathtt{Hyb_{4}}: 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} exploits the functional privacy of FHE to generate c3=𝙴𝚗𝚌pk(α(𝐗1+𝐗0)α𝐗1)c_{3}=\mathtt{Enc}_{pk}(\alpha(\langle\mathbf{X}\rangle_{1}+\langle\mathbf{X}\rangle_{0})-\langle\alpha\mathbf{X}\rangle_{1}), c4=𝙴𝚗𝚌pk(α(𝐘1+𝐘0)α𝐘1)c_{4}=\mathtt{Enc}_{pk}(\alpha(\langle\mathbf{Y}\rangle_{1}+\langle\mathbf{Y}\rangle_{0})-\langle\alpha\mathbf{Y}\rangle_{1}), c5=𝙴𝚗𝚌pk(α(𝐗𝐘)α𝐙1)c_{5}=\mathtt{Enc}_{pk}(\alpha(\mathbf{X}\odot\mathbf{Y})-\langle\alpha\mathbf{Z}\rangle_{1}), and c6=𝙴𝚗𝚌pk((𝐗𝐘)𝐙1)c_{6}=\mathtt{Enc}_{pk}((\mathbf{X}\odot\mathbf{Y})-\langle\mathbf{Z}\rangle_{1}). This hybrid is computationally indistinguishable to the previous hybrid from the function privacy of the FHE scheme. Note that view of the model holder in 𝙷𝚢𝚋𝟺\mathtt{Hyb_{4}} is identical to the view generated by 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}}.

Appendix C Conversion between convolution and matrix multiplication

Refer to caption
Figure 9: Conversion between convolution and matrix multiplication

Figure 9 provides an example to convert a given convolution into the corresponding matrix multiplication. As shown in Figure 9, given input tensor of size 5×55\times 5 with 33 channels, denoted as 𝐗\mathbf{X}, 33 kernels with a size of (2+1)×(2+1)×3(2+1)\times(2+1)\times 3 denote as tensor 𝐘\mathbf{Y}, the convolution between 𝐗\mathbf{X} and 𝐘\mathbf{Y} are converted an equivalent matrix multiplication 𝐗\mathbf{X^{\prime}} and 𝐘\mathbf{Y^{\prime}}, where the number of turns to zero-pad is 0, and stride s=1s=1. Specifically, we construct a matrix 𝐗\mathbf{X^{\prime}} with dimension (9×27)(9\times 27), where 𝐗(i,j)(Δi,Δj,k)=𝐗i+Δi,j+Δj,k\mathbf{X^{\prime}}_{(i,j)(\Delta_{i},\Delta_{j},k)}=\mathbf{X}_{i+\Delta_{i},j+\Delta_{j},k}. Similarly, we construct a matrix 𝐘\mathbf{Y^{\prime}} of dimension (27×3)(27\times 3) such that 𝐘(Δi,Δj,k)k=𝐘Δi,Δj,k\mathbf{Y^{\prime}}(\Delta_{i},\Delta_{j},k)k^{\prime}=\mathbf{Y}_{\Delta_{i},\Delta_{j},k^{\prime}}. Then, the original convolution operation is transformed into 𝐙=𝐗𝐘\mathbf{Z^{\prime}}=\mathbf{X^{\prime}}\ast\mathbf{Y^{\prime}}, where 𝐙(ij)k\mathbf{Z^{\prime}}_{(ij)k^{\prime}} =𝐙ijk=\mathbf{Z}_{ijk^{\prime}}.

Appendix D Proof of Theorem 4

Proof.

Semi-honest client security.The security of the protocol πONlin\pi_{ONlin} against the semi-honest client P1P_{1} is evident by observing the execution of the protocol. This stems from the fact that P1P_{1} does obtain output in OTκλ{}_{\lambda}^{\kappa} and does not receive any information from P0P_{0} in subsequent executions. Here we focus on the security analysis of πONlin\pi_{ONlin} against malicious model holder P0P_{0}.

Malicious model holder security. We first define the functionality of the protocol πONlin\pi_{ONlin}, denoted as ONlin\mathcal{F}_{ONlin}, as shown in Figure 10. We use 𝚁𝚎𝚊𝚕\mathtt{Real} to refer to the view of the real interaction between P1P_{1} and the adversary 𝒜\mathcal{A} controlling P0P_{0}, and then demonstrate 𝚁𝚎𝚊𝚕\mathtt{Real} indistinguishability from the simulated view interacted by the simulator 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} and 𝒜\mathcal{A} through standard hybrid arguments. In the following we will define three hybrid executions 𝙷𝚢𝚋𝟷\mathtt{Hyb_{1}}, 𝙷𝚢𝚋𝟸\mathtt{Hyb_{2}} and 𝙷𝚢𝚋𝟹\mathtt{Hyb_{3}}. We prove that πONlin\pi_{ONlin} is secure from the malicious model holder P0P_{0} by proving indistinguishability among these hybrid executions.

Function f:𝔽p𝔽pf:\mathbb{F}_{p}\rightarrow\mathbb{F}_{p}.
Input: P1P_{1} holds 𝐯i1𝔽p\left\langle\mathbf{v}_{i}\right\rangle_{1}\in\mathbb{F}_{p} and a MAC key α\alpha uniformly chosen from 𝔽p\mathbb{F}_{p}. P0P_{0} holds 𝐯i0𝔽p\left\langle\mathbf{v}_{i}\right\rangle_{0}\in\mathbb{F}_{p}.
Output: PbP_{b} obtains {(α𝐯ib,f(𝐯i)b,αf(𝐯i)b)}\{(\langle\alpha\mathbf{v}_{i}\rangle_{b},\langle f(\mathbf{v}_{i})\rangle_{b},\langle\alpha f(\mathbf{v}_{i})\rangle_{b})\} for b{0,1}b\in\{0,1\}.
Figure 10: Functionality of the nonlinear layer ONlin\mathcal{F}_{ONlin}

𝙷𝚢𝚋𝟷\mathtt{Hyb_{1}}: This hybrid execution is identical to 𝚁𝚎𝚊𝚕\mathtt{Real} except in the authentication phase. To be precise, in the authentication phase, the simulator 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} use labels 𝚕𝚊𝚋i,jout^\hat{\mathtt{lab}_{i,j}^{out}} (described below) to replace the labels 𝚕𝚊𝚋i,jout\mathtt{lab}_{i,j}^{out} used in 𝚁𝚎𝚊𝚕\mathtt{Real}. Please note that in this hybrid the simulator 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} can access P1P_{1}’ input 𝐯i1\left\langle\mathbf{v}_{i}\right\rangle_{1} and α\alpha, where 𝐯i0+𝐯i1=𝐯i\left\langle\mathbf{v}_{i}\right\rangle_{0}+\left\langle\mathbf{v}_{i}\right\rangle_{1}=\mathbf{v}_{i}. Let δ=(𝐯i||sign(𝐯i))\delta=(\mathbf{v}_{i}||sign(\mathbf{v}_{i})). Therefore, for i[2κ]i\in[2\kappa], we set 𝚕𝚊𝚋i,jout^=𝚕𝚊𝚋i,jout\hat{\mathtt{lab}_{i,j}^{out}}=\mathtt{lab}_{i,j}^{out} if j=δ[i]j=\delta[i], otherwise, 𝚕𝚊𝚋i,1δ[i]out^\hat{\mathtt{lab}_{i,1-\delta[i]}^{out}} (i.e., the “other” label) is set to a random value chosen from {0,1}λ\{0,1\}^{\lambda} uniformly, where the first bit of 𝚕𝚊𝚋i,1δ[i]out^\hat{\mathtt{lab}_{i,1-\delta[i]}^{out}} is 1ςi,δ[i]1-\varsigma_{i,\delta[i]}. We provide the formal description of 𝙷𝚢𝚋𝟷\mathtt{Hyb_{1}} as follows, where the indistinguishability between the view of 𝒜\mathcal{A} in 𝚁𝚎𝚊𝚕\mathtt{Real} and 𝙷𝚢𝚋𝟷\mathtt{Hyb_{1}} is directly derived from the authenticity of the garbled circuit.

  • 1.

    𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} receives 𝐯i0\left\langle\mathbf{v}_{i}\right\rangle_{0} from 𝒜\mathcal{A} as the input of OTκλ{}_{\lambda}^{\kappa}.

  • 2.

    Garbled Circuit Phase:

    • For boolnfbooln^{f}, 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} first computes 𝙶𝚊𝚛𝚋𝚕𝚎(1λ,boolnf)(𝙶𝙲,{{𝚕𝚊𝚋i,jin},{𝚕𝚊𝚋i,jout}}j{0,1})\mathtt{Garble}(1^{\lambda},booln^{f})\rightarrow(\mathtt{GC},\{\{\mathtt{lab}_{i,j}^{in}\},\{\mathtt{lab}_{i,j}^{out}\}\}_{j\in\{0,1\}}) for each i[2κ]i\in[2\kappa], and then for i{κ+1,,2κ}{i\in\{\kappa+1,\cdots,2\kappa\}} sends {𝚕𝚊𝚋~jin=𝚕𝚊𝚋j,𝐯i0[j]in}\{\mathtt{\tilde{lab}}_{j}^{in}=\mathtt{lab}_{j,\left\langle\mathbf{v}_{i}\right\rangle_{0}[j]}^{in}\} to 𝒜\mathcal{A} as the output of OTκλ{}_{\lambda}^{\kappa}. In addition, 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} sends the garbled circuit 𝙶𝙲\mathtt{GC} and its garbled inputs {{𝚕𝚊𝚋~jin=𝚕𝚊𝚋j,𝐯i1[j]}j[κ]\{\{\mathtt{\tilde{lab}}_{j}^{in}=\mathtt{lab}_{j,\left\langle\mathbf{v}_{i}\right\rangle_{1}[j]}\}_{j\in[\kappa]} to 𝒜\mathcal{A}.

  • 3.

    Authentication Phase 1:

    • 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} sets δ=(𝐯i||sign(𝐯i))\delta=(\mathbf{v}_{i}||sign(\mathbf{v}_{i})).

    • For i[2κ]i\in[2\kappa], 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} sets 𝚕𝚊𝚋i,jout^=𝚕𝚊𝚋i,δ[i]out\hat{\mathtt{lab}_{i,j}^{out}}=\mathtt{lab}_{i,\delta[i]}^{out} if j=δ[i]j=\delta[i].

    • For i[2κ]i\in[2\kappa], if j=1δ[i]j=1-\delta[i], 𝒮\mathcal{S} sets 𝚕𝚊𝚋i,jout^\hat{\mathtt{lab}_{i,j}^{out}} as a random value chosen from {0,1}λ\{0,1\}^{\lambda} uniformly, where first bit of 𝚕𝚊𝚋i,1δ[i]out^\hat{\mathtt{lab}_{i,1-\delta[i]}^{out}} is 1ςi,δ[i]1-\varsigma_{i,\delta[i]}.

    • 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} computes and sends {cti,j,cti,j^}i[κ],j{0,1}\{ct_{i,j},\hat{ct_{i,j}}\}_{i\in[\kappa],j\in\{0,1\}} to 𝒜\mathcal{A} using 𝚕𝚊𝚋i,jout^i[2κ],j{0,1}\hat{\mathtt{lab}_{i,j}^{out}}_{i\in[2\kappa],j\in\{0,1\}}. This process is same as in 𝚁𝚎𝚊𝚕\mathtt{Real} execution using 𝚕𝚊𝚋i,jouti[2κ],j{0,1}{\mathtt{lab}_{i,j}^{out}}_{i\in[2\kappa],j\in\{0,1\}}.

  • 4.

    Local Computation Phase: The execution of this phase is indistinguishable from 𝚁𝚎𝚊𝚕\mathtt{Real} since no information needs to be exchanged between 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} and 𝒜\mathcal{A}.

  • 5.

    Authentication Phase 2:

    • The execution is identical to 𝚁𝚎𝚊𝚕\mathtt{Real}.

𝙷𝚢𝚋𝟸\mathtt{Hyb_{2}}: We will make four changes to 𝙷𝚢𝚋𝟷\mathtt{Hyb_{1}} to obtain 𝙷𝚢𝚋𝟸\mathtt{Hyb_{2}}, and argue that 𝙷𝚢𝚋𝟸\mathtt{Hyb_{2}} is indistinguishable from 𝙷𝚢𝚋𝟷\mathtt{Hyb_{1}} from the adversary’s view. To be precise, let 𝙶𝙲𝙴𝚟𝚊𝚕(𝙶𝙲,{𝚕𝚊𝚋~iin}i[2κ]){(ς~i||ϑ~i)i[2κ]={𝚕𝚊𝚋~iout}i[2κ]}\mathtt{GCEval}(\mathtt{GC},\{\mathtt{\tilde{lab}}_{i}^{in}\}_{i\in[2\kappa]})\rightarrow\{(\tilde{\varsigma}_{i}||\tilde{\vartheta}_{i})_{i\in[2\kappa]}=\{\mathtt{\tilde{lab}}_{i}^{out}\}_{i\in[2\kappa]}\}. First, we have {𝚕𝚊𝚋~iout=𝚕𝚊𝚋i,δ[i]out}i[2κ]\{\mathtt{\tilde{lab}}_{i}^{out}=\mathtt{{lab}}_{i,\delta[i]}^{out}\}_{i\in[2\kappa]} based on the correctness of garbled circuits. Second, we note that ciphertexts {cti,1ς~i,ct^i,1ς~i+κ}i[κ]\{ct_{i,1-\tilde{\varsigma}_{i}},\hat{ct}_{i,1-\tilde{\varsigma}_{i+\kappa}}\}_{i\in[\kappa]} are computed by exploiting the “other” set of output labels picked uniformly in 𝙷𝚢𝚋𝟷\mathtt{Hyb_{1}}. Based on this observation, 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} actually can directly sample them uniformly at random. Third, in real execution, for every i[κ]i\in[\kappa] and j{0,1}j\in\{0,1\}, P1P_{1} sends cti,ςi,jct_{i,\varsigma_{i,j}} and ct^i,ςi+κ,j\hat{ct}_{i,\varsigma_{i+\kappa,j}} to P0P_{0}, and then P0P_{0} computes cic_{i}, did_{i} and eie_{i} based on them. To simulate this, 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} only needs to uniformly select random values cic_{i}, did_{i} and eie_{i} which satisfy α𝐯i0=(j[κ]cj2j1)\langle\alpha\mathbf{v}_{i}\rangle_{0}=(-\sum_{j\in[\kappa]}c_{j}2^{j-1}), sign(𝐯i)0=(j[κ]dj2j1)\langle sign(\mathbf{v}_{i})\rangle_{0}=(-\sum_{j\in[\kappa]}d_{j}2^{j-1}) and αsign(𝐯i)0=(j[κ]ej2j1)\langle\alpha sign(\mathbf{v}_{i})\rangle_{0}=(-\sum_{j\in[\kappa]}e_{j}2^{j-1}). Finally, since α𝐯i0\langle\alpha\mathbf{v}_{i}\rangle_{0}, sign(𝐯i)0\langle sign(\mathbf{v}_{i})\rangle_{0} and αsign(𝐯i)0\langle\alpha sign(\mathbf{v}_{i})\rangle_{0} are part the outputs of functionality ONlin\mathcal{F}_{ONlin}, 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} can obtain these as the outputs from ONlin\mathcal{F}_{ONlin}. In summary, with the above changes, 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} no longer needs α\alpha of P1P_{1}. We provide the formal description of 𝙷𝚢𝚋𝟸\mathtt{Hyb_{2}} as follows.

  • 1.

    𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} receives 𝐯i0\left\langle\mathbf{v}_{i}\right\rangle_{0} from 𝒜\mathcal{A} as the input of OTκλ{}_{\lambda}^{\kappa}.

  • 2.

    Garbled Circuit Phase: Same as 𝙷𝚢𝚋𝟷\mathtt{Hyb_{1}}.

  • 3.

    Authentication Phase 1:

    • 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} runs 𝙶𝙲𝙴𝚟𝚊𝚕(𝙶𝙲,{𝚕𝚊𝚋~iin}i[2κ]){(ς~i||ϑ~i)i[2κ]={𝚕𝚊𝚋~iout}i[2κ]}\mathtt{GCEval}(\mathtt{GC},\{\mathtt{\tilde{lab}}_{i}^{in}\}_{i\in[2\kappa]})\rightarrow\{(\tilde{\varsigma}_{i}||\tilde{\vartheta}_{i})_{i\in[2\kappa]}=\{\mathtt{\tilde{lab}}_{i}^{out}\}_{i\in[2\kappa]}\}.

    • 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} learns α𝐯i0\langle\alpha\mathbf{v}_{i}\rangle_{0}, sign(𝐯i)0\langle sign(\mathbf{v}_{i})\rangle_{0} and αsign(𝐯i)0\langle\alpha sign(\mathbf{v}_{i})\rangle_{0} by sending 𝐯i0\langle\mathbf{v}_{i}\rangle_{0} to ONlin\mathcal{F}_{ONlin}.

    • For j[κ]j\in[\kappa], 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} uniformly selects random values cjc_{j}, djd_{j} and ej𝔽pe_{j}\in\mathbb{F}_{p} which satisfy α𝐯i0=(j[κ]cj2j1)\langle\alpha\mathbf{v}_{i}\rangle_{0}=(-\sum_{j\in[\kappa]}c_{j}2^{j-1}), sign(𝐯i)0=(j[κ]dj2j1)\langle sign(\mathbf{v}_{i})\rangle_{0}=(-\sum_{j\in[\kappa]}d_{j}2^{j-1}) and αsign(𝐯i)0=(j[κ]\langle\alpha sign(\mathbf{v}_{i})\rangle_{0}=(-\sum_{j\in[\kappa]} ej2j1)e_{j}2^{j-1}).

    • For every i[κ]i\in[\kappa], 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} computes cti,ς~i=ci𝐓𝐫𝐮𝐧κ(ϑ~i)ct_{i,\tilde{\varsigma}_{i}}=c_{i}\oplus\mathbf{Trun}_{\kappa}(\tilde{\vartheta}_{i}) and ct^i,ς~i+κ=(di||ei)𝐓𝐫𝐮𝐧2κ(ϑ~i+κ)\hat{ct}_{i,\tilde{\varsigma}_{i+\kappa}}=(d_{i}||e_{i})\oplus\mathbf{Trun}_{2\kappa}(\tilde{\vartheta}_{i+\kappa}). For ciphertexts {cti,1ς~i,ct^i,1ς~i+κ}i[κ]\{ct_{i,1-\tilde{\varsigma}_{i}},\hat{ct}_{i,1-\tilde{\varsigma}_{i+\kappa}}\}_{i\in[\kappa]}, 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} samples them uniformly at random.

    • 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} sends {cti,j,cti,j^}i[κ],j{0,1}\{ct_{i,j},\hat{ct_{i,j}}\}_{i\in[\kappa],j\in\{0,1\}} to 𝒜\mathcal{A}.

  • 4.

    Local Computation Phase: The execution of this phase is indistinguishable from 𝚁𝚎𝚊𝚕\mathtt{Real} since no information needs to be exchanged between 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} and 𝒜\mathcal{A}.

  • 5.

    Authentication Phase 2:

    • The execution is identical to 𝚁𝚎𝚊𝚕\mathtt{Real}.

𝙷𝚢𝚋𝟹\mathtt{Hyb_{3}}: This hybrid we remove 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}}’s dependence on P1P_{1}’s input 𝐯i1\langle\mathbf{v}_{i}\rangle_{1}. The indistinguishability between 𝙷𝚢𝚋𝟹\mathtt{Hyb_{3}} and 𝙷𝚢𝚋𝟸\mathtt{Hyb_{2}} stems from the security of the garbled circuit. We provide the formal description of 𝙷𝚢𝚋𝟹\mathtt{Hyb_{3}} below.

  • 1.

    𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} receives 𝐯i0\left\langle\mathbf{v}_{i}\right\rangle_{0} from 𝒜\mathcal{A} as the input of OTκλ{}_{\lambda}^{\kappa}.

  • 2.

    Garbled Circuit Phase:

    • 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} samples 𝙶𝚊𝚛𝚋𝚕𝚎(1λ,boolnf)(𝙶𝙲~,{𝚕𝚊𝚋^iin}i{κ+1,,2κ})\mathtt{Garble}(1^{\lambda},booln^{f})\rightarrow(\tilde{\mathtt{GC}},\{\hat{\mathtt{lab}}_{i}^{in}\}_{i\in\{\kappa+1,\cdots,2\kappa\}}) and sends {𝚕𝚊𝚋^i}i{κ+1,,2κ}\{\hat{\mathtt{lab}}_{i}\}_{i\in\{\kappa+1,\cdots,2\kappa\}} to 𝒜\mathcal{A} as the output of OTκλ{}_{\lambda}^{\kappa}. 𝒮\mathcal{S} also sends 𝙶𝙲~\tilde{\mathtt{GC}} and {𝚕𝚊𝚋^iin}i[κ]\{\hat{\mathtt{lab}}_{i}^{in}\}_{i\in[\kappa]} to 𝒜\mathcal{A}.

  • 3.

    Authentication Phase 1:

  • 4.

    Local Computation Phase: The execution of this phase is indistinguishable from 𝚁𝚎𝚊𝚕\mathtt{Real} since no information needs to be exchanged between 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} and 𝒜\mathcal{A}.

  • 5.

    Authentication Phase 2: Same as 𝙷𝚢𝚋𝟸\mathtt{Hyb_{2}}, where 𝚂𝚒𝚖𝚖\mathtt{Sim_{m}} uses (𝐯i0,𝙶𝙲~,(\langle\mathbf{v}_{i}\rangle_{0},\tilde{\mathtt{GC}}, and {𝚕𝚊𝚋^iin}i[2κ])\{\hat{\mathtt{lab}}_{i}^{in}\}_{i\in[2\kappa]}) to process this phase for 𝒜\mathcal{A}.

Appendix E Proof of Theorem 5

Proof.

Assuming that P0P_{0} tampered with any of the inputs it holds during the execution, qq can be expressed as follows

q=Δ+j[t]𝐫j(ρjατj)+i[m1]𝐫i(τiξi)q=\Delta+\sum_{j\in[t]}\mathbf{r}_{j}(\mathbf{\rho}_{j}-\alpha\mathbf{\tau}_{j})+\sum_{i\in[m-1]}\mathbf{r}_{i}(\mathbf{\tau}_{i}-\mathbf{\xi}_{i})

where Δ\Delta refers to the increment caused by P0P_{0}’s violation of the protocol. The above formula can be expressed as a 1-degree polynomial function Q(α)Q(\alpha) with respect to the variable α\alpha. It is clear that Q(α)Q(\alpha) is a non-zero polynomial whenever P0P_{0} introduces errors. Further, when Q(α)Q(\alpha) is a non-zero polynomial, it has at most one root. Hence, over the choice of α\alpha, the probability that Q(α)=0Q(\alpha)=0 is at most 1/p1/p . Therefore, the probability that P1P_{1} aborts is at least 11/p1-1/p when P0P_{0} cheats. ∎