This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Appendices

1 Dataset Statistics

Here we list the statistics of 7 public dataset in Table 1.

Nodes Edges Classes Features
Cora 2,708 5,429 7 1,433
Citeseer 3,327 4,732 6 3,703
Polblogs 1,490 19,025 2 -
USA 1,190 13,599 4 -
Brazil 131 1,038 4 -
AIDS 31,385 64,780 38 4
ENZYMES 19,580 74,564 3 18
Table 1: Dataset statistics

2 Details of Random Sampling

Algorithm 1 Random sampling from probabilistic vector to binary adjacency matrix

Input: Probabilistic vector 𝐚\mathbf{a}, number of trials KK
Parameter: Edge density ρ\rho
Output: Binary matrix AA;

1:  Normalize probabilistic vector: 𝐚^\hat{\mathbf{a}} = 𝐚/𝐚1\mathbf{a}/\|\mathbf{a}\|_{1}
2:  for k = 1,2 \cdotsdo Draw binary vector 𝐚(k)\mathbf{a}^{(k)} by sampling ρn\lfloor\rho n\rfloor edges according to probabilistic vector 𝐚^\hat{\mathbf{a}}.
3:  end for
4:  Choose a vector 𝐚\mathbf{a}^{*} from {𝐚(k)}\{\mathbf{a}^{(k)}\} which yields the smallest loss attack\mathcal{L}_{attack}.
5:  Convert 𝐚\mathbf{a}^{*} to binary adjacency matrix AA
6:  return AA

3 Proof of Edge Influence

Theorem 1.

The adversary advantage is greater for edges with greater influence.

The proof is based on the lemma 1 from wu2016methodology.

Lemma 1.

Suppose the target model ff is trained on data distribution p(𝒳,𝒴)p(\mathcal{X},\mathcal{Y}). 𝒳=(xs,xns)\mathcal{X}=(x_{s},x_{ns}), where xsx_{s} and xnsx_{ns} denote the sensitive and non-sensitive part of feature respectively. The optimal adversary advantage is

P𝒳p(𝒳,𝒴)[f(xs=1,xns)f(xs=0,xns)].P_{\mathcal{X}\sim p(\mathcal{X},\mathcal{Y})}[f(x_{s}=1,x_{ns})\neq f(x_{s}=0,x_{ns})].
Proof.

In our graph setting, xnsx_{ns} refers to feature matrix XX and xsx_{s} refers to edges. Set P[fθi(A,X)=yi]P[f^{i}_{\theta^{*}}(A,X)=y_{i}] = Pfθ(yi|A,X)=pP_{f_{\theta^{*}}}(y_{i}~{}|~{}A,X)=p; P[fθi(Ae,X)=yi]=Pfθ(yi|Ae,X)=qP[f^{i}_{\theta^{*}}(A_{-e},X)=y_{i}]=P_{f_{\theta^{*}}}(y_{i}~{}|~{}A_{-e},X)=q. Without loss of generality, the prediction accuracy is higher with edge ee. So, we set qpq\leq p. The adversary advantage is Adv=[p(1q)+q(1p)]Adv=[p(1-q)+q(1-p)]. Through variable substitution (x=pq,y=p+qx=p-q,y=p+q), we have Adv=(y+x2y22)Adv=(y+\frac{x^{2}-y^{2}}{2}) and Adv(e)=Advx=pq0\frac{\partial Adv}{\partial\mathcal{I}(e)}=\frac{\partial Adv}{\partial x}=p-q\geq 0. ∎