Appendices
1 Dataset Statistics
Here we list the statistics of 7 public dataset in Table 1.
Nodes | Edges | Classes | Features | |
Cora | 2,708 | 5,429 | 7 | 1,433 |
Citeseer | 3,327 | 4,732 | 6 | 3,703 |
Polblogs | 1,490 | 19,025 | 2 | - |
USA | 1,190 | 13,599 | 4 | - |
Brazil | 131 | 1,038 | 4 | - |
AIDS | 31,385 | 64,780 | 38 | 4 |
ENZYMES | 19,580 | 74,564 | 3 | 18 |
2 Details of Random Sampling
Input: Probabilistic vector , number of trials
Parameter: Edge density
Output: Binary matrix ;
3 Proof of Edge Influence
Theorem 1.
The adversary advantage is greater for edges with greater influence.
The proof is based on the lemma 1 from wu2016methodology.
Lemma 1.
Suppose the target model is trained on data distribution . , where and denote the sensitive and non-sensitive part of feature respectively. The optimal adversary advantage is
Proof.
In our graph setting, refers to feature matrix and refers to edges. Set = ; . Without loss of generality, the prediction accuracy is higher with edge . So, we set . The adversary advantage is . Through variable substitution (), we have and . ∎