On Collective Robustness of Bootstrap Aggregation Against Data Poisoning

Aeiau Zzzz Bauiu C. Yyyy Cieua Vvvvv Iaesut Saoeu Fiuea Rrrr Tateu H. Yasehe Aaoeu Iasoh Buiui Eueu Aeuia Zzzz Bieea C. Yyyy Teoau Xxxx Eee Pppp

Abstract

Bootstrap aggregating (bagging) is a popular ensemble learning protocol for its effectiveness, simpleness and robustness. Prior works have derived the deterministic robustness certificate for its specific variant, against data poisoning. However, it remains open that, 1) how to generalize the deterministic robustness certificate to the general form of bagging; 2) how to compute its collective robustness. Collective robustness refers to the minimum number of simultaneously unchanged predictions when predict a testset, which has proven a more informative and practical robustness measure against data poisoning. In this paper, we propose pseudo bagging with the deterministic robustness against data poisoning, to approximate an arbitrary form of bagging. Moreover, we propose the first certification for certifying the tight collective robustness of pseudo bagging. Specifically, it is computed by solving a binary linear integer programming (BLIP) problem, whose time complexity is exponential to the testset size. To reduce time complexity, a decomposition strategy is devised to compute a lower bound. Empirically experiments show the notable advantage in terms of practical applicability, collective robustness and certified accuracy.

Robustness certification, EL, certified robustness

1 Introduction

Bagging, as a well-known ensemble learning method, is commonly used for improving accuracy or reducing overfitting. (levine2021deep) prove the deterministic robustness certificate for a specific variant of bagging, partition-based bagging, against the general data poisoning attack (the attacker is allowed to arbitrarily insert/delete/modify a bounded number of training samples). This bagging-based defense significantly improves robustness when compared to prior works (Wang2020OnCR; Rosenfeld2019CertifiedRT; ma2019data).

However, in terms of defense construction, the practicality of current bagging-based defenses (levine2021deep; jia2021intrinsic) is limited by their hard constraints. Specifically, (levine2021deep) requires the trainset of each sub-classifier to be disjoint. (jia2021intrinsic) needs to train hundreds of sub-classifiers for estimating an acceptable robustness lower bound. Therefore, there exists a gap between the bagging-based defenses and the general form of bagging, since we cannot arbitrarily customize the number of sub-classifiers and the sub-trainset size simultaneously. It is meaningful to develop a robustness certification that can guarantee the robustness for bagging, instead of the specific variants.

Moreover, in terms of robustness certification, existing certifications against data poisoning mainly focus on sample-wise robustness (whether a single prediction is changeable under the attacks), and lack discussing a more informative robustness measure, collective robustness (the minimum number of unchanged predictions under the attacks). In practice, collective robustness is superior than sample-wise robustness for two reasons. First, sample-wise robustness is only a specific case of collective robustness when the testset size is one. Second, poisoning attacks have two properties: 1) an attacker cannot craft different poisoning attacks for each testing sample; 2) a poisoning attack globally influences all the predictions. Therefore, collective robustness that jointly consider those two properties of poisoning attack, seems more practical than sample-wise robustness. Therefore, it is preferable to compute collective robustness for bagging-based defenses.

However, computing collective robustness is challenging, as we need to collectively consider the strongest attack for multiple predictions. To our knowledge, there are only two collective certification for specific defenses. Specifically, the first work (schuchardt2021collective) computes the collective robustness against adversarial examples for graph neural networks by exploiting the locality property, but the locality is the unique property of GNN and cannot be generalized to all the classifiers. The other work (jia2022rnn) computes the collective robustness against data poisoning for a machine-learning classifier, namely rNN (radius Nearest Neighbors), but the certification leverages the unique nature of rNN and cannot be applied to bagging. Therefore, computing collective robustness for bagging-based defense is non-trivial.

In this paper, we propose pseudo bagging to offer bagging certified robustness, and collective certification to compute the tight collective robustness for pseudo bagging defenses

. Specifically, our main idea is to formulate the problem of computing the collective robustness as a binary integer linear programming (BILP). Specifically, the objective of BILP is to maximize the number of successfully changed predictions w.r.t. the pre-specific poison budget. Notably, the robustness guaranteed by our method is tight, as we only have access to the predictions. Moreover, by our method, we can estimate a much tighter accuracy lower bound (certified accuracy) than prior methods. The main contributions are three-fold:

1) On defense construction, we propose pseudo bagging to approximate bagging algorithm, which

2)On robustness certification, We show that computing collective robustness is a NP-hard problem. We thus propose a problem decomposition strategy to compute a lower bound instead of exact robustness, reducing the time complexity to $\mathcal{O}(N)$ . The lower bound certified by our method is theoretically greater than or equal to the prior methods.

3) The empirical results on MNIST and CIFAR-10 show that our method significantly improves collective robustness. The source code will be made public available.

2 Related Works

We discuss the related works that our contribution mainly relates to: 1) EL based certified defenses; 2) robustness certifications against data poisoning.

Certified defenses against data poisoning.

There are a line of certified defenses (steinhardt2017certified; Wang2020OnCR) against data poisoning, such as random flipping (Rosenfeld2019CertifiedRT), randomized smoothing (weber2020rab), differential privacy (ma2019data) and ensemble-based defenses (levine2021deep; jia2021intrinsic). Among them, only (ma2019data; jia2022rnn; jia2021intrinsic; levine2021deep) are designed to defend the general data poisoning attacks. (ma2019data) is limited to the training algorithms with differential privacy and (jia2022rnn) can only be applied to rNN. Compared to (jia2021intrinsic), the robustness guaranteed by (levine2021deep) is often higher under the similar computation cost.

Robustness certification against general data poisoning.

Among existing robustness certifications against general data poisoning (ma2019data; jia2021intrinsic; jia2022rnn; levine2021deep), (ma2019data; jia2021intrinsic) are probabilistic, meaning that they have chance to output a wrong robustness certificate. The robustness certification of (jia2022rnn) is deterministic, but only applies to rNN, instead of general NN classifiers. Currently, all the certifications for general NN classifiers are focusing on the sample-wise certificate, which often output a loose robustness certificate when certifying the robustness for multiple predictions. Therefore, in this work, we aim to develop a deterministic and collective certification to certify the exact robustness for multiple predictions.

3 Preliminaries

Strongest threat model.

The attacker is able to arbitrarily insert/delete/modify $\leq r$ training samples, where $r$ denotes the poison budget. Since we have no requirement on the sub-classifiers, we assume: the attacker can fully control the sub-classifiers trained on the poisoned data.

Bootstrap aggregating (bagging) (peter2002bag) is defined as: 1) construct $G$ sub-trainsets by randomly selecting $k$ samples from the trainset (of size $N$ ) $G$ times; 2) train $G$ sub-classifiers on $G$ sub-trainsets independently; 3) predict the majority class among $G$ sub-classifiers’ predictions.

3.1 Certified robustness of bagging-based ensemble learning (levine2021deep).

Main idea.

Bagging-based ensemble learning obtains the certified robustness from: 1) limit the influence scope of each poisoned data by training each sub-classifier on a subset of trainset; 2) make prediction based on the majority voting mechanism.

General sample-wise certification of BEL.

we compute the sample-wise robustness by three steps: 1) compute

contrict the portion of unaffected sub-classifiers under any attack

Example 1 (Partition aggregation)

Example 2 (Bootstrap aggregation)

DPA exploits the majority voting mechanism to achieve certified robustness. Specifically, given a trainset $\mathcal{D}$ , we first split $\mathcal{D}$ into $G$ ( $G$ is pre-specified) disjoint partitions $\mathcal{D}=\bigcup_{g=1}^{G}\mathcal{D}^{\rm{sub}}_{g}$ , based on a deterministic partition rule (e.g. hash function). Then we learn the sub-classifiers $f_{g}(\cdot):\,g=1,\dots,G$ on those $G$ partitions independently, in a deterministic manner. The ensemble of these $G$ sub-classifiers $g(\cdot)$ predicts the class with the majority votes. let ${V}_{x}(y):y\in{\mathcal{Y}}$ ( ${\mathcal{Y}}$ is the output space) refer to the number of sub-classifier votes for class $y$ after the attack when predicting $x$ .

\vspace{-5pt}\footnotesize{V}_{x}(y)\coloneqq\sum_{g=1}^{G}\mathbb{I}\{f_{g}(x)=y\}

(1)

For clarity, ${\overline{V}}_{x}(y)$ denotes the number of votes for the class $y$ before the attack. The ensemble predictions $g(x)$ is:

\footnotesize g(x)=\begin{cases}\argmax\limits_{y\in{\mathcal{Y}}}{V}_{x}(y),&\text{only one majority class}\\ \argmin\limits_{y}\argmax\limits_{y\in{\mathcal{Y}}}{V}_{x}(y),&\text{multiple majority classes}\\ \end{cases}

(2)

where $\argmin\limits_{y}$ means that the class with the smallest index has a higher priority if there exists multiple majority classes. {remark}[Reproducibility] We emphasize that, DPA requires both the partition results and the training process to be reproducible. We have readily realized reproducibility by specifying the random seed for the random operations.

Sample-wise certification of DPA.

(levine2021deep) proves that, the ensemble prediction $g(x)$ is certifiably robust against poison budget $r$ :

\footnotesize r=\lfloor\frac{V_{x}(y_{A})-V_{x}(y_{B})-\mathbb{I}\{y_{A}<y_{B}\}}{2}\rfloor

(3)

where $y_{A}$ ( $g(x)$ ), $y_{B}$ denote the top-2 majority classes. {remark}[Proof sketch of DPA] The attacker (of poison budget $r$ ) can only control at most $r$ sub-classifiers. Consider the worst case where the attacker modify $r$ votes for $y_{A}$ to $y_{B}$ , $y_{A}$ still has a higher priority than $y_{B}$ .

Naive collective certification.

A naive collective certification to simply fuse multiple sample-wise certificates, which is to count the number of sample-wise robust predictions. However, the naive collective certification only gives a lower bound of the true collective robustness.

Refer to caption — Figure 1: An example to illustrate the gap between xxx. Here $g(x)$ is the ensemble of $f_{1}(x),f_{2}(x),f_{3}(x)$ , and the testing samples are $x_{1},x_{2},x_{3}$ . Cat/Dog denotes the correct predictions, and Cat/Dog denotes the wrong predictions. Consider an attacker (poison budget is 1) can control an arbitrary sub-classifier. Sample-wise certificate: we consider $g(x_{1}),g(x_{2}),g(x_{3})$ independently. To change $g(x_{1})$ / $g(x_{2})$ / $g(x_{3})$ , the attacker can flip $f_{2}(x_{1})$ / $f_{3}(x_{2})$ / $f_{1}(x_{3})$ respectively. Therefore, all the three predictions are not robust and the sample-wise robustness is $0$ . Collective certificate: we consider $g(x_{1}),g(x_{2}),g(x_{3})$ collectively. If the attacker poisons $f_{1}$ / $f_{2}$ / $f_{3}$ , the prediction $g(x_{1})$ / $g(x_{2})$ / $g(x_{3})$ is unchangeable respectively. Thus the collective robustness is $1$ .

Certification gap.

We illustratively show the gap between the sample-wise certificate and the collective certificate by a toy example, as shown in Fig. 1.

4 Methodology

Unlike the sample-wise threat model 3, here we model an attacker (with full knowledge) that aims to maximize the number of successfully changed predictions within the pre-specific poison budget. In fact, based on the nature of DPA, computing the optimal poisoning case can be simplified as: compute the maximum number of changed ensemble predictions if we can arbitrarily select $r$ sub-classifiers to fully control.

4.1 Compute Collective Certification of DPA

Given a collection of testing samples $\{x_{i}\}_{i=1}^{N}$ , we aim to compute the maximum number of successfully changed predictions, under the poison budget $r$ . $\{y_{i}\}_{i=1}^{N}$ denote the ensemble predictions before the attack. We formulate the problem (BILP):

	$\displaystyle\max_{A_{1},\dots,A_{G}}\;\sum_{i=1}^{N}\mathbb{I}\left\{{V}_{x_{i}}(y_{i})<\max_{y\neq y_{i}}\left[{V}_{x_{i}}(y)+\frac{1}{2}\mathbb{I}\{y<y_{i}\}\right]\right\}$		(4)
	$\displaystyle s.t.\quad\mathbf{A}=[A_{1},A_{2},\dots,A_{G}]\in\{0,1\}^{G}$		(5)
	$\displaystyle\sum_{g=1}^{G}A_{g}-\sum_{p=1}^{G}\sum_{q=1}^{G}A_{p}A_{q}\mathbb{I}\{f_{p}\cap f_{q}\}\leq r$		(6)
	$\displaystyle{V}_{x_{i}}(y_{i})={\overline{V}}_{x_{i}}(y_{i})-\sum_{g=1}^{G}A_{g}\mathbb{I}\{f_{g}(x_{i})=y_{i}\}\;\forall i$		(7)
	$\displaystyle{V}_{x_{i}}(y)=\left[{\overline{V}}_{x_{i}}(y)+\sum_{g=1}^{G}A_{g}\mathbb{I}\{f_{g}(x_{i})\neq y\}\right]\;\forall i,y\neq y_{i}$		(8)

We now explain each equation respectively. Specifically, Eq. ((5)): $A_{1},A_{2},\dots,A_{G}$ are $G$ binary variables which denotes the poisoning attack. $A_{g}\in\{0,1\}$ denotes whether the attacker poisons the $g$ -th sub-trainset to control $f_{g}$ . Eq. (6): the poison budget is bounded to be $\leq r$ . Eq. (7): ${V}_{x_{i}}(y_{i})$ denotes the minimum number of votes for the (original) ensemble prediction $y_{i}$ after the attack $[A_{1},A_{2},\dots,A_{G}]$ , which equals to the number of votes for $y_{i}$ ( ${\overline{V}}_{x_{i}}(y_{i})$ ) minus the number of attacked sub-classifiers that originally predict $y_{i}$ ( $\sum_{g=1}^{G}A_{g}\mathbb{I}\{f_{g}(x_{i})=y_{i}\}$ ). Eq. (8): ${V}_{x_{i}}(y)$ denotes the number of votes for the class $y\neq y_{i}$ . Note that based on our threat model, the attacker can arbitrarily modify the predictions of those poisoned sub-classifiers, thus the number of votes for the class $y\neq y_{i}$ equal to the number of original votes for $y\neq y_{i}$ plus the number of poisoned sub-classifier whose original predictions are not $y$ . Eq. (4): we aim to maximize the number of changed ensemble predictions. Note that the ensemble prediction is unchanged only if there exists a class $y\neq y_{i}$ that has a higher priority than $y_{i}$ .

Observations about Problem (BILP)

1.

The collective robustness (the minimum number of unchanged ensemble predictions) is equal to the total number of predictions minus the optimal value of problem (BILP).
2.

The collective robustness is tight.
3.

Problem (P1) is an NP-hard problem, because
4.

We can simplify problem by ignoring $x_{i}$ that satisfies: ${V}_{x_{i}}(y_{i})-\max_{y\neq y_{i}}{V}_{x_{i}}(y)>r$ and $\max_{y\neq y_{i}}{V}_{x_{i}}(y)=$ .
5.

We can simplify Problem (P1) by partitioning it into multiple sub-problems.
6.

The collective robustness is highly related with prediction diversity.

Implementation.

Alg. 1 shows our practical algorithm for computing the collective certification for a collection of testing samples.

Algorithm 1 Certify

Input: The testset

\mathcal{D}_{test}=\{(x_{i},y_{i})\}_{i=1}^{N}

, the sub-classifiers

f_{1}(\cdot),\dots,f_{G}(\cdot)

, the poison budget

r

Output: The minimum certified accuracy

\underline{\rm{CA}}(r)

under the poison budget

R

for

i=1

N

Compute the predictions

f_{g}(x_{i}):\,g=1,\dots,G

;

end for

Compute the maximum number of successfully attacked ensemble predictions

\overline{N}_{\rm{ATK}}

by solving (P1); # solve a binary integer linear programming problem

Compute certified accuracy

\underline{\rm{CA}}(R)\leftarrow\frac{N-\overline{N}_{\rm{ATK}}}{N}

;

return certified accuracy

\underline{\rm{CA}}(R)

5 Comparisons to Prior Works

We compare our method to: 1) partition aggregation defense (levine2021deep); 2) prior certified defenses with collective certifications (steinhardt2017certified; schuchardt2021collective; jia2022rnn).

Comparison to (levine2021deep) Our collective certification computes the exact certified accuracy against the general data poisoning attack, as we only have access to the predictions and the poison budget.

Comparison to (steinhardt2017certified) The collective certification (steinhardt2017certified) is substantially different from ours, which instead derives an approximate upper bound of the test loss against data poisoning attacks. This method cannot guarantee the certified accuracy.

Comparison to (schuchardt2021collective) (schuchardt2021collective) prove the collective certification against GNN adversarial examples, which is based on the locality property of the base classifiers (e.g. GCN). However, this technique is invalid for many other tasks (e.g. image classification), where the classifiers lack the locality property.

Comparison to (jia2022rnn) This related work derives the collective certification for the machine learning algorithm rNN (Radius Nearest Neighbors). Since the naive rNN performs poorly on the tasks as CIFAR-10, the author propose to enhance rNN by using a pre-trained encoder to extract the features, on the top of rNN. Therefore, the performance of rNN (jia2022rnn) highly depends on the encoder. However, a well trained encoder is often unavailable in practice, and training encoder is also vulnerable to data poisoning attacks. This vulnerability is ignored by (jia2022rnn), which is unfair when compared to other certified defenses.

Table 1: Experimental setups.

Dataset Trainset $G$ (partitions) $\delta$ (step) MNIST 60,000 250/500/1000 $G/10$ CIFAR-10 50,000 50/100/200 $G/10$

6 Experiments

6.1 Experimental Setups

Datasets and models.

We mainly evaluate the robustness certification on MNIST (lecun2010mnist) and CIFAR-10 (krizhevsky2009learning), which follows the prior works (levine2021deep; jia2021intrinsic; jia2022rnn). Following (levine2021deep), we use NiN (Network-In-Network) model architecture (min2014nin) for MNIST, and NiN with full data augmentation for CIFAR-10. The trainset sizes of two datasets are $60,000$ and $50,000$ respectively, and the testset sizes of both are $10,000$ . All the experiments are conducted on CPU (16 Intel(R) Xeon(R) Gold 5222 CPU @ 3.80GHz) and GPU (one NVIDIA RTX 2080 Ti).

Peer methods.

We compare our method (the collective certification of partition aggregation) to the sample-wise certification of aggregation partition (levine2021deep) and the sample-wise certification of bagging²²2We take the confidence level $1-\alpha=0.999$ , the same as the original implementation. We set its number of base classifiers same as the number of our partitions for the computational fairness. (jia2021intrinsic). We do not compare to collective certification of rNN (jia2022rnn) because its pre-trained encoder is trained from the additional datasets without considering the poisoning attacks, which is unfair for comparison.

Evaluation metrics.

Following the prior works (levine2021deep; jia2021intrinsic; jia2022rnn), we evaluate the performances of the robustness certification on the given testset by three evaluation metrics: CA(r) (certified accuracy under the poison budget $r$ ), ACR (average certified robustness) and MCR (median certified robustness). Specifically, CA(r), ACR and MCR are given by:

	$\displaystyle\rm{CA}(r)\coloneqq\frac{N-\overline{N}_{\rm{ATK}}(r)}{N}$		(9)
	$\displaystyle\rm{ACR}\coloneqq\sum_{r=1}^{\infty}\rm{CA}(r)$		(10)
	$\displaystyle\rm{MCR}\coloneqq\max_{\rm{CA}(r)\geq 50\%}r$		(11)

where $N$ is the testset size. We emphasize that our definitions for CA(r), ACR and MCR are compatible with the original definitions in the sample-wise certification (cohen2019certified; levine2021deep). Specifically, 1) both definitions of $CA(r)$ give an accuracy lower bound, but our $CA(r)$ gives the exact lower bound; 2) both definitions of ACR are the sum of the certified accuracy with respect to the poison budget, meaning the average poison budget that a testing sample can tolerate; 3) both definitions of MCR are the poison budget at which the certified accuracy is $50\%$ . For the computational consideration, we approximate $\rm{ACR}$ by only using $\rm{CA}(r):\;r=0,\delta,2\delta,\dots,\frac{G}{2}$ . We report $\underline{\rm{ACR}}$ (lower bound), $\overline{\rm{ACR}}$ (upper bound) and $\rm{AACR}$ (approximate ACR), which are computed by: ( $s=\frac{G}{2\delta}$ ).

	$\displaystyle\underline{\rm{ACR}}\coloneqq\sum_{r=\delta,2\delta,\dots,s\delta}\delta\cdot\rm{CA}(r)$		(12)
	$\displaystyle\overline{\rm{ACR}}\coloneqq\sum_{r=0,\delta,2\delta,\dots,(s-1)\delta}\delta\cdot\rm{CA}(r)$		(13)
	$\displaystyle\rm{AACR}\coloneqq\frac{1}{2}\left(\underline{\rm{ACR}}+\overline{\rm{ACR}}\right)$		(14)

Implementation details.

In practice, we solve the optimization problem (P1) by Gurobi 9.0 (gurobi), which has the preferable function: Gurobi can return a lower/upper bound of the objective value within the given time period.

Table 2: MNIST: Compare our method CPA (collective certification of partition aggregation) to SPA. Better results are in bold.

Partitions Method MCR ACR CA(13) CA(25) CA(38) CA(50) CA(63) CA(75) CA(88) CA(100) CA(113) CA(125) 250 $\rm{CPA}$ $\rm{SPA}$ $\rm{SBag}$ Partitions Method MCR ACR CA(25) CA(50) CA(75) CA(100) CA(125) CA(150) CA(175) CA(200) CA(225) CA(250) 500 $\rm{CPA}$ $\rm{SPA}$ $\rm{SBag}$

Table 3: CIFAR-10: comparison to. We embolden the better results among the comparisons.

Partitions Method MCR ACR CA(3) CA(5) CA(8) CA(10) CA(13) CA(15) CA(18) CA(20) CA(23) CA(25) 50 $\rm{CPA}$ $\rm{SPA}$ $\rm{Bag}$ Partitions Method MCR ACR CA(5) CA(10) CA(15) CA(20) CA(25) CA(30) CA(35) CA(40) CA(45) CA(50) 100 $\rm{CPA}$ $\rm{SPA}$ $\rm{SBag}$ Partitions Method MCR ACR CA(10) CA(20) CA(30) CA(40) CA(50) CA(60) CA(70) CA(80) CA(90) CA(100) 200 $\rm{CPA}$ $\rm{SPA}$ $\rm{SBag}$

Evaluation results on MNIST.

Evaluation results on CIFAR-10.

6.2 Runtime Analysis

6.3 Ablation Studies

Impact of partition size.

Impact of testset size.

Impact of ensemble diversity.

7 Conclusion

We have presented a collective certification approach against both training-stage attacks and inference-stage attacks. To our best knowledge, this is the first work to consider an NN classifier’s robustness certification against data poisoning in a collective manner. Consequently, the collective certification reports a much more precise evaluation on the overall certified robustness for the given testset. Our collective certification can be easily implemented. Empirical results suggest that the collective certification can yield much stronger overall robustness certification.