XG-BoT: An Explainable Deep Graph Neural Network
for Botnet Detection and Forensics

Wai Weng Lo w.w.lo@uq.net.au Gayan Kulatilleke g.kulatilleke@uq.net.au Mohanad Sarhan m.sarhan@uq.net.au Siamak Layeghy siamak.layeghy@uq.net.au Marius Portmann marius@itee.uq.edu.au School of ITEE, The University of Queensland, Brisbane, Australia

Abstract

In this paper, we propose XG-BoT, an explainable deep graph neural network model for botnet node detection. The proposed model comprises a botnet detector and an explainer for automatic forensics. The XG-BoT detector can effectively detect malicious botnet nodes in large-scale networks. Specifically, it utilizes a grouped reversible residual connection with a graph isomorphism network to learn expressive node representations from botnet communication graphs. The explainer, based on the GNNExplainer and saliency map in XG-BoT, can perform automatic network forensics by highlighting suspicious network flows and related botnet nodes. We evaluated XG-BoT using real-world, large-scale botnet network graph datasets. Overall, XG-BoT outperforms state-of-the-art approaches in terms of key evaluation metrics. Additionally, we demonstrate that the XG-BoT explainers can generate useful explanations for automatic network forensics.

keywords:

Graph neural network, Graph representation learning, Botnet detection, Digital forensics, Anomaly detection

1 Introduction

A botnet is a computer network that consists of compromised victim computers or IoT devices (bots) controlled by a ”botmaster” to perform malicious activities. The victims are usually used for distributed denial of service (DDoS) attacks, phishing, and malware propagation. Due to dynamic changes in network flows (i.e., rapid changes in botnet sizes), it is very difficult to detect botnet nodes effectively. Current machine learning (ML)-based botnet detection approaches [1, 2] require a huge amount of domain knowledge and manual effort from experts for the extraction of features such as packet sizes, packet byes, and corresponding protocols for feature extraction. Furthermore, network traffic can be encrypted and intentionally manipulated (i.e., Payload mutation) [3] to evade ML-based NIDS.

The botnet detection problem can be solved by a graph-based approach, where network flows are represented as communication flows, and nodes are mapped as the victims and attackers. This makes it possible to consider the overall graph patterns, in addition to the network flows and features, for botnet detection. Previous works [4, 5] only used the graph topological pattern of the botnet network graphs for detection, ignoring network flow features. These works considered graph centrality as a feature for botnet detection, which might not describe the corresponding botnet patterns sufficiently.

Refer to caption — Figure 1: Botnet Overview Left: C2 Botnet Right: P2P Botnet

Graph Neural Networks (GNNs) [6] represent a fast-growing field in machine learning. They can automatically capture the graph data structure to perform downstream tasks, such as node classification. Since the botnet network graph consists of rich structural information, the corresponding graph structures can be utilized for automatic botnet detection based on the GNN. Thus, most recent works [7, 8] have explored the use of vanilla graph convolutional networks (GCNs) for botnet node detection by converting the problem into a node classification problem. However, vanilla GCNs are susceptible to over-smoothing, vanishing gradient, and over-fitting problems [9], and the model performance can degrade significantly with an increase in the number of graph convolutional layers.

In this paper, we propose a deep GNN model to against sophisticated botnet attacks. We theorize that GNN models need to be deeper to capture some of the hidden topological botnet patterns. To achieve this task, we first used real-world botnet graph datasets, a decentralized botnet P2P, and a centralized botnet C2 dataset [7], as illustrated in Fig. 1, and trained the GNN model to learn the botnet topological patterns. The major problem is that most deep GNN approaches suffer from the over-smoothing problem [9]. Botnet topological patterns can come in drastically varied depths [10], and deeper models are required to detect some of the hidden topological patterns. Thus, we propose XG-BoT, which is based on grouped reversible residual connections [11], which can effectively handle model performance degradation (i.e., vanishing gradient problem) caused by a higher number of layers with a powerful graph isomorphism network (GIN) [12] for effective botnet detection.

Flagging a malicious botnet node is not a trivial process. False alarms can distract the network administrator and increase the level of irrelevant workload. Extensive human resources are required to review the model’s detection results, which is inefficient and costly. Therefore, in this paper, we also investigate explainability approaches, specifically the GNNExplainer [13] and saliency maps [14], to provide intuitive explanations for model predictions based on highlighting suspicious network flows and related botnet nodes.

Overall, the aim of this study is to propose a novel and explainable deep GNN-based botnet detection system. Most related works, such as [15, 16, 17], consider network flows independently, without taking into account their interconnected relationship, which is important in botnet detection. On the other hand, other GNN approaches, such as those presented in [7] and [8], are susceptible to over-smoothing problems and lack explainability for network forensics. To address these limitations, we present the XG-BoT model, which captures graph patterns for sophisticated botnet detection. Additionally, we utilize GNNExplainer and saliency maps to highlight suspicious network flows and botnet nodes, which makes the detection process more transparent and interpretable for automatic network forensics, and those features are lacking in the current botnet detection related works. Our results demonstrate that the proposed approach achieves state-of-the-art performance in terms of the detection rate. It also can generate useful explanations based on subgraph visualisation for automatic forensics, which indicates the potential for utilising deep GNN approaches in automatic botnet detection and forensics.

In summary, the key contributions of this paper are as follows:

1.

We propose and implement XG-BoT, a deep, explainable, graph neural network-based model that can detect botnets within large-scale communication networks. XG-BoT can also enable automatic forensics by highlighting suspicious network flows and related botnet nodes using GNNExplainer and saliency maps. This, until now, has been lacking in current botnet detection approaches.
2.

The comprehensive evaluation of the proposed XG-BoT approach, using two datasets, indicates that it can achieve superior performance compared to state-of-the-art approaches. Additionally, the experiments demonstrate that useful explainable results can be generated for automatic network forensics.

The paper is organized as follows: In Section 3, we introduce the latest works related to botnet detection. In Section 4, we describe our proposed method. Section 5 describes the experimental settings. In Section 6, we provide an evaluation of the model. Section 7 discusses explainability, and Section 8 summarises the paper.

2 Botnets

A botnet is a collection of bots, which are computers or IoT devices that have been compromised through malicious software. Bots can be controlled remotely by attackers and are connected to botmasters to receive commands to perform malicious activities. Botmasters are attackers who control botnets by issuing commands to the bot to perform malicious actions. Botmasters typically remain hidden from the public and evade detection by law enforcement. One of the most common botnet architectures is the centralized command-and-control infrastructure (C2) [10], as shown on the left of Fig 1, consisting of bots and a centralized control entity. In C2, the botmasters attempt to use one or more network protocols to command victim computers and coordinate their actions. The instructions can range from the execution of denial-of-service attacks to malicious software propagation. The centralized approach is very similar to the traditional client and server architecture. It can be implemented through the internet relay chatbot (IRC) protocol, where all bots establish a communication channel with the botmaster.

The C2 architecture enables easy and direct communication with the bot and allows the botmaster to monitor the global distribution and bot status. However, the main problem with the C2 architecture is its single point of failure, which allows law enforcement to shut down the botnet easily. This provides motivation for the use of a decentralized P2P [10] botnet architecture, as shown on the right of Fig 1. Due to the decentralized nature of the C2 architecture, P2P Botnets are more resilient and more difficult to shut down because there is no central C2 server that can be disabled.

3 Related Works

McDermott et al. [15] used a deep learning approach to perform botnet detection on IoT networks based on a bidirectional Long Short-Term Memory Recurrent Neural Network (BLSTM-RNN) model to detect botnet activity within consumer IoT devices and networks. They first used word embeddings to encode and convert packets into a tokenized integer format for feature extraction. Then, the word embedding vector was fed to the BLSTM-RNN model to detect botnet activity. The authors applied four attack vectors used by the Mirai botnet malware [16] for the performance evaluation. Shi et al. [18] proposed a deep learning method to detect and classify botnets on the extracted features from the input network traffic by using LSTM and RNN models. Ahmed et al. [17] proposed a deep feed-forward neural network model for botnet detection and compared the proposed method with the Support Vector Machine and Naive Bayes algorithms. Kozik et al. [19] developed an attack detection platform for IoT applications based on Extreme Learning Machines (ELM) and the Apache Spark framework. They first used the CTU-13 Netflow dataset collected from an IoT network and then applied the proposed platform for botnet detection. The results show that the approach achieves high accuracy values of 0.99, 0.76, and 0.95 for scanning, C2 and infected host scenarios, respectively.

In [20], the authors proposed a botnet detector by combining both network flow and graph pattern information. They first built the network graph to represent network connections between hosts and then extracted statistical-based information from the graph features and network flow for feature extraction to train the 1D CNN-RNN to identify botnets. The CTU-13 and ISOT datasets were used to evaluate the model’s effectiveness. The experimental results show that an overall accuracy of 99.3% and an F1-score of 99.1% were achieved. In [21], the authors created a 28 Standard Android Botnet Dataset (28-SABD) and an Android botnet malware dataset, including 14 families of Android botnet malware traffic. They used the ensemble K-Nearest Neighbors (KNN) technique to improve the overall detection accuracy. However, the proposed method only obtained an overall accuracy of 94%, which indicates that further improvement of the proposed method is needed. In [22], the authors introduced an unsupervised botnet detection method for IoT based on the One-Class Support Vector Machine (OCSVM). They applied Grey Wolf optimization (GWO) to optimize the hyperparameters of the OCSVM to improve the botnet detection performance.

Graph-based machine learning methods for botnet detection are explored in [4, 5]. The authors designed a graph-based technique to detect botnet attacks. They first built a botnet communication graph by representing hosts as nodes and edges as network flows between them. Subsequently, they extracted the graph centrality features, such as the degree centrality, for feature extraction and applied various machine learning algorithms to detect the botnets. However, these works considered graph centrality as a feature, and this might not be able to completely capture the hidden botnet topological patterns sufficiently.

In [7, 8], the authors used a basic ’vanilla’ GCN to detect botnets. They exploited generated botnet traffic and normal traffic created by using the CTU-13 [23] dataset to create botnet communication graphs and applied a 12-layer GCN for botnet node classification. Botnet communication graphs do not contain any node features for botnet detection, and those approaches rely on the graph pattern structure of the communication graph for botnet node classification. Nevertheless, these approaches suffer from the over-smoothing problem [24]. It has been indicated that, as the number of graph convolutional layers increases, all node embeddings over a graph will converge to indistinguishable vectors, which can lower the performance in downstream tasks [25]. Zhou et al. [7] theorized that the GNN model needs to reach a certain depth in order to capture some of the hidden topological botnet patterns. Thus, further investigation of the application of deep GNNs for botnet detection is critical.

Table 1: Comparison of state-of-the-art botnet detection methods

Reference	Method	Features	Datasets	Support for Automatic Network Forensics
Chowdhury et al. [4]	Graph-based	Graph centrality	CTU-13 [23]	No
McDermott et al. [15]	BLSTM-RNN	Word embeddings	Custom dataset	No
Kozik et al. [19]	ELM and Spark	Network flow features	CTU-13 [23]	No
Moodi et al. [21]	Ensemble KNN	Network flow features	Custom dataset	No
Abou-Rjeili et al. [5]	Graph-based	Graph centrality	CTU-13 [23]	No
Pektacs et al. [20]	1D CNN-RNN	Graph and network flow features	CTU-13 [23] and ISOT [26]	No
Shi et al. [18]	LSTM and RNN	Network flow features	CTU-13 [23]	No
Ahmed et al. [17]	Deep feed-forward NN	Network flow features	CTU-13 [23]	No
Alqahtani et al. [22]	OCSVM and GWO	Network flow features	N-BaIoT [27]	No
Zhou et al. [7]	GCN	Communication graphs	CTU-13 and CAIDA [7]	No
Zhang et al. [8]	GCN	Communication graphs	CTU-13 and CAIDA [7]	No
XG-BoT	Grouped reversible GINs	Communication graphs	CTU-13 and CAIDA [7]	Yes

Table 1 presents a comparison of various botnet detection methods used in related studies. In contrast to these studies, our XG-BoT approach utilizes grouped reversible residual connections with GINs for botnet detection. This approach helps mitigate the over-smoothing problem present in [7, 8] and captures deeper hidden botnet patterns to improve detection performance. Additionally, we enable GNN explainability for automatic network forensics by highlighting highly correlated hosts and network flows, which until now has been lacking in current botnet detection approaches.

4 Proposed Method

The proposed XG-BoT approach was designed to extract useful graph topological patterns for botnet detection in large-scale botnet graph datasets. Since botnet datasets inherit a high-class imbalance, which affects the detection performance, the proposed method aims to train very deep GCNs and capture the hidden topological patterns for botnet detection as much as possible. The goal is to improve classification performance by enhancing node representation.

The GCN utilizes a message propagation mechanism to compute node embeddings by incorporating the $K$ -hop neighbours’ node features. The trained GCN models can be applied to different graphs to generate embeddings for the downstream tasks (i.e., node classification).

We consider a network $G=(N,E)$ , where $N$ represents the set of hosts (vertices) and $E$ is the set of communication flows (edges). The adjacency matrix $A$ is an $N\times N$ sparse matrix with $(i,j)$ . Each node has a k-dimensional node feature vector, and $X\in\mathbb{R}^{N\times K}$ is a feature vector for each $N$ node.

The $k$ -th layer of a typical GCN is

h_{v}^{(k)}=\sigma\left(W\cdot{\rm MEAN}\left\{h_{u}^{(k-1)},\ \forall u\in{N}(v)\cup\{v\}\right\}\right),

(1)

$W^{(l)}$ is the weight matrix that will be learned for the downstream tasks. $\sigma$ is an activation function, typically ReLU, for computing node representations. Since the vanilla GCN is limited by the number of layers, we propose XG-BoT, which combines the grouped reversible residual connections [11] with GINs [12] for botnet detection to act against the over-smoothing problem and capture deeper hidden botnet patterns.

Proposed XG-BoT model: In this study, we utilized grouped reversible residual connections [11] with GINs [12] to build the XG-BoT model to perform botnet node detection, as shown in Fig 2. In the XG-BoT model, the input node feature matrix $X$ , which is an all one’s constant vector, is transformed into a 64-dimensional vector using a linear transformation and then uniformly partitioned into $C=2$ groups across the hidden layer channel dimension. Each of the grouped GIN modules only takes the corresponding group of node features to compute the corresponding node embeddings. The forward propagation of computing embeddings $X^{\prime}$ is performed as follows:

X_{0}^{\prime}=\sum_{i=2}^{C}X_{i}

(2)

X_{i}^{\prime}=f_{wi}\left(X_{i-1}^{\prime},g\right)+X_{i},i\in\{1,\cdots,C\}

(3)

X^{\prime}=X_{1}^{\prime}\|X_{2}^{\prime}

(4)

where $X_{0}^{\prime}$ represents node features split into $C$ groups for message propagation, as shown in Equation 2. In Equation 3, each of the GIN blocks only takes the corresponding grouped node feature $X_{i}$ for computing node embeddings with the reversible residual connection mechanism [28] to minimize over-smoothing and memory consumption problems. For example, in Fig. 2, it is assumed that there are $C=2$ groups and each of the reversible GIN blocks takes two inputs $\left(x_{1},x_{2}\right)$ and produces two intermediate node representations $\left(y_{1},y_{2}\right)$ . The residual functions of the GIN blocks $G_{1}$ and $G_{2}$ are

		$\displaystyle y_{1}=x_{1}+G_{1}\left(x_{2}\right)$		(5)
		$\displaystyle y_{2}=x_{2}+G_{2}\left(y_{1}\right)$		(5)

In the final step of forward propagation, the node embeddings are reconstructed based on the concatenation operation of each of the subset node embeddings, as shown in Equation 4. To compute the final node embedding, this is fed to a fully connected (MLP) layer and Softmax layer to perform the downstream task (i.e., botnet node classification). Due to group processing, the number of training parameters decreases as the size of the group increases. This allows a deeper XG-BoT model to be built, allowing the capture of hidden topological patterns.

Graph Encoder module: As we mentioned, we adopted GIN [12], a state-of-the-art graph neural network, as the GNN encoder block in Fig. 2. GIN calculates each node representation via a sum-like neighbourhood aggregation function, as shown below:

h_{v}^{(k)}=\mathrm{MLP}^{(k)}\left(\left(1+\epsilon^{(k)}\right)\cdot h_{v}^{(k-1)}+\sum_{u\in\mathcal{N}(v)}h_{u}^{(k-1)}\right)

(6)

where $h_{v}^{(k)}$ is the node embedding of node $v_{i}$ at the $k$ -th layer. The MLP isthe multi-layer perceptron. $\epsilon^{(k)}$ is either a fixed scalar or trainable parameter. We can stack $L$ layers of GIN to obtain the final node embedding $h_{v}^{(L)}$ . Unlike vanilla GCNs [29], which are much less powerful than the Weisfeiler–Lehman (1-WL) [30] algorithm, the sum-like aggregator in GIN can capture the structural homophily and neighbourhood homophily, which are both critical for representing the botnet behaviour patterns. In terms of the GIN encoder, we used two MLP layers with the ReLU activation function and batch normalization to extract node embeddings, as shown in Fig. 2.

5 Experiments

5.1 Datasets

Table 2: Botnet dataset statistics for P2P

Data Split	Graphs	Avg Nodes	Avg Edges	Avg Botnet Nodes
Train	768	143895	1623217	3090
Val	96	143763	1622620	3093
Test	96	144051	1624948	3095

Table 3: Botnet dataset statistics for C2

Data Split	Graphs	Avg Nodes	Avg Edges	Avg Botnet Nodes
Train	768	143895	813237	3211
Val	96	143763	812955	3234
Test	96	144051	814003	3175

In general, a set of network flows from the datasets can be treated as graph data, as each of the network host IP addresses can be represented as a graph node, and network communication flows between each host can be represented as edges. Formally, we can define the communication graph as $G=(V,E)$ , where graphs $G$ consist of a collection $V$ of nodes and a collection $E$ of edges. An adjacency matrix format can represent the communication graphs, whereby a graph with $n$ nodes and communication flow can be represented as an adjacency matrix $A\in\mathbb{R}^{n\times n}$ with $a_{ij}=1$ if there is a communication flow between host node $i$ and host node $j$ .

We used two publicly available botnet graph datasets [7] with P2P and C2 botnet scenarios. The botnet graph datasets were generated from the CTU-13 original NetFlow dataset [23], and the botnet traffic was generated by real-world malware samples. The botnet nodes and botnet topological patterns were mixed with background traffic collected from CAIDA in 2018 to generate botnet communication graphs.

Both the P2P and C2 botnet datasets consist of 768 training graphs and 96 validation and testing graphs. Each of these graphs contains around 3,000 botnet nodes. Each node is equipped with an ”all ones” constant vector as its node feature. Due to privacy concerns, the IP address of each network node was numerically relabeled to the nodes of each graph by the dataset authors. The distribution of the dataset and the statistics of the nodes and edges for each botnet graph are shown in Table 2 and Table 3.

5.2 Training

Our experiments were conducted on a virtual Linux server with a $2.3\mathrm{GHz}$ 2-core Intel(R) Xeon(R) processor and 51 GB memory, and a Tesla P100 GPU. The proposed model was developed in Python using several machine learning packages, such as Sckit-learn, PyTorch Geometric, and PyTorch.

For performing hyperparameter tuning, a grid search was performed to ensure the optimal settings were used. The XG-BoT grid search values are given in Table 5. Overall, we found that the optimal parameters were 15 XG-BoT layers for the C2 datasets and 6 XG-BoT layers for P2P datasets with $C=2$ , 64 hidden channels (32 hidden channels for each GINs as $C=2$ ) and $\epsilon=0$ . The results for different layers are shown in Fig 3. We used the Adam optimizer with a learning rate of 0.001 to train the proposed model.

Table 4: XG-BoT training time and MTTD

Datasets	Training time (hrs)	Mean time to detect (MTTD) ( $\mu$ s)
C2	4.63	1.45
P2P	3.82	1.59

Table 5: Hyperparameter values used in XG-BoT

Hyperparameter	Values
No. Layers	$[3,6,9,12,15,18,21,24]$
No. Hidden Channels	64 (32 hidden channels for each GINs)
No. Groups	2
Learning Rate	$1e^{-3}$
Activation Func.	ReLU
Optimiser	Adam

To demonstrate the computational efficiency of the XG-BoT model, we measured the training time for the optimal model and the Mean time to detect (MTTD). The performance results for the two benchmark datasets are shown in Table 4.

6 Experimental Results

For the performance evaluation of the proposed XG-BoT, the evaluation metrics listed in Table 6 were used, where $TP$ , $TN$ , $FP$ and $FN$ represent the number of True Positives, True Negatives, False Positives, and False Negatives, respectively.

Table 7 shows the performance evaluation results of the proposed XG-BoT model for automatic botnet detection, indicating the Precision, F1-Score, Recall, and FAR for the C2 and P2P datasets. Overall, the XG-BoT model achieved state-of-the-art performance scores. As all the datasets are highly imbalanced, we did not use accuracy as a performance metric. The performance metrics show that XG-BoT achieves extremely low false alarm rates and very high detection rates, in both the C2 and P2P experiments.

Table 6: Evaluation metrics utilised in this study.

Metric	Definition
Recall (Detection Rate)	$\frac{TP}{TP+FN}$
Precision	$\frac{TP}{TP+FP}$
F1-Score	$2\times\frac{Recall\times Precision}{Recall+Precision}$
FAR (False Alarm Rate)	$\frac{FP}{FP+TN}$

We then used detection rates and F1 and Recall scores to compare our proposed model with the state-of-the-art results, i.e., the best classification results shown in the literature. Table 7 shows the corresponding results for the XG-BoT classifier compared to the state-of-the-art results in terms of detection rates and F1-Score. As we can see in the table, for the C2 dataset experiment, XG-BoT achieved F1 and Recall scores of 99.52% and 99.42%, respectively. In the second P2P dataset experiment, XG-BoT achieved F1 and Recall scores of 99.47% and 99.72%, respectively. In regard to the F1-Score and Recall, XG-BoT outperformed all state-of-the-art approaches in both P2P and C2 experiments.

Table 7: Performance of binary classification by XG-BoT compared with the state-of-art algorithms

Method	Dataset	Precision	Recall	F1	FAR
Proposed XG-BoT	C2	99.63%	99.42%	99.52%	0.01%
GCN [7]	C2	$-$	99.03%	$-$	0.01%
GCN [8]	C2	$-$	96.40%	98.00%	$-$
XGBoost [8]	C2	$-$	96.00%	11.80%	$-$
Proposed XG-BoT	P2P	99.23%	99.72%	99.47%	0.02%
GCN [7]	P2P	$-$	99.51%	$-$	0.01%
GCN [8]	P2P	$-$	98.40%	98.91%	$-$
XGBoost [8]	P2P	$-$	98.50%	10.20%	$-$
ABD-GN [31]	P2P	$-$	$-$	99.29%	$0.01\%$
isirgn1 [31]	P2P	$-$	$-$	97.85%	$0.02\%$

7 Explainability for Automatic Network Forensics

While there is a huge interest in the explainability of deep learning model predictions [14], the adoption of GNN explainability involves some challenges [32]. In this section, we discuss the XG-BoT explainability methods, with a specific focus on automatic network forensics via subgraph visualization.

GNNExplainer [13]: This is an explainable graph algorithm that provides interpretable explanations for GNN predictions. Given an individual input graph, GNNExplainer emphasizes a subgraph structure and node-level features that are relevant to the prediction. In an explainable botnet node classification case the algorithm returns the most critically important host and network flow paths in an explainable botnet node classification case. GNNExplainer tries to maximize a mutual information objective function between the prediction of a graph neural network and the distribution of feasible subgraphs is maximized. The goal of GNNExplainer is to identify a subgraph $G_{\mathrm{S}}\subseteq G$ with associated features $X_{\mathrm{S}}=\left\{x_{j}\mid v_{j}\in G_{\mathrm{S}}\right\}$ that are relevant for explaining a target prediction via a mutual information measure MI, where $H$ is an entropy term.

\displaystyle\max_{G_{\mathrm{S}}}\operatorname{MI}\left(Y,\left(G_{\mathrm{S}},X_{\mathrm{S}}\right)\right)=

\displaystyle H(Y)-H\left(Y\mid G=G_{\mathrm{S}},X=X_{\mathrm{S}}\right)

(7)

The detection results of the XG-BoT can be explained by the contribution of ”learnt” node features and the interconnections of nodes towards the suspected bot. The explainable algorithm learns and returns a node feature and edge mask to explain the importance of corresponding nodes and edges that contributed to the final detection result. The masks can be learned using gradient descents by maximizing the mutual information between the subgraph and the final detection. Fig. 4 shows the explainable results of the P2P botnet graph nodes with the normal/bot node samples. The edges in green (darker indicates more significant) are the most relevant path, which contributes to the detection of the centre targeted bot (shown in red) in Fig. 4 right and the targeted centre normal node (shown in blue) in Fig. 4 left. Overall, GNNExplainer correctly identifies corresponding neighbour nodes in the TP scenarios, which contributes to the detection results as it highlights highly correlated hosts and network flows.

Gradient-Based Saliency Maps [14]: These are the derivatives of the class probability $P_{i}$ to the input image $X$ [33], given by $\frac{dP_{i}}{dX}$ . Essentially similar to backpropagation, they generate a heat map corresponding to each pixel’s importance. This image domain CNN-based explainability method can be adopted to predict which edges are important for GNN decisions [14]. Specifically, in the case of XG-BoT classification, a quantification is obtained for every neighbour node in a node’s ego network with respect to a certain class prediction $P_{i}$ . We start with all connected edge weights set to one initially and calculate the gradient of the output with respect to the edge weights $w_{e_{i}}$ . We use the absolute value of the gradient as the attribution value for each edge:

Attribution_{e_{i}}=|\frac{\partial F(x)}{\partial w_{e_{i}}}|,

(8)

where x is the input and F(x) is the output of XG-BoT on input node x. Figure 5 shows the influence of the XG-BoT decision on a node from its ego network. In the FP and FN cases (bottom row), given the context, it is clear to see the reasoning for the XG-BoT classification decision.

In summary, GNN-based explainability is challenging [32] with few works available to explain GNN models [14]. Unlike purely visual inspections, which can be misleading [34], identifying influential neighbours via salience or gradients is simple and efficient and is recommended for XG-BoT-based detection. For datasets where features are not present, using perturbation-based methods to monitor changes in the predictions by perturbing different input features is not effective. However, XG-BoT can be used with Integrated Gradients [32] to assign an importance score based on an approximation of the integral of the gradient or Clustering-based approaches [35] to provide explainable results. The network administrator can use those results for automatic forensics.

8 Conclusion and Future Work

This paper proposes a novel, explainable GNN-Based botnet detection system. We first present XG-BoT, which uses the grouped reversible residual connection and a graph isomorphism network to perform botnet detection. Then GNNExplainer and saliency map was applied by highlighting specious network flows and botnet nodes. Given the experimental results, two benchmark datasets indicate that our approach can outperform the state-of-the-art approaches in terms of F1-Score and detection rate. Furthermore, identifying the suspicious network flows and botnet nodes can facilitate understanding botnet patterns for automatic network forensics. In the future, we plan to apply edge-based graph encoders, such as E-GraphSAGE [36], with XG-BoT to consider network communication flow features, such as the number of packets required for deep-wise edge-based botnet detection.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Gu et al. [2008] G. Gu, R. Perdisci, J. Zhang, W. Lee, Botminer: Clustering analysis of network traffic for protocol-and structure-independent botnet detection (2008).
Bilge et al. [2012] L. Bilge, D. Balzarotti, W. Robertson, E. Kirda, C. Kruegel, Disclosure: detecting botnet command and control servers through large-scale netflow analysis, in: Proceedings of the 28th Annual Computer Security Applications Conference, 2012, pp. 129–138.
Cheng et al. [2011] T.-H. Cheng, Y.-D. Lin, Y.-C. Lai, P.-C. Lin, Evasion techniques: Sneaking through your intrusion detection/prevention systems, IEEE Communications Surveys & Tutorials 14 (2011) 1011–1020.
Chowdhury et al. [2017] S. Chowdhury, M. Khanzadeh, R. Akula, F. Zhang, S. Zhang, H. Medal, M. Marufuzzaman, L. Bian, Botnet detection using graph-based feature clustering, Journal of Big Data 4 (2017) 1–23.
Abou Daya et al. [2019] A. Abou Daya, M. A. Salahuddin, N. Limam, R. Boutaba, A graph-based machine learning approach for bot detection, in: 2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM), IEEE, 2019, pp. 144–152.
Wu et al. [2020] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, S. Y. Philip, A comprehensive survey on graph neural networks, IEEE transactions on neural networks and learning systems 32 (2020) 4–24.
Zhou et al. [2020] J. Zhou, Z. Xu, A. M. Rush, M. Yu, Automating botnet detection with graph neural networks, AutoML for Networking and Systems Workshop of MLSys 2020 Conference (2020).
Zhang et al. [2021] B. Zhang, J. Li, C. Chen, K. Lee, I. Lee, A practical botnet traffic detection system using gnn, in: International Symposium on Cyberspace Safety and Security, Springer, 2021, pp. 66–78.
Li et al. [2019] G. Li, M. Muller, A. Thabet, B. Ghanem, Deepgcns: Can gcns go as deep as cnns?, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 9267–9276.
Vormayr et al. [2017] G. Vormayr, T. Zseby, J. Fabini, Botnet communication patterns, IEEE Communications Surveys & Tutorials 19 (2017) 2768–2796.
Li et al. [2021] G. Li, M. Müller, B. Ghanem, V. Koltun, Training graph neural networks with 1000 layers, in: International conference on machine learning, PMLR, 2021, pp. 6437–6449.
Xu et al. [2019] K. Xu, W. Hu, J. Leskovec, S. Jegelka, How powerful are graph neural networks?, in: International Conference on Learning Representations, 2019. URL: https://openreview.net/forum?id=ryGs6iA5Km.
Ying et al. [2019] R. Ying, D. Bourgeois, J. You, M. Zitnik, J. Leskovec, Gnnexplainer: Generating explanations for graph neural networks, Advances in neural information processing systems 32 (2019) 9240.
Kasanishi et al. [2021] T. Kasanishi, X. Wang, T. Yamasaki, Edge-level explanations for graph neural networks by extending explainability methods for convolutional neural networks, in: 2021 IEEE International Symposium on Multimedia (ISM), IEEE, 2021, pp. 249–252.
McDermott et al. [2018] C. D. McDermott, F. Majdani, A. V. Petrovski, Botnet detection in the internet of things using deep learning approaches, in: 2018 international joint conference on neural networks (IJCNN), IEEE, 2018, pp. 1–8.
Antonakakis et al. [2017] M. Antonakakis, T. April, M. Bailey, M. Bernhard, E. Bursztein, J. Cochran, Z. Durumeric, J. A. Halderman, L. Invernizzi, M. Kallitsis, et al., Understanding the mirai botnet, in: 26th USENIX security symposium (USENIX Security 17), 2017, pp. 1093–1110.
Ahmed et al. [2020] A. A. Ahmed, W. A. Jabbar, A. S. Sadiq, H. Patel, Deep learning-based classification model for botnet attack detection, Journal of Ambient Intelligence and Humanized Computing (2020) 1–10.
Shi and Sun [2020] W.-C. Shi, H.-M. Sun, Deepbot: a time-based botnet detection with deep learning, Soft Computing 24 (2020) 16605–16616.
Kozik et al. [2018] R. Kozik, M. Choraś, M. Ficco, F. Palmieri, A scalable distributed machine learning approach for attack detection in edge computing environments, Journal of Parallel and Distributed Computing 119 (2018) 18–26.
Pektaş and Acarman [2019] A. Pektaş, T. Acarman, Deep learning to detect botnet via network flow summaries, Neural Computing and Applications 31 (2019) 8021–8033.
Moodi and Ghazvini [2019] M. Moodi, M. Ghazvini, A new method for assigning appropriate labels to create a 28 standard android botnet dataset (28-sabd), Journal of Ambient Intelligence and Humanized Computing 10 (2019) 4579–4593.
Al Shorman et al. [2020] A. Al Shorman, H. Faris, I. Aljarah, Unsupervised intelligent system based on one class support vector machine and grey wolf optimization for iot botnet detection, Journal of Ambient Intelligence and Humanized Computing 11 (2020) 2809–2825.
Garcia et al. [2014] S. Garcia, M. Grill, J. Stiborek, A. Zunino, An empirical comparison of botnet detection methods, computers & security 45 (2014) 100–123.
Liu et al. [2020] M. Liu, H. Gao, S. Ji, Towards deeper graph neural networks, in: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, 2020, pp. 338–348.
Li et al. [2018] Q. Li, Z. Han, X.-M. Wu, Deeper insights into graph convolutional networks for semi-supervised learning, in: Thirty-Second AAAI conference on artificial intelligence, 2018.
iso [2010] Isot botnet dataset, 2010. URL: https://onlineacademiccommunity.uvic.ca/isot/2022/11/27/botnet-and-ransomware-detection-datasets/.
Meidan et al. [2018] Y. Meidan, M. Bohadana, Y. Mathov, Y. Mirsky, A. Shabtai, D. Breitenbacher, Y. Elovici, N-baiot—network-based detection of iot botnet attacks using deep autoencoders, IEEE Pervasive Computing 17 (2018) 12–22.
Gomez et al. [2017] A. N. Gomez, M. Ren, R. Urtasun, R. B. Grosse, The reversible residual network: Backpropagation without storing activations, Advances in neural information processing systems 30 (2017).
Kipf and Welling [2016] T. N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv:1609.02907 (2016).
Shervashidze et al. [2011] N. Shervashidze, P. Schweitzer, E. J. Van Leeuwen, K. Mehlhorn, K. M. Borgwardt, Weisfeiler-lehman graph kernels., Journal of Machine Learning Research 12 (2011).
Carpenter et al. [2021] J. Carpenter, J. Layne, E. Serra, A. Cuzzocrea, Detecting botnet nodes via structural node representation learning, in: 2021 IEEE International Conference on Big Data (Big Data), IEEE, 2021, pp. 5357–5364.
Sundararajan et al. [2017] M. Sundararajan, A. Taly, Q. Yan, Axiomatic attribution for deep networks, in: International conference on machine learning, PMLR, 2017, pp. 3319–3328.
Simonyan et al. [2014] K. Simonyan, A. Vedaldi, A. Zisserman, Deep inside convolutional networks: Visualising image classification models and saliency maps, in: Proceedings of the International Conference on Learning Representations, 2014.
Adebayo et al. [2018] J. Adebayo, J. Gilmer, M. Muelly, I. Goodfellow, M. Hardt, B. Kim, Sanity checks for saliency maps, Advances in neural information processing systems 31 (2018).
Kulatilleke et al. [2022] G. K. Kulatilleke, M. Portmann, S. S. Chandra, Scgc: Self-supervised contrastive graph clustering, arXiv preprint arXiv:2204.12656 (2022).
Lo et al. [2022] W. W. Lo, S. Layeghy, M. Sarhan, M. Gallagher, M. Portmann, E-graphsage: A graph neural network based intrusion detection system for iot, in: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, 2022, pp. 1–9.

XG-BoT: An Explainable Deep Graph Neural Network for Botnet Detection and Forensics