Supplementary Material for Directed Percolation in Temporal Networks

Arash Badie-Modiri Department of Computer Science, School of Science, Aalto University, FI-0007, Finland Abbas K. Rizi Department of Computer Science, School of Science, Aalto University, FI-0007, Finland Márton Karsai Department of Network and Data Science Central European University, 1100 Vienna, Austria Alfréd Rényi Institute of Mathematics, 1053 Budapest, Hungary Mikko Kivelä Department of Computer Science, School of Science, Aalto University, FI-0007, Finland

I Implementation

The implementation, along with two of the real-world temporal networks used, namely US air transport and Helsinki public transport, are made available online [1]. Please refer to Supplementary Material for [2] for a more detailed usage information.

II Generating synthetic temporal networks and the event graph

The synthetic temporal networks are created with some continuous-time stochastic process based on an underlying static network with a degree distribution of $p_{k}$ and excess degree distribution of $q_{k}$ . The event graph, directed acyclic graph of adjacency relationships between pairs events, can then be produced by iterating through all events $e$ and connecting it to all other events when $e$ happens less that $\delta t$ time before that event and they share at least one node.

Reachability on the event graph will be preserved by removing some of the links so that the in/out-degree varies between 0 to 2 for every node [3] as long as the probability of adjacent events happening at exactly the same time is negligible. Practically, for every event $e$ in the event graph, we can remove directed links to all but the very first events for each of the two nodes involved in $e$ . This preserves connectivity in the event graph since all the events on the other end of the removed adjacency relationships would still be connected through one of the remaining links out-bound from $e$ as they share at least one node and the time difference is less than or equal to the original event. Note that if more than one adjacent events are happening at the same time and no other adjacent events happen before them, we would have to keep all of them to preserve connectivity.

Note that in practice it is often not necessary to explicitly generate the event graph to measure the quantities. It is possible to store the list of associated events for each node in the network sorted by time and generate adjacency relationships on the fly. This can also be combined with other techniques such as using probabilistic data structures for estimating out-component sizes to allow processing of temporal networks much larger than what is possible with the explicit solution [4].

II.1 Degree distribution of the reduced event graph

Let’s assume a vertex on the event graph, an event $e$ , that involves two nodes called $l$ and $r$ which just activated at time $t_{0}$ (Fig 1). The two nodes $l$ and $r$ have respectively $q_{l}$ and $q_{r}$ neighbor nodes, other than each other, over the static network.

Refer to caption — Figure 1: Considering the case of an event between nodes $l$ and $r$ happening at time $t_{0}$ , where each node has $q_{l}$ and $q_{r}$ neighbours other than each other respectively. Assuming link ${l\mathrel{\mkern-3.0mu{-}\mkern-3.0mu}r}$ was selected uniformly at random from the set of all the links in the base network, the values $q_{l}$ and $q_{r}$ are both realisations of the excess degree distribution of the base network $P_{q}$ . Out-degree of the event $e_{0}=(l,r,t_{0})$ is between zero and two depending on the order and timing of events between $l$ , $r$ and their neighbours. If the ${l\mathrel{\mkern-3.0mu{-}\mkern-3.0mu}r}$ link activates before any of the other links incident to $l$ and $r$ (panel a) or only links incident to $l$ (or only $r$ ) other than ${l\mathrel{\mkern-3.0mu{-}\mkern-3.0mu}r}$ fire before ${l\mathrel{\mkern-3.0mu{-}\mkern-3.0mu}r}$ (panel b) at a time $t_{1}>t_{0}$ , event $e_{0}$ would have an out degree of zero if $t_{1}-t_{0}\geq\delta t$ or one if $t_{1}-t_{0}<\delta t$ . All other edges coming out of $e_{0}$ would necessarily get pruned out as shown by the crossed-out links. The only case for $e_{0}$ having a degree two happens when at least one event at $t_{1}<\delta t$ only involving $l$ and not $r$ and one at $t_{2}<\delta t$ only involving $r$ and not $l$ both happen before ${l\mathrel{\mkern-3.0mu{-}\mkern-3.0mu}r}$ fires again.

Let’s also define $Pr(t_{res}<\delta t)$ as the probability that a process with inter-event time distribution $\mathcal{T}$ can activate at least once in time $\delta t$ after a random point in time. This can correspond to probability of one of the links in the underlying network activating within a time period of $\delta t$ . Random variable $t_{res}$ is distributed according to the residual inter-event time distribution $\mathcal{R}$ . Similarly, $Pr(t_{iet}<\delta t)$ is the probability that a process with inter-event time distribution $\mathcal{T}$ can activate at least once in time $\delta t$ right after activation.

Probability of an event having out-degree of zero in the event graph can be calculated as:

P_{out}(0|q_{l},q_{r})=Pr(t_{res}>\delta t)^{q_{l}+q_{r}}Pr(t_{iet}>\delta t)

(1)

where $q_{l}$ and $q_{r}$ are the number of neighbours each of the nodes participating in the event has except for the connection between two nodes of the event in question, $t_{iet}$ is a realisation of the inter-event time distribution of the network $\mathcal{T}$ and $t_{res}$ is a realisation of the residual inter-event time distribution $\mathcal{R}$ . Out-degree of an event is zero if and only if none of the $q_{l}+q_{r}$ adjacent links on the underlying network have an event within $\delta t$ and the two nodes participating in the original event also don’t have any events between them within $\delta t$ . The second term corresponds to the probability of the same link not activating and the first is the probability of all of the other incident links except for the original link not activating in $\delta t$ .

The only case that an event on the event graph can have an out-degree equal to 2 (as shown on Fig. $\ref{fig:schematic_with_mellor}c$ ) is that at least one of the $q_{l}$ neighbours of $l$ and one of the $q_{r}$ neighbours of $r$ activate before $\delta t$ and before reactivation of the link between $l$ and $r$ . Activation of the link between $l$ and $r$ before at least one of the links on each side is activated (Fig. $\ref{fig:schematic_with_mellor}a$ and $\ref{fig:schematic_with_mellor}b$ ) would result in out-degree equal to zero or one depending on the value of $\delta t$ and timing of the events.

Probability of having an out-degree equal to 2 can be calculated this way:

$\displaystyle P_{out}(2\|q_{l},q_{r})=\int^{\infty}_{0}$	$\displaystyle(1-Pr(t_{res}>\delta t\lor t_{res}>t)^{q_{l}})$	(2)
	$\displaystyle(1-Pr(t_{res}>\delta t\lor t_{res}>t)^{q_{r}})$
	$\displaystyle Pr(t\sim\mathcal{T})\differential{t}$

where $t_{res}$ , $\mathcal{T}$ , $q_{l}$ and $q_{r}$ are defined as above. An event has an out-degree equal to 2 if and only if two mutually non-adjacent links adjacent to the link corresponding to the original event activated within $\delta t$ and before the link corresponding to the original event is activated.

P_{out}(1|q_{l},q_{r})=1-(P_{out}(0|q_{l},q_{r})+P_{out}(2|q_{l},q_{r}))

(3)

Based on these equations, it is trivial to construct joint in- and out-degree distribution

	$\displaystyle P(in,out)=\sum^{\infty}_{q_{l},q_{r}=1}P_{in}(in\|q_{l},q_{r})P_{out}(out\|q_{l},q_{r})$		(4)
	$\displaystyle P_{q}(q_{l})P_{q}(q_{r})$		(4)

where $P_{q}(i)$ is the probability mass function of excess degree for the static aggregate base network.

It is possible to construct the joint degree distribution generating function $\mathcal{G}$ using the joint degree distribution itself

\mathcal{G}(x,y)=\sum^{2}_{in,out=0}P(in,out)x^{in}y^{out}

(5)

and in- and out-degree and excess degree distribution generating functions

$\displaystyle\mathcal{G}^{in}_{0}(x)$	$\displaystyle=\mathcal{G}(x,1)$	(6)
$\displaystyle\mathcal{G}^{out}_{0}(y)$	$\displaystyle=\mathcal{G}(1,y)$
$\displaystyle\mathcal{G}^{in}_{1}(x)$	$\displaystyle=\frac{1}{z}\frac{\partial\mathcal{G}(x,y)}{\partial y}\biggl{\|}_{y=1}$
$\displaystyle\mathcal{G}^{out}_{1}(y)$	$\displaystyle=\frac{1}{z}\frac{\partial\mathcal{G}(x,y)}{\partial x}\biggl{\|}_{x=1}$

where $z$ is the mean in- and out-degree derived from

z=\frac{\partial\mathcal{G}(x,y)}{\partial x}\biggl{|}_{x=y=1}=\frac{\partial\mathcal{G}(x,y)}{\partial y}\biggl{|}_{x=y=1}.

(7)

Note that the in- and out-excess degree distribution generating functions we just derived ( $\mathcal{G}^{in}_{1}(x)$ and $\mathcal{G}^{out}_{1}(x)$ ) refer to excess in- and out-degree distribution of a random event in the event graph.

III Details of analytical derivation of critical exponents

To study properties of the event graph, we approximate it by a random directed graph with the same in- and out-degree distribution. The following sections are all based on this assumption. The validity of this assumption and the following results can be verified explicitly by empirically constructing temporal networks of different topologies and temporal dynamics and measuring scaling of quantities such as $\rho(t)$ , $P(t)$ , $M$ , $V$ or $\rho_{\text{stat}}(\tau)$ [2].

III.1 Control Parameter $\tau$

The mean-field rate equation for occupation density in homogeneous occupation initial condition can be constructed as

\partial_{t}\rho(t)=[Q_{out}(2)-Q_{out}(0)]\rho(t)-[Q_{out}(1)+2Q_{out}(2)]\rho^{2}(t)

(8)

where $Q_{out}(i)=\frac{\partial^{i}}{i!\partial y^{i}}G^{out}_{1}(y)$ is the excess out-degree distribution of events in the event graph. Using excess degree distribution captures the fact that in the random temporal model we are using, in- and out-degrees of events in the event graph are correlated and both are a function of degree of the event’s constituting nodes in the static base network. By defining $\tau=Q_{out}(2)-Q_{out}(0)$ and $g=Q_{out}(1)+2Q_{out}(2)$ , Eq. 8 turns into $\partial_{t}\rho(t)=\tau\rho(t)-g\rho^{2}(t)$ with stationary solutions at $\rho=0$ , which represents the absorbing phase, and $\tau=0$ , which corresponds to the mean-field critical point.

An event graph can be presented, without any change in reachability of any event, so that no event has an in- or out-degree larger than two (as discussed in the beginning of this section) $\tau$ and $g$ can be written as $\tau=\left\langle Q_{out}\right\rangle-1$ and $g=\left\langle Q_{out}\right\rangle$ .

The phase transition at $\tau=0$ also complies with the previously know result of phase transition in random directed graphs with arbitrary degree distribution at $\frac{\partial}{\partial y}G^{out}_{1}(y)|_{y=1}=1$ [5].

III.2 Density scaling exponents $\alpha=\beta=1$

For large $t$ , Eq. 8 has one solution for active and absorbing phases and the critical threshold $\tau=0$

\rho(t)=\left\{\begin{aligned} &-\tau\left(g-\frac{\tau}{\rho_{0}}\right)^{-1}\mathrm{e}^{\tau t},&&\text{if}\ \tau<0\\ &\left(\rho_{0}^{-1}+gt\right)^{-1},&&\text{if}\ \tau=0\\ &\frac{\tau}{g}+\frac{\tau}{g^{2}}\left(g-\frac{\tau}{\rho_{0}}\right)\mathrm{e}^{-\tau t},&&\text{if}\ \tau>0\end{aligned}\right.

(9)

where as $t$ grows, $\rho$ approaches zero for $\tau\leq 0$ and $\rho\rightarrow\tau/g$ for the $\tau>0$ , i.e. asymptotically

\rho(t)\propto t^{-1},\text{if}\ \tau=0

(10)

and

\rho_{stat}(\tau)\propto\tau^{1},\text{if}\ \tau>0.

(11)

which leads to

\alpha=\beta=1.

(12)

III.3 Rapidity-reversal symmetry $\beta^{\prime}=\beta$

The fact that survival of a component is measured using out-component of events in the event graph while occupation density is calculated by measuring in-component of all possibly infected nodes, hints at a symmetry in the system under time reversal. Consider $\delta t$ -constrained event graph representation of temporal network $T(\mathcal{V},\mathcal{E})$ and two sets events in bands of time $\delta t$ units of time wide, namely $E_{0}=\{e\in\mathcal{E}\mid 0\leq t_{e}<\delta t\}$ and $E_{t}=\{e\in\mathcal{E}\mid t\leq t_{e}<t+\delta t\}$ where $t_{e}$ is time of activation of event $e$ . Assuming $S_{t}\subseteq E_{t}$ where each member of $S_{t}$ appears in the out-component of at least one of the members of $E_{0}$ and $S_{0}\subseteq E_{0}$ where each member of $S_{0}$ appears in the in-component of at least one of the members of $E_{t}$ (which is to say, one of the members of $S_{t}$ ). Probability of survival at time $t$ can be estimates as the fraction of nodes in $E_{0}$ that can reach at least a node in $E_{t}$ , $P(t)\approx|S_{0}|/|E_{0}|$ . Similarly, since in the homogeneous fully occupied case all the events in $E_{0}$ are occupied, the occupation density at time $t$ can be estimated as $\rho(t)\approx|S_{t}|/|E_{t}|$ .

Under reversal of time $t_{e}\rightarrow(t+\delta t)-t_{e}$ the direction of the links in the event graph will revert which in turn causes switching of in- and out-component set of each node. In this scenario, occupation density is estimated by $\rho(t)\approx|S_{0}|/|E_{0}|$ which is the same as probability of survival in the original case. Conversely, probability of survival is estimated by $P(t)\approx|S_{t}|/|E_{t}|$ which is the same as occupation density in the original case. If the time-reverted event graph has the same likelihood as the original event graph, e.g. if $\mathcal{G}^{out}_{0}=\mathcal{G}^{in}_{0}$ , this leads to the identity

P(t)=\rho(t),

(13)

which in turn, for models belonging to the DP class, leads to the celebrated rapidity-reversal symmetry:

\beta=\beta^{\prime}.

(14)

III.4 Mean component mass exponent $\gamma=1$

The generating function for distribution of out-component sizes $H_{0}(y)$ is the solution to the system

	$\displaystyle H_{0}(y)$	$\displaystyle=y\mathcal{G}^{out}_{0}(H_{1}(y))$		(15)
	$\displaystyle H_{1}(y)$	$\displaystyle=y\mathcal{G}^{out}_{1}(H_{1}(y))$		(15)

and mean out-component size can be calculated as

M=\frac{\partial H_{0}(y)}{\partial y}\biggl{|}_{y=1}=1+\mathcal{G}^{{}^{\prime}\text{out}}_{0}(1)H^{\prime}_{1}(1)\,.

(16)

For $tau<0$ where $H_{1}(1)=1$ this results in a solution in form of

		$\displaystyle H^{\prime}_{1}(1)=1+\mathcal{G}^{{}^{\prime}\text{out}}_{1}(1)H^{\prime}_{1}(1)=(1-\mathcal{G}^{{}^{\prime}\text{out}}_{1}(1))^{-1}$		(17)
		$\displaystyle\rightarrow M=1+\mathcal{G}^{{}^{\prime}out}_{0}(1)(1-\mathcal{G}^{{}^{\prime}\text{out}}_{1}(1))^{-1}\,.$		(17)

Keeping in mind the definition of control parameter $\tau=\left\langle Q_{out}\right\rangle-1=\mathcal{G}^{{}^{\prime}\text{out}}_{1}(1)-1$ and that $\frac{\partial}{\partial y}\mathcal{G}^{\text{out}}_{0}(y)|_{y=1}=z$ (see Eq. 7), we can re-write $M$ as

M=1+z(-\tau)^{-1}=\frac{z-\tau}{-\tau}\,,

(18)

For the special case of random $k$ -regular networks we can prove that $z-\tau=1$ which give the result $M=(-\tau)^{-1}$ . More generally, to find exponent of a power-law asymptote of the form $(-\tau)^{-\gamma}$ as $\tau\rightarrow 0^{-}$ for any random graph we can find the solution to

	$\displaystyle-\gamma$	$\displaystyle=\lim_{\tau\rightarrow 0^{-}}\frac{\ln M}{\ln-\tau}=\lim_{\tau\rightarrow 0^{-}}\frac{\ln(z-\tau)-\ln-\tau}{\ln-\tau}$		(19)
		$\displaystyle=\lim_{\tau\rightarrow 0^{-}}\frac{\ln(z-\tau)}{-\tau}-1=-1\,\text{if}\ 0<z<\infty\,.$		(19)

Under the condition that $0<z<\infty$ , the limit term is equal to zero, resulting in $\gamma=1$ . Given the fact that, assuming probability of co-occurrence of adjacent events is negligible, the maximum out-degree of the event graph is 2, if the mean out-degree is above zero at $\tau=0$ , a power relation with exponent $\gamma=1$ estimates mean component mass.

IV Alternative analogue for mean cluster mass

In classic DP, mean cluster mass $M$ is defined as the integration of the pair-connectedness function across all nodes and time. Based on how we established parallels between spatial dimensions and the base networks and a the possibility of defining pair-connectedness function as existence of a $\delta t$ -limited time-respecting path between a pair of nodes at different times, this might translates more directly to the sum of time-length of all infections started from a random event, or in other words total duration of sickness for all people. This would also imply that the spreading processes start at random nodes and times, as opposed to starting at random events as we use in the manuscript. Values for mean cluster mass $M$ , as well as mean cluster volume $V$ and mean cluster lifetime $T$ for random 9-regular networks with Poisson lik activations are plotted in Fig. 2 for different values of $\delta_{t}$ .

V Description of the real-world temporal network data sets

Four real-world temporal networks were used for demonstrating the measurement of the quantities and phase transition. These are the same datasets used previously for developing the algorithmic method which form the backbone of the more empirical parts of the current manuscript [4].

The air transportation network dataset is a recording of 180 112 flights between 279 airports in the United States, gathered from the website of the The Department of Transportation’s Bureau of Transportation Statistics website in 2017 [6]. The public transportation network is the set of all 664 138 trips during a typical Monday in Helsinki in 2018, where a trip is one public transportation vehicle moving from one of the 6 858 bus, metro and ferry stations to the next [7]. The twitter dataset is a set of 266 179 671 mentions (counting replies) of 17 313 552 user handles [8]. Finally, the mobile phone call dataset is set of 324 576 400 phone calls between over 5 193 086 mobile phone subscribers [9]. A few thousand events were removed from the beginning of the twitter dataset to eliminate a weeks-wide gap in the gathered data.

The first two networks, air and public transportation networks, were processed as directed, delayed temporal networks where each event has a duration as well as a starting time $e=(v_{1},v_{2},t,d)$ where two events are adjacent if the second event starts after the duration of the event is finished and the tail node of the second event is the same as the head node of the first event, e.g. the first plane lands in the destination airport before the second one takes off from that airport. The waiting time $\delta t$ then refers to the time between end of the first event to the beginning of the second one.

There is an argument for measuring waiting time in delayed temporal networks from the beginning of the first event for some processes such as disease spreading. For example, a disease that gets healed less than an hour after infecting someone has a very low chance of spreading through air travel where most trips take longer than that. That method was not used in this manuscript. The second pair of networks, twitter and mobile networks, were treated like undirected, instantaneous temporal networks.

The real-world networks show high degrees of temporal heterogeneity, daily/weekly patterns, peaks at spacial hours of the day or at special days of the year or local or global outages. Measuring representative values for static density $\rho_{\text{stat}}$ and susceptibility $\chi$ for real-world networks would need special consideration. Our current method for measuring these quantities in the homogeneous, fully-occupied initial condition is dependent on the level of activity of the initial time $t_{0}$ of the dataset as well as existence of unlikely periods of very low or very high activity as a result of natural disasters, real-world happenings or simply failure of the measurement apparatus or the measured system. To average out any such outliers we split the original data into 64 equal time windows of time $T/2$ (for air and public transport) and $T/16$ (for mobile and twitter) where $T$ is the time window of the original dataset, each starting at a random point in time.

Distribution of mass, volume and lifetime for each event in the event graph can be seen Figs. 3, 4, 5 and 6 for Helsinki public transport, Air transport, Twitter and Mobile datasets respectively.

VI Joint degree distribution in real-world temporal network event graphs

Symmetry of the joint in- and out-degree distribution of the event graph ( $\forall_{i,o}\,p^{in,out}_{i,o}=p^{in,out}_{o,i}$ ) was discussed as a condition for $\beta=\beta^{\prime}$ under a mean-field assumption of connectivity. While real-world networks often have correlations and inhomogeneities that affect connectivity, it is still interesting to verify the validity of this condition in real-world networks.

Observation of joint in- and out-degree of 100 000 random events of the event graphs for the Mobile and Twitter networks, the two largest real-world systems studied, show degree distributions (degrees 0 to 2) estimated at:

p^{in,out}=\begin{pmatrix}0.11063&0.16668&0.02007\\ 0.16263&0.37322&0.05916\\ 0.02028&0.05570&0.02359\end{pmatrix}

(20)

and

p^{in,out}=\begin{pmatrix}0.30426&0.10793&0.02340\\ 0.11109&0.20426&0.05220\\ 0.02082&0.04992&0.03393\end{pmatrix}

(21)

respectively at $\delta t=7.5\ \text{hours}$ and $\delta t=25\ \text{minutes}$ .

References

Badie-Modiri et al. [2021a] A. Badie-Modiri, A. K. Rizi, M. Karsai, and M. Kivelä, Source code and for ”Directed Percolation in Temporal Networks” and ”Directed Percolation in Random Temporal Network Models with Heterogeneities” (2021a), available at https://doi.org/10.5281/zenodo.5570746.
Badie-Modiri et al. [2021b] A. Badie-Modiri, A. K. Rizi, M. Karsai, and M. Kiveä, Directed percolation in random temporal network models with heterogeneities, arXiv:2110.07698 (2021b).
Mellor [2017] A. Mellor, The temporal event graph, Journal of Complex Networks 6, 639 (2017).
Badie-Modiri et al. [2020] A. Badie-Modiri, M. Karsai, and M. Kivelä, Efficient limited-time reachability estimation in temporal networks, Phys. Rev. E 101, 052303 (2020).
Newman et al. [2001] M. E. J. Newman, S. H. Strogatz, and D. J. Watts, Random graphs with arbitrary degree distributions and their applications, Physical review E 64, 026118 (2001).
Bureau of Transportation Statistics [2017] Bureau of Transportation Statistics, Bureau of transportation statistics website (2017).
Kujala et al. [2018] R. Kujala, C. Weckström, R. K. Darst, M. N. Mladenović, and J. Saramäki, A collection of public transport network data sets for 25 cities, Scientific data 5, 180089 (2018).
Yang and Leskovec [2011] J. Yang and J. Leskovec, Patterns of temporal variation in online media, in Proceedings of the fourth ACM international conference on Web search and data mining (ACM, 2011) pp. 177–186.
Karsai et al. [2011] M. Karsai, M. Kivelä, R. K. Pan, K. Kaski, J. Kertész, A.-L. Barabási, and J. Saramäki, Small but slow world: How network topology and burstiness slow down spreading, Physical Review E 83, 025102(R) (2011).