This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Modularity Enhances the Rate of Evolution in a Rugged Fitness Landscape

Jeong-Man Park1,2, Man Chen1, Dong Wang1, and Michael W. Deem1,3
1Department of Physics & Astronomy
Rice University, Houston, TX 77005–1892, USA
2Department of Physics, The Catholic University of Korea, Bucheon 420-743, Korea
3Center for Theoretical Biological Physics, Rice University, Houston, TX 77005–1892, USA
Abstract

Biological systems are modular, and this modularity affects the evolution of biological systems over time and in different environments. We here develop a theory for the dynamics of evolution in a rugged, modular fitness landscape. We show analytically how horizontal gene transfer couples to the modularity in the system and leads to more rapid rates of evolution at short times. The model, in general, analytically demonstrates a selective pressure for the prevalence of modularity in biology. We use this model to show how the evolution of the influenza virus is affected by the modularity of the proteins that are recognized by the human immune system. Approximately 25% of the observed rate of fitness increase of the virus could be ascribed to a modular viral landscape.

1 Introduction

Biological systems are modular, and the organization of their genetic material reflects this modularity [1, 2, 3, 4]. Complementary to this modularity is a set of evolutionary dynamics that evolves the genetic material of biological systems. In particular, horizontal gene transfer (HGT) is an important mechanism of evolution, in which genes, pieces of genes, or multiple genes are transferred from one individual to another [5, 6, 7]. Additionally, multi-body contributions to the fitness function in biology are increasingly thought to be an important factor in evolution [8], leading to a rugged fitness landscape and glassy evolutionary dynamics. The combination of modularity and horizontal gene transfer provide an effective mechanism for evolution upon a rugged fitness landscape [9]. The organization of biology into modules simultaneously restricts the possibilities for function, because the modular organization is a subset of all possible organizations, and may lead to more rapid evolution, because the evolution occurs in a vastly restricted modular subspace of all possibilities [9, 10]. Our results explicitly demonstrate this trade off, with tt^{*} serving as the crossover time from the latter to the former regime.

Thus, the fitness function in biology is increasingly realized to be rugged, yet modular. Nonetheless, nearly all analytical theoretical treatments assume a smooth fitness landscape with a dependence only on Hamming distance from a most-fit sequence [11], a linear or multiplicative fitness landscape for dynamical analysis of horizontal gene transfer [12, 13], or an uncorrelated random energy model [14, 15]. Horizontal gene transfer processes on more general, but still smooth, landscapes have been analyzed [16, 17, 18, 19, 20]. Here we provide, to our knowledge, the first analytical treatment of a finite-population Markov model of evolution showing how horizontal gene transfer couples to modularity in the fitness landscape. We prove analytically that modularity can enhance the rate of evolution for rugged fitness landscapes in the presence of horizontal gene transfer. This foundational result in the physics of biological evolution offers a clue to why biology is so modular. We demonstrate this theory with an application to evolution of the influenza virus.

We introduce and solve a model of individuals evolving on a modular, rugged fitness landscape. The model is constructed to represent several fundamental aspects of biological evolution: a finite population, mutation and horizontal gene transfer, and a rugged fitness landscape. For this model, we will show that the evolved fitness is greater for a modular landscape than for a non-modular landscape. This result holds for t<tt<t^{*} where tt^{*} is a crossover time, larger than typical biological timescales. The dependence of the evolved fitness on modularity is multiplicative with the horizontal gene transfer rate, and the advantage of modularity disappears when horizontal gene transfer is not allowed. Our results describe the response of the system to environmental change. In particular, we show that modularity allows the system to recover more rapidly from change, and fitness values attained during the evolved response to change increase with modularity for large or rapid environmental change.

2 Theory of the Rate of Evolution in a Rugged Fitness Landscape

We use a Markov model to describe the evolutionary process. There are NN individuals. Each individual α\alpha replicates at a rate fαf_{\alpha}. The average fitness in the population is defined as f=1Nα=1Nfα\langle f\rangle=\frac{1}{N}\sum_{\alpha=1}^{N}f_{\alpha}. Each individual has a sequence SαS^{\alpha} that is composed of LL loci, siαs_{i}^{\alpha}. For simplicity, we take siα=±1s_{i}^{\alpha}=\pm 1. Each of the loci can mutate to the opposite state with rate μ\mu. Each sequence is composed of KK modules of length l=L/Kl=L/K. A horizontal gene transfer process randomly replaces the kkth module in the sequence of individual α\alpha, e.g. the sequence loci s(k1)l+1αsklαs_{(k-1)l+1}^{\alpha}\ldots s_{kl}^{\alpha}, with the corresponding sequences from a randomly chosen individual β\beta at rate ν\nu. The a priori rate of sequence change in a population is, therefore, NμLN\mu L + NνL/2N\nu L/2. Since the fitness landscape in biology is rugged, we use a spin glass to represent the fitness:

f[S]\displaystyle f[S] =\displaystyle= 2L+H[S]\displaystyle 2L+H[S]
H[S]\displaystyle H[S] =\displaystyle= ijsisjJijΔij\displaystyle\sum_{ij}s_{i}s_{j}J_{ij}\Delta_{ij} (1)

JijJ_{ij} is a quenched, Gaussian random matrix, with variance 1/C1/C. As discussed in Appendix A, the offset value 2L2L is chosen by Wigner’s semicircle law so that the minimum eigenvalue of ff is non-negative. The entries in the matrix Δ\Delta are zero or one, with probability C/LC/L per entry, so that the average number of connections per row is CC. We introduce modularity by an excess of interactions in Δ\Delta along the l×ll\times l block diagonals of the L×LL\times L connection matrix. There are KK of these block diagonals. Thus, the probability of a connection is C0/LC_{0}/L when i/lj/l\lfloor i/l\rfloor\neq\lfloor j/l\rfloor and C1/LC_{1}/L when i/l=j/l\lfloor i/l\rfloor=\lfloor j/l\rfloor. The number of connections is C=C0+(C1C0)/KC=C_{0}+(C_{1}-C_{0})/K. Modularity is defined by M=(C1C0)/(KC)M=(C_{1}-C_{0})/(KC) and obeys 1/(K1)M1-1/(K-1)\leq M\leq 1.

The Markov process describing these evolutionary dynamics includes terms for replication (ff), mutation (μ\mu), and horizontal gene transfer (ν\nu):

dP({n𝐚};t)dt\displaystyle\frac{dP(\{n_{\bf a}\};t)}{dt} =\displaystyle= {𝐚}[f(S𝐚)(n𝐚1){𝐛𝐚}n𝐛+1NP(n𝐚1,n𝐛+1;t)f(S𝐚)n𝐚{𝐛𝐚}n𝐛NP(n𝐚,n𝐛;t)]\displaystyle\sum_{\{{\bf a}\}}\bigg{[}f(S_{\bf a})(n_{\bf a}-1)\sum_{\{{\bf b}\neq{\bf a}\}}\frac{n_{\bf b}+1}{N}P(n_{\bf a}-1,n_{\bf b}+1;t)-f(S_{\bf a})n_{\bf a}\sum_{\{{\bf b}\neq{\bf a}\}}\frac{n_{\bf b}}{N}P(n_{\bf a},n_{\bf b};t)\bigg{]}
+μ{𝐚}{𝐛=𝐚}[(n𝐛+1)P(n𝐚1,n𝐛+1;t)n𝐛P(n𝐚,n𝐛;t)]\displaystyle+\mu\sum_{\{{\bf a}\}}\sum_{\{{\bf b}=\partial{\bf a}\}}\bigg{[}(n_{\bf b}+1)P(n_{\bf a}-1,n_{\bf b}+1;t)-n_{\bf b}P(n_{\bf a},n_{\bf b};t)\bigg{]}
+ν{𝐚}k=1K{𝐛,𝐛k𝐚k}[(n𝐚/𝐛k+1)n𝐛/𝐚kNP(n𝐚1,n𝐚/𝐛k+1;t)\displaystyle+\nu\sum_{\{{\bf a}\}}\sum_{k=1}^{K}\sum_{\{{\bf b},{\bf b}_{k}\neq{\bf a}_{k}\}}\bigg{[}(n_{{\bf a}/{\bf b}_{k}}+1)\frac{n_{{\bf b}/{\bf a}_{k}}}{N}P(n_{\bf a}-1,n_{{\bf a}/{\bf b}_{k}}+1;t)
n𝐚/𝐛kn𝐛/𝐚kNP(n𝐚,n𝐚/𝐛k;t)]\displaystyle-n_{{\bf a}/{\bf b}_{k}}\frac{n_{{\bf b}/{\bf a}_{k}}}{N}P(n_{\bf a},n_{{\bf a}/{\bf b}_{k}};t)\bigg{]}

Here n𝐚n_{\bf a} is the number of individuals with sequence S𝐚S_{\bf a}, with the vector index 𝐚{\bf a} used to label the 2L2^{L} sequences. This process conserves N=𝐚n𝐚N=\sum_{\bf a}n_{\bf a}. The notation 𝐚\partial{\bf a} means the LL sequences created by a single mutation from sequence S𝐚S_{\bf a}. The notation 𝐚/𝐛k{\bf a}/{\bf b}_{k} means the sequence created by horizontally gene transferring module kk from sequence S𝐛S_{\bf b} into sequence S𝐚S_{\bf a}.

We consider how a population of initially random sequences adapts to a given environment, averaged over the distribution of potential environments. For example, in the context of influenza evolution, these sequences arise, essentially randomly, by transmission from swine. As discussed in Appendix B, a short-time expansion for the average fitness can be derived by recursive application of this master equation:

f(t)\displaystyle\langle f(t)\rangle =\displaystyle= 2L+at+bt2\displaystyle 2L+at+bt^{2}
a\displaystyle a =\displaystyle= 2L(11N)\displaystyle 2L\left(1-\frac{1}{N}\right)
b\displaystyle b =\displaystyle= 4L2N(11N)4μL(11N)\displaystyle-\frac{4L^{2}}{N}\left(1-\frac{1}{N}\right)-4\mu L\left(1-\frac{1}{N}\right) (3)
2νL[(11K)(1M)(14N)+1N](11N)\displaystyle-2\nu L\left[\left(1-\frac{1}{K}\right)\left(1-M\right)\left(1-\frac{4}{N}\right)+\frac{1}{N}\right]\left(1-\frac{1}{N}\right)

Result (3) is exact for all finite NN. Note that the effect of modularity enters at the quadratic level and requires a non-zero rate of horizontal gene transfer, ν>0\nu>0. We see that fM>0(t)>f0(t)\langle f_{M>0}(t)\rangle>\langle f_{0}(t)\rangle for short times.

From the master equation, we also calculate the sequence divergence, defined as D=1Nα=1NLSα(t)Sα(0)2D=\frac{1}{N}\sum_{\alpha=1}^{N}\langle\frac{L-S_{\alpha}(t)\cdot S_{\alpha}(0)}{2}\rangle. As discussed in Appendix C, recursive application of Eq. (LABEL:2) gives

D\displaystyle D =\displaystyle= αt+βt2\displaystyle\alpha t+\beta t^{2}
α\displaystyle\alpha =\displaystyle= L2(11N)+μL+νL2(11N)\displaystyle L^{2}\left(1-\frac{1}{N}\right)+\mu L+\frac{\nu L}{2}\left(1-\frac{1}{N}\right)
β\displaystyle\beta =\displaystyle= L3(112L+32LN)(11N)2μL2(11N)νL2(11N)\displaystyle-L^{3}\left(1-\frac{1}{2L}+\frac{3}{2LN}\right)\left(1-\frac{1}{N}\right)-2\mu L^{2}\left(1-\frac{1}{N}\right)-\nu L^{2}\left(1-\frac{1}{N}\right) (4)
μ2LμνL(11N)14ν2L(11N)\displaystyle-\mu^{2}L-\mu\nu L\left(1-\frac{1}{N}\right)-\frac{1}{4}\nu^{2}L\left(1-\frac{1}{N}\right)

The sequence divergence does not depend on modularity at second order in time. Note the terms at order tnt^{n} not proportional μmνnm\mu^{m}\nu^{n-m} are due to discontinuous changes in sequence resulting from the fixed population size constraint.

Introduction of a new sequence into an empty niche corresponds to an initial condition of identical, rather than random, sequences. The population dynamics again follows Eq. (LABEL:2). To see an effect of modularity, an expansion to 4th order in time is required. As discussed in Appendix B, we find

f(t)=2L+bt2+ct3+dt4\langle f(t)\rangle=2L+bt^{2}+ct^{3}+dt^{4} (5)

with

(2!)b\displaystyle(2!)b =\displaystyle= 16μL(11/N)\displaystyle 16\mu L(1-1/N)
(3!)c\displaystyle(3!)c =\displaystyle= 64μL2(11/N)/N192μ2L(11/N)32μνL(11/N)/N\displaystyle-64\mu L^{2}(1-1/N)/N-192\mu^{2}L(1-1/N)-32\mu\nu L(1-1/N)/N
(4!)d\displaystyle(4!)d =\displaystyle= 128μ2νLM(11/K)(11/N)(14/N)+64μL(11/N)[2(6+14μ2μν)\displaystyle 128\mu^{2}\nu LM(1-1/K)(1-1/N)(1-4/N)+64\mu L(1-1/N)[2(6+14\mu^{2}-\mu\nu)
+2μν(14/N)/K2(3610μL9μν)/N+(4L2+2L+92+4νL+ν2)/N2]\displaystyle+2\mu\nu(1-4/N)/K-2(36-10\mu L-9\mu\nu)/N+(4L^{2}+2L+92+4\nu L+\nu^{2})/N^{2}]

Interestingly, the fitness function initially increases quadratically. The modularity dependence enters at fourth order, and fM>0(t)>f0(t)\langle f_{M>0}(t)\rangle>\langle f_{0}(t)\rangle for short times.

3 Comparison of Theory to Simulation Results

The response function of the modular and non-modular system can be computed numerically as well. We start the system off with random initial sequences, so that the average initial f(0)2L\langle f(0)\rangle-2L is zero, and compute the evolution of the average fitness, f(t)\langle f(t)\rangle. In Fig. 1 we show the results from a Lebowitz-Gillespie simulation. We see that fM(t)>f0(t)\langle f_{M}(t)\rangle>\langle f_{0}(t)\rangle for t<tt<t^{*} for M=1M=1. That is, the modular system evolves more quickly to improve the average fitness, for times less than a crossover time, tt^{*}. For t>tt>t^{*}, the constraint that modularity imposes on the connections leads to the non-modular system dominating, f0(t)>fM(t)\langle f_{0}(t)\rangle>\langle f_{M}(t)\rangle.

Refer to caption
Figure 1: Shown is the population average fitness for the modular (M=1M=1, solid) and non-modular (M=0M=0, dashed) systems. Fitness is normalized by LL and the offset 2L2L is subtracted. The modular system evolves to a greater fitness value for times t<tt<t^{*}. Here t182t^{*}\approx 182. Here Eq. (1) has been scaled by ϵ=0.1\epsilon=0.1, and L=100L=100, N=10000N=10000, μ=0.05\mu=0.05, ν=0.6\nu=0.6, K=5K=5, and C=L/K1C=L/K-1, motivated by application to evolution of influenza to be discussed below. In inset is shown the fit of the data to the form of Eq. (7).

Eq. (3) shows these results analytically, and we have checked Eq. (3) by numerical simulation as well. For the parameter values of Fig. 1 and M=0M=0, theory predicts a/L=0.019998a/L=0.019998, b/L=0.011636b/L=-0.011636, and simulation results give a/L=0.020529±0.000173a/L=0.020529\pm 0.000173, b/L=0.011306±0.000406b/L=-0.011306\pm 0.000406. For M=1M=1, theory predicts a/L=0.019998a/L=0.019998, b/L=0.002041b/L=-0.002041, and simulation results give a/L0.019822±0.000200a/L-0.019822\pm 0.000200, b/L=0.0019084±0.000177b/L=-0.0019084\pm 0.000177

Since the landscape is rugged, we expect the time it takes the system to reach a given fitness value, f(t)\langle f(t)\rangle, starting from random sequences to grow in a stretched exponential way at large times. Moreover, since the combination of horizontal gene transfer and modularity improves the response function at short times, we expect that the difference in times required to reach f(t)\langle f(t)\rangle of a non-modular and modular system also increases in a stretched exponential way, by analogy with statistical mechanics, in which we expect the spin glass energy response function to converge as

f(t)fcln2/νt/t0\langle f(t)\rangle\sim f_{\infty}-c\ln^{-2/\nu}t/t_{0} (7)

with ν=1\nu=1 [21]. Figure 1 shows the fit of the data to this functional form, with f=0.958f_{\infty}=0.958, c=0.067c=0.067, t0=3.168t_{0}=3.168 for M=0M=0 and f=0.922f_{\infty}=0.922, c=0.004c=0.004, t0=7.619t_{0}=7.619 for M=1M=1. Figure 2 shows the stretched exponential speedup in the rate of evolution that modularity provides.

Refer to caption
Figure 2: Shown is the difference in time required, Δt\Delta t, for the modular (M=1M=1) and non-modular (M=0M=0) system to reach a specified average fitness value for times t<tt<t^{*}, starting from random initial sequences. Parameter values are as in Fig. 1.

The prediction for evolution of a population of identical sequences in a new niche is shown in Eq. (LABEL:101). Analytically, the effect of modularity shows up at 4th order rather than 2nd order when all sequences are initially identical. Qualitatively, however, at all but the very shortest times, the results for random and identical sequences are similar, as shown in Figs. 1 and 3.

Refer to caption
Figure 3: Shown is the population average fitness for the modular (M=1M=1, solid) and non-modular (M=0M=0, dashed) systems when all individuals initially have the same sequence. Fitness is normalized by LL and the offset 2L2L is subtracted. The modular system evolves to a greater fitness value for times t<tt<t^{*}. Here t230t^{*}\approx 230. Parameter values are as in Fig. 1.

4 Average Fitness in a Changing Environment

We show how to use these results to calculate the average fitness in a changing environment. We consider that every TT time steps, the environment randomly changes. During such an environmental change, each JijJ_{ij} in the fitness, Eq. (1), is randomly redrawn from the Gaussian distribution with probability pp. That is, on average, a fraction pp of the fitness is randomized. Due to this randomization, the fitness immediately after the environmental change will be 1p1-p times the value immediately before the change, on average. This condition allows us to calculate the average time-dependent fitness during evolution in one environment as a function of pp and TT, given only the average fitness starting from random initial conditions, f(t)\langle f(t)\rangle [22]. We denote the average fitness reached during evolution in one environment as fp,T(M)f_{p,T}(M). This is related to the average fitness with random initial conditions by fp,T(M)=f(M)(t)f_{p,T}(M)=\langle f(M)\rangle(t), where tt is chosen to satisfy f(M)(tT)=(1p)f(M)(t)\langle f(M)\rangle(t-T)=(1-p)\langle f(M)\rangle(t) due to the above condition. Thus, for high rates or large magnitudes of environmental change, Eq. (3) can be directly used along with these two conditions to predict the steady-state average population fitness.

a) Refer to caption b) Refer to caption

c) Refer to caption

Figure 4: Shown is the fitness, fp,T(M)f_{p,T}(M), in an environment that changes with characteristic time TT and magnitude a) p=0.5p=0.5 or b) p=0.9p=0.9. c) The fitness is replotted as a function of T/p. For each pp, the fitness is shown for three values of modularity: M=0.6M=0.6 (solid), M=0.8M=0.8 (dotted), and M=1M=1 (dashed). Parameters are as in Fig. 1.

Figure 4 shows how the average fitness depends on the rate and magnitude of environmental change. The average fitness values for a given modularity are computed numerically. The figure displays a crossing of the modularity-dependent fitness curves. The curve associated with the largest modularity is greatest at short times. There is a crossing to the curve associated with the second largest modularity at intermediate times. And there is a crossing to the curve associated with the smallest modularity at somewhat longer times. In other words, the value of modularity that leads to the highest average fitness depends on time, TT, decreasing with increasing time.

Figure 4 demonstrates that larger values of modularity, greater MM, lead to higher average fitness values for faster rates of environmental change, smaller TT. The advantage of greater modularity persists to larger TT for greater magnitudes of environmental change, larger pp. In other words, modularity leads to greater average fitness values either for greater frequencies of environmental change, 1/T1/T, or greater magnitudes of environmental change, pp.

It has been argued that an approximate measure of the environmental pressure is given by the product of the magnitude and frequency of environmental change p/Tp/T [22]. If this is the case, then the curves in Fig.  4ab as a function of TT for different values of pp should collapse to a single curve as a function of T/pT/p. As shown in Fig.  4c, this is approximately true for the three cases M=0.6M=0.6, M=0.8M=0.8, and M=1M=1.

5 Application of Theory to Influenza Evolution Data

We here apply our theory to the evolution of the influenza virus. Influenza is an RNA virus of the Orthomyxoviridae family. The 11 proteins of the virus are encoded by 8 gene segments. Reassortment of these gene segments is common [23, 24, 25]. In addition, there is a constant source of genetic diveristy, as most human influenza viruses arise from birds, mix in pigs, with a select few transmitted to humans [23, 24, 25, 26]. We model a simplification of this complex coevolutionary dynamics with the theory presented here. We consider K5K\approx 5 modules, with mutations in the genetic material encoding them leading to an effective mutation rate at the coarse-grained length scale. We do not consider individual amino acids, but rather a coarse-grained L=100L=100 reprsentation the viral protein material. The virus is under strong selective pressure to adapt to the human host, having most often arisn in birds and trasmitted through swine. The virus is also under strong pressure from the human immune system. The virus evolves in response to this pressure, thereby increasing its fitness. The increase in fitness was estimated by tracking the increase in frequency of each viral strain, as observed in the public sequence databases: ln[fi(t+1)/fi(t)]=x^i(t+1)/xi(t)\ln[f_{i}(t+1)/f_{i}(t)]=\hat{x}_{i}(t+1)/x_{i}(t), where fi(t)f_{i}(t) is the fitness of clade ii at time tt, xi(t)x_{i}(t) is the frequency of clade ii among all clades observed at time tt, and x^i(t+1)\hat{x}_{i}(t+1) is the frequency at t+1t+1 predicted by a model that includes a description of the mutational processes [27]. Here “clade” is a term for the quasispecies of closely-related influenza sequences at time tt. We use these approximate fitness values, estimated from observed HA sequence patterns and an approximate point mutation model of evolution, for comparison to the present model.

Influenza evolves within a cluster of closely-related sequences for 3–5 years and then jumps to a new cluster [28, 29, 30]. Indeed, the fitness flux data over 17 years [27] shows a pattern of discontinuous changes every 3–5 years. Often these jumps are related to influx of genetic material from swine [23, 24, 25, 26]. By clustering the strains, the clusters and the transitions between them can be identified: the flu strain evolved from Wuhan/359/95 to Sydney/5/97 at 1996-1997, Panama/2007/1999 to Fujian/411/2002 at 2001-2002, California/7/2004 to Wisconsin/67/2005 at 2006-2007, Brisbane/10/2007 to BritishColumbia/RV1222/09 at 2009-2010 [30]. These cluster transitions correspond to the discontinuous jumps in the fitness flux evolution and discontinuous changes in the sequences. Our theory represents the evolution within one of these clusters, considering reassortment only among human viruses. Thus, we predict the evolution of the fitness during each of these periods. There are 4 periods within the time frame 1993–2009. Figure 5 shows the measured fitness data [27], averaged over these four time periods.

Refer to caption
Figure 5: Shown is the fitness of influenza virus H3N2 in humans, estimated from the time derivative of the frequency with which sequences appear in the GenBank sequence database. Each time a new circulating strain is introduced to humans, typically from pigs infected from birds, the fitness increases as a quasispecies expands around the strain. The average fitness increase over four such antigenic shifts between 1993 and 2010 is shown. Data are from [27]. Error bars are one standard error. Theoretical results are shown for the modular (M=1M=1, solid) and non-modular (M=0M=0, dotted) model predictions. Theory was fit to the average linear fitness increase at 5 years. Parameter values are as in Fig. 1, with fitness scaled by ϵ=0.1\epsilon=0.1. Theoretical results are also shown for ϵ=0.125\epsilon=0.125 and M=0M=0 (dashed).

We scaled Eq. (1) by ϵ\epsilon to fit the observed data. The predictions from Eq. (3) are shown in Fig. 5. We find ϵ=0.1\epsilon=0.1 and M=1M=1 fit the observed data well. We assume the overall rate of evolution is equally contributed to by mutation and horizontal gene transfer. The average observed substitution rate in the 100aa epitope region of the HA protein is 5 amino acids per year [30, 27]. We interpret this result in our coarse-grained model to imply μ=.05\mu=.05 and ν=0.6\nu=0.6.

The value of ϵ\epsilon required to fit the non-modular model to the data is 25% greater than the value required to fit the modular model to the data. That is, approximately 25% of the observed rate of fitness increase of the virus could be ascribed to a modular viral fitness landscape. Thus, to achieve the observed rate of evolution, the virus may either have evolved modularity M=1M=1, or the virus may have evolved a 25% increase in its replication rate by other evolutionary mechanisms. Given the modular structure of the epitopes on the haemagglutinin protein of the virus, the modular nature of the viral segments, and the modular nature of naive, B cell, and T cell immune responses we suggest that influenza has likely evolved a modular fitness landscape.

6 Generalization to Investigate the Effect of Landscape Ruggedness

We here generalize Eq. (1) to pp-spin interactions, qq letters in the alphabet from which the sequences are constructed, and what is known as the GNK random matrix form of interactions [31]. These generalizations allow us to investigate the effect of landscape ruggedness on the rate of evolution.

6.1 Generalization to the pp-spin SK Model

We first generalize the SK model to interactions among pp spins, which we term the ppSK model. The ppSK landscape is more rugged for increasing pp. Thus, we might expect modularity to play a more important role for the models with larger pp. The generalized fitness is written as

f=i1i2ipJi1i2ipΔi1i2ipsi1si2sip+2Lf=\sum_{i_{1}i_{2}...i_{p}}J_{i_{1}i_{2}...i_{p}}\Delta_{i_{1}i_{2}...i_{p}}s_{i_{1}}s_{i_{2}}...s_{i_{p}}+2L (8)

The JJ and Δ\Delta tensors are symmetric, that is, Ji1i2ip=J[i1i2ip]J_{i_{1}i_{2}...i_{p}}=J_{[i_{1}i_{2}...i_{p}]} and Δi1i2ip=Δ[i1i2ip]\Delta_{i_{1}i_{2}...i_{p}}=\Delta_{[i_{1}i_{2}...i_{p}]}, where [i1i2ip][i_{1}i_{2}...i_{p}] is any permutation of i1i2ipi_{1}i_{2}...i_{p}. The probability of a connection is C1/Lp1C_{1}/L^{p-1} inside the block, i1/l=i2/l==ip/l\lfloor i_{1}/l\rfloor=\lfloor i_{2}/l\rfloor=...=\lfloor i_{p}/l\rfloor, and C0/Lp1C_{0}/L^{p-1} outside the blocks. The connections are defined as

i2ipΔi1i2ip=C\sum_{i_{2}...i_{p}}\Delta_{i_{1}i_{2}...i_{p}}=C (9)

Since C=C0[1(1K)p1]+C1(1K)p1C=C_{0}[1-(\frac{1}{K})^{p-1}]+C_{1}(\frac{1}{K})^{p-1}, it follows that C1=C[MKp1+(1M)]C_{1}=C[MK^{p-1}+(1-M)], and C0=C(1M)C_{0}=C(1-M).

Increasing the value of pp makes the landscape more rugged, and so it is more difficult for system to reach a given value of the fitness in a finite time. Thus, modularity is expected to play a more significant role in giving the system an evolutionary advantage. Specifically we expect that the effect of modularity will show up at shorter times for larger pp. As shown in Appendix B, we find

a\displaystyle a =\displaystyle= 2L(11/N)\displaystyle 2L(1-1/N)
2b\displaystyle 2b =\displaystyle= 8L2N(11/N)4μpL(11/N)\displaystyle-\frac{8L^{2}}{N}(1-1/N)-4\mu pL(1-1/N) (10)
+2νL[(2Np+3pN)(1M)(11Kp1)2N](11/N)\displaystyle+2\nu L\left[\left(\frac{2}{N}-p+\frac{3p}{N}\right)(1-M)\left(1-\frac{1}{K^{p-1}}\right)-\frac{2}{N}\right](1-1/N)

The modular terms increase faster with pp than the non-modular terms, so the dependence on modularity of evolution is more significant with bigger pp. Figure 6 shows the average fitness curves for the pp-spin interaction. Due to the normalization of the JJ values, all fitness curves have the same slope at t=0t=0, i.e. the same values of aa. However, increasing pp leads to slower rates of fitness increase at finite time, especially for the M=0M=0 case. At finite time, for large values of pp, the evolutionary advantage of larger MM is more pronounced. That is, modularity help the sequences to evolve more efficiently.

Refer to caption
Figure 6: Results for 22-spin, 1212-spin and 3030-spin SK interaction fitness when initial sequences are random. The fitness is given by Eq. (8). The parameters are as in Fig. 1.

We also calculated the average fitness for the initial condition of all sequences identical. As discussed above, many-spin effects make the landscape more rugged for larger pp. We seek to understand this effect quantitatively for the case of same initial sequences as well. As shown in Appendix B, the Taylor expansion results are

(2!)b\displaystyle(2!)b =\displaystyle= 8μpL(11/N)\displaystyle 8\mu pL(1-1/N)
(3!)c\displaystyle(3!)c =\displaystyle= 32μpL2(11/N)/N16μ2p2L(11/N)\displaystyle-32\mu pL^{2}(1-1/N)/N-16\mu^{2}p^{2}L(1-1/N)
64μ2pL(11/N)16μνpL(11/N)/N\displaystyle-64\mu^{2}pL(1-1/N)-16\mu\nu pL(1-1/N)/N
(4!)d\displaystyle(4!)d =\displaystyle= 64μ2νp(p1)LM(11/K)(11/N)(14/N)\displaystyle 64\mu^{2}\nu p(p-1)LM(1-1/K)(1-1/N)(1-4/N)
+32μpL(11/N)[(6p+7μ2p22μν(p1))\displaystyle+32\mu pL(1-1/N)[(6p+7\mu^{2}p^{2}-2\mu\nu(p-1))
+2μν(14/N)/K(36p10μpL5μνp8μν(p1))/N\displaystyle+2\mu\nu(1-4/N)/K-(36p-10\mu pL-5\mu\nu p-8\mu\nu(p-1))/N
+(4L2+2L+46p+4νL+ν2)/N2]\displaystyle+(4L^{2}+2L+46p+4\nu L+\nu^{2})/N^{2}]

6.2 Generalization to the Planar Potts Model

We here generalize the alphabet from two-states, si=±1s_{i}=\pm 1, to qq states, with a planar Potts model [32] of the fitness. The q=2q=2 model is often justified as a projection of the q=4q=4 nucleic acid alphabet onto purines and pyrimidines. Thus, q=4q=4 is a relevant generalization. Additional q=20q=20 corresponds to the amino acid alphabet. The alphabet size can affect the evolutionary phase diagram [33], so qq is a relevant order parameter. In the planar Potts model, each vector spin takes on qq equidistant angles, and the angle between two is defined by 𝒔i𝒔j=cosθij\boldsymbol{s}_{i}\cdot\boldsymbol{s}_{j}=\cos\theta_{ij}. The fitness is

f=ijJijΔijcosθij+2Lf=\sum_{ij}J_{ij}\Delta_{ij}\cos\theta_{ij}+2L (12)

The directions of a spin are evenly spaced in the plane, so when the spin points to qq directions, the angle between two spins is 2kπ/q2k\pi/q, where kk can be 0, 11, …, q1q-1.

The evolutionary dynamics in the planar Potts model is distinct from that in the ppSK model. As shown in Appendix B,

a\displaystyle a =\displaystyle= L(11/N)(1+δq,2)\displaystyle L(1-1/N)(1+\delta_{q,2})
2b\displaystyle 2b =\displaystyle= 4L2(11/N)(1+δq,2)/N2μL(11/N)qq1(1+δq,2)\displaystyle-4L^{2}(1-1/N)(1+\delta_{q,2})/N-2\mu L(1-1/N)\frac{q}{q-1}(1+\delta_{q,2}) (13)
2νL[(11K)(1M)(14N)+1N](11N)(1+δq,2)\displaystyle-2\nu L\left[\left(1-\frac{1}{K}\right)\left(1-M\right)\left(1-\frac{4}{N}\right)+\frac{1}{N}\right]\left(1-\frac{1}{N}\right)(1+\delta_{q,2})

Figure 7 displays results for two values of qq. Increasing qq does increase the landscape ruggedness in the planar Potts model. There is an evolutionary speedup provided by modularity. The qq dependence, however, is not as dramatic as in the ppSK model.

Refer to caption
Figure 7: Average fitness in the planar Potts model for alphabet sizes of q=2q=2 or q=20q=20. Sequences are initially random. The fitness Eq.  (12). Other parameters are same as in Fig. 1.

6.3 Generalization to the GNK Model

A model related to the SK form is the GNK model, a simple form of which is

f=H+2L=ijσij(si,sj)Δij+2Lf=H+2L=\sum_{ij}\sigma_{ij}(s_{i},s_{j})\Delta_{ij}+2L (14)

where the σ\sigma matrix is a symmetric matrix with random Gaussian entries, and the Δ\Delta matrix is the same as that in Eq. (1). Other than the condition of symmetry, the entries in the σ\sigma matrices are independently drawn from a Gaussian distribution of zero mean and unit variance in each matrix and for different i,ji,j. Thus, other than the condition σij=σji\sigma_{ij}=\sigma{ji}, the values are independent for different ii, jj, sis_{i}, or sjs_{j}. This form is generalized to a pp-spin, qq-state form as

f=i1i2ipσi1i2ip(si1,si2,,sip)Δi1i2ip+2Lf=\sum_{i_{1}i_{2}...i_{p}}\sigma_{i_{1}i_{2}...i_{p}}(s_{i_{1}},s_{i_{2}},...,s_{i_{p}})\Delta_{i_{1}i_{2}...i_{p}}+2L (15)

where sis_{i} can take one of the qq values.

For the GNK model, as shown in Appendix B, pp-spin qq-state result is

a\displaystyle a =\displaystyle= 2L(11/N)(11/qp)/lnq\displaystyle 2L(1-1/N)(1-1/q^{p})/\ln q
2b\displaystyle 2b =\displaystyle= 8L2N(11/N)(11/qp)/lnq2μpL(11/N)q(q1)lnq\displaystyle-\frac{8L^{2}}{N}(1-1/N)(1-1/q^{p})/\ln q-2\mu pL(1-1/N)\frac{q}{(q-1)\ln q} (16)
2νKL(11N)(1M)[(13N)(1+1qp(11/K+q/Kq)p(q(11/K)+1/Kq)p)\displaystyle-2\nu KL(1-\frac{1}{N})(1-M)\Bigg{[}(1-\frac{3}{N})\left(1+\frac{1}{q^{p}}-\left(\frac{1-1/K+q/K}{q}\right)^{p}-\left(\frac{q(1-1/K)+1/K}{q}\right)^{p}\right)
2(11/K+q/Kq)p/N+2/NK+2(11/K)/Nqp]/lnq\displaystyle-2\left(\frac{1-1/K+q/K}{q}\right)^{p}/N+2/NK+2(1-1/K)/Nq^{p}\Bigg{]}/\ln q
+4νL(11N)(1+(K1)/qp)/(Nlnq)\displaystyle+4\nu L(1-\frac{1}{N})\left(1+(K-1)/q^{p}\right)/(N\ln q)

As with previous models, modularity increases the rate of evolution.

7 Conclusion

The modular spin glass model captures several important aspects of biological evolution: finite population, rugged fitness landscape, modular correlations in the interactions, and horizontal gene transfer. We showed in both numerical simulation and analytical calculations that a modular landscape allows the system to evolve higher fitness values for times t<tt<t^{*}. Using this theory to analyze fitness data extracted from influenza virus evolution, we find that approximately 25% of the observed rate of fitness increase of the virus could be ascribed to a modular viral landscape. This result is consistent with the success of the modular theory of viral immune recognition, termed the pepitopep_{\rm epitope} theory, over the non-modular theory, termed psequencep_{\rm sequence} [29]. The former correlates with human influenza vaccine effectiveness with R2=0.81R^{2}=0.81, whereas the latter correlates with R2=0.59R^{2}=0.59. The model, in general, analytically demonstrates a selective pressure for the prevalence of modularity in biology. The present model may be useful for understanding the influence of modularity on other evolving biological systems, for example the HIV virus or immune system cells.

Acknowledgments

This research was supported by the US National Institutes of Health under grant 1 R01 GM 100468–01. JMP was also supported by the National Research Foundation of Korea Grant (NRF-2013R1A1A2006983).

8 Appendix A: The Requirement that Fitness be Non-Negative

8.1 The ppSK Model

The normalization of the coupling in Eq. (1), the JijJ_{ij}, affects the maximum and minimum possible values of the HH term. We require that the average energy per site of the sequence be finite when LL\rightarrow\infty, i.e. that HH is on the order of LL. For the 22-spin, 22-state interaction, the Wigner semicircle law [34] shows that a rank-LL random symmetric matrix with jJij2Δij2=1i\sum_{j}\langle J_{ij}^{2}\Delta_{ij}^{2}\rangle=1~\forall~i has a minimum eigenvalue 2-2. Thus, if we consider the sis_{i} in Eq. (1) to be normalized as isi2=L\sum_{i}s_{i}^{2}=L, the 2L2L shift in Eq.  (1) guarantees that H+2L0H+2L\geq 0. So after considering the connection matrix Δ\Delta, we choose Jii=0J_{ii}=0 and JijJkl=(δikδjl+δilδjk)/C\langle J_{ij}J_{kl}\rangle=(\delta_{ik}\delta_{jl}+\delta_{il}\delta_{jk})/C.

For the ppSK case, we first assume that Ji1ipJ_{i_{1}...i_{p}} is not symmetric, i.e. Ji1ipJ_{i_{1}...i_{p}} with permutations of the same labels are drawn independently. We will consider the symmetrization afterward. We define

Ki1i2=i3ipJi1i2ipΔi1i2ipK_{i_{1}i_{2}}=\sum_{i_{3}...i_{p}}J_{i_{1}i_{2}...i_{p}}\Delta_{i_{1}i_{2}...i_{p}} (17)

And for any i1i_{1}

i2Ki1i22=i2ipJi1i2ip2Δi1i2ip=CJ2\left<\sum_{i_{2}}K_{i_{1}i_{2}}^{2}\right>=\sum_{i_{2}...i_{p}}\langle J_{i_{1}i_{2}...i_{p}}^{2}\rangle\Delta_{i_{1}i_{2}...i_{p}}=C\langle J^{2}\rangle (18)

So,

H\displaystyle H =\displaystyle= i1i2ipsi1si2sipJi1i2ipΔi1i2ip\displaystyle\sum_{i_{1}i_{2}...i_{p}}s_{i_{1}}s_{i_{2}}...s_{i_{p}}J_{i_{1}i_{2}...i_{p}}\Delta_{i_{1}i_{2}...i_{p}} (19)
=\displaystyle= i1i2si1si2(i3ipsi3sipJi1i2ipΔi1i2ip)\displaystyle\sum_{i_{1}i_{2}}s_{i_{1}}s_{i_{2}}(\sum_{i_{3}...i_{p}}s_{i_{3}}...s_{i_{p}}J_{i_{1}i_{2}...i_{p}}\Delta_{i_{1}i_{2}...i_{p}})
=\displaystyle= si1si2Ki1i2\displaystyle s_{i_{1}}s_{i_{2}}K_{i_{1}i_{2}}
=\displaystyle= si1si2Ki1i2+Ki2i12\displaystyle s_{i_{1}}s_{i_{2}}\frac{K_{i_{1}i_{2}}+K_{i_{2}i_{1}}}{2}

In the last step we symmetrize the KK matrix so that we can apply the Wigner semicircle law. From the semicircle law, if we want the minimum of the pp-spin interaction to be 2L-2L, we need i2(Ki1i2+Ki2i1)/22=1\sum_{i_{2}}\langle(K_{i_{1}i_{2}}+K_{i_{2}i_{1}})/2\rangle^{2}=1 for every i1i_{1}. We, thus, find

i2Ki1i2+Ki2i122=14i2Ki1i22+Ki2i12=C2J2=1\sum_{i_{2}}\langle\frac{K_{i_{1}i_{2}}+K_{i_{2}i_{1}}}{2}\rangle^{2}=\frac{1}{4}\sum_{i_{2}}\langle K_{i_{1}i_{2}}^{2}+K_{i_{2}i_{1}}^{2}\rangle=\frac{C}{2}\langle J^{2}\rangle=1 (20)

So Ji1ip2=2/C\langle J_{i_{1}...i_{p}}^{2}\rangle=2/C for asymmetric Ji1ipJ_{i_{1}...i_{p}}. We symmetrize the Ji1ipJ_{i_{1}...i_{p}}

Ji1i2ip=1p!permutationsJi1i2ipJ^{\prime}_{i_{1}i_{2}...i_{p}}=\frac{1}{p!}\sum_{\text{permutations}}J_{i_{1}i_{2}...i_{p}} (21)

to find

Ji1i2ip2=p!p!2Ji1i2ip2=2p!C\langle J^{\prime 2}_{i_{1}i_{2}...i_{p}}\rangle=\frac{p!}{p!^{2}}\langle J_{i_{1}i_{2}...i_{p}}^{2}\rangle=\frac{2}{p!C} (22)

The Wigner approach to calculate the minimal possible value of HH is overly conservative, because in this approach all possible real vectors, sis_{i} are considered, whereas in our application si=±1s_{i}=\pm 1. We here use extreme value theory to take this constraint into account. We use the 22-spin interaction to exemplify the method. As the JJ matrix is symmetric, its elements are not independent, so the terms in HH are not independent. We rewrite HH, noting the diagonal terms are zero, as

H=ijJijsisjΔij=2i<jJijsisjΔij=2HH=\sum_{ij}J_{ij}s_{i}s_{j}\Delta_{ij}=2\sum_{i<j}J_{ij}s_{i}s_{j}\Delta_{ij}=2H^{\prime} (23)

The terms in HH^{\prime} are independent. After taking into account the connection matrix, there are LC/2LC/2 non-zero independent elements. As sis_{i} is either +1+1 or 1-1, multiplying them does not change the distribution of each element, so the HH^{\prime} is the sum of LC/2LC/2 Gaussian variables. We choose the variance of each element as 1/C1/C, so HN(0,L/2)H^{\prime}\sim N(0,L/2), and HH is twice of this, so

HN(0,2L)H\sim N(0,2L) (24)

There are 2L2^{L} different sequences, so there are 2L2^{L} different HH, which differ by a few terms instead of all terms, so they are not independent. We seek the smallest HH. Slepian’s lemma [35] shows that for Gaussian variables XiX_{i} and YiY_{i} with average 0, if Xi2=Yi2\langle X_{i}^{2}\rangle=\langle Y_{i}^{2}\rangle, and XiXjYiYj\langle X_{i}X_{j}\rangle\leq\langle Y_{i}Y_{j}\rangle for all ii and jj, then for any xx,

P(max1inXix)P(max1inYix)P(\max_{1\leq i\leq n}X_{i}\leq x)\leq P(\max_{1\leq i\leq n}Y_{i}\leq x) (25)

Thus, since XiX_{i} and YiY_{i} have zero mean, the minimum of the less correlated variables is smaller than that of the more correlated ones. Thus, we still use extreme value theory to determine the minimum for the same number of independent variables with the same distribution, and we use this number as an estimate of a lower bound. When we add this estimated value to HH, Slepian’s lemma guarantees the non-negativity of the fitness. We will show shortly that this estimated value is quite near the true value. The extreme value theory calculation proceeds as follows:

HminP(H)𝑑H=1/2L\int_{-\infty}^{H_{\text{min}}}P(H)dH=1/2^{L} (26)

where 2L2^{L} is the sample space size when the spin has 2 directions. From the symmetry of Gaussian distribution and Hmax=HminH_{\text{max}}=-H_{\text{min}}, we find

12π2LHmaxex2/4L𝑑x\displaystyle\frac{1}{\sqrt{2\pi}\sqrt{2L}}\int_{-\infty}^{H_{\text{max}}}e^{-x^{2}/4L}dx =\displaystyle= 0.5+1π0Hmax/2Lex2𝑑x\displaystyle 0.5+\frac{1}{\sqrt{\pi}}\int_{0}^{H_{\text{max}}/2\sqrt{L}}e^{-x^{2}}dx (27)
=\displaystyle= 0.5+0.5erf(Hmax/2L)\displaystyle 0.5+0.5\operatorname{erf}(H_{\text{max}}/2\sqrt{L})
\displaystyle\approx 1eHmax2/4LπHmax/2L\displaystyle 1-\frac{e^{-H_{\text{max}}^{2}/4L}}{\sqrt{\pi}H_{\text{max}}/2\sqrt{L}}
=\displaystyle= 112L\displaystyle 1-\frac{1}{2^{L}}

where erf()\operatorname{erf}() is the error function. Since LL is large, we find

eHmax2/4LπHmax/2LeHmax2/4L=eHmin2/4L=2L\frac{e^{-H_{\text{max}}^{2}/4L}}{\sqrt{\pi}H_{\text{max}}/2\sqrt{L}}\approx e^{-H_{\text{max}}^{2}/4L}=e^{-H_{\text{min}}^{2}/4L}=2^{-L} (28)

So

Hmin=2Lln22×0.832LH_{\text{min}}=-2L\sqrt{\ln 2}\approx-2\times 0.832L (29)

Numerical results show that the exact value is 2×0.763L-2\times 0.763L [36], quite close to the value in Eq. (29). In this way, for the ppSK interaction, a normalization of

Ji1,i2,,ip2=2/Cp!\langle J_{i_{1},i_{2},...,i_{p}}^{2}\rangle=2/Cp! (30)

gives a minimum 2Lln2\geq-2L\sqrt{\ln 2}. This bound becomes exact as pp\rightarrow\infty [37]. As pp becomes bigger, the landscape becomes more rugged, and the sequences becomes more uncorrelated. The correlation of general pp falls between that of p=2p=2 and pp\rightarrow\infty, so according to Slepian’s lemma, the exact minimum of any pp will fall between that of p=2p=2 and that of pp\rightarrow\infty. As 0.83210.832\approx 1 and 0.76310.763\approx 1, we will neglect this factor, and set the shift to 2L2L when J2=2/p!C\langle J^{2}\rangle=2/p!C, which guarantees f>0f>0.

Here we also calculate H2\langle H^{2}\rangle, which is used to obtain the Taylor expansion results in section 9. For a two-spin interaction we find

H2\displaystyle\langle H^{2}\rangle =\displaystyle= ijklJijJklΔijΔklsisjsksl\displaystyle\left<\sum_{ij}\sum_{kl}J_{ij}J_{kl}\Delta_{ij}\Delta_{kl}s_{i}s_{j}s_{k}s_{l}\right> (31)
=\displaystyle= ijkl(δikδjl/C+δilδjk/C)ΔijΔklsisjsksl\displaystyle\sum_{ij}\sum_{kl}(\delta_{ik}\delta_{jl}/C+\delta_{il}\delta_{jk}/C)\Delta_{ij}\Delta_{kl}s_{i}s_{j}s_{k}s_{l}
=\displaystyle= 2ijΔij/C=2L\displaystyle 2\sum_{ij}\Delta_{ij}/C=2L

We similarly calculate pp-spin results. We find that H2\langle H^{2}\rangle is always 2L2L for all pp under this normalization scheme.

8.2 The Planar Potts Model

For the planar Potts model, we consider that qq is even, so the configuration space of q=2q=2 is a subset of that of q>2q>2, and the minimum of q>2q>2 is no bigger than that of q=2q=2 Potts model, which is 0.763×2L-0.763\times 2L [36]. We consider two limiting cases to obtain the lower bound of the minimum. First, when qq\rightarrow\infty, the planar Potts model becomes the XY model, which is defined in this case as H=ijJij𝒔i𝒔j=ijJijcosθijH=\sum_{ij}J_{ij}\boldsymbol{s}_{i}\cdot\boldsymbol{s}_{j}=\sum_{ij}J_{ij}\cos\theta_{ij}, where θij\theta_{ij} is the angle between vector 𝒔i\boldsymbol{s}_{i} and 𝒔j\boldsymbol{s}_{j}, and these vectors 𝒔i\boldsymbol{s}_{i} can point to any direction in a two dimensional space. So the configuration space of the planar Potts model is a subset of that of the XY model, and the minimum of planar Potts model is no smaller than that of the XY model. Numerical results show that the ground state energy for the XY model is roughly 0.90×2L0.90\times 2L [38].

Second, the vectors in the XY model are restricted in a two dimensional space. If we generalize it to an nn dimensional space, with 𝒔i=(x1,x2,,xn)\boldsymbol{s}_{i}=(x_{1},x_{2},...,x_{n}), and x12+x22++xn2=1x_{1}^{2}+x_{2}^{2}+...+x_{n}^{2}=1, we obtain the nn vector model. When nn\rightarrow\infty, the model becomes the spherical model, the exact minimum of HH can be calculated analytically as 2L-2L when Jij2=1/L\langle J_{ij}^{2}\rangle=1/L [39]. Similarly, we see that the configuration space of the XY model is a subset of the spherical model, and the minimum of the XY model is no smaller than the spherical model. So considering the above two limiting cases, the minimum of the planar Potts model is no smaller than 2L-2L when Jij2=1/L\langle J_{ij}^{2}\rangle=1/L. Thus, we set the shift as 2L2L to guarantee that f>0f>0.

We still need to consider the effect of the connection matrix. We use extreme value theory. We consider that the matrix elements of JJ are randomly chosen, so that HH becomes sum of LCLC Gaussian-like variables instead of L2L^{2} variables, so the minimum is C/L\sqrt{C/L} times that of the minimum without the connection matrix. To normalize the minimum back to 2L-2L, we finally choose Jij2=1/C\langle J_{ij}^{2}\rangle=1/C.

We calculate H2\langle H^{2}\rangle for the planar Potts model. When q=2q=2, the average value of cos2θij\cos^{2}\theta_{ij} is 11, while when q>2q>2, the value is 1/21/2. Thus,

H2\displaystyle\langle H^{2}\rangle =\displaystyle= ijJijΔijcosθijklJklΔklcosθkl\displaystyle\left<\sum_{ij}J_{ij}\Delta_{ij}\cos\theta_{ij}\sum_{kl}J_{kl}\Delta_{kl}\cos\theta_{kl}\right> (32)
=\displaystyle= 2ijJij2Δklcos2θij\displaystyle 2\left<\sum_{ij}J_{ij}^{2}\Delta_{kl}\cos^{2}\theta_{ij}\right>
=\displaystyle= ijJij2(1+δq,2)Δkl\displaystyle\sum_{ij}\langle J_{ij}^{2}\rangle(1+\delta_{q,2})\Delta_{kl}
=\displaystyle= L(1+δq,2)\displaystyle L(1+\delta_{q,2})

8.3 The GNK Model

We use extreme value theory to discuss the minimum and normalization of the GNK model. When σ2=2p!/C\langle\sigma^{2}\rangle=2p!/C,

Hp!N(0,σ2LC/p!)N(0,2L)H\sim p!N(0,\langle\sigma^{2}\rangle LC/p!)\sim N(0,2L) (33)

so the distribution of the HH for the GNK model is the same as that of the ppSK model. The correlation induced by the dependence of HH on the sis_{i} is, however, different in these two models. Here we follow the method of [40]. We have two sets of variables with the same set size, one is the pp spin interaction GNK HH, denoted as XiX_{i}, and normalized so that Xi2=1\langle X_{i}^{2}\rangle=1, and the other is a set of Gaussian variables YiY_{i}, and Yi2=1\langle Y_{i}^{2}\rangle=1, YiYj=c<1\langle Y_{i}Y_{j}\rangle=c<1. There are qLq^{L} variables in each set, and we know that Ymax\langle Y_{\text{max}}\rangle is 1c\sqrt{1-c} times the maximum of the same number of uncorrelated Gaussian variables with the same variance and average [41], so from Eq. (29) Ymax=2L(1c)lnq\langle Y_{\text{max}}\rangle=\sqrt{2L(1-c)\ln q}. We define a new variable, Zi=1tXi+tYiZ_{i}=\sqrt{1-t}X_{i}+\sqrt{t}Y_{i}, where 0t10\leq t\leq 1, and we define r(t)=P(Z1(t)a,,ZqL(t)a)r(t)=P(Z_{1}(t)\leq a,...,Z_{q^{L}}(t)\leq a). So r(t=0)r(t=0) is the probability that the maximum of XiX_{i} is smaller than aa, while r(t=1)r(t=1) is the probability that the maximum of YiY_{i} is smaller than aa. We seek to make r(0)r(1)r(0)\leq r(1), so that XmaxYmaxX_{\text{max}}\geq Y_{\text{max}}, so we seek dr(t)/dt0dr(t)/dt\geq 0 for 0t10\leq t\leq 1. It is shown that on page 71 of [40]:

dr(t)/dt=12ij(YiYjXiXj)aa2ϕ(t,Z1,,ZqL)ZiZj𝑑Z1𝑑ZqLdr(t)/dt=\frac{1}{2}\sum_{ij}(\langle Y_{i}Y_{j}\rangle-\langle X_{i}X_{j}\rangle)\int^{a}_{-\infty}\cdot\cdot\cdot\int^{a}_{-\infty}\frac{\partial^{2}\phi(t,Z_{1},...,Z_{q^{L}})}{\partial Z_{i}\partial Z_{j}}dZ_{1}...dZ_{q^{L}} (34)

where ϕ\phi is the joint distribution of ZZ. For example, the term corresponding to i=1i=1, j=2j=2 is (Y1Y2X1X2)aaϕ(t,a,a,Z3,,ZqL)𝑑Z3𝑑ZqL(\langle Y_{1}Y_{2}\rangle-\langle X_{1}X_{2}\rangle)\int^{a}_{-\infty}\cdot\cdot\cdot\int^{a}_{-\infty}\phi(t,a,a,Z_{3},...,Z_{q^{L}})dZ_{3}...dZ_{q^{L}}, and aaϕ(t,a,a,Z3,,ZqL)dZ3dZqLϕ(t,Z1=a,Z2=a)\int^{a}_{-\infty}\cdot\cdot\cdot\int^{a}_{-\infty}\phi(t,a,a,Z_{3},...,Z_{q^{L}})dZ_{3}...dZ_{q^{L}}\approx\phi(t,Z_{1}=a,Z_{2}=a), as the probability that all other variables is smaller than the maximum we are looking for is approximately 11. YiYj=c\langle Y_{i}Y_{j}\rangle=c for all iji\neq j, and XiXj\langle X_{i}X_{j}\rangle is the same for pairs with dijd_{ij} the same, so we group the pairs according to their Hamming distance, and rewrite Eq. (34) as

dr(t)/dt=14id=1L(Ld)(q1)d(cXiXj)ϕ(t,Zi=a,Zj=a)dr(t)/dt=\frac{1}{4}\sum_{i}\sum_{d=1}^{L}\binom{L}{d}(q-1)^{d}(c-\langle X_{i}X_{j}\rangle)\phi(t,Z_{i}=a,Z_{j}=a) (35)

where XjX_{j} is any sequence satisfying dij=dd_{ij}=d. As the system is totally random, it does not matter what sequence jj is, and we can set it as the first sequence and rewrite it as

dr(t)/dt=14qLd=1L(Ld)(q1)d(cX1Xi)ϕ(t,Z1=a,Zi=a)dr(t)/dt=\frac{1}{4}q^{L}\sum_{d=1}^{L}\binom{L}{d}(q-1)^{d}(c-\langle X_{1}X_{i}\rangle)\phi(t,Z_{1}=a,Z_{i}=a) (36)

where XiX_{i} is any sequence satisfying d1i=dd_{1i}=d. As only the integral depends on tt, we can write it as

r(1)r(0)=14qLd=1L(Ld)(q1)d(cX1Xi)01ϕ(t,Z1=a,Zi=a)r(1)-r(0)=\frac{1}{4}q^{L}\sum_{d=1}^{L}\binom{L}{d}(q-1)^{d}(c-\langle X_{1}X_{i}\rangle)\int_{0}^{1}\phi(t,Z_{1}=a,Z_{i}=a) (37)

when r(0)=r(1)r(0)=r(1), we find

c=d=1L(Ld)(q1)d01ϕ(t,Z1=a,Zi=a)dtX1Xid=1L(Ld)(q1)d01ϕ(t,Z1=a,Zi=a)dtc=\frac{\sum_{d=1}^{L}\binom{L}{d}(q-1)^{d}\int_{0}^{1}\phi(t,Z_{1}=a,Z_{i}=a)dt\langle X_{1}X_{i}\rangle}{\sum_{d=1}^{L}\binom{L}{d}(q-1)^{d}\int_{0}^{1}\phi(t,Z_{1}=a,Z_{i}=a)dt} (38)

We first calculate XiXj\langle X_{i}X_{j}\rangle. The correlation is

XiXj\displaystyle\langle X_{i}X_{j}\rangle =\displaystyle= i1ipσ(si1sip)Δi1ipj1jpσ(sj1sjp)Δj1jp\displaystyle\sum_{i_{1}...i_{p}}\sigma({s_{i_{1}}...s_{i_{p}}})\Delta_{i_{1}...i_{p}}\sum_{j_{1}...j_{p}}\sigma({s^{\prime}_{j_{1}}...s^{\prime}_{j_{p}}})\Delta_{j_{1}...j_{p}} (39)
=\displaystyle= (p!)2i1<<ipσ(si1sip)σ(si1sip)Δi1ip\displaystyle(p!)^{2}\sum_{i_{1}<...<i_{p}}\sigma({s_{i_{1}}...s_{i_{p}}})\sigma({s^{\prime}_{i_{1}}...s^{\prime}_{i_{p}}})\Delta_{i_{1}...i_{p}}

We initially neglect the Δ\Delta, as among the (Lp)\binom{L}{p} different kinds of σ\sigma, there are (Ldp)\binom{L-d}{p} remaining unchanged, XiXj=Xi2(Ldp)/(Lp)=(Ldp)/(Lp)\langle X_{i}X_{j}\rangle=\langle X_{i}^{2}\rangle\binom{L-d}{p}/\binom{L}{p}=\binom{L-d}{p}/\binom{L}{p} if dLpd\leq L-p, otherwise it is 0. Now considering the connection matrix, if M=0M=0, the connections are randomly chosen, so we expect that the result does not change. If M=1M=1, all connections fall into small modules, and again the dd changed spins are randomly distributed in each module, so out of the ll spins in each module, dl/Ldl/L spins have changed. Thus, in each module, out of the (lp)\binom{l}{p} spins, (ldl/Lp)\binom{l-dl/L}{p} remain unchanged, so XiXj=Xi2(ldl/Lp)/(lp)(ldl/L)p/lp(Ld)p/Lp(Ldp)/(Lp)\langle X_{i}X_{j}\rangle=\langle X_{i}^{2}\rangle\binom{l-dl/L}{p}/\binom{l}{p}\approx(l-dl/L)^{p}/l^{p}\approx(L-d)^{p}/L^{p}\approx\binom{L-d}{p}/\binom{L}{p}. For general MM, for a given σ\sigma, consider that hh spins fall out of the module, and php-h will fall in the module. Among the (Llh)\binom{L-l}{h} possibilities outside of the module, ((Ll)(Ld)/Lh)\binom{(L-l)(L-d)/L}{h} remain unchanged, so the ratio of unchanged over all is is ((Ll)(Ld)/Lh)/(Llh)(Ld)h/Lh\binom{(L-l)(L-d)/L}{h}/\binom{L-l}{h}\approx(L-d)^{h}/L^{h}. For the (lph)\binom{l}{p-h} possible choices inside the module, (l(Ld)/Lph)\binom{l(L-d)/L}{p-h} is unchanged, so the ratio is (l(Ld)/Lph)/(lph)(Ld)ph/Lph\binom{l(L-d)/L}{p-h}/\binom{l}{p-h}\approx(L-d)^{p-h}/L^{p-h}. So the probability that a σ\sigma is unchanged is (Ld)h/Lh×(Ld)ph/Lph=(Ld)p/Lp(Ldp)/(Lp)(L-d)^{h}/L^{h}\times(L-d)^{p-h}/L^{p-h}=(L-d)^{p}/L^{p}\approx\binom{L-d}{p}/\binom{L}{p}. So for all MM, XiXj=Xi2(Ldp)/(Lp)=(Ldp)/(Lp)\langle X_{i}X_{j}\rangle=\langle X_{i}^{2}\rangle\binom{L-d}{p}/\binom{L}{p}=\binom{L-d}{p}/\binom{L}{p}. Also, (Ld)(Ldp)/(Lp)=(Lpd)\binom{L}{d}\binom{L-d}{p}/\binom{L}{p}=\binom{L-p}{d}, so

c=d=1Lp(Lpd)(q1)d01ϕ(t,Z1=a,Zi=a)dtd=1L(Ld)(q1)d01ϕ(t,Z1=a,Zi=a)dtc=\frac{\sum_{d=1}^{L-p}\binom{L-p}{d}(q-1)^{d}\int_{0}^{1}\phi(t,Z_{1}=a,Z_{i}=a)dt}{\sum_{d=1}^{L}\binom{L}{d}(q-1)^{d}\int_{0}^{1}\phi(t,Z_{1}=a,Z_{i}=a)dt} (40)

The integral part simplifies as

aaϕ(t,a,Z2,,Zi1,a,Zi+1,,ZqL)dZ2dZi1dZi+1dZqLϕ(t,Z1=a,Zi=a)\int^{a}_{-\infty}\cdot\cdot\cdot\int^{a}_{-\infty}\phi(t,a,Z_{2},...,Z_{i-1},a,Z_{i+1},...,Z_{q^{L}})dZ_{2}...dZ_{i-1}dZ_{i+1}...dZ_{q^{L}}\approx\phi(t,Z_{1}=a,Z_{i}=a)

We write the joint distribution of the Gaussian variables as

f(x1,,xk)=1(2π)k|Σ|e12(𝒁𝒁)TΣ1(𝒁𝒁)f(x_{1},...,x_{k})=\frac{1}{\sqrt{(2\pi)^{k}}|\Sigma|}e^{-\frac{1}{2}(\boldsymbol{Z-\langle Z\rangle})^{T}\Sigma^{-1}(\boldsymbol{Z-\langle Z\rangle})} (41)

where Σ\Sigma is the covariance matrix, and |Σ||\Sigma| is the determinant of Σ\Sigma. For our particular case, where there are only two variables with unit variance and zero average, the covariance matrix is given by

Σ=(Z12Z1Z2)Z1Z2Z22)=(1ρρ1)\Sigma=\begin{pmatrix}\langle Z_{1}^{2}\rangle&\langle Z_{1}Z_{2}\rangle)\\ \langle Z_{1}Z_{2}\rangle&\langle Z_{2}^{2}\rangle\end{pmatrix}=\begin{pmatrix}1&\rho\\ \rho&1\end{pmatrix} (42)

where ρ=Z1Z2\rho=\langle Z_{1}Z_{2}\rangle. So |Σ|=1ρ2|\Sigma|=1-\rho^{2} and

Σ1=11ρ2(1ρρ1)\Sigma^{-1}=\frac{1}{1-\rho^{2}}\begin{pmatrix}1&-\rho\\ -\rho&1\end{pmatrix} (43)

It follows that

12(𝒁𝒁)TΣ1(𝒁𝒁)=Z12+Z222ρZ1Z222ρ2-\frac{1}{2}(\boldsymbol{Z-\langle Z\rangle})^{T}\Sigma^{-1}(\boldsymbol{Z-\langle Z\rangle})=-\frac{Z_{1}^{2}+Z_{2}^{2}-2\rho Z_{1}Z_{2}}{2-2\rho^{2}} (44)

When both variables equal aa, it is a21+ρ-\frac{a^{2}}{1+\rho}. So putting everything into Eq. (41), the probability that both variables are aa is ea2/(1+ρ)/1ρ2πe^{-a^{2}/(1+\rho)}/\sqrt{1-\rho^{2}}\pi. We calculate ρij(t)=[(1t)XiXj+tYiYj]/(1t+t)=tc+(1t)(Ldp)/(Lp)=tc+(1t)(1dL)p\rho_{ij}(t)=[(1-t)\langle X_{i}X_{j}\rangle+t\langle Y_{i}Y_{j}\rangle]/(1-t+t)=tc+(1-t)\binom{L-d}{p}/\binom{L}{p}=tc+(1-t)(1-\frac{d}{L})^{p}.

Note that a=2L(1c)lnqa=\sqrt{2L(1-c)\ln q}, so Eq. (40) is a self-consistent equation

c=d=1Lp(Lpd)(q1)d01e2L(1c)lnq/[1+tc+(1t)(1dL)p]/1[tc+(1t)(1dL)p]2𝑑td=1L(Ld)(q1)d01e2L(1c)lnq/[1+tc+(1t)(1dL)p]/1[tc+(1t)(1dL)p]2𝑑tc=\frac{\sum_{d=1}^{L-p}\binom{L-p}{d}(q-1)^{d}\int_{0}^{1}e^{-2L(1-c)\ln q/[1+tc+(1-t)(1-\frac{d}{L})^{p}]}/\sqrt{1-[tc+(1-t)(1-\frac{d}{L})^{p}]^{2}}dt}{\sum_{d=1}^{L}\binom{L}{d}(q-1)^{d}\int_{0}^{1}e^{-2L(1-c)\ln q/[1+tc+(1-t)(1-\frac{d}{L})^{p}]}/\sqrt{1-[tc+(1-t)(1-\frac{d}{L})^{p}]^{2}}dt} (45)

The solution for L=100L=100, p=2p=2, and q=2q=2 is c=0.331c=0.331, so 1c=0.82\sqrt{1-c}=0.82. With p=2p=2, q=2q=2, and L=1000L=1000, c=0.330c=0.330, the value of cc near that for infinite LL. We also applied this method to the ppSK model. For p=2p=2, q=2q=2, the correlation is 14d(Ld)/L21-4d(L-d)/L^{2}. So

c=d=1L[14d(Ld)/L2]01e2L(1c)lnq/[1+tc+(1t)(1dL)p]/1[tc+(1t)(1dL)p]2𝑑td=1L(Ld)01e2L(1c)lnq/[1+tc+(1t)(1dL)p]/1[tc+(1t)(1dL)p]2𝑑tc=\frac{\sum_{d=1}^{L}[1-4d(L-d)/L^{2}]\int_{0}^{1}e^{-2L(1-c)\ln q/[1+tc+(1-t)(1-\frac{d}{L})^{p}]}/\sqrt{1-[tc+(1-t)(1-\frac{d}{L})^{p}]^{2}}dt}{\sum_{d=1}^{L}\binom{L}{d}\int_{0}^{1}e^{-2L(1-c)\ln q/[1+tc+(1-t)(1-\frac{d}{L})^{p}]}/\sqrt{1-[tc+(1-t)(1-\frac{d}{L})^{p}]^{2}}dt} (46)

When p=2p=2, q=2q=2, and L=100L=100 we find c=0.128c=0.128, and Hmin=0.777×2LH_{\text{min}}=0.777\times 2L, quite near the numerical result Hmin=0.763×2LH_{\text{min}}=0.763\times 2L. This self-consistent method, thus, is fairly accurate.

For a given qq, larger pp lead to smaller cc using Slepian’s lemma Eq. (25). To prove this, first assume Vi=Hpv,qGNK(Si)V_{i}=H^{\text{GNK}}_{p_{v},q}(S_{i}) and Wi=Hpw,qGNK(Si)W_{i}=H^{\text{GNK}}_{p_{w},q}(S_{i}) are GNK model variables with pV>pWp_{V}>p_{W}. For any sequence SiS_{i}, we group all other sequences according to the Hamming distance between them, so the group with Hamming distance dd contains (Ld)(q1)d\binom{L}{d}(q-1)^{d} variables. Assume SjS_{j} has a Hamming distance dd from SiS_{i}, so ViVj=(LdpV)/(LpV)=(LpVd)/(Ld)\langle V_{i}V_{j}\rangle=\binom{L-d}{p_{V}}/\binom{L}{p_{V}}=\binom{L-p_{V}}{d}/\binom{L}{d}, WiWj=(LdpW)/(LpW)=(LpWd)/(Ld)\langle W_{i}W_{j}\rangle=\binom{L-d}{p_{W}}/\binom{L}{p_{W}}=\binom{L-p_{W}}{d}/\binom{L}{d}. As pV>pWp_{V}>p_{W}, (LpVd)<(LpWd)\binom{L-p_{V}}{d}<\binom{L-p_{W}}{d}, so ViVj<WiWj\langle V_{i}V_{j}\rangle<\langle W_{i}W_{j}\rangle. As the SiS_{i} is chosen randomly, the correlation between any pair of VV is smaller than that of the corresponding pair of WW, and according to Slepian’s lemma, the minimum of VV is smaller. Thus, it is proved that larger pp lead to smaller minima.

We calculate the results for p=2p=2 of different qq. Shown in Table 1 is the result of L=100L=100, p=2p=2 and 3q203\leq q\leq 20. We see that cc decreases monotonically for increasing qq in this range.

Table 1: Self-consistent calculation of cc as a function of qq using Eq. (45). Here L=100L=100 and p=2p=2.
qq 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
cc 0.219 0.17 0.15 0.129 0.116 0.107 0.100 0.094 0.089 0.085 0.082 0.079 0.076 0.074 0.072 0.070 0.068 0.067

We also calculated cc for large qq and L=1000000L=1000000, Table 2.

Table 2: Self-consistent calculation of cc as function of qq using Eq. (45). Here L=1000000L=1000000 and p=2p=2.
qq 40 80 160 320 640 1280 2560 5120 10240
cc 0.049 0.038 0.031 0.026 0.022 0.019 0.016 0.014 0.013

It is apparent that all results for cc fall below 0.330.33, so the minimum of GNK model will fall between 0.82×2Llnq0.82\times\sqrt{2L\ln q} and 2Llnq\sqrt{2L\ln q} for any pp and qq. As 0.8210.82\approx 1, we neglect this factor, i.e. conservatively assure that the fitness is positive, not merely non-negative. When H2=2L/lnq\langle H^{2}\rangle=2L/\ln q, the minimum will be 2L2L.

In summary, for all models discussed in this appendix, a shift of 2L2L assures that the fitness is positive.

9 Appendix B: Calculation of Taylor Series Expansion for Average Fitness

We here describe how the coefficients in the Taylor series expansion of f(t)\langle f(t)\rangle are calculated.

9.1 Organizing the Terms in the Taylor Series Expansion

We define f({na})=nif(Si)ni/Nf(\{n_{a}\})=\sum_{n_{i}}f(S_{i})n_{i}/N, the average fitness of an individual of configuration {na}\{n_{a}\}. The first order result of Eq. (LABEL:2) is a linear combination of P({na},t)P(\{n_{a}\},t), which can be divided into three parts: ff part, from natural selection; μ\mu part, from mutation; ν\nu part, from horizontal gene transfer. In a compact form

dPdt=P=fP+μP+νP\frac{dP}{dt}=\mathcal{L}P=\mathcal{L}_{f}P+\mathcal{L}_{\mu}P+\mathcal{L}_{\nu}P (47)

and

dfdt={na}f({na})P({na},t)\frac{d\langle f\rangle}{dt}=\sum_{\{n_{a}\}}f(\{n_{a}\})\mathcal{L}P(\{n_{a}\},t) (48)

In this way we obtain the form of higher order results, for example, the second order result can be expressed as

d2fdt2=ddtdfdt={na}f({na})2P={na}f({na})(f+μ+ν)(f+μ+ν)P\frac{d^{2}\langle f\rangle}{dt^{2}}=\frac{d}{dt}\frac{d\langle f\rangle}{dt}=\sum_{\{n_{a}\}}f(\{n_{a}\})\mathcal{L}^{2}P=\sum_{\{n_{a}\}}f(\{n_{a}\})(\mathcal{L}_{f}+\mathcal{L}_{\mu}+\mathcal{L}_{\nu})(\mathcal{L}_{f}+\mathcal{L}_{\mu}+\mathcal{L}_{\nu})P (49)

which is divided into 99 terms:

d2fdt2\displaystyle\frac{d^{2}\langle f\rangle}{dt^{2}} =\displaystyle= {na}f({na})(ff+fμ+fν+μf\displaystyle\sum_{\{n_{a}\}}f(\{n_{a}\})(\mathcal{L}_{f}\mathcal{L}_{f}+\mathcal{L}_{f}\mathcal{L}_{\mu}+\mathcal{L}_{f}\mathcal{L}_{\nu}+\mathcal{L}_{\mu}\mathcal{L}_{f} (50)
+μμ+μν+νf+νμ+νν)P\displaystyle+\mathcal{L}_{\mu}\mathcal{L}_{\mu}+\mathcal{L}_{\mu}\mathcal{L}_{\nu}+\mathcal{L}_{\nu}\mathcal{L}_{f}+\mathcal{L}_{\nu}\mathcal{L}_{\mu}+\mathcal{L}_{\nu}\mathcal{L}_{\nu})P

Note that f\mathcal{L}_{f}, μ\mathcal{L}_{\mu}, ν\mathcal{L}_{\nu} are not commutative with each other (but of course commutative with itself), for example fμμf\mathcal{L}_{f}\mathcal{L}_{\mu}\neq\mathcal{L}_{\mu}\mathcal{L}_{f}, since these two correspond to different evolutionary processes of the population. As 2b=d2f(t)dt2|t=02b=\frac{d^{2}\langle f(t)\rangle}{dt^{2}}|_{t=0}, we can divide 2b2b into nine terms, and name them TffT_{ff}, TfμT_{f\mu}, TfνT_{f\nu}, TμfT_{\mu f}, TμμT_{\mu\mu}, TμνT_{\mu\nu}, TνfT_{\nu f}, TνμT_{\nu\mu}, TννT_{\nu\nu}. Also note that the naming is in a sense inverted, for example TfμT_{f\mu} term, which comes from fμP\mathcal{L}_{f}\mathcal{L}_{\mu}P, actually represents the process that a mutation happens before natural selection.

In this way, the third order term consists of 2727 terms, and the rr-th order term consists 3r3^{r} terms in general. Each term can be named according to the order of \mathcal{L} operators involved. Below we will discuss how to compute each part of an rr-th order term. First we discuss eliminating terms that do not contribute, then we calculate terms for different models. We will start with the simplest 22-spin, 22-state model, and then discuss ppSK model, planar Potts model, and GNK model.

9.2 Identification of the Terms that Vanish

The number of terms increases exponentially with order, so we would like to eliminate some terms that vanish from some considerations. We discuss this according to the initial conditions, that is, random initial sequences and identical initial sequences.

For both initial conditions, a term that does not contain natural selection processes vanishes. The reason is that the Taylor expansion results correspond to change of fitness, so it has a factor of Δf\Delta f proportional to some linear combinations of HH, which is in turn some linear combinations of JJ or σ\sigma. For terms that only contain mutations and horizontal gene transfers, the result is proportional to linear combinations of JJ or σ\sigma, which vanish when we take an average. Only when there is natural selection can JJ or σ\sigma be multiplied into second order or higher even order forms, which are non-zero when averaged. So only terms containing natural selections can contribute.

For the random initial sequences, the modular term appears in second order. From the principle above, the first order terms TμT_{\mu} and TνT_{\nu}, and the second order terms TμμT_{\mu\mu}, TμνT_{\mu\nu}, TνμT_{\nu\mu} and TννT_{\nu\nu} are all 0. In addition, for the TfμT_{f\mu} term, first an individual mutates then a natural selection happens. As the initial system is totally random, the mutation in the first step does not change the system, so this term is 0. Then, up to second order, there are five non-zero terms: TfT_{f}, TffT_{ff}, TfνT_{f\nu}, TμfT_{\mu f} and TνfT_{\nu f}.

For the same initial sequences, the modular terms appear in the fourth order, meaning that we need to calculate 3+9+27+81=1203+9+27+81=120 terms. Fortunately, the initial conditions are so special that we can greatly reduce our burden. As discussed above, terms not containing ff are automatically zero, so 12024816=90120-2-4-8-16=90 terms can be non-zero. In addition, if a natural selection or horizontal gene transfer process happens first without a mutation, then nothing is changed as every sequence is initially the same, making this term zero. So any term not ending with μ\mu is zero, leaving behind 0+1+5+19=250+1+5+19=25 non-zero terms: 0 first order term, 11 second order term TfμT_{f\mu}, 55 third order terms satisfying three-letter-array ending with μ\mu and containing a ff in the first two letters and 1919 fourth order terms satisfying four-letter-array ending with μ\mu and containing a ff in the first three letters. Additionally, for a term starting with ν\nu (the last process is horizontal gene transfer), if there were fewer than 2 mutational processes before horizontal gene transfer, then the term is zero. The reason is that if only one mutation happened, all mutated individuals are the same, so a horizontal gene transfer changes the population only when it involves a mutated individual and an unmutated individual. And for a change to occur, the horizontal gene transfer must transfer the part of the sequence that is mutated. The probability of the unmutated one becoming a mutated one through horizontal gene transfer is the same as the mutated one becoming an unmutated one, and the change of fitness of these two processes cancel, so the contribution is zero. In this way, TνfμT_{\nu f\mu} is zero, and TνffμT_{\nu ff\mu}, TνfνμT_{\nu f\nu\mu} and TννfμT_{\nu\nu f\mu} is zero. An additional two terms, TμνfμT_{\mu\nu f\mu} and TνμfμT_{\nu\mu f\mu} are found to be zero after calculation, so there is one non-zero second order term, four non-zero third order terms, and fourteen non-zero fourth order terms, totaling nineteen non-zero terms.

9.3 The p=2p=2 SK Model with Random Initial Sequences

To illustrate the calculation for the random initial sequences, we use TfT_{f} term as an example.

f|n1,,n2L\displaystyle\mathcal{L}_{f}|n_{1},...,n_{2^{L}}\rangle =\displaystyle= abnanbNfa(|na+1,,nb1|na,,nb)\displaystyle\sum_{ab}\frac{n_{a}n_{b}}{N}f_{a}(|n_{a}+1,...,n_{b}-1\rangle-|n_{a},...,n_{b}\rangle) (51)

So

Tf\displaystyle T_{f} =\displaystyle= {na}f({na})fP({na},0)\displaystyle\sum_{\{n_{a}\}}f(\{n_{a}\})\mathcal{L}_{f}P(\{n_{a}\},0) (52)
=\displaystyle= {na}P({na},0)abnanbNfa[f(|na+1,,nb1)f(|na,,nb)]\displaystyle\sum_{\{n_{a}\}}P(\{n_{a}\},0)\sum_{ab}\frac{n_{a}n_{b}}{N}f_{a}\left[f(|n_{a}+1,...,n_{b}-1\rangle)-f(|n_{a},...,n_{b}\rangle)\right]
=\displaystyle= {na}P({na},0)abnanbNfafafbN\displaystyle\sum_{\{n_{a}\}}P(\{n_{a}\},0)\sum_{ab}\frac{n_{a}n_{b}}{N}f_{a}\frac{f_{a}-f_{b}}{N}
=\displaystyle= abnanbNfafafbN\displaystyle\left<\sum_{ab}\frac{n_{a}n_{b}}{N}f_{a}\frac{f_{a}-f_{b}}{N}\right>
=\displaystyle= abnanbNfaHaHbN\displaystyle\left<\sum_{ab}\frac{n_{a}n_{b}}{N}f_{a}\frac{H_{a}-H_{b}}{N}\right>
=\displaystyle= abnanbNHaHaHbN\displaystyle\left<\sum_{ab}\frac{n_{a}n_{b}}{N}H_{a}\frac{H_{a}-H_{b}}{N}\right>

The last equation holds because average over first order terms of HH is always zero. As the initial sequences are totally random, when we pick out a particular individual aa, we expect that from the view of aa, the other N1N-1 individuals are totally random, so their HH is uncorrelated with that of aa. As H=0\langle H\rangle=0, bnbHaHb=0\sum_{b}n_{b}H_{a}H_{b}=0.

So,

Tf\displaystyle T_{f} =\displaystyle= abnanbNHaHaHbN\displaystyle\left<\sum_{ab}\frac{n_{a}n_{b}}{N}H_{a}\frac{H_{a}-H_{b}}{N}\right> (53)
=\displaystyle= abnanbHa2N2\displaystyle\left<\sum_{ab}\frac{n_{a}n_{b}H_{a}^{2}}{N^{2}}\right>
=\displaystyle= N(N1)N2Ha2\displaystyle\frac{N(N-1)}{N^{2}}\langle H_{a}^{2}\rangle
=\displaystyle= (11/N)ijklJijJklΔijΔklsisjsksl\displaystyle(1-1/N)\sum_{ijkl}\langle J_{ij}J_{kl}\rangle\Delta_{ij}\Delta_{kl}s_{i}s_{j}s_{k}s_{l}
=\displaystyle= 2(11/N)ijΔij/C\displaystyle 2(1-1/N)\sum_{ij}\Delta_{ij}/C
=\displaystyle= 2L(11/N)\displaystyle 2L(1-1/N)

and,

a=Tfa=T_{f} (54)

Similarly, we can obtain,

Tff\displaystyle T_{ff} =\displaystyle= 8L2(11/N)/N\displaystyle-8L^{2}(1-1/N)/N
Tμf\displaystyle T_{\mu f} =\displaystyle= 8μL(11/N)\displaystyle-8\mu L(1-1/N)
Tfν\displaystyle T_{f\nu} =\displaystyle= 4νL(11/N)[M+(1M)/K]/N\displaystyle-4\nu L(1-1/N)\left[M+(1-M)/K\right]/N
Tνf\displaystyle T_{\nu f} =\displaystyle= 4νL(11/N)(13/N)(11/K)(M1)\displaystyle 4\nu L(1-1/N)(1-3/N)\left(1-1/K\right)(M-1) (55)

and

2b=Tff+Tμf+Tfν+Tνf2b=T_{ff}+T_{\mu f}+T_{f\nu}+T_{\nu f} (56)

9.4 The p=2p=2 SK model with Identical Initial Sequences

To illustrate how to calculate the terms for identical initial sequences we use TfμT_{f\mu} term as an example. We assume the initial sequences are S0S_{0}, and the initial state is {na}0=(N,0,,0)=|Nδ𝒆,0\{n_{a}\}_{0}=(N,0,...,0)=|N\delta_{\boldsymbol{e},0}\rangle. μ\mathcal{L}_{\mu} takes the state to

μ|Nδ𝒆,0=μN𝒂=0[|(N1)δ𝒆,0+δ𝒆,𝒂|Nδ𝒆,0]\mathcal{L}_{\mu}|N\delta_{\boldsymbol{e},0}\rangle=\mu N\sum_{\boldsymbol{a}=\partial 0}\left[|(N-1)\delta_{\boldsymbol{e},0}+\delta_{\boldsymbol{e},\boldsymbol{a}}\rangle-|N\delta_{\boldsymbol{e},0}\rangle\right] (57)

A subsequent f\mathcal{L}_{f} takes the state to

fμ|Nδ𝒆,0\displaystyle\mathcal{L}_{f}\mathcal{L}_{\mu}|N\delta_{\boldsymbol{e},0}\rangle =\displaystyle= μNf𝒂=0[|(N1)δ𝒆,0+δ𝒆,𝒂|Nδ𝒆,0]\displaystyle\mu N\mathcal{L}_{f}\sum_{\boldsymbol{a}=\partial 0}\left[|(N-1)\delta_{\boldsymbol{e},0}+\delta_{\boldsymbol{e},\boldsymbol{a}}\rangle-|N\delta_{\boldsymbol{e},0}\rangle\right] (58)
=\displaystyle= μN𝒂=0[N1Nf(S0)(|Nδ𝒆,0|(N1)δ𝒆,0+δ𝒆,𝒂)\displaystyle\mu N\sum_{\boldsymbol{a}=\partial 0}\Bigg{[}\frac{N-1}{N}f(S_{0})(|N\delta_{\boldsymbol{e},0}\rangle-|(N-1)\delta_{\boldsymbol{e},0}+\delta_{\boldsymbol{e},\boldsymbol{a}}\rangle)
+N1Nf(S𝒂)(|(N2)δ𝒆,0+2δ𝒆,𝒂|(N1)δ𝒆,0+δ𝒆,𝒂)]\displaystyle+\frac{N-1}{N}f(S_{\boldsymbol{a}})(|(N-2)\delta_{\boldsymbol{e},0}+2\delta_{\boldsymbol{e},\boldsymbol{a}}\rangle-|(N-1)\delta_{\boldsymbol{e},0}+\delta_{\boldsymbol{e},\boldsymbol{a}}\rangle)\Bigg{]}
=\displaystyle= μ(N1)𝒂=0[f(S0)(|Nδ𝒆,0|(N1)δ𝒆,0+δ𝒆,𝒂)\displaystyle\mu(N-1)\sum_{\boldsymbol{a}=\partial 0}\big{[}f(S_{0})(|N\delta_{\boldsymbol{e},0}\rangle-|(N-1)\delta_{\boldsymbol{e},0}+\delta_{\boldsymbol{e},\boldsymbol{a}}\rangle)
+f(S𝒂)(|(N2)δ𝒆,0+2δ𝒆,𝒂|(N1)δ𝒆,0+δ𝒆,𝒂)]\displaystyle+f(S_{\boldsymbol{a}})(|(N-2)\delta_{\boldsymbol{e},0}+2\delta_{\boldsymbol{e},\boldsymbol{a}}\rangle-|(N-1)\delta_{\boldsymbol{e},0}+\delta_{\boldsymbol{e},\boldsymbol{a}}\rangle)\big{]}

So after two operations, P(|Nδ𝒆,0,0)=μ(N1)𝒂=0f(S0)P(|N\delta_{\boldsymbol{e},0}\rangle,0)=\mu(N-1)\sum_{\boldsymbol{a}=\partial 0}f(S_{0}), P(|(N1)δ𝒆,0+δ𝒆,𝒂,0)=μ(N1)𝒂=0(f(S𝒂)+f(S0))P(|(N-1)\delta_{\boldsymbol{e},0}+\delta_{\boldsymbol{e},\boldsymbol{a}}\rangle,0)=-\mu(N-1)\sum_{\boldsymbol{a}=\partial 0}(f(S_{\boldsymbol{a}})+f(S_{0})) and P(|(N2)δ𝒆,0+2δ𝒆,𝒂,0)=μ(N1)𝒂=0f(S𝒂)P(|(N-2)\delta_{\boldsymbol{e},0}+2\delta_{\boldsymbol{e},\boldsymbol{a}}\rangle,0)=\mu(N-1)\sum_{\boldsymbol{a}=\partial 0}f(S_{\boldsymbol{a}}), and we find

Tfμ\displaystyle T_{f\mu} =\displaystyle= {na}f({na})fμP({na},0)\displaystyle\sum_{\{n_{a}\}}f(\{n_{a}\})\mathcal{L}_{f}\mathcal{L}_{\mu}P(\{n_{a}\},0) (59)
=\displaystyle= f(|Nδ𝒆,0)P(|Nδ𝒆,0,0)\displaystyle f(|N\delta_{\boldsymbol{e},0}\rangle)P(|N\delta_{\boldsymbol{e},0}\rangle,0)
+f(|(N1)δ𝒆,0+δ𝒆,𝒂)P(|(N1)δ𝒆,0+δ𝒆,𝒂,0)\displaystyle+f(|(N-1)\delta_{\boldsymbol{e},0}+\delta_{\boldsymbol{e},\boldsymbol{a}}\rangle)P(|(N-1)\delta_{\boldsymbol{e},0}+\delta_{\boldsymbol{e},\boldsymbol{a}}\rangle,0)
+f(|(N2)δ𝒆,0+2δ𝒆,𝒂)P(|(N2)δ𝒆,0+2δ𝒆,𝒂,0)\displaystyle+f(|(N-2)\delta_{\boldsymbol{e},0}+2\delta_{\boldsymbol{e},\boldsymbol{a}}\rangle)P(|(N-2)\delta_{\boldsymbol{e},0}+2\delta_{\boldsymbol{e},\boldsymbol{a}}\rangle,0)
=\displaystyle= μ(11/N)𝒂=0[Nf(S0)2(N1)f(S0)(f(S0)+f(S𝒂))\displaystyle\mu(1-1/N)\sum_{\boldsymbol{a}=\partial 0}[Nf(S_{0})^{2}-(N-1)f(S_{0})(f(S_{0})+f(S_{\boldsymbol{a}}))
f(S𝒂)(f(S0)+f(S𝒂))+(N2)f(S𝒂)f(S0)+2f(S𝒂)2]\displaystyle-f(S_{\boldsymbol{a}})(f(S_{0})+f(S_{\boldsymbol{a}}))+(N-2)f(S_{\boldsymbol{a}})f(S_{0})+2f(S_{\boldsymbol{a}})^{2}]
=\displaystyle= μ(11/N)𝒂=0(f(S𝒂)f(S0))2\displaystyle\mu(1-1/N)\sum_{\boldsymbol{a}=\partial 0}(f(S_{\boldsymbol{a}})-f(S_{0}))^{2}

in which

𝒂=0[f(S𝒂)f(S0)]2\displaystyle\sum_{\boldsymbol{a}=\partial 0}[f(S_{\boldsymbol{a}})-f(S_{0})]^{2} =\displaystyle= 𝒂=0ijklJijJklΔijΔkl[1(12δai)(12δaj)]\displaystyle\sum_{\boldsymbol{a}=\partial 0}\sum_{ij}\sum_{kl}J_{ij}J_{kl}\Delta_{ij}\Delta_{kl}[1-(1-2\delta_{ai})(1-2\delta_{aj})] (60)
×[1(12δak)(12δal)]\displaystyle\times[1-(1-2\delta_{ak})(1-2\delta_{al})]

if we average over JijJklJ_{ij}J_{kl}, we find

𝒂=0[f(S𝒂)f(S0)]2\displaystyle\sum_{\boldsymbol{a}=\partial 0}[f(S_{\boldsymbol{a}})-f(S_{0})]^{2} =\displaystyle= 𝒂=0ijklJijJklΔijΔkl[1(12δai)(12δaj)]\displaystyle\sum_{\boldsymbol{a}=\partial 0}\sum_{ijkl}\langle J_{ij}J_{kl}\rangle\Delta_{ij}\Delta_{kl}[1-(1-2\delta_{ai})(1-2\delta_{aj})] (61)
×[1(12δak)(12δal)]\displaystyle\times[1-(1-2\delta_{ak})(1-2\delta_{al})]
=\displaystyle= 𝒂=0ijkl(δikδjl+δilδjk)(1δij)ΔijΔkl/C\displaystyle\sum_{\boldsymbol{a}=\partial 0}\sum_{ijkl}(\delta_{ik}\delta_{jl}+\delta_{il}\delta_{jk})(1-\delta_{ij})\Delta_{ij}\Delta_{kl}/C
×[1(12δai)(12δaj)][1(12δak)(12δal)]\displaystyle\times[1-(1-2\delta_{ai})(1-2\delta_{aj})][1-(1-2\delta_{ak})(1-2\delta_{al})]
=\displaystyle= 2𝒂=0ijΔij[1(12δai)(12δaj)]2/C\displaystyle 2\sum_{\boldsymbol{a}=\partial 0}\sum_{ij}\Delta_{ij}[1-(1-2\delta_{ai})(1-2\delta_{aj})]^{2}/C
=\displaystyle= 8𝒂=0ijΔij(δai+δaj)/C\displaystyle 8\sum_{\boldsymbol{a}=\partial 0}\sum_{ij}\Delta_{ij}(\delta_{ai}+\delta_{aj})/C
=\displaystyle= 16L\displaystyle 16L

So

Tfμ=16μL(11/N)T_{f\mu}=16\mu L(1-1/N) (62)

and

2b=Tfμ2b=T_{f\mu} (63)

Similarly, we find

Tfμμ\displaystyle T_{f\mu\mu} =\displaystyle= 128μ2L(11/N)\displaystyle-128\mu^{2}L(1-1/N)
Tfνμ\displaystyle T_{f\nu\mu} =\displaystyle= 32μνL(11/N)/N\displaystyle-32\mu\nu L(1-1/N)/N
Tffμ\displaystyle T_{ff\mu} =\displaystyle= 64μL2(11/N)/N\displaystyle-64\mu L^{2}(1-1/N)/N
Tμfμ\displaystyle T_{\mu f\mu} =\displaystyle= 64μ2L(11/N)\displaystyle-64\mu^{2}L(1-1/N)
Tfffμ\displaystyle T_{fff\mu} =\displaystyle= 256μL3(11/N)/N2+128μL2(11/N)/N2\displaystyle 256\mu L^{3}(1-1/N)/N^{2}+128\mu L^{2}(1-1/N)/N^{2}
+256μL(11/N)(318N+23N2)\displaystyle+256\mu L(1-1/N)\left(3-\frac{18}{N}+\frac{23}{N^{2}}\right)
Tffμμ\displaystyle T_{ff\mu\mu} =\displaystyle= 512μ2L2(11/N)/N\displaystyle 512\mu^{2}L^{2}(1-1/N)/N
Tffνμ\displaystyle T_{ff\nu\mu} =\displaystyle= 128μνL2(11/N)/N2\displaystyle 128\mu\nu L^{2}(1-1/N)/N^{2}
Tfμfμ\displaystyle T_{f\mu f\mu} =\displaystyle= 512μ2L2(11/N)/N\displaystyle 512\mu^{2}L^{2}(1-1/N)/N
Tfμμμ\displaystyle T_{f\mu\mu\mu} =\displaystyle= 1024μ3L(11/N)\displaystyle 1024\mu^{3}L(1-1/N)
Tfμνμ\displaystyle T_{f\mu\nu\mu} =\displaystyle= 256μ2νL(11/N)/N\displaystyle 256\mu^{2}\nu L(1-1/N)/N
Tfνfμ\displaystyle T_{f\nu f\mu} =\displaystyle= 128μνL2(11/N)/N2\displaystyle 128\mu\nu L^{2}(1-1/N)/N^{2}
Tfνμμ\displaystyle T_{f\nu\mu\mu} =\displaystyle= 128μ2νL(11/N)[2+(1M)(11/K)]/N\displaystyle 128\mu^{2}\nu L(1-1/N)\left[2+(1-M)\left(1-1/K\right)\right]/N
Tfννμ\displaystyle T_{f\nu\nu\mu} =\displaystyle= 64μν2L(11/N)/N2\displaystyle 64\mu\nu^{2}L(1-1/N)/N^{2}
Tμffμ\displaystyle T_{\mu ff\mu} =\displaystyle= 256μ2L2(11/N)/N\displaystyle 256\mu^{2}L^{2}(1-1/N)/N
Tμfμμ\displaystyle T_{\mu f\mu\mu} =\displaystyle= 512μ3L(11/N)\displaystyle 512\mu^{3}L(1-1/N)
Tμfνμ\displaystyle T_{\mu f\nu\mu} =\displaystyle= 128μ2νL(11/N)/N\displaystyle 128\mu^{2}\nu L(1-1/N)/N
Tμμfμ\displaystyle T_{\mu\mu f\mu} =\displaystyle= 256μ3L(11/N)\displaystyle 256\mu^{3}L(1-1/N)
Tνfμμ\displaystyle T_{\nu f\mu\mu} =\displaystyle= 128μ2νL(11/N)(13/N)(1M)(11/K)\displaystyle-128\mu^{2}\nu L(1-1/N)(1-3/N)(1-M)\left(1-1/K\right) (64)

and

(3!)c\displaystyle(3!)c =\displaystyle= Tfμμ+Tfνμ+Tffμ+Tμfμ\displaystyle T_{f\mu\mu}+T_{f\nu\mu}+T_{ff\mu}+T_{\mu f\mu}
(4!)d\displaystyle(4!)d =\displaystyle= Tfffμ+Tffμμ+Tffνμ+Tfμfμ+Tfμμμ+Tfμνμ+Tfνfμ\displaystyle T_{fff\mu}+T_{ff\mu\mu}+T_{ff\nu\mu}+T_{f\mu f\mu}+T_{f\mu\mu\mu}+T_{f\mu\nu\mu}+T_{f\nu f\mu} (65)
+Tfνμμ+Tfννμ+Tμffμ+Tμfμμ+Tμfνμ+Tμμfμ+Tνfμμ\displaystyle+T_{f\nu\mu\mu}+T_{f\nu\nu\mu}+T_{\mu ff\mu}+T_{\mu f\mu\mu}+T_{\mu f\nu\mu}+T_{\mu\mu f\mu}+T_{\nu f\mu\mu}

9.5 The ppSK Interaction

For the pp-spin interaction, the relationship between different Ji1,,ipJ_{i_{1},...,i_{p}} is quite similar to the p=2p=2 case, and the principles eliminating zero terms are the same. Moreover, the physical processes corresponding to each Taylor expansion term remain the same, so we just need to change the form of the HH. For example, to obtain TfμT_{f\mu} term of same initial conditions, Eq. (59) still holds, but we need to use the new HH to calculate 𝒂=0[f(S𝒂)f(S0)]2\sum_{\boldsymbol{a}=\partial 0}[f(S_{\boldsymbol{a}})-f(S_{0})]^{2}. For the new form,

𝒂=0[f(S𝒂)f(S0)]2\displaystyle\sum_{\boldsymbol{a}=\partial 0}[f(S_{\boldsymbol{a}})-f(S_{0})]^{2} =\displaystyle= 𝒂=0i1ipj1jpJi1ipJj1jpΔi1ipΔj1jp\displaystyle\sum_{\boldsymbol{a}=\partial 0}\sum_{i_{1}...i_{p}j_{1}...j_{p}}\langle J_{i_{1}...i_{p}}J_{j_{1}...j_{p}}\rangle\Delta_{i_{1}...i_{p}}\Delta_{j_{1}...j_{p}} (66)
×[1Πq=1p(12δa,iq)][1Πq=1p(12δa,jq)]\displaystyle\times[1-\Pi_{q=1}^{p}(1-2\delta_{a,i_{q}})][1-\Pi_{q=1}^{p}(1-2\delta_{a,j_{q}})]
=\displaystyle= 𝒂=0i1ipp!2Cp!(2q=1pδa,iq)2Δi1ip\displaystyle\sum_{\boldsymbol{a}=\partial 0}\sum_{i_{1}...i_{p}}p!\frac{2}{Cp!}\left(2\sum_{q=1}^{p}\delta_{a,i_{q}}\right)^{2}\Delta_{i_{1}...i_{p}}
=\displaystyle= 8C𝒂=0i1ipΔi1ipi=1pδa,iq\displaystyle\frac{8}{C}\sum_{\boldsymbol{a}=\partial 0}\sum_{i_{1}...i_{p}}\Delta_{i_{1}...i_{p}}\sum_{i=1}^{p}\delta_{a,i_{q}}
=\displaystyle= 8pCi1ipΔi1ip\displaystyle\frac{8p}{C}\sum_{i_{1}...i_{p}}\Delta_{i_{1}...i_{p}}
=\displaystyle= 8pL\displaystyle 8pL

So Tfμ=8μpL(11/N)T_{f\mu}=8\mu pL(1-1/N). In this way, we can obtain for pp-spin interaction random initial sequences,

Tf\displaystyle T_{f} =\displaystyle= 2L(11/N)\displaystyle 2L(1-1/N)
Tff\displaystyle T_{ff} =\displaystyle= 8L2(11/N)/N\displaystyle-8L^{2}(1-1/N)/N
Tμf\displaystyle T_{\mu f} =\displaystyle= 4μpL(11/N)\displaystyle-4\mu pL(1-1/N)
Tfν\displaystyle T_{f\nu} =\displaystyle= 4νL(11/N)(M+(1K)p1(1M))/N\displaystyle-4\nu L(1-1/N)\left(M+\left(\frac{1}{K}\right)^{p-1}(1-M)\right)/N
Tνf\displaystyle T_{\nu f} =\displaystyle= 2pνL(11/N)(13/N)(1(1K)p1)(M1)\displaystyle 2p\nu L(1-1/N)(1-3/N)\left(1-\left(\frac{1}{K}\right)^{p-1}\right)(M-1) (67)

and

a\displaystyle a =\displaystyle= Tf\displaystyle T_{f}
2b\displaystyle 2b =\displaystyle= Tff+Tμf+Tfν+Tνf\displaystyle T_{ff}+T_{\mu f}+T_{f\nu}+T_{\nu f} (68)

Also, we find for initial conditions of identical sequences,

Tfμμ\displaystyle T_{f\mu\mu} =\displaystyle= 64μ2pL(11/N)\displaystyle-64\mu^{2}pL(1-1/N)
Tfνμ\displaystyle T_{f\nu\mu} =\displaystyle= 16μνpL(11/N)/N\displaystyle-16\mu\nu pL(1-1/N)/N
Tffμ\displaystyle T_{ff\mu} =\displaystyle= 32μpL2(11/N)/N\displaystyle-32\mu pL^{2}(1-1/N)/N
Tμfμ\displaystyle T_{\mu f\mu} =\displaystyle= 16μ2p2L(11/N)\displaystyle-16\mu^{2}p^{2}L(1-1/N)
Tfffμ\displaystyle T_{fff\mu} =\displaystyle= 128μpL3(11/N)/N2+64μpL2(11/N)/N2\displaystyle 128\mu pL^{3}(1-1/N)/N^{2}+64\mu pL^{2}(1-1/N)/N^{2}
64μp2L(11/N)(318N+23N2)\displaystyle 64\mu p^{2}L(1-1/N)\left(3-\frac{18}{N}+\frac{23}{N^{2}}\right)
Tffμμ\displaystyle T_{ff\mu\mu} =\displaystyle= 128μ2p2L2(11/N)/N\displaystyle 128\mu^{2}p^{2}L^{2}(1-1/N)/N
Tffνμ\displaystyle T_{ff\nu\mu} =\displaystyle= 64μνpL2(11/N)/N2\displaystyle 64\mu\nu pL^{2}(1-1/N)/N^{2}
Tfμfμ\displaystyle T_{f\mu f\mu} =\displaystyle= 128μ2p2L2(11/N)/N\displaystyle 128\mu^{2}p^{2}L^{2}(1-1/N)/N
Tfμμμ\displaystyle T_{f\mu\mu\mu} =\displaystyle= 128μ3p3L(11/N)\displaystyle 128\mu^{3}p^{3}L(1-1/N)
Tfμνμ\displaystyle T_{f\mu\nu\mu} =\displaystyle= 64μ2νp2L(11/N)/N\displaystyle 64\mu^{2}\nu p^{2}L(1-1/N)/N
Tfνfμ\displaystyle T_{f\nu f\mu} =\displaystyle= 64μνpL2(11/N)/N2\displaystyle 64\mu\nu pL^{2}(1-1/N)/N^{2}
Tfνμμ\displaystyle T_{f\nu\mu\mu} =\displaystyle= 64μ2νL(11/N)[p2+p(p1)(1M)(11/K)]/N\displaystyle 64\mu^{2}\nu L(1-1/N)\left[p^{2}+p(p-1)(1-M)\left(1-1/K\right)\right]/N
Tfννμ\displaystyle T_{f\nu\nu\mu} =\displaystyle= 32μν2pL(11/N)/N2\displaystyle 32\mu\nu^{2}pL(1-1/N)/N^{2}
Tμffμ\displaystyle T_{\mu ff\mu} =\displaystyle= 64μ2p2L2(11/N)/N\displaystyle 64\mu^{2}p^{2}L^{2}(1-1/N)/N
Tμfμμ\displaystyle T_{\mu f\mu\mu} =\displaystyle= 64μ3p3L(11/N)\displaystyle 64\mu^{3}p^{3}L(1-1/N)
Tμfνμ\displaystyle T_{\mu f\nu\mu} =\displaystyle= 32μ2νp2L(11/N)/N\displaystyle 32\mu^{2}\nu p^{2}L(1-1/N)/N
Tμμfμ\displaystyle T_{\mu\mu f\mu} =\displaystyle= 32μ3p3L(11/N)\displaystyle 32\mu^{3}p^{3}L(1-1/N)
Tνfμμ\displaystyle T_{\nu f\mu\mu} =\displaystyle= 64μ2νp(p1)L(11/N)(13/N)(1M)(11/K)\displaystyle-64\mu^{2}\nu p(p-1)L(1-1/N)(1-3/N)(1-M)\left(1-1/K\right) (69)

and

a\displaystyle a =\displaystyle= Tf\displaystyle T_{f}
(2!)b\displaystyle(2!)b =\displaystyle= Tff+Tμf+TfνTνf\displaystyle T_{ff}+T_{\mu f}+T_{f\nu}T_{\nu f}
(3!)c\displaystyle(3!)c =\displaystyle= Tfμμ+Tfνμ+Tffμ+Tμfμ\displaystyle T_{f\mu\mu}+T_{f\nu\mu}+T_{ff\mu}+T_{\mu f\mu}
(4!)d\displaystyle(4!)d =\displaystyle= Tfffμ+Tffμμ+Tffνμ+Tfμfμ+Tfμμμ+Tfμνμ+Tfνfμ\displaystyle T_{fff\mu}+T_{ff\mu\mu}+T_{ff\nu\mu}+T_{f\mu f\mu}+T_{f\mu\mu\mu}+T_{f\mu\nu\mu}+T_{f\nu f\mu} (70)
+Tfνμμ+Tfννμ+Tμffμ+Tμfμμ+Tμfνμ+Tμμfμ+Tνfμμ\displaystyle+T_{f\nu\mu\mu}+T_{f\nu\nu\mu}+T_{\mu ff\mu}+T_{\mu f\mu\mu}+T_{\mu f\nu\mu}+T_{\mu\mu f\mu}+T_{\nu f\mu\mu}

9.6 The Planar Potts Model with Random Initial Sequences

The non-zero terms are still TfT_{f}, TffT_{ff}, TμfT_{\mu f}, TfνT_{f\nu} and TνfT_{\nu f}. As H2\langle H^{2}\rangle is changed, all terms will also change. In addition, the TμfT_{\mu f} term has a mutational process, which depends on the number of directions a spin can have. We assume that after a mutation, a spin randomly chooses a different direction as before. Using methods similar to Eq. (52), we find Tμf=μa(11/N)Ha(HaHa)T_{\mu f}=\mu\sum_{\partial a}(1-1/N)\left<H_{a}(H_{\partial a}-H_{a})\right>. Considering the different directions, we find

aHa(HaHa)\displaystyle\sum_{\partial a}\left<H_{a}(H_{\partial a}-H_{a})\right> (71)
=\displaystyle= aijJijΔijsisjcosθijklJklΔklsksl(cosθklcosθkl)\displaystyle\sum_{\partial a}\left<\sum_{ij}J_{ij}\Delta_{ij}s_{i}s_{j}\cos\theta_{ij}\sum_{kl}J_{kl}\Delta_{kl}s_{k}s_{l}(\cos\theta_{kl}^{\prime}-\cos\theta_{kl})\right>
=\displaystyle= aij21CΔijcosθij(cosθijcosθij)\displaystyle\sum_{\partial a}\left<\sum_{ij}2\frac{1}{C}\Delta_{ij}\cos\theta_{ij}(\cos\theta_{ij}^{\prime}-\cos\theta_{ij})\right>

where θij\theta_{ij}^{\prime} is the same as θij\theta_{ij} if neither ii nor jj is mutated, otherwise θij\theta_{ij}^{\prime} has the same probability to be any allowed value other than θij\theta_{ij}. So we only need to consider the case when either ii or jj is mutated, and obtain an extra factor of 22 from this,

aij21CΔijcosθij(cosθijcosθij)\displaystyle\sum_{\partial a}\left<\sum_{ij}2\frac{1}{C}\Delta_{ij}\cos\theta_{ij}(\cos\theta_{ij}^{\prime}-\cos\theta_{ij})\right> (72)
=\displaystyle= 4CijΔijcosθij(cosθijcosθij)\displaystyle\frac{4}{C}\left<\sum_{ij}\Delta_{ij}\cos\theta_{ij}(\cos\theta_{ij}^{\prime}-\cos\theta_{ij})\right>

and

cosθijcosθij\displaystyle\langle\cos\theta_{ij}\cos\theta_{ij}^{\prime}\rangle =\displaystyle= 1q1rscosθrcosθs\displaystyle\left<\frac{1}{q-1}\sum_{r\neq s}\cos\theta_{r}\cos\theta_{s}\right> (73)
=\displaystyle= 1q1(rcosθrcosθs)cosθs\displaystyle\left<\frac{1}{q-1}(\sum_{r}\cos\theta_{r}-\cos\theta_{s})\cos\theta_{s}\right>
=\displaystyle= 1q1cos2θs\displaystyle-\frac{1}{q-1}\langle\cos^{2}\theta_{s}\rangle

Then,

Ha(HaHa)\displaystyle\left<H_{a}(H_{\partial a}-H_{a})\right> =\displaystyle= 4CijΔijcosθij(cosθijcosθij)\displaystyle\left<\frac{4}{C}\sum_{ij}\Delta_{ij}\cos\theta_{ij}(\cos\theta_{ij}^{\prime}-\cos\theta_{ij})\right> (74)
=\displaystyle= 4CijΔij(1q11)cos2θ\displaystyle\frac{4}{C}\sum_{ij}\Delta_{ij}(-\frac{1}{q-1}-1)\langle\cos^{2}\theta\rangle
=\displaystyle= 4Lqq1cos2θ\displaystyle-4L\frac{q}{q-1}\langle\cos^{2}\theta\rangle
=\displaystyle= 2Lqq1(1+δq,2)\displaystyle-2L\frac{q}{q-1}(1+\delta_{q,2})

So

Tμf=2μL(11/N)qq1(1+δq,2)T_{\mu f}=-2\mu L(1-1/N)\frac{q}{q-1}(1+\delta_{q,2}) (75)

The other terms are

a=Tf=L(11/N)(1+δq,2)a=T_{f}=L(1-1/N)(1+\delta_{q,2}) (76)
Tff\displaystyle T_{ff} =\displaystyle= 4(1+δq,2)L2(11/N)/N\displaystyle-4(1+\delta_{q,2})L^{2}(1-1/N)/N
Tμf\displaystyle T_{\mu f} =\displaystyle= 2(1+δq,2)μL(11/N)qq1\displaystyle-2(1+\delta_{q,2})\mu L(1-1/N)\frac{q}{q-1}
Tfν\displaystyle T_{f\nu} =\displaystyle= 2(1+δq,2)νL(11/N)[M+(1M)/K]/N\displaystyle-2(1+\delta_{q,2})\nu L(1-1/N)[M+(1-M)/K]/N
Tνf\displaystyle T_{\nu f} =\displaystyle= 2(1+δq,2)νL(11/N)(13/N)(11/K)(M1)\displaystyle 2(1+\delta_{q,2})\nu L(1-1/N)(1-3/N)(1-1/K)(M-1) (77)

and

2b=Tff+Tμf+Tfν+Tνf2b=T_{ff}+T_{\mu f}+T_{f\nu}+T_{\nu f} (78)

9.7 GNK Model Calculation

The processes corresponding to different terms are still the same, but for the GNK model, as the HH is quite different, the Taylor expansion results will change. For example, for two randomly chosen sequences SaS_{a} and SbS_{b}, their correlation will be non-zero. Instead, it will be

HaHb\displaystyle\langle H_{a}H_{b}\rangle =\displaystyle= i1ipσi1ip(si1asipa)Δi1ipj1jpσj1jp(si1bsipb)Δj1jp\displaystyle\left<\sum_{i_{1}...i_{p}}\sigma_{i_{1}...i_{p}}(s_{i_{1}}^{a}...s_{i_{p}}^{a})\Delta_{i_{1}...i_{p}}\sum_{j_{1}...j_{p}}\sigma_{j_{1}...j_{p}}(s_{i_{1}}^{b}...s_{i_{p}}^{b})\Delta_{j_{1}...j_{p}}\right> (79)
=\displaystyle= (p!)2i1<<ipσi1ip(si1asipa)σi1ip(si1bsipb)Δi1ip\displaystyle(p!)^{2}\sum_{i_{1}<...<i_{p}}\langle\sigma_{i_{1}...i_{p}}(s_{i_{1}}^{a}...s_{i_{p}}^{a})\sigma_{i_{1}...i_{p}}(s_{i_{1}}^{b}...s_{i_{p}}^{b})\rangle\Delta_{i_{1}...i_{p}}
=\displaystyle= (p!)2i1<<ip1qpσ2Δi1ip\displaystyle(p!)^{2}\sum_{i_{1}<...<i_{p}}\frac{1}{q^{p}}\langle\sigma^{2}\rangle\Delta_{i_{1}...i_{p}}
=\displaystyle= p!i1ip1qpσ2Δi1ip\displaystyle p!\sum_{i_{1}...i_{p}}\frac{1}{q^{p}}\langle\sigma^{2}\rangle\Delta_{i_{1}...i_{p}}
=\displaystyle= p!1qp2p!ClnqLC\displaystyle p!\frac{1}{q^{p}}\frac{2}{p!C\ln q}LC
=\displaystyle= 2Lqplnq\displaystyle\frac{2L}{q^{p}\ln q}

where the 1/qp1/q^{p} factor comes from the fact that only when the states of all corresponding spins match can the correlations of σ\sigma be nonzero.

Similarly we calculate all terms and obtain

a=Tf=2L(11/N)(11/qp)/lnqa=T_{f}=2L(1-1/N)(1-1/q^{p})/\ln q (80)

and

2b\displaystyle 2b =\displaystyle= Tff+Tμf+Tfν+Tνf\displaystyle T_{ff}+T_{\mu f}+T_{f\nu}+T_{\nu f}
Tff\displaystyle T_{ff} =\displaystyle= 8L2(11/N)Nlnq(11/qp)\displaystyle-\frac{8L^{2}(1-1/N)}{N\ln q}(1-1/q^{p})
Tμf\displaystyle T_{\mu f} =\displaystyle= 2μpL(11/N)/lnq\displaystyle-2\mu pL(1-1/N)/\ln q
Tfν\displaystyle T_{f\nu} =\displaystyle= 4νK(11/N)Nlnq[L(1M)(11/K+q/Kq)p+LM/K+L(11/K)M/qp]\displaystyle\frac{4\nu K(1-1/N)}{N\ln q}\left[L(1-M)\left(\frac{1-1/K+q/K}{q}\right)^{p}+LM/K+L(1-1/K)M/q^{p}\right]
Tνf\displaystyle T_{\nu f} =\displaystyle= 2νKlnq(M1)(11/N)(13/N)\displaystyle\frac{2\nu K}{\ln q}(M-1)(1-1/N)(1-3/N) (81)
×[1+1/qp(11/K+q/Kq)p(qq/K+1/Kq)p]\displaystyle\times\left[1+1/q^{p}-\left(\frac{1-1/K+q/K}{q}\right)^{p}-\left(\frac{q-q/K+1/K}{q}\right)^{p}\right]

10 Appendix C: Calculation of the Taylor Series Expansion for the Sequence Divergence

We here describe how the Taylor expansion series of sequence divergence Eq. (4) are calculated.

As the divergence is D=1Nα=1NLSα(t)Sα(0)2D=\frac{1}{N}\sum_{\alpha=1}^{N}\langle\frac{L-S_{\alpha}(t)\cdot S_{\alpha}(0)}{2}\rangle, it is determined by the changes to the sequences of the population, which can be tracked using Eq. (LABEL:2). The order of the terms corresponds to the number of processed involved, for example, second order terms involve two processes. Using conventions developed in section 9.1, we divide the terms according to what evolutionary processes it involves, for example, the term which is the result of a mutational process followed by a horizontal gene transfer is called DνμD_{\nu\mu}, similar to the naming norm used in section 9.1. So

α\displaystyle\alpha =\displaystyle= Df+Dμ+Dν\displaystyle D_{f}+D_{\mu}+D_{\nu}
(2!)β\displaystyle(2!)\beta =\displaystyle= Dff+Dfμ+Dfν+Dμf+Dμμ+Dμν+Dνf+Dνμ+Dνν\displaystyle D_{ff}+D_{f\mu}+D_{f\nu}+D_{\mu f}+D_{\mu\mu}+D_{\mu\nu}+D_{\nu f}+D_{\nu\mu}+D_{\nu\nu} (82)

We use DfD_{f} term as an example. From Eq. (51), after a natural selection process, one sequence SbS_{b} is replaced by sequence SaS_{a}. If it is replaced by itself, nothing is changed. Otherwise, as the initial sequences is totally random, the number of sites changed is on average L/2L/2, and the probability of this is f(a)(11/N)=2L(11/N)\langle f(a)\rangle(1-1/N)=2L(1-1/N). As in the whole population, only this sequence is changed, the divergence can be calculated as

Df\displaystyle D_{f} =\displaystyle= 1NLSb(t)Sb(0)2\displaystyle\frac{1}{N}\left<\frac{L-S_{b}(t)\cdot S_{b}(0)}{2}\right> (83)
=\displaystyle= 2L(11/N)L(L/2L/2)2\displaystyle 2L(1-1/N)\frac{L-(L/2-L/2)}{2}
=\displaystyle= L2(11/N)\displaystyle L^{2}(1-1/N)

Similarly, we can obtain other terms,

Dμ\displaystyle D_{\mu} =\displaystyle= μL\displaystyle\mu L
Dν\displaystyle D_{\nu} =\displaystyle= νL(11/N)/2\displaystyle\nu L(1-1/N)/2
Dff\displaystyle D_{ff} =\displaystyle= 2L3(11/N)+L2(11/N)(13/N)\displaystyle-2L^{3}(1-1/N)+L^{2}(1-1/N)(1-3/N)
Dfμ\displaystyle D_{f\mu} =\displaystyle= Dμf=2μL2(11/N)\displaystyle D_{\mu f}=-2\mu L^{2}(1-1/N)
Dfν\displaystyle D_{f\nu} =\displaystyle= Dνf=νL2(11/N)\displaystyle D_{\nu f}=-\nu L^{2}(1-1/N)
Dμμ\displaystyle D_{\mu\mu} =\displaystyle= 2μ2L\displaystyle-2\mu^{2}L
Dμν\displaystyle D_{\mu\nu} =\displaystyle= Dνμ=μνL(11/N)\displaystyle D_{\nu\mu}=-\mu\nu L(1-1/N)
Dνν\displaystyle D_{\nu\nu} =\displaystyle= ν2L(11/N)/2\displaystyle-\nu^{2}L(1-1/N)/2 (84)

Adding these terms together gives Eq. (4).

References

  • [1] C. H. Waddington. Canalization of development and the inheritance of acquired characters. Nature, 150:563–565, 1942.
  • [2] H. A. Simon. The architecture of complexity. Proc. Amer. Phil. Soc., 106:467–482, 1962.
  • [3] L. H. Hartwell, J. J. Hopfield, S. Leibler, and A. W. Murray. From molecular to modular cell biology. Nature, 402:C47–C52, 1999.
  • [4] R. Rojas. Neural Networks: A Systematic Introduction. Springer, New York, 1996. Ch 16.
  • [5] E. Grohmann. Horizontal gene transfer between bacteria under natural conditions. In I Ahmad, F. Ahmad, and J. Pichtel, editors, Microbes and Microbial Technology, pages 163–187. Springer New York, 2011.
  • [6] M. B. Gogarten, J. P. Gogarten, and L. Olendzenski, editors. Horizontal Gene Transfer: Genomes in Flux. Methods in Molecular Biology, Vol. 532. Humana Press, New York, 2009.
  • [7] N. Chia and N. Goldenfeld. Statistical mechanics of horizontal gene transfer in evolutionary ecology. J. Stat. Phys., 142:1287–1301, 2011.
  • [8] M. S. Breen et al. Epistasis as the primary factor in molecular evolution. Nature, 490:535–538, 2012.
  • [9] J. Sun and M. W. Deem. Spontaneous emergence of modularity in a model of evolving individuals. Phys. Rev. Lett., 99:228107, 2007.
  • [10] N. Kashtan, M. Parter, E. Dekel, A. E. Mayo, and U. Alon. Extinctions in heterogeneous environments and the evolution of modularity. Evolution, 63:1964–1975, 2009.
  • [11] J.-M. Park and M. W. Deem. Schwinger boson formulation and solution of the Crow-Kimura and Eigen models of quasispecies theory. J. Stat. Phys., 125:975–1015, 2006.
  • [12] E. Cohen, D. A. Kessler, and H. Levine. Recombination dramatically speeds up evolution of finite populations. Phys. Rev. Lett., 94:098102, 2005.
  • [13] S. Goyal, D. J. Balick, E. R. Jerison, R. A. Neher, B. I. Shraiman, and M. M. Desai. Dynamic mutation–selection balance as an evolutionary attractor. Genetics, 191:1309–1319, 2012.
  • [14] B. Derrida and L. Peliti. Evolution in a flat fitness landscape. Bull. Math. Biol., 53:355–382, 1991.
  • [15] G. Bianconi, D. Fichera, S. Franz, and L. Peliti. Modeling microevolution in a changing environment: the evolving quasispecies and the diluted champion process. J. Stat. Mech., 2011:P08022, 2011.
  • [16] J.-M. Park and M. W. Deem. Phase diagrams of quasispecies theory with recombination and horizontal gene transfer. Phys. Rev. Lett., 98:058101, 2007.
  • [17] E. T. Munoz, J.-M. Park, and M. W. Deem. Quasispecies theory for horizontal gene transfer and recombination. Phys. Rev. E, 78:061921, 2008.
  • [18] Z. Avetisyan and D. B. Saakian. Recombination in one- and two-dimensional fitness landscapes. Phys. Rev. E, 81:051916, 2010.
  • [19] D. B. Saakian. Evolutionary advantage via common action of recombination and neutrality. Phys. Rev. E, 88:052717, 2013.
  • [20] D. B. Saakian, Z. Kirakosyan, and C. K. Hu. Biological evolution in a multidimensional fitness landscape. Phys. Rev. E, 86:031920, 2012.
  • [21] V. S. Dotsenko. Fractal dynamics of spin glasses. J. Phys. C: Solid State Phys., 18:6023–6031, 1985.
  • [22] J.-M. Park, L. R. Niestemski, and M. W. Deem. Quasispecies theory for evolution of modularity. page DOI: 10.1103/PhysRevE.00.002700, 2014. arXiv:1211.5646.
  • [23] G. J. D. Smith, D. Vijaykrishna, J. Bahl, S. J. Lycett, M. Worobey, O. G. Pybus, S. K. Ma, C. L. Cheung, J. Raghwani, S. Bhatt, J. S. Malik Peiris, Y. Guan, and A. Rambaut. Origins and evolutionary genomics of the 2009 swine-origin h1n1 influenza a epidemic. Nature, 459:1122–1125, 2009.
  • [24] A. Rambaut, O. G. Pybus, M. I. Nelson, C. Viboud, J. K. Taubenberger, and E. C. Holmes. The genomic and epidemiological dynamics of human influenza a virus. Nature, 453:615–619, 2008.
  • [25] N. Marshall, L. Priyamvada, Z. Ende, J. Steel, and A. C. Lowen. Influenza virus reassortment occurs with high frequency in the absence of segment mismatch. PLoS Pathog., 9:e1003421, 2013.
  • [26] V. Trifonov, H. Khiabanian, and Raul Rabadan. Geographic dependence, surveillance, and origins of the 2009 influenza a (h1n1) virus. N. Engl. J. Med., 361:115–119, 2009.
  • [27] M. Łuksza and M. Lässig. A predictive fitness model for influenza. Nature, 507:57–61, 2014.
  • [28] D. J. Smith, A. S. Lapedes, J. C. de Jong, T. M. Bestebroer, G. F. Rimmelzwaan, A. D. M. E. Osterhaus, and R. A. M. Fouchier. Mapping the antigenic and genetic evolution of influenza virus. Science, 305:371–376, 2004.
  • [29] V. Gupta, D. J. Earl, and M. W. Deem. Quantifying influenza vaccine efficacy and antigenic distance. Vaccine, 24:3881–3888, 2006.
  • [30] J. He and M. W. Deem. Low-dimensional clustering detects incipient dominant influenza strain clusters. PEDS, 23:935–946, 2010.
  • [31] M. W. Deem and H.-Y. Lee. Sequence space localization in the immune system response to vaccination and disease. Phys. Rev. Lett., 91:068101, 2003.
  • [32] F. Y. Wu. The potts model. Rev. Mod. Phys., 54:235–268, 1982.
  • [33] E. T. Munoz, J.-M. Park, and M. W. Deem. Solution of the crow-kimura and eigen models for alphabets of arbitrary size by schwinger spin coherent states. J. Stat. Phys, 135:429–465, 2009.
  • [34] P. E. Wigner. On the distribution of the roots of certain symmetric matrices. Annals of Mathematics, pages 325–327, 1958.
  • [35] W. V. Li and Q.-M. Shao. Gaussian processes: inequalities, small ball probabilities and applications. Stochastic processes: theory and methods, 19:533–597, 2001.
  • [36] S.-Y. Kim, S.-J. Lee, and J. Lee. Ground-state energy and energy landscape of the sherrington-kirkpatrick spin glass. Phys. Rev. B, 76:184412, 2007.
  • [37] V. M. de Oliveira and J. F. Fontanari. Landscape statistics of the pp-spin ising model. Journal of Physics A: Mathematical and General, 30:8445–8457, 1997.
  • [38] G. S. Grest and C. M. Soukoulis. Ground-state properties of the infinite-range vector spin-glasses. Physical Review B, 28:2886–2889, 1983.
  • [39] J. M. Kosterlitz, D. J. Thouless, and R. C. Jones. Spherical model of a spin-glass. Phys. Rev. Lett., 36:1217–1220, 1976.
  • [40] G. Pisier. The volume of convex bodies and Banach space geometry, volume 94. Cambridge University Press, New York, 1999.
  • [41] D. B. Owen and G. P. Steck. Moments of order statistics from the equicorrelated multivariate normal distribution. The Annals of Mathematical Statistics, 33:1286–1291, 1962.