Cooperative action in eukaryotic gene regulation: physical properties of a viral example
Abstract
The Epstein-Barr virus (EBV) infects more than 90% of the human population, and is the cause of several both serious and mild diseases. It is a tumorivirus, and has been widely studied as a model system for gene (de)regulation in human. A central feature of the EBV life cycle is its ability to persist in human B cells in states denoted latency I, II and III. In latency III the host cell is driven to cell proliferation and hence expansion of the viral population, but does not enter the lytic pathway, and no new virions are produced, while the latency I state is almost completely dormant. In this paper we study a physico-chemical model of the switch between latency I and latency III in EBV. We show that the unusually large number of binding sites of two competing transcription factors, one viral and one from the host, serves to make the switch sharper (higher Hill coefficient), either by cooperative binding between molecules of the same species when they bind, or by competition between the two species if there is sufficient steric hindrance.
pacs:
87.16.Yc,87.17.Aa,05.90.+mI Introduction
Genetic switches, mainly in bacteria, have recently interested statistical physicists, and work in this direction has been extensively reviewed in Bintu et al. (2005a, b). The fundamental assumption is that gene transcription, the copying of a stretch of DNA into mRNA, is either “on” or “off”. This state of transcription depend on whether certain gene specific DNA binding proteins, transcription factors, are bound, or not, to the promoter region of the gene. A gene may be controlled by one or more transcription factors, each having a varying number of binding sites in the promoter region. The action of the transcription factor may in turn be either inhibitory or excitatory. Inhibition can arise from blocking access of the RNA-Polymerase to the transcription start site, while a stimulating effect is obtained if the bound factor stabilizes the Polymerase-DNA complex. A paradigmatic example where both effects are present is lysogeny maintenance in phage Ptashne (2005); Ptashne and Gann (2002). DNA looping, where distantly bound transcription factors interact and affect transcription, is also possible.
At a given transcription factor concentration, each possible state of promoter bound factors occurs with a probability given by a grand canonical ensemble formula. The promoter region with the binding sites (with or without transcription factors) corresponds to the small system, and the cytoplasm, with a large number of transcription factors moving around, serves as the reservoir. Quite often transcription factors bind in dimer (or multimer) form, in which case the relevant concentration is determined by balance from the total concentration. In summary, the rate of transcription is a non-linear, sometimes quite complicated, function of the concentrations of the transcription factors regulating the gene.
One important property in gene regulation is cooperativity. If a single copy of a protein molecule in monomer form were to (positively) regulate a certain gene, the activity of that gene would follow the well-known Michaelis-Menten curve. The transcription rate would then be proportional to the concentration of the regulating molecule, up to a threshold above which it would level off. In other words, there would be appreciably high transcription even at very low concentrations of the regulating protein. The rationale for transcription factor often binding in multimer form, and of multiple DNA binding sites enabling cooperative interactions, is therefore assumed to be that it results in a sharper, more “all-or-nothing” switch.
Multiple binding sites for one and the same transcription factor are common in eukaryotic promoters. The object of this paper is one particular viral example of no less than 20 binding sites for a viral factor, where transcriptional activity has been observed to require 8 bound molecules Wysokenski and Yates (1989); Zetterberg et al. (2004), see section II below. In addition, these sites are interleaved with an equal number of binding sites of a host transcription factor, presumably imposing the opposite effect. In a previous contribution, Werner et al. , we introduced, for reasons of computational simplicity, a thermodynamic model of this promoter switch ignoring eventual cooperative bindings and allowing some steric hindrance. Although direct experimental evidence is lacking, cooperative bindings of the viral transcription factor at this promoter is likely to be present, as well as more extensive blocking scenarios due the closely spaced sites. Both these mechanisms are likely to affect the sharpness of the switch.
We show in this paper that while cooperative protein interactions is one way to achieve effective cooperativity of the switch, accounting for full steric hindrance (blocking) of one species of molecules on the other is a more effective one. Therefore, a possible functional role of the alternating pattern of binding sites could be increasing effective cooperativity when the promoter architecture do not allow for cooperative molecular interactions.
II The Epstein-Barr virus, the EBNA-1 protein, and the C promoter
The Epstein-Barr virus (EBV) belongs to the gamma-herpes virus family, with relatives among other primate lymphocryptoviruses, and has likely co-evolved with man for a very long time Gerner et al. (2004). Although not discovered until the 1960ies, it is now known to infect more than 90 % of the human population. The infection is asymptomatic if it occurs early in life, while later infection may result in infectious mononucleosis, more commonly known as “the kissing disease”. The virus infects new hosts by virus particles shed from epithelial cells in the throat, and can persist in the host blood B cells for long times, in at least three distinct latent states known as latency I, II and III. EBV is medically important primarily because some cancer forms are invariably associated with the viral infection Young and Rickinson (2004).
The most vital EBV protein is EBNA-1, a transcription factor involved in replication, episome partitioning as well as gene regulation Leight and Sugden (2000). In latency I, EBNA-1 is produced from RNA transcripts originating from the Q promoter on the EBV genome. EBNA-1 down-regulates transcription from Qp by binding to sites downstream of the transcription start site Sample et al. (1992). In latency III, on the other hand, EBNA-1 is produced together with five other proteins by alternative splicing of a longer RNA transcribed from the EBV C promoter (Cp) Bodescot et al. (1987). EBNA-1 positively regulates Cp activity by binding to the “family-of-repeats” (FR) region, positioned upstream of the start site Reisman and Sugden (1986). The physical description of this regulatory element is the topic of the present paper.
The FR region consists of 20 consecutive binding sites for EBNA-1 Nilsson et al. (2001). There are minor variations in the DNA sequence among these sites, but they are all experimentally verified, and approximately equally strong, binding sites Ambinder et al. (1990). Comparing promoter activity, from constructs with varying number of binding sites in FR, revealed that at least eight sites are necessary to have full transcriptional activation Wysokenski and Yates (1989); Zetterberg et al. (2004), see Table 1. Recent studies have identified an equal number of octamer binding sites at FR, juxtaposed with the EBNA-1 sites Almqvist et al. (in press). The action of the human transcription factor Oct-2 , identified as binding to these octamer sites complex with the co-factors Groucho/TLE, is believed to be inhibitory Malin et al. (2005).
In summary, the Cp activity is largely regulated by binding of two species of molecules, EBNA-1 and Oct-2. They each can bind to 20 sites, and have antagonistic effects when bound. Due to the closely spaced binding sites, Oct-2 and EBNA-1 compete for binding to FR. It is however not experimentally known if one bound Oct-2 blocks out one or both of the neighbouring sites for EBNA-1, and vice versa. The other unknown aspect is whether there exists cooperative binding between EBNA-1 proteins at FR, and if so, the strength of these interactions Frappier . Therefor we explore the effects of cooperative binding and blocking, with emphasis on how the effective cooperativity of the promoter switch is affected, i.e. the sharpness of the switch.
III Cooperative binding and competition
The general thermodynamic framework is the following. Suppose a number of transcription factors , ,…, can bind in different states indexed by around the start of a gene. The number of transcription factors of type bound in state is , the association free energy is , and the rate of transcription of the gene is . Suppose further is the concentration of transcription factor in the surrounding cytoplasm, in the form in which this transcription factor binds. Then the binding sites, with or without bound transcription factors, can be considered a small system, exchanging particles (transcription factors) and energy with the larger reservoir. The probability of the small system being in state is
(1) |
and the net average rate of transcription is
(2) |
The key assumption behind (2) is that the time scale at which the probabilities in (2) equilibrate is much faster than the time scales at which the the concentrations , ,…, change appreciably.
In the present example, states can be labeled by , the number of EBNA-1 molecules bound, , the number of Oct-2 molecules bound, , the number of cooperative bindings between bound EBNA-1 molecules and , the number of cooperative bindings between bound Oct-2 molecules. Every such state has a binding free energy of
(3) |
where kcal/mol Ambinder et al. (1990) and kcal/mol Shah et al. (1997) are the known binding free energies of EBNA-1 and Oct-2 to binding sites in FR, and and are the unknown cooperative binding energies. In the numerical experiments described in this paper we only examine EBNA-1 cooperativity. is proportional to in the range from 0 % (no cooperativity) up to 40 %. The total probability of the states with given values of , , and is hence
(4) |
where is the number such states, and the overall rate of transcription is
(5) |
where is the number of binding sites.
As described briefly in the introduction, one can imagine two plausible blocking scenarios at FR. The first and simplest, is that each molecule bound hinders binding of the competitive species to the closest neighbouring site on one side. This is referred to a single-side blocking (Fig 1a). The other scenario is that each bound molecule, sterically hinders both neighbouring sites for the other molecule; a double-sided blocking (Fig 1b). The blocking method naturally affects the number of possible bound configurations, seen in Eq. 5. The upper bound in the sum over is in the single-side blocking model, but at most in the double-side blocking model for all greater than zero. Similarly, the sums over and may effectively go over smaller ranges e.g. in the double blocking scenario with both molecules bound and all EBNA-1 and Oct-2 molecules bind together in two groups, hence and .

Brute-force counting of is not feasible as the number of states in this model is up to (in the model with single-side blocking only). Efficient calculation of involves two aspects. First, elementary combinatorics is used to build up a paradigm “balls-baskets” problem. It counts, under different constrains, the number of ways that one can put certain number of balls into another number of baskets. Second, we find a way that can describe efficiently all effects including double-side blocking, cooperativity and combination of both in a three-step algorithm:
-
1.
Construct a backbone sequence (S0) made up by two types of baskets (, ), the two types of molecules.
-
2.
Distribute Es and Os among these baskets, forming a sequence (S1) consisting only of E and O.
-
3.
Consider the front, end and the in-between positions of S1 as baskets () for empty binding sites . Insert empty sites into these positions and get the final pattern (S2).
By setting , the actual number of sites is reduced by half, and the single-sided blocking model is the default. The double-side blocking is realized by setting the between an ”OE” segment in S2 as must-be-filled baskets (Fig 1b). The number of cooperative units, , are counted by recording number of ”EE” in S2, minus the number of that have been filled with .
To examine the effective cooperativity in the transition from to we compute the Hill coefficient. This is the logarithmic derivative of the ratio of probability of transcription to the probability of no transcription, with respect to the logarithm of the free ligand concentration. The Hill coefficient is a function of the ligand concentration, but the effective Hill coefficient is customarily taken at half saturation:
(6) |
In this paper we explore the Hill coefficient functions to see how blocking and cooperative binding influence the effective cooperativity of the switch. There are three cases studied; 1) cooperative binding of EBNA-1 and no competing molecular species, 2) cooperative binding of EBNA-1 with single-side blocking between the competing species, and 3) cooperative binding of EBNA-1 with double-side blocking between the competing species.
IV Effective cooperativity of the switch
One convenient way to visualize the cooperativity of the switch is as the ratio vs. the local Hill coefficient given as . For very high and very low concentrations of EBNA-1, corresponding to very large and very small values of , it is easy to see that in our model respectively . and are constants, and is the total number of binding sites in FR. Accordingly, the extreme local Hill coefficients are and . Fig. 2 illustrates this limit behaviour for three values of .


In the region of main interest, where , the Hill coefficient curves show very different behavior for the three models. Without any cooperative interactions, and without competition, the effective Hill coefficient is substantially lower than both its limits. This baseline function for the system has an effective Hill coefficient of 3.5 (Fig 3, circled lines). This low Hill coefficient remains even with competition from Oct-2 binding, for the single-side blocking, the effective cooperativity practically insensitive to Oct-2 levels. On the contrary, competition with double-side blocking dramatically alters the shape of the Hill coefficient curve, to a sigmoidal interpolation between the limits and . The effective Hill coefficient then changes from 3.5 up to 10.5, for saturating amounts of Oct-2 (Fig 3, dotted lines).
From a theoretical point of view, the thermodynamic model of the switch is a (finite, one-dimensional) Ising-like model with three states at each site: bound by EBNA-1, bound by Oct-2, or free. The only complication in computing the “ON” probability () is that only states with enough bound EBNA-1 count, which mixes in a global variable in the elementary statistical mechanical model. The single-blocking results can however be readily understood. With no cooperative binding and only single blocking, one can sum over in (7) to obtain the model studied in Werner et al. , that is
(7) |
Including the normalization this means
(8) |
and the ratio between ON and OFF probabilities is therefore a function of the variable only:
(9) |
The local Hill coefficients are
(10) |
which like the ratio depends on the concentration of the second molecule only through . The effective cooperativity in the model without cooperative binding and only single blocking hence does not depend on , as shown in the curves in Fig 3. The Hill coefficient at can be estimated by approximating the binomials with a Gaussian distribution, i.e.
(11) |
where, in the case at hand,
,
,
and . Half-filling is achieved at ,
and the Hill coefficient is
which accords quite well with the minimum value in Fig. 3.
The switch is therefore much less sharp than the limits of and , at respectively and could have led one to believe. We note that the sharpness increases with
(as long as the threshold stays around ), but only as the square root of : more than
a hundred consecutive binding sites are necessary to reach a Hill coefficient of about ten
in a model of this kind.
In the model with double blocking on the other hand clearly the effective cooperativity
can be much larger, and also depend on . That is easy to understand in the limit where is large; if so EBNA-1 and Oct-2 compete for binding sites, and the possibility that a site is left free can be disregarded. Therefore, if copies of EBNA-1 are bound, then also copies of Oct-2 are bound, altogether in the pattern with statistical weight
(12) |
The Hill coefficient is then only a function of , such that the curve in Fig 3 has a limit when becomes large, and the value of the Hill coefficient at e.g. then lies between the limits of and . Competition with a second molecule therefore makes the switch sharper for double-side blocking, in contrast to the situation in single-sided blocking.
The case with cooperativity can be understood qualitatively, with the helix-coil model of protein physics. Without Oct-2, the statistical model can be written as a factor for each letter , and a penalty for every start letter of a string of ’s. In an infinitely long string, the fraction of letters as well as the frequency of initiation of a string of ’s are calculated from the leading eigenvalue of the transfer matrix Sneppen and Zocchi (2005). In our case, the interesting region is obviously when that fraction is around , as sites out of need to be filled to have transcription from Cp. If is close to one, cooperative binding is weak, and the switch is similar to the single-blocking case discussed above. If on the other hand is much less than one, the expected fraction of letters can be larger than , while the expected frequency if initialization of a string of ’s is less than once in twenty sites. Eventually, we would expect that either all twenty sites are bound, or no sites in FR be bound. This describes a situation where all twenty molecules have to bind in simultaneously, in which case the Hill coefficient is 20.
The addition of a cooperative binding of EBNA-1 to both the single- and double sided model, hence changes the effective Hill coefficient differently, depending on model. Fig 4 displays the curves for 5 different cooperative binding strengths, when no Oct-2 is competing for the FR sites. The range of cooperative strength here is from 0 % up to 40 % of the DNA affinity, i.e kcal/mol, where the effective Hill coefficient is increased from 3.5 to 16.

However, in the real system the competitive protein Oct-2 is likely to be present, perhaps even at very high concentrations. As for the single-sided blocking, an additional cooperative binding of EBNA-1 does not have the same impact when Oct-2 levels are high. Instead of a 4-fold change, from 3.4 to 16, the effective Hill coefficient is now only doubled, from 3.5 to 7 (compare Fig 4 and 5, solid lines). This is to be compared with the double-sided blocking model, where even no cooperative bindings have a relatively high effective cooperativity. Adding up to 40 % cooperative binding strength, the Hill coefficient is almost doubled, from 10.5 to 18 (Fig 3)
A conclusion to draw from this is that to create an effective switch for genetic control, this type of architecture, with alternating binding sites for two antagonistic factors, can be one approach. For EBV, the FR region is known for its enhancer function, as well as forming a looped structure with another EBNA-1 binding region on the viral genome; the dyad symmetry (DS) Frappier and O’Donnell (1991); Su et al. (1991). This structure is involved in replication initiation control. If the EBNA-1 binding sites in FR were to be arrange in the same manner as in DS, i.e. much closer in space, there might be cooperative bindings forming even at FR. However, since FR also seem to play an important role in forming a looped structure, there might be a structural reason behind these more sparsely placed sites, not enabling the same type of tight interactions. And, as we show here, there is no need for cooperative interactions to get a sharp switch of Cp activity, as long as there is efficient steric hindrance.


Acknowledgements.
We thank Ingemar Ernberg for sharing his knowledge of the Epstein-Barr virus and many discussions on mechanisms and modelling of the EBV lat I/lat III switch. This work was supported by the Swedish Science Council (M.W. and E.A.).References
- Bintu et al. (2005a) L. Bintu, N. E. Buchler, H. G. Garcia, U. Gerland, T. Hwa, J. Kondev, T. Kuhlman, and R. Phillips, Curr Opin Genet Dev 15, 116 (2005a).
- Bintu et al. (2005b) L. Bintu, N. E. Buchler, H. G. Garcia, U. Gerland, T. Hwa, J. Kondev, T. Kuhlman, and R. Phillips, Curr Opin Genet Dev 15, 125 (2005b).
- Ptashne (2005) M. Ptashne, A genetic switch (3rd edition) (Cold Spring Harbour Laboratory Press, 2005).
- Ptashne and Gann (2002) M. Ptashne and A. Gann, Genes and Signals (Cold Spring Harbour Laboratory Press, 2002).
- Wysokenski and Yates (1989) D. A. Wysokenski and J. L. Yates, Journal of Virology 63, 2657 (1989).
- Zetterberg et al. (2004) H. Zetterberg, C. Borestrom, T. Nilsson, and L. Rymo, International Journal of Oncolocy 25, 693 (2004).
- (7) M. Werner, I. Ernberg, J. Zou, J. Almqvist, and E. Aurell, submitted to BMC Systems Biology.
- Gerner et al. (2004) C. Gerner, A. dolan, and D. McGeoch, Virus Research 99, 187 (2004).
- Young and Rickinson (2004) L. S. Young and A. B. Rickinson, Nat Rev Cancer 4, 757 (2004).
- Leight and Sugden (2000) E. R. Leight and B. Sugden, Reviews in Medical Virology 10, 83 (2000).
- Sample et al. (1992) J. Sample, E. B. Henson, and C. Sample, J Virol 66, 4654 (1992).
- Bodescot et al. (1987) M. Bodescot, M. Perricaudet, and P. J. Farrell, Journal of Virology 61, 3424 (1987).
- Reisman and Sugden (1986) D. Reisman and B. Sugden, Molecular and cellular Biology 6, 3838 (1986).
- Nilsson et al. (2001) T. Nilsson, H. Zetterberg, Y. C. Wang, and L. Rymo, Journal of Virology 75, 5796 (2001).
- Ambinder et al. (1990) R. F. Ambinder, W. A. Shah, D. R. Rawlins, G. S. Hayward, and S. D. Hayward, Journal of Virology 64, 2369 (1990).
- Almqvist et al. (in press) J. Almqvist, J. Zou, Y. Linderson, C. Borestrom, E. Altiok, H. Zetterberg, L. Rymo, S. Petterson, and I. Ernberg, Journal of General Virology pp. – (in press).
- Malin et al. (2005) S. Malin, Y. Linderson, J. Almqvist, I. Ernberg, T. Tallone, and S. Petterson, Nucleic Acids Research 33, 4618 (2005).
- (18) L. Frappier, personal communication to M. Werner.
- Shah et al. (1997) P. C. Shah, E. Bertolino, and H. Singh, The EMBO Journal 16, 7105 (1997).
- Sneppen and Zocchi (2005) K. Sneppen and G. Zocchi, Physics in Molecular Biology (Cambridge University Press, 2005).
- Frappier and O’Donnell (1991) L. Frappier and M. O’Donnell, Proc.Natl. Acad. Sci 88, 10875 (1991).
- Su et al. (1991) W. Su, T. Middelton, B. Sudgen, and H. Echols, Proc. Natl. Acad. Sci. 88, 10870 (1991).
Tables
Number of sites | Activity |
---|---|
20 | 280 |
19 | 229 |
17 | 226 |
14 | 169 |
12 | 206 |
11 | 169 |
8 | 87 |
6 | 19 |
5 | 19 |
4 | 11 |
3 | 3.3 |
2 | 2.1 |
1 | 1.2 |
0 | 3.3 |