Analyzing effective models: An example from JAK/STAT5 signaling
1 Abstract
In systems biology effective models are widely used due to the complexity of biological system. They result from a coarse-graining process which employs specific assumptions. Frequently one does not start with a model taking all details into account and then performs a coarse-graining process, but rather one starts right away with the effective equations and often the underlying assumptions remain hidden or unclear. We exemplify the analysis of an effective model by analyzing a time delay equation for the JAK/STAT5 signaling pathway and show how one can avoid wrong conclusions and obtain a deeper understanding of the biological system . By analyzing the assumptions leading to a coarse-grained model one might be able to gain new insight into the involved biological processes. Further, the compliance of the model with experimental data can be considered as a validation of the assumptions made in the derivation of the mathematical equations.
2 Introduction
Biological processes such as, e.g., signal transduction pathways, often involve many different time-scales ranging from a s to hours and also span often several length scales. A prominent example for an event on a short time and length scale involved in signal transduction is the binding of a ligand to a receptor or protein-kinase interaction and the subsequent phosphorylation [1, 2]. Each of these is a complex phenomenon itself, e.g., the binding of a ligand to a receptor is governed by different physical interactions, van der Waals, electrostatic, etc., but also by the mobility of the ligand and the structure of the proximity of the receptor [3]. Clearly, models including all these details would be intractable. Rather, one refers to coarse-grained or simplified effective models reflecting the level of detail necessary to study a specific problem and avoid considering all details of the underlying complex process. Another well known example of coarse-grained models are Langevin equation where the influence of the water molecules on collodial particles is comprised by a stochastic force [4]. By using coarse-grained models fundamental design principles are not masked by unnecessary details [5]. Alternatively, employing effective models might be necessary if not all components of the system are accessible by experiments. Examples of the effective treatment of biological processes are the synthesis and the degradation of proteins which involve many sub-steps, such as transcription factor binding, binding of the RNA polymerase, etc., or several ubiquitination steps and translocation to the proteasome, etc [6]. In all these examples sub-processes are condensed into a single rate which appear in effective models. Another common reduction of sub-processes is their description by specific mathematical functions. Prominent examples are the Michaelis-Menten or Hill functions often used to describe saturating sub-processes by coarse-grained models [7]. Sub-processes can also introduce time delays into the effective mathematical description of a biological system. One of the most famous example of a dynamical system with time delays was suggested by Mackey and Glass [8, 9], where nonlinear differential-delay equations describe physiological control systems. Other examples occur in population models [10, 11], infection models [12, 13, 14], signal transduction networks [15, 16], and gene regulation [17, 18]. Although effective models are inevitable their use should be a matter of great care, because they result from a coarse-graining process which involves certain assumptions. Of course, in most case one does not start with a model taking all necessary details into account and then performs a coarse-graining process to derive the relevant effective model for the problem at hand. Rather one starts right away with the effective model or equation and very often the underlying assumptions remain hidden or unclear. In systems biology effective models are widely used as otherwise one would be overwhelmed by the amount of details of a complex biological system. However, employing such models blindly without understanding the range of its validity clearly impairs reliable statements about the system under investigation. Further, models for dynamic biological systems are often tested against quantitative experimental data. If a model reproduces the behavior of a system it is understood that the model sufficiently reflects the dominant dynamics of the system under consideration. This must be then also correct for the implicit assumptions underlying the employed effective model. In turn this means that by analyzing the assumptions leading to the effective model one might be able to gain a deeper understanding about the involved biological processes. In this paper, we exemplify the analysis of an effective model and how one can obtain a deeper understanding of the biological system by analyzing a time delay equation for the JAK/STAT5 signaling pathway previously presented in [16].
Results and Discussion
The delay differential equation for the JAK/STAT5 signaling pathway as an example for an effective model
The mathematical model of the JAK/STAT presented by Swameye et al. [16] presupposes the binding of the ligand erythropoietin (Epo) to the Epo receptor (EpoR) located at the cell membrane. This results in an activation of the receptor (via cross-phosphorylation of the JAK proteins) and a subsequent phosphorylation of the STAT5 molecule. Two phosphorylated STAT5 proteins form a homodimer which enters the cell nucleus, where it stimulates the transcription of target genes. Then the dimers dissociate, are dephosphorylated, and relocated back to the cytoplasm. A delay represents the time the STAT5 proteins reside in the nucleus. This time-delay is introduced to lump all nuclear processes into a single variable. This biochemical reaction scheme can be visualized by
EpoR + STAT5 | EpoR + pSTAT5 | ||||
pSTAT5+pSTAT5 | |||||
(1) |
where pSTAT5 represents phosphorylated STAT5 in the cytoplasm, the STAT5 dimer, and the STAT5 dimer inside the nucleus is represented by . Reaction rates are represented by and the time-delay by . In the study of Swameye et al. [16] the authors suggested the following model:
(2) |
Here, the concentration of cytoplasmic unphosphorylated STAT5 is represented by , whereas denotes the phosphorylated STAT5. Moreover, describes the concentration of the dimer and is the nuclear STAT5. The concentration of the activated receptor sites is given by . Eq. (The delay differential equation for the JAK/STAT5 signaling pathway as an example for an effective model) is an example of an effective description of a biological process. Based on this model the authors found from a fit to experimental data that the import rate and the export rate are very similar, in particular, and . The question arises whether it is valid to draw the obvious conclusion that the import and the export rates for the STAT5 molecules are equal. In order to answer this question we analyze the underlying assumptions of the model. To this end we translate the reaction scheme given by Eq. (The delay differential equation for the JAK/STAT5 signaling pathway as an example for an effective model) into a mathematical model, assuming that all reactions inside the nucleus can be summarized by series of reactions. Furthermore, we assume that due to the size of the nucleus, any delay-distribution arising from diffusion can safely be neglected. In the next section we derive a set of differential equation for a simple two-compartment system.
Delays as an approximation for a series of reactions
Let us assume that a compartment B is divided into several sub-compartments each representing a specific state of the molecules in B. The transition from state into occurs with rate , where we assume that the back-reaction rates are significantly smaller than . This allows us to neglect any reaction from to , such that we clearly assign a direction towards the last sub-compartment . The sub-division process of B is graphically illustrated in Fig. 1. Note that if the rates are very heterogeneous, the behavior of the system is dominated by the smallest rate, as shown in [19]. However, to simplify the discussion we set . These considerations lead to the following system of differential equations:

(3) |
where is the concentration of state in sub-compartment . We further assume that the initial concentration in all sub-compartments is zero. To obtain an equation similar to Eq. (The delay differential equation for the JAK/STAT5 signaling pathway as an example for an effective model), we have to relate each to the input of compartment B. Then, the total concentration in B is given by the sum over the sub-compartments which then turns the differential equations into a single equation containing and . To this end, we perform a Laplace-transformation of Eq. (Delays as an approximation for a series of reactions), which leads to
(4) |
where denotes the Laplace-transformation of the corresponding function. Recursively, solving these equations, we arrive at
(5) |
Summing over the sub-compartments and performing the Laplace back transformation finally yields
(6) |
Thus the sequence of reactions shown in Fig. 1 results in a system of integro-differential equations, where the integration kernel is given by
(7) |
where we set and the function is the incomplete gamma function defined by: [20].
Before we employ our results to the JAK-STAT signaling case, we discuss some limiting cases of . First, for a large number of sub-compartments the integration kernel converges to
(8) |
In Fig. 2 we present for different number of sub-compartments . It can be seen that already for the limit kernel provides a good approximation for . For smaller the difference between and becomes pronounced for . For the kernel is dominated by the exponential decay and the difference between and the limit kernel is negligible small. This means that for small and large even the case of having only a few reactions in B the process can be modeled with sufficient accuracy by using instead of . In the opposite case, where is large or is small it is necessary to use . If we further set , the integration kernel given in Eq. (8) simplifies to
(9) |
which is a gamma distribution. The fact that a linear chain such as that in Eq. (Delays as an approximation for a series of reactions) is equivalent to a delay differential equation with a gamma distributed delay is a known result [21]. We would like to stress that in our model the rate is conceptually different from the rates and . While and represent the import and export rate, resp., is related to unknown reactions inside B. The steady state of Eq. (6) is given by (cf. Appendix A):
(10) |
where denotes the initial concentration of and the limit is motivated by the observation that the dynamics is quite insensitive to . The steady state value of is decreased compared to a system without delay due to the extra factor in the denominator which reflects the storage of molecules in the reaction chain.

Analysis of the JAK-STAT system
We are now equipped to analyze the delay system given by Eq. (The delay differential equation for the JAK/STAT5 signaling pathway as an example for an effective model). Taking the same approach as above we obtain for the JAK-STAT signaling pathway the following integro-differential equations:
(11) |
The resulting integro-differential equation is difficult to handle; to simplify the effective equations further we employ the fact that and assume , which results in (cf. Appendix B):
(12) |
It is important to recognize that the rate at which STAT5 is exported from the nucleus does not appear explicitly in these equations, a direct consequence of the assumption that the export rate is fast compared to the rate of change of the STAT5 dimer concentration in the cytoplasm. In addition, the dynamic of the nuclear STAT5, , is completely determined by the dynamic of the cytoplasmic STAT5 dimers, . This might be counter-intuitive and will not be generally true, but results from the assumptions underlying Eq. (Analysis of the JAK-STAT system) and is a consequence of incorporating all reactions into a single delay.
We derived starting from a system of equations in which all reactions steps inside the nucleus explicitly appear, a set of effective equations, Eq. (Analysis of the JAK-STAT system), which are almost equal to the equations used by Swameye et al. [16], given by Eq. (Delays as an approximation for a series of reactions). The subtle but important difference is that the export rate does not appear explicitly in Eq. (Analysis of the JAK-STAT system) as a consequence of our derivation, whereas it is still present in model Eq. (Delays as an approximation for a series of reactions). However, by parameter estimation using quantitative experimental data Swameye et al. found , which can be regarded as an experimental validation of the assumptions underlying Eq. (Analysis of the JAK-STAT system). In particular, this means that the processes inside the nucleus can be described as a linear forward reaction process. Moreover, we can conclude that the export process is, compared to the other processes involved, a rather fast process, since we required this in order to achieve Eq. (Analysis of the JAK-STAT system) from Eq. (Analysis of the JAK-STAT system). Our analysis shows that the seemingly obvious conclusion that the export and the import rate of the STAT5 protein from and into the nucleus, resp., are approximatively equal is not valid and would have been drawn due to a misunderstanding of the equations Eq. (The delay differential equation for the JAK/STAT5 signaling pathway as an example for an effective model).
3 Conclusions
Many biological reactions consist of intermediate steps introducing a complexity which is difficult to handle explicitly. These intermediate steps are often subsumed in effective models such as delay differential equations. We showed that it is necessary to introduce delays with great care, since intuitive modeling might lead to a set of equations which are hard or even not possible to interpret from a mechanistic point of view. We argue that it is important to make the assumptions underlying an effective model explicit in order to arrive to valid statements about the systems behavior. As a consequence, being aware of the requirements leading to the effective model yields extra insight into the system. The compliance of the model with experimental data can be considered as a validation of the assumptions made in the derivation of the mathematical equations.
As an example of a system of effective equations successfully employed in systems biology we chose the JAK/STAT5 signaling cascade proposed in [16]. One of the important results for JAK/STAT5 signaling obtained in [16] is the STAT5 molecule is involved in many subsequent signaling steps inside the nucleus. Instead of guessing the effective equation we derived the delay differential equation through a sequence of simplification steps based on certain assumptions, finally arriving to a similar system of delay differential equations suggested in [16]. By this procedure we could make the assumptions underlying effective delay differential equations explicit. In particular, since we can describe the action of STAT5 in the nucleus by a linear directed series of reactions it follows that there are no significant feedbacks or back reactions in the signaling cascade inside the nucleus. Moreover, we show that the model does not allow any statements about the export rate of the STAT5 molecules from the nucleus. The considered effective model is well supported by quantitative data and hence this suggests that the conditions for the coarse-graining are fulfilled.
Acknowledgements
This work was supported by FP7 CancerSys Project (HEALTH-F4-2008-223188) and BMBF Project FRISYS (0313921).
Appendix A Steady state of the series of reaction model
Starting point for the calculation of the steady state is Eq. (7) using the limiting kernel for :
(13) |
Taking the derivative with respect to time results in:
(14) |
Using Eq. (13) to replace the integral yields:
(15) |
Next we expand in a Taylor series in giving rise to:
(16) |
For Eq. (13) can be solved exactly with the result: . This can be exploited in the following way. We integrate Eq. (16) over the interval leading to:
(17) |
Using the we obtain for the steady state :
(18) |
which results into Eq. (10) for the steady state.
Appendix B Simplifying the integro-differential equation
We use the identity to rewrite the integral from Eq. (Analysis of the JAK-STAT system)
Performing partial integration yields:
If holds we can neglect the integral on the right hand side.
References
- [1] Tony Pawson and John D Scott. Protein phosphorylation in signaling–50 years and counting. Trends in Biochemical Sciences, 30(6):286–290, jun 2005.
- [2] AH Brivanlou and James E. Darnell Jr. Signal transduction and the control of gene expression. Science (New York, NY), 2002.
- [3] R Grima and S Schnell. A systematic investigation of the rate laws valid in intracellular environments. Biophysical Chemistry, 2006.
- [4] Ian Snook. The Langevin and generalised Langevin approach to the dynamics of atomic, polymeric and colloidal systems. Elsevier Science, 2007.
- [5] Didier Gonze, Wassim Abou-Jaoudé, Djomangan Adama Ouattara, and José Halloy. How molecular should your molecular model be? On the level of molecular detail required to simulate biological networks in systems and synthetic biology. Methods in Enzymology, 487:171–215, 2011.
- [6] B. Alberts. Molecular Biology of the Cell. Garland Science, 5, illustrated edition, 2008.
- [7] J N Weiss. The Hill equation revisited: uses and misuses. The FASEB journal, 11(11):835–841, sep 1997.
- [8] M.C. Mackey and L. Glass. Oscillations and chaos in physiological control systems. Science, 197:287–289, 1977.
- [9] L. Glass and M.C. Mackey. Pathological conditions resulting from instabilities in physiological control systems. Ann. N.Y. Acad. Sci., 316:214–235, 1979.
- [10] R.M. May. Stability and Complexity in Model Ecosystems. Princeton University Press, 1975.
- [11] W.S.C. Gurney and R.M. Nisbet. Age- and density-dependent population dynamics in static and variable environments. Theor. Popul. Biol., 17:321–344, 1980.
- [12] RV Culshaw and S Ruan. A delay-differential equation model of hiv infection of cd4+ t-cells. Mathematical Biosciences, 165(1):27–39, 2000.
- [13] PW Nelson and AS Perelson. Mathematical analysis of delay differential equation models of hiv-1 infection. Mathematical Biosciences, 179(1):73–94, 2002.
- [14] PW Nelson, JD Murray, and AS Perelson. A model of hiv-1 pathogenesis that includes an intracellular delay. Mathematical biosciences, 163(2):201–215, 2000.
- [15] J. Srividhya, M.S. Gopinathan, and S. Schnell. The effects of time delays in phosphorylation-dephosphorylation pathway. Biophys. Chemistry, 125:286–297, 2007.
- [16] I. Swameye, T.G. Müller, J. Timmer, O. Sandra, and U. Klingmüller. Identification of nucleocytoplasmic cycling as a remote sensor in cellular signaling by data-based modeling. PNAS, 100:1028–1033, 2003.
- [17] M.H. Jensen, K. Sneppen, and G. Tiana. Sustained oscillations and time delays in gene expression of protein Hes1. FEBS Letters, 541:176–177, 2003.
- [18] D. Bratsun, D. Volfson L.S. Tsimring, and J. Hasty. Delay-induced stochastic oscillations in gene regulation. PNAS, 102:14593–14598, 2005.
- [19] I. Epstein. Delay effects and differential delay equations in chemical kinetics. Reviews in Physical Chemistry, 11:135–160, 1992.
- [20] L.S. Gradshteyn and I.M. Ryzhik. Table of Integrals, Series and Products. Academic Press, 2000.
- [21] N. MacDonald. Biological Delay Systems: Linear Stability Theory. Cambridge University Press, 1989.