This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\RS@ifundefined

subsecref \newrefsubsecname = \RSsectxt \RS@ifundefinedthmref \newrefthmname = theorem  \RS@ifundefinedlemref \newreflemname = lemma

Katu: a fast open-source full lepto-hadronic kinetic code suitable for Bayesian inference modeling of blazars

 Bruno Jiménez-Fernández,1 H. J. van Eerten.1
1Department of Physics, University of Bath, Claverton Down, Bath BA2 7AY, United Kingdom
E-mail: B.Jimenez.Fernandez@bath.ac.ukE-mail: hjve20@bath.ac.uk
(Accepted XXX. Received YYY; in original form ZZZ)
Abstract

In the modelling of blazars and other radiating plasma flows, correctly evolving in time a series of kinetic equation coupled for different particle species is a computationally challenging problem. In this paper we introduce the code Katu, capable of simultaneously evolving a large number of particle species including photons, leptons and hadrons and their interactions. For each particle, the numerical expressions for the kinetic equations are optimized for efficiency and the code is written to easily allow for extensions and modifications. Additionally, we tie Katu to two pieces of statistical software, emcee and pymultinest, which makes a suite ready to apply to blazars and extract relevant statistical information from their electromagnetic (and neutrino, if applicable) flux. As an example, we apply Katu to Mrk 421 and compare a leptonic and a leptohadronic model. In this first ever direct Bayesian comparison between full kinetic simulations of these models, we find substantial evidence (Bayes factor 7\sim 7) for the leptohadronic model purely from the electromagnetic spectrum. The code is available online at https://github.com/hveerten/katu under the LGPL license111Alternatively, a mirror is maintained at https://gitlab.com/Noxbru/Katu.

keywords:
Radiation mechanisms: non-thermal, methods: statistical, methods: numerical, BL Lacertae objects: individual: Mrk 421.
pubyear: 2021pagerange: Katu: a fast open-source full lepto-hadronic kinetic code suitable for Bayesian inference modeling of blazars7

1 Introduction

Simulating the behaviour of energetic particles both individually and collectively is relevant in many fields of particle physics and astrophysics. In the first case, understanding how a particle emits and interacts with others has allowed us to build the theory of the standard model of particle physics. In the second case, by putting together all the results from the first, we can develop theories to explain the emission from astrophysical objects such as stars, planetary nebulae or, as for this work, blazars.

The theory of emission processess that occur between interacting non-thermal distributions of lepto-hadronic particles has evolved over many years (see e.g. Jones, 1968; Lightman & Zdziarski, 1987; Mastichiadis & Kirk, 1992), as have methods to solve these systems numerically (e.g. Mastichiadis & Kirk, 1995). More recently, the increase in computational power has allowed for the development of more complete models with better approximations to the different processes and models that deal explicitly with more kinds of particles, such as Mastichiadis et al. (2005); Dimitrakoudis et al. (2012); Cerruti et al. (2015); Petropoulou et al. (2015) and Gao et al. (2017).

In most cases, the computation of a final steady state for a model given a series of parameters is time-consuming, which has prevented the use of complex statistical tools in order to analyze the data, the models, the parameters and their relations. Although some studies use grid-based searches Gao et al. (2019) or genetic algorithms Rodrigues (2019) in order to systematically look for the set of parameters that best fit the data, numerous studies still fit models by eye Keivani et al. (2018); Petropoulou et al. (2020).

In this work we introduce a novel code to model the behaviour of different particle species in a relativistic plasma, Katu222Basque word for ’cat’ https://en.wiktionary.org/wiki/katu#Basque. Katu aims to be an efficient, highly-configurable model that simulates the kinetic equations in a time-dependent manner.

Besides Katu, we also introduce two wrappers over commonly used bayesian statistical packages: multinest Feroz et al. (2013) and emcee Foreman-Mackey et al. (2013). This allows us to not only have a powerful tool for simulating the emission from plasmas, but to have a full suite ready to quickly analyse and fit datasets producing high level statistical information. By having comprehensive statistical information that goes beyond assigning a likelihood value to a set of parameters, we also obtain converged probability distributions of the parameters and their correlations along with a measure of their Bayesian evidence that can be used for model comparisons..

This paper is structured as follows. We start in 2 by presenting the physics of the model, in particular, the kinetic equation. We continue by presenting the two numerical approaches that we take for solving the kinetic equation in 3, with some of the details of the implementation in 4. Some simple tests of the numerical schemes we implement are shown in 5. Next, an overview of bayesian inference, along with a simple presentation of emcee and multinest is presented in 6. We continue with a demonstration of the suite by analysing the data of Mrk 421 during 2009 in 7. We wrap up with our conclusions in 8. Additionally, we include a full explanation of all the processes that we solve in Appendix A and the configuration options for Katu in Appendix B.

2 Physics of the model

The model that the code Katu simulates is based on solving the kinetic equation associated with 1616 different species of particles: photons, protons, neutrons, electrons positrons, charged and neutral pions, muons and antimuons, separated between left and right handed333The decay rates from pions and to neutrinos are different depending on the handedness of the muons, and electron and muon neutrinos and antineutrinos. The general form of a kinetic equation is:

nt=𝒬e+𝒬i+nτescnτdecγ(γtn).\frac{\partial n}{\partial t}=\mathcal{Q}_{e}+\mathcal{Q}_{i}+\mathcal{L}-\frac{n}{\tau_{esc}}-\frac{n}{\tau_{dec}}-\frac{\partial}{\partial\gamma}\left(\frac{\partial\gamma}{\partial t}n\right). (1)

Here, 𝒬e\mathcal{Q}_{e} is an external injection of particles; 𝒬i\mathcal{Q}_{i} is an internal injection of particles, that is particles that are produced from processes inside the simulated volume; \mathcal{L} is the term that denotes the loss of particles due to internal processes; τesc\tau_{esc} represents the escape timescale of the particle, which we take to be a multiple of the free escape timescale (τesc=στfree,esc)\left(\tau_{esc}=\sigma\tau_{free,esc}\right); τdec\tau_{dec} represents the decay timescale of the particle; and γ/t\partial\gamma/\partial t represents continuous gains and losses of energy that affect the particles, where γ\gamma represents the energy of the particle normalized to its energy at rest.

In our model, the last term is used to represent accelerations proportional to the energy of the particle, and losses due to synchrotron emission and inverse Compton scattering, which are quadratic in energy. Thus, we take:

γt=1τaccγ+Sγ2,\frac{\partial\gamma}{\partial t}=\frac{1}{\tau_{acc}}\gamma+S\gamma^{2}, (2)

where τacc\tau_{acc} represents the acceleration timescale and SS represents the proportionality constant between synchrotron and inverse Compton losses and the energy squared.

Usually, the expressions for photon injection and losses are given in terms of emission and absorption coefficients (jν and αν)\left(j_{\nu}\text{ and }\alpha_{\nu}\right), which can be related to the injection and losses terms as:

𝒬i\displaystyle\mathcal{Q}_{i} |nγ(ε)t|inj=4πεmec2dνdεjν(ν)=2εjν(ν),\displaystyle\coloneqq\left|\frac{\partial n_{\gamma}\left(\varepsilon\right)}{\partial t}\right|_{inj}=\frac{4\pi}{\varepsilon m_{e}c^{2}}\frac{d\nu}{d\varepsilon}j_{\nu}\left(\nu\right)=\frac{2}{\hbar\varepsilon}j_{\nu}\left(\nu\right), (3)
\displaystyle\mathcal{L} |nγ(ε)t|abs=cαν(ν)nγ(ε),\displaystyle\coloneqq\left|\frac{\partial n_{\gamma}\left(\varepsilon\right)}{\partial t}\right|_{abs}=c\alpha_{\nu}\left(\nu\right)n_{\gamma}\left(\varepsilon\right), (4)

where ε\varepsilon is the energy of the photons normalized to the energy of an electron at rest (ε=Eν/mec2)\left(\varepsilon=E_{\nu}/m_{e}c^{2}\right) and \hbar is the reduced Plank constant.


In general, models that simulate the emission from relativistic plasmas can be separated in three types depending on their treatment of the primary and secondary particles. With primary and secondary particles we refer, respectively, to those that are set up at the beginning of the simulation or are injected from outside versus those that are created by any internal process.

Of the first type we have the static models, where the primary particles are set to a static distribution and the resulting photon field is derived. Typically, no secondary particles are included. Effectively, this usually corresponds to fixing the populations of electrons and deriving the resulting photon population by applying (LABEL:kinetic_equation) with nγ/t=0\partial n_{\gamma}/\partial t=0. The main disadvantage of this kind of models is, evidently, that the primary particles can not evolve and phenomena such as dynamic cascades can only be included in an approximate manner.

The second type uses a hybrid approach where a population of primary particles is fixed but a population of secondaries is allowed to vary dynamically allowing the production of cascades of particles. Some examples of these models are Dimitrakoudis et al. (2012); Cerruti et al. (2015).

The final option when modelling the system is to allow both primary and secondary particles to vary with time. Examples of this third type of models can be found in Mastichiadis & Kirk (1995); Gao et al. (2017).

Aiming to be comprehensive, we have set up Katu to include a large number of processes in a large amount of detail. Appendix A provides an in-depth description of each process, along with the equations that are implemented in Katu. For general reference, we recommend Rybicki & Lightman (1979, chapter 3); Böttcher et al. (2012, chapter 3) and Ghisellini (2013). LABEL:Tab:List-of-processes presents all the processes that we simulate along with the main references that we have consulted.

Process References
Synchrotron Emission Crusius & Schlickeiser (1986)
and Absorption Böttcher et al. (2012)
Inverse Compton Scattering Jones (1968)
Böttcher et al. (2012)
Two Photon Pair Production Böttcher & Schlickeiser (1997)
Pair Annihilation Svensson (1982)
Photo-meson Production Hümmer et al. (2010)
Bethe-Heitler Pair Production Kelner & Aharonian (2008, Section IV)
Pion & Muon decays Lipari et al. (2007)
Hümmer et al. (2010)
Table 1: List of processes and References.

3 Numerics

LABEL:Eq:kinetic_equation has a very complex form which can not be solved analytically. Therefore, we must solve it numerically. Depending on the type of particle whose population we want to evolve, certain terms will not appear and the equation will be simpler to solve. But first, we show the general solution of (LABEL:kinetic_equation) in terms of exponential integrators and an alternative solution using an implicit scheme.

First, we substitute (LABEL:energy_change) in (LABEL:kinetic_equation), expand the derivative in energy and merge similar terms:

nt\displaystyle\frac{\partial n}{\partial t} =𝒬++An+Bnγ,\displaystyle=\mathcal{Q}+\mathcal{L}+An+B\frac{\partial n}{\partial\gamma}, (5)

where:

A\displaystyle A =1τesc1τdec1τacc2Sγ,\displaystyle=-\frac{1}{\tau_{esc}}-\frac{1}{\tau_{dec}}-\frac{1}{\tau_{acc}}-2S\gamma, (6)
B\displaystyle B =γτaccSγ2.\displaystyle=-\frac{\gamma}{\tau_{acc}}-S\gamma^{2}. (7)

In general, every term in this equation depends on time, whether explicitly (for example, if external injection is varied over time), through a variable of the model (for example, τesc\tau_{esc} depends on the size of the volume we are simulating, which can evolve over time) or through the dependency on the population of other particles.

The loss term \mathcal{L} can be separated as =Ln\mathcal{L}=Ln (see for example (LABEL:[)s]L_for_photons and 65) and LL can be grouped with AA. As a final step in transforming the original equation, we note that BB depends on γ\gamma, so we can change the derivative to logarithmic space. Additionally, we know that it is common that the dependency of the population on the energy takes the form of a power law. In this case, it is best to take the derivatives of the logarithm of the population with respect to the logarithm of the energy as it tends to be a small constant that does not depend upon the population or the energy. Then we have the two compact forms of the kinetic equation:

nt\displaystyle\frac{\partial n}{\partial t} =𝒬+An+Bnlnγ.\displaystyle=\mathcal{Q}+A^{\prime}n+B^{\prime}\frac{\partial n}{\partial\ln\gamma}. (8)
lnnt\displaystyle\frac{\partial\ln n}{\partial t} =𝒬n+A+Blnnlnγ.\displaystyle=\frac{\mathcal{Q}}{n}+A^{\prime}+B^{\prime}\frac{\partial\ln n}{\partial\ln\gamma}. (9)

where:

A\displaystyle A^{\prime} =L1τesc1τdec1τacc2Sγ,\displaystyle=L-\frac{1}{\tau_{esc}}-\frac{1}{\tau_{dec}}-\frac{1}{\tau_{acc}}-2S\gamma, (10)
B\displaystyle B^{\prime} =1τaccSγ.\displaystyle=-\frac{1}{\tau_{acc}}-S\gamma. (11)

In the first case (equation 8), by taking the derivative as an operator, we have a simple equation that can be solved with an exponential integrator. This solution, under the assumptions of constant 𝒬\mathcal{Q}, AA^{\prime} and BB^{\prime} is exact (unless not binned numerically over a finite number over energies with a given representation of the derivative). This solution is very interesting because, for cases where there is no acceleration or cooling, that is, there is no derivative term, this remains exact when implemented numerically. This is the case for neutral particles, which have τacc=0\tau_{acc}=0 and S=0S=0, and thus B=0B^{\prime}=0.

Where this is not the case and we have to keep the derivative, even though the solution that we find through the exponential integrator is correct, its calculation involves obtaining the exponential and the inverse of an operator, which is very costly in computational time, so we explore other solutions in the form of an implicit scheme of (LABEL:kinetic_equation_compact_log_log).

3.1 Exponential Integrator Solutions

Taking the derivative as an operator, the solution of (LABEL:kinetic_equation_compact) is:

n(t0+Δt)\displaystyle n\left(t_{0}+\Delta_{t}\right) =exp[t0t0+Δt(A+Blnγ)𝑑t]n(t0)+\displaystyle=\exp\left[\int_{t_{0}}^{t_{0}+\Delta_{t}}\left(A^{\prime}+B^{\prime}\frac{\partial}{\partial\ln\gamma}\right)dt^{\prime}\right]n\left(t_{0}\right)+
t0t0+Δtexp[tt0+Δt(A+Blnγ)𝑑t′′]𝒬𝑑t\displaystyle\int_{t_{0}}^{t_{0}+\Delta_{t}}\exp\left[\int_{t^{\prime}}^{t_{0}+\Delta_{t}}\left(A^{\prime}+B^{\prime}\frac{\partial}{\partial\ln\gamma}\right)dt^{\prime\prime}\right]\mathcal{Q}dt^{\prime} (12)

where 𝒬\mathcal{Q}, AA^{\prime} and BB^{\prime} all depend on time.

To obtain the first order solution to (LABEL:Exponential_Integrator_Solution), we consider that every term varies very little between t0t_{0} and t0+Δtt_{0}+\Delta_{t} and can be taken as constant. Then we obtain:

n(t0+Δt)\displaystyle n\left(t_{0}+\Delta_{t}\right) =exp[Δt(A+Blnγ)]n(t0)+\displaystyle=\exp\left[\Delta_{t}\left(A^{\prime}+B^{\prime}\frac{\partial}{\partial\ln\gamma}\right)\right]n\left(t_{0}\right)+
{exp[Δt(A+Blnγ)]I}[A+Blnγ]1𝒬,\displaystyle\left\{\exp\left[\Delta_{t}\left(A^{\prime}+B^{\prime}\frac{\partial}{\partial\ln\gamma}\right)\right]-I\right\}\left[A^{\prime}+B^{\prime}\frac{\partial}{\partial\ln\gamma}\right]^{-1}\mathcal{Q}, (13)

where the exponential and the inverse of an operator are defined through their Taylor expansions:

exp[Δt(A+Blnγ)]\displaystyle\exp\left[\Delta_{t}\left(A^{\prime}+B^{\prime}\frac{\partial}{\partial\ln\gamma}\right)\right] i=01i![Δt(A+Blnγ)]i,\displaystyle\coloneqq\sum_{i=0}^{\infty}\frac{1}{i!}\left[\Delta_{t}\left(A^{\prime}+B^{\prime}\frac{\partial}{\partial\ln\gamma}\right)\right]^{i}, (14)
(A+Blnγ)1\displaystyle\left(A^{\prime}+B^{\prime}\frac{\partial}{\partial\ln\gamma}\right)^{-1} i=0(IABlnγ)i.\displaystyle\coloneqq\sum_{i=0}^{\infty}\left(I-A^{\prime}-B^{\prime}\frac{\partial}{\partial\ln\gamma}\right)^{i}. (15)

Note that both AA^{\prime} and BB^{\prime} depend on γ\gamma, so the cross terms of BlnγAB^{\prime}\frac{\partial}{\partial\ln\gamma}A^{\prime} and Blnγ(Blnγ)B^{\prime}\frac{\partial}{\partial\ln\gamma}\left(B^{\prime}\frac{\partial}{\partial\ln\gamma}\right) are not null.

This step usually involves the exponentiation and inversion of an operator, which is very costly computationally. But, as we have said in the previous section, this solution is very interesting in the case that B=0B^{\prime}=0 as it reduces to:

n(t0+Δt)=exp[ΔtA]n(t0)+exp[ΔtA]IA𝒬,n\left(t_{0}+\Delta_{t}\right)=\exp\left[\Delta_{t}A^{\prime}\right]n\left(t_{0}\right)+\frac{\exp\left[\Delta_{t}A^{\prime}\right]-I}{A^{\prime}}\mathcal{Q}, (16)

where the exponential and the inverse are easily calculable. This is the solution that we employ for neutral particles (neutrons, neutral pions and neutrinos).

3.2 Implicit Method Solution

An alternative way of solving (LABEL:kinetic_equation) is to discretize the derivative in energy as the first step and then look for a solution. In order to make the solution stable for high values of Δt\Delta_{t}, we work from an implicit expression for the derivative. In this case, it is better to work with logarithmic derivatives, as this makes the numerical solution more stable. To account for the possible flows of particles due to the acceleration and cooling, we have to look for an equilibrium point where the two fluxes converge and separate the solution in two parts at the equilibrium energy.

We note from (LABEL:energy_change) that there is an equilibrium point in energy at γeq=1τaccS\gamma_{eq}=-\frac{1}{\tau_{acc}S} (remember that S<0S<0) and that for γ<γeq\gamma<\gamma_{eq}, we have that γt>0\frac{\partial\gamma}{\partial t}>0 and conversely, for γ>γeq\gamma>\gamma_{eq}, we have that γt<0\frac{\partial\gamma}{\partial t}<0. This gives us the energy direction and thus, the direction in which we have to take the derivatives when discretizing. This is equivalent to upwind schemes for flows of matter, although in energy space.

In this case the discretized version of (LABEL:kinetic_equation_compact_log_log), taking index ii as the index of the highest energy that is lower than γeq\gamma_{eq}, now reads:

lnnjtt={𝒬jnjt+1+Aj+Bj(lnnjt+1lnnj1t+1Δlnγ)for j<i𝒬jnjt+1+Aj+Bj(lnnj+1t+1lnnjt+1Δlnγ)for j>i.\frac{\partial\ln n_{j}^{t}}{\partial t}=\begin{cases}\frac{\mathcal{Q}_{j}}{n_{j}^{t+1}}+A_{j}^{\prime}+B_{j}^{\prime}\left(\frac{\ln n_{j}^{t+1}-\ln n_{j-1}^{t+1}}{\Delta\ln\gamma}\right)&\text{for }j<i\\ \frac{\mathcal{Q}_{j}}{n_{j}^{t+1}}+A_{j}^{\prime}+B_{j}^{\prime}\left(\frac{\ln n_{j+1}^{t+1}-\ln n_{j}^{t+1}}{\Delta\ln\gamma}\right)&\text{for }j>i\end{cases}. (17)

Discretizing now in time, and isolating the terms in lnnjt+1\ln n_{j}^{t+1} we obtain:

lnnjt+1\displaystyle\ln n_{j}^{t+1} ={lnnjt+Δt[AjBjlnnj1t+1Δlnγ]+Δt[𝒬jnjt+1+Bjlnnjt+1Δlnγ]for j<ilnnjt+Δt[Aj+Bjlnnj+1t+1Δlnγ]+Δt[𝒬jnjt+1Bjlnnjt+1Δlnγ]for j>i.\displaystyle=\begin{cases}\ln n_{j}^{t}+\Delta_{t}\left[A_{j}^{\prime}-B_{j}^{\prime}\frac{\ln n_{j-1}^{t+1}}{\Delta\ln\gamma}\right]+\\ \phantom{\ln n_{j}^{t}+\;\,}\Delta_{t}\left[\frac{\mathcal{Q}_{j}}{n_{j}^{t+1}}+B_{j}^{\prime}\frac{\ln n_{j}^{t+1}}{\Delta\ln\gamma}\right]&\text{for }j<i\\ \ln n_{j}^{t}+\Delta_{t}\left[A_{j}^{\prime}+B_{j}^{\prime}\frac{\ln n_{j+1}^{t+1}}{\Delta\ln\gamma}\right]+\\ \phantom{\ln n_{j}^{t}+\;\,}\Delta_{t}\left[\frac{\mathcal{Q}_{j}}{n_{j}^{t+1}}-B_{j}^{\prime}\frac{\ln n_{j}^{t+1}}{\Delta\ln\gamma}\right]&\text{for }j>i\end{cases}. (18)

For convenience, we define the following constants:

C±\displaystyle C_{\pm} Aj±Bjlnnj±1t+1Δlnγ,\displaystyle\coloneqq A_{j}^{\prime}\pm B_{j}^{\prime}\frac{\ln n_{j\pm 1}^{t+1}}{\Delta\ln\gamma}, (19)
D±\displaystyle D_{\pm} ±ΔtΔlnγBj1.\displaystyle\coloneqq\pm\frac{\Delta_{t}}{\Delta\ln\gamma}B_{j}^{\prime}-1. (20)

The solution can now be found in terms of the Lambert W-function:

njt+1={Δt𝒬jD+W[Δt𝒬jD+exp(lnnjt+ΔtCD+)]for j<iΔt𝒬jDW[Δt𝒬jDexp(lnnjt+ΔtC+D)]for j>i.n_{j}^{t+1}=\begin{cases}-\frac{\Delta_{t}\mathcal{Q}_{j}}{D_{+}W\left[-\frac{\Delta_{t}\mathcal{Q}_{j}}{D_{+}}\exp\left(\frac{\ln n_{j}^{t}+\Delta_{t}C_{-}}{D_{+}}\right)\right]}&\text{for }j<i\\ -\frac{\Delta_{t}\mathcal{Q}_{j}}{D_{-}W\left[-\frac{\Delta_{t}\mathcal{Q}_{j}}{D_{-}}\exp\left(\frac{\ln n_{j}^{t}+\Delta_{t}C_{+}}{D_{-}}\right)\right]}&\text{for }j>i\end{cases}. (21)

The Lambert function has the property that lnW(x)=lnxW(x)\ln W\left(x\right)=\ln x-W\left(x\right) for x>0x>0, so we can obtain the logarithm of the population as:

lnnjt+1\displaystyle\ln n_{j}^{t+1} ={lnnjt+ΔtCD++W[Δt𝒬jD+exp(lnnjt+ΔtCD+)]for j<ilnnjt+ΔtC+D+W[Δt𝒬jDexp(lnnjt+ΔtC+D)]for j>i.\displaystyle=\begin{cases}-\frac{\ln n_{j}^{t}+\Delta_{t}C_{-}}{D_{+}}+\\ \quad W\left[-\frac{\Delta_{t}\mathcal{Q}_{j}}{D_{+}}\exp\left(\frac{\ln n_{j}^{t}+\Delta_{t}C_{-}}{D_{+}}\right)\right]&\text{for }j<i\\ -\frac{\ln n_{j}^{t}+\Delta_{t}C_{+}}{D_{-}}+\\ \quad W\left[-\frac{\Delta_{t}\mathcal{Q}_{j}}{D_{-}}\exp\left(\frac{\ln n_{j}^{t}+\Delta_{t}C_{+}}{D_{-}}\right)\right]&\text{for }j>i\end{cases}. (22)

By imposing some boundary conditions for lnn0t+1\ln n_{0}^{t+1} and lnnnt+1\ln n_{n}^{t+1}, the rest of the populations can be solved for in a straightforward manner. This is the solution that we employ for charged particles (protons, electrons and positrons, muons and charged pions).

4 Computational Implementation

4.1 Initialization

Katu is initialized according to a configuration file with an extensive range of options (explained in Appendix B). Here, we explain how the configuration settings are translated into the model variables.

The initial electron and proton populations are set using background density (ρ)\left(\rho\right), ratio of protons to electrons (η)\left(\eta\right) and the shapes of their distributions over energies as set in the configuration file. The resulting initial number densities for electrons and protons respectively are therefore given by:

ne(γe)=\displaystyle n_{e}\left(\gamma_{e}\right)= Nefe(γe)=ρme+ηmpfe(γe),\displaystyle N_{e}f_{e}\left(\gamma_{e}\right)=\frac{\rho}{m_{e}+\eta m_{p}}f_{e}\left(\gamma_{e}\right), (23)
np(γp)=\displaystyle n_{p}\left(\gamma_{p}\right)= Npfp(γp)=ηρme+ηmpfp(γp).\displaystyle N_{p}f_{p}\left(\gamma_{p}\right)=\frac{\eta\rho}{m_{e}+\eta m_{p}}f_{p}\left(\gamma_{p}\right). (24)

Here fef_{e} is the normalized distribution function for electrons and NeN_{e} the total electron number density (likewise for protons).


For the external injection of particles, we follow a similar procedure, except that instead of using the density of the fluid ρ\rho, we use the total particle luminosity LL. Then, if we inject protons and electrons in a plasma volume (V)\left(V\right) we have:

LV\displaystyle\frac{L}{V} =mec2γe,minγe,maxγeQe,0fe(γe)𝑑γe+\displaystyle=m_{e}c^{2}\int_{\gamma_{e,min}}^{\gamma_{e,max}}\gamma_{e}Q_{e,0}f_{e}\left(\gamma_{e}\right)d\gamma_{e}+
mpc2γp,minγp,maxγpQp,0fp(γp)𝑑γp,\displaystyle\phantom{=\;}m_{p}c^{2}\int_{\gamma_{p,min}}^{\gamma_{p,max}}\gamma_{p}Q_{p,0}f_{p}\left(\gamma_{p}\right)d\gamma_{p}, (25)

where Q0,eQ_{0,e} and Q0,pQ_{0,p} are respectively the number densities of injected electrons and protons, respectively. Injecting η\eta protons per electron (Qp,0=ηQe,0)\left(Q_{p,0}=\eta Q_{e,0}\right), and calculating the integrals, we have that:

LV=Qe,0γemec2+ηQe,0γpmpc2,\frac{L}{V}=Q_{e,0}\left\langle\gamma_{e}\right\rangle m_{e}c^{2}+\eta Q_{e,0}\left\langle\gamma_{p}\right\rangle m_{p}c^{2},

where γ\left\langle\gamma\right\rangle is the average energy of the injected particles. Solving for Qe,0Q_{e,0}, we can obtain that the injection distributions are given by:

Qe(γe)\displaystyle Q_{e}\left(\gamma_{e}\right) =Qe,0fe(γe),\displaystyle=\phantom{\eta}Q_{e,0}f_{e}\left(\gamma_{e}\right), (26)
Qp(γp)\displaystyle Q_{p}\left(\gamma_{p}\right) =ηQe,0fp(γp).\displaystyle=\eta Q_{e,0}f_{p}\left(\gamma_{p}\right). (27)

The external injection of photons is much simpler as only one particle kind is involved and thus we have:

Qγ(ε)=LγεVmec2,Q_{\gamma}\left(\varepsilon\right)=\frac{L_{\gamma}}{\left\langle\varepsilon\right\rangle Vm_{e}c^{2}}, (28)

where LγL_{\gamma} is the luminosity of the external photon field and ε\left\langle\varepsilon\right\rangle is its normalized average energy.


The value of the free escape timescale is calculated from the volume as:

tfree,esc={34Rcif sphereπ4hcif disk,t_{free,esc}=\begin{cases}\frac{3}{4}\frac{R}{c}&\text{if sphere}\\ \frac{\pi}{4}\frac{h}{c}&\text{if disk}\end{cases}, (29)

where RR is the radius of the sphere and hh is the height of the disk.

4.1.1 Numerical Initialization

Even though mathematically the values that we have found for the different injections, initial number densities and so on are exact, some of the distributions that we use do not have formulas with which to get their normalized form. This means that we actually have to numerically calculate them.

Then, to obtain the values of the injection in the population distribution, we bin the corresponding injection onto the population binning.


Only the boundary values for the ranges of the proton, electron and photon populations are set by the user (see Appendix B). For other particles, their boundaries have to be calculated so that we are sure that the injection of secondary particles does not fall outside the range of their distributions. In particular we take:

γn,min\displaystyle\gamma_{n,min} =γp,min,\displaystyle=\gamma_{p,min}, (30)
γn,max\displaystyle\gamma_{n,max} =γp,max,\displaystyle=\gamma_{p,max}, (31)
γe+,min\displaystyle\gamma_{e^{+},min} =γe,min,\displaystyle=\gamma_{e^{-},min}, (32)
γe+,max\displaystyle\gamma_{e^{+},max} =γe,max,\displaystyle=\gamma_{e^{-},max}, (33)
γπ,min\displaystyle\gamma_{\pi,min} =max(0.2γp,minmpmπ,100.5),\displaystyle=\max\left(0.2\cdot\gamma_{p,min}\cdot\frac{m_{p}}{m_{\pi}},10^{0.5}\right), (34)
γπ,max\displaystyle\gamma_{\pi,max} =0.6γp,maxmpmπ,\displaystyle=0.6\cdot\gamma_{p,max}\cdot\frac{m_{p}}{m_{\pi}}, (35)
γμ,min\displaystyle\gamma_{\mu,min} =max(γπ,minmπmμ,100.5),\displaystyle=\max\left(\gamma_{\pi,min}\cdot\frac{m_{\pi}}{m_{\mu}},10^{0.5}\right), (36)
γμ,max\displaystyle\gamma_{\mu,max} =γπ,maxmπmμ,\displaystyle=\gamma_{\pi,max}\cdot\frac{m_{\pi}}{m_{\mu}}, (37)
γν,min\displaystyle\gamma_{\nu,min} =min(γπ,minmπme,γμ,minmμme),\displaystyle=\min\left(\gamma_{\pi,min}\cdot\frac{m_{\pi}}{m_{e}},\gamma_{\mu,min}\cdot\frac{m_{\mu}}{m_{e}}\right), (38)
γν,max\displaystyle\gamma_{\nu,max} =max(γπ,maxmπme(1mμ2me2),γμ,maxmμme),\displaystyle=\max\left(\gamma_{\pi,max}\cdot\frac{m_{\pi}}{m_{e}}\cdot\left(1-\frac{m_{\mu}^{2}}{m_{e}^{2}}\right),\gamma_{\mu,max}\cdot\frac{m_{\mu}}{m_{e}}\right), (39)

where γn,γe+,γπ,γμ\gamma_{n},\gamma_{e^{+}},\gamma_{\pi},\gamma_{\mu} and γν\gamma_{\nu} refer to the limits for neutrons, positrons, pions, muons and neutrinos respectively. It must be noted that we do not modify the higher limit for the electron population, and thus, the user must make sure that the electrons created by the Bethe-Heitler process do not fall outside the boundaries.


In order for the numerical schemes to be stable, the maximum possible timestep has to be selected as lower than the free escape timescale:

Δt,max=max(Δt,conf,tfree,esc).\Delta_{t,max}=\max\left(\Delta_{t,conf},t_{free,esc}\right). (40)

4.2 Stopping Criteria

Katu evolves the population of the particles according to the kinetic equation (1). As such, it would continuing stepping indefinitely if no stopping criteria are provided. In our modelling, and in the examples that we provide with the code, we have added two stopping criteria.

In the first case, we stop the modelling if the elapsed time is bigger than a tmaxt_{max} set in the configuration file.

In the second case, we stop if the simulation reaches a steady state with a tolerance better than the one required in the configuration file. For this, we check the evolution of three species of particles: electrons, photons and neutrinos, according to the following formula:

[1n(x)n(x)t]2𝑑x<tol,\sqrt{\int\left[\frac{1}{n\left(x\right)}\frac{\partial n\left(x\right)}{\partial t}\right]^{2}dx}<\text{tol}, (41)

where nn is the number density for a given type of particle.

4.3 Computational Optimizations

A number of strategies have been implemented to optimize the computational efficiency of our code.

4.3.1 Look-Up-Tables

Most, if not all, processes have an equation that fall into one of the following forms:

n(x)t\displaystyle\frac{\partial n\left(x\right)}{\partial t} =yminymaxn(y)R(x,y)𝑑y,\displaystyle=\int_{y_{min}}^{y_{max}}n\left(y\right)R\left(x,y\right)\,dy, (42)
n(x)t\displaystyle\frac{\partial n\left(x\right)}{\partial t} =yminymaxn(y)zminzmaxn(z)R(x,y,z)𝑑z𝑑y,\displaystyle=\int_{y_{min}}^{y_{max}}n\left(y\right)\int_{z_{min}}^{z_{max}}n\left(z\right)R\left(x,y,z\right)\,dzdy, (43)

where RR is a generic function describing an interaction between particles.

For example, synchrotron emission and absorption and almost all loss terms follow the first form, whereas all the processes that require two particles interacting follow the second form. The important part is that the function RR does only depend on the energy of the implied particles and thus, can be tabulated as part of the start-up of the modelling process once and be used repeatedly.

For the more extreme cases of complicated functions, such as for the Bethe-Heitler process, we create a super-table that encompasses the most likely boundaries for the energy and interpolate it onto the energy grid.

Reducing every processes to the form of (LABEL:[)s]lut_1 and (43) allows us to transform their implementations into a loop over the produced particles along with two nested integrals. In practice, this means that the worst case scenario for a process is to scale cubically with the number of bins for a particle population. The processes currently included in Katu scale at most quadratically with the population sizes of photons and electrons and linearly with the other particles.

4.3.2 Threading

During a single time step (assumed to be sufficiently small) all processes are treated as independent and can therefore be computed in parallel. As part of the start-up process, we create a series of workers and add them to a pool to wait for work. When the moment of calculating each process arrives, the manager thread adds each process to a queue and wakes up the workers, which take a work item from the queue and complete it. Then, the workers check the queue until it is empty and go back to sleep, with the last one signalling the manager thread that it can all have finished and the program can continue.

5 Tests of the model

The model on which we base our code included linear acceleration, handled through an acceleration timescale tacct_{acc}, and cooling caused by the synchrotron and inverse Compton processes. These terms, while mathematically simple, pose a computational challenge to calculate correctly, as discussed in 3. Hence, we concentrate our tests on verifying acceleration and cooling.

All tests have an extremely low value of background plasma (ρ=1038)\left(\rho=10^{-38}\right), injection (L=1030)\left(L=10^{30}\right) and ratio of protons to electrons (η=1010)\left(\eta=10^{-10}\right). This is in order to have a low population of pure electrons and a minimal quantity of photons, so that the losses through Inverse Compton cooling are minimal and the cooling behaviour can be controlled by setting the magnetic field strength. The geometry of the blob is chosen as a sphere with R=1016R=10^{16}, which gives tesc250173t_{esc}\approx 250173 s.


To test the acceleration code, we inject electrons at low energy γe,inj,max=102\gamma_{e,inj,max}=10^{2} and wait until the system has reached a steady state. According to Mastichiadis & Kirk (1995), above the maximum energy of injection, the electron population should adopt a power law behaviour with index 1+tacc/tesc1+t_{acc}/t_{esc}.

Refer to caption
Figure 1: Acceleration tests. Solid lines correspond to the output of our tests. The bands represent power laws with index 1+tacc/tesc1+t_{acc}/t_{esc}.

LABEL:Fig:Acceleration_Tests shows the result of the tests with different ratios of tacc/tesct_{acc}/t_{esc} implemented by varying the value of tacct_{acc}. We obtain a near-exact match between the numerical curves and their theoretical values.


To test the cooling code, we take a similar approach to the acceleration test, but we inject electrons only at high energies (γe,inj,min=104)\left(\gamma_{e,inj,min}=10^{4}\right) with a power law index of p=2.3p=2.3. The first part of this test studies the behaviour of the particle population below the injection point. By assuming a simplified version of (LABEL:kinetic_equation) in which there is neither injection nor acceleration and the only losses are due to escape and cooling, we can obtain a steady state solution for the population as:

n(γ)=Cexp(1Sτescγ)1γ2n\left(\gamma\right)=C\exp\left(\frac{1}{S\tau_{esc}\gamma}\right)\frac{1}{\gamma^{2}} (44)

where CC is a constant. In the absence of other contributions to the particle population, this solution fully represents the particle population distribution below the lower cut-off of the injection. For values of Sτescγ1S\tau_{esc}\gamma\gg 1, the exponential term tends to 11 and we obtain the result that the population behaves like a power law with index 22. For other cases, the exponential term is important and causes an exponential drop in the population towards lower energies. Physically, the first case represents that the cooling timescale is lower than the escape timescale where particles cool faster than they escape, which gives the power law behaviour. In the second case, we have that particles can escape much faster than they cool, so the population drops exponentially towards lower energies.

Refer to caption
Figure 2: Cooling tests below the injection lower limit. Solid lines correspond to the output of our tests. The bands represent (LABEL:population_low_energy_steady_state).

In 2 we show the resulting population of electrons for six different values of the magnetic field where on one extreme (B=0.1)\left(B=0.1\right) we have that the main driver of the particle population is escape, whereas for the other (B=2)\left(B=2\right) it is cooling. The bands follow (LABEL:population_low_energy_steady_state) with CC chosen as to match approximately the value of the population at the lower end of the injection point. As we can see, the match is quite good, which demonstrates the validity of our approach for the cooling algorithms.

The second part of the cooling test involves the behaviour of the particles in the injection range. For this range, we have again two behaviours depending on the cooling and escape timescales. The transition between these two behaviours is defined by a cooling break energy, which can be estimated as γc1/(1pinj)Sτesc\gamma_{c}\approx 1/\left(1-p_{inj}\right)S\tau_{esc}. Hence, keeping pinjp_{inj} and τesc\tau_{esc} constant, we can vary the value of the magnetic field to control the position of the cooling break. Below the cooling break, we have a population that is escape dominated, and develops into a population that keeps the original behaviour of the injection profile. Above the cooling break, the population is cooling dominated and as a result the population follows a power law steepened by 11.

Refer to caption
Figure 3: Cooling tests in the injection range. Solid lines correspond to the output of our tests. The bands represent the theoretical approximations for the population. The vertical dashed blue line corresponds to the cooling break for B=0.1B=0.1.

For 3 we have tested two values of the magnetic field (B=0.1 and B=2)\left(B=0.1\text{ and }B=2\right). In the first case (blue line), the cooling break is between the injection ends, so we get two behaviours as expected. For the lower energy part, the resulting population has an index of 2.32.3 and for the higher energy part, the index is 3.33.3, just as expected. Below the injection lower limit, the population of electrons escapes freely and that is reflected by the abrupt drop in population. For the second case (brown line), the cooling timescale is much lower and the cooling break is below the lower injection limit. Hence, between the injection ends, we see that the whole population has a index of 3.33.3.


In addition to the above tests, which are designed to stress the acceleration and cooling parts of our numerical schemes, we have also performed tests aimed towards checking the correctness of the implementation of the physical processes. In this case, we have checked that the tests of Dimitrakoudis et al. (2012) pass. Further demonstrations of the code can be found in Jiménez-Fernández & van Eerten (2020).

6 Bayesian Inference, Markov Chain Monte Carlo and Nested Sampling analysis

Bayesian inference provides a useful approach to parameter estimation and model selection. In this section we describe some of the key concepts of Bayesian inference and how it can be used in the task of fitting a model. For the case of blazar modelling, we additionally describe two different approaches that Katu is able to work with via two available wrappers: Markov Chain Monte Carlo (MCMC) analysis in the form of an ensemble sampler with affine invariance Goodman & Weare (2010) and Nested Sampling (NS) analysis in the form of nested elliptical sampling Feroz & Hobson (2008).

6.1 Bayesian Inference

We define a dataset 𝒟\mathcal{D} as the combination of observations yy and their associated errors σ\sigma. A model \mathcal{M} is defined as some function that can be applied to a set of parameters θ\theta to obtain some model predictions yy_{\mathcal{M}} that can be compared with the dataset. Then, we can define the probability that a dataset has been obtained given a model and a set or parameters as:

P(𝒟|θ,),P\left(\mathcal{D}|\theta,\mathcal{M}\right)\equiv\mathcal{L}, (45)

where \mathcal{L} is called the likelihood. Likewise, we can also define the a priori probability, or belief, of the parameters that we have used:

P(θ|)π,P\left(\theta|\mathcal{M}\right)\equiv\pi, (46)

where π\pi is denoted the prior. For example, when comparing leptonic and lepto-hadronic models, the parameter that most determines this division is η\eta. If we have any a priori information suggesting a certain distribution for η\eta, we can include this information as a prior. If not, the prior is typically set to be flat. Note however that this does not mean that no prior assumptions were included: a flat prior in logarithmic space will for example have a different impact than a flat prior in linear space.

By integrating over the whole parameter space Θ\Theta, we can obtain the probability of obtaining a dataset given a model:

P(𝒟|)𝒵=ΘP(𝒟|θ,)P(θ|)𝑑θ,P\left(\mathcal{D}|\mathcal{M}\right)\equiv\mathcal{Z}=\int_{\Theta}P\left(\mathcal{D}|\theta,\mathcal{M}\right)P\left(\theta|\mathcal{M}\right)d\theta, (47)

where 𝒵\mathcal{Z} is the evidence of the model, or, simplifying the notation:

𝒵=Θ(θ)π(θ)𝑑θ.\mathcal{Z}=\int_{\Theta}\mathcal{L}\left(\theta\right)\pi\left(\theta\right)d\theta. (48)

The evidence can be understood as a weighted average of the likelihood \mathcal{L} over the whole parameter space with the prior π\pi as a weight function. Thus, only the regions where the product of both the likelihood and the prior is high will mainly contribute to the value of the evidence. By integrating over the whole parameter space with a normalized prior, we are implicitly dividing by the volume of the space so models with more parameters that do not improve the likelihood will obtain a lower evidence.

By applying Bayes’ theorem, we can find the probability of a model given a dataset:

P(|𝒟)=P(𝒟|)P()P(𝒟),P\left(\mathcal{M}|\mathcal{D}\right)=\frac{P\left(\mathcal{D}|\mathcal{M}\right)P\left(\mathcal{M}\right)}{P\left(\mathcal{D}\right)}, (49)

where P()P\left(\mathcal{M}\right) and P(𝒟)P\left(\mathcal{D}\right) are the a priori probability, or belief, of the model and the data respectively. Here, P(𝒟)P\left(\mathcal{D}\right) denotes the probability of measuring the dataset 𝒟\mathcal{D} from the underlying physics and the observation method. P()P\left(\mathcal{M}\right) encapsulates our prior belief in the model \mathcal{M} that is not already reflected in the priors of the parameters. For example, if we know that blazars could be fit from two different models with a ratio of 11 to 44, we can incorporate this information in this term.

Denoting two models by aa and bb, we can take the ratio of their probabilities to obtain:

R=P(a|𝒟)P(b|𝒟)=𝒵aP(a)𝒵bP(b).R=\frac{P\left(\mathcal{M}_{a}|\mathcal{D}\right)}{P\left(\mathcal{M}_{b}|\mathcal{D}\right)}=\frac{\mathcal{Z}_{a}P\left(\mathcal{M}_{a}\right)}{\mathcal{Z}_{b}P\left(\mathcal{M}_{b}\right)}. (50)

Assuming that both models have the same a priori probability, we are left with the ratio of evidences as a way of comparing two models, so that if R>1R>1 model aa is more likely and vice versa, if R<1R<1, model bb is more likely.

As with all criteria for model selection, the actual amount of (relative) confidence in a given model that is linked to a given numerical value of R remains subjective. Taking as a guideline Kass & Raftery (1995), we give the following references for the value of RR: 13.21-3.2, ‘not worth more than a bare mention’; 3.2103.2-10, ‘substantial’; 1010010-100, ‘strong’ and >100>100, ‘decisive’.


By inverting (LABEL:evidence_expanded) and the conditioning, we can find the probability of a point in parameter space given a dataset and a model:

P(θ|𝒟,)=P(𝒟|θ,)P(θ|)P(𝒟|),P\left(\theta|\mathcal{D},\mathcal{M}\right)=\frac{P\left(\mathcal{D}|\theta,\mathcal{M}\right)P\left(\theta|\mathcal{M}\right)}{P\left(\mathcal{D}|\mathcal{M}\right)}, (51)

which we call the posterior distribution of θ\theta: 𝒫(θ)\mathcal{P}\left(\theta\right). Simplifying the notation, we obtain:

𝒫(θ)=(θ)π(θ)𝒵.\mathcal{P}\left(\theta\right)=\frac{\mathcal{L}\left(\theta\right)\pi\left(\theta\right)}{\mathcal{Z}}. (52)

For the purpose of parameter estimation, it is not necessary to compute the evidence.

Assuming that the data follow a Gaussian distribution, we can define the log-likelihood444The reader should not confuse the Likelihood ()\left(\mathcal{L}\right) with the log-likelihood (L)\left(L\right). associated between a dataset and the model predictions as:

L=12in[ln(2πσi2)+(yiy,iσi)2].L=-\frac{1}{2}\sum_{i}^{n}\left[\ln\left(2\pi\sigma_{i}^{2}\right)+\left(\frac{y_{i}-y_{\mathcal{M},i}}{\sigma_{i}}\right)^{2}\right]. (53)

The first term is a constant for every log-likelihood, as it does not depend on the model, only on the errors of the data. It can be proven that this term appears as a constant multiplicative factor upon the calculation of the evidence, so it divides out when comparing the evidence of two models.

6.2 MCMC and Nested Sampling

MCMC and Nested Sampling are two different approaches to analyse, using bayesian inference, a model given its likelihood. MCMC can sample from a given function, which in this case it is the likelihood of the model, and thus, we can reconstruct the shape of the likelihood and the posterior probabilities of any of its parameters. Nested sampling works by sampling from the likelihood function in a monotonically increasingly way. By assigning a statistical weight to each of the samples, the shape of the likelihood can be reconstructed and the evidence of the model can be calculated.


Even though sampling using MCMC is simple in principle, it has some drawbacks when the shape of the sampled distribution has various maxima and/or is very different from a simple gaussian. In order to avoid partially these problems, Goodman & Weare (2010) introduce the ’Ensemble Samplers with Affine Invariance’ and Foreman-Mackey et al. (2013) creates, using ideas from that work, emcee. The MCMC steps introduced in Goodman & Weare (2010) and implemented in emcee allows for sampling efficiently from any distribution that can be reduced to a multivariate gaussian by an affine transformation, increasing greatly the distributions for which MCMC is efficient at the cost of increasing the memory requirements. In particular, whereas simple MCMC uses a ’walker’ to traverse the parameter space, emcee uses an ensemble of them.

Even though the affine invariant ensemble samplers and emcee are a great statistical tool, they still have some problems inherited from the basic MCMC analysis. First, MCMC tends to perform very badly in situations where multiple modes exist, as it can take a long time for a walker to go from one maximum to another because in its path there is a volume of likelihood that is taken with lower probability than other steps. Second, MCMC needs some period of burn-in until the sampler starts to produce samples that are independent and representative of the sampled probability distribution. Third, not every distribution is well represented by affine transformations of a multivariate gaussian, so there are cases where emcee will still perform not optimally.

Nevertheless, after some iterations, we obtain a chain of parameters for each walker from which we can derive the sought for posterior probabilities by simply plotting the appropiate histograms.


In contrast to MCMC analysis, nested sampled is aimed towards the calculation of the evidence. The basic idea is that, by creating a sequence of points with increasingly likelihood and by assigning a weight to each of them depending on their position in the sequence, the evidence can be calculated. As the value of the likelihood increases in the sequence, the probability of drawing a new point with a better value decreases. multinest avoids this problem by using a set of points as ’live points’ and using them to define ellipses from where to drawn the new points (Feroz et al., 2009). This way, the sampling is much more efficient and the sequence can grow faster.

As the set of ’live points’ define the sampled volume, it can happen that this volume can be separated into isolated sub-volumes. In these cases, each of the sub-volumes is associated with one maximum of the likelihood function, and Nested Sampling, and multinest by extension, can detect and handle these separated maxima with ease.

Even though the length of the sequence can be made as big as wanted, there is a point where the last points contribute very little to the calculation of the evidence. multinest, by default, is set to stop the calculation of the evidence when the best point of the sequence increases the value of the log evidence by less than some tolerance.

Each of the points in the sequence is assigned a weight, and thus, can be used to create the histograms of the posterior probabilities of the parameters.

7 Application to Mrk 421

7.1 Dataset

The blazar Markarian 421 (Mrk 421) has been observed for decades. Mrk 421 is one of the nearest blazars to earth, and therefore one of the brightest. Mrk 421 is a BL Lacertae object, which tend to be variable in their flux levels and can typically be modelled without a need for an external source of photons.

Refer to caption
Figure 4: Combined SED for the two statistical packages. The blue line corresponds to the best SED obtained by multinest, the orange to the one obtained by emcee and the green one to multinest with a leptonic model. The respective log likelihoods are 39.78-39.78, 39.33-39.33 and 42.89-42.89.

The full dataset that we study was gathered during a year long multi-wavelength campaing during 2009, whose details are reported on Abdo et al. (2011) and we reproduce it in 4. It is important to notice that not all the experimental points follow the same trend, even when taking into account their errors. As an example of this problem, we reproduce the X-rays part of the SED in 5, where it can be clearly seen that there are two different trends with the different instruments, Swift and RXTE, where the first shows a lower flux than the second.

Refer to caption
Figure 5: MRK 421 X-ray data separated by instruments. The original errors of each observation are displayed in the same colour as the data with an errorbar. The error bands correspond to that of taking a=0.2a=0.2 with (LABEL:error_formula).

Similarly, observations at optical frequencies display a spectral variability that goes beyond their (very small) observational errors. To account for these systematic deviations from the base line physics captured by our model, we take a conservative approach to the sizes of the error bars of the observations (for an alternative approach when fitting complex models to broadband transient data, see Aksulu et al. (2020)). By increasing the error bars according to the prescription below, we acknowledge that our model fit is to the generic state of the blazar SED during the period of observation rather than to its instantaneous value at a given time of observation for a specific instrument. We use

σi=max(a×yi,σi,0),\sigma_{i}=\max\left(a\times y_{i},\sigma_{i,0}\right), (54)

where σi\sigma_{i} is the final error that we take, aa is a constant, yiy_{i} is the experimental flux and σi,0\sigma_{i,0} is the experimental error. For this analysis, we have chosen to separate the value of aa by band and have set aoptical=aγrays=0.1a_{optical}=a_{\gamma-rays}=0.1 and axrays=0.2a_{x-rays}=0.2. In 5 we show with bands the resulting error bands that we obtain and see that a common trend can be more easily fitted between the observations.

For the case of X-rays, the existence of two trends in the data had an important impact upon the fitting procedure and its results. The first consequence is that it creates two maxima in the likelihood function where each one is linked to each of the trends. The existence of more than one maximum can cause problems to fitting algorithms. For emcee, as we have discussed in 6.2, because it needs to have walkers in both maxima as it would be complicated for the walkers to go from one maxima to the other, and because in the degenerate case of having all walkers in one of the maxima the fitting procedure would need a very high time to ’discover’ the other maxima. For multinest the multiples maxima are not such a serious problem as it has the capability of discovering, separating and analysing each of the maxima separately, although in some cases, depending on the shape of the likelihood function this is not easy, or even doable. In both cases, what we would find is that the running time and the acceptance ratio of the fitting algorithms would be very high and low respectively.


The flux detected by SMA and most of the radio observatories should be handled carefully as they can not resolve the jet and instead detect the integrated flux from the galaxy. To take this into account, we have chosen to use the SMA point as an upper limit in our fits. Upper limits can be treated in different ways in a model fit. In this case, we have chosen to discard all model-generated SEDs that led to flux levels above the SMA value. Because the SMA upper limit in practice then automatically forces the SED to go below the radio upper limits at longer wavelengths, we can safely omit these from our fitting process without changing the outcome (confirmation of this is provided in 7.3).

7.2 Configuration

As we have discussed in 4 and can be seen in B, Katu supports a high number of configuration options. To focus on one particular set of models, we choose to set the plasma volume to a sphere and to allow for all the particles to evolve. Additionally, all the initial population distributions have ben set to simple power laws.

The background density (ρ)\left(\rho\right) and its ratio of protons to electrons has been set to 1026gcm310^{-26}g\cdot cm^{-3} and 11 respectively. Note that these values have no reflection on the final steady state, as the initial distribution will be overwhelmed by the injected one.

For the electrons, we have chosen a lower cut-off for the energy range of 22 and higher cut-off of max(γe,max,mpmeγp,max)\max\left(\gamma_{e,max},\frac{m_{p}}{m_{e}}\gamma_{p,max}\right), where γe,max\gamma_{e,max} and γp,max\gamma_{p,max} are the values of the upper cut-offs of the electron and proton injection distributions respectively. These limits allow that electrons can cool down to very low energies and that all the electrons that are produced through photo-meson and the Bethe-Heitler processes can be injected in the distribution.

Photons are also produced by many processes, so their energy limits have to be chosen to reflect this. We have taken a lower cut-off of 101210^{-12} and a higher cutoff of γe,0,max\gamma_{e,0,max}, which is the value of the higher energy cut-off for the electrons. We have also chosen to not inject photons from the exterior of the volume, so we set the value of the photon luminosity (Lγ)\left(L_{\gamma}\right) as 0.

For protons, we have taken that both, the initial and the injection distribution have the same shape, with a lower cut-off value of γp,min=10\gamma_{p,min}=10, a higher cut-off value of γp,max=106\gamma_{p,max}=10^{6} and a power law index of 22.

We assume that the charged particles scape slower than neutral particles, so we have set the charged to free escape ratio (σ)\left(\sigma\right) as 1010.

Even though we do not know of any process that would preferentially accelerate one kind of particle over the other, we have chosen to set the ratio of protons to electrons (η)\left(\eta\right) as a free variable to further explore the hadronic capabilities of our model. In 7.4 we fix this ratio to 101010^{-10} in order to have virtually no protons and thus, be able to simulate a fully leptonic model.


Numerically, we have chosen to set the tolerance to 10610^{-6} and have set the number of bins for the primary particles as: ne=np=160n_{e}=n_{p}=160 and nγ=128n_{\gamma}=128. Experience shows that these settings provide a good balance between numerical accuracy and reasonable convergence times for the simulations.

Finally, we are left with 88 free variables, which are gathered in 2, along with the respective search ranges. γe,inj,min\gamma_{e,inj,min}, γe,inj,max\gamma_{e,inj,max} and pp correspond to the parameters of the injected population of electrons. BB (G)\left(\text{G}\right) and δ\delta are the magnetic field and the Doppler boosting parameter of the volume, respectively; RR (cm)\left(\text{cm}\right) is the radius of the sphere we simulate; and LL (ergs1)\left(\text{erg}\cdot\text{s}^{-1}\right) and η\eta correspond to the total luminosity of the injected particles (in the plasma frame) and its ratio of protons to electrons. As cosmological parameters, we have chosen those of Collaboration et al. (2019): H0=67.66H_{0}=67.66, Ωm=0.3111\Omega_{m}=0.3111 and ΩΛ=0.6889\Omega_{\Lambda}=0.6889.

Parameter Search Range
γe,inj,min\gamma_{e,inj,min} 1010410-10^{4}
γe,inj,max\gamma_{e,inj,max} 104101010^{4}-10^{10}
pp 131-3
BB 102100.510^{-2}-10^{-0.5}
δ\delta 1010010-100
RR 1014101810^{14}-10^{18}
LL 1038104510^{38}-10^{45}
η\eta 100.110510^{-0.1}-10^{5}
Table 2: Free parameters and their search ranges.

7.3 Results

The results of the two statistical packages that we have tested are presented in 3 and 4. The corresponding corner plots for both packages can be seen in LABEL:fig:[s]Corner-Plot-multinestand 7 respectively.

Parameter multinest emcee
Best Median Best Median
log10(γe,inj,min)\log_{10}\left(\gamma_{e,inj,min}\right) 2.502.50 2.640.11+0.162.64_{-0.11}^{+0.16} 2.552.55 2.660.13+0.172.66_{-0.13}^{+0.17}
log10(γe,inj,max)\log_{10}\left(\gamma_{e,inj,max}\right) 5.505.50 5.610.11+0.125.61_{-0.11}^{+0.12} 5.585.58 5.610.12+0.145.61_{-0.12}^{+0.14}
pp 2.232.23 2.200.07+0.062.20_{-0.07}^{+0.06} 2.212.21 2.200.07+0.062.20_{-0.07}^{+0.06}
log10(B)\log_{10}\left(B\right) 0.96-0.96 1.030.10+0.09-1.03_{-0.10}^{+0.09} 1.04-1.04 1.030.10+0.10-1.03_{-0.10}^{+0.10}
log10(δ)\log_{10}\left(\delta\right) 1.871.87 1.820.07+0.071.82_{-0.07}^{+0.07} 1.841.84 1.820.08+0.071.82_{-0.08}^{+0.07}
log10(R)\log_{10}\left(R\right) 15.2615.26 15.420.20+0.1915.42_{-0.20}^{+0.19} 15.4015.40 15.410.22+0.2115.41_{-0.22}^{+0.21}
log10(L)\log_{10}\left(L\right) 45.4745.47 45.460.92+0.2945.46_{-0.92}^{+0.29} 45.6245.62 45.420.97+0.3145.42_{-0.97}^{+0.31}
log10(η)\log_{10}\left(\eta\right) 3.923.92 4.010.98+0.284.01_{-0.98}^{+0.28} 4.044.04 3.991.11+0.313.99_{-1.11}^{+0.31}
Table 3: Parameter comparisons with the two tested statistical packages.

It can be clearly seen from 3 that both approaches give very similar results for the best parameters, the median and the 1616 and 8484 percentiles, which show the consistency of the analysis of both packages. Likewise, from both corner plots, LABEL:fig:[s]Corner-Plot-multinestand 7, we see that the posterior shape and the 2D correlations are consistent. Furthermore, we see in 4 that both SEDs are very close together, and indeed, their log likelihoods are 39.78-39.78 and 39.33-39.33 for multinest and emcee, respectively. For multinest, we can also quote the value of the log evidence, which is not available for emcee, ln(𝒵)=65.750±0.232\ln\left(\mathcal{Z}\right)=-65.750\pm 0.232.

The choice that we made in 7.1, namely to choose the SMA datapoint as an upper limit and ignore the rest of the radio detections on the basis that the SMA was the most restricting one, is correct, as none of the best fits of the SED go above any of the radio datapoints.

In computational terms, it is complicated to compare both algorithms due to how they handle points with null likelihood, as multinest silently discards them while emcee treats them as any other point; and how the first steps of emcee had been handled. multinest has taken about a day of computations and at least on the order of 120 000120\,000 evaluations of the log likelihood function, whereas emcee has taken more than 400 000400\,000 evaluations and more than double the time, although part of these evaluations were done in initial short runs and due to leaving emcee to work overnight without a clear stopping condition.555All the computations were done on a server with 32 available cores with multinest and emcee calculating 8 log likelihoods in parallel and Katu with 4 threads. Even though emcee seems worse from the point of view of finishing the runs, it tends to give a result for the best log likelihood much sooner, and usually better, than multinest.

From 4, it can be seen that in the range of 104107\sim 10^{4}-10^{7} eV, there is a small hump between the synchrotron and inverse Compton peaks. In our simulations, this is caused by the pairs created by the Bethe-Heitler process Mastichiadis et al. (2005); Petropoulou & Mastichiadis (2015), highlighting the need for protons in order to completely fit the SED.

Abdo et al. (2011) estimate the luminosity of the blazar between 2.6×10462.6\times 10^{46} and 1.2×10471.2\times 10^{47}. Taking that the luminosity from the jet frame to the blazar frame transforms as Lblazarδ2Ljet/2L_{blazar}\approx\delta^{2}L_{jet}/2, we obtain Lblazar,multinest6.44×1048L_{blazar,multinest}\approx 6.44\times 10^{48} and Lblazar,emcee9.98×1048L_{blazar,emcee}\approx 9.98\times 10^{48}. Both values are clearly above the estimation of Abdo et al. (2011), an issue shared with other lepto hadronic models Liodakis & Petropoulou (2020).

7.4 Model selection using Bayesian inference: leptonic versus lepto-hadronic models

To verify that the small hump that appears between the peaks in 4 is indeed caused by the presence of protons and to compare our results against a purely leptonic model, we choose to repeat the statistical analysis of Mrk 421’s data with multinest but setting η=1010\eta=10^{-10}. This ratio of injected protons to electrons means that, effectively, the hadronic processes have no impact upon the final fit. The choice of multinest over emcee is to allow us to obtain a value for the evidence that we can compare against the evidence for the hadronic model.

As we can see in 4, the best SED that the leptonic model can produce is unable to fit the high energy part of the X-ray observations, and this is indeed reflected in the best log likelihood that it can achieve: 42.89-42.89. We also find that the log evidence has a value of ln(𝒵)=67.685±0.231\ln\left(\mathcal{Z}\right)=-67.685\pm 0.231.

Assuming that both models have the same a priori probability, we find that the hadronic model is favoured as:

ln(R)1.935±0.327.\ln\left(R\right)\approx 1.935\pm 0.327.

The corresponding Bayes’ Factor is 6.921.93+2.686.92_{-1.93}^{+2.68}. This value is consistent with the qualification substantial for the evidence in favour of the lepto-hadronic model, as discussed in 6.1. Note that this measure already accounts for the different number of parameters between the two models.

Lepto-hadronic models can successfully explain some of the features of blazar SEDs, such as the gamma ray peak being caused by neutral pion decay or proton synchrotron Mannheim (1993); Cerruti et al. (2015). But, leptonic models can also fit this peak with inverse compton scattering of synchrotron photons, and the energy requirements for the proton population are usually above the Eddington luminosity Liodakis & Petropoulou (2020). Some characteristics of the SEDs can only be explained by a hadronic population, such as a population of pairs produced by the Bethe-Heitler pair production process, or the presence of neutrinos Mastichiadis et al. (2005); Petropoulou & Mastichiadis (2015); Petropoulou et al. (2015); Rodrigues (2019); Rodrigues et al. (2020). However, to our knowledge, this is the first time that the dichotomy of leptonic vs. lepto-hadronic modelling has been explored from a fully bayesian point of view using kinetic equation modelling.

8 Summary

In this work we introduce the code Katu, for simulating the evolution of plasmas according to a set of kinetic equation for a series of particles. We have tried to capture as much of the physics as possible by adding diverse processes in a detailed way. We have tried to make it as performant as possible, on the mathematical front by adapting the kinetic equation to the species of particle and separating neutral from charged particles, and on the computational front by takind advantage of multi-threading and by calculating as much as possible up-front.

Given that Katu is fast, we are able to tie it to two commonly used statistical packages that can perform bayesian analysis of models, emcee and multinest. emcee is a python package that performs bayesian analysis by using affine invariant samplers for MCMC and multinest is a Fortran package, which we employ using pymultinest, which implements Elliptical Nested Sampling.

As an example of its usage, we show how Katu and both wrappers can be used to analyse the well known blazar Mrk 421 obtaining, as expected, that both statistical packages report similar values for the best, median and 1616 and 8484 percentiles of the posterior of the parameters.

Additionally, we perform the test of exploring a purely leptonic model, finding that there is substantial evidence to reject it in favour of a lepto-hadronic model.

The code Katu was also employed for the study of TXS 0506+056, as described in Jiménez-Fernández & van Eerten (2020).

Acknowledgements

The authors would like to thank Matteo Cerruti and Anatoli Fedynitch for the valuable discussions about their models and Petar Mimica for hosting them during their meetings at the University of Valencia. B.J.F. would also like to thank the late Cosmo the Cat, Jinx, Loki and Mina the Cats and Lottie the Dog for their support. B.J.F. acknowledges funding from STFC through the joint STFC-IAC scholarship program. H.J.v.E. acknowledges partial support by the European Union Horizon 2020 Programme under the AHEAD2020 project (grant agreement number 871158).

References

Appendix A Detailed Description of the processes

In this appendix we try to give a detailed description of the equations that we implement in our model, along with the references. The reader must notice two things. First, we have adapted the notation to make it consistent with our conventions through the text. Additionally, some equations have been rearranged into equivalent forms that are a bit simpler. Second, our numerical implementation, even though it follows these equations as closely as possible, has some quirks that the reader and anyone implementing the model should be aware of.

Note also that we do not explicitly add anything about the losses of particles due to decay because this term is explicitly tracked in the kinetic equation, 1.

In the first place, some of the equations are physically valid in limits such as γ1\gamma\rightarrow 1, ε0\varepsilon\rightarrow 0, 1β01-\beta\rightarrow 0, etc. but numerically they either become unstable or they fall into regimes where ’catastrophic loss of precision’ could happen. In these cases, the special cases have been pulled out and are treated separately or are accounted for by using approximations.

In the second case, in general, the integrals are taken in logarithmic space in energy, as that makes the bins equally sized and in general the energy term that is introduced can be simplified, which also simplifies the equations. Numerically, they are performed using the trapezium rule, which gives a good precision taking into account the errors already introduced by the approximations in the models and the final data we compare with. Likewise, the derivatives are calculated using a first order scheme, although in log-log space, as they are much more stable this way.

A.1 Synchrotron Emission and Absorption

In our model, all the charged particles create synchrotron emission, have synchrotron losses and cause synchrotron absorption. For this, we model the emission as:

jν=14π1n(γ)Pν(γ)𝑑γ,j_{\nu}=\frac{1}{4\pi}\int_{1}^{\infty}n\left(\gamma\right)P_{\nu}\left(\gamma\right)d\gamma, (55)

where Pν(γ)P_{\nu}\left(\gamma\right) is the energy emitted by a particle with energy γ\gamma at frequency ν\nu and has the form (Crusius & Schlickeiser, 1986):

Pν(γ)=34πq3Bmc2xCS(x),P_{\nu}\left(\gamma\right)=\frac{\sqrt{3}}{4\pi}\frac{q^{3}B}{mc^{2}}xCS\left(x\right), (56)

where qq and mm are the charge and the mass of the emitting particle, BB is the magnetic field strength, xx is the frequency normalized to a critical frequency x=ν/νcx=\nu/\nu_{c} with

νcν0γ234πqBmcγ2,\nu_{c}\coloneqq\nu_{0}\gamma^{2}\coloneqq\frac{3}{4\pi}\frac{qB}{mc}\gamma^{2}, (57)

and CS(x)CS\left(x\right) is a function given in terms of Whittaker functions:

CS(x)=W0,43(x)W0,13(x)W12,56(x)W12,56(x).CS\left(x\right)=W_{0,\frac{4}{3}}\left(x\right)W_{0,\frac{1}{3}}\left(x\right)-W_{\frac{1}{2},\frac{5}{6}}\left(x\right)W_{-\frac{1}{2},\frac{5}{6}}\left(x\right). (58)

The loss of energy of the emitting particles is modelled as:

γt=43cσTuBmec2(mem)3γ2,\frac{\partial\gamma}{\partial t}=-\frac{4}{3}c\sigma_{T}\frac{u_{B}}{m_{e}c^{2}}\left(\frac{m_{e}}{m}\right)^{3}\gamma^{2}, (59)

where σT\sigma_{T} is the Thomson cross-section and uBu_{B} is the energy density of the magnetic field.

The absorption of photons is modelled as:

αν=18πmν21γ2γ[nγ2]Pν(γ)𝑑γ.\alpha_{\nu}=-\frac{1}{8\pi m\nu^{2}}\int_{1}^{\infty}\gamma^{2}\frac{\partial}{\partial\gamma}\left[\frac{n}{\gamma^{2}}\right]P_{\nu}\left(\gamma\right)d\gamma. (60)

A.2 Inverse Compton

Photons interact with energetic electrons due to the inverse Compton process, which we model as:

jν(εs)=εs21ne(γe)0nγ(ε)gu,d(γe,ε,εs)𝑑ε𝑑γe,j_{\nu}\left(\varepsilon_{s}\right)=\frac{\hbar\varepsilon_{s}}{2}\int_{1}^{\infty}n_{e}\left(\gamma_{e}\right)\int_{0}^{\infty}n_{\gamma}\left(\varepsilon\right)g_{u,d}\left(\gamma_{e},\varepsilon,\varepsilon_{s}\right)d\varepsilon d\gamma_{e}, (61)

where ε\varepsilon is the initial energy of the photon, εs\varepsilon_{s} is the energy after the scattering, nen_{e} is the electron population, nγn_{\gamma} is the photon population and gu,d(γ,ε,εs)g_{u,d}\left(\gamma,\varepsilon,\varepsilon_{s}\right) are two functions referring to the upscattering and downscattering of photons which where calculated by Jones (Jones, 1968, equations 40 and 44):

gu(γ,ε,εs)\displaystyle g_{u}\left(\gamma,\varepsilon,\varepsilon_{s}\right) =3cσT4γ2ε[2qulnqu+(1+2qu)(1qu)+εs2γ(γεs)1qu2]if qu1εsε,εs<γ,\displaystyle=\frac{3c\sigma_{T}}{4\gamma^{2}\varepsilon}\left[2q_{u}\ln q_{u}+\left(1+2q_{u}\right)\left(1-q_{u}\right)+\frac{\varepsilon_{s}^{2}}{\gamma\left(\gamma-\varepsilon_{s}\right)}\frac{1-q_{u}}{2}\right]\quad\text{if }q_{u}\leq 1\leq\frac{\varepsilon_{s}}{\varepsilon},\,\varepsilon_{s}<\gamma, (62)
gd(γ,ε,εs)\displaystyle g_{d}\left(\gamma,\varepsilon,\varepsilon_{s}\right) =3cσT16γ4ε[(qd1)(1+2qd)2lnqd]if 1qd1εεs,\displaystyle=\frac{3c\sigma_{T}}{16\gamma^{4}\varepsilon}\left[\left(q_{d}-1\right)\left(1+\frac{2}{q_{d}}\right)-2\ln q_{d}\right]\qquad\qquad\qquad\qquad\qquad\text{if }\frac{1}{q_{d}}\leq 1\leq\frac{\varepsilon}{\varepsilon_{s}}, (63)

with

qu\displaystyle q_{u} =εs4εγ(γεs),\displaystyle=\frac{\varepsilon_{s}}{4\varepsilon\gamma\left(\gamma-\varepsilon_{s}\right)}, qd\displaystyle q_{d} =4γεs2ε.\displaystyle=\frac{4\gamma{{}^{2}}\varepsilon_{s}}{\varepsilon}. (64)

The loss term for photons can be obtained as:

nγ(ε)t=nγ(ε)1ne(γe)(γe,ε)𝑑γe,\frac{\partial n_{\gamma}\left(\varepsilon\right)}{\partial t}=-n_{\gamma}\left(\varepsilon\right)\int_{1}^{\infty}n_{e}\left(\gamma_{e}\right)\mathcal{R}\left(\gamma_{e},\varepsilon\right)d\gamma_{e}, (65)

where (γe,ε)\mathcal{R}\left(\gamma_{e},\varepsilon\right) is the reaction rate of electrons of energy γe\gamma_{e} with photons of energy ε\varepsilon and can be found from integrating the Klein-Nishina cross section.

(γe,ε)\displaystyle\mathcal{R}\left(\gamma_{e},\varepsilon\right) =316cσT1βγe2ε2[x4+14(x+1)+(x2+9x+8)2xln(1+x)+2Li2(x)]xMxm\displaystyle=\frac{3}{16}c\sigma_{T}\frac{1}{\beta\gamma_{e}^{2}\varepsilon^{2}}\left[-\frac{x}{4}+\frac{1}{4\left(x+1\right)}+\frac{\left(x^{2}+9x+8\right)}{2x}\ln\left(1+x\right)+2\operatorname{Li}_{2}\left(-x\right)\right]_{x_{M}}^{x_{m}} (66)

where:

xm\displaystyle x_{m} =2εγe(1+β),\displaystyle=2\varepsilon\gamma_{e}\left(1+\beta\right), xM\displaystyle x_{M} =2εγe(1β),\displaystyle=2\varepsilon\gamma_{e}\left(1-\beta\right), (67)

and Li2\operatorname{Li}_{2} the dilogarithm.

Following Böttcher et al. (2012), we use the simplified prescription for the electron energy losses in inverse Compton scattering:

γt=43cσTuγmec2γ2,\frac{\partial\gamma}{\partial t}=-\frac{4}{3}c\sigma_{T}\frac{u_{\gamma}}{m_{e}c^{2}}\gamma^{2}, (68)

where uγu_{\gamma} is the energy density of the photon field:

uγ=0εnγ(ε)𝑑ε.u_{\gamma}=\int_{0}^{\infty}\varepsilon n_{\gamma}\left(\varepsilon\right)d\varepsilon. (69)

A.3 Two photon Pair Production

Photons can interact among themselves creating electron and positron pairs, which we model according to (Böttcher & Schlickeiser, 1997).

Let us define

E\displaystyle E =ε1+ε2,\displaystyle=\varepsilon_{1}+\varepsilon_{2}, c±\displaystyle c_{\pm} =(ε1,2γ)21,\displaystyle=\left(\varepsilon_{1,2}-\gamma\right)^{2}-1, (70)
a\displaystyle a =γ(Eγ)+1,\displaystyle=\gamma\left(E-\gamma\right)+1, d±\displaystyle d_{\pm} =ε1,22+ε1ε2±γ(ε2ε1),\displaystyle=\varepsilon_{1,2}^{2}+\varepsilon_{1}\varepsilon_{2}\pm\gamma\left(\varepsilon_{2}-\varepsilon_{1}\right), (71)

where ε1,2\varepsilon_{1,2} are the energies of the initial photons and γ\gamma is the energy of the final electron or positron.

Then we can obtain the rate of created particles as:

ne±(γ)t=34σTc0𝑑ε1nγ(ε1)ε12max(1ε1,γ+1ε1)𝑑ε2nγ(ε2)ε22R(ε1,ε2,γ),\frac{\partial n_{e^{\pm}}\left(\gamma\right)}{\partial t}=\frac{3}{4}\sigma_{T}c\int_{0}^{\infty}d\varepsilon_{1}\frac{n_{\gamma}\left(\varepsilon_{1}\right)}{\varepsilon_{1}^{2}}\int_{\max\left(\frac{1}{\varepsilon_{1}},\gamma+1-\varepsilon_{1}\right)}^{\infty}d\varepsilon_{2}\frac{n_{\gamma}\left(\varepsilon_{2}\right)}{\varepsilon_{2}^{2}}R\left(\varepsilon_{1},\varepsilon_{2},\gamma\right), (72)

where

R(ε1,ε2,γ)=[E24ε24+H++H]εminεmax.R\left(\varepsilon_{1},\varepsilon_{2},\gamma\right)=\left[\frac{\sqrt{E^{2}-4\varepsilon^{\prime 2}}}{4}+H_{+}+H_{-}\right]_{\varepsilon_{min}^{\prime}}^{\varepsilon_{max}^{\prime}}. (73)

The values for ε\varepsilon^{\prime} are given by energetic and kinematic reasoning:

εmax\displaystyle\varepsilon_{max}^{\prime} =min(ε1ε2,ε),\displaystyle=\min\left(\sqrt{\varepsilon_{1}\varepsilon_{2}},\varepsilon^{\prime\star}\right), εmin\displaystyle\varepsilon_{min}^{\prime} =max(1,ε),\displaystyle=\max\left(1,\varepsilon^{\prime\dagger}\right), (74)

where

(ε,)2=a±a2E22.\left(\varepsilon^{\prime\star,\dagger}\right)^{2}=\frac{a\pm\sqrt{a^{2}-E^{2}}}{2}. (75)

The function H±H_{\pm} is given depending on the value of c±c_{\pm}:

H±\displaystyle H_{\pm} ={ε8ε1ε2+c±ε2[2(ε2+1ε2)+2ε1ε21c±+2c±d±ε1ε2]+I±4(2ε1ε21c±)if c±01(ε1ε2)3/2(ε312εd±8)+1ε1ε2(ε36+ε2+14ε)if c±=0,\displaystyle=\begin{cases}\frac{\varepsilon^{\prime}}{8\sqrt{\varepsilon_{1}\varepsilon_{2}+c_{\pm}\varepsilon^{\prime 2}}}\left[2\left(\varepsilon^{\prime 2}+\frac{1}{\varepsilon^{\prime 2}}\right)+2\frac{\varepsilon_{1}\varepsilon_{2}-1}{c_{\pm}}+\frac{2c_{\pm}-d_{\pm}}{\varepsilon_{1}\varepsilon_{2}}\right]+\frac{I_{\pm}}{4}\left(2-\frac{\varepsilon_{1}\varepsilon_{2}-1}{c_{\pm}}\right)&\text{if }c_{\pm}\neq 0\\ \frac{1}{\left(\varepsilon_{1}\varepsilon_{2}\right)^{3/2}}\left(\frac{\varepsilon^{\prime 3}}{12}-\frac{\varepsilon^{\prime}d_{\pm}}{8}\right)+\frac{1}{\sqrt{\varepsilon_{1}\varepsilon_{2}}}\left(\frac{\varepsilon^{\prime 3}}{6}+\frac{\varepsilon^{\prime}}{2}+\frac{1}{4\varepsilon^{\prime}}\right)&\text{if }c_{\pm}=0\end{cases}, (76)

with

I±={1c±log(εc±+ε1ε2+c±ε2)if c±>01c±arcsin(εc±ε1ε2)if c±<0.I_{\pm}=\begin{cases}\frac{1}{\sqrt{c_{\pm}}}\log\left(\varepsilon^{\prime}\sqrt{c_{\pm}}+\sqrt{\varepsilon_{1}\varepsilon_{2}+c_{\pm}\varepsilon^{\prime 2}}\right)&\text{if }c_{\pm}>0\\ \frac{1}{\sqrt{-c_{\pm}}}\arcsin\left(\varepsilon^{\prime}\sqrt{-\frac{c_{\pm}}{\varepsilon_{1}\varepsilon_{2}}}\right)&\text{if }c_{\pm}<0\end{cases}. (77)

Photon losses are modelled according to the fully integrated cross section in a similar way to the photon loss term for IC.

(x)\displaystyle\mathcal{R}\left(x\right) =3σT4x2{a2xa+Li2(1a2)Li2(1+a2)atanh(a)[ln(4x)1x2x+2]},\displaystyle=\frac{3\sigma_{T}}{4x^{2}}\left\{a-2xa+\operatorname{Li}_{2}\left(\frac{1-a}{2}\right)-\operatorname{Li}_{2}\left(\frac{1+a}{2}\right)-\operatorname{atanh}\left(a\right)\left[-\ln\left(4x\right)-\frac{1}{x}-2x+2\right]\right\}, (78)

where x=ε1ε2x=\varepsilon_{1}\varepsilon_{2} and a=11xa=\sqrt{1-\frac{1}{x}}.

A.4 Pair Annihilation

Just like a pair of photons can interact to produce a pair electron-positron, the opposite process can happen where electrons and positrons annihilate among themselves to produce a pair of photons. We model this process following Svensson (1982).

Let us start by defining the auxiliary variables:

γcm2\displaystyle\gamma_{cm}^{*2} ε(γ++γε),\displaystyle\coloneqq\varepsilon\left(\gamma_{+}+\gamma_{-}-\varepsilon\right), d±\displaystyle d_{\pm} γ(γ++γ)±ε(γ+γ),\displaystyle\coloneqq\gamma_{\mp}\left(\gamma_{+}+\gamma_{-}\right)\pm\varepsilon\left(\gamma_{+}-\gamma_{-}\right), (79)
c±\displaystyle c_{\pm} (γε)21,\displaystyle\coloneqq\left(\gamma_{\mp}-\varepsilon\right)^{2}-1, u±\displaystyle u_{\pm} γcm2+c±γcm2,\displaystyle\coloneqq\sqrt{\gamma_{cm}^{*2}+c_{\pm}\gamma_{cm}^{2}}, (80)

note that these definitions are similar but different to the case of pair production. In particular, c±c_{\pm} and d±d_{\pm} go with γ\gamma_{\mp} not the other way around.

Svensson (1982) defines the spectral emissivity as:

nγ(ε)t=γ+,minγ+,maxne+(γ+)γ,minγ,maxne(γ)vdσdε¯(ε,γ+,γ)𝑑γ𝑑γ+\frac{\partial n_{\gamma}\left(\varepsilon\right)}{\partial t}=\int_{\gamma_{+,min}}^{\gamma_{+,max}}n_{e^{+}}\left(\gamma_{+}\right)\int_{\gamma_{-,min}}^{\gamma_{-,max}}n_{e^{-}}\left(\gamma_{-}\right)\overline{v\frac{d\sigma}{d\varepsilon}}\left(\varepsilon,\gamma_{+},\gamma_{-}\right)d\gamma_{-}d\gamma_{+} (81)

where vdσdε¯\overline{v\frac{d\sigma}{d\varepsilon}} is the angle averaged emissivity and the limits γ±,min\gamma_{\pm,min} and γ±,max\gamma_{\pm,max} are defined using the auxiliary functions:

γA(ε)\displaystyle\gamma_{A}\left(\varepsilon\right) 4ε2+14ε,\displaystyle\coloneqq\frac{4\varepsilon^{2}+1}{4\varepsilon}, F±(ε,γ)\displaystyle F_{\pm}\left(\varepsilon,\gamma\right) 2εγ(1±β),\displaystyle\coloneqq 2\varepsilon-\gamma\left(1\pm\beta\right), (82)
γB(ε)\displaystyle\gamma_{B}\left(\varepsilon\right) 2ε22ε+12ε1,\displaystyle\coloneqq\frac{2\varepsilon^{2}-2\varepsilon+1}{2\varepsilon-1}, γ±(ε,γ)\displaystyle\gamma^{\pm}\left(\varepsilon,\gamma\right) 12(F±+1F±),\displaystyle\coloneqq\frac{1}{2}\left(F_{\pm}+\frac{1}{F_{\pm}}\right), (83)

which give the integration limits as:

γ±,max\displaystyle\gamma_{\pm,max} =,\displaystyle=\infty, (84)
γ+,min\displaystyle\gamma_{+,min} ={1if ε>12γA(ε)if ε12,\displaystyle=\begin{cases}1&\text{if }\varepsilon>\frac{1}{2}\\ \gamma_{A}\left(\varepsilon\right)&\text{if }\varepsilon\leq\frac{1}{2}\end{cases}, γ,min\displaystyle\gamma_{-,min} ={1if γ+γB(ε) and ε12γ+(ε,γ+)if γ+<γB(ε) and ε1γ(ε,γ+)for all other cases,\displaystyle=\begin{cases}1&\text{if }\gamma_{+}\geq\gamma_{B}\left(\varepsilon\right)\text{ and }\varepsilon\geq\frac{1}{2}\\ \gamma^{+}\left(\varepsilon,\gamma_{+}\right)&\text{if }\gamma_{+}<\gamma_{B}\left(\varepsilon\right)\text{ and }\varepsilon\geq 1\\ \gamma^{-}\left(\varepsilon,\gamma_{+}\right)&\text{for all other cases}\end{cases}, (85)

The angle averaged emissivity is given by:

vdσdε¯(ε,γ+,γ)=πcre21β+βγ+2γ2[(γ++γ)24γcm2+H++H]γcm,minγcm.max,\overline{v\frac{d\sigma}{d\varepsilon}}\left(\varepsilon,\gamma_{+},\gamma_{-}\right)=\pi cr_{e}^{2}\frac{1}{\beta_{+}\beta_{-}\gamma_{+}^{2}\gamma_{-}^{2}}\left[\sqrt{\left(\gamma_{+}+\gamma_{-}\right)^{2}-4\gamma_{cm}^{2}}+H_{+}+H_{-}\right]_{\gamma_{cm,min}}^{\gamma_{cm.max}}, (86)

where the limits are given by kinetic reasoning:

γcm±2\displaystyle\gamma_{cm}^{\pm 2} =12(1+γ+γ±β+βγ+γ),\displaystyle=\frac{1}{2}\left(1+\gamma_{+}\gamma_{-}\pm\beta_{+}\beta_{-}\gamma_{+}\gamma_{-}\right), (87)
γcm,min\displaystyle\gamma_{cm,min} =γcm,\displaystyle=\gamma_{cm}^{-}, γcm,max\displaystyle\gamma_{cm,max} =min(γcm+,γcm),\displaystyle=\min\left(\gamma_{cm}^{+},\gamma_{cm}^{*}\right), (88)

The H±H_{\pm} is given again in a form that depends on whether c±c_{\pm} is equal or different from 0.

H±={(2+1γcm2c±)I±+1u±[1γcmγcmc±+γcm2γcm2(2c±d±)]+u±γcmc±if c±0(23γcm3+2γcm+1γcm)1γcm+12(23γcm3d±γcm)1γcm3if c±=0,H_{\pm}=\begin{cases}\left(2+\frac{1-\gamma_{cm}^{*2}}{c_{\pm}}\right)I_{\pm}+\frac{1}{u_{\pm}}\left[\frac{1}{\gamma_{cm}}-\frac{\gamma_{cm}}{c_{\pm}}+\frac{\gamma_{cm}}{2\gamma_{cm}^{*2}}\left(2c_{\pm}-d_{\pm}\right)\right]+u_{\pm}\frac{\gamma_{cm}}{c_{\pm}}&\text{if }c_{\pm}\neq 0\\ \left(\frac{2}{3}\gamma_{cm}^{3}+2\gamma_{cm}+\frac{1}{\gamma_{cm}}\right)\frac{1}{\gamma_{cm}^{*}}+\frac{1}{2}\left(\frac{2}{3}\gamma_{cm}^{3}-d_{\pm}\gamma_{cm}\right)\frac{1}{\gamma_{cm}^{*3}}&\text{if }c_{\pm}=0\end{cases}, (89)

where

I±={1c±ln(γcmc±+u±)if c±>01c±arcsin(γcmγcmc±)if c±<0.I_{\pm}=\begin{cases}\frac{1}{\sqrt{c_{\pm}}}\ln\left(\gamma_{cm}\sqrt{c_{\pm}}+u_{\pm}\right)&\text{if }c_{\pm}>0\\ \frac{1}{\sqrt{-c_{\pm}}}\arcsin\left(\frac{\gamma_{cm}}{\gamma_{cm}^{*}}\sqrt{-c_{\pm}}\right)&\text{if }c_{\pm}<0\end{cases}. (90)

A.5 Photo-Meson Production

Hadrons (protons and neutrons) interact with photons producing hadronic resonances which ultimately decay in the form of hadrons and pions. To model this we follow the approximations of Hümmer et al (Hümmer et al., 2010) in their SimB model. We diverge though in the calculations of the loss and self-injection term for hadrons as, instead of applying a cooling term, we model every process as a loss and where applicable, we reinject the corresponding hadron with reduced energies.

For this process, and in order to avoid adding constants and units we choose to follow the original convention on units and notation. We begin with the equation for the injection of particles from an interaction between a photon and a proton:

𝒬bIT=Np(EbχIT)mpEbϵth/2𝑑ynγ(mpyχITEb)MbITfIT(y),\mathcal{Q}_{b}^{IT}=N_{p}\left(\frac{E_{b}}{\chi^{IT}}\right)\frac{m_{p}}{E_{b}}\int_{\epsilon_{th}/2}^{\infty}dyn_{\gamma}\left(\frac{m_{p}y\chi^{IT}}{E_{b}}\right)M_{b}^{IT}f^{IT}\left(y\right), (91)

where ITIT makes reference to the Interaction Type and bb to the daughter particle. The constants χIT\chi^{IT}, MbITM_{b}^{IT} and ϵth\epsilon_{th} and the function fIT(y)f^{IT}\left(y\right) depend on the interaction type.


For the Resonant Pion production, we follow their Simplified Model B (Hümmer et al., 2010, section 4.2.2 & Table 4), whose parameters we reproduce in 4.

IT ϵr\epsilon_{r} Range (GeV) σ\sigma(μ\mubarn) Initial Proton Initial Neutron K
Mπ0M_{\pi^{0}} Mπ+M_{\pi^{+}} MπM_{\pi^{-}} MnM_{n} Mπ0M_{\pi^{0}} Mπ+M_{\pi^{+}} MπM_{\pi^{-}} MnM_{n}
χπ0\chi_{\pi^{0}} χπ+\chi_{\pi^{+}} χπ\chi_{\pi^{-}} MpM_{p} χπ0\chi_{\pi^{0}} χπ+\chi_{\pi^{+}} χπ\chi_{\pi^{-}} MpM_{p}
LR 0.20.50.2-0.5 200200 2/32/3 1/31/3 1/31/3 2/32/3 1/31/3 2/32/3 0.220.22
0.220.22 0.220.22 2/32/3 0.220.22 0.220.22 1/31/3
HR 0.51.20.5-1.2 9090 0.470.47 0.770.77 0.340.34 0.430.43 0.470.47 0.340.34 0.770.77 0.570.57 0.390.39
0.260.26 0.250.25 0.220.22 0.570.57 0.260.26 0.220.22 0.250.25 0.430.43
Table 4: Sim-B parameters for the resonant pion production.

The ff functions are given depending on the Interaction Type as:

fLR={0if 2y<0.2GeV200μbarn(1(0.2GeV)2(2y)2)if 0.2GeV2y<0.5GeV200μbarn((0.5GeV)2(0.2GeV)2(2y)2)if 0.5GeV2y,f^{LR}=\begin{cases}0&\text{if }2y<0.2\text{GeV}\\ 200\mu\text{barn}\left(1-\frac{\left(0.2\text{GeV}\right)^{2}}{\left(2y\right)^{2}}\right)&\text{if }0.2\text{GeV}\leq 2y<0.5\text{GeV}\\ 200\mu\text{barn}\left(\frac{\left(0.5\text{GeV}\right)^{2}-\left(0.2\text{GeV}\right)^{2}}{\left(2y\right)^{2}}\right)&\text{if }0.5\text{GeV}\leq 2y\end{cases}, (92)
fHR={0if 2y<0.5GeV90μbarn(1(0.5GeV)2(2y)2)if 0.5GeV2y<1.2GeV90μbarn((1.2GeV)2(0.5GeV)2(2y)2)if 1.2GeV2yf^{HR}=\begin{cases}0&\text{if }2y<0.5\text{GeV}\\ 90\mu\text{barn}\left(1-\frac{\left(0.5\text{GeV}\right)^{2}}{\left(2y\right)^{2}}\right)&\text{if }0.5\text{GeV}\leq 2y<1.2\text{GeV}\\ 90\mu\text{barn}\left(\frac{\left(1.2\text{GeV}\right)^{2}-\left(0.5\text{GeV}\right)^{2}}{\left(2y\right)^{2}}\right)&\text{if }1.2\text{GeV}\leq 2y\end{cases} (93)

For the direct pion production, we follow (Hümmer et al., 2010, section 4.3 & Table 5), whose parameters we reproduce in 5.

IT ϵminIT\epsilon_{min}^{IT} (GeV) ϵmaxIT\epsilon_{max}^{IT} (GeV) χ\chi K Initial Proton Initial Neutron
Mπ0M_{\pi^{0}} Mπ+M_{\pi^{+}} MπM_{\pi^{-}} Mπ0M_{\pi^{0}} Mπ+M_{\pi^{+}} MπM_{\pi^{-}}
T1LT_{1L} 0.170.17 0.560.56 0.130.13 0.130.13 11 11
T1MT_{1M} 0.560.56 1010 0.050.05 0.050.05 11 11
T1HT_{1H} 1010 \infty 0.0010.001 0.0010.001 11 11
T2aLT_{2aL} 0.40.4 1.581.58 0.080.08 0.280.28 1/41/4 3/43/4 3/43/4 1/41/4
T2aMT_{2aM} 1.581.58 1010 0.020.02 0.220.22 1/41/4 3/43/4 3/43/4 1/41/4
T2aHT_{2aH} 1010 \infty 0.0010.001 0.2010.201 1/41/4 3/43/4 3/43/4 1/41/4
T2bT_{2b} 0.40.4 \infty 0.20.2 1/61/6 3/43/4 1/121/12 1/61/6 1/121/12 3/43/4
Table 5: Parameters for the direct pion production

The ff functions are given by;

fIT={0if 2y<ϵminIT12y2(IIT(2y)IIT(ϵminIT))if ϵminIT2y<ϵmaxIT12y2(IIT(ϵmaxIT)IIT(ϵminIT))if ϵmaxIT2y.f^{IT}=\begin{cases}0&\text{if }2y<\epsilon_{min}^{IT}\\ \frac{1}{2y^{2}}\left(I^{IT}\left(2y\right)-I^{IT}\left(\epsilon_{min}^{IT}\right)\right)&\text{if }\epsilon_{min}^{IT}\leq 2y<\epsilon_{max}^{IT}\\ \frac{1}{2y^{2}}\left(I^{IT}\left(\epsilon_{max}^{IT}\right)-I^{IT}\left(\epsilon_{min}^{IT}\right)\right)&\text{if }\epsilon_{max}^{IT}\leq 2y\end{cases}. (94)

The functions IITI^{IT} are parametrized in terms of xlog10(y(GeV))x\equiv\log_{10}\left(y\text{(GeV)}\right) and are different depending on the interaction type:

IT1={0if 2y<0.17GeV35.9533+84.0859x+110.765x2+102.728x3+40.4699x4if 0.17GeV<2y<0.96GeV30.2004+40.5478x+2.0374x20.387884x3+0.025044x4if 0.96GeV<2y,I^{T_{1}}=\begin{cases}0&\text{if }2y<0.17\text{GeV}\\ 35.9533+84.0859x+110.765x^{2}+102.728x^{3}+40.4699x^{4}&\text{if }0.17\text{GeV}<2y<0.96\text{GeV}\\ 30.2004+40.5478x+2.0374x^{2}-0.387884x^{3}+0.025044x^{4}&\text{if }0.96\text{GeV}<2y\end{cases}, (95)
IT2={0if 2y<0.4GeV3.4083+16.28642y+40.7160ln(2y)if 0.4GeV<2y.I^{T_{2}}=\begin{cases}0&\text{if }2y<0.4\text{GeV}\\ -3.4083+\frac{16.2864}{2y}+40.7160\ln\left(2y\right)&\text{if }0.4\text{GeV}<2y\end{cases}. (96)

For the multi pion production, we follow (Hümmer et al., 2010, section 4.4.2 & Table 6), whose parameters we reproduce in 6.

IT ϵminIT\epsilon_{min}^{IT} (GeV) ϵmaxIT\epsilon_{max}^{IT} (GeV) σ\sigma χ\chi K Initial Proton Initial Neutron
Mπ0M_{\pi^{0}} Mπ+M_{\pi^{+}} MπM_{\pi^{-}} Mπ0M_{\pi^{0}} Mπ+M_{\pi^{+}} MπM_{\pi^{-}}
M1LM_{1L} 0.5 0.9 60 0.1 0.27 0.32 0.34 0.04 0.32 0.04 0.34
M1HM_{1H} 0.5 0.9 60 0.4 0.17 0.29 0.05 0.17 0.05 0.29
M2LM_{2L} 0.9 1.5 85 0.15 0.34 0.42 0.31 0.07 0.42 0.07 0.31
M2HM_{2H} 0.9 1.5 85 0.35 0.19 0.35 0.08 0.19 0.08 0.35
M3LM_{3L} 1.5 5.0 120 0.15 0.39 0.59 0.57 0.30 0.59 0.30 0.57
M3HM_{3H} 1.5 5.0 120 0.35 0.16 0.21 0.13 0.16 0.13 0.21
M4LM_{4L} 5.0 50 120 0.07 0.49 1.38 1.37 1.11 1.38 1.11 1.37
M4HM_{4H} 5.0 50 120 0.35 0.16 0.25 0.23 0.16 0.23 0.25
M5LM_{5L} 50 500 120 0.02 0.45 3.01 2.86 2.64 3.01 2.64 2.86
M5HM_{5H} 50 500 120 0.5 0.20 0.21 0.14 0.20 0.14 0.21
M6LM_{6L} 500 5000 120 0.007 0.44 5.13 4.68 4.57 5.13 4.57 4.68
M6HM_{6H} 500 5000 120 0.5 0.27 0.29 0.12 0.27 0.12 0.29
M7LM_{7L} 5000 \infty 120 0.002 0.44 7.59 6.80 6.65 7.59 6.65 6.80
M7HM_{7H} 5000 \infty 120 0.6 0.26 0.27 0.13 0.26 0.13 0.27
Table 6: Parameters for the multi pion production.

The ff functions are given by;

fIT={0if 2y<ϵminMiσMi(2y)2((2y)2(ϵminMi)2)if ϵminMi2y<ϵmaxMiσMi(2y)2((ϵmaxMi)2(ϵminMi)2)if ϵmaxMi2y.f^{IT}=\begin{cases}0&\text{if }2y<\epsilon_{min}^{M_{i}}\\ \frac{\sigma^{M_{i}}}{\left(2y\right)^{2}}\left(\left(2y\right)^{2}-\left(\epsilon_{min}^{M_{i}}\right)^{2}\right)&\text{if }\epsilon_{min}^{M_{i}}\leq 2y<\epsilon_{max}^{M_{i}}\\ \frac{\sigma^{M_{i}}}{\left(2y\right)^{2}}\left(\left(\epsilon_{max}^{M_{i}}\right)^{2}-\left(\epsilon_{min}^{M_{i}}\right)^{2}\right)&\text{if }\epsilon_{max}^{M_{i}}\leq 2y\end{cases}. (97)

To account for energy losses in protons and neutrons, we also follow Hümmer et al. (2010, Appendix B), although we apply the framework of reinjection to all the particles and we ignore cooling. This way, all the losses can be treated with the same equation:

𝒬pIT=Np(Ep1KIT)mpEpϵth/2𝑑ynγ(mpy(1KIT)Ep)MpITfIT(y),\mathcal{Q}_{p^{\prime}}^{IT}=N_{p}\left(\frac{E_{p^{\prime}}}{1-K^{IT}}\right)\frac{m_{p}}{E_{p^{\prime}}}\int_{\epsilon_{th}/2}^{\infty}dyn_{\gamma}\left(\frac{m_{p}y\left(1-K^{IT}\right)}{E_{p^{\prime}}}\right)M_{p^{\prime}}^{IT}f^{IT}\left(y\right), (98)

where pp^{\prime} is the resulting hadron and KITK^{IT} is the inelasticity of the interaction type and the corresponding channel.

A.6 Bethe-Heitler Process

Besides producing mesons, a photon interacting with a proton can produce pairs electron-positron through the Bethe-Heitler process. We have chosen to model this process using the results of Kelner & Aharonian (2008, Section IV).

First, the injection of produced particles is defined as:

dne±(γe)dγe=c2γp3(γp+γe)24γp2γe𝑑εnγ(ε)ε2(γp+γe)22γpγe2γpε𝑑ωωγp2+γe22γpγeω1dγpW(ω,γ,ξ),\frac{dn_{e^{\pm}}\left(\gamma_{e}\right)}{d\gamma_{e}}=\frac{c}{2\gamma_{p}^{3}}\int_{\frac{\left(\gamma_{p}+\gamma_{e}\right)^{2}}{4\gamma_{p}^{2}\gamma_{e}}}^{\infty}d\varepsilon\frac{n_{\gamma}\left(\varepsilon\right)}{\varepsilon^{2}}\int_{\frac{\left(\gamma_{p}+\gamma_{e}\right)^{2}}{2\gamma_{p}\gamma_{e}}}^{2\gamma_{p}\varepsilon}d\omega\omega\int_{\frac{\gamma_{p}^{2}+\gamma_{e}^{2}}{2\gamma_{p}\gamma_{e}}}^{\omega-1}\frac{d\gamma_{-}}{p_{-}}W\left(\omega,\gamma_{-},\xi\right), (99)

where γe\gamma_{e} is the energy of the final electron (or correspondingly positron), γp\gamma_{p} is the energy of the initial proton and ε\varepsilon is the energy of the incoming photon. ω\omega and γ\gamma_{-} act as integration variables which act as energies in the rest frame of the proton. Note that we have added a factor cc with respect to the original equation to correct the units.

In the limit of highly energetic protons we have that:

cosθ=ξ=γpγγeγpβγ.\cos\theta=\xi=\frac{\gamma_{p}\gamma_{-}-\gamma_{e}}{\gamma_{p}\beta_{-}\gamma_{-}}. (100)

WW is the differential reaction rate:

W(ω,γ,ξ)=dσ2dγdcosθ,W\left(\omega,\gamma_{-},\xi\right)=\frac{d\sigma^{2}}{d\gamma_{-}d\cos\theta_{-}}, (101)

which is available in Blumenthal (1970, equation 10) under de assumption of having a proton with infinite mass:

dσ2dγdcosθ\displaystyle\frac{d\sigma^{2}}{d\gamma_{-}d\cos\theta_{-}} =(αZ2r02pp+2ω3){4sin2θ2γ2+1p2Δ4+5γ22γ+γ+3p2Δ2+p2ω2T2Δ2+2γ+p2Δ+\displaystyle=\left(\frac{\alpha Z^{2}r_{0}^{2}p_{-}p_{+}}{2\omega^{3}}\right)\left\{-4\sin^{2}\theta_{-}\frac{2\gamma_{-}^{2}+1}{p_{-}^{2}\Delta_{-}^{4}}+\frac{5\gamma_{-}^{2}-2\gamma_{+}\gamma_{-}+3}{p_{-}^{2}\Delta_{-}^{2}}+\frac{p_{-}^{2}-\omega^{2}}{T^{2}\Delta_{-}^{2}}+\frac{2\gamma_{+}}{p_{-}^{2}\Delta_{-}}+\right.
+Ypp+[2γsin2θ3ω+p2γ+Δ4+2γ2(γ2+γ+2)7γ23γ+γγ+2+1Δ2+ω(γ2γγ+1)Δ]\displaystyle+\frac{Y}{p_{-}p_{+}}\left[2\gamma_{-}\sin^{2}\theta_{-}\frac{3\omega+p_{-}^{2}\gamma_{+}}{\Delta_{-}^{4}}+\frac{2\gamma_{-}^{2}\left(\gamma_{-}^{2}+\gamma_{+}^{2}\right)-7\gamma_{-}^{2}-3\gamma_{+}\gamma_{-}-\gamma_{+}^{2}+1}{\Delta_{-}^{2}}+\frac{\omega\left(\gamma_{-}^{2}-\gamma_{-}\gamma_{+}-1\right)}{\Delta_{-}}\right] (102)
δ+Tp+T[2Δ23ωΔω(p2ω2)T2Δ]2y+Δ},\displaystyle\left.-\frac{\delta_{+}^{T}}{p_{+}T}\left[\frac{2}{\Delta_{-}^{2}}-\frac{3\omega}{\Delta_{-}}-\frac{\omega\left(p_{-}^{2}-\omega^{2}\right)}{T^{2}\Delta_{-}}\right]-\frac{2y_{+}}{\Delta_{-}}\right\},

where

T\displaystyle T =p+2+2ωΔ,\displaystyle=\sqrt{p_{+}^{2}+2\omega\Delta_{-}}, δ+T\displaystyle\delta_{+}^{T} =ln(1+p+p++TωΔ),\displaystyle=\ln\left(1+p_{+}\frac{p_{+}+T}{\omega\Delta_{-}}\right), (103)
Y\displaystyle Y =2p2ln(γγ++pp++1ω),\displaystyle=\frac{2}{p_{-}^{2}}\ln\left(\frac{\gamma_{-}\gamma_{+}+p_{-}p_{+}+1}{\omega}\right), Δ\displaystyle\Delta_{-} =γeγp,\displaystyle=\frac{\gamma_{e}}{\gamma_{p}}, (104)
y+\displaystyle y_{+} =2p+atanh(β+),\displaystyle=\frac{2}{p_{+}}\operatorname{atanh}\left(\beta_{+}\right), (105)

and α\alpha is the fine structure constant, ZZ is the electric charge of the scattering nucleus (in our case for protons Z=1Z=1) and r0r_{0} is the classical radius of the electron. We have chosen to keep the notation compact p±=β±γ±p_{\pm}=\beta_{\pm}\gamma_{\pm} to denote the modulus of the momentum of a particle. Under the assumption of the proton having infinite mass, or being at rest, we have that ω=γ+γ+\omega=\gamma_{-}+\gamma_{+}.

We will focus on the second and third integrals, which only depend on external variables:

F(γp,γe,ε)=12(1Δ+Δ)+12γpε𝑑ωω12(1Δ+Δ)ω1dγpW(ω,γ,Δ).F\left(\gamma_{p},\gamma_{e},\varepsilon\right)=\int_{\frac{1}{2}\left(\frac{1}{\Delta_{-}}+\Delta_{-}\right)+1}^{2\gamma_{p}\varepsilon}d\omega\omega\int_{\frac{1}{2}\left(\frac{1}{\Delta_{-}}+\Delta_{-}\right)}^{\omega-1}\frac{d\gamma_{-}}{p_{-}}W\left(\omega,\gamma_{-},\Delta_{-}\right). (106)

Pulling out some constants we obtain:

F(γp,γe,ε)\displaystyle F\left(\gamma_{p},\gamma_{e},\varepsilon\right) =αZ2r022Δ12(1Δ+Δ)+12γpεdωω2G(ω,Δ),\displaystyle=\frac{\alpha Z^{2}r_{0}^{2}}{2\Delta_{-}}\int_{\frac{1}{2}\left(\frac{1}{\Delta_{-}}+\Delta_{-}\right)+1}^{2\gamma_{p}\varepsilon}\frac{d\omega}{\omega^{2}}G^{\prime}\left(\omega,\Delta_{-}\right), (107)
G(ω,Δ)\displaystyle G^{\prime}\left(\omega,\Delta_{-}\right) =12(1Δ+Δ)ω1𝑑EW(ω,E,Δ),\displaystyle=\int_{\frac{1}{2}\left(\frac{1}{\Delta_{-}}+\Delta_{-}\right)}^{\omega-1}dE_{-}W^{\prime}\left(\omega,E_{-},\Delta_{-}\right), (108)
W1(ω,γ,Δ)\displaystyle W_{1}^{\prime}\left(\omega,\gamma_{-},\Delta_{-}\right) =p+(4sin2θ2γ2+1p2Δ3+5γ22γ+γ+3p2Δ+p2ω2T2Δ+2γ+p2),\displaystyle=p_{+}\left(-4\sin^{2}\theta_{-}\frac{2\gamma_{-}^{2}+1}{p_{-}^{2}\Delta_{-}^{3}}+\frac{5\gamma_{-}^{2}-2\gamma_{+}\gamma_{-}+3}{p_{-}^{2}\Delta_{-}}+\frac{p_{-}^{2}-\omega^{2}}{T^{2}\Delta_{-}}+\frac{2\gamma_{+}}{p_{-}^{2}}\right), (109)
W2(ω,γ,Δ)\displaystyle W_{2}^{\prime}\left(\omega,\gamma_{-},\Delta_{-}\right) =Yp[2γsin2θ3ω+p2γ+Δ3+2γ2(γ2+γ+2)7γ23γ+γγ+2+1Δ+ω(γ2γγ+1)],\displaystyle=\frac{Y}{p_{-}}\left[2\gamma_{-}\sin^{2}\theta_{-}\frac{3\omega+p_{-}^{2}\gamma_{+}}{\Delta_{-}^{3}}+\frac{2\gamma_{-}^{2}\left(\gamma_{-}^{2}+\gamma_{+}^{2}\right)-7\gamma_{-}^{2}-3\gamma_{+}\gamma_{-}-\gamma_{+}^{2}+1}{\Delta_{-}}+\omega\left(\gamma_{-}^{2}-\gamma_{-}\gamma_{+}-1\right)\right], (110)
W3(ω,γ,Δ)\displaystyle W_{3}^{\prime}\left(\omega,\gamma_{-},\Delta_{-}\right) =δ+TT[2Δ3ωω(p2ω2)T2],\displaystyle=-\frac{\delta_{+}^{T}}{T}\left[\frac{2}{\Delta_{-}}-3\omega-\frac{\omega\left(p_{-}^{2}-\omega^{2}\right)}{T^{2}}\right], (111)
W4(ω,γ,Δ)\displaystyle W_{4}^{\prime}\left(\omega,\gamma_{-},\Delta_{-}\right) =4atanh(β+).\displaystyle=-4\operatorname{atanh}\left(\beta_{+}\right). (112)

As far as we know, no simple form for these integrals exist so they are performed numerically using an adaptive integration procedure from the Gnu Scientific Library.

Finally, to calculate the injection rate for electrons or positrons, (LABEL:BH_dne/dgp) has to be integrated over the proton population:

𝒬e(γe)=1dne±(γe)dγenp(γp)𝑑γp.\mathcal{Q}_{e}\left(\gamma_{e}\right)=\int_{1}^{\infty}\frac{dn_{e^{\pm}}\left(\gamma_{e}\right)}{d\gamma_{e}}n_{p}\left(\gamma_{p}\right)d\gamma_{p}. (113)

A.7 Pion-Muon Decays

To account for the cascade of pion and muon decays we follow the same model of Lipari et al. (2007) and Hümmer et al. (2010).

The injection of produced particles can be factorized as:

Qb(Eb)=𝑑EaNa(Ea)τaFab(Eb/Ea)Ea,Q_{b}\left(E_{b}\right)=\int dE_{a}\frac{N_{a}\left(E_{a}\right)}{\tau_{a}}\frac{F_{a\rightarrow b}\left(E_{b}/E_{a}\right)}{E_{a}}, (114)

where EaE_{a} and EbE_{b} are the energies of the parent and daughter particles (not normalized), τa\tau_{a} is the decay timescale of the parent particle and FabF_{a\rightarrow b} is a function that gives the ratio of decaying.

Charged pions decay to muons, which subsequently decay through the following channels:

π+\displaystyle\pi^{+}\rightarrow μ++νμ\displaystyle\mu^{+}+\nu_{\mu}
μ+e++νμ¯+νe\displaystyle\mu^{+}\rightarrow e^{+}+\overline{\nu_{\mu}}+\nu_{e}
π\displaystyle\pi^{-}\rightarrow μ+νμ¯\displaystyle\mu^{-}+\overline{\nu_{\mu}}
μe+νμ¯+νe\displaystyle\mu^{-}\rightarrow e^{-}+\overline{\nu_{\mu}}+\nu_{e}

The rate of charged pions decaying into neutrinos is given by Lipari et al. (2007):

Fπ+νμ(x)=Fπνμ¯(x)=11rπΘ(1rπx).F_{\pi^{+}\rightarrow\nu_{\mu}}\left(x\right)=F_{\pi^{-}\rightarrow\overline{\nu_{\mu}}}\left(x\right)=\frac{1}{1-r_{\pi}}\Theta\left(1-r_{\pi}-x\right). (115)

For the decay into muons we need to track the helicity of the final muons, as it will impact the subsequent decays into neutrinos, Hümmer et al. (2010):

Fπ+μR+(x)\displaystyle F_{\pi^{+}\rightarrow\mu_{R}^{+}}\left(x\right) =FπμL(x)=(1x)rπx(1rπ)2Θ(xrπ),\displaystyle=F_{\pi^{-}\rightarrow\mu_{L}^{-}}\left(x\right)=\frac{\left(1-x\right)r_{\pi}}{x\left(1-r_{\pi}\right)^{2}}\Theta\left(x-r_{\pi}\right), (116)
Fπ+μL+(x)\displaystyle F_{\pi^{+}\rightarrow\mu_{L}^{+}}\left(x\right) =FπμR(x)=xrπx(1rπ)2Θ(xrπ).\displaystyle=F_{\pi^{-}\rightarrow\mu_{R}^{-}}\left(x\right)=\frac{x-r_{\pi}}{x\left(1-r_{\pi}\right)^{2}}\Theta\left(x-r_{\pi}\right). (117)

Then, decay of muons to neutrinos is given by Hümmer et al. (2010):

Fμ+νμ¯(x,h)\displaystyle F_{\mu^{+}\rightarrow\overline{\nu_{\mu}}}\left(x,h\right) =Fμνμ(x,h)=(53+3x2+34x3)+h(13+3x283x3),\displaystyle=F_{\mu^{-}\rightarrow\nu_{\mu}}\left(x,-h\right)=\left(\frac{5}{3}+3x^{2}+\frac{3}{4}x^{3}\right)+h\left(-\frac{1}{3}+3x^{2}-\frac{8}{3}x^{3}\right), (118)
Fμ+νe(x,h)\displaystyle F_{\mu^{+}\rightarrow\nu_{e}}\left(x,h\right) =Fμνe¯(x,h)=(26x2+4x3)+h(212x+18x28x3),\displaystyle=F_{\mu^{-}\rightarrow\overline{\nu_{e}}}\left(x,-h\right)=\left(2-6x^{2}+4x^{3}\right)+h\left(2-12x+18x^{2}-8x^{3}\right), (119)

with h=1h=1 for right handed muons and h=1h=-1 for left handed muons.

The functions that govern the subsequent injection of electrons and positrons do depend on the helicity of the parent muons and can be simplified to:

FμL+e+\displaystyle F_{\mu_{L}^{+}\rightarrow e^{+}} (x)=FμRe(x)=23x2+6x3,\displaystyle\left(x\right)=F_{\mu_{R}^{-}\rightarrow e^{-}}\left(x\right)=2-3x^{2}+6x^{3}, (120)
FμR+e+\displaystyle F_{\mu_{R}^{+}\rightarrow e^{+}} (x)=FμLe(x)=4342x3.\displaystyle\left(x\right)=F_{\mu_{L}^{-}\rightarrow e^{-}}\left(x\right)=\frac{4}{3}-\frac{4}{2}x^{3}. (121)

A.8 Neutral Pion Decay

The injection of photons from neutral pions follows the simple equation:

𝒬γ(ε)=min(1,memπε)nπ0(γπ0)γπ0τπ0𝑑γπ0.\mathcal{Q}_{\gamma}\left(\varepsilon\right)=\int_{\min\left(1,\frac{m_{e}}{m_{\pi}}\varepsilon\right)}\frac{n_{\pi^{0}\left(\gamma_{\pi^{0}}\right)}}{\gamma_{\pi^{0}}\tau_{\pi^{0}}}d\gamma_{\pi^{0}}. (122)

Appendix B Configuration File

Katu is very easy to configure, this is done is done with the Toml666https://toml.io/en/ language, which allows us to separate variables per tables and subtables. An example configuration can be found in the default.toml file. As a reference, we detail all the possible options here.

The [general] table refers to general options for the simulation and the plasma. The possible keys are:

  • density: The density of the background plasma

  • magnetic_field: The magnitude of the background magnetic field

  • eta: The ratio of protons to electrons in the background plasma

  • cfe_ratio: The ratio of charged to free escape

  • t_acc: The acceleration timescale

  • dt: The initial value of the time step

  • dt_max: The maximum value of the time step

  • t_max: The maximum simulation time

  • tol: The requested tolerance for the steady state

The [volume] table refers to options referred to the shape of the simulation volume.

  • shape: The shape of the volume, currently it supports "sphere" and "disk"

  • R: The radius of the volume

  • h: The height of the disk. (Only used for the "disk" shape)

The external_injection table refers to the injection of particles from outside the simulated volume. The possible keys are:

  • luminosity: The integrated luminosity of the injected particles

  • eta: The ratio of protons to electrons to inject.

Note that the subtable external_injection.photons also has luminosity as a key separated from the particle injection.

The distribution tables and subtables ([protons], [electrons] and [photons]) refer to the distribution of particles, their limits and number of bins. The available keys are:

  • gamma_min: The minimum value for gamma for protons and electrons

  • gamma_max: The maximum value for gamma for protons and electrons

  • epsilon_min: The minimum value for epsilon for photons

  • epsilon_max: The maximum value for epsilon for photons

  • size: The number of bins for the particle kind

The distribution of the corresponding particle is controlled by the key distribution_type and supports the following options, along with the corresponding keys:

  • "maxwell_juttner": This option gives a thermal distribution of particles

    • temperature: The temperature of the distribution, in terms of the rest mass energy of the particle

  • "black_body": This option gives a thermal distribution of photons

    • temperature: The temperature of the distribution, in terms of the rest mass energy of an electron

  • "power_law": This option gives a simple power law

    • slope: The value of the exponent of the power law

  • "broken_power_law": This option gives a broken power law

    • break_point: The value of the energy where the the power law changes from the first slope to the second

    • first_slope: The value of the first exponent of the power law

    • second_slope: The value of the second exponent of the power law

  • "connected_power_law": This option gives a smooth power law with two slopes

    • connection_point: The value of the energy where the the power law changes from the first slope to the second

    • first_slope: The value of the first exponent of the power law

    • second_slope: The value of the second exponent of the power law

  • "power_law_with_exponential_cutoff": This option gives a power law with an exponential cutoff

    • slope: The value of the exponent of the power law

    • break_point: The normalization value for the energy in the exponential cutoff

  • "hybrid": This option gives a thermal distribution of particles at low energies coupled with a power law at higher energies

    • temperature: The temperature of the distribution, in terms of the rest mass energy

    • slope: The value of the exponent of the power law

Appendix C Corner Plots

Refer to caption
Figure 6: Corner Plot for multinest. The blue line represents the best loglikelihood that has been found.
Refer to caption
Figure 7: Corner Plot for emcee. The blue line represents the best loglikelihood that has been found.