On the impact of baryons on the halo mass function, bias, and cluster cosmology

Tiago Castro,^1,2,3,4 Stefano Borgani,^1,2,3,4 Klaus Dolag,^5,6 Valerio Marra,^2,3,7 Miguel Quartin,^8,9 Alexandro Saro,^1,2,3,4 Emiliano Sefusatti^2,3,4
¹Dipartimento di Fisica, Sezione di Astronomia, Università di Trieste, Via Tiepolo 11, I-34143 Trieste, Italy
²INAF – Osservatorio Astronomico di Trieste, via Tiepolo 11, I-34131 Trieste, Italy
³IFPU – Institute for Fundamental Physics of the Universe, via Beirut 2, 34151, Trieste, Italy
⁴INFN – Sezione di Trieste, I-34100 Trieste, Italy
⁵University Observatory Munich, Scheinerstr. 1, 81679 Munich, Germany
⁶Max-Planck-Institut fur Astrophysik, Karl-Schwarzschild Strasse 1, 85748 Garching, Germany
⁷Núcleo de Astrofísica e Cosmologia & Departamento de Física, Universidade Federal do Espírito Santo, 29075-910, ES, Brazil
⁸Instituto de Física, Universidade Federal do Rio de Janeiro, 21941-972, Rio de Janeiro, RJ, Brazil
⁹Observatório do Valongo, Universidade Federal do Rio de Janeiro, 20080-090, Rio de Janeiro, Brazil

(Accepted XXX. Received YYY; in original form ZZZ)

Abstract

Luminous matter produces very energetic events, such as active galactic nuclei and supernova explosions, that significantly affect the internal regions of galaxy clusters. Although the current uncertainty in the effect of baryonic physics on cluster statistics is subdominant as compared to other systematics, the picture is likely to change soon as the amount of high-quality data is growing fast, urging the community to keep theoretical systematic uncertainties below the ever-growing statistical precision. In this paper, we study the effect of baryons on galaxy clusters, and their impact on the cosmological applications of clusters, using the Magneticum suite of cosmological hydrodynamical simulations. We show that the impact of baryons on the halo mass function can be recast in terms on a variation of the mass of the halos simulated with pure N-body, when baryonic effects are included. The halo mass function and halo bias are only indirectly affected. Finally, we demonstrate that neglecting baryonic effects on halos mass function and bias would significantly alter the inference of cosmological parameters from high-sensitivity next-generations surveys of galaxy clusters.

keywords:

Large-scale structure – cosmology:simulations – galaxy clusters

^†^†pubyear: 2020^†^†pagerange: On the impact of baryons on the halo mass function, bias, and cluster cosmology–C

1 Introduction

Clusters of galaxies — clusters, in short — are the most massive gravitational self-bound structures in the Universe. Within the standard cosmological framework, they form a continuous merging of smaller dark matter halos, and sit atop of the hierarchy of collapsed cosmic structures. During this process, cosmic baryons fall into the potential wells of galaxy clusters, thereby giving rise to the variety of astrophysical processes that determine the observational properties of clusters, and of their galaxy population, at different wavelengths (e.g. Kravtsov & Borgani, 2012, for a review). The way in which galaxy clusters take shape from primordial density perturbations and in turn define the most extreme environment for galaxy formation are per se fascinating subjects of study (see, eg., McDonald et al., 2012; Webb et al., 2015; Ellien et al., 2019; Schellenberger et al., 2019; Yuan et al., 2020), whose complexity is best captured by advanced cosmological simulations (e.g. Borgani & Kravtsov, 2011).

Clusters also provide powerful cosmological tests of both expansion history and growth of density perturbations (Allen et al., 2011, for a review). Their abundance and spatial distribution on very large scales — described by the halo mass function and 2-point statistics — are tightly connected with fluctuations in the primordial matter density field. The exact connection is a very active topic in cosmology, where cosmological simulations are usually the primary theoretical tool.

A common assumption in the study of halo mass function and bias is that they are determined only by gravitational instability acting on primordial fluctuations of a collisionless fluid (e.g., Sheth & Tormen, 1999; Jenkins et al., 2001; Tinker et al., 2008, 2010; Watson et al., 2011; Despali et al., 2016; McClintock et al., 2019; Nishimichi et al., 2019; Bocquet et al., 2020). This assumption is justifiable as the matter content of the Universe is dominated by non-baryonic and collisionless dark matter (DM, hereafter).

However, physical and astrophysical processes related to the baryonic component (e.g. radiative cooling, star formation, feedback effects from supernova and AGN) are known to produce small, but sizeable, modifications in the evolution of density perturbations by impacting on both the statistics of total matter distribution and on the internal structure of non-linear collapsed structures. While these processes affect scales which are well resolved by cosmological simulations and sampled by observations of cosmic structures, they take place on small scales that cannot be explicitly resolved in simulations covering cosmological volumes. This prevents any possibility to simulate such processes explicitly from first principles, and one must resort to sub-resolution models (e.g. Hirschmann et al., 2014; Vogelsberger et al., 2014; Schaye et al., 2015; Crain et al., 2015; McCarthy et al., 2017; Borgani & Kravtsov, 2011; Vogelsberger et al., 2020). The flourishing of such sub-resolution descriptions of a variety of astrophysical processes provided invaluable insights to quantify the impact of baryonic effects on large scale structure, while offering solutions for small-scale tensions between the predictions of the standard cosmological model and observations of the internal structure of galaxy-sized halos (see, e.g., Teyssier et al., 2011; Martizzi et al., 2013; Sawala et al., 2016; Bullock & Boylan-Kolchin, 2017).

However, the sub-grid approach is susceptible to criticism. Due to a limited understanding of those physical processes, their implementation often involves parameters that have to be calibrated to reproduce observables correctly. This calibration introduces a fundamental problem as the reproduction of a given observable does not strictly validate a simulation. Due to the interplay between several processes on the baryonic sector, a given observable might be reproduced with a non-realistic choice of parameters governing the different processes. For instance, the simulations used in Martizzi et al. (2014), Cui et al. (2014) and Bocquet et al. (2015) all reproduce reasonably well several observational properties of galaxy clusters, yet, they provide somewhat different results for the impact of baryonic processes on the halo mass function.

The current uncertainty in the effect of baryonic physics on the halo mass function is subdominant when compared to both limited statistics of current surveys and systematic effects in the cosmological application of galaxy clusters (e.g., Ade et al., 2016; Mantz et al., 2016; Bocquet et al., 2019; Costanzi et al., 2019). However, this picture will soon change as next-generation surveys will provide an increase by orders of magnitude of the statistics of detected clusters and a corresponding improvement of the control of systematics related to survey selection function and, most importantly, cluster mass measurements (see, for instance, Laureijs et al., 2011; Merloni et al., 2012; DESI Collaboration, 2016; Bonoli et al., 2020). Such a leap forward in the increase of both precision and accuracy in the cosmological applications of galaxy clusters calls for the need of a corresponding increase in the precision and accuracy in the calibration of halo mass function (HMF hereafter) and halo bias (HB, hereafter), which represent the theoretical pillars of cluster cosmology. This highlight the relevance of properly including the effect of baryons on the measurement from simulations of HMF and HB, and how uncertainties in the description of sub-resolution processes propagate to the accuracy of their calibration.

In this paper, we use the Magneticum¹¹1http://www.magneticum.org suite of simulations to study the effect of baryons on the HMF and linear. The Magneticum suite of simulations offers the unique advantage of combining large volume covered, so as to sample HMF and HB in the high-mass end, and sufficient resolution to describe in a realistic way the baryonic processes inside cluster-sized halos. In our analysis, we will show that the impact of baryons on HMF and HB can be entirely interpreted in terms of a modification of the mass of individual halos, induced by the non-linear baryonic processes. Furthermore, we will demonstrate that neglecting such baryonic effects would induce a bias in the derivation of cosmological parameters by assuming a cluster survey specification mimicking what expected from the ESA’s Euclid mission (Sartoris et al., 2016; Euclid Collaboration, 2019).

This paper is organized as follows. In Section 2, we present the Magneticum set of simulations used in this paper and describe the identification of clusters in these simulations. In Section 3, we introduce the halo mass function and linear bias. In Section 4, we present our methodology to measure the HMF and HB. Results for the baryonic effect on clusters and cosmological parameters are presented in Sections 5 and 6. We draw our conclusions in Section 7. Finally, in Appendix A, we revise the key points of the Peak-Background-Split prediction for the linear HB. Appendixes B and C address the validity of the linear regime on the scales used to measure the linear bias in our simulations and the validity of the Gaussian approximation for the distribution of the halo and matter power-spectrum.

2 The Magneticum Simulations

The Magneticum suite of simulations (Biffi et al., 2013; Saro et al., 2014; Steinborn et al., 2015; Steinborn et al., 2016; Dolag et al., 2015, 2016; Teklu et al., 2015; Bocquet et al., 2016; Remus et al., 2017; Castro et al., 2018) describes the evolution cosmic structures by following up to $2\times 10^{11}$ particles of dark matter, gas, stars, and black holes, while covering more than $300$ Gpc³ in comoving volume. The simulations were performed with the TreePM+SPH code P-Gadget3 — a more efficient version of the publicly available Gadget-2 code (Springel et al., 2001; Springel, 2005). Our SPH solver implements the improved model of Beck et al. (2016), which overcomes a number of limitations of the standard SPH. Hydrodynamics is coupled to the treatment of radiative cooling, heating by a uniform evolving UV background, star formation based on the original model by Springel et al. (2005), and the treatment of stellar evolution and chemical enrichment processes as described by (Tornatore et al., 2007). Chemical enrichment from AGB stars, Type-Ia and Type-II SN follows a total of $11$ chemical elements (H;He;C;N;O;Ne;Mg;Si;S;Ca;Fe). Metallicity dependent cooling is implemented by following Wiersma et al. (2009), using cooling tables produced by the publicly available CLOUDY photo-ionization code (Ferland et al., 1998). Black Holes are modeled as sink particles (see, for instance, Springel et al., 2005; Di Matteo et al., 2008). Lastly, AGN feedback and black hole growth are modeled as described in Hirschmann et al. (2014).

Table 1: Set of boxes from Magneticum simulations and halo selection parameters used in this work. Updated from Bocquet et al. (2016).

Box	Size $L_{\textrm{box}}$	grav. softening length (kpc/h)			$N_{\textrm{particles}}$	$m_{\textrm{DM}}$	$M_{\textrm{halo, min}}$	Simulation		$N_{\rm 200m}\,(z=0)$
	$(\textrm{Mpc/h})$	DM	gas	stars		( $M_{\odot}$ )	( $M_{\odot}$ )	Hydro	DMO	Hydro	DMO
$4$ /uhr	$48$	$1.40$	$1.40$	$0.7$	$2\times 576^{3}$	$3.6\times 10^{7}$	$6.2\times 10^{11}$	✓	✓	²²2Box $4$ /uhr ran only until $z=0.137$ . At this redshift, it contains $815$ halos.	$835$
$2$ /hr	$352$	$3.75$	$3.75$	$2.0$	$2\times 1584^{3}$	$9.8\times 10^{8}$	$1.1\times 10^{13}$	✓	✓	$21\,013$	$23\,416$
$2b$ /hr	$640$	$3.75$	$3.75$	$2.0$	$2\times 2880^{3}$	$9.8\times 10^{8}$	$1.1\times 10^{13}$	✓	✓	³³3Box $2b$ /hr ran only until $z=0.2$ . At this redshift, it contains $104\,823$ halos.	$140\,664$
$0$ /mr	$2688$	$10.0$	$10.0$	$5.0$	$2\times 4563^{3}$	$1.9\times 10^{10}$	$2.2\times 10^{14}$	✓	✓	$232\,816$	$241\,733$

The Magneticum suite also cover different cosmologies (see Singh et al., 2020). In this work, we restrict our analysis to the sub-set in agreement with the WMAP7 results (Komatsu et al., 2011), with total matter density parameter $\Omega_{m}=0.272$ ; baryonic fraction of $16.8$ per cent; Hubble constant $H_{0}=70.4$ km/s/Mpc; primordial spectral index $n_{s}=0.963$ , and a normalization of the matter power spectrum $\sigma_{8}=0.809$ . The basic properties of the boxes from the Magneticum set used in this analysis are shown in Table 1.

In each simulation box, halos are identified through the Spherical Overdensity (SO) implemented within the SUBFIND algorithm (Springel et al., 2001; Dolag et al., 2009): spheres are grown around halo centers, which are identified with the position of the DM particles that have the minimum value of the gravitational potential, within Friend-of-Friends (FoF) groups defined by a linking length $b=0.16$ in units of the mean DM interparticle distance. Each sphere is then grown until the enclosed density is $\Delta$ times either the mean or the critical matter density. In the following, we will consider the values $\Delta={\rm\{200m,200c\}}$ , where $m$ and $c$ stand for, respectively, the mean and critical matter density of the Universe at a given redshift.

In order to guarantee the robustness of our results, we use the same conservative selection criteria in terms of minimum number of particles per halo to consider it as unaffected by numerical resolution Bocquet et al. (2016). As presented in Table 1, for each simulation box, we have a minimum mass cut, that was chosen to ensure that only halos with more than $10^{4}$ DM particles are considered. A maximum mass cut is also applied, and for each box, the maximum mass considered is the minimum mass cut of the next box. For the largest box a mass limit of $10^{16}M_{\odot}$ is used.

The simulation outputs, from which cluster catalogs are constructed, are selected in the redshift range $0\leq z\leq 2$ and roughly equispaced in time by $t\approx 1.6$ Gyr. This time interval — arguably larger than the typical dynamic time of a halo — was chosen in order to reduce the correlation between the different snapshots. The redshifts considered are $\{0.00,0.14,0.29,0.47,0.90,1.18,1.98\}$ .

As mentioned above, we will follow closely the work of Bocquet et al. (2016), but with the following differences in the set of simulations used for the HMF calibration:

•

The Boxes $1a$ , $3$ , and $4$ used in Bocquet et al. (2016) are too small for covering the large-scale modes required to compute the variance over the scales of interest (see Eq. (3) below), i.e. such boxes are far from the infinite volume limit. Therefore, they are not used in this paper.
•

The cooling tables implemented for Box $2$ differ slightly from the one implemented in the other high-resolution simulation Box $2b$ . We verified that this difference corresponds to a $5$ per cent increase on the expected number of clusters in Box $2$ with respect to Box $2b$ . In order not to contaminate the HMF calibration with data sets in tension with each other, we have kept only the larger box and have not used Box $2$ for the calibration.
•

The Box $2b$ with full hydrodynamics (Hydro) and its DM-only counterpart (DMO) were added to the set of simulations used.

3 Theory

3.1 The halo mass function

We define the halo mass function (HMF) as the comoving density $n(M,z)$ of halos of mass $M$ at redshift $z$ , so that

\frac{{\rm d}n(M,z)}{{\rm d}M}\,{\rm d}M\,=\,{\frac{\rho_{\rm mc}}{M}}\,f(M,z){\rm d}M\,,

(1)

is the comoving number density of such halos with mass in the range $\{M,M+{\rm d}M\}$ at the same redshift. In the above relation, the multiplicity function (MF, hereafter) $f(M,z)$ gives the fraction of total mass contained within halos of mass M, while $\rho_{\rm m}$ is the mean matter density at redshift $z=0$ .

The Press-Schechter formalism (Press & Schechter, 1974) predicts the following Universal expression for the MF:

f_{\rm PS}(\nu)\,=\,\sqrt{\frac{1}{2\,\pi}}\nu\,e^{-\nu^{2}/2},

(2)

where the cosmological dependence is manifested through the peak height $\nu\equiv\nu(M,z)=\delta_{c}(z)/\sigma_{R}(M,z)$ , with $\delta_{c}$ the linear density contrast for spherical collapse (Bryan & Norman, 1998). The variance of the linear density fluctuations at the scale $R$ , $\sigma^{2}_{R}(z)$ , is defined as

\sigma^{2}_{R}(z)\,=\,\frac{1}{2\,\pi^{2}}\int_{0}^{\infty}dk\,k^{2}\,P_{\rm lin}(k,z)W^{2}(k\,R)\,,

(3)

where $W(k\,R)$ is the Fourier transform of the top-hat window function of a sphere of radius $R$ , and $P_{\rm lin}$ is the linear matter power spectrum. For the top-hat window function, the comoving smoothing length $R$ is related to the mass scale by $M=(4\pi/3)\,R^{3}\,\rho_{mc}.$

For the comparison with simulations, the evaluation of Eq. (3) should take into account the effects introduced by the finite simulated volume and the finite grid on which initial conditions are defined. In addition, we should also account for the specific, stochastic realisation of the initial power spectrum. We therefore compute Eq. (3) using the measured power spectrum from the sampled initial conditions. Due to the large volume of our boxes, such effects collectively affect our computation of Eq. (3) by less than $0.5$ per cent in the mass ranges defined for each box. We compared the initial conditions $P_{\rm lin}$ and the fiducial $P_{\rm lin}$ . The former was calculated by integrating from $k_{\rm min}=2\pi/L_{\rm box}$ to 1/2 Nyquist frequency of the meshed grid;⁴⁴4For this test specifically we have used a grid with $1024^{3}$ mesh points and not $256^{3}$ as in the rest of the paper, see Section 4. the latter was integrated over the $k$ -range from $2\pi/(10L_{\rm box})$ to 10 Nyquist frequency. Given the good agreement and in order to avoid numerical issues, such as those related to the interpolation of the power spectrum measured on the initial conditions, in the following we will use in Eq. (3) the fiducial $P_{\rm lin}$ , instead of its random realization.

The MF of Eq. (2) provides a qualitative prediction for the density of objects observed in simulations. In order to have an accurate description, more sophisticated HMF modelings have been proposed and calibrated against N-body simulations. For instance, Sheth & Tormen (1999); Jenkins et al. (2001); Tinker et al. (2008, 2010); Watson et al. (2011); Despali et al. (2016); Bocquet et al. (2016) followed the approach of calibrating a parametric prescription of the HMF measured in different sets of N-body simulations, while McClintock et al. (2019); Nishimichi et al. (2019); Bocquet et al. (2020) used Gaussian process regression to build HMF emulators based on the results of large sets of N-body simulations.

In this paper, we adopt the HMF expression proposed by Tinker et al. (2008),

$\displaystyle f(\sigma,z)$	$\displaystyle\,=\,A\,\left[\left(\sigma/b\right)^{-a}+\sigma^{-c}\right]\,e^{-d/\sigma^{2}},$	(4)
$\displaystyle A(z)$	$\displaystyle\,=\,A_{0}\,(1+z)^{A_{z}},$	(5)
$\displaystyle a(z)$	$\displaystyle\,=\,a_{0}\,(1+z)^{a_{z}},$	(6)
$\displaystyle b(z)$	$\displaystyle\,=\,b_{0}\,(1+z)^{b_{z}},$	(7)
$\displaystyle c(z)$	$\displaystyle\,=\,c_{0}\,(1+z)^{c_{z}},$	(8)
$\displaystyle d(z)$	$\displaystyle\,=\,d_{0}\,(1+z)^{d_{z}},$	(9)

with the specific purpose of analysing in detail the impact of baryons. Therefore, rather than calibrating accurate fitting functions to describe the HMF from the Magneticum set of simulations, we resort to an HMF functional expression, which has been already calibrated on N-body simulations, and quantify the modification of this HMF induced by baryonic effects. As such, our HMF at $z=0$ is completely described by the $5$ parameters $\{A_{0},\,a_{0},\,b_{0},\,c_{0},\,d_{0}\}$ , where $A_{0}$ is a normalization factor, $\{a_{0},\,b_{0},\,c_{0}\}$ describe the HMF at low masses and $d_{0}$ determine the location of the exponential cutoff in the high mass end. For each of these parameters, we assume a power-law dependence on the expansion factor, with exponents $\{A_{z},\,a_{z},\,b_{z},\,c_{z},\,d_{z}\}$ . In total, $10$ free parameters capture the mass- and redshift-dependence of the universal (i.e. cosmology-independent) MF.

3.2 Halo linear bias

Halos are biased tracers of the underlying dark matter field. To first order, their local overdensity $\delta_{h}(\textbf{r},M,z)=n(\textbf{r},M,z)/\bar{n}(M,z)-1$ can be written as a function of the matter density contrast $\delta_{m}(\textbf{r},z)$ as

\delta_{h}(\textbf{r},M,z)=b(M,z)\,\delta_{m}(\textbf{r},z)+\epsilon(\textbf{r},M,z)\,,

(10)

where $b$ is the halo bias and $\epsilon$ is a stochastic term, that in the following we assume to be associated to shot-noise.

From Eq. (10) it follows that, for sufficiently large scales, the halo-halo, $P_{hh}$ , and halo-matter power spectrum, $P_{hm}$ , are written as a function of the linear matter power spectrum, $P_{mm}$ , as:

	$\displaystyle P_{hh}(k,M,z)$	$\displaystyle=$	$\displaystyle b^{2}(M,z)\,P_{mm}(k,z)+P_{\rm SN}\,,$		(11)
	$\displaystyle P_{hm}(k,M,z)$	$\displaystyle=$	$\displaystyle b(M,z)\,P_{mm}(k,z)\,,$		(12)

where $P_{\rm SN}$ represents a shot-noise component, which is equal to the Poisson term, $P_{\rm SN}=1/\bar{n}$ , under the assumption that halos provide a discrete Poisson sampling of the underlying continuous matter density field. In fact, both negative and positive corrections to Poisson shot-noise are expected respectively for low and high mass halos (Casas-Miranda et al., 2002; Hamaus et al., 2010).

Using the Peak-Background Split (PBS) (Mo & White, 1996) it is possible to obtain the halo bias $b(M,z)$ directly from the halo mass function under the assumption of a universal MF. For the Press-Schechter MF presented in Eq. (2), the PBS prediction is:

b(\nu)=1+\frac{\nu^{2}-1}{\delta_{c}},

(13)

while for the adopted HMF functional form of Eq. (4) we obtain:

b(\sigma,z)=\frac{-c+\delta_{c}(z)+2\,d/\sigma^{2}+(c-a)\,\sigma^{c}/\left[\sigma^{c}+(\sigma/b)^{a}\right]}{\delta_{c}(z)}\,.

(14)

In Appendix A we present the key points for the derivation of the above equation.

Although the PBS provides us a fair estimation of the bias, in Tinker et al. (2010) its performance has been observed to be not better than $10-20$ per cent. In order to obtain a better fit, we will also consider the bias fitting formula introduced in Tinker et al. (2010):

b(\nu)\,=\,1-A\frac{\nu^{a}}{\nu^{a}+\delta_{c}^{a}}+B\nu^{b}+C\nu^{c}.

(15)

4 Methodology

At each redshift analysed in our simulations, we have binned the halo distribution in $\log_{10}M_{\rm halo}$ with equispaced intervals of width $0.1$ dex. We have then measured $P_{hh}$ and $P_{hm}$ of the halos falling within each mass bin, along with the matter power spectrum $P_{mm}$ .

We have used PYLIANS⁵⁵5https://github.com/franciscovillaescusa/Pylians, a set of PYthon LIbraries for the Analysis of Numerical Simulations, to construct both the corresponding density field and to compute the corresponding power spectra. All power spectra were computed from a 3D grid with $256^{3}$ mesh points populated according to a CIC (Cloud In Cell) mass assignment scheme. Lastly, we have averaged the power-spectrum measurement within shells in $k$ -space, having width given by the fundamental mode of the box, $k_{f}\equiv 2\pi/L$ .

To fit the parameters of the HMF and linear HB against results from simulations, we have adopted a maximum likelihood approach assuming uninformative uniform priors on all parameters. The best-fits were determined using the AMPGO (Adaptive Memory Programming for Global Optimization, Lasdon et al., 2010) global optimization algorithm, while the covariance between the parameters was estimated using EMCEE (Foreman-Mackey et al., 2013). For the sake of investigating the robustness of our best-fit estimations, we have also used the implementation of Virtanen et al. (2020) of the global optimization method described in Xiang et al. (2013), dubbed Dual Annealing. The relative differences between both estimations were found to be less than $1$ per cent.

For all these statistical methods we have used the Python interface from LMFIT (Newville et al., 2019).

4.1 HMF likelihood

For the calibration of the HMF, we assume that the number of halos $N_{i}$ with masses $M_{\rm halo}\in[M_{i},M_{i+1})$ in the snapshot at redshift $z$ , follows a Poisson distribution. The adopted likelihood $\mathcal{L}(N_{i}|\boldsymbol{\theta},z)$ is therefore given by:

2\ln\mathcal{L}(N_{i}|\boldsymbol{\theta},z)\,=2\left(N_{i}^{\rm sim.}\ln N_{i}(\boldsymbol{\theta},z)-\,N_{i}(\boldsymbol{\theta},z)\right),

(16)

where $N_{i}^{\rm sim.}$ stands for the number of halos within the above mass bin for the snapshot at redshift $z$ , $N_{i}(\boldsymbol{\theta},z)$ is the theoretical expectation of halos computed by integrating Eq. (1) and multiplying it by the simulated volume assuming the HMF of Eq. (4) with parameter vector $\boldsymbol{\theta}$ .

The final log-likelihood is computed by summing Eq. (16) over all redshifts, mass bins, and simulations. This amount to assume that different mass bins at fixed redshift and snapshots at different redshifts are independent of each other.

4.2 Linear HB likelihood

From the power spectra computed from the halo distribution with masses $M_{\rm halo}\in[M_{i},M_{i+1})$ we have selected only modes with $k$ smaller than $k_{\rm min.}=0.05\,({\rm Mpc/h})^{-1}$ in order to guarantee that the linear approximation is accurate (see Appendix B for more details).

For every mass bin, the bias is measured using the ratio of the halo-matter cross-spectrum and the matter power-spectrum: $b_{i,j}^{\rm sim.}=P_{hm}(k_{j})/P_{mm}(k_{j})$ where $i$ and $j$ indexes the mass bin and the $k$ -shell.

The shell-average estimation of the power spectrum for a Gaussian field realization has a $\chi^{2}$ -distribution. As the number of modes $N_{k}$ inside the shell grows rapidly with $k$ , this distribution converges to a Gaussian distribution due to the central limit theorem. Thus, the distribution of the ratio is approximately described by the ratio of two Gaussian-distributed variables. In Appendix C we show that this approximation is in fact valid for our analysis.

Although computing the probability distribution function of a ratio of two random variables is straightforward, its computational cost is high if one has to compute it many times, as needed in order to find a global extrema or when running a MCMC. Fortunately, for the Gaussian case the following variable transformation leads to a normal distribution in the new variable (Geary, 1930):

\zeta=\frac{\mu_{x}w-\mu_{y}}{\sqrt{(\sigma_{x}w)^{2}-2\,\Sigma(x,y)w+\sigma_{y}^{2}}},

(17)

where $w=y/x$ , $\{\mu_{y},\sigma_{y}\}$ and $\{\mu_{x},\sigma_{x}\}$ are the mean and standard deviation of the numerator ( $y$ ) and denominator ( $x$ ) distributions respectively, and $\Sigma(x,y)$ is their covariance. Applying Eq. (17) to the HB likelihood estimator, we get:

	$\displaystyle 2\ln\mathcal{L}(b_{i,j}^{\rm sim.}\|\boldsymbol{\theta},z)$	$\displaystyle=$
	$\displaystyle\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\frac{\,(b_{i,j}(\boldsymbol{\theta})-b_{i,j}^{\rm sim.})^{2}\,P_{mm}^{2}}{(\sigma_{P_{mm}}b_{i,j}^{\rm sim.})^{2}-2\,\Sigma(P_{mm},P_{hm})b_{i,j}^{\rm sim.}+\sigma_{P_{hm}}^{2}},$			(18)

with the covariances computed using linear theory:

$\displaystyle\left(\frac{\sigma_{P_{mm}}}{P_{mm}}\right)^{2}\,$	$\displaystyle=$	$\displaystyle\,\frac{2}{N_{k}}\,,$
$\displaystyle\left(\frac{\sigma_{P_{hm}}}{\sigma_{P_{mm}}}\right)^{2}\,$	$\displaystyle=$	$\displaystyle\,\frac{1}{2}\,\left(2\,b_{i,j}(\boldsymbol{\theta})^{2}+\frac{1}{\bar{n}\,P_{mm}}\right)\,,$
$\displaystyle\frac{\Sigma(P_{mm},P_{hm})}{{\sigma_{P_{mm}}^{2}}}\,$	$\displaystyle=$	$\displaystyle\,b_{i,j}(\boldsymbol{\theta})\,,$

where $\boldsymbol{\theta}$ represents the vector of parameters of Eq. (15).

The final log-likelihood is obtained summing Eq. (18) in all mass bins, selected modes, and different boxes.

5 Results

5.1 Halo masses in Hydro and DMO simulations

Before presenting the results on the HMF and HB, we discuss in details the way in which the baryons affect such quantities. This subsection is entirely based on results from the Hydro and DMO versions of Box $2$ . The numerical convergence of our results has been assessed using Box $4$ — our highest-resolution simulation. Due to its limited volume, Box $4$ allows us to assess the convergence only at at low masses and redshift $z>0.137$ . In this regime, the halo statistics presented in this section is entirely consistent in both simulations, meaning that any possible high-redshift effect due to limited resolution does not propagate to low-redshift. For redshift higher than unity, Box $4$ and Box $2$ are qualitatively in agreement; however, we verified that the latter predicts halos slightly more massive than the former by $5-10$ per cent. This tension could be due to the lower resolution of Box $2$ as well as the limited volume covered by Box $4$ . Thus, this section’s results are expected to be accurate at the level of $5-10$ per cent for high-redshifts.

Refer to caption — Figure 1: Scatter plot of the ratio of masses $M_{\rm 200c}$ of matched halos identified in the Hydro and DMO version of Box 2, as a function of $M_{\rm 200c}$ for the Hydro simulation, at three different redshifts. The color code is for the ratio between the baryon fraction of each single halo within $R_{{\rm 200c}}$ and the median baryon fraction for halos of the same mass. The solid red curves mark the median value of the mass ration, with the red-shaded regions encompassing the 16th-84th percentiles. The solid vertical line denotes the Box $2$ minimum halo mass cut presented in Table 1. Halos in the two simulations have been matched selecting the pairs that are closest to each other (see text).

5.1.1 Effect of baryons on halo masses

Table 2: Pearson’s correlation coefficient (PCC; Column 3)

r

and Spearman’s rank correlation coefficient (SRCC; Column 4)

\rho

between the Hydro-to-DMO mass ratio,

M_{200c}^{\rm Hydro}/M_{200c}^{\rm DMO}

, and the normalized baryon fraction

f_{b}/\widetilde{f}_{b}

(shown in Figure 1), stellar mass fraction

f_{\star}/\widetilde{f}_{\star}

, and gas mass fraction

f_{\rm gas}/\widetilde{f}_{\rm gas}

. Also shown are the corresponding

p

-values of having the same absolute correlation coefficient by chance (Columns 4 and 6, respectively).

		PCC		SRCC
Redshift	Var.	$r$	$\log p$ -value	$\rho$	$\log p$ -value
$z=1.98$	$(f_{b}/\widetilde{f}_{b})_{\rm 200c}$	$0.38$	$-637$	$0.44$	$<-708$
	$(f_{\star}/\widetilde{f}_{\star})_{\rm 200c}$	$0.24$	$-247$	$0.29$	$-382$
	$(f_{\rm gas}/\widetilde{f}_{\rm gas})_{\rm 200c}$	$0.05$	$-10.5$	$0.04$	$-7.02$
$z=1.18$	$(f_{b}/\widetilde{f}_{b})_{\rm 200c}$	$0.56$	$<-708$	$0.67$	$<-708$
	$(f_{\star}/\widetilde{f}_{\star})_{\rm 200c}$	$0.04$	$-22.3$	$0.03$	$-11.4$
	$(f_{\rm gas}/\widetilde{f}_{\rm gas})_{\rm 200c}$	$0.46$	$<-708$	$0.55$	$<-708$
$z=0.00$	$(f_{b}/\widetilde{f}_{b})_{\rm 200c}$	$0.35$	$<-708$	$0.45$	$<-708$
	$(f_{\star}/\widetilde{f}_{\star})_{\rm 200c}$	$0.01$	$-4.05$	$-0.03$	$-28.9$
	$(f_{\rm gas}/\widetilde{f}_{\rm gas})_{\rm 200c}$	$0.33$	$<-708$	$0.45$	$<-708$

In Figure 1 we present a scatter plot of the halo masses in the Hydro run and its ratio with respect to the mass of the corresponding matched halo in the DMO counter-part. Halos have been matched by selecting the halo pairs (one each in the Hydro and DMO runs) that are closest to each other.⁶⁶6We have validated the matching procedure by applying it to the DMO catalog and a synthetic catalog created from it, where clusters have been randomly displaced, mimicking the Hydro catalog. We have observed that, for halos more massive than Box $2b$ minimum mass cut, both the matched catalog’s completeness and purity are greater than $95\%$ . The color code is the ratio of the baryon fraction inside $R_{{\rm 200c}}$ and the median baryon fraction in halos of the same mass. The median baryon fraction $\tilde{f}_{b}(M)$ has been computed by interpolating the median of the halo sample in different mass bins ( $\Delta\log_{10}M/M_{\odot}=0.2$ ). The red shaded region comprises the 16th-84th percentiles around the median. In general we note a trend for this ratio to decrease with redshift, with halos in the Hydro run being on average significantly lighter than their DMO counter-parts at $z\lesssim 1$ . The effect is stronger for lower masses and weaker for larger ones. Still, even at the largest sampled masses the ratio does not asymptotically tend to unity but to a slightly lower value that grows with redshift. Similar results have been reported independently by Sawala et al. (2013); Cui et al. (2014); Velliscig et al. (2014). For the $z=1.98$ panel the picture flips, and halos are slightly heavier in the Hydro run.

In order to quantify the statistical significance of the correlations shown in Figure 1, we present in Table 2 the value of the Pearson’s correlation coefficient (PCC, $r$ ) and the Spearman’s rank correlation coefficient (SRCC, $\rho$ ) between the Hydro-to-DMO halo mass ratio, $M_{200c}^{\rm Hydro}/M_{200c}^{\rm DMO}$ , and the baryon fraction $f_{b}/\widetilde{f}_{b}$ , normalized to the median baryon fraction within each mass bin (shown with the color-code in Figure 1), the stellar fraction $f_{\star}/\widetilde{f}_{\star}$ , and the gas fraction $f_{\rm gas}/\widetilde{f}_{\rm gas}$ . The former statistics measures the linear correlation between two random variables while the latter measures how monotonic is their relation regardless of the complexity of such relation. We also present the $p$ -value of having the same absolute value of the correlation by chance between two uncorrelated distributions of same size.

From Table 2 we note that the correlation with the total baryon content is highly significant at all redshifts. The PCC correlation is the highest at $z=1.18$ and has similar values $z\simeq 2$ and $z=0$ . The initial increase and the later reduction of the correlation is due to the interplay of the mass accretion due to in-falling matter and halo mergers. The former increases the correlation while the latter introduces further stochasticity reducing the correlation with the baryon content at halo formation. In addition, the similar values for $r$ and $\rho$ between the mass ratio and the baryon fraction indicate that the relation between these two variables can be described reasonably well by a linear relation.

We also note that the correlation with the baryon excess is driven by different components (i.e. gas and stars) at different epochs. Stellar component is the main driver of the correlation at $z\approx 2$ , when halos tend to be more massive in the Hydro simulation. While being almost insignificant at this redshift, the gas component becomes the main responsible for the correlation at redshift $z\lesssim 1$ , when the mass ratio flips and halos becomes less massive in the Hydro simulation. The transition between these two regimes correspond to the transition between the relative action played by two physical mechanisms. At high redshift, Hydro halos are more massive than DMO ones, due to efficient gas cooling that causes a rapid condensation of baryons in central halo regions, thus causing a halo contraction. The resulting conversion of cooled gas particles in stars thus causes the positive correlation with the stellar mass fraction reported in Table 2 at high redshift. Cooling gas also feeds the Supermassive Black Hole (SMBH) hosted at the cluster center, thus igniting AGN feedback that suddenly heats and displace the surrounding gas. In turn, gas displacement will cause an expansion of the gravitational potential and reduce the halo gas content, thus halting gas re-accretion (e.g. Ettori et al., 2006; Zhang et al., 2020). All together this will lead to less massive and less gaseous halos at late times, explaining the correlation with the gas fraction.

The effect of the above described mechanism can also be observed in Figure 2 where we present the evolution of the median baryon, gas and stellar mass fractions, all normalized to the cosmic baryon fraction, of the main progenitors of halos identified at $z=0$ . The median has been computed over all the halos belonging to the same mass bin of width $\Delta\log_{10}M/M_{\odot}=0.2$ . Curves are color-coded according to the halo mass at $z=0$ . In the left panel we observe that, down to $z\approx 2$ , the baryon content of halos exceeds, although by a small amount, the cosmic value. During this period the gas fraction significantly drops while the star fraction increases at least by the same amount due to gas cooling and star formation. The stellar mass fraction reaches a peak of about $25$ per cent at $z\simeq 2-3$ and then reduces gradually to a value around $15$ per cent at late times with a mild dependence on the halo mass. On the other hand, the gas fraction shows a more interesting behaviour: after the star fraction peaks, AGN feedback displaces and heats the gas; however only in the less massive halos, corresponding to the shallower gravitational potential wells, the effect is strong enough to significantly reduce the gas fraction. For halos with mass around $10^{13}\,M_{\odot}/h$ the gas fraction drops from $70$ per cent at $z=2.0$ to $40$ per cent at $z=0.0$ . The most massive halos are instead able to keep their gas fraction constant during this period.

5.1.2 AGN feedback and halo accretion history

AGN feedback is the main responsible for the halo mass reduction observed at low redshift in the Hydro simulations. The energy injected by the AGN is proportional to the mass accretion rate of the black hole $\dot{M}_{\rm BH}$ (e.g., Springel et al., 2005). In Figure 3 we show the history of the AGN activity scaled by the halo thermal energy, $\dot{M}_{\rm BH}/(M_{\rm 200c}^{\rm Gas}\,T_{\rm 200c}^{\rm Gas})$ , of the main progenitors of the halos identified at $z=0$ . Each curve shows the median value of such quantity, computed among all the cluster with mass in a given range. Qualitatively, the history of the relative AGN feedback has a nearly universal behavior. It peaks at slightly higher redshift than the baryon fraction peak (shown in Figure 2), then decays quickly around $z=1.0$ , reaching finally a slow decaying phase at recent times. The slightly shift between the peaks of the AGN activity and the stellar fraction is expected as the feedback suppresses the star formation. Quantitatively, the amplitude of the AGN activity decreases with the halo mass in agreement with the trend of baryon fraction and of the low- $z$ behaviour of the halo mass reduction in Hydro simulation shown in Figure 1.

In order to understand how the halo mass assembly reacts to AGN feedback, we show in Figure 4 the SRCC $\rho$ between the Hydro-to-DMO mass ratio at $z=0.0$ and the relative AGN energy feedback of the main progenitor at a given redshift, against the mass fraction $M(Z)/M(z=0)$ accreted at the same redshift. The quantity reported on the $y$ axis is expected to have negative (positive) values whenever the action of AGN feedback at a given redshift tend to produce a decrease (increase) of halo mass at redshift zero. Therefore, this plot shows whether there is a characteristic epoch in the halo accretion history at which AGN feedback leaves its imprint on the variation of the final halo mass. In order to present only significant correlations we select the mass bins with at least $100$ halos. Quite interestingly, We note that the response to the AGN feedback has a rather universal trend. The variation of the halo mass has the most significant, and negative, correlation with the relative intensity of AGN feedback when halo progenitors reach $\sim 30$ -50 per cent of their final mass. This regime has been shown by Wang et al. (2020) to be also the most informative with respect to the halo concentration. This confirms that the decrease of halo mass in the Hydro simulations is induced by the action of AGN feedback in a relatively early phase of the halo assembly, when the shallower potential well can more easily back-react to the sudden displacement of gas heated by feedback. This result also suggests that the Hydro mass variation might correlate with other halo properties and possibly present secondary effects of the halo clustering (Mao et al., 2018). This possibility will be left for further investigation.

The variation of halo masses in the Hydro simulations is the key aspect to understand the bias induced by neglecting baryonic processes on cluster cosmology. In the following sub-sections we will discuss the results for the HMF and HB, and show how they are tightly connected.

5.2 The halo mass function calibration

In Figure 5 we plot the number density of halos per log-interval of the halo mass, as a function of the halo mass from our simulations. The top and bottom panels show results for masses computed at $\Delta=200m$ and $200c$ halos, respectively. The errorbars are computed using the Gaussian approximation to the Poisson distribution. Notice that the error bars are for illustration purposes only since we use a Poison likelihood on the calibration, see Section 4. Form this figure we observe small but appreciable differences between the HMF for Hydro and DMO runs in the high mass end, where Hydro runs systematically form less halos at a fixed mass.

Differences on lower masses can be better appreciated in Figure 6 where we show the relative difference of the halo number in the Hydro and in the DMO runs, relative to the best-fitting function for the DMO HMF (see below). Left and right columns are for halo mass definitions given by $\Delta=200m$ and $200c$ , respectively. Different different rows are for the different redshifts considered. Dashed regions represent the $68$ per cent confidence regions for our best-fitting HMF, calculated by propagating the uncertainties on the best-fit parameters (Table 4) using the covariance matrix shown in Figures 7 and 8. For the two lowest redshifts and for masses below the minimum mass cut of Box 0 (see Table 1), we plot results from Box $2$ instead of $2b$ , since only the former has been run down to $z=0.2$ . Although this mass regime is a extrapolation of our fit, the performance of our fit is only slightly affected.

In order to quantify the overall quality of our HMF calibration, we present in Table 3 a two-tailed test for the reduced log-likelihood. The latter is defined by using Eq. (16), summed over all mass bins, and dividing it by the number of degrees of freedom (henceforth d.o.f.). This test has similar interpretation of the frequentist reduced- $\chi^{2}$ for Gaussian likelihoods and was done as follows: for each mass definition and simulation (Hydro and DMO), we have created $1000$ synthetic catalogs by Poisson-sampling the corresponding best fit HMF. Then, for each catalog we have computed the reduced log-likelihood at the best fit point. The $p$ -value reported in Table 3 is the fraction of the catalogs that have an absolute difference between the reduced log-likelihood and the corresponding mean value of the whole set, which is smaller than the difference between our best-fit and the simulation data. The similar results for all cases indicate that our calibration performs statistically well for all of them, and the $p$ -values indicate that our calibration is in agreement with the data variance, with no evidence of either under or over-fitting.

Table 3: Two-tailed test for the reduced log-likelihood of

1000

synthetic catalogs randomly generated by Poisson-sampling our best-fit HMF calibrations. The

p

-values are the fraction of catalogs with an absolute difference between the reduced log-likelihood and the corresponding mean value of the whole set which is smaller than the same difference measured for our catalogues extracted from the simulations.

	$200c$		$200m$
	Hydro	DMO	Hydro	DMO
$p$ -value	$0.245$	$0.289$	$0.279$	$0.321$

For both mass definitions, at fixed mass, halos are scarcer on Hydro than on DMO runs. The relative difference presents a characteristic mass where it is minimal and grows for both larger and smaller masses. This is the result of the interplay between the mass reduction in Hydro halos, which is less pronounced at large masses, and the exponential sensitivity of the HMF at high masses. At the low mass end and low redshifts (where critical and mean cosmic densities differ mostly), the effect of using $\Delta=200c$ for halo masses is $\sim 50$ per cent higher than for $\Delta=200m$ . The enhancement of the effect when comparing critical and mean thresholds at low redshift is not surprising since the $200m$ overdensity traces larger radii where the baryonic effects are smaller.

Table 4: Best-fit parameters for the HMF presented in Eq. (4). The covariances between different parameters are presented in Figures 7 and 8.

	$A_{0}$	$A_{z}$	$a_{0}$	$a_{z}$	$b_{0}$	$b_{z}$	$c_{0}$	$c_{z}$	$d_{0}$	$d_{z}$
	200m
Hydro	$0.862$	$-0.475$	$-2.631$	$1.279$	$3.000$	$-1.388$	$1.725$	$0.323$	$1.256$	$0.046$
DMO	$0.610$	$-0.457$	$2.721$	$-0.113$	$0.846$	$0.213$	$0.731$	$-1.959$	$1.297$	$-0.003$
	200c
Hydro	$0.372$	$1.140$	$3.164$	$0.769$	$1.094$	$-0.629$	$0.528$	$0.927$	$1.710$	$-0.102$
DMO	$0.333$	$1.093$	$2.110$	$0.870$	$1.147$	$-0.652$	$0.379$	$0.691$	$1.515$	$-0.066$

5.3 The linear halo bias calibration

As the calibration of the HB for $\Delta=200m$ and $200c$ involved slightly different technical aspects, we separate their presentations in Sections 5.3.1 and 5.3.2, respectively.

5.3.1 Results for $\Delta=200m$

The fitting function presented in Eq. (15) can be split in two components: a low- $\nu$ component, which is governed by the parameters $\{A,a\}$ , and a high- $\nu$ component which is determined by two power-law redshift dependencies, involving the parameters $\{B,b,C,c\}$ . For the calibration of the linear halo bias we will rely on the results presented by Tinker et al. (2010) and only refit for the low- $\nu$ component. The reason for this choice is twofold. Firstly, the effect of baryons, as it has been observed for the HMF, is expected to be stronger at low mass. Secondly, Eq. (15) is too flexible given the constraining power of our simulated power-spectra, if all the parameters are left free. The reduced $\chi^{2}$ for our DMO and Hydro best-fits are $1.056$ (579 d.o.f.) and $1.028$ (567 d.o.f.), respectively. Both values are strongly reduced if all parameters are left free due to overfitting.

The best-fit parameters for the model HB and the covariance between them are presented in Table 5. In Figure 9 we plot our estimation for the halo bias of both Hydro and DMO runs. We plot the measured halo bias as the ratio $P_{hm}/P_{mm}$ . The effect of baryons on the clustering is better appreciated in the bottom panel where the residuals with respect to the DMO predictions are shown. In order to avoid an overcrowded plot, we plot only the points with $\chi^{2}<2$ . We also add the Tinker et al. (2010) prediction. Obviously, no effect is observed for rare halos since we have fixed the high- $\nu$ parameters to the values found by Tinker et al. In the low- $\nu$ regime, halos are more clustered on the Hydro than in the DMO runs. At low- $\nu$ the different predictions differ by less than $10$ per cent and are comparable to the scatter presented by Tinker et al. (2010) for their calibration set of simulations. It is important to note that in the present analysis we are limiting our calibration on a single cosmological model. Thus, the confidence regions presented for our different calibrations do not take into account issues like the universality and the cosmology dependence. In addition, differences in the halo-finder algorithm have been also shown to change the halo statistics by several percent, see (Knebe et al., 2011; Garcia & Rozo, 2019). Therefore, the level of concordance of our DMO results and those by Tinker et al. (2010) is quite satisfactory.

Table 5: Best-fit parameters (upper part) and their covariance (lower part) for the linear halo bias presented in Eq. (10) for

200m

halos.

	$A$	$a$	$B^{\dagger}$	$b^{\dagger}$	$C^{\dagger}$	$c^{\dagger}$
Hydro	$1.065$	$0.177$	$0.183$	$1.50$	$0.265$	$2.40$
DMO	$1.136$	$0.111$	$0.183$	$1.50$	$0.265$	$2.40$
$\dagger$ Parameters fixed at the best-fit presented by Tinker et al. (2010).

	Hydro		DMO
	$A$	$a$	$A$	$a$
$A$	$7.50\times 10^{-4}$	$-3.84\times 10^{-3}$	$6.70\times 10^{-4}$	$-3.15\times 10^{-3}$
$a$	$-3.84\times 10^{-3}$	$2.92\times 10^{-2}$	$-3.15\times 10^{-3}$	$2.28\times 10^{-2}$

5.3.2 Results for $\Delta=200c$

As we did in Section 5.3.1, we will base our calibration of the HB on the expression provided by Eq. (15), as proposed by Tinker et al. (2010). While they presented in their paper results only for overdensities with respect to $\rho_{m}$ , they also provide useful interpolation expressions for the values of the fitting parameters $\{A,a,B,b,C,c\}$ as a function of $y\equiv\log_{10}\Delta_{m}$ :

$\displaystyle A(y)$	$\displaystyle\,=10+24\,y\,\exp{\big{[}-(4/y)^{4}\big{]}}\,,$	(19)
$\displaystyle a(y)$	$\displaystyle\,=044\,y-88\,,$
$\displaystyle B(y)$	$\displaystyle\,=0183\,,$
$\displaystyle b(y)$	$\displaystyle\,=150\,,$
$\displaystyle C(y)$	$\displaystyle\,=0019+107\,y+19\,\exp\big{[}{-(4/y)^{4}}\big{]}\,,$
$\displaystyle c(y)$	$\displaystyle\,=24\,.$

Critical and mean quantities are connected through the matter density of the Universe. Thus, in order to obtain the prediction of the linear halo bias for $200c$ using Eqs. (10) and (19) one has to use $y(z)=\log_{10}\,(200/\Omega_{m}(z))$ . The latter equation introduces a redshift dependence on the halo bias prediction. Notice that, only the parameters $\{A,a,C\}$ depend on the overdensity threshold for the mass definition.

While the prescription given by Eqs. (15) and (19) provides a non-optimal fit of our data, we verified that this approach is substantially better than a raw re-fit of parameters $\{A,a,B,b,C,c\}$ assuming no redshift evolution. However, we obtained an improved fitting performance by re-calibrating the amplitude of the mass dependent parameters $\{A,a,C\}$ . To this purpose, we defined our model for the z-dependent linear halo bias at $\Delta=200c$ as given by Eq. (10) using the parameters:

\{A,a,B,b,C,c\}=\{A_{r},a_{r},0,0,C_{r},0\}+b_{\rm T10}(z)\,.

(20)

In the above equation, $b_{\rm T10}(z)$ represents the set of parameters given by Eq. (19), with $y(z)=\log_{10}\,(200/\Omega_{m}(z))$ , to which we add and $\{A_{r},a_{r},C_{r}\}$ that are included as fitting parameters. Best-fit values and corresponding covariance matrix for the Hydro and DMO cases are presented in Table 6.

In Figure 10 we plot the best-fit estimation for the halo bias of both Hydro and DMO runs for $z=0.293$ . As we did for Figure 9, we plot the measured halo bias as the ratio of $b=P_{hm}/P_{mm}$ . In the bottom panel of Figure 9 we show the residuals of the prescriptions with respect to the DMO one. Also in this case, for clarity we plot only the points with $\chi^{2}<2$ . Differently from what we observe in Figure 9, a small but noticeable effect is observed for rare halos. This is due to the extra freedom added by $C_{r}$ . On the low- $\nu$ regime, as it has been observed in Figure 9, halos are more clustered in the Hydro than in the DMO case. With respect to $200m$ results, there is an enhancement of the effect of baryons of about $50$ per cent at low- $\nu$ . A similar enhancement has been observed as well in the HMF, as shown in the different panels of Figure 6. Again, this more marked effect in the low- $\nu$ regime is in line with the fact that baryonic effects on the halo mass are more pronounced for lower-mass halos. The fit quality for $200c$ is very similar to that reported for $200m$ , with $\chi_{\nu}^{2}=1.053$ ( $485$ d.o.f.) and $1.008$ ( $479$ d.o.f.) for DMO and Hydro, respectively.

Table 6: Best-fit parameters and their covariance for the linear halo bias for

200c

halos presented in Eqs. (15), (19), and (20). Upper and lower part of the table are for the Hydro and DMO simulations, respectively.

		Hydro
	best-fit	$A_{r}$	$a_{r}$	$C_{r}$
$A_{r}$	0.684	$4.87\times 10^{-3}$	$5.94\times 10^{-2}$	$4.85\times 10^{-4}$
$a_{r}$	-4.53	$5.94\times 10^{-2}$	$1.24$	$8.15\times 10^{-3}$
$C_{r}$	-0.033	$4.85\times 10^{-4}$	$8.15\times 10^{-3}$	$6.28\times 10^{-5}$
		DMO
	best-fit	$A_{r}$	$a_{r}$	$C_{r}$
$A_{r}$	0.761	$3.85\times 10^{-3}$	$4.39\times 10^{-2}$	$3.68\times 10^{-4}$
$a_{r}$	-4.81	$4.39\times 10^{-2}$	$8.66\times 10^{-1}$	$5.86\times 10^{-3}$
$C_{r}$	-0.0345	$3.68\times 10^{-4}$	$5.86\times 10^{-3}$	$4.66\times 10^{-5}$

5.4 Performance of the Peak-Background Split

In Figure 11 we present our best-fit estimation for the linear halo bias, compared with the predictions of the Peak-Background Split (PBS) prescription, all computed at $z=0.293$ (our lowest redshift at which we have data from all simulations). In the top and the bottom panels we show our results for halos identified with $\Delta=200m$ and $200c$ , respectively. The residual of the PBS prescription with respect to the best-fit estimation is attached on the subplot of each panel as well. The shaded area represent the $68$ per cent confidence level of the measured ratio and it has been calculated propagating the uncertainties on both HMF and linear halo bias parameters.

For the DMO results the accuracy of the PBS prediction over the $\nu$ range covered by our simulations is similar to what has been observed by Tinker et al. (2010). The quick degradation of the PBS performance at low $\nu$ is caused by our high mass cut that affects the derivative of the HMF more drastically than the HMF itself closer to the extremes. At high- $\nu$ the PBS performs with very similar accuracy on both Hydro and DMO runs. However, at intermediate $\nu$ ( $0.2\lesssim\log_{10}\nu\lesssim 0.4$ ), the PBS performance in the Hydro is markedly worse than for the DMO. For both $200c$ and $200m$ halos, the PBS model under-predict the value of the bias by $\sim 10$ per cent for the Hydro and by few percent for the DMO halos.

The worse accuracy of the PBS prediction on the halo bias for the Hydro case is again a consequence of the variation induced by baryonic effect on halo masses. Since halos in the Hydro simulations tend to be less massive than their DMO counter-part, part of the mass inside the collapsed Lagrangian patch has not been taken into account when the mass variance is computed. In other words, baryons induce a modification of the relationship between the multiplicity function and the probability of fluctuations above a threshold on the linearly evolved density field (see Appendix A for a review of the key aspects). As the Lagrangian radius of a halo tend to be smaller in the Hydro than in the DMO, it is not surprising that also the PBS prescription underestimate the bias, as the former is a monotonic growing function of the latter.

In order to study the direct effect of baryons on clustering, we present in Figure 12 the ratio of the $P_{hh}$ computed on both Hydro and DMO runs for a matched halo catalog constructed from Box $2b$ at $z=0.25$ . Different colors are different mean masses of the corresponding DMO sample. The effect of baryons on the halo clustering is fully consistent with null effect. The relative difference fluctuates around $0$ with a scatter less than $0.3$ per cent for $k<0.1$ (Mpc/ $h$ )^-1. This demonstrates that the baryonic effect of the mass-dependent halo bias is entirely due to the variation of halo masses, while no effect is detected on the clustering pattern of matched halo catalogues.

6 Impact on cosmological constraints

Here, we will investigate the impact on cosmological inference of adopting a halo mass function and bias that neglect the effect of baryons, as calibrated in our analysis. As we will see, depending on the halo mass range explored by the data, the bias on the cosmological parameters can be quite significant. In order to address this issue, in the following we will propose to absorb the impact of baryons by suitably correcting the DMO HMF.

6.1 Cluster Counts

Galaxy cluster number counts directly constrain the halo mass function and so the cosmological parameters on which it depends.⁷⁷7See Castro et al. (2016) for a study that uses observations to constrain the halo mass function itself rather than the cosmological parameters. The number of clusters expected in a survey with sky coverage $\Omega_{\rm sky}$ within the $i$ -th redshift bin $\Delta z_{i}$ centered around $z_{i}$ and the $j$ -th mass bin of width $\Delta M_{j}=M_{j+1}-M_{j}$ is (see, e.g., Sartoris et al., 2016):

\displaystyle N_{ij}=\frac{\Omega_{\rm sky}}{8\pi}\int_{\Delta z_{i}}\!\!{\rm d}z\frac{{\rm d}V}{{\rm d}z}\int_{0}^{\infty}\!\!{\rm d}M\,n(M,z)\left(\textrm{erfc}\,x_{j}-\textrm{erfc}\,x_{j+1}\right)\!.

(21)

In the above equation $\textrm{erfc}(x_{j})$ is the complementary error function, whose argument conveys information on the halo mass through

x_{j}\equiv x(M^{\rm ob}_{j})\,=\,\frac{\ln M^{\rm ob}_{j}-\ln M_{\rm bias}-\ln M}{\sqrt{2}\sigma_{\ln M}}\,,

(22)

where $M_{\rm bias}$ models a possible bias in the mass estimation (not to be confused with the bias in the halo distribution) and $\sigma_{\ln M}$ is the intrinsic scatter in the relation between true and observed mass (masses are defined according to $\Delta=200m$ ). Following Sartoris et al. (2016), we model the latter two quantities as

	$\displaystyle\ln M_{\rm bias}$	$\displaystyle\,=\,B_{M0}+\alpha\ln(1+z)\,,$		(23)
	$\displaystyle\sigma_{\text{ln}M}^{2}$	$\displaystyle\,=\,\sigma_{\text{ln}M0}^{2}+(1+z)^{2\beta}-1\,.$		(24)

The lowest mass bin at a given redshift corresponds to $M^{ob}_{j=0}(z)=M_{\rm thr}(z)$ which defines the survey selection function, i.e. the limiting value of the observed cluster mass, $M^{ob}$ for a cluster at redshift $z$ to be included in the survey. Figure 13 reports the value of this $z$ -dependent limiting mass for a Euclid-like survey, with blue and red curves corresponding to a detection threshold of signal-to-noise 5 and 3, respectively (see Sartoris et al., 2016, Fig. 2).

Furthermore, the quantity ${\rm d}V/{\rm d}z$ appearing in Eq. (21)is the cosmology-dependent comoving volume element per unit redshift interval which is given by:

\frac{{\rm d}V}{{\rm d}z}\,=\,4\pi(1+z)^{2}\,\frac{d_{A}^{2}(z)}{c^{-1}H(z)}\,,

(25)

with $d_{A}$ is angular diameter distance and $H(z)$ the Hubble rate at redshift $z$ .

We assume Poisson errors for the cluster counts so that we can use the Cash $C$ statistics (Cash, 1979; Holder et al., 2001):

C\,=\,-2\ln L_{\rm cc}\,=\,2\sum_{ij}\left(N_{ij}-N_{ij}^{\rm obs}\ln N_{ij}\right)+K,

(26)

where $L_{\rm cc}$ is the cluster count likelihood, and $N_{ij}$ and $N_{ij}^{\rm obs}$ are expected and observed counts, respectively, and $K$ is a constant. Eq. (26) is valid under the assumption that different mass- and redshift-bins are uncorrelated. Testing this hypothesis goes beyond the purpose of this analysis, as it requires using a large ensemble of mock Euclid-like surveys.

6.2 Cluster power spectrum

The cluster catalogs discussed in Section 6.3 can also be used to calculate the power spectrum of clusters identified according to a given selection function, $P_{\rm cl}$ (Majumdar & Mohr, 2004). It is given by (see e.g. Sartoris et al., 2010):

P_{\rm cl}(k,z)=b^{2}_{\rm eff}(z)\,P(k,z)\,,

(27)

where $P(k,z)$ is the linear power spectrum and $b_{\rm eff}(z)$ is the linear bias weighted by the HMF:

b_{\rm eff}(z)=\frac{1}{\bar{n}(z)}\int_{0}^{\infty}{\rm d}M\,n(M,z)\,\textrm{erfc}\{x[M_{\rm thr}(z)]\}\,b(M,z)\,.

(28)

The normalization factor $\bar{n}(z)$ is the average number density of objects included in the survey at the redshift $z$ :

\bar{n}(z)\,=\,\int_{0}^{\infty}{\rm d}M\,n(M,z)\,\textrm{erfc}\{x[M_{\rm thr}(z)]\}\,.

(29)

Note that in Eq. (27) redshift space distortions (RSD) have been neglected.

The cluster power spectrum of Eq. (27) is valid for a small redshift interval centered around $z$ . Observationally, it is convenient to measure the $P_{\rm cl}$ within wide redshift intervals. The $P_{\rm cl}$ averaged over the $i$ -th redshift bin $\Delta z_{i}$ centered around $z_{i}$ is then (Majumdar & Mohr, 2004):

\bar{P}_{\rm cl}(k,z_{i})\,=\,\frac{\displaystyle\int_{\Delta z_{i}}{\rm d}z\,\frac{{\rm d}V}{{\rm d}z}\,\bar{n}^{2}(z)P_{\rm cl}(k,z)}{\displaystyle\int_{\Delta z_{i}}{\rm d}z\,\frac{{\rm d}V}{{\rm d}z}\,\bar{n}^{2}(z)}\,,

(30)

that is, the cluster power spectrum is weighted according to the square of the number density of clusters that are included in the survey at redshift $z$ .

As the so-defined $P_{\rm cl}$ probes linear scales, we can assume uncorrelated Gaussian errors so that we can build the following likelihood:

-2\ln L_{\rm cps}\!=\!\sum_{i,j}\!\frac{\big{[}\bar{P}_{\rm cl}(k_{j},z_{i})-\hat{P}_{\rm cl}^{\rm obs}(k_{j},z_{i})\big{]}^{2}}{\sigma^{2}_{P}(k_{j},z_{i})}+K^{\prime}

(31)

where we define again the likelihood up to a constant $K^{\prime}$ and the product runs over the redshift bins $\Delta z_{i}$ centered around $z_{i}$ and the wavenumber bins $\Delta k_{j}$ centered around $k_{j}$ . We adopt constant widths for the redshift bins, $\Delta z=0.2$ (see Figure 13), and for the $k$ -bin, $\Delta\log(k\,\text{Mpc}/h)=0.1$ with:

\{k_{\rm min},k_{\rm max}\}=\{5\times 10^{-3},\,5\times 10^{-2}\}(\text{Mpc}/h)^{-1}\,.

(32)

A coarser redshift bins should make correlations between adjacent bins negligible and while our choice for the value of $k_{\rm max}$ should make non-linear corrections to the power spectrum negligible (see Appendix B).

In Eq. (31) the variance is given by (Scoccimarro et al., 1999):

\frac{\sigma^{2}_{P}}{\bar{P}^{2}_{\rm cl}}=\frac{(2\pi)^{3}}{V_{s}V_{k}/2}\Big{[}1+\frac{1}{\bar{n}(z)\bar{P}_{\rm cl}(k,z)}\Big{]}^{2}\,,

(33)

where $V_{k}$ is the $k$ -space volume of the bin, $V_{k}=4\pi k^{2}{\rm d}k$ , and $V_{s}$ is the survey volume for the redshift bin $\Delta z_{i}$ , which can be computed using (25): $V_{s}=\Omega_{\rm sky}(4\pi)^{-1}\!\int_{\Delta z_{i}}\!{\rm d}z\,({\rm d}V/{\rm d}z)$ . As for the forecasts of this paper, we are assuming constant (in real space) window function and power spectrum, Eq. (33) is in agreement with the optimal weighting scheme of Feldman et al. (1994). Also, Eq. (33) neglects any anisotropy in the survey volume.

Galaxy clusters sample discretely the underlying matter field and the resulting shot noise has to be accounted for. Finally, in the forecasts of Section 6.3 we model the observed power spectrum via $P_{\rm cl}^{\rm obs}=\bar{P}_{\rm cl}^{\rm fid}+1/\bar{n}^{\rm fid}$ .

6.3 Fisher Matrix forecasts

We consider number counts forecasts for an Euclid-like cluster survey. The Euclid telescope is scheduled for launch in 2022 and will observe approximately $\Omega_{\rm sky}=15000\deg^{2}$ of the extragalactic sky (Laureijs et al., 2011). Following Sartoris et al. (2016), the photometric cluster survey to be carried out by Euclid will have a redshift-dependent limiting mass as shown in Figure 13 for a detection signal-to-noise (S/N) threshold of 3 and 5 (see also Euclid Collaboration, 2019). The shape of the selection functions is higher at $z\sim 0.2$ than at $z\sim 0.7$ , while one would in principle expect progressively more and more massive clusters to be only included at higher redshift. Sartoris et al. (2016) explains this counter-intuitive behavior as due to the competing contributions of cosmic variance and Poisson noise in the contaminating field counts. As in Sartoris et al. (2016), we assume

	$\displaystyle\{B_{M0,\rm{fid}},\;\alpha_{\rm{fid}}\}$	$\displaystyle\;=\;\{0,0\}\,,$		(34)
	$\displaystyle\{\sigma_{\text{ln}M0,\rm{fid}},\;\beta_{\rm{fid}}\}$	$\displaystyle\;=\;\{2,0125\}$		(34)

for the fiducial values of the four nuisance parameters of Eqs. (23)–(24). We point out that we vary these nuisance parameters in the following Fisher-Matrix forecast analysis by assuming no prior knowledge on them.

The cluster count likelihood is computed over the redshift range of Figure 13 with bins of $\Delta z=0.1$ . The observed mass range extends from the lowest mass limit ( $M_{\rm thr}$ ) to $\log M/M_{\odot}\leq 16$ , with $\Delta\log_{10}M/M_{\odot}=0.2$ .

As we will show in the following, neglecting baryonic effects can strongly bias cosmological inference. In the following we will show both the effect of assuming the DMO HMF and HB to analyse a population of clusters whose statistical properties are defined according to the Hydro calibration, and of correcting the DMO calibrations for baryonic effects by correcting halo masses according to what shown in Figure 1. To this end, we define the correcting function $hc(M,z)$ , which is defined as the inverse of the mass variation shown in Figure 1:

\displaystyle hc(M_{\rm Hydro},z)=\frac{M_{\rm DMO}}{M_{\rm Hydro}}\,.

(35)

In the above equation $M_{\rm DMO}$ is the mass in the DMO simulation of the halo that has mass $M_{\rm Hydro}$ in the hydro simulation. We then correct the DMO number density and bias according to:

	$\displaystyle n(M_{\rm Hydro},z)\,$	$\displaystyle\longrightarrow\,n\big{(}M_{\rm Hydro}\,hc,z\big{)}\frac{\textrm{d}M_{\rm DMO}}{\textrm{d}M_{\rm Hydro}}\,,$		(36)
	$\displaystyle b(M_{h},z)\,$	$\displaystyle\longrightarrow\,b\big{(}M_{\rm Hydro}\,hc,z\big{)}\,,$		(37)

where the Jacobian is:

\displaystyle\frac{\textrm{d}M_{\rm DMO}}{\textrm{d}M_{\rm Hydro}}=hc(M_{\rm Hydro},z)+M_{\rm Hydro}\frac{\textrm{d}hc}{\textrm{d}M_{\rm Hydro}}\,.

(38)

The corrected functions are then fed into the expressions for number counts and effective bias.

Forecasts based on the Fisher approximation are shown in Figures 14 and 15 for a Euclid-like survey with detection thresholds of $3\sigma$ and $5\sigma$ , respectively. We checked that the Fisher approximation is valid via a low-resolution full posterior exploration. The forecast was done by generating synthetic catalogues from the Hydro calibrations of HMF and HB, and analysing it with the corresponding DMO calibration, with (green contours) and without (blue contours) correction for the halo mass variation defined by Eqs. (36) and (37), as well as using the Hydro fit itself (red contours). This in practice means that the assumed values for the nuisance parameters will correspond to the fiducial ones only for the Hydro HMF. Figure 14 shows a significant impact of baryonic effects on the inference of cosmological parameters when assuming the lower S/N selection function. In this case, the neglecting the baryonic effects would lead to a highly significant bias toward larger values of $\sigma_{8}$ and $\Omega_{m}$ . The direction of this bias is consistent with the fact that the DMO simulations produce a significantly larger number of halos at fixed mass. Correcting for the calibrated variation of halo masses induced by baryonic effects significantly reduces the bias on the cosmological parameters. Both the impact and the constraining power significantly weakens in the case of Figure 15 as the higher mass threshold (see Figure 13) strongly reduces the statistics of clusters included in the survey, thus suppressing the relative important of baryonic effects with respect to purely statistical uncertainties. Notice that the forecast constraining power for the $5\sigma$ selection adds little information to current constraints (see e.g., Costanzi et al., 2019), illustrating the importance of better understanding the baryonic sector in order to optimize the capabilities of the survey’s next-generation.

We remark that the excellent recovery of the cosmological parameters comes with an increasing tension in the recovery of the nuisance parameters that describe the evolution and scatter of the scaling relations. This is due to the simplistic implementation of our mass correction that does not consider the full distribution of the $M_{\rm Hydro}/M_{\rm DMO}$ but instead only the median. We leave the study of a more precise implementation to future investigation. However, we stress that the tension on the nuisance parameters might, firstly, again induce a bias on cosmological constraints if, for instance, tight priors are assumed due to a better understanding of the scaling relations. Secondly, the tension might compromise the analysis’s robustness as observed for the hydrostatic mass bias (see e.g., Ade et al., 2016).

In order to quantify the impact of the baryonic effects, we compute the tension between the constraints that adopt the baryon-calibrated HMF and bias and the ones that adopt the DMO ones, either with or without the inclusion correcting function for halo masses, $hc$ . To this purpose, we use the index of inconsistency (IOI) (Lin & Ishak, 2017), which in $\sigma$ -units reads:

\displaystyle\sqrt{2\text{IOI}}

\displaystyle\equiv\sqrt{\delta^{T}{(C_{\rm hydro}+C_{\rm other})}^{-1}\delta\,}\,.

(39)

Here $\delta$ is the difference between the two vectors defined by the best-fit parameters of DMO and Hydro forecasts. We also calculate the Figure of Merit (FoM) as the square root of the Fisher matrix. For Both IOI and FoM, the covariance matrices are relative to the posteriors on $\Omega_{m}$ and $\sigma_{8}$ , marginalized over the other nuisance parameters defining mass bias and intrinsic scatter between true and observed masses. We list the results in Table 7.

From Table 7 we see that the correction of Eq. (36) significantly reduces the bias to less than a 1 $\sigma$ shift on the constraints on $\Omega_{m0}$ and $\sigma_{8}$ . The results reported in this Table confirm that the change in halo mass due to baryonic effects is responsible for most of the impact on the inference of cosmological parameters. The remaining bias is due to the way in which the correction was modeled, which does not take into account the full distribution of the relation between DMO and Hydro halo masses.

To understand the effect of baryonic physics on Cluster Counts and Cluster Power-Spectrum separately, we have also forecast a Cluster Count only analysis assuming a $3\sigma$ selection. With respect to the corresponding full analysis presented in Table 7, the FoM is reduced by a factor of $4$ , while the IOI is reduced from $8.1$ to $7.0$ . The better consistency for Cluster Counts only with respect to the full analysis is due to loosen constrain on $\Omega_{m0}$ while the constraints on $\sigma_{8}$ are not significantly changed — in agreement with Sartoris et al. (2016).

Table 7: FoM in units of

10^{3}

and Index of Inconsistency (IOI) in

\sigma

units (Lin & Ishak, 2017), without and with corrections for mass variations.

S/N of Euclid forecast	3	5
FoM (Hydro)	137	19.4
FoM (DMO)	114	17.0
$\sqrt{2\text{IOI}}$ DMO wrt Hydro	8.1	1.1
FoM (DMO+Hydro correction)	120	15.6
$\sqrt{2\text{IOI}}$ Corrected DMO wrt hydro	1.1	1.0

7 Conclusions

We have studied the effect of baryonic physics the statistical properties of halos having sizes of groups and clusters of galaxies, i.e. more massive than $10^{13}M_{\odot}/h$ , over the redshift range of $[0,2]$ , which is typical of the next generation of SZ and optical/near-IR cluster surveys. In particular, we have focused our analysis on the impact that baryonic physics has on the halo mass function (HMF) and the linear halo bias (HB), and on the implications of neglecting such baryonic effects in the inference of cosmological parameters from such quantities. To this purpose, we analyzed the Magneticum suite of cosmological hydrodynamical simulations (e.g. Dolag et al., 2016). We note that Bocquet et al. (2015) had already studied the effect of baryons on the HMF using the Magneticum set of simulations. Not surprisingly our results are broadly consistent with theirs, while extending them in several directions. Small differences with respect to the results by Bocquet et al. (2015) are due to the improved statistics of larger simulations used in this work. Furthermore, to our knowledge, we addressed for the first time the effect of baryons on the halo bias of cluster size halos.

The main results of our analysis can be summarized as follows.

•

The major role of baryons is induce small but sizeable changes in the halo masses. Halos in the Hydro runs are, at low redshift, on average, lighter than the corresponding halos in the DM-only (DMO) simulations (see Figure 1). While decreasing at increasing halo mass, this effect is present even at the highest masses sampled in our simulations, with the Hydro-to-DMO halo mass ratio asymptotically tend a value slightly lower than unity. This effect decreases at with increasing redshift. Our results are in qualitative agreement with previous analyses, such as Sawala et al. (2013); Cui et al. (2014); Velliscig et al. (2014). At redshift $z\simeq 2$ we observe that this trend is inverted, with halos in the Hydro simulations becoming slightly heavier than their DMO counterparts, especially in the low-mass end.
•

The relative difference of masses of Hydro and DMO halos is found to correlate tightly with the halo baryon mass fraction (see Table 2). At high redshift, $z\simeq 2$ , when $M_{\rm Hydro}/M_{\rm DMO}>1$ , the former correlation is dominantly driven by the stellar content: halos with a larger stellar mass fraction tend to be relatively more massive that in DMO simulations. In fact, a halo mass increase is driven by rapid gas cooling, and the ensuing star formation, that accelerates mass accretion and halo adiabatic contraction in the Hydro simulations. At lower redshifts, when $M_{\rm Hydro}/M_{\rm DMO}<1$ , the correlation is driven by the halo gas mass fraction: AGN feedback displaces and heats the gas leading to less gaseous objects at lower redshifts, that tend to have a lower mass than their DMO counterparts.
•

As a consequence of halo mass decrease in the Hydro simulations, halos in the Hydro simulations at fixed mass are less abundant and more biased than in the DMO case (see Figures 6-10). However, by comparing the halo power-spectrum of halos matched in Hydro and DMO simulations, we have shown that the effect of baryons directly on the clustering is smaller than $0.3$ per cent on linear scales, and fully consistent with null-effect (see Figure 12).
•

We verified that neglecting the effect of baryons can induce a significant bias in the inference of cosmological parameters from a Euclid-like cluster survey. However, our results also show that correcting the DMO HMF and halo bias by assuming a deterministic relationship between halo masses in Hydro and DMO simulations reduces this bias to a level comparable to the statistical uncertainties.

It is worth pointing out that the validity of the exact calibration of baryonic effects on HMF and HB presented in this work are specific to the choice of sub-resolution models adopted in the Magneticum simulations. On the other hand, the sensitivity of cosmological constraints on the inclusion of such baryonic effects calls for the need of having such effects under such a good control, for them not to dominate over the statistical uncertainties expected from the next generation of cluster surveys. Progress in this direction would probably requires following two lines of investigation. Firstly, a systematic comparison of baryonic effects in different suite of simulations, carried out by different groups and based on different sub-resolution models of star formation and feedback, should allow to set priors on the parameters describing baryonic effects (primarily the halo mass variation). Secondly, observational data on the gas and stellar mass fractions in clusters should help in assessing the actual impact of baryonic effects. In fact, the results of our analysis show that variation of halo masses correlates with the amount and distribution of both the gaseous and stellar content of clusters, two quantities that can be obtained from observational data.

Acknowledgements

It is a pleasure to thank Francisco Villaescusa-Navarro for assistance with the methodology to compute the power-spectrum. We also would like to thank Sebastian Bocquet and Antonio Ragagnin for useful discussions. TC and SB are supported by the INFN INDARK PD51 grant and by the PRIN-MIUR 2015W7KAWC grant. KD acknowledges support by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC-2094 – 3907833. VM thanks CNPq and FAPES for partial financial support. MQ is supported by the Brazilian research agencies CNPq and FAPERJ. AS is supported by the ERC-StG ‘ClustersXCosmo’ grant agreement 716762, and by the FARE-MIUR grant ’ClustersXEuclid’ R165SBKTMA. The analysis has been performed using the ‘Leibniz-Rechenzentrum’ with CPU time assigned to the Project “pr86re” and “pr83li”. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 888258.

Data availability

The data underlying this article will be shared on reasonable request to the corresponding author.

References

Ade et al. (2016) Ade P., et al., 2016, A&A, 594, A24, 1502.01597
Allen et al. (2011) Allen S. W., Evrard A. E., Mantz A. B., 2011, Ann. Rev. Astron. Astrophys., 49, 409, 1103.4829
Beck et al. (2016) Beck A. M., et al., 2016, MNRAS, 455, 2110, 1502.07358
Biffi et al. (2013) Biffi V., Dolag K., Böhringer H., 2013, MNRAS, 428, 1395, 1210.4158, ADS
Bocquet et al. (2015) Bocquet S., et al., 2015, ApJ, 799, 214, 1407.2942
Bocquet et al. (2019) Bocquet S., et al., 2019, ApJ, 878, 55, 1812.01679
Bocquet et al. (2020) Bocquet S., Heitmann K., Habib S., Lawrence E., Uram T., Frontiere N., Pope A., Finkel H., 2020, 2003.12116
Bocquet et al. (2016) Bocquet S., Saro A., Dolag K., Mohr J. J., 2016, MNRAS, 456, 2361, 1502.07357
Bonoli et al. (2020) Bonoli S., et al., 2020, 2007.01910
Borgani & Kravtsov (2011) Borgani S., Kravtsov A., 2011, Advanced Science Letters, 4, 204, 0906.4370, ADS
Bryan & Norman (1998) Bryan G. L., Norman M. L., 1998, ApJ, 495, 80, astro-ph/9710107
Bullock & Boylan-Kolchin (2017) Bullock J. S., Boylan-Kolchin M., 2017, Ann. Rev. Astron. Astrophys., 55, 343, 1707.04256
Casas-Miranda et al. (2002) Casas-Miranda R., Mo H. J., Sheth R. K., Boerner G., 2002, MNRAS, 333, 730, astro-ph/0105008
Cash (1979) Cash W., 1979, ApJ, 228, 939
Castro et al. (2016) Castro T., Marra V., Quartin M., 2016, MNRAS, 463, 1666, 1605.07548
Castro et al. (2018) Castro T., Quartin M., Giocoli C., Borgani S., Dolag K., 2018, MNRAS, 478, 1305, 1711.10017
Costanzi et al. (2019) Costanzi M., et al., 2019, MNRAS, 488, 4779, 1810.09456
Crain et al. (2015) Crain R. A., et al., 2015, MNRAS, 450, 1937, 1501.01311
Cui et al. (2014) Cui W., Borgani S., Murante G., 2014, MNRAS, 441, 1769, 1402.1493
DESI Collaboration (2016) DESI Collaboration 2016, 1611.00036
Despali et al. (2016) Despali G., Giocoli C., Angulo R. E., Tormen G., Sheth R. K., Baso G., Moscardini L., 2016, MNRAS, 456, 2486, 1507.05627
Di Matteo et al. (2008) Di Matteo T., Colberg J., Springel V., Hernquist L., Sijacki D., 2008, ApJ, 676, 33, 0705.2269
Dolag et al. (2009) Dolag K., Borgani S., Murante G., Springel V., 2009, MNRAS, 399, 497, 0808.3401
Dolag et al. (2015) Dolag K., Gaensler B., Beck A., Beck M., 2015, MNRAS, 451, 4277, 1412.4829
Dolag et al. (2016) Dolag K., Komatsu E., Sunyaev R., 2016, MNRAS, 463, 1797, 1509.05134
Ellien et al. (2019) Ellien A., Durret F., Adami C., Martinet N., Lobo C., Jauzac M., 2019, A&A, 628, A34, 1905.10816, ADS
Ettori et al. (2006) Ettori S., Dolag K., Borgani S., Murante G., 2006, MNRAS, 365, 1021, astro-ph/0509024
Euclid Collaboration (2019) Euclid Collaboration 2019, A&A, 627, A23, 1906.04707
Feldman et al. (1994) Feldman H. A., Kaiser N., Peacock J. A., 1994, ApJ, 426, 23, astro-ph/9304022
Ferland et al. (1998) Ferland G. J., Korista K. T., Verner D. A., Ferguson J. W., Kingdon J. B., Verner E. M., 1998, Publ. Astron. Soc. Pac., 110, 761
Foreman-Mackey et al. (2013) Foreman-Mackey D., Hogg D. W., Lang D., Goodman J., 2013, Publ. Astron. Soc. Pac., 125, 306, 1202.3665
Garcia & Rozo (2019) Garcia R., Rozo E., 2019, MNRAS, 489, 4170, 1903.01709
Geary (1930) Geary R. C., 1930, Journal of the Royal Statistical Society, 93, 442
Hamaus et al. (2010) Hamaus N., Seljak U., Desjacques V., Smith R. E., Baldauf T., 2010, Phys. Rev., D82, 043515, 1004.5377
Hirschmann et al. (2014) Hirschmann M., Dolag K., Saro A., Bachmann L., Borgani S., Burkert A., 2014, MNRAS, 442, 2304, 1308.0333
Holder et al. (2001) Holder G., Haiman Z., Mohr J., 2001, ApJ, 560, L111, astro-ph/0105396
Jenkins et al. (2001) Jenkins A., Frenk C., White S. D., Colberg J., Cole S., et al., 2001, MNRAS, 321, 372, astro-ph/0005260
Knebe et al. (2011) Knebe A., et al., 2011, MNRAS, 415, 2293, 1104.0949
Komatsu et al. (2011) Komatsu E., et al., 2011, ApJS, 192, 18, 1001.4538
Kravtsov & Borgani (2012) Kravtsov A. V., Borgani S., 2012, ARA&A, 50, 353, 1205.5556, ADS
Lasdon et al. (2010) Lasdon L., Duarte A., Glover F., Laguna M., Martí R., 2010, Comput. Oper. Res., 37, 1500
Laureijs et al. (2011) Laureijs R., et al., 2011, 1110.3193
Lin & Ishak (2017) Lin W., Ishak M., 2017, Phys. Rev., D96, 023532, 1705.05303
McCarthy et al. (2017) McCarthy I. G., Schaye J., Bird S., Le Brun A. M., 2017, MNRAS, 465, 2936, 1603.02702
McClintock et al. (2019) McClintock T., Rozo E., Becker M. R., DeRose J., Mao Y.-Y., McLaughlin S., Tinker J. L., Wechsler R. H., Zhai Z., 2019, ApJ, 872, 53, 1804.05866
McDonald et al. (2012) McDonald M., et al., 2012, Nature, 488, 349, 1208.2962
Majumdar & Mohr (2004) Majumdar S., Mohr J. J., 2004, ApJ, 613, 41, astro-ph/0305341
Mantz et al. (2016) Mantz A. B., Allen S. W., Morris R. G., von der Linden A., Applegate D. E., Kelly P. L., Burke D. L., Donovan D., Ebeling H., 2016, MNRAS, 463, 3582, 1606.03407
Mao et al. (2018) Mao Y.-Y., Zentner A. R., Wechsler R. H., 2018, MNRAS, 474, 5143, 1705.03888
Martizzi et al. (2014) Martizzi D., Mohammed I., Teyssier R., Moore B., 2014, MNRAS, 440, 2290, 1307.6002
Martizzi et al. (2013) Martizzi D., Teyssier R., Moore B., 2013, MNRAS, 432, 1947, 1211.2648
Merloni et al. (2012) Merloni A., et al., 2012, 1209.3114
Mo & White (1996) Mo H. J., White S. D. M., 1996, MNRAS, 282, 347, astro-ph/9512127
Newville et al. (2019) Newville M., et al.,, 2019, lmfit/lmfit-py 0.9.14
Nishimichi et al. (2019) Nishimichi T., et al., 2019, ApJ, 884, 29, 1811.09504
Press & Schechter (1974) Press W. H., Schechter P., 1974, ApJ, 187, 425
Remus et al. (2017) Remus R.-S., Dolag K., Hoffmann T. L., 2017, Galaxies, 5, 49, 1709.02393
Saro et al. (2014) Saro A., et al., 2014, MNRAS, 440, 2610, 1312.2462
Sartoris et al. (2010) Sartoris B., Borgani S., Fedeli C., Matarrese S., Moscardini L., Rosati P., Weller J., 2010, MNRAS, 407, 2339, 1003.0841
Sartoris et al. (2016) Sartoris B., et al., 2016, MNRAS, 459, 1764, 1505.02165
Sawala et al. (2016) Sawala T., et al., 2016, MNRAS, 457, 1931, 1511.01098
Sawala et al. (2013) Sawala T., Frenk C. S., Crain R. A., Jenkins A., Schaye J., Theuns T., Zavala J., 2013, MNRAS, 431, 1366, 1206.6495
Schaye et al. (2015) Schaye J., et al., 2015, MNRAS, 446, 521, 1407.7040
Schellenberger et al. (2019) Schellenberger G., David L., O’Sullivan E., Vrtilek J. M., Haines C. P., 2019, ApJ, 882, 59, 1907.10581, ADS
Scoccimarro et al. (1999) Scoccimarro R., Zaldarriaga M., Hui L., 1999, ApJ, 527, 1, astro-ph/9901099
Sheth & Tormen (1999) Sheth R. K., Tormen G., 1999, Mon.Not.Roy.Astron.Soc., 308, 119, astro-ph/9901122
Singh et al. (2020) Singh P., Saro A., Costanzi M., Dolag K., 2020, MNRAS, 494, 3728, 1911.05751
Springel (2005) Springel V., 2005, MNRAS, 364, 1105, astro-ph/0505010
Springel et al. (2005) Springel V., Di Matteo T., Hernquist L., 2005, MNRAS, 361, 776, astro-ph/0411108
Springel et al. (2005) Springel V., Di Matteo T., Hernquist L., 2005, MNRAS, 361, 776, astro-ph/0411108, ADS
Springel et al. (2005) Springel V., White S. D., Jenkins A., Frenk C. S., Yoshida N., et al., 2005, Nature, 435, 629, astro-ph/0504097
Springel et al. (2001) Springel V., White S. D. M., Tormen G., Kauffmann G., 2001, MNRAS, 328, 726, , ADS
Springel et al. (2001) Springel V., Yoshida N., White S. D., 2001, New Astron., 6, 79, astro-ph/0003162
Steinborn et al. (2016) Steinborn L. K., Dolag K., Comerford J. M., Hirschmann M., Remus R.-S., Teklu A. F., 2016, MNRAS, 458, 1013, 1510.08465, ADS
Steinborn et al. (2015) Steinborn L. K., Dolag K., Hirschmann M., Prieto M. A., Remus R.-S., 2015, MNRAS, 448, 1504, 1409.3221, ADS
Teklu et al. (2015) Teklu A. F., Remus R.-S., Dolag K., Beck A. M., Burkert A., Schmidt A. S., Schulze F., Steinborn L. K., 2015, ApJ, 812, 29, 1503.03501, ADS
Teyssier et al. (2011) Teyssier R., Moore B., Martizzi D., Dubois Y., Mayer L., 2011, MNRAS, 414, 195, 1003.4744
Tinker et al. (2008) Tinker J. L., Kravtsov A. V., Klypin A., Abazajian K., Warren M. S., Yepes G., Gottlober S., Holz D. E., 2008, ApJ, 688, 709, 0803.2706
Tinker et al. (2010) Tinker J. L., Robertson B. E., Kravtsov A. V., Klypin A., Warren M. S., Yepes G., Gottlober S., 2010, ApJ, 724, 878, 1001.3162
Tornatore et al. (2007) Tornatore L., Borgani S., Dolag K., Matteucci F., 2007, MNRAS, 382, 1050, 0705.1921
Velliscig et al. (2014) Velliscig M., van Daalen M. P., Schaye J., McCarthy I. G., Cacciato M., Le Brun A. M., Vecchia C. D., 2014, MNRAS, 442, 2641, 1402.4461
Virtanen et al. (2020) Virtanen P., et al., 2020, Nature Meth., 1907.10121
Vogelsberger et al. (2014) Vogelsberger M., Genel S., Springel V., Torrey P., Sijacki D., Xu D., Snyder G. F., Nelson D., Hernquist L., 2014, MNRAS, 444, 1518, 1405.2921
Vogelsberger et al. (2020) Vogelsberger M., Marinacci F., Torrey P., Puchwein E., 2020, Nature Reviews Physics, 2, 42, 1909.07976, ADS
Wang et al. (2020) Wang K., Mao Y.-Y., Zentner A. R., Lange J. U., van den Bosch F. C., Wechsler R. H., 2020, 2004.13732
Watson et al. (2011) Watson D., Denney K., Vestergaard M., Davis T., 2011, Astrophys. J. Lett., 740, L49, 1109.4632
Webb et al. (2015) Webb T., et al., 2015, ApJ, 809, 173, 1508.04982
Wiersma et al. (2009) Wiersma R. P. C., Schaye J., Smith B. D., 2009, MNRAS, 393, 99, 0807.3748
Xiang et al. (2013) Xiang Y., Gubian S., Suomela B., Hoeng J., 2013, The R Journal, 5, 13
Yuan et al. (2020) Yuan T., Elagali A., Labbé I., Kacprzak G. G., Lagos C. d. P., Alcorn L. Y., Cohn J. H., Tran K.-V. H., Glazebrook K., Groves B. A., Freeman K. C., Spitler L. R., Straatman C. M. S., Fisher D. B., Sweet S. M., 2020, Nature Astronomy, 2005.11880, ADS
Zhang et al. (2020) Zhang C., Churazov E., Dolag K., Forman W. R., Zhuravleva I., 2020, MNRAS, 494, 4539, 2001.10959

Appendix A Derivation of the PBS prediction

In the Press-Schechter formalism (Press & Schechter, 1974) the fraction of the matter in form of halos more massive than $M$ at redshift $z$ is equivalent to the probability of peaks higher than $\nu(M,z)=\delta_{c}(z)/\sigma(M,z)$ on the linearly evolved smoothed matter density field $\delta_{M}(z,\mathbf{x})$ . Thus, the number density $n(M,z)$ of halos with mass $M$ in the range $[M,M+\textrm{d}M]$ at redshift $z$ is:

\frac{\textrm{d}n(M,z)}{\textrm{d}M}=\frac{\rho_{\rm mc}}{M}\frac{\textrm{d}P(\delta_{M}/\sigma(M,z)>\nu)}{\textrm{d}\nu}\,,

where masses and redshift dependence has been omitted for the sake of a shorter notation. Comparing the equation above with Eq. (1) we can recognize and re-interpret the multiplicity function $f(M,z)$ as the probability:

f(M,z)\equiv\frac{\textrm{d}P}{\textrm{d}\nu}\,\frac{\textrm{d}\nu}{\textrm{d}M}=\frac{\textrm{d}P}{\textrm{d}M}\,.

The Peak-Background Split makes the additional assumption that the linearly evolved matter density field $\delta(\mathbf{x})$ can be split in two components: one with short correlation length $\delta_{s}(\mathbf{x})$ and one with long correlation length $\delta_{l}(\mathbf{x})$ . Notice that, given its short correlation length, $\delta_{s}$ gives a null contribution to $\delta_{M}$ while $\delta_{l}$ is basically unchanged by the smoothing — $\delta_{M}=\delta_{l}$ . Therefore, the conditional cumulative probability function $P(\delta_{M}/\sigma(M)>\nu(M,z)|\mathbf{x})$ can be expanded as:

	$\displaystyle P(\delta_{M}/\sigma>\nu\|\,\mathbf{x})$	$\displaystyle=P(\delta_{M}/\sigma>\nu-\delta_{M}(\mathbf{x})/\sigma)$
		$\displaystyle\approx\left(1-\frac{\delta_{M}(\mathbf{x})}{\sigma}\,\frac{\textrm{d}\,\,}{\textrm{d}\nu}\right)\,P(\delta_{M}/\sigma>\nu).$

Finally, one can write the halo density contrast at $\mathbf{x}$ :

\delta_{h}(M,z,\mathbf{x})\equiv\frac{n(M,z,\mathbf{x})}{n(M,z)}-1=\frac{1}{\sigma(M,z)}\frac{\textrm{d}\log f}{\textrm{d}\nu}\,\delta_{M}(z,\mathbf{x})\,.

Recognizing the halo bias on Lagrangian space $b_{\rm L}=\delta_{h}/\delta_{M}$ and changing the variable to $\sigma$ :

b_{\rm L}=-\frac{\delta_{c}(z)}{\sigma(M,z)}\frac{\textrm{d}\log f(z,\sigma)}{\textrm{d}\sigma}\,.

That on Eulerian space — $b=b_{L}+1$ , see (Sheth & Tormen, 1999) — and for Eq. (4) reduces to Eq. (14).

Appendix B On the choice of $k_{\rm min.}$

In Figure 16 we present all measurement of the bias for the DMO simulation at $z=0$ . The blue line is the ratio $P_{hm}/P_{mm}$ while the blue filled regions represent the $1$ and $2\,\sigma$ error bars. In black we present our best-fit prediction. The area in grey are the modes $k>k_{\rm min.}=0.05\,h$ /Mpc. As can be seen in the different panels of Figure 16, the linear approximation provides a good overall description of the simulated bias.

Appendix C On the Gaussian approximation for the distribution of $P(k)$

In order to verify the validity regime of the Gaussian approximation of the distribution of the power-spectrum of a Gaussian field, in Figure 17 we present a Kolmogorov–Smirnov (KS) test of synthetic data drawn from a $\chi^{2}$ -distribution with respect to both normal distribution $\mathcal{N}$ and the $\chi^{2}$ -distribution itself. The test is reproduced as a function of the degree of freedom of the $\chi^{2}$ distribution $\nu$ . The sample size was chosen to match the total number of mass bins summed on all simulations and redshifts, to wit $150$ . In Figure 17 solid lines represents the median of $10.000$ realizations of the KS test. Filled regions represent the $68$ percentiles around the median.

The KS test characterizes the difference between a sample and a hypothesized distribution by the supremum value of the difference between the respective cumulative distributions. Thus, the lower the value the better the distribution fits the sample. In Figure 17 we observe that the Gaussian approximation poorly fits the $\chi^{2}$ -distribution for low- $\nu$ . However, the quality of fit grows fast with the number of degrees of freedom and for $\nu$ bigger than the number of modes inside the first $k$ -shell (represented by the vertical solid black line) the differences is already subdominant with respect to the sample variance. Therefore, the Gaussian approximation introduces only a minor systematic in this work.