This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

DESI DR1 Ly𝜶\alpha 1D power spectrum: The Fast Fourier Transform estimator measurement

Corentin Ravoux 0000-0002-3500-6635    Marie-Lynn Abdul-Karim 0009-0000-7133-142X    Jean-Marc Le Goff    Eric Armengaud 0000-0001-7600-5148    Jessica N. Aguilar    Steven Ahlen 0000-0001-6098-7247    Stephen Bailey 0000-0003-4162-6619    Davide Bianchi 0000-0001-9712-0006    Allyson Brodzeller 0000-0002-8934-0954    David Brooks    Jonás Chaves-Montero 0000-0002-9553-4261    Todd Claybaugh    Andrei Cuceu 0000-0002-2169-0595    Roger de Belsunce 0000-0003-3660-4028    Axel de la Macorra 0000-0002-1769-1640    Arjun Dey 0000-0002-4928-4003    Zhejie Ding 0000-0002-3369-3718    Peter Doel    Simone Ferraro 0000-0003-4992-7854    Andreu Font-Ribera 0000-0002-3033-7312    Jaime E. Forero-Romero 0000-0002-2890-3725    Enrique Gaztañaga    Naim Göksel Karaçaylı 0000-0001-7336-8912    Satya Gontcho A Gontcho 0000-0003-3142-233X    Gaston Gutierrez    Julien Guy 0000-0001-9822-6793    Hiram K. Herrera-Alcantar 0000-0002-9136-9609    Mustapha Ishak 0000-0002-6024-466X    Robert Kehoe    David Kirkby 0000-0002-8828-5463    Theodore Kisner 0000-0003-3510-7134    Anthony Kremin 0000-0001-6356-7424    Martin Landriau 0000-0003-1838-8528    Laurent Le Guillou 0000-0001-7178-8868    Michael E. Levi 0000-0003-1887-1018    Marc Manera 0000-0003-4962-8934    Paul Martini 0000-0002-4279-4182    Aaron Meisner 0000-0002-1125-7384    Ramon Miquel    Paulo Montero-Camacho 0000-0002-6998-6678    Andrea Muñoz-Gutiérrez    Seshadri Nadathur 0000-0001-9070-3102    Gustavo Niz 0000-0002-1544-8946    Nathalie Palanque-Delabrouille 0000-0003-3188-784X    Zhiwei Pan 0000-0003-0230-6436    Will J. Percival 0000-0002-0644-5727    Ignasi Pérez-Ràfols 0000-0001-6979-0125    Matthew M. Pieri 0000-0003-0247-8991    Francisco Prada 0000-0001-7145-8674    Graziano Rossi    Eusebio Sanchez 0000-0002-9646-8198    Christoph Saulder 0000-0002-0408-5633    David Schlegel    Michael Schubnell    Hee-Jong Seo 0000-0002-6588-3508    Joseph H. Silber 0000-0002-3461-0320    Małgorzata Siudek 0000-0002-2949-2155    David Sprayberry    Ting Tan 0000-0001-8289-1481    Ji-Jia Tang 0000-0002-1860-0886    Gregory Tarlé 0000-0003-1704-0781    Michael Walther 0000-0002-1748-3745    Benjamin A. Weaver    Christophe Yèche 0000-0001-5146-8533    Jiaxi Yu 0009-0001-7217-8006    Rongpu Zhou 0000-0001-5381-4372    Hu Zou 0000-0002-6684-3997
Abstract

We present the one-dimensional Lyman-α\alpha forest power spectrum measurement derived from the data release 1 (DR1) of the Dark Energy Spectroscopic Instrument (DESI). The measurement of the Lyman-α\alpha forest power spectrum along the line of sight from high-redshift quasar spectra provides information on the shape of the linear matter power spectrum, neutrino masses, and the properties of dark matter. In this work, we use a Fast Fourier Transform (FFT)-based estimator, which is validated on synthetic data in a companion paper. Compared to the FFT measurement performed on the DESI early data release, we improve the noise characterization with a cross-exposure estimator and test the robustness of our measurement using various data splits. We also refine the estimation of the uncertainties and now present an estimator for the covariance matrix of the measurement. Furthermore, we compare our results to previous high-resolution and eBOSS measurements. In another companion paper, we present the same DR1 measurement using the Quadratic Maximum Likelihood Estimator (QMLE). These two measurements are consistent with each other and constitute the most precise one-dimensional power spectrum measurement to date, while being in good agreement with results from the DESI early data release.

1 Introduction

The Lyman-α\alpha (Lyα\alpha) forest is a set of resonant Lyα\alpha (λ=1215.67\lambda=1215.67Å) absorption features imprinted in the spectra of background objects, such as quasars, caused by intervening neutral hydrogen overdensities in the intergalactic medium (IGM) along their lines-of-sight. From ground-based observations, the Lyα\alpha forest is measured in the redshift range 2z62\lesssim z\lesssim 6. At higher redshift, the ionization state of the IGM is such that Lyα\alpha absorption is near-complete, while the atmospheric UV cutoff constrains lower redshift observations.

Thanks to the redshift effect, the absorption features measured at different but closeby wavelengths probe separate, closeby, IGM regions: for example, at z=2.4z=2.4, absorptions seen at 22 Å separation are sourced on average by clouds of IGM separated by 1.5h1\sim 1.5\,h^{-1} Mpc. This property makes the Lyα\alpha forest a unique probe to study the matter density field on small cosmological scales at z>2z>2 [1, 2, 3, 4].

Many summary statistics based on Lyα\alpha measurements have been used to infer properties of the IGM and the underlying matter field (e.g. [5, 6, 7]). Here, we focus on P1D,αP_{1\mathrm{D},\alpha}, the one-dimensional power spectrum of the fluctuations of the Lyα\alpha transmission field along lines-of-sight. The Lyα\alpha transmitted flux fraction F(λ)F(\lambda) is the relative IGM Lyα\alpha absorption as a function of wavelength in the observer’s frame, i.e., the ratio of the measured flux to the background source’s intrinsic flux, in the absence of any contaminants such as absorptions by metals or instrumental effects. The corresponding flux contrast is δF=F/F¯(λ)1\delta_{F}=F/\bar{F}(\lambda)-1, and P1D,αP_{1\mathrm{D},\alpha} is the one-dimensional power spectrum of the Lyα\alpha forest signal in δF\delta_{F}. As a decisive advantage, P1D,αP_{1\mathrm{D},\alpha} is quite directly connected to the linear matter power spectrum at redshift z>2z>2 and near-Mpc scales (e.g. [8, 9]). As a consequence, P1D,αP_{1\mathrm{D},\alpha} measurements has been used to test several cosmological scenarios, such as a running of the primordial matter power spectrum, the effect of the sum of neutrino masses [10, 11, 12, 13, 14], and several dark matter models such as warm [15, 16, 17, 18, 13, 19, 14], fuzzy [20, 21], or interacting dark matter [22], or primordial black holes [23].

The Dark Energy Spectroscopic Instrument (DESI) survey [24, 25, 26] provides an unprecedented sample of high-redshift quasar spectra from which Lyα\alpha forest information can be extracted. A major goal of DESI is to measure the three-dimensional auto-correlation of the Lyα\alpha forest, as well as its cross-correlation with quasar positions, on large scales, to derive the BAO scale and linear growth of structures at redshift z2.3z\sim 2.3 [27, 28, 29, 30, 31, 32]. In this article, we focus on the one-dimensional correlations of the Lyα\alpha forest, which can be studied down to small scales thanks to DESI spectral resolution (R=Δλ/λ[2,0005,100]R=\Delta\lambda/\lambda\in[2,000-5,100] for wavelength ranging from 3,6003,600 to 9,8009,800 Å). In [33, 34], we reported on the first measurements of the related power spectrum based on the DESI Early Data Release (EDR) [35, 36] and its first two months (M2) data sample (noted EDR+M2 hereafter). Two different estimators were used to compute P1D,αP_{1\mathrm{D},\alpha}, one based on the Fast-Fourier Transform of individual 1D spectra (FFT [33] and noted R23 hereafter), and one based on a Quadratic Maximum Likelihood Estimator (QMLE, [34]). In this article, as well as companion papers [37, 38] (noted K25 and KR25 hereafter), we update those measurements, using the vastly larger (five times more) Lyα\alpha sample included in the DESI Data Release 1 (DR1) [39], built from the first year of DESI main survey. In addition to the massive statistical gain, we present a series of analysis improvements and robustness tests.

Given the abundance of quasar spectra collected over the years from different instruments, several measurements of P1D,αP_{1\mathrm{D},\alpha} were performed over a range of redshifts and wavenumbers. On the one hand, with high-resolution spectra from, e.g., VLT/UVES [40], VLT/X-shooter [41] or Keck [42, 43], P1D,αP_{1\mathrm{D},\alpha} was measured for large values of the wavenumber kk. On the other hand, P1D,αP_{1\mathrm{D},\alpha} was estimated for smaller values of kk, and with higher statistical precision, thanks to the massive Lyα\alpha forest samples from the first SDSS spectroscopic survey, the Baryon Oscillation Spectroscopic Survey (BOSS)[44] and follow-up eBOSS[45] surveys. Compared to BOSS and eBOSS data, the Lyα\alpha spectra from DESI have a better spectral resolution, therefore enabling measurements of P1D,αP_{1\mathrm{D},\alpha} up to larger kk, overlapping with the measurements derived from the high-resolution samples.

The outline of this paper is as follows. After presenting the dataset used in section 2, including the catalogs of contaminants, we describe the current implementation of the FFT method in section 3, and present a new characterization of the impact of instrumental properties, especially the noise in section 4. We then discuss in section 5 a series of data splits and analysis variations that were used to assess the robustness of the measurement. After that, we turn to our estimate of the statistical covariance and systematic uncertainties in section 6. Finally, we present in section 7 our measurement and compare it to high-resolution and eBOSS measurements.

2 Data

2.1 The DESI DR1 data set

Refer to caption
Figure 1: Histogram of the quasar redshifts in the DR1 data release. All quasars with redshift 2.0<z<5.02.0<z<5.0 are represented in blue. The distribution of Broad Absorption Line (BAL) quasars defined by the Absorption Index criterion (AI>0AI>0) appears in yellow, and the distribution of quasars with Balnicity Index criterion (BI>0BI>0) is in green. The distribution of quasars containing at least one detected Damped Lyα\alpha (DLA) object in their spectra is shown in red.

The Dark Energy Spectroscopic Instrument (DESI) is a multi-object spectrograph whose purpose is to measure the spectra of more than 4040 million galactic (stars) and extra-galactic objects (galaxies and quasars). The main science goal of DESI is to infer the properties of dark energy using Baryon Acoustic Oscillations (BAO) and Redshift Space Distortions (RSD) measured in galaxy, quasars, and Lyα\alpha correlation functions. Additionally, the significant statistics and variety of observations allows DESI to cover broader scientific objectives, including the one presented in this article. The focal plane of DESI is equipped with 5,0005,000 robotic fiber positioners [25, 46, 47, 48, 49] to quickly reconfigure the observation pattern and point towards pre-selected targets [50]. A full description of the DESI instrument is given in [26]. We use the first data release noted DESI-DR1 (or DR1 for conciseness), which results from the first year of DESI observation [39].

2.2 Quasar catalog and spectra

Quasars candidates are pre-selected with a dedicated targeting to obtain a denser sample of high-redshift targets [50, 51, 52]. The DESI-DR1 quasar sample is constructed by automated classifiers. A first template fitting software called redrock \faGithub111https://github.com/desihub/redrock [53] is used to classify quasars, galaxies, and stars, and to provide a robust redshift measurement. A dedicated neural network software called QuasarNet \faGithub222https://github.com/desihub/QuasarNP [54, 55] picks up quasars missed by redrock. As QuasarNet is designed explicitly for classification, the redshift of retrieved quasars is determined by running redrock again with an informed prior. Finally, a Mgii line fitter algorithm catches low-redshift quasars (1.5<z<2.01.5<z<2.0) whose redshifts were correctly measured, but which have been classified as objects of another type. The full description of the pipeline used to generate the quasar catalog is given in [52]. The histogram of the DR1 quasar redshifts is represented in figure 1. It contains 521,509521,509 quasars with a redshift in the Lyα\alpha forest range of interest (2.0<z<5.02.0<z<5.0).

Several modifications are made to this quasar catalog. First, spectra from secondary target programs dedicated to the search for high-redshift quasars are added. A visual inspection campaign of those targets is carried out (see the companion paper K25 for more details). Second, we remove quasars whose measured individual power spectrum appears as an outlier. The full DR1 sample cannot be entirely visually inspected and still contains outliers passing the automated classifier criteria. We carry out a pre-study in which the individual power spectrum of each Lyα\alpha forest is measured. Taking the full DR1 catalog, we remove quasars for which the average of individual power spectrum over large scales (k<0.8k<0.8 Å-1) is more than 20σ20\sigma higher than the average for all quasars, where σ\sigma is the standard deviation over all quasars. This analysis is carried out in the Lyα\alpha and side-band SB1 (1,270<λrf<1,3801,270<\lambda_{\mathrm{rf}}<1,380 Å) regions used later in this paper. A visual inspection showed that the removed spectra are indeed quasar spectra but exhibiting spikes caused by instrumental artifacts. A small number of 77 and 1818 quasars are removed by conducting this study in the Lyα\alpha and SB1 region, respectively. This removal eliminates spurious oscillations in the measurement of the one-dimensional power spectrum.

2.3 Other catalogs

The computation of one-dimensional power spectra necessitates accounting for several effects that causes changes in the spectra (absorptions or emissions), and that are not directly related to IGM absorptions. Our choice concerning these contaminants is to remove the spectra from the data sample or mask the impacted spectral region.

First, the Broad Absorption Lines (BAL) quasars present specific blueshifted absorptions caused by the quasar torus, thus directly related to its surroundings. These specific quasars are found using the baltools \faGithub333https://github.com/paulmartini/baltools software [56], which is a χ2\chi^{2} minimizer that measures blueshifted Civ or Siiv absorptions compared to an unabsorbed quasar continuum. The baltools software computes for each quasar spectrum the Balnicity Index (BI) and Absorption Index (AI) [56] defined as

BI=250003000[1f^(v)0.9]C(v)𝑑v,AI=250000[1f^(v)0.9]C(v)𝑑v,\begin{split}BI&=-\int_{25000}^{3000}\left[1-\frac{\hat{f}(v)}{0.9}\right]C(v)dv\,,\\ AI&=-\int_{25000}^{0}\left[1-\frac{\hat{f}(v)}{0.9}\right]C(v)dv\,,\end{split} (2.1)

where f^\hat{f} is the normalized flux, i.e., the observed quasar flux divided by a model fitted to the quasar without BAL features, vv is the separation from the Civ line center in velocity units, and CC is a constant null term unless [1f^(v)/0.9][1-\hat{f}(v)/0.9] is greater than zero for more than 2,000kms12,000\ \text{km$\cdot$s}^{-1}. The BI>0BI>0 is a stricter criterion to define a BAL quasar, thus all BI>0BI>0 BAL quasars are included in the AI>0AI>0 sample. The P1D,αP_{1\mathrm{D},\alpha} measurement is carried out after removing all quasars that satisfies the BI>0BI>0 criterion (4%\sim 4\% in the total data set).

Specific absorptions called High-Column Density (HCD) systems are present in the spectrum of some quasars. They are caused by the passage of the quasar light in the vicinity of a galaxy, called a circumgalactic medium. HCD objects contaminate the Lyα\alpha forest by imprinting a saturated absorption that spans a much larger spectral range than the size of the galaxy itself and adds metal absorption lines present in the circumgalactic medium [57]. The HCD objects are detected using the combination of a convolutional neural network (CNN) algorithm desi-dlas \faGithub444https://github.com/cosmodesi/desi-dlas [58] and a Gaussian process (GP) algorithm \faGithub555https://github.com/jibanCat/gpy_dla_detection [59]. The GP algorithm provides a better estimate of the HCD column density NHiN_{\mathrm{H{\textsc{i}}}}, and the CNN a better detection completeness. We only consider the Damped Lyα\alpha(DLA) objects, defined by NHi>1020.3cm2N_{\mathrm{H{\textsc{i}}}}>10^{20.3}~\mathrm{cm}^{-2}. We take the same criteria as R23 to define our sample: valid DLA detection by CNN with a confidence level higher than 0.20.2 for quasars continuum-to-noise ratio CNR¯>3\overline{\mathrm{CNR}}>3 and a confidence level higher than 0.30.3 when CNR¯<3\overline{\mathrm{CNR}}<3. For the GP algorithm, we take a minimal 0.90.9 confidence level. The final catalog results from the merging of DLA detected by either algorithm. When both algorithms detect the DLA, the GP column density is used. The core region of a detected DLA, defined by an induced absorption larger than 20%20~\%, is masked at the spectrum level. The remaining DLA damping wings are corrected at the spectrum stage using a Voigt profile (see e.g. [27] for details regarding DLA masking). Approximately 1616 % of the quasars in the DR1 catalog have at least one detected DLA in their spectrum. The redshift distribution of quasars presenting at least one detected DLA and that of quasars passing the AI>0AI>0 or BI>0BI>0 BAL quasar criteria are shown in figure 1.

Finally, absorption lines caused by passing through the Milky Way and atmospheric emission lines cause spurious spectra modifications even after being corrected by the DESI pipeline. We mask the lines in the line catalog adapted to DESI spectral resolution and developed in R23(\faGithub666https://github.com/corentinravoux/p1desi/tree/main/etc/skylines/list_mask_p1d_DESI_EDR.txt). The transmitted flux fraction of masked pixels (DLA or lines) is fixed to its average value (F=F¯F=\overline{F}), which corresponds to a zero flux contrast. This masking induces a bias on P1D,αP_{1\mathrm{D},\alpha} that will be corrected in section 7.

3 Method

The one-dimensional power spectrum is computed with the same method as R23. This section aims to summarize the continuum fitting procedure and the P1D,αP_{1\mathrm{D},\alpha} FFT estimator used in this paper and to update the studies carried out in R23 with the DR1 dataset employed here.

3.1 Continuum fitting procedure

The continuum fitting procedure estimates the flux contrast δF\delta_{F}, i.e., the normalization of the flux variation caused by all absorptions in the Lyα\alpha forest, from the measured flux ff at a specific wavelength λ\lambda:

δF(λ)=F(λ)F¯(λ)1=f(λ)Cq(λ,zq)F¯(λ)1,\delta_{F}(\lambda)=\frac{F(\lambda)}{\overline{F}(\lambda)}-1=\frac{f(\lambda)}{C_{\mathrm{q}}(\lambda,z_{\mathrm{q}})\overline{F}(\lambda)}-1\,, (3.1)

where Cq(λ,zq)C_{\mathrm{q}}(\lambda,z_{\mathrm{q}}) is the quasar continuum corresponding to the unabsorbed flux emitted by the quasar qq and expressed at the quasar redshift zqz_{q}. The term F¯\overline{F} corresponds to the average fraction of transmitted flux in the IGM. We use the picca \faGithub777https://github.com/igmhub/picca [60] software package to fit for the product Cq(λ,zq)F¯(λ)C_{\mathrm{q}}(\lambda,z_{\mathrm{q}})\overline{F}(\lambda) using the following parameterization for the continuum:

Cq(λ,zq,aq,bq)=(aq+bqλ)C(λrf),C_{\mathrm{q}}(\lambda,z_{\mathrm{q}},a_{\mathrm{q}},b_{\mathrm{q}})=\left(a_{\mathrm{q}}+b_{\mathrm{q}}\lambda\right)C\left(\lambda_{\mathrm{rf}}\right)\,, (3.2)

where λrf=λ/(1+zq)\lambda_{\mathrm{rf}}=\lambda/(1+z_{\mathrm{q}}) is the rest-frame wavelength, aqa_{\mathrm{q}} and bqb_{\mathrm{q}} are parameters fitted for each quasar individually, accounting for their variability, CC is a function used to model the common continuum of all quasars. The latter is computed as the average of all the quasar spectra available in the sample. The fitting procedure is done iteratively for the ensemble of all quasars, updating the common continuum and fitting for the aqa_{\mathrm{q}} and bqb_{\mathrm{q}} terms at each iteration, by minimizing the following log-likelihood:

ln=12i[f(λi)F¯(λi)Cq(λi,zq,aq,bq)]2(F¯(λi)Cq(λi,zq,aq,bq))2ln[(F¯(λi)Cq(λi,zq,aq,bq))2].\ln\mathcal{L}=-\frac{1}{2}\sum_{i}\frac{\left[f(\lambda_{i})-\overline{F}(\lambda_{i})C_{\mathrm{q}}\left(\lambda_{i},z_{\mathrm{q}},a_{\mathrm{q}},b_{\mathrm{q}}\right)\right]^{2}}{\left(\overline{F}(\lambda_{i})C_{\mathrm{q}}\left(\lambda_{i},z_{\mathrm{q}},a_{\mathrm{q}},b_{\mathrm{q}}\right)\right)^{2}}-\ln\left[\left(\overline{F}(\lambda_{i})C_{\mathrm{q}}\left(\lambda_{i},z_{\mathrm{q}},a_{\mathrm{q}},b_{\mathrm{q}}\right)\right)^{2}\right]\,. (3.3)

The noise associated to each flux contrast is directly given by the estimate of the DESI pipeline noise, noted σpip,q(λ)\sigma_{\mathrm{pip,q}}(\lambda) (see [61] and R23), normalized by the Cq(λ,zq)F¯(λ)C_{\mathrm{q}}(\lambda,z_{\mathrm{q}})\overline{F}(\lambda) product. We then define the mean signal-to-noise ratio of a quasar qq as

SNR¯=f(λ)σpip,q(λ)λ.\overline{\mathrm{SNR}}=\left\langle\frac{f(\lambda)}{\sigma_{\mathrm{pip,q}}(\lambda)}\right\rangle_{\lambda}\,. (3.4)

Following R23, we only keep quasars with SNR¯>1\overline{\mathrm{SNR}}>1. The fitting procedure is performed for an observed wavelength range of 3,600<λ<7,6003,600<\lambda<7,600 Åand a rest-frame range of 1,050<λrf<1,1801,050<\lambda_{\mathrm{rf}}<1,180 Å. The quasar spectra are linearly binned in observed wavelength with Δλpix=0.8\Delta\lambda_{\mathrm{pix}}=0.8 Å. We adopt the same rebinning scheme of the common continuum CC in the rest-frame basis with a bin size equal to Δλpix,rf=2.67\Delta\lambda_{\mathrm{pix,rf}}=2.67 Å. Finally, the stack of all flux contrasts is set to zero at the end of the procedure.

3.2 Fast Fourier Transform estimator

The one-dimensional Lyα\alpha power spectrum, noted P1D,αP_{1\mathrm{D},\alpha}, is defined as the power spectrum of the Lyα\alpha absorption contrast δLyα\delta_{\mathrm{Ly\alpha}} along the quasar line-of-sight. Following R23, we include in the P1D,αP_{1\mathrm{D},\alpha} definition the cross-correlations between the Lyα\alpha absorption line and other IGM elements whose rest-frame absorption is very close to the Lyα\alpha line. In particular, we include absorptions from Siii and Siiii (λSiii=1,190\lambda_{\mathrm{Si{\textsc{ii}}}}=1,190 and 1,1931,193 Å, and λSiiii=1,206.50\lambda_{\mathrm{Si{\textsc{iii}}}}=1,206.50 Å) whose cross-correlation with Lyα\alpha absorption is modeled at the cosmological interpretation stage. Neglecting the Siii and Siiii auto-correlations as they correspond to minor metal lines, we mathematically define P1D,αP_{1\mathrm{D},\alpha} as

(2π)δD(kk)P1D,α(k)=δLyα(k)δLyα(k)+δLyα(k)δSiii(k)+δLyα(k)δSiii(k)+δLyα(k)δSiiii(k)+δLyα(k)δSiiii(k),\begin{split}(2\pi)\delta_{\mathrm{D}}(k-k^{\prime})P_{1\mathrm{D},\alpha}(k)=&\left\langle\delta_{\mathrm{Ly\alpha}}(k)\delta_{\mathrm{Ly\alpha}}^{*}(k^{\prime})\right\rangle+\left\langle\delta_{\mathrm{Ly\alpha}}(k)\delta_{\mathrm{Si{\textsc{ii}}}}^{*}(k)\right\rangle+\left\langle\delta_{\mathrm{Ly\alpha}}^{*}(k)\delta_{\mathrm{Si{\textsc{ii}}}}(k)\right\rangle\\ &+\left\langle\delta_{\mathrm{Ly\alpha}}(k)\delta_{\mathrm{Si{\textsc{iii}}}}^{*}(k)\right\rangle+\left\langle\delta_{\mathrm{Ly\alpha}}^{*}(k)\delta_{\mathrm{Si{\textsc{iii}}}}(k)\right\rangle\,,\end{split} (3.5)

where δD\delta_{\mathrm{D}} is the Dirac function, δSiii\delta_{\mathrm{Si{\textsc{ii}}}} and δSiiii\delta_{\mathrm{Si{\textsc{iii}}}} are the absorption contrasts associated to the Siii and Siiii lines.

To compute P1D,αP_{1\mathrm{D},\alpha}, we first separate the flux contrast δF\delta_{F} from each spectrum into three non-overlapping sub-forests of equal size, Lsub=43.3L_{\mathrm{sub}}=43.3 Å in the rest frame, with a flux contrast noted δF,s\delta_{F,s}. Each of the 3Nq3N_{\mathrm{q}} sub-forests (with NqN_{\mathrm{q}} the total number of quasars) is referred to with the index ss in the following equations. This splitting reduces correlations between redshift bins and increases the smallest accessible wavenumber to kmin=2π/(Lsub(1+zmin))=0.0468k_{\mathrm{min}}=2\pi/(L_{\mathrm{sub}}(1+z_{\mathrm{min}}))=0.0468 Å-1. Furthermore, the resulting sub-forest covers at most Δz=0.2\Delta z=0.2, and we use this value to define the redshift binning for P1D,αP_{1\mathrm{D},\alpha}. The redshift associated with a sub-forest is determined by the center of its observed wavelength range, which assigns it to a redshift bin. Therefore, each redshift bin contains information from adjacent redshift bins, as sub-forests with a central redshift near a bin edge have half their spectrum extending beyond their redshift bin. It implies the existence of an inter-redshift correlation, which is not accounted for in this work but will be in future studies. The sub-forest flux contrasts are then transformed to Fourier space by picca using a multiprocessed Fast Fourier Transform (FFT) algorithm.

The P1D,αP_{1\mathrm{D},\alpha} FFT estimator is built under the main assumption that each sub-forest in our dataset provides an independent realization of the IGM Lyα\alpha absorptions at the redshift at which they occur. In other words, we assume that the average of all sub-forests gives an unbiased estimator of the one-dimensional power spectrum, and we neglect correlations between sub-forests. The FFT estimator is derived from the relation between the observed flux contrast δF,s\delta_{F,s} obtained after continuum fitting and the Lyα\alpha contrast δLyα,s\delta_{\mathrm{Ly\alpha},s}. This relation is derived by considering all the instrumental (noise, resolution) and astrophysical (metals) effects that imprint absorptions in the Lyα\alpha region. Following the decomposition detailed in R23, the FFT P1D,αP_{1\mathrm{D},\alpha} estimator for the wavenumber bin AA and redshift bin zz is defined by

P1D,α(A,z)=Praw,s(k)Pnoise,s(k)Rs2(k)sz,kAPmetals(A,z)=Ps(k)sz,kAPmetals(A,z),P_{1\mathrm{D},\alpha}(A,z)=\left\langle\frac{P_{\mathrm{raw},s}(k)-P_{\mathrm{noise},s}(k)}{R_{s}^{2}(k)}\right\rangle_{s\in z,k\in A}-P_{\mathrm{metals}}(A,z)=\left\langle P_{s}(k)\right\rangle_{s\in z,k\in A}-P_{\mathrm{metals}}(A,z)\,, (3.6)

where Praw,sP_{\mathrm{raw},s} is the power spectrum of the flux contrast δF,s\delta_{F,s}, Pnoise,sP_{\mathrm{noise},s} is the noise power spectrum, PmetalsP_{\mathrm{metals}} is the metal power spectrum resulting from the combination of metal forests whose emission lines are located redwards of the three Lyα\alpha, Siii and Siiii absorptions, PsP_{s} is a practical term called individual power spectrum of the sub-forest ss, and RsR_{s} is the average of the Fourier transform of the resolution matrix provided by the DESI pipeline. This matrix encodes for each spectrum and each wavelength the complete resolution information, resulting from spectroscopic extraction (see e.g. [62, 61] and R23 for more details). The resolution matrix is symmetric and is averaged for each spectra pixel in the wavelength space before applying the Fourier transform. We call Praw,sP_{\mathrm{raw},s}, Pnoise,sP_{\mathrm{noise},s} and PsP_{s} power spectra although they correspond to one single sub-forest and only their average over many sub-forest is a measurement of the power spectrum.

The picca software is used to compute this FFT estimator. In agreement with R23, we fix our minimal redshift to zmin=2.1z_{\mathrm{min}}=2.1 to avoid the λ3,700\lambda\lesssim 3,700 Å region of the spectra where atmospheric absorptions significantly increase the flux noise. As we apply several masks (DLA, BAL, and atmospheric emission line catalogs), we remove sub-forests shorter than 75 wavelength pixels or with more than 120 masked pixels. We compute the raw power spectrum for each sub-forest directly from the flux contrast as Praw,s=|δF,s(k)2|P_{\mathrm{raw},s}=\left|\delta_{F,s}(k)^{2}\right|. The noise power spectrum Pnoise,sP_{\mathrm{noise},s} is computed using the DESI pipeline noise. The software generates a large number (NG=2500N_{\mathrm{G}}=2500) of Gaussian signals δnoise,s\delta_{\mathrm{noise},s} with standard deviation equal to σpip,s(λ)\sigma_{\mathrm{pip,s}}(\lambda) for each wavelength, and the noise power spectrum is obtained by averaging those signals in Fourier space, i.e. Pnoise,s(k)=|δnoise,s|2NGP_{\mathrm{noise},s}(k)=\left\langle\left|\delta_{\mathrm{noise},s}\right|^{2}\right\rangle_{N_{\mathrm{G}}}.

The picca software averages the individual power spectra with a SNR¯\overline{\mathrm{SNR}} weighting scheme. In each bin (A,z)(A,z), we distribute the power spectra of individual sub-forests in bins in the signal-to-noise ratio of the sub-forests SNR¯s\overline{\mathrm{SNR}}_{s}. We compute the variance of the power spectra in each SNR¯s\overline{\mathrm{SNR}}_{s} bin. We then fit these variances with a simple least-squares minimization as a function of SNR¯s\overline{\mathrm{SNR}}_{s} according to:

σ(A,z)2(SNR¯s)=aA,z(SNR¯s1)2+bA,z.\sigma_{(A,z)}^{2}\left(\overline{\mathrm{SNR}}_{s}\right)=\frac{a_{A,z}}{\left(\overline{\mathrm{SNR}}_{s}-1\right)^{2}}+b_{A,z}\,. (3.7)

The fitted terms aA,za_{A,z} and bA,zb_{A,z} are used to define the inverse-variance weights of each sub-forest as wA,s=1/σ(A,z)2(SNR¯s)w_{A,s}=1/\sigma_{(A,z)}^{2}\left(\overline{\mathrm{SNR}}_{s}\right). By construction, the weights tend to zero when SNR¯\overline{\mathrm{SNR}} tends to 1, in agreement with the applied SNR¯>1\overline{\mathrm{SNR}}>1 cut. Finally, the weighted average in equation 3.6 is explicitly computed as:

Ps(k)sz,kA=sz,kAwA,sPs(k)sz,kAwA,s.\left\langle P_{s}(k)\right\rangle_{s\in z,k\in A}=\frac{\sum_{s\in z,k\in A}w_{A,s}P_{s}(k)}{\sum_{s\in z,k\in A}w_{A,s}}\,. (3.8)

4 Instrumental characterization of one-dimensional power spectrum

4.1 Updates on DR1 data set

Some methods applied to the analysis of EDR+M2 DESI datasets in R23 are applied in this paper to the DR1 data set without any modifications. We review the new characterization of those instrumental features with the DR1 data set. The updated figures are located in appendix A.

The metal power spectrum, PmetalsP_{\mathrm{metals}} in equation 3.6, is estimated using the side-band region SB1 (1,270<λrf<1,3801,270<\lambda_{\mathrm{rf}}<1,380 Å). We show the SB1 power spectrum in figure 16, which is measured using the averaging of equation 3.6 in the SB1 rest-frame range. Here, we use a rest-frame range distinct from the Lyα\alpha forest one, but with the same observed wavelength range. The quasars selected for SB1 are then located at lower redshifts than those selected for Lyα\alpha. Consequently, our correction assumes that the continuum fitting process can catch the quasar emission variability as a function of redshift. As in R23, we average the side-band power spectrum over redshift bin on a common rest-frame wavenumber binning krest=(1+z)×kobsk_{\mathrm{rest}}=(1+z)\times k_{\mathrm{obs}}. This average is fitted with a least-square minimization by a physically motivated model considering Siiv and Civ doublets oscillations:

PSB1,m(k)=C×kϵ+i[Siiv,Civ]Cieciksin(2π(kki)+ψi).P_{\mathrm{SB1,m}}(k)=C\times k^{-\epsilon}+\sum_{i\in[Si{\textsc{iv}},C{\textsc{iv}}]}C_{i}e^{-c_{i}k}\sin\left(2\pi\left(\frac{k}{k_{i}}\right)+\psi_{i}\right)\,. (4.1)

The evolution of the SB1 power spectrum as a function of redshift is accounted for in a second step, by multiplying and fitting a linear function independently for each redshift, PSB1,m,z(k)=(azk+bz)PSB1,m(k)P_{\mathrm{SB1,m,z}}(k)=(a_{z}k+b_{z})P_{\mathrm{SB1,m}}(k). The resulting fitted functions are used as the metal power spectrum correction in equation 3.6. The fitted parameters for SB1 power spectrum are similar to the EDR+M2 ones in R23. The DR1 power spectrum contains a larger number of sub-forests (by at least a factor 55 for all redshifts) and consequently gives lower error bars.

The R23 method is applied to measure the resolution damping with DR1 dataset. The resolution damping term in equation 3.6 is the average of the Fourier transform of the resolution matrix. The average resolution damping is shown in Fig. 15 and is almost identical to the one measured for EDR+M2 data set in R23. Consequently, we choose the same maximal wavenumber value kmax=2.0k_{\mathrm{max}}=2.0 Å-1, corresponding to a factor 5 correction.

4.2 Improving noise assessment

Refer to caption
Figure 2: (left) Average raw (Praw,s(k)sz,kA\left\langle P_{\mathrm{raw},s}(k)\right\rangle_{s\in z,k\in A}) and noise (Pnoise,s(k)sz,kA\left\langle P_{\mathrm{noise},s}(k)\right\rangle_{s\in z,k\in A}) power spectra for the DR1 dataset as a function of the wavenumber expressed in Å-1. (right) Difference between raw and noise power spectra on the same dataset and resulting asymptotic difference α\alpha.

We measure the noise power spectrum with the method detailed in section 3.2 directly from the DESI pipeline output. As highlighted in R23, characterizing the noise sources in DESI is challenging, and the noise level can be underestimated. To have an alternative way to estimate the noise level, we use the smallest scales of P1D,αP_{1\mathrm{D},\alpha}, for which the resolution damping completely suppresses the signal, resulting in an expected equality between the raw and noise power spectra. Consequently, the measurement of Praw,sP_{\mathrm{raw},s} at the largest wavenumber is a suitable noise level estimator. The noise and raw power spectra are compared in figure 2 for the DR1 dataset.

We derive an additive noise correction by averaging the Praw,sPnoise,sP_{\mathrm{raw},s}-P_{\mathrm{noise},s} difference for wavenumber where the resolution damping is higher than 98 % (kres,98=3.1k_{\mathrm{res},98}=3.1 Å-1), which gives α=Praw,s(k)z,k>kres,98Pnoise,s(k)z,k>kres,98\alpha=\left\langle P_{\mathrm{raw},s}(k)\right\rangle_{\forall z,k>k_{\mathrm{res},98}}-\left\langle P_{\mathrm{noise},s}(k)\right\rangle_{\forall z,k>k_{\mathrm{res},98}}. The measured α=6.9×104\alpha=6.9\times 10^{-4} Å for DR1 corresponds to a 2.8 % noise power spectrum underestimation. This is lower than the value α=1.09×103\alpha=1.09\times 10^{-3} obtained for DESI-M2 in R23, a dataset with similar properties, which means we have better control over the noise characterization thanks to the dataset quality. In contrast to R23, the noise power spectrum is not flat for k2.0k\lesssim 2.0 Å-1. This is due to the SNR¯\overline{\mathrm{SNR}} weighting used for DR1 and not for R23. Indeed, the noise power spectrum is flat for all individual power spectra, but since we are considering different weights in the k[0.02.0]k\in[0.0-2.0] Å-1 range, the SNR¯\overline{\mathrm{SNR}} weights distribution impacts the averaging of the noise power spectrum in a wavenumber-dependent way.

To control the noise power spectrum even better, we developed an alternative power spectrum estimator called cross-exposure, which does not need to model or measure the noise power spectrum explicitly. The DESI standard pipeline coadds the multiple exposures of a given quasar to create less noisy spectra [61]. In contrast, the cross-exposure estimator computes the one-dimensional individual cross-power spectrum between all distinct exposures of a given quasar. This provides an estimator free from noise influence if the noise realizations are independent between exposures.

Therefore, to minimize the impact of correlated noise sources, we select exposures to get only one per quasar and per night. It removes the noise sources resulting from exposures using the same calibration images, such as bias and flat calibrations, but not necessarily that resulting from dark calibration, which is not performed each night (see e.g. [61] or appendix C of R23 for an extensive description of noise sources).

For a quasar with NexpN_{\mathrm{exp}} exposures in DESI, we define the following cross-exposure operator:

𝒫X[δ1,δ2]=1Nexp(Nexp1)mNexpnmNexp|δ1m(k)δ2n(k)|,\mathcal{P}_{X}\left[\vec{\delta}_{1},\vec{\delta}_{2}\right]=\frac{1}{N_{\mathrm{exp}}\left(N_{\mathrm{exp}}-1\right)}\sum_{m}^{N_{\mathrm{exp}}}\sum_{n\neq m}^{N_{\mathrm{exp}}}\left|\delta_{1}^{m}(k)\delta_{2}^{n}(k)^{*}\right|\,, (4.2)

where δ1\vec{\delta}_{1} and δ2\vec{\delta}_{2} are two contrast sets of size NexpN_{\mathrm{exp}} representing the contrast for different exposures. For example, δ1\vec{\delta}_{1} can be a vector of NexpN_{\mathrm{exp}} Lyα\alpha contrasts and δ2\vec{\delta}_{2} a vector of NexpN_{\mathrm{exp}} noise contrasts. For this section only, the vector notation \vec{\cdot} refers to a set of contrasts for different exposures.

We replace the simple raw power spectrum coadd estimator given in section 3.2 by the cross-exposure estimator Praw,s(k)=𝒫X[δF,s(k),δF,s(k)]P_{\mathrm{raw},s}(k)=\mathcal{P}_{X}\left[\vec{\delta}_{F,s}(k),\vec{\delta}_{F,s}(k)\right], where δF,s\vec{\delta}_{F,s} is the set of sub-forest flux contrast for the NexpN_{\mathrm{exp}} exposures.

Noise correction and resolution damping must be accounted for to build a one-dimensional power spectrum estimator. Following R23, the flux contrast can be decomposed in Fourier space as δF,s(k)=(δLyα,s(k)+δnoise,s(k))×Rs(k)+δmetal(k)\vec{\delta}_{F,s}(k)=\left(\vec{\delta}_{\mathrm{Ly\alpha},s}(k)+\vec{\delta}_{\mathrm{noise},s}(k)\right)\times\vec{R}_{s}(k)+\vec{\delta}_{\mathrm{metal}}(k). Following the demonstration in R23 to build the P1D,αP_{1\mathrm{D},\alpha} estimator, the cross-exposure estimator P1D,α,crossP_{1\mathrm{D},\alpha,\mathrm{cross}} yields:

P1D,α,cross(A,z)=𝒫X[δF,s(k)Rs(k),δF,s(k)Rs(k)]+2𝒫X[δF,s(k)Rs(k),δnoise,s(k)Rs(k)]+𝒫X[δnoise,s(k)Rs(k),δnoise,s(k)Rs(k)]sz,kAPmetals(A,z).\begin{split}P_{1\mathrm{D},\alpha,\mathrm{cross}}(A,z)&=\left\langle\mathcal{P}_{X}\left[\overrightarrow{\frac{\delta_{F,s}(k)}{R_{s}(k)}},\overrightarrow{\frac{\delta_{F,s}(k)}{R_{s}(k)}}\right]+2\mathcal{P}_{X}\left[\overrightarrow{\frac{\delta_{F,s}(k)}{R_{s}(k)}},\overrightarrow{\frac{\delta_{\mathrm{noise},s}(k)}{R_{s}(k)}}\right]\right.\\ &+\left.\mathcal{P}_{X}\left[\overrightarrow{\frac{\delta_{\mathrm{noise},s}(k)}{R_{s}(k)}},\overrightarrow{\frac{\delta_{\mathrm{noise},s}(k)}{R_{s}(k)}}\right]\right\rangle_{s\in z,k\in A}-P_{\mathrm{metals}}(A,z)\,.\end{split} (4.3)

In this decomposition, since the noise contrasts are purely random Gaussian signals, they are all independent from each other and from Lyα\alpha contrasts. Consequently, the last two terms of the decomposition are null. We numerically verified this assumption on the DR1 data set by assessing that the average of the last two terms is consistent with zero. Consequently, the cross-exposure estimator is given by:

P1D,α,cross(A,z)=𝒫X[δF,s(k)Rs(k),δF,s(k)Rs(k)]sz,kAPmetals(A,z).P_{1\mathrm{D},\alpha,\mathrm{cross}}(A,z)=\left\langle\mathcal{P}_{X}\left[\overrightarrow{\frac{\delta_{F,s}(k)}{R_{s}(k)}},\overrightarrow{\frac{\delta_{F,s}(k)}{R_{s}(k)}}\right]\right\rangle_{s\in z,k\in A}-P_{\mathrm{metals}}(A,z)\,. (4.4)

We use a new implementation in picca to compute this cross-exposure estimator by performing the continuum fitting on each exposure of all the DR1 quasar catalog. Since distinct exposures of the same quasar have different SNR¯\overline{\mathrm{SNR}}, we perform the continuum fitting independently, as if each exposure were an independent quasar. We tried alternative continuum fitting methods and concluded that using the continuum fitted on the coadd of all exposures or on the exposure with the highest SNR¯\overline{\mathrm{SNR}} introduces biases and spurious oscillations in P1D,α,crossP_{1\mathrm{D},\alpha,\mathrm{cross}}.

Refer to caption
Figure 3: Comparison between the power spectrum without noise correction (UNCORR), the noise-corrected power spectrum corrected by subtracting the α\alpha term (CORR), and the power spectrum obtained with the cross-exposure estimator defined by equation 4.4 (XEXP). The normalized one-dimensional power spectrum (Δ1D,α(k)=kP1D,α/π\Delta_{1\mathrm{D},\alpha}(k)=kP_{1\mathrm{D},\alpha}/\pi) is shown on the top panel, and the ratio between them is shown on the two others. For this comparison, all P1D,αP_{1\mathrm{D},\alpha} have no corrections or metal power spectrum subtraction, and only DLA and atmospheric emission lines are masked. Only the first six redshift bins are shown for clarity. For the same reason, the error bar associated with only one redshift bin (z=2.4z=2.4) is shown in the ratios comparing the three measurements.

Figure 3 compares the one-dimensional power spectrum measured on DR1 dataset before noise correction with P1D,αP_{1\mathrm{D},\alpha} after noise correction, and with the cross-exposure estimation, P1D,α,crossP_{1\mathrm{D},\alpha,\mathrm{cross}}. The two bottom panels are showing the ratios between those three analysis variations. At large wavenumber (k1.5k\gtrsim 1.5 Å-1), the cross-exposure estimator has the same trend as the noise-corrected measurement, indicating that both noise estimations match. The uncorrected P1D,αP_{1\mathrm{D},\alpha} have a discrepancy at small redshifts for those scales, even if the difference is still within the statistical error bar. Near k0.6k\sim 0.6 Å-1, both uncorrected and corrected measurements show a slight discrepancy with the cross-exposure estimator. This bump is caused by a pattern in a specific faulty DESI amplifier and is studied in detail in section 5.4. The cross-exposure estimator can remove part of this contamination at low redshifts.

The cross-exposure estimator only uses between 3535 and 3939 % (depending on the redshift bin) of the quasar catalog because of the need for several exposures and the fact that we select exposures from different nights. Consequently, we do not directly use the cross-exposure estimator for high redshifts, where the statistical errors are large. As discussed in 7, to better control the noise estimate, we use the cross-exposure estimator as the baseline for the redshift bins where we see small-scale and amplifier discrepancies, i.e. z=2.2,z=2.2, 2.42.4, and 2.62.6. For the other bins, we use the noise-corrected measurement as it agrees with the cross-exposure but results in smaller error bars due to the larger number of sub-forests and exposures available. Additionally, considering the extensive studies we performed here and in R23, we assert here that we have sufficient control over the noise level to drop the associated systematic error bar, which was already overly conservative in R23. We assume that the increase in the statistical error at low redshift due to using the cross-exposure estimator makes the systematic error due to noise negligible.

5 Data splits and variations

We aim at improving the robustness of the P1D,αP_{1\mathrm{D},\alpha} measurement by testing most of the hypotheses and choices we performed regarding the treatment of quasar spectra, the catalogs used (see section 2), and the methodology itself developed in section 3. As it is now standard with other Lyα\alpha analyses [29, 31], we vary several analysis options and perform data splitting concerning different types of properties, and compare the resulting P1D,αP_{1\mathrm{D},\alpha} to a baseline. This first application of data variations and splits for P1D,αP_{1\mathrm{D},\alpha} also aims to set the basis for this measurement. It will be improved in future studies, notably by performing it for the final baseline analysis and covariance matrix and considering additional effects. To improve readability, we explicitly label on the plot the tests that we intuitively expect to fail or pass.

For the sake of simplicity, we take a simple baseline, which uses the same methodology as the EDR+M2 measurement in R23, without any correction. To be more explicit, DLAs and atmospheric emission lines are masked, and the BAL quasars verifying the BI>0BI>0 criterion are removed. We do not apply noise correction, cross-exposure estimator, metal subtraction, or multiplicative correction from synthetic data. Except for cross-exposure, all the corrections that will be applied to our measurement in section 7 are the same for all analysis variations and data splits performed here. Considering that cross-exposure only slightly impacts P1D,αP_{1\mathrm{D},\alpha}, we can safely suppose that all the tests performed in this section are also valid for the final measurement for which all corrections are applied.

This section aims to conduct a blind exploration of four data variations (DLA and BAL catalogs, continuum fitting, and averaging parameters) and four data splits (number of exposures, amplifiers, sky regions, and Civ equivalent width).

The ratio between the baseline and the variation or split power spectra is noted r(A,z)=PBASE(A,z)/PTEST(A,z)r(A,z)=P_{\mathrm{BASE}}(A,z)/P_{\mathrm{TEST}}(A,z). The statistical uncertainty on this ratio depends on the correlation between the two power spectra. In the data split case, we have a sub-sample of the baseline, we can compute the correlation and obtain 888This is coming from (δ(a/b)a/b)2=(δaa)2+(δbb)2cov(ab)ab\left(\frac{\delta(a/b)}{a/b}\right)^{2}=\left(\frac{\delta a}{a}\right)^{2}+\left(\frac{\delta b}{b}\right)^{2}-\frac{{\rm cov}(ab)}{ab} and cov(PTEST,PBASE)=var(PBASE){\rm cov}(P_{\mathrm{TEST}},P_{\mathrm{BASE}})={\rm var}(P_{\mathrm{BASE}}).:

σr(A,z)2=r(A,z)2[(σPBASE(A,z)PBASE(A,z))2+(σPTEST(A,z)PTEST(A,z))22(σPBASE(A,z))2PBASE(A,z)PTEST(A,z)]\sigma_{r}(A,z)^{2}=r(A,z)^{2}\left[\left(\frac{\sigma_{P_{\mathrm{BASE}}}(A,z)}{P_{\mathrm{BASE}}(A,z)}\right)^{2}+\left(\frac{\sigma_{P_{\mathrm{TEST}}}(A,z)}{P_{\mathrm{TEST}}(A,z)}\right)^{2}-2\frac{\left(\sigma_{P_{\mathrm{BASE}}}(A,z)\right)^{2}}{P_{\mathrm{BASE}}(A,z)P_{\mathrm{TEST}}(A,z)}\right]\ \, (5.1)

where σPBASE\sigma_{P_{\mathrm{BASE}}} and σPTEST\sigma_{P_{\mathrm{TEST}}} are the standard deviations of the two measurements, see section 6.1. We then compute a pp-value given from the χ2\chi^{2} cumulative distribution function Fχ2F_{\chi^{2}}, and note it hereafter “χ2\chi^{2} pp-value”:

p=1Fχ2((A,z)[(r(A,z)1)σr(A,z)]2).p=1-F_{\chi^{2}}\left(\sum_{(A,z)}\left[\frac{(r(A,z)-1)}{\sigma_{r}(A,z)}\right]^{2}\right)\,. (5.2)

We cannot easily compute the error bars on the ratio for analysis variations; thus, they are not shown in the following. We design other statistical tests to quantify the ratio positioning and the trend as a function of wavenumber. The first test characterizes the distribution of the ratio with respect to unity. We consider the fraction of pixels below and above unity and associate a simple “Distribution pp-value” equal to twice the lower of the two fractions.

To characterize a potential trend with respect to the wavenumber, we want to compute the Pearson sample correlation coefficient RpR_{p} between the wavenumber kk and the ratio rr. However, the error on rr varies with kk. In order to have a quantity whose error does not depend on kk, we compute for each redshift the correlation between kk and (rr)/σr\left(r-\langle r\rangle\right)/\sigma_{r}, where σr\sigma_{r} is given by equation 5.1 without the last covariance term. It is largely overestimating σr\sigma_{r}, but only the variation with kk matters here, and we assume that we should get a fair estimate of this variation. The associated “Correlation pp-value” to test the existence of a correlation is given by the two-sided Student tt-distribution:

p=2PS(X|RpN21Rp2|)p=2P_{\mathrm{S}}\left(X\geq\left|R_{p}\sqrt{\frac{N-2}{1-R_{p}^{2}}}\right|\right) (5.3)

where NN is the number of data points. This pp-value is computed for all redshifts independently, and we report the minimal one.

Some of the tests performed in this section are used to see the impact on P1D,αP_{1\mathrm{D},\alpha}, and some to compare to a purer sample, which removes specific contaminants but decreases the statistics. We use the p>0.05p>0.05 criterion to check if the test is passed in the latter case. This study allows us to conduct a data-driven exploration of potential analysis issues and modify our baseline accordingly.

5.1 DLA and BAL catalog variations

Refer to caption
Refer to caption
Figure 4: Ratios of the baseline power spectrum (BASE) to the power spectrum obtained from analysis variations. The baseline P1D,αP_{1\mathrm{D},\alpha} is the uncorrected power spectrum without metal subtraction. (left) Variation of the DLA treatment: without DLA masked in calculating P1D,αP_{1\mathrm{D},\alpha} (NOMASK) and with a high-purity catalog (PURITY). The level of failure in the PURITY test is well below the DLA completeness systematics associated with the final measurement in section 6.2 (right) Variation of BAL treatment. The baseline removes the BAL quasar spectra detected with the BI>0BI>0 criterion. The variations considered are removing the BAL quasar spectra with the AI>0AI>0 criterion (AICUT), not removing any quasars from the data set, and masking all BAL absorption with the AI>0AI>0 criterion (AIMASK), and removing BIBI BAL quasar and masking AIAI BAL absorptions (BIC_AIM). The failure of the AICUT test shows that AI BAL significantly affect the baseline, motivating us to change the baseline.

The DLA catalog used for masking P1D,αP_{1\mathrm{D},\alpha} was created with the specific CNN and GP confidence cuts in section 3. Variations of those parameters provide insight into DLA completeness. The left panel of figure 4 shows the two variations related to the DLA catalog. We start by not masking DLA (NOMASK) and comparing it to the baseline. In agreement with the previous measurement in R23 or on simulation in [63], not masking DLAs greatly increases the power spectrum at the largest scales considered. This test is used in section 6.2 to derive a systematic uncertainty associated with the DLA completeness.

Conjointly with the companion paper K25, we modify the confidence cut to obtain a higher purity version of the DLA catalog. This alternative cutting uses the same confidence cut (CL>0.9CL>0.9) for GP, and a continuum-to-noise ratio (CNR¯\overline{\mathrm{CNR}})-dependent cut for the CNN: CL>0.2CL>0.2 for CNR¯>3\overline{\mathrm{CNR}}>3, CL>0.4CL>0.4 for 2<CNR¯<32<\overline{\mathrm{CNR}}<3, and CL>0.5CL>0.5 for CNR¯<1\overline{\mathrm{CNR}}<1. The resulting catalog contains approximately 3131 % less DLA than the high-completeness version used for the baseline. When using this high-purity catalog, the obtained power spectrum is compared to the baseline in the last panel of Figure 4 (PURITY). The pp-values reported show a statistically significant correlation with respect to wavenumber, coming from small wavenumbers and small redshifts. We note, however, that this difference is small (around 2%\%) and is well below the level of the systematic uncertainties that we derive in section 6.2 for DLA completeness. Thus, employing a purer or more complete sample of DLA does not largely impact our measurement. Furthermore, even if the two catalogs give different DLA numbers, the impact of masking is not significantly changed. We choose to keep the high-completeness catalog as the baseline.

Refer to caption
Figure 5: Ratios of the new baseline power spectrum (NEWBASE) with the AICUT BAL variation of figure 4 for which all BAL quasars passing the AI>0AI>0 criterion are removed. The new baseline removes BIBI BAL quasars, masks AIAI BAL absorption, and is corrected from the effect of masking with correction derived in KR25.

We have a larger number of sub-forests than in R23, sufficient to study the effect of the completeness of BAL catalog, as illustrated in the right panels of figure 4. The baseline measurement (BASE) uses a catalog in which quasars tagged as BAL with the BI>0BI>0 criterion have been removed. The upper panel of the figure compares it to an analysis variation (AICUT) with the AI>0AI>0 cut, which removes 2222 % of quasars, including the 44 % of BI>0BI>0 quasars. We observe a near 55 % difference at small scales, well above the level of statistical uncertainties, giving a 0.00.0 Correlation pp-value. However, removing all AIAI BAL quasars would induce a large drop in statistics, and we aim to match the power spectrum obtained with this severe cut while keeping most of the quasars.

Two additional variations are then performed: masking at the spectrum level all AIAI BAL absorptions while keeping all quasars in the catalog (AIMASK) and masking all AIAI BAL pixels while removing BIBI BAL quasars from the catalog (BIC_AIM). AIMASK is a drastic change relative to baseline as no quasars are removed from the input catalog, while BIC_AIM can be considered as the baseline while additionally masking for BAL regions that are passing AI>0AI>0 criterion. As shown on the lower two right panels of the same figure, the impact of masking BAL is very similar for AIMASK and BIC_AIM: it decreases the P1D,αP_{1\mathrm{D},\alpha} level for all scales due to the effect of pixel masking while accounting for the small scale difference due to the unaccounted AIAI BAL that are masked. Since the impact of masking is way more prominent when the BIBI BAL quasars are kept (AIMASK), we choose BIC_AIM as our new baseline.

To verify that the impact of AIAI BAL quasars is correctly accounted for, we compare our new baseline to the case where all AIAI BAL quasars are removed from the catalog (AICUT). The new baseline is corrected for the impact of BAL masking using the correction derived in KR25. The number of AIAI BAL quasars is larger in the DR1 data than in the mocks used in KR25. We normalize the absolute level of the BAL masking correction to account for the larger fraction of BAL in the data. The resulting new baseline (BIC_AIM with corrections) is compared to the AIAI cutting case in figure 5. Compared to the first left panel of figure 4, the new baseline properly accounts for the AIAI BAL, as shown by the p=0.05p=0.05 value. This p-value is the lower value for six bins in redshift, so p=0.05p=0.05 indicates that the test is passed for all redshift bins. We will use this study of the impact of AIAI BAL to derive systematic uncertainties linked to BAL completeness and the BAL masking correction.

5.2 FFT method parameters variation

Refer to caption
Refer to caption
Figure 6: Ratios of the baseline power spectrum (BASE) to the power spectrum obtained from analysis variations. (left) Variation of the continuum fitting method: using wavelength-dependent correction terms in the computation of the continuum fitting weights (VARW), using a second-order polynomial to compute the quasar-specific continuum (ORDER2), taking a shorter definition of the Lyα\alpha forests in terms of rest-frame wavelength (LSMALL), and performing the continuum fitting independently on the different sub-forests with a zero-order polynomial for the quasar-specific continuum (SFCUT). (right) Variation of the SNR¯\overline{\mathrm{SNR}} cut and the P1D,αP_{1\mathrm{D},\alpha} averaging method: Performing the average without weights and a SNR¯>1\overline{\mathrm{SNR}}>1 cut (NOW), without weights and an SNR¯>3\overline{\mathrm{SNR}}>3 cut (NOWSNR3), and with the SNR¯\overline{\mathrm{SNR}} weighting scheme used for baseline and a SNR¯>3\overline{\mathrm{SNR}}>3 cut (SNR3).

We vary some parameters chosen as fixed in section 3 to blindly search for potential sources of analysis systematic error. First, the parameters used for continuum fitting in section 3.1 are varied, and the ratio with baseline is shown in the left panels of figure 6. We carry out the continuum fitting while allowing noise-dependent weights in the likelihood used for fitting (VARW). In equation 3.3, the weights σq=F¯(λi)Cq(λi,zq,aq,bq)\sigma_{q}=\overline{F}(\lambda_{i})C_{\mathrm{q}}\left(\lambda_{i},z_{\mathrm{q}},a_{\mathrm{q}},b_{\mathrm{q}}\right) in the likelihood are replaced by:

σq2(F¯(λi)Cq(λi,zq,aq,bq))2=η(λ)σpip,q2(λ)(F¯(λi)Cq(λi,zq,aq,bq))2+σlss2(λ)+ϵ(λ)(F¯(λi)Cq(λi,zq,aq,bq))2σpip,q2(λ),\begin{split}\frac{\sigma_{\mathrm{q}}^{2}}{\left(\overline{F}(\lambda_{i})C_{\mathrm{q}}\left(\lambda_{i},z_{\mathrm{q}},a_{\mathrm{q}},b_{\mathrm{q}}\right)\right)^{2}}&=\eta(\lambda)\frac{\sigma_{\mathrm{pip},\mathrm{q}}^{2}(\lambda)}{\left(\overline{F}(\lambda_{i})C_{\mathrm{q}}\left(\lambda_{i},z_{\mathrm{q}},a_{\mathrm{q}},b_{\mathrm{q}}\right)\right)^{2}}+\sigma_{\mathrm{lss}}^{2}(\lambda)\\ &+\epsilon(\lambda)\frac{\left(\overline{F}(\lambda_{i})C_{\mathrm{q}}\left(\lambda_{i},z_{\mathrm{q}},a_{\mathrm{q}},b_{\mathrm{q}}\right)\right)^{2}}{\sigma_{\mathrm{pip},\mathrm{q}}^{2}(\lambda)}\,,\end{split} (5.4)

with additional wavelength-dependent parameters η\eta, σlss\sigma_{\mathrm{lss}} and ϵ\epsilon, as it is the case for BAO analysis. In a second variation, we added a cqλ2c_{q}\lambda^{2} term to compute individual quasar continuum in equation 3.2 (ORDER2). In a third variation we modify the minimum and maximum wavelength considered in the computation of flux contrast from 1,050<λrf<1,1801,050<\lambda_{\mathrm{rf}}<1,180 Å to the reduced range 1,070<λrf<1,1601,070<\lambda_{\mathrm{rf}}<1,160 Å (LSMALL). Finally, we perform the sub-forest cut before the continuum fitting and fit the continuum independently for each sub-forest (SFCUT). For the latter variation, in comparison to baseline, we also remove the bqλb_{q}\lambda in equation 3.2 as this term would generate a broken linear continuum fitting when separated in sub-forests and would suppress small-wavenumber modes. Note that the SFCUT variation corresponds to how continuum fitting was performed for eBOSS [45].

The four variations considered do not induce a significant decrease in the number of sub-forests: only a nearly 55 % decrease for LSMALL due to the creation of shorter sub-forests that do not pass the size cuts. The second test (ORDER2) has a very small effect on P1D,αP_{1\mathrm{D},\alpha} as indicated by the reported pp-values. All the three other tests (VARW, LSMALL, SFCUT) shows some statistically significant correlation between the ratio and the wavenumber. The VARW test has a small but significant effect on the power spectrum. However, including σLSS2\sigma^{2}_{\rm LSS} in the weights is biasing the evaluation of the power spectrum, so this effect is not surprising. The LSMALL and SFCUT variations show more significant fluctuations. We interpret those spikes as caused by adding smaller sub-forests that appear as outliers in the averaging, and that are potentially not accounted by the outlier finder detailed in section 2. Those two variations should be considered in a cosmological interpretation study as they could potentially yield different results.

In a separate study, we test the average P1D,αP_{1\mathrm{D},\alpha} computation by varying the SNR¯\overline{\mathrm{SNR}} cut and the weighting scheme. The right panels of figure 6 show the variation performed here: using a constant weighting scheme with the same SNR¯>1\overline{\mathrm{SNR}}>1 cut (NOW), with a SNR¯>3\overline{\mathrm{SNR}}>3 cut (NOWSNR3) which is close to what was done in eBOSS [45], or using the same SNR¯\overline{\mathrm{SNR}} weighting scheme as baseline with the SNR¯>3\overline{\mathrm{SNR}}>3 cut (SNR3). The SNR¯>3\overline{\mathrm{SNR}}>3 cut implies keeping only between 3232 to 4040 % of the DR1 sub-forests depending on the redshift bin. It corresponds to a drastic cut in statistics, but can be considered a robust sample for small scales, for which noisy sub-forests can significantly impact the measurement. Indeed, noisy sub-forests have a large noise power spectrum and a slight noise misestimation causes a large small-scale variation. The upper plot exhibits a large discrepancy between the baseline and the case without weighting.

For the second plot when a SNR¯>3\overline{\mathrm{SNR}}>3 cut is applied to the unweighted analysis, the ”Distribution” p-value is improved, but there still exists a wavenumber correlation. However, we note that the ratio is very close to unity with a maximal value near 22 %. The NOW discrepancy can therefore be ascribed to noisy spectra with 1<SNR¯<31<\overline{\mathrm{SNR}}<3, and probably mostly to spectra with SNR¯1\overline{\mathrm{SNR}}\gtrsim 1. These spectra have a very small weight and a negligible contribution to the baseline power spectrum, which, therefore, is hardly affected by removing SNR¯<3\overline{\mathrm{SNR}}<3 spectra, as illustrated by the lower plot, which shows a small (near 11 %) difference, which is well below the level of small-scale systematics (e.g., resolution) associated to the final measurement in section 6.2. Even if the impact of biasing caused by low SNR¯\overline{\mathrm{SNR}} sub-forest is mostly accounted for by our SNR¯\overline{\mathrm{SNR}}-weighting, the SNR¯>3\overline{\mathrm{SNR}}>3 can be considered a potential variation for cosmological interpretations.

5.3 Comparison of exposure numbers

Refer to caption
Refer to caption
Figure 7: Ratios of the baseline power spectrum (BASE) to the power spectrum obtained from analysis variations. (left) Splitting of the quasar catalog between quasars observed with one (N=1), two or three (N=2||3), and four or more (N>>=4) exposures. (right) Splitting of the amplifier sets used to compute P1D,αP_{1\mathrm{D},\alpha}. We show the ratio for the power spectrum measured using the 18th18^{\mathrm{th}} amplifier set, which exhibits a noticeable discrepancy with the baseline. A variation of weighting scheme and continuum fitting are performed: without averaging weights (18NOW), with SNR¯\overline{\mathrm{SNR}} weighting (18W), and with SNR¯\overline{\mathrm{SNR}} weighting but taking the continuum fitting measure on the baseline for the 18th18^{\mathrm{th}} amplifier set power spectrum (18WCONT). The amplifier-related failures in the column are tackled in the final analysis by using the cross-exposure estimator for z2.7z\leq 2.7.

Each quasar is planned to be observed several times by the DESI instrument, and since DR1 is an intermediate data set, it contains spectra observed with very different numbers of exposures. The coadding of exposures can potentially introduce a bias, especially when it involves a large number of exposures with very different integration times and observing conditions. A data split involving quasars observed over only one (N=1), two or three (N=2||3), or four or more (N>>=4) exposures is shown in the left panels of figure 7. The data splits contain, on average over redshift bins, 3939 %, 3636 %, and 2525 % of the total number of sub-forests, respectively, thus resulting in larger error bars. The ratios do not show strong variations from unity; however, only the (N=2 ||3) and (N >>=4) are passing the χ2\chi^{2} test. The low number of exposures (N=1) can potentially bias our measurement, especially at small scales, because a quasar with a low number of exposures tends to be noisier, thus giving different weights when averaging. Removing this sample implies a too-large decrease in statistics, but this variation should be considered for cosmological interpretation and future measurement with larger data sets and a larger number of exposures.

5.4 Measurement with different amplifier sets

The focal plane of the DESI instrument is composed of ten modules called petals, each containing 500 robotic fiber positioners. The light measured by those positioners is linked to ten spectrographs by optical fibers. Each spectrograph receives the fiber associated with one petal and scatters the received light using volume phase holographic gratings. Three CCD cameras per spectrograph read the spectra, covering different wavelength ranges. Finally, four amplifiers per CCD are positioned in squares to perform the readout. In the xx direction, a CCD image contains different spectra chunks, while yy is the direction of wavelengths. A given quasar spectrum is then measured over six amplifiers (2 per CCD camera). For one spectrograph, the fibers directed to the left and right parts of the CCD cameras constitute two sets, measured with completely distinct amplifiers. At the spectrum level, the data can be split between 20 amplifier sets, according to the left and right parts of the ten spectrographs.

We measured P1D,αP_{1\mathrm{D},\alpha} over those 20 amplifier set splits independently, which constitutes a significant drop in terms of statistics. The right panels of figure 7 show the ratio between baseline and the power spectrum using only spectra from the amplifier set number 18 with constant-weight averaging (18NOW), with SNR¯\overline{\mathrm{SNR}}-weight averaging (18W), and with the SNR¯\overline{\mathrm{SNR}}-weight averaging but using the continuum from the complete DR1 data set (18WCONT). The 18th amplifier set was chosen because it shows a considerable difference for low redshift near wavenumber k0.65k\sim 0.65 Å-1. Among the 20 splits of the amplifier set, only two exhibit this behavior, with the 16th data split showing a smaller spike amplitude. Those spikes originate from oscillatory patterns of period Δλ=9.6\Delta\lambda=9.6 Å(corresponding to k0.65k\sim 0.65 Å-1) present on DR1 calibrations images used for bias corrections.

The source of those patterns was not identified and was highly complicated to correct due to their partial coverage of the amplifier and sporadic occurrence. In contrast to the companion paper K25, the impact of this feature drastically decreases when using the SNR¯\overline{\mathrm{SNR}} weighting scheme. Furthermore, comparing 18W and 18WCONT indicates that a subsequent part of the spike is caused by the continuum fitting procedure itself when using only the 18th amplifier set. The spike is more prominent at small redshifts. As pointed out in section 4.2, the cross-exposure estimator accounts for this spike because it uses the individual cross-power spectrum between different fibers and, consequently, different amplifier sets. As we use the cross-exposure estimator for the lower redshift (z<2.7z<2.7) where the 0.650.65 Å-1 spike is the largest (2.3σ2.3\sigma for z=2.2z=2.2 and 3.5σ3.5\sigma for z=2.4z=2.4 and dropping to 1σ1\sigma or below for larger redshifts), we consider that this feature does not impact our measurement, and we choose not to add a dedicated systematic uncertainty. We interpret the low pp-value for the last panel to be mostly caused by fluctuations and not by the 0.650.65 Å-1 spike itself.

5.5 Galactic hemisphere split

Refer to caption
Refer to caption
Figure 8: Ratios of the baseline power spectrum (BASE) to the power spectrum obtained from analysis variations. (left) Splitting of the quasar catalog between north (NGC) and south (SGC) galactic cap. (right) Splitting of the quasar catalog based on the Civ equivalent width noted EW. The split is performed such that the quasar (but not the sub-forest) catalog is split in half: EW>>39 and EW<<39.

Different imaging surveys are used for the quasar target selection of the DESI survey [50, 51, 52]. Inhomogeneities in the photometric target pre-selection can significantly impact galaxy clustering studies, since they affect the target density and subsequently the estimation of the matter density. We do not expect an impact on the measurement of P1D,αP_{1\mathrm{D},\alpha} since each spectrum pixel is a proxy for the IGM density independently of the quasar density. However, as a sanity check, we still perform a data split between the North Galactic Cap (NGC) and the South Galactic Cap (SGC), as shown on the left panels of figure 8. The NGC dataset contains 6565 % of the sub-forests, and the SGC 3535 %. As expected, the P1D,αP_{1\mathrm{D},\alpha} measured on both data splits show no visible difference with the baseline, as also indicated by the reported pp-values.

5.6 CIV equivalent width

The last data split performed follows the BAO analysis [29, 31] and is related to the Civ equivalent width (EW) in angstrom, i.e. the spectral area associated with the quasar Civ emission line. The EW variable is a good quasar population discriminant because of the Baldwin effect, which causes EW to be anti-correlated to the quasar continuum luminosity [64]. Consequently, the high EW quasars will tend to be noisier. We use the EW measurement performed with fastspecfit\faGithub999https://github.com/desihub/fastspecfit in [29, 31] to create two populations equal in terms of quasar number: EW>>39 and EW<<39. The result of P1D,αP_{1\mathrm{D},\alpha} computations for those two EW populations is shown in the right panels of figure 8. Since we apply a SNR¯>1\overline{\mathrm{SNR}}>1 quality cut, the noisier EW>>39 measurement contains only 3535 % of the total sub-forests. The EW>>39 data set shows a 55 % discrepancy for small redshift and k<0.2k<0.2 Å-1. This discrepancy can be interpreted as a proximity effect impacting the larger scales of the quasar spectra. The EW<<39 sample is also showing a discrepancy, shown by the pp-value, but with a smaller level (near 22 % at large scales). Removing the EW>>39 sample would imply a statistical drop that is too large. For this analysis, we keep the two populations of quasars in the P1D,αP_{1\mathrm{D},\alpha} baseline measurement. However, the EW<<39 data sample should also be considered as a variation for cosmological analysis, and this EW test should be investigated more in future studies. For now, the level of the 1%\sim 1\% induced bias compared to EW<<39 is well below the level of large-scale systematics such as DLA completeness detailed in section 6.2.

5.7 Analysis variation and data split conclusion

We performed an extensive data split study to find potential biases in our P1D,αP_{1\mathrm{D},\alpha} measurement. Those data splits allow us to validate our measurement and improve the characterization of systematic uncertainties. Notably, this study indicates that the impact of AIAI BAL on the P1D,αP_{1\mathrm{D},\alpha} measurement cannot be ignored, and we changed the baseline measurement to mask the associated absorptions. This analysis shows that some effects are slightly changing the global trend of P1D,αP_{1\mathrm{D},\alpha}, and some imprint fluctuations sufficient not to pass our statistical tests. We can also note that the ”Correlation pp-value” test is especially severe. Removing the sub-samples that do not pass the tests would imply a significant statistical drop; we then keep them for this paper. First, those effects (continuum fitting variations, number of exposures, amplifier splits, and Civ EW quasar population) should be considered as baseline variations for any cosmological interpretation. Secondly, those sub-samples should potentially be removed for future DESI releases with even larger data sets available.

6 Power spectrum uncertainties

6.1 Statistical uncertainties and covariance matrix

Refer to caption
Figure 9: Statistical uncertainties of the DR1 P1D,αP_{1\mathrm{D},\alpha} measurement, computed using equation 6.2. The uncertainties are smoothed in each redshift bin using a Savitzky-Golay filter. The left panel shows the absolute level of those uncertainties, and the right panel shows their level relative to the measured power spectrum.

As the aim of this DR1 measurement is to be interpreted in terms of cosmological parameters, we estimate the covariance matrix associated with P1D,αP_{1\mathrm{D},\alpha}. Furthermore, we improve the study conducted in R23 concerning the estimation of statistical and systematic uncertainties. The full derivation of the variance and covariance estimators used here is presented in Appendix B.

The expression of the power spectrum P1D,α(A,z)P_{1\mathrm{D},\alpha}(A,z) involves a weighted average of Ps(k)P_{s}(k) (Eq. 3.6). In Appendix B, we derive an estimator for the variance of a weighted average wiAi/wi\sum w_{i}A_{i}/\sum w_{i} and for the covariance between two such weighted averages. However, this derivation requires that the AiA_{i} be uncorrelated. When a sub-forest ss has several wavenumbers kk in a given AA bin, these wavenumbers are correlated, so we cannot use our variance estimator for the weighted average Ps(k)sz,kA\langle P_{s}(k)\rangle_{s\in z,k\in A}.

We therefore introduce AsA_{s} as the ensemble of wavenumbers associated with sub-forest ss that fall into bin AA, NA,sN_{A,s} the number of wavenumbers in AsA_{s}, and PA,s=kAsPs(k)/NA,sP_{A,s}=\sum_{k\in A_{s}}P_{s}(k)/N_{A,s} the average of individual power spectra in bin AA. We can neglect the correlation between different sub-forests as accounting for them would give a three-dimensional estimator. Therefore, we use our variance estimator for the PA,sP_{A,s}. We introduce the weights wA,s=NA,swA,sw_{A,s}^{\prime}=N_{A,s}w_{A,s} to rewrite Ps(k)sz,kA\langle P_{s}(k)\rangle_{s\in z,k\in A} as a weighted average of the PA,sP_{A,s}:

Ps(k)kA,sz=kA,szwA,sPs(k)kA,szwA,s=szwA,sPA,sszwA,s=PA,ssz.\langle P_{s}(k)\rangle_{k\in A,s\in z}=\frac{\sum_{k\in A,s\in z}w_{A,s}P_{s}(k)}{\sum_{k\in A,s\in z}w_{A,s}}=\frac{\sum_{s\in z}w_{A,s}^{\prime}P_{A,s}}{\sum_{s\in z}w_{A,s}^{\prime}}=\left\langle P_{A,s}\right\rangle_{s\in z}\,. (6.1)

We can then use our estimator for the variance

σ1D,α2(A,z)=((szwA,s)2sz(wA,s)21)1[sz(wA,s)2PA,s2sz(wA,s)2PA,ssz2],\sigma_{1\mathrm{D},\alpha}^{2}(A,z)=\left(\frac{(\sum_{s\in z}w^{\prime}_{A,s})^{2}}{\sum_{s\in z}(w^{\prime}_{A,s})^{2}}-1\right)^{-1}\left[\frac{\sum_{s\in z}(w^{\prime}_{A,s})^{2}P_{A,s}^{2}}{\sum_{s\in z}(w^{\prime}_{A,s})^{2}}-\left\langle P_{A,s}\right\rangle_{s\in z}^{2}\right]\,, (6.2)

and the covariance

𝒞1D,α(A,B,z)=\displaystyle\mathcal{C}_{1\mathrm{D},\alpha}(A,B,z)= (szwA,sszwB,sszwA,swB,s1)1\displaystyle\left(\frac{\sum_{s\in z}w^{\prime}_{A,s}\sum_{s\in z}w^{\prime}_{B,s}}{\sum_{s\in z}w^{\prime}_{A,s}w^{\prime}_{B,s}}-1\right)^{-1}
[szwA,swB,sPA,sPB,sszwA,swB,sPA,sszPB,ssz].\displaystyle\left[\frac{\sum_{s\in z}w^{\prime}_{A,s}w^{\prime}_{B,s}P_{A,s}P_{B,s}}{\sum_{s\in z}w^{\prime}_{A,s}w^{\prime}_{B,s}}-\left\langle P_{A,s}\right\rangle_{s\in z}\left\langle P_{B,s}\right\rangle_{s\in z}\right]\,. (6.3)

We note that our estimator fluctuates significantly and can even result in a negative variance for a small dataset size (at high redshift in our case). This issue is solved by smoothing the variance with a Savitzky-Golay filter implemented in scipy [65]. Similarly, the covariance matrix is smoothed along its two dimensions using a Savitzky-Golay filter implemented in sgolay2 \faGithub101010https://github.com/espdev/sgolay2. The covariance and variance estimators have been validated in the companion KR25 paper by comparing the variation between 2020 DR1-like uncontaminated mocks with the average of those mocks. This validation showed that our estimator slightly overestimates the covariance by 1010 %. We correct our covariance matrix by renormalizing by this factor, independently of redshift.

Figure 9 shows the standard deviation calculated with equation 6.2 for the DR1 power spectrum. The three smallest redshift bins (2.22.2, 2.42.4, and 2.62.6) have a different profile because the cross-exposure estimator (see section 4.2) is used for these bins. Due to the reduced number of sub-forests available for the cross-exposure estimator, the error bars increase, especially at large wavenumber. Compared to R23 results, the uncertainties are smoother due to the applied smoothing but have a similar profile. The level of uncertainties is much lower than for R23: between 2.62.6 and 33 times decrease depending on redshift. This improvement is even higher than the five times increase in quasar statistics, which should decrease the error bar only by 2.23\sim 2.23 times. Considering that SNR¯\overline{\mathrm{SNR}} weighting is used in both measurements, we interpret this significant improvement as caused by decreasing noise in the input data set.

Refer to caption
Figure 10: Correlation matrix of the statistical error of the DR1 power spectrum measurement, obtained by normalizing the covariance in equation 6.1. Similarly to figure 9, the covariance is computed for the new P1D,αP_{1\mathrm{D},\alpha} baseline and smoothed with a 2D Savitzky-Golay filter.

Figure 10 shows the correlation matrix for the DR1 data set. The correlation profile is very similar to the one obtained for eBOSS in [45], with a significant correlation (near 2020 %) at small wavenumber and decreasing as a function of redshift. There are some instabilities for high redshift caused by the low number of available sub-forests. The cosmological interpretation of this measurement should account for those instabilities.

6.2 Systematic uncertainties

Refer to caption
Figure 11: Systematic uncertainties of the DR1 P1D,αP_{1\mathrm{D},\alpha} measurement computed from the different sources considered in this study. The left panel shows the absolute level of those uncertainties, while the right panel shows their level relative to statistical uncertainties (see figure 9). Each systematic uncertainty is shown in a different line of panels, and the total systematic error, computed in quadrature, is shown on the bottom panels.

We re-evaluated the systematic uncertainty budget of R23 to account for new sources of systematics and improvements developed in this article and in the validation companion paper KR25. More specifically, pixel masking and resolution corrections were reevaluated to match DR1 statistics. The continuum fitting correction, see below, was also derived in the same paper.

All systematic uncertainties included in the total error budget are shown in figure 11. As mentioned in section 4.2, with the development of a cross-exposure estimator that measures the noise in a fully data-driven way, we consider that the systematic error associated with the noise level estimation is negligible and can safely be removed from the systematic error budget. Other systematic uncertainties associated to a bin (A,z)(A,z) and noted σsyst,X\sigma_{\mathrm{syst,X}}, are defined similarly to R23 in the following way:

  • Resolution damping: We ascribe an uncertainty to the stability of the Point Spread Function (PSF) and its impact on the DESI resolution matrix. Following R23, the average resolution damping in figure 15 is fitted with a simplified resolution damping model exp(0.5(kΔλ)2)sinc(0.5kΔλpix )\exp\left(-0.5(k\Delta\lambda)^{2}\right)\cdot\operatorname{sinc}\left(0.5k\Delta\lambda_{\text{pix }}\right) to measure the effective spectral resolution Δλ\Delta\lambda in Å. The P1D,αP_{1\mathrm{D},\alpha} uncertainty is given by σsyst,res(A,z)=2kA2ΔλσΔλP1D,α(A,z)\sigma_{\mathrm{syst,res}}(A,z)=2k_{A}^{2}\Delta\lambda\sigma_{\Delta\lambda}\cdot P_{1\mathrm{D},\alpha}(A,z), where σΔλ\sigma_{\Delta\lambda} is the error on Δλ\Delta\lambda for which we take a conservative absolute error of 1%1\% following the PSF stability measurement in [61].

  • Resolution correction: A correction for the resolution modeling Ares(A,z)A_{\mathrm{res}}(A,z) is derived in the companion paper KR25. We apply this multiplicative correction to the main P1D,αP_{1\mathrm{D},\alpha} measurement and ascribe a systematic error that is 3030 % of the correction, i.e., σsyst,res(A,z)=0.3|Ares(A,z)1|P1D,α(A,z)\sigma_{\mathrm{syst,res}}(A,z)=0.3\left|A_{\mathrm{res}}(A,z)-1\right|P_{1\mathrm{D},\alpha}(A,z).

  • Metal power spectrum: The metal power spectrum is measured in section 4.1 in the SB1 side-band wavelength range and shown in figure 16. The P1D,αP_{1\mathrm{D},\alpha} measurement is directly corrected by a physically motivated fit PSB1,m,zP_{\mathrm{SB1,m,z}}. We add a systematic uncertainty equal to the statistical error bar of the side-band power spectrum.

  • Continuum fitting: A multiplicative correction accounting for the bias introduced by the continuum fitting is derived from DR1-like mocks in KR25. This correction, noted Acont(A,z)A_{\mathrm{cont}}(A,z), directly multiplies the power spectrum measurement, and a 3030 % relative systematic uncertainty is added to the total budget.

  • Pixel masking corrections: At the continuum fitting stage, some spectrum pixels are masked due to DLA, BAL passing the AI>0AI>0 criterion, and atmospheric emission lines, as discussed in sections 2 and 5.1. Multiplicative corrections, respectively noted Adla(z)A_{\mathrm{dla}}(z), Abal(A,z)A_{\mathrm{bal}}(A,z), and Aline(A,z)A_{\mathrm{line}}(A,z) are computed in KR25. As for the previous multiplicative corrections, we correct the power spectrum measurement and associate a 3030 % relative systematic uncertainty. For the special case of BAL masking, a dedicated study in KR25 shows that a 30 %\% value is over-estimating the systematic uncertainty, and we take a 6 %\% value.

  • Catalog completeness: The DLA and BAL finding algorithms detailed in section 2 result in incomplete detections, especially for the low SNR¯\overline{\mathrm{SNR}} spectra. The data splits presented in section 5.1 provide insight into the impact of undetected DLA and BAL. In contrast with R23, we derive the impact of DLA and BAL from the data splits instead of mocks, making this analysis independent of the distribution of DLA and BAL in mocks. For DLA, we compare the power spectrum obtained when DLA are masked and the power spectrum corrected by Adla(z)A_{\mathrm{dla}}(z) to the case where no DLA are masked. The resulting ratio corresponds to the top left panel of figure 4 with the additional DLA masking correction. We fit this ratio with a power-law function provided by [63]. Following R23 which uses the same DLA finding algorithms, we ascribe a conservative 2020 % incompleteness to the DLA finders. The associated systematic uncertainty is obtained by multiplying this incompleteness factor by the product of the fitted function times P1D,αP_{1\mathrm{D},\alpha}. In a very similar way, we estimate the impact of unmasked AIAI BAL absorptions by comparing our new baseline for which we masked AIAI BAL, corrected by Abal(A,z)A_{\mathrm{bal}}(A,z), to the previous baseline measurement where those absortions where left unmasked. We use the same fitting function and associate a systematic uncertainty considering a 1515 % incompleteness for the finder, following [56].

For multiplicative correction, the 3030 % relative systematic uncertainty is motivated by considering a random shift of the correction between 0 and 100100 %. Considering a uniform distribution for this shift gives a 3030 % standard deviation. It is a straightforward but very conservative way of generating an error bar associated with the multiplicative corrections because it gives overestimated error bars related to larger corrections, such as atmospheric lines. In contrast, taking the standard deviation of the correction for different mock realizations gives too small and non-conservative error bars. We chose not to use this method based on mocks for systematic uncertainties.

All systematic uncertainties are added in quadrature to compute the total systematic error, which is shown in the last panels of figure 11. The correlations between wavenumber bins induced by these systematics are complex to model, and their contribution to the covariance matrix will be discussed in the upcoming inference of this work. Given the improvement in statistics compared to R23, the statistical error bars have decreased by between 260 to 300 % depending on redshift, and now the P1D,αP_{1\mathrm{D},\alpha} measurement is dominated by systematic errors for most redshifts and wavenumbers.

7 Measurement

Refer to caption
Figure 12: The normalized one-dimensional Lyα\alpha forest power spectrum (Δ1D,α(k)=kP1D,α/π\Delta_{1\mathrm{D},\alpha}(k)=kP_{1\mathrm{D},\alpha}/\pi) obtained from the DESI-DR1 data set using the FFT P1D,αP_{1\mathrm{D},\alpha} estimator given in figure 7.1. This power spectrum is computed from the new baseline, which includes BAL AIAI absorption masking and the use of the cross-exposure estimator for low redshifts. The error bars reported are obtained by adding in quadrature the statistical and systematic uncertainties shown in figures 9 and 11. The measurement is shown with Å units but the equivalent skm1\text{s$\cdot$km}^{-1} units are shown for different redshift on the top of the figure. The number of sub-forest used (noted here chunks) are reported for each redshift bin.
Refer to caption
Figure 13: Ratios of our DR1 one-dimensional power spectrum measurement to original measurements from eBOSS data [45] (top) and to the combination of KODIAQ, SQUAD, and XQ-100 high-resolution surveys [66] (bottom). For clarity, only the ratio error bar associated to the z=2.8z=2.8 redshift bin is shown with the shaded area. This error bar contains the combination of our statistical and systematic uncertainties with the compared P1D,αP_{1\mathrm{D},\alpha} error bars. For the high-resolution comparison, our P1D,αP_{1\mathrm{D},\alpha} measurement and the error bars are rebinned to the same wavenumber binning.

The P1D,αP_{1\mathrm{D},\alpha} measurement is obtained from running the entire pipeline given in 3, correcting for underestimated noise for high redshifts (z>2.7z>2.7), using the cross-exposure estimator for low redshift (z<2.7z<2.7), applying pixel masking, continuum fitting and resolution corrections, and subtracting the estimated metal power spectrum PSB1,m,zP_{\mathrm{SB1,m,z}}. The final P1D,αP_{1\mathrm{D},\alpha} estimator is defined as

P1D,α(A,z)=\displaystyle P_{1\mathrm{D},\alpha}(A,z)= Aline(A,z)Adla(z)Abal(A,z)Acont(A,z)Ares(A,z)\displaystyle A_{\mathrm{line}}(A,z)A_{\mathrm{dla}}(z)A_{\mathrm{bal}}(A,z)A_{\mathrm{cont}}(A,z)A_{\mathrm{res}}(A,z) (7.1)
×(Ps(k)sz,kAPSB1,m(A,z)),\displaystyle\times\left(\left\langle P_{s}(k)\right\rangle_{s\in z,k\in A}-P_{\mathrm{SB1,m}}(A,z)\right)\,,

where the individual power spectra definition is redshift dependent:

Ps(k)={𝒫X[δF,s(k)Rs(k),δF,s(k)Rs(k)]for z<2.7,[Praw,s(k)Pnoise,s(k)α]Rs2(k)for z>2.7.P_{s}(k)=\left\{\begin{array}[]{ll}\mathcal{P}_{X}\left[\overrightarrow{\frac{\delta_{F,s}(k)}{R_{s}(k)}},\overrightarrow{\frac{\delta_{F,s}(k)}{R_{s}(k)}}\right]&\text{for }z<2.7,\\ \left[P_{\mathrm{raw},s}(k)-P_{\mathrm{noise},s}(k)-\alpha\right]\cdot R_{s}^{-2}(k)&\text{for }z>2.7\,.\end{array}\right. (7.2)

Figure 12 shows the one-dimensional power spectrum measurement obtained from DESI-DR1 data set with this full FFT estimator. The error bars are the quadratic sum of the statistical and total systematic uncertainties detailed in previous sections. The increase in statistics is visible by eye compared to the previous EDR+M2 measurement in R23. We did not compute the P1D,αP_{1\mathrm{D},\alpha} for redshift higher than 4.24.2 because of the too-low number of sub-forests and the significant variation of P1D,αP_{1\mathrm{D},\alpha}.

We compare the resulting P1D,αP_{1\mathrm{D},\alpha} measurement with the previous eBOSS [45] and high-resolution measurements from the combination of KODIAQ, SQUAD, and XQ-100 surveys [66] in figure 13. As those studies expressed P1D,αP_{1\mathrm{D},\alpha} in velocity units (kms1\text{km$\cdot$s}^{-1}), the DR1 power spectrum is converted using k[skm1]=k[Å1]×λα(1+z)/ck\left[\text{s$\cdot$km}^{-1}\right]=k\left[\text{\AA }^{-1}\right]\times\lambda_{\alpha}(1+z)/c before the P1D,αP_{1\mathrm{D},\alpha} averaging stage using the redshift of individual sub-forests. We reproduce all aforementioned corrections with the velocity unit measurements on DR1 data and synthetic data in KR25.

Refer to caption
Figure 14: (top) eBOSS (shaded area) and DESI DR1 (points) P1D,αP_{1\mathrm{D},\alpha} measurements resulting from a similar analysis pipeline and using the DESI DLA catalog. (bottom) Ratio of DESI DR1 over this updated eBOSS measurement.

We observe a discrepancy with eBOSS, especially at large scales (k<0.005kms1k<0.005\ \text{km$\cdot$s}^{-1}) and for the first two redshift bins (z=2.2,2.4z=2.2,2.4). This is primarily due to the different DLA catalogs. Indeed, analyzing eBOSS data with our DLA catalog and using procedures close to the DESI analysis results in a DESI to eBOSS ratio independent of zz and kk for k<0.015k<0.015 s/km, as illustrated in figure 14. The origin of this 1.05 ratio is unknown and will be further studied in future publications. We ascribe the differences at higher kk in figure 14 to the noise characterization, which is much less controlled for eBOSS than for DESI data set. This is confirmed by the fact that DESI data appear in agreement with high-resolution experiments up to k<0.02k<0.02 s/km in figure 13.

The difference between our DR1 measurement and previous non-DESI ones is smaller than for EDR+M2 measurement in R23. In particular, our measurement is in better agreement with the high-resolution measurement at the smallest scales, implying an improved control of small-scale systematics in our study. However, we note that there still exists a disagreement larger than the error bar at k>0.02skm1k>0.02\ \text{s$\cdot$km}^{-1} which will need to be explored in future work before using both measurements for a cosmological interpretation. Since the DESI spectroscopic pipeline is built on a more robust methodology (e.g., ”spectroperfectionism” algorithm detailed in [62]) than the eBOSS one, and considering the extended data split analysis performed in our study, we are confident that our DR1 measurement is more robust than eBOSS for cosmological interpretation.

8 Conclusion and prospects

Using a Fast Fourier Transform estimator, we measured the one-dimensional Lyα\alpha power spectrum with the DESI-DR1 data set. Following the work performed in the previous DESI study R23 [33], we extended the instrumental characterization to the entire DR1 data set. The noise control was improved by introducing a new P1D,αP_{1\mathrm{D},\alpha} estimator, which uses the individual cross-power spectrum between different DESI exposures of the same quasar. This cross-exposure estimator, used for the low redshift sample of the DR1 measurement, does not require explicit modeling of the noise power spectrum and allows in addition for the reduction of fiber-related instrumental effects.

An extended analysis of potential sources of systematics was performed by computing P1D,αP_{1\mathrm{D},\alpha} for various data splits and method variations. Those studies notably concluded that the broad absorption line quasars detected with the absorption index should be accounted for by masking the associated absorptions. All the other data splits and method variations highlighted the robustness of our measurement. We conducted a detailed investigation of the statistical and systematic uncertainty budget. Notably, the variance estimator had to be modified to take into account spectrum weighting, and the impact of broad absorption line quasars was added to the systematic budget. A covariance estimator that accounts for SNR¯\overline{\mathrm{SNR}} weighting was developed and used to compute the covariance matrix, which will be used for cosmological interpretation of the measurement.

The resulting P1D,αP_{1\mathrm{D},\alpha} measured on the DR1 data set constitutes the best intermediate-resolution power spectrum measurement in terms of resolution and statistics. In particular, our measurement represent a drastic statistical improvement (55 times increase in number of quasars, and around 2.82.8 times decrease in statistical uncertainty) compared to the previous DESI measurement [33, 34]. We compared the DR1 measurement to eBOSS [45] and high-resolution [66] measurements. Relative to R23, we observe a global improvement in the agreement with previous measurements, although we still have large and small-scale discrepancies.

The FFT measurement is associated with a companion paper [37] presenting the measurement of P1D,αP_{1\mathrm{D},\alpha} with a quadratic maximum likelihood estimator. Both measurements agree in a comparison made in [37]. They will be cosmologically interpreted in another paper in preparation. High-resolution hydrodynamic simulations will be used in hand with state-of-the-art Gaussian process emulators [67, 68, 69] to interpret those P1D,αP_{1\mathrm{D},\alpha} measurements. They will put strong constraints on the amplitude and slope of the matter power spectrum, the sum of neutrino masses, and exotic dark matter models.

The uncertainty budgeting conducted in this article showed that systematic uncertainties now dominate the one-dimensional power spectrum measurement for most scales. Future measurements should aim to enhance the characterization of the different systematic sources to lower their contribution, as done for the noise systematic in this study. The increase in statistics with the next data release of DESI will provide the opportunity to measure P1D,αP_{1\mathrm{D},\alpha} data from sub-samples less contaminated by systematics. For example, the systematics related to pixel masking can be suppressed using masking window methods as in [70]. We note that other surveys such as 4MOST Cosmology Redshift Survey [71] or WEAVE-QSO [72] will also provide clean Lyα\alpha forest data sets that can be used to measure P1D,αP_{1\mathrm{D},\alpha}.

Appendix A Updated figures for DR1 data set

Refer to caption
Figure 15: Resolution damping (Rs2(k)sz,kA\left\langle R_{s}^{2}(k)\right\rangle_{s\in z,k\in A}) computed for the DR1 data set. This damping is shown in dashed black lines for the different redshifts and is very weakly dependent on it. Blue points represent the average over all redshifts. The shaded region represents the wavenumbers for which the impact due to resolution damping is larger than 8080 %. Using this criterion, the vertical line defines the maximal wavenumber used in our DR1 analysis.
Refer to caption
Figure 16: One-dimensional power spectra measured in the side-band region SB1 of DR1 data set with noise correction adapted to SB1 region. The noise power spectrum used for this SB1 calculation is corrected with an additive term α=0.00018\alpha=0.00018 Å, measured with the same method performed in figure 2 adapted to SB1. The continuous lines are obtained by fitting the model in equation 4.1 with a first-order redshift-dependent polynomial function to the data points represented by points. This fitted model is used to correct for the metal power spectrum in the main measurement.

This appendix contains the figures for which we used in this paper the same methods as were used for EDR+M2 data set in R23. We provide those figures to show an update using the new DESI-DR1 data set. Figure 15 shows the resolution damping computed as Rs2(k)sz,kA\left\langle R_{s}^{2}(k)\right\rangle_{s\in z,k\in A} from the average of the Fourier transform of the DESI pipeline resolution matrix. This resolution damping determines both the maximal wavenumber considered and the systematic uncertainty from PSF stability.

Figure 16 shows the power spectrum measured in the side-band SB1 (1,270<λrf<1,3801,270<\lambda_{\mathrm{rf}}<1,380 Å) and used to estimate the metal power spectrum PmetalsP_{\mathrm{metals}}.

Appendix B Covariance and variance estimators

Let us consider we have nn pairs of measurements (Xi,Yi)(X_{i},Y_{i}) of two physics quantities XX and YY and the weighted means X¯=i=1nviXi/i=1nvi\overline{X}=\sum_{i=1}^{n}v_{i}X_{i}/\sum_{i=1}^{n}v_{i} and Y¯=i=1nwiYi/i=1nwi\overline{Y}=\sum_{i=1}^{n}w_{i}Y_{i}/\sum_{i=1}^{n}w_{i}. We note E(Xi)=μE(X_{i})=\mu, E(Yi)=νE(Y_{i})=\nu and cov(Xi,Yj)=δijci{\rm cov}(X_{i},Y_{j})=\delta_{ij}c_{i}, which means we assume cov(Xi,Yj)=0{\rm cov}(X_{i},Y_{j})=0 when iji\neq j and in particular cov(Xi,Xj)=0{\rm cov}(X_{i},X_{j})=0. We have

cov(X¯,Y¯)=ijviwjcov(Xi,Yj)iviiwi=iviwiciiviiwi.{\rm cov}(\overline{X},\overline{Y})=\frac{\sum_{ij}v_{i}w_{j}{\rm cov}(X_{i},Y_{j})}{\sum_{i}v_{i}\sum_{i}w_{i}}=\frac{\sum_{i}v_{i}w_{i}c_{i}}{\sum_{i}v_{i}\sum_{i}w_{i}}\,. (B.1)

We want to extract this covariance from the dispersions of the XiX_{i} and YiY_{i}. In order to get the iviwici\sum_{i}v_{i}w_{i}c_{i} term, we should include iviwiXiYi\sum_{i}v_{i}w_{i}X_{i}Y_{i} in the estimator and normalize it by iviwi\sum_{i}v_{i}w_{i}. But this will produce a term in μν\mu\nu that we should try to cancel out, let’s try the following quantity:

T=iviwiXiYiiviwiiviXiiviiwiYiiwi.T=\frac{\sum_{i}v_{i}w_{i}X_{i}Y_{i}}{\sum_{i}v_{i}w_{i}}-\frac{\sum_{i}v_{i}X_{i}}{\sum_{i}v_{i}}\frac{\sum_{i}w_{i}Y_{i}}{\sum_{i}w_{i}}\,. (B.2)

We have

E(iviwiXiYi)=iviwi[cov(XiYi)+E(Xi)E(Yi)]=iviwi(ci+μν),E\left(\sum_{i}v_{i}w_{i}X_{i}Y_{i}\right)=\sum_{i}v_{i}w_{i}[{\rm cov}(X_{i}Y_{i})+E(X_{i})E(Y_{i})]=\sum_{i}v_{i}w_{i}(c_{i}+\mu\nu)\,, (B.3)

and

E(iviXijwjXj)\displaystyle E\left(\sum_{i}v_{i}X_{i}\sum_{j}w_{j}X_{j}\right) =\displaystyle= ijviwiE(XiYj)=ijviwj(μν+δijci)\displaystyle\sum_{ij}v_{i}w_{i}E(X_{i}Y_{j})=\sum_{ij}v_{i}w_{j}(\mu\nu+\delta_{ij}c_{i}) (B.4)
=\displaystyle= μνiwijwj+iviwici.\displaystyle\mu\nu\sum_{i}w_{i}\sum_{j}w_{j}+\sum_{i}v_{i}w_{i}c_{i}\,. (B.5)

We can now compute E(T)E(T) where the μν\mu\nu terms do cancel out and get

E(T)=iviwiciiviwiiviwiciivijwi=(viwiviwi1)iviwiciivijwi.\displaystyle E(T)=\frac{\sum_{i}v_{i}w_{i}c_{i}}{\sum_{i}v_{i}w_{i}}-\frac{\sum_{i}v_{i}w_{i}c_{i}}{\sum_{i}v_{i}\sum_{j}w_{i}}=\left(\frac{\sum v_{i}\sum w_{i}}{\sum v_{i}w_{i}}-1\right)\frac{\sum_{i}v_{i}w_{i}c_{i}}{\sum_{i}v_{i}\sum_{j}w_{i}}\,. (B.6)

So we just have to divide TT by the above parenthesis to get an unbiased estimator of the covariance:

cov^=(viwiviwi1)1[viwiXiYiviwiiviXiiviiwiYiiwi],\widehat{\rm cov}=\left(\frac{\sum v_{i}\sum w_{i}}{\sum v_{i}w_{i}}-1\right)^{-1}\left[\frac{\sum v_{i}w_{i}X_{i}Y_{i}}{\sum v_{i}w_{i}}-\frac{\sum_{i}v_{i}X_{i}}{\sum_{i}v_{i}}\frac{\sum_{i}w_{i}Y_{i}}{\sum_{i}w_{i}}\right]\,, (B.7)

which gives for the variance:

var^=((wi)2wi21)1[wi2Xi2wi2(iwiXiiwi)2].\widehat{\rm var}=\left(\frac{(\sum w_{i})^{2}}{\sum w_{i}^{2}}-1\right)^{-1}\left[\frac{\sum w_{i}^{2}X_{i}^{2}}{\sum w_{i}^{2}}-\left(\frac{\sum_{i}w_{i}X_{i}}{\sum_{i}w_{i}}\right)^{2}\right]\,. (B.8)

We can set Xi=PA,sX_{i}=P_{A,s}, Yi=PB,sY_{i}=P_{B,s}, vi=wA,sv_{i}=w_{A,s} and wi=wB,sw_{i}=w_{B,s} to get our estimators of the variance and covariance of P1D,αP_{1\mathrm{D},\alpha}.

An alternative estimator of the variance of a weighted average is sometimes proposed [73], where w2w^{2} is replaced by ww. However, the same kind of algebra as above shows that this estimator fails if the weights are not proportional to the inverse variance. In our case, this proportionality is only approximate, which makes this estimator inappropriate. In addition, this ww estimator cannot be generalized to a covariance estimator. These analytic findings were confirmed by a small Monte Carlo simulation that showed that our w2w^{2} estimator for the variance is unbiased and that the ww estimator fails in the general case.

Acknowledgments

This material is based upon work supported by the U.S. Department of Energy (DOE), Office of Science, Office of High-Energy Physics, under Contract No. DE–AC02–05CH11231, and by the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility under the same contract. Additional support for DESI was provided by the U.S. National Science Foundation (NSF), Division of Astronomical Sciences under Contract No. AST-0950945 to the NSF’s National Optical-Infrared Astronomy Research Laboratory; the Science and Technology Facilities Council of the United Kingdom; the Gordon and Betty Moore Foundation; the Heising-Simons Foundation; the French Alternative Energies and Atomic Energy Commission (CEA); the National Council of Humanities, Science and Technology of Mexico (CONAHCYT); the Ministry of Science, Innovation and Universities of Spain (MICIU/AEI/10.13039/501100011033), and by the DESI Member Institutions: https://www.desi.lbl.gov/collaborating-institutions. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the U. S. National Science Foundation, the U. S. Department of Energy, or any of the listed funding agencies.

The authors are honored to be permitted to conduct scientific research on I’oligam Du’ag (Kitt Peak), a mountain with particular significance to the Tohono O’odham Nation.

The project leading to this publication has received funding from Excellence Initiative of Aix-Marseille University - A*MIDEX, a French “Investissements d’Avenir” program (AMX-20-CE-02 - DARKUNI).

Data availability

The DESI spectra and associated catalogs from DESI-DR1 are publicly available111111https://data.desi.lbl.gov/doc/releases/dr1/. All the plots of this article are generated with picca7{}^{\ref{igmhub/picca}} (9.12.3) and p1desi \faGithub121212https://github.com/corentinravoux/p1desi (2.0.1). All the data points of the figures in this article will be made available upon publication according to the data management policy of DESI.

References