On the limitations of H $\alpha$ luminosity as a star formation tracer in spatially resolved observations

Zipeng Hu,

{}^{\href https://orcid.org/0000-0002-3758-552X\,1}

Benjamin D. Wibking,

{}^{\href https://orcid.org/0000-0003-3175-2291\,2}

Mark R. Krumholz

{}^{\href https://orcid.org/0000-0003-3893-854X\,1,3}

, and Christoph Federrath

{}^{\href https://orcid.org/0000-0002-0706-2306\,1,3}

¹Research School of Astronomy and Astrophysics, Australian National University, Canberra, ACT 2611, Australia
²Department of Physics and Astronomy, Michigan State University,East Lansing, MI,USA
³ARC Centre of Excellence for Astronomy in Three Dimensions (ASTRO-3D), Canberra, ACT 2611, Australia
zphu.charles@gmail.com (ZPH)

(Accepted XXX. Received YYY; in original form ZZZ)

Abstract

This study examines the limitations of H $\alpha$ luminosity as a tracer of star formation rates (SFR) in spatially resolved observations. We carry out high-resolution simulations of a Milky Way-like galaxy including both supernova and photoionization feedback, and from these we generate synthetic H $\alpha$ emission maps that we compare to maps of the true distribution of young stellar objects (YSOs) on scales from whole-galaxy to individual molecular clouds ( $\lesssim 100$ pc). Our results reveal significant spatial mismatches between H $\alpha$ and true YSO maps on sub-100 pc scales, primarily due to ionizing photon leakage, with a secondary contribution from young stars drifting away from their parent molecular clouds. On small scales these effects contribute significantly to the observed anti-correlation between gas and star formation, such that there is $\sim 50\%$ less anti-correlation if we replace an H $\alpha$ -based star formation map with a YSO-based one; this in turn implies that previous studies have underestimated the time it takes for young stars to disperse their parent molecular clouds. However, these effects are limited in dense regions with hydrogen columns $N_{\mathrm{H}}>3\times 10^{21}$ cm^-2, where the H $\alpha$ - and YSO-based SFR maps show better agreement. Based on this finding we propose a calibration model that can precisely measure the SFR of large molecular clouds (mean radius > 100 pc) with a combination of H $\alpha$ and CO observations, which provides a foundation for future study of star formation processes in extragalactic molecular clouds.

keywords:

ISM: clouds – ISM: HII regions – galaxies: star formation

^†^†pubyear: 2024^†^†pagerange: On the limitations of H

\alpha

luminosity as a star formation tracer in spatially resolved observations–On the limitations of H

\alpha

luminosity as a star formation tracer in spatially resolved observations

1 Introduction

The H $\alpha$ emission line produced by hydrogen recombination is one of the most commonly used tracers for star formation, since it is emitted by the photoionized gas created by extreme ultraviolet photons from young massive stars. Observers routinely use this line to measure star formation rates (SFRs) for galaxies out to redshift $z\sim 2.5$ , and on scales from whole-galaxy to kpc (e.g., Kennicutt, 1983; Shivaei et al., 2015; Ellison et al., 2017; Belfiore et al., 2018). There are two major advantages of this method: first, the H $\alpha$ line is easy to observe due to its brightness and location in the optical part of the spectrum, and because hydrogen is ubiquitous throughout the Universe; second, compared with other SFR tracers as the dust-reprocessed IR and UV continuum, H $\alpha$ traces only the most recent star formation ( $<5$ Myr) due to the short lifetimes of the stars that produce the bulk of the H $\alpha$ flux and the short recombination times of H ii regions (e.g., Weisz et al., 2012; Caplar & Tacchella, 2019; Haydon et al., 2020).

Driven in part by the success of H $\alpha$ -based galactic-scale SFR measurements, there has been great interest of extending such studies to the smaller scales characteristic of individual molecular clouds, $\lesssim 100$ pc. Measurements on such small scales are important for testing star formation theories under different conditions, and have been used to measure quantities such as the characteristic lifetimes of molecular clouds and the strength of stellar feedback (e.g., Kruijssen & Longmore, 2014; Kruijssen et al., 2018; Chevance et al., 2020, 2021). Of all star formation tracers that are bright enough for routine use in extragalactic measurements, H $\alpha$ is the only one that traces star formation on time scales similar to cloud life times, which can be as short as $\sim 10$ Myr (Chevance et al., 2023, and references therein). However, systematic uncertainties arise when converting H $\alpha$ luminosity into SFR on local regions around molecular clouds. The first uncertainty source is the dust attenuation effect, which can cause significant underestimation of local SFRs in regions of high extinction. However, as long as the extinction is not too severe, this effect can be calibrated out using the ratios of Balmer recombination lines (the Balmer decrement method) on both galactic and cloud scales (e.g., Sánchez et al., 2012). The second problem is the stochasticity of massive star formation. In a $\lesssim 100$ pc region the SFR may be too small to sample all stellar masses and evolutionary states, which most often leads to underestimation of the SFR derived from the H $\alpha$ tracer (da Silva et al., 2014; Gregg et al., 2024). The third and to date most poorly-calibrated bias depends on the size of H ii regions. As summarized in Kennicutt & Evans (2012), such regions vary in scale from $<100$ pc in the Milky way to $200-500$ pc in actively star-forming galaxies, implying that H $\alpha$ -bright H ii region boundaries may be displaced away from the original star-forming sites. This effect is further complicated by non-uniform gas distributions around newborn stars, which allows ionizing photons to travel farther in directions with less gas and thus make the bias anisotropic (Jin et al., 2022).

There have been limited efforts to characterise and calibrate the effects of stochasticity and finite H ii region size. Observationally, one can compare H $\alpha$ to direct counts of newborn stars, or young stellar objects (YSOs), identified from IR observations. However, beyond the Milky Way such efforts are severely limited by resolution. To date the only published extragalactic comparison of these tracers is from the Large Magellanic Cloud (Ochsendorf et al., 2017), and analysis of these data together with comparable Galactic data sets indicates significant disagreements on cloud scales (Krumholz et al., 2019).

This situation leaves simulations as the best available tool for understanding the spatial scales on which the SFRs traced by H $\alpha$ begins to deviate from the true star formation rate as characterised by direct counts of YSOs, and for quantifying the size of the discrepancy. Although no theoretical effort has been made thus far to study this question, there have been several numerical works that try to calibrate other aspects of H $\alpha$ emission as a star formation tracer. Tacchella et al. (2022) use simulations of Milky Way-like and dwarf galaxies to study the effect of dust attenuation and calibrate the relation between H $\alpha$ luminosity and SFR accounting for it, while Flores Velázquez et al. (2020) have estimated the time scales to which different SFR tracers are sensitive with cosmological simulations. However, such galactic scale models generally do not reach spatial resolutions of parsecs, or mass resolutions $\lesssim 100$ M_⊙, and thus can not resolve the star formation histories of individual molecular clouds. Kiloparsec-scale galactic box simulations with periodic boundaries can reach such resolutions (e.g., Peters et al., 2017; Kado-Fong et al., 2020), but lack a realistic large-scale galactic environment, which may affect molecular cloud density and spatial structure and ionizing photon transmission. For obvious reasons, they can not be used to study the reliability of H $\alpha$ over a full range of scales from galactic to sub-kpc.

In this work, we perform high resolution MHD simulations to address these challenges. We start from a moderate resolution Milky-Way like spiral galaxy simulation to create a realistic galactic environment for molecular cloud evolution that we run to statistical steady-state, then increase the mass resolution and continue the simulation for a shorter period to resolve the structures on parsec scales. Our star formation algorithm stochastically draws stars from the initial mass function (IMF), traces the individual stellar evolution to calculate the feedback, and directly follows the propagation of ionizing photons through the galaxy using a Strömgren volume method. Via post-processing, we derive local chemical compositions, and generate synthetic CO, H $\alpha$ , and YSO observations. Our goal is to answer the following questions: on what scales do star formation rates inferred from H $\alpha$ agree reasonably well with those inferred directly from YSOs? On scales where agreement breaks down, what are the physical mechanisms responsible? And can we calibrate the H $\alpha$ tracer to the YSO tracer to allow more accurate mesurements on the scales of individual molecular clouds?

The present paper is organized as follows: Section 2 summarizes the numerical method and simulation setup, then outlines the synthetic observation generation method. Section 3 compares our synthetic H $\alpha$ observation with observations of real galaxies in order to validate our methods, and presents scale-dependent comparisons of star formation rates inferred by H $\alpha$ versus by direct YSO counting. Section 4 discusses the physical mechanisms behind the difference, develops a calibration method, and proposes possible future work in this area. Section 5 concludes the work done in this paper.

2 Methods

We describe in Section 2.1 our Milky-Way like galaxy simulation, and list basic physical parameters of the simulated galaxy. We also describe the underlying physics and numerical methods. Then in Section 2.2 we introduce our method to produce synthetic SFR measurements traced by both H $\alpha$ luminosity and YSO counts.

2.1 Simulations

Our galactic scale simulation is designed based on the whole-galaxy simulations by Wibking & Krumholz (2023, hereafter WK23) and Hu et al. (2023, hereafter HWK23). The simulation code we use is GIZMO (Hopkins, 2015; Hopkins & Raives, 2016; Hopkins, 2016), which adopts a mesh-free, Lagrangian finite-mass Godunov method to solve the equations of magneto-hydrodynamics (MHD). The simulation include a time-dependent chemistry network that tracks the abundances of H i, H ii, He i, He ii, and He iii, and uses these abundances to calculate hydrogen and helium cooling rates. It also includes metal line cooling computed by interpolating on a set of pre-computed CLOUDY (Ferland et al., 1998) tables assuming Solar metallicity and collisional ionization equilibrium, and as implemented by the GRACKLE chemistry and cooling library (Smith et al., 2017). We implement algorithms for star formation, stellar feedback, and radiative cooling in the same way as in HWK23. Here we provide a brief summary of the physics and numerical methods most relevant to this project, and refer the readers to HWK23 and WK23 for more details.

For every gas particle with a density $\rho_{\rm g}$ above a prescribed threshold $\rho_{\rm crit}$ , we determine its local SFR density as

\rho_{\rm SFR}=\epsilon_{\rm ff}\frac{\rho_{\rm g}}{t_{\rm ff}},

(1)

where $\epsilon_{\rm ff}$ is the star formation efficiency, and $t_{\rm ff}$ is the local gas free-fall time. Multiple observational methods across a wide density range yield a mean value of $\epsilon_{\mathrm{ff}}\approx 0.01$ (Krumholz et al., 2019), with a dispersion about 0.2 dex in the Milky Way (Hu et al., 2022). Therefore we adopt $\epsilon_{\rm ff}=0.01$ in equation 1. The star formation density threshold we apply in our simulation is $\rho_{\rm crit}=1000$ m_H cm^-3, chosen to produce approximate equality between the Jeans mass and the particle mass at our peak gas mass resolution $\Delta M=89$ M_⊙ (see below). To avoid extremely small time steps in high density region, and the resulting high computational cost, we increase $\epsilon_{\rm ff}$ to 1 for gas particles with $\rho_{\rm g}>100\rho_{\rm crit}$ . In this way, we quickly convert high density gas particles into stellar particles, and set the highest density we resolve in this simulation to approximately $100\rho_{\rm crit}$ . During one time step $\Delta t$ , gas particles with $\rho_{\mathrm{SFR}}>0$ have a probability

P=1-\text{exp}(-\epsilon_{\rm ff}\Delta t/t_{\rm ff})

(2)

of being converted to a stellar particle. Once formed, the stellar particles represent sub-clusters that only interact with the galactic environment via gravity and stellar feedback.

When calculating the stellar feedback, we cannot integrate over the IMF because our stellar particles are too small: at our mass resolution, the expected number of SNe per particle is $\lesssim 1$ , and the expected number of stars $\gtrsim 20$ M_⊙ – the mass range that dominates ionizing photon production – is $\ll 1$ . Instead, for each stellar particle, we use the stellar population synthesis code SLUG (da Silva et al., 2012; Krumholz et al., 2015) to stochastically draw its stellar population from a Chabrier IMF (Chabrier, 2005). In this simulation we only include photonionization and type II supernova feedback, since they are the dominant mechanisms at the scale with which we are concerned. At each hydrodynamical time-step, SLUG reports for each star particle the instantaneous ionizing luminosity, $\dot{N}_{\rm ion}$ , the number of individual type II supernovae, $N_{\rm SNe}$ , and the mass of the ejecta, $M_{\rm ej}$ , added during that time step; we derive the ionizing luminosities from a combination of Padova stellar tracks (Schaller et al., 1992; Meynet et al., 1994; Girardi et al., 2000) and SLUG’s “starburst99” treatment of stellar atmospheres (Leitherer et al., 1999), and draw our estimates of which stars end their lives in type II SNe, and the ejecta mass for those that do, from the tabulated results provided by Sukhbold et al. (2016). Assuming each supernova ejects an equal energy of 10⁵¹ erg, we can easily derive the energy of the ejecta as $E_{\rm ej}=N_{\rm SNe}\times 10^{51}$ erg, and the momentum as $p_{\rm ej}=\sqrt{2E_{\rm ej}M_{\rm ej}}$ . Then for type II supernovae feedback, we inject the ejected mass, momentum, and energy into the neighbouring gas particles proportional to the solid angle centered on the stellar particle and subtended by the effective intersection area between each nearby gas particle and the star, following the approach described in Hopkins et al. (2018a, b). As for the implementation of photonionization, we adopt the algorithm as in Hopkins et al. (2018a): for each star particle, we sort its nearby gas particles with increasing distance, and start from the first gas particle that has a temperature below 10⁴ K, which we define as non-ionized. We then calculate the ionization rate $\Delta\dot{N}_{\rm ion}$ needed to fully ionize it, set the temperature of this gas particle to 10⁴ K, and deduct $\Delta\dot{N}_{\rm ion}$ from $\dot{N}_{\rm ion}$ . We repeat this process on the next closest non-ionized gas particle until $\dot{N}_{\rm ion}\approx 0$ .

The initial conditions and setup of our simulation follow the same routine as described in HWK23. Following that paper, we initially run the simulation for 0.7 Gyr at a mass resolution of $\Delta M=859.3$ M_⊙. During this period the galaxy reaches a stable SFR close to $3\;\rm M_{\odot}$ yr^-1. We then increase the mass resolution by a factor of 10. To do so, we replace each gas particle with 10 smaller particles randomly distributed around their parent particle’s center of mass, with a distance of one fifth of the parental smoothing length. After the split, the child particle mass is divided by 10, the child smoothing length is divided by $10^{1/3}$ , while all other physical parameters remain the same. In this way we improve the mass resolution to about 90 M_⊙; as noted above, for our cooling curve this is approximately equal to the Jeans mass at our star formation threshold, and the corresponding Jeans length is $\approx$ 1 pc, which we take to be the characteristic resolution of our simulation. This split snapshot becomes the initial condition for the next stage of our simulation, which we run for 23 Myr, which is about 4 times of the maximum age of massive stars that can emit ionizing photons ( $\sim$ 5 Myr) and thus is long enough for the ionization pattern to reach a steady state at the new resolution. During this phase we output snapshots at a 1 Myr frequency. We show the face-on gas column density map $N_{\rm H}$ at 720 Myr (i.e., 20 Myr into the high resolution run) in the upper left panel of Figure 1.

2.2 Synthetic observation

In order to study the difference between SFR measurements derived from H $\alpha$ and those using direct star counts, we need to produce two different synthetic maps from our simulation: a YSO mass distribution map and an H $\alpha$ luminosity map. In addition, we also produce a synthetic CO emission map, for comparison with real extragalactic molecular cloud observations.

Due to their different evolution stages and spectral energy distributions, protostars identified from IR observations are normally classified into two types (e.g., Pokhrel et al., 2020): class 1 YSOs are embedded in a circumstellar envelope, and have a smaller age limit ( $\lesssim 0.5$ Myr), while class 2 YSOs are surrounded by an optically thick circumstellar disk but not an opaque envelope, and have a larger age limit ( $<2$ Myr). We elect to use a 2 Myr age limit to define YSOs for our analysis, both for consistency with existing extragalactic observational analyses (Ochsendorf et al., 2017) and to reduce the shot noise that arises from the finite number of star particles in our simulation, which is limited by our mass resolution. To construct our YSO map, we identify all stellar particles formed in the 2 Myr prior to the time of a given snapshot as YSOs, and project their masses along the z-axis (normal to the galactic plane) onto a 2D histogram to produce the YSO mass distribution map; we defer for the moment a discussion of the pixel size we use in these maps. The SFR per unit area inside each pixel can be simply calculated as

\Sigma_{\rm SFR,YSO}=\frac{\sum m_{\rm YSO,pix}}{\Delta A_{\rm pix}\Delta t_{\rm YSO}},

(3)

where $m_{\rm YSO,pix}$ is the total mass of YSOs inside a given pixel, $\Delta A_{\rm pix}$ is the pixel area, and $\Delta t_{\rm YSO}=2$ Myr is the age range over which we identify YSOs. We plot the map of $\Sigma_{\rm SFR,YSO}$ in the lower right panel of Figure 1, which illustrates the intrinsic spatial distribution of SFR in our simulated galaxy.

To construct the H $\alpha$ luminosity map, we first compute the case B H $\alpha$ emission coefficient from the approximation suggested by Draine (2011),

\alpha_{\rm eff,H\alpha}\approx 1.17\times 10^{-13}T_{4}^{-0.942-0.031\;\mathrm{ln}(T_{4})}\mathrm{cm^{3}s^{-1}},

(4)

where $T_{4}$ is the ratio between the gas temperature $T$ and 10⁴ K. Since we trace the abundances of hydrogen and helium atoms and ions in our simulation, we can easily derive the local number density of protons $n_{\mathrm{p}}=n_{\mathrm{HII}}$ and electrons $n_{\mathrm{e}}=n_{\mathrm{HII}}+n_{\mathrm{HeII}}+2n_{\mathrm{HeIII}}$ .¹¹1A minor technical issue is that, due to the difference between the time steps of GIZMO and GRACKLE, the chemical abundance of gas particles is not always updated to include the effects of photoionization when GIZMO writes out a snapshot. This results in a small number of particles that have been ionized by the photoionization algorithm and whose temperatures have therefore been set to $10^{4}$ K, but whose H ii abundances have not yet been updated. To handle these cases we treat all gas particles with $T\geq 10^{4}$ K as fully ionized in hydrogen, consistent with our photonionzation feedback algorithm. Then we can derive the total H $\alpha$ luminosity from each gas particle simply as

L_{\rm H\alpha}=\alpha_{\rm eff,H\alpha}h\nu_{\alpha}n_{\mathrm{e}}n_{\mathrm{p}}V,

(5)

where $\nu_{\alpha}$ = 457 THz is the frequency of the H $\alpha$ emission line, and $V$ is the effective volume of the gas particle. Since equation 4 is no longer a good approximation for low temperature regions, we discard $L_{\rm H\alpha}$ from gas particles with $T<1000$ K. We also ignore $L_{\rm H\alpha}$ for gas particles with $\rho>10\rho_{\mathrm{crit}}$ , since we are modifying the physics at these densities with our star formation algorithm and thus the particle properties are unreliable. To convert $L_{\rm H\alpha}$ to SFR, we consider a slightly modified version of the fit provided by Kennicutt & Evans (2012), who find a relationship

\mathrm{log\;SFR}(\mathrm{M_{\odot}\,yr^{-1}})=\mathrm{log}\;L_{\rm H\alpha}(\mathrm{erg\,s^{-1}})-C_{\mathrm{H\alpha}},

(6)

where $C_{\mathrm{H\alpha}}=41.27$ is the calibration parameter. This calibration includes the effects of dust scattering and attenuation, both external and internal to H ii regions (i.e., the fact that not every ionizing photon is absorbed by a hydrogen atom), which are not included in our simulation and post-analysis; it is also at least somewhat dependent on the choice of IMF and stellar evolution tracks to atmospheres. To ensure consistency, we therefore determine a different calibration factor as follows: for all snapshots past the first 3 Myr (during which time the high resolution simulation is still settling), we calculate both the total $L_{\rm H\alpha}$ and the total star formation rate traced by YSOs with ages $<2$ Myr over the whole simulation domain, and set our value of $C_{\mathrm{H\alpha}}$ so that the star formation rates from H $\alpha$ and YSO counts agree when averaged over these snapshots. The resulting value is $C_{\mathrm{H\alpha}}=40.67$ , a factor of $\approx 3$ smaller than the Kennicutt & Evans (2012) value, which is consistent with the lack of dust attenuation in our simulations being the principle difference. We adopt this value throughout our analysis. We use this calibration to compute the H $\alpha$ -derived star formation rate of every gas particle, and project the star formation rate converted from H $\alpha$ luminosity SFR_Hα along the z axis. We show a sample SFR surface density map from the 720 Myr snapshot in the lower left panel of Figure 1.

The last mock observation we produce is the synthetic CO emission map, which is the most commonly used tracer of molecular mass. Our approach to this map is similar to that in HWK23, so here we provide only a brief summary. While it is possible to include chemistry on the fly in simulations (e.g., Glover et al., 2010; Tritsis et al., 2022), this is computationally intensive, and also does not by itself capture non-LTE excitation effects, which are equally important for predicting observable emission. For this reason, we instead use the DESPOTIC astrochemistry and radiative transfer code (Krumholz, 2014) to post-process the output snapshots. Under the assumption of Solar elemental abundances and Solar radiation field environment, DESPOTIC uses an escape probability approximation to derive the CO $J=1\rightarrow 0$ emission line luminosity per hydrogen atom $l_{\mathrm{CO}}$ ( $\nu_{\mathrm{CO}}$ = 115.271 GHz) as a function of three local physical parameters: hydrogen atom number density $n_{\mathrm{H}}$ , column density $N_{\mathrm{H}}$ , and velocity gradient $\mathrm{d}v/\mathrm{d}r$ . We first run DESPOTIC on a grid covering the parameter range $n_{\rm H}=10^{-2}-10^{6}\;\rm cm^{-3}$ , $N_{\rm H}=10^{19}-10^{25}\;\rm cm^{-2}$ , and $dv/dr=10^{-3}-10^{2}\;\rm km/s/pc$ . The result is a table of $l_{\mathrm{CO}}$ versus $(n_{\rm H},N_{\rm H},\mathrm{d}v/\mathrm{d}r)$ ; we determine $l_{\mathrm{CO}}$ values for each gas particle by performing trilinear interpolation on this table. We take values of $n_{\rm H}$ and $\mathrm{d}v/\mathrm{d}r$ for each cell directly from GIZMO output, while we derive $N_{\rm H}$ from the $N_{\mathrm{H}}$ value at the gas particle position in the column density map. Then we can derive the total CO $J=1\rightarrow 0$ emission line luminosity as $L_{\mathrm{CO}}=X_{\mathrm{H}}m_{\mathrm{g}}l_{\mathrm{CO}}/m_{\mathrm{H}}$ , where $X_{\mathrm{H}}=0.76$ is the hydrogen mass fraction, and $m_{\mathrm{g}}$ is the gas particle mass. We set $L_{\mathrm{CO}}=0$ for gas particles with ionized mass ratio $X_{\mathrm{HII}}>10\%$ ; this is necessary because our DESPOTIC grid does not include the effects of photoionization. We refer the readers to HWK23 and Krumholz (2014) for more details. To facilitate comparison to observations, we convert $L_{\mathrm{CO}}$ into the CO-inferred molecular mass as $m_{\mathrm{H_{2},CO}}/\mathrm{M}_{\odot}=2.3\times 10^{-29}L_{\mathrm{CO}}/(\mathrm{erg\,s}^{-1})$ ; this conversion factor combines the CO $J=1\rightarrow 0$ line-luminosity conversion factor from Solomon & Vanden Bout (2005) and the mass-to-luminosity conversion factor from Bolatto et al. (2013). The resulting face-on synthetic $\Sigma_{\mathrm{H_{2},CO}}$ surface density map for the 720 Myr snapshot is plotted on the upper right panel in Figure 1.

Refer to caption — Figure 1: Milky-Way-like spiral galaxy simulation at 720 Myr. Upper left panel: total gas column density map; here $N_{\mathrm{H}}$ is the column density of H nuclei, regardless of their chemical state. Upper right panel: surface density map of molecular mass as traced by CO $J=1\rightarrow 0$ emission (see main text for details). Lower left panel: SFR surface density map traced by H $\alpha$ emission. Lower right panel: SFR surface density map traced by YSOs with ages below 2 Myr. The dimensions of each map are $30\times 30$ kpc, while the pixel size of each map is 20 pc. The ruler in the upper left panel indicates 3 kpc.

3 Results

In this section, we first demonstrate that the spatial distribution of our synthetic H $\alpha$ luminosity map is similar to what is found in real observed galaxies by producing the so-called tuning fork diagram (Section 3.1). Then we illustrate and quantify the differences between the SFR measurements traced by H $\alpha$ and YSOs as a function of scale (Section 3.2).

3.1 Tuning fork diagram

To test the reliability of our stellar feedback mechanisms and the resulting synthetic H $\alpha$ maps, especially on spatial scales of molecular clouds ( $<100$ pc), we need to demonstrate that our simulations are capable of reproducing the spatial decorrelation between H $\alpha$ emission peaks and molecular clouds that has been recently observed in a sample of nearby galaxies (Kruijssen et al., 2018, 2019; Chevance et al., 2020, 2021; Kim et al., 2022). Reproducing this decorrelation has proven to be an exquisively sensitive test of feedback implementations in galaxy simulation codes (e.g., Fujimoto et al., 2019; Semenov et al., 2021; Jeffreson et al., 2021, 2024).

Such decorrelation can be quantified by measuring the depletion time of molecular gas, defined as the ratio between the molecular mass traced by CO emission, and SFR traced by H $\alpha$ emission: $\tau_{\mathrm{dep}}=M_{\rm H_{2},CO}/\rm SFR_{H\alpha}$ . One measures $\tau_{\mathrm{dep}}$ in apertures of different sizes and centered on peaks of either the CO or H $\alpha$ map, then compare the results to the mean depletion time of the whole galaxy $\tau_{\mathrm{dep,galaxy}}$ . Due to the spatial mismatch between CO and H $\alpha$ peaks, on sub-100 pc scales $\tau_{\mathrm{dep}}>\tau_{\mathrm{dep,galaxy}}$ when measured around CO peaks, and $\tau_{\mathrm{dep}}<\tau_{\mathrm{dep,galaxy}}$ when measured around H $\alpha$ peaks, but the difference diminishes as the aperture size increases, vanishing almost completely at $\sim$ kpc scales.

To mimic real observations, we first modify our maps by applying the resolution and detection limits reported by Kruijssen et al. (2019) for the nearby spiral galaxy NGC 300, which are typical of other applications of this method. Their observations have a pixel size of 20 pc, a $\Sigma_{\mathrm{H_{2},CO}}$ detection limit of 13 M_⊙ pc^-2, and a $\Sigma_{\rm SFR,H\alpha}$ detection limit corresponding to an H $\alpha$ -inferred star formation rate $\Sigma_{\mathrm{SFR,H\alpha}}=8.05\times 10^{-3}\;\mathrm{M_{\odot}yr^{-1}kpc^{-2}}$ . Then we follow the pipeline described by Kruijssen et al. (2018) to generate the tuning fork diagram. The first step is to plot both $\Sigma_{\mathrm{H_{2},CO}}$ and $\Sigma_{\mathrm{SFR,H\alpha}}$ maps at the finest available resolution of 20 pc (Figure 1), then cut all pixels below the detection limits above. Applying these detection limit thresholds removes $\sim$ 11% of the molecular mass and $\sim$ 13% of the H $\alpha$ emission, so the effects on star formation and molecular mass are comparable. We then determine a mean depletion time $\tau_{\mathrm{dep,0}}$ by summing the molecular gas mass and star formation rate over the remaining pixels and taking their ratio. Secondly, for a given scale $L$ , we smooth both maps with a top hat filter with aperture size $L$ , then identify local maxima from each map as emission peaks. If $L$ is large enough for some of the apertures to overlap, we randomly select 500 sub-samples of non-overlapping apertures around identified peaks, then calculate the mean $\Sigma_{\mathrm{H_{2},CO}}$ and $\Sigma_{\mathrm{SFR,H\alpha}}$ values from the whole collection of sub-samples. The depletion time of aperture size $L$ ( $\tau_{\mathrm{dep}}(L)$ ) is calculated as the ratio between these two mean values. Due to the stochasticity of our star formation algorithm, the birth of young massive stars is random in our simulation. To minimize this noise, we perform the analysis above for 20 snapshots between 704 Myr and 723 Myr, then plot the 16th, 50th and 84th percentiles of the ratio $\tau_{\mathrm{dep}}(L)/\tau_{\mathrm{dep,0}}$ around CO emission peaks (top branch) and H $\alpha$ peaks (bottom branch) in Figure 2. We also show the measured values for NGC 628 (Chevance et al., 2020) for comparison; we choose this galaxy as an appropriate comparison because it has a total mass and size comparable to our simulated galaxy.

The plot shows that our simulated galaxy reproduces the observed tuning fork-like shape. At large scales ( $L\leq$ 1 kpc), the two branches of the fork converge because the aperture size is large enough to include many CO peaks and H $\alpha$ peaks, making the depletion time equal to the galactic average value. When the aperture scale comes down, especially to $L<100$ pc, the branches diverge as a result of decoupling between H ii region and molecular gas. The overall shape is in good agreement with the results from NGC 628 (Chevance et al., 2020), and a quantitative comparison confirms this visual impression. Below in Section 4.2 we apply the same heisenberg code (Kruijssen et al., 2018) that Chevance et al. apply to NGC 628 to estimate the characteristic spacing $\lambda$ between the CO emission peaks and H $\alpha$ emission peaks from our simulated images, and we obtain $\lambda=103.2^{+58.3}_{-19.7}$ pc, whereas Chevance et al. find $\lambda=113^{+22}_{-14}$ pc for NGC 628 – identical to within the uncertainties. This confirms the reliability of our stellar feedback models, especially on cloud scale, and gives us the confidence to begin examining how star formation rate traced by H $\alpha$ differs from the “true” star formation rate one can infer directly from YSOs.

3.2 Comparison of SFRs from H $\alpha$ emission and direct YSO counts

To study how well H $\alpha$ luminosity traces recent star formation history, we first compare their spatial distributions qualitatively. In Figure 3, we plot contours of $\Sigma_{\rm SFR,H\alpha}$ (red) and $\Sigma_{\rm SFR,YSO}$ (black); the H $\alpha$ contour level corresponds to a typical detection limit of $8.05\times 10^{-3}\;\mathrm{M_{\odot}yr^{-1}kpc^{-2}}$ , and the $\Sigma_{\rm SFR,YSO}$ contour level is set to enclose pixels that include at least one YSO with age $<2$ Myr. To make the differences easier to see, we zoom in to a 3 kpc $\times$ 3 kpc patch of the galaxy, with the galactic center at the lower right corner; the pixel size is 20 pc for consistency. From the plot, we can clearly see a spatial mismatch: many H ii regions contain no YSOs (though the great majority of these do contain older stars, with ages of $2-5$ Myr), while others are much more spatially extended than the corresponding $\Sigma_{\rm SFR,YSO}$ contours. Averaged over 20 snapshots, the overlapping area between the two kinds of contours accounts for about only 38% of the total area of H $\alpha$ contours, and about 70% of the total area of YSO contours. Such mismatch indicates that on sub-100 pc scale, H $\alpha$ luminosity is an imperfect tracer of recent star formation history.

To quantify the difference between the H $\alpha$ - and YSO-based star formation rates as a function of spatial scale, we first construct maps of both $\Sigma_{\rm SFR,H\alpha}$ and $\Sigma_{\rm SFR,YSO}$ with pixel sizes $\ell$ ranging from 10 pc to 3 kpc. For each pair of maps with same pixel size, we define their relative difference as

\delta(\ell)=\frac{\sum_{i}\left|\Sigma_{{\rm SFR,H\alpha},i}-\Sigma_{{\rm SFR,YSO},i}\right|}{2\sum_{i}\Sigma_{{\rm SFR,YSO},i}},

(7)

where $\Sigma_{{\rm SFR,YSO},i}$ and $\Sigma_{{\rm SFR,H\alpha},i}$ are the YSO- and H $\alpha$ -derived SFRs in pixel $i$ for the map of pixel size $\ell$ , and the sums run over all pixels. We illustrate this procedure in Figure 4, where we plot the galactic $\Sigma_{\rm SFR}$ maps for H $\alpha$ and YSOs (1st and 2nd columns), and the normalised absolute difference $|\Sigma_{\mathrm{SFR,H\alpha}}-\Sigma_{\mathrm{SFR,YSO}}|/2\bar{\Sigma}_{\rm SFR,YSO}$ (3rd column), where $\bar{\Sigma}_{\rm SFR,YSO}$ is the mean of $\Sigma_{\mathrm{SFR,YSO}}$ over all pixels, at pixel sizes $\ell=0.1,$ 1, and 3 kpc (top to bottom rows). The quantity $\delta(\ell)$ defined in equation 7 is simply the mean value of the maps in the third column. As shown in the figure, when $\ell$ increases, the difference between $\Sigma_{\rm SFR}$ maps drops, and thus $\delta$ becomes smaller.

Because we have normalised the total SFRs derived from H $\alpha$ and YSOs to be equal when summed over the full galaxy, $\delta$ is constrained to lie in the range $0-1$ , with $\delta=0$ when the two maps are exactly the same and $\delta=1$ when they are completely non-overlapping. We plot the $\delta(\ell)$ versus the pixel size $\ell$ in Figure 5. Similar to Figure 2, we present the results stacked from 20 snapshots between 704 Myr and 723 Myr by showing the 50th percentile as the dotted lines, and the 16th and 84th percentiles as band edges. We also show the results from YSOs with over two ranges, for reasons we discuss below in Section 4.1. From $\ell=1$ kpc to $\ell=10$ pc, we see that $\delta(\ell)$ quickly increases, and reaching values $\sim 0.5$ at sub-100 pc scales. This indicates H $\alpha$ luminosity intrinsically fails to trace the recent star formation rate on sub-100 pc scales, and cannot be used for studying the star formation history of molecular clouds without a great deal of caution.

4 Discussion

In this section, we first discuss the physical mechanisms behind the breakdown of H $\alpha$ as a SFR tracer on small scales (Section 4.1), and consider the implications of this for inferences of molecular cloud properties (Section 4.2). Then we develop an improved method to calibrate H $\alpha$ measurements for individual molecular clouds (Section 4.3). Finally we propose future work with the help of most recent observations (Section 4.4).

4.1 Why H $\alpha$ fails on small scales

We first seek to understand the physical mechanisms responsible for the breakdown of H $\alpha$ as a SFR tracer on small scales. As a start, we can immediately discard one obvious possibility, which is stochastic sampling of stellar populations (e.g., da Silva et al., 2012, 2014; Krumholz et al., 2015; Arora et al., 2021). While this is potentially an important effect at small scales, as we shall see below, stochasticity is not consistent with the pattern shown in Figure 4. If the primary effect were stochasticity, we would expect to see many contours containing YSOs but little H $\alpha$ , corresponding to regions containing a recently-formed population that is deficient in massive stars. While there are a few examples of such structures, Figure 4 shows that a far larger effect is H $\alpha$ contours that are more extended than YSO ones, or without any YSOs present at all.

This leads us to focus on two alternative explanations: drift of young massive stars, and transport of of ionizing photons. With regard to the former, we note that in Galactic observations YSOs identified from IR excess are normally younger than 2 Myr old, and we have followed this convention in our definition of YSOs. However, stars with significant ionizing luminosities can have lifetimes up to $\approx 5$ Myr. Once the stars are formed, they are no long subject to gas pressure and may drift out of their mother molecular cloud, and so one might hypothesise that H $\alpha$ differs from YSOs because it traces an older and more dispersed stellar population. We test for this effect in Figure 5, where we compare the mismatch parameter $\delta(\ell)$ computed for our fiducial YSO age limit or 2 Myr (gray band) to one derived using an age limit of 5 Myr instead (orange band); the latter should include all significant sources of ionizing photons. We see that the extending age limit to 5 Myr does reduce the mismatch between H $\alpha$ and YSOs, but only by $\sim 10\%$ . This suggests that stellar drift is not the primary culprit.

This leaves ionizing photon transport as the prime suspect. With regard to this phenomenon, we note that previous observations and theoretical models have found that ionizing photons can leak out of compact H ii regions, and create diffuse H $\alpha$ components up to even $\sim$ 1 kpc above the galactic plane (e.g., Mathis, 1986; Sembach et al., 2000; Wood et al., 2010; Belfiore et al., 2022). In the study of nearby disk galaxies by Pflamm-Altenburg & Kroupa (2008), they find that this photon leak effect becomes significant for gas surface densities $\Sigma_{\mathrm{gas}}<10$ M_⊙ pc^-2. However, their $\Sigma_{\mathrm{gas}}$ measurements are averaged over $\sim$ kpc scales, so the photon leak effect on cloud scales is still not well quantified. Our simulation, however, is perfectly suited to such a study, because when a gas parcel is newly ionized we set its temperature to exactly $10^{4}$ K, and therefore gas parcels at this temperature perfectly outline the outer boundaries of H ii regions. We can easily locate these particles in the output snapshots, allowing us to determine how far ionizing photons can travel.

Our analysis comes in three steps: first we record the coordinates of all gas particles with temperature of exactly $T=10^{4}$ K; then we build a k-d tree containing the coordinates of stars with ages $<5$ Myr, which are potential sources of ionizing photons; finally, for each identified gas particle, we find the closet young star using the k-d tree. For each gas particle, we record both the distance $D$ to the closest star and the total gas column density $N_{\mathrm{H}}$ around the star; we measure $N_{\mathrm{H}}$ in a circular region around the star with a radius of 100 pc in order to smear fluctuations while retaining cloud scale information. We perform this analysis on snapshots from 704 Myr to 723 Myr, and we plot the 16th, 50th, and 84th percentiles of $D$ versus $N_{\mathrm{H}}$ in Figure 6. This plot shows a clear trend that $D$ decreases with $N_{\mathrm{H}}$ , which indicates that denser regions can absorb more ionizing photons, and confine the H ii regions to smaller sizes. The median H ii region size drops below 100 pc for $N_{\mathrm{H}}>2\times 10^{21}$ cm^-2, suggesting that we should find better agreement between H $\alpha$ and YSO-based SFRs if the YSOs are embedded in dense regions.

To test this hypothesis, we plot the galactic SFR maps from the same snapshots with a pixel size of 100 pc; then for each pixel containing at least one YSO, we calculate the ratio $\Sigma_{\rm SFR,YSO}/\Sigma_{\rm SFR,H\alpha}$ and the column density $N_{\mathrm{H}}$ . We plot this ratio as a function of $N_{\mathrm{H}}$ in Figure 7, where we again show the median value as the solid line and the 16th to 84th percentiles as the shaded region. As the Figure shows, when YSOs reside in a low surface density region, the corresponding H ii region greatly expands beyond the 100 pc pixel, leading to a substantial mismatch between the H $\alpha$ and YSO-based SFRs. Conversely, when the local column density becomes high enough ( $N_{\mathrm{H}}\gtrsim 3\times 10^{21}$ cm^-2, or $\Sigma_{\mathrm{gas}}>32\;\mathrm{M_{\odot}\;pc^{-2}}$ ), H $\alpha$ begins to agree very well with YSOs because the H ii region is well confined. We therefore conclude that H $\alpha$ luminosity is only locally trustworthy for dense regions with $N_{\mathrm{H}}\gtrsim 10^{21}$ cm^-2, while the exact threshold needs calibration from future high-resolution observations.

Finally, we note that while we have shown above that drift of young stars away from their birth sites is not the direct cause of the H $\alpha$ -YSO mismatch, it may play a significant role in enhancing ionizing photon transport, together with the closely related effect that young stars drive away nearby dense gas with pre-supernovae stellar feedback. In either case, the effect is that the local column density around the stars drops with time, which enables the ionizing photons to travel further and enlarges the SFR tracers mismatch. Quantitatively, we find that the column density around a star on average drops by a factor of $\approx 10$ from the moment of its birth to the age of 5 Myr. Moreover, about 71% of the YSOs $<2$ Myr old reside in CO contours where $\Sigma_{\mathrm{H_{2},CO}}\geq 13$ M_⊙ pc^-2, our rough threshold for H ii region confinement to $\lesssim 100$ pc sizes; for stars with ages between 2 and 5 Myr, this fraction drops to $\approx 38\%$ . Thus the mismatch between H $\alpha$ and YSOs due to ionizing photon transport is mainly driven by older YSOs that have, either by drift or by dispersing their natal gas, escapes to low-density regions from which their ionizing photons can propagate long distances. This phenomenon has also been observed in another recent theoretical modeling effort by McClymont et al. (2024), who find that the diffuse ionized gas (DIG) is primarily ionized by older stars instead of embedded YSOs.

4.2 Implications for statistical inferences of the molecular cloud life cycle

The fact that H $\alpha$ -based SFRs are significantly distorted on small scales by photon transport effects has important implications for the interpretation of the small-scale gas-SFR decorrelation, which is often used to deduce properties of the molecular cloud life cycle and star formation process. To understand these implications, we begin by quantifying how photon transport effects contribute to the tuning fork diagram on small scales. We do this by constructing tuning fork diagrams using the same analysis method described in Section 3.1, but this time using the SFR map derived from YSOs instead of H $\alpha$ , both for YSOs with ages $<2$ Myr and $<5$ Myr. We show the result in Figure 8, together with H $\alpha$ -based result we obtained in Section 3.1.

We find that, unlike H $\alpha$ , YSOs with ages $<2$ Myr are spatially well correlated with the CO emission, so the upper branch of the black plot is close to unity across different scales. The lower branch is below unity, since when the aperture is located around YSO mass peaks, the local stellar particles are grouped together more tightly than the corresponding gas, making the local depletion time smaller than the galactic average value. The comparison between H $\alpha$ and YSOs with ages $<5$ Myr is perhaps more informative, since in this case we are probing very close to the same time scale as H $\alpha$ , and thus any differences between the H $\alpha$ and 5 Myr YSO tuning forks translate directly to changes in the inferred properties of molecular clouds. We find that, when the stellar age increases, either YSOs drift away from the CO emission peaks or molecular clouds are dispersed by feedback, resulting in both the upper and lower tuning fork branches deviating from unity. However, the deviation is noticeably smaller than for H $\alpha$ .

Examining published analyses based on the SFR-gas decorrelation (e.g., Kruijssen et al., 2018, 2019; Chevance et al., 2020, 2021), it is clear that a tuning fork that opens more narrowly and at small scales implies molecular clouds whose positions are correlated on smaller scales and which spend significantly longer in the “overlap” phase when both tracers of star formation and gas are present, and clouds have not been dispersed by feedback; it is likely that the inferred overall molecular cloud lifetime would increase as well.

To quantify this effect, we use the publicly-available heisenberg code (Kruijssen et al., 2018) to analyze all three tuning fork diagrams in Figure 8. As a first step in this analysis, we test whether heisenberg’s diffuse emission filter can remove the effects of H $\alpha$ emission coming from diffuse ionised gas (DIG) far from YSOs. The full algorithm implemented by this filter is described by Hygate et al. (2019), so here we only summarize for reader convenience. The first step is to first draw a tuning fork diagram and fit the average spacing between emission peaks $\lambda$ from the original emission map; doing so we obtain $\lambda=103.2_{-19.7}^{58.3}$ pc. We then filter out the diffuse component by transforming the H $\alpha$ map into Fourier space, applying a high-pass filter to remove modes with wave number $\gtrsim 1/k$ , and then transforming back to physical space. We then re-fit $\lambda$ by applying heisenberg to the filtered map, and iterate the process of fitting, filtering, and re-fitting until $\lambda$ changes by less than 5% between two successive iterations. Applying this procedure to our H $\alpha$ map, we find that the process converges in six iterations, with the characteristic separation stablizing at $\lambda=91^{+13}_{-11}$ pc; the mean is only 10% less than the non-filtered result, and the two are consistent within the formal uncertainties. Moreover, the fraction of diffuse H $\alpha$ flux is measured to be only $\sim 8\%$ . These indicate that the effect of DIG is limited in our synthetic observations, a result that can be understood as coming from two effects: first, we have already suppressed much of the DIG emission by removing H $\alpha$ emission below the detection limit; second, the spatially extended H $\alpha$ emission due to photon transmission effect is not distinct from compact H ii regions, as they are originally generated by the same sources. Given the minimal effects of filtering, we turn off this module for the remainder of our analysis with heisenberg, which allows us to use the same procedure to analyze both the H $\alpha$ -derived and YSO-derived star formation maps, eliminating a possible source of systematic difference.

We next use heisenberg to derive the durations of two phases in the cloud life cycle: the gas-only phase $t_{\mathrm{gas}}$ during which only the molecular gas tracer is present and the “overlap” phase $t_{\mathrm{over}}$ during which both tracers of star formation and gas are present. We do this for both the H $\alpha$ map and the two YSO-derived maps. When fitting the durations of these two phases, heisenberg requires that we supply a reference duration of the star-only phase $t_{\mathrm{ref}}$ . For H $\alpha$ emission, we adopt $t_{\mathrm{ref}}=4.3^{+0.1}_{-0.2}$ as measured from a disc galaxy simulation at Solar metallicity (Hygate et al., 2019). For the star formation maps derived from YSOs with ages $<2$ Myr and $<5$ Myr, we set $t_{\mathrm{ref}}$ to 2 and 5 Myr, respectively. We report the results returned by Heisenberg in Table 1.

Tracer	$\lambda$	$t_{\mathrm{gas}}$	$t_{\mathrm{over}}$	$t_{\mathrm{ref}}$
	(pc)	(Myr)	(Myr)	(Myr)
H $\alpha$	$103.2^{+58.3}_{-19.7}$	$27.6^{+15.8}_{-4.7}$	$8.4^{+6.0}_{-2.0}$	$4.3^{+0.1}_{-0.2}$
YSO < 5 Myr	$51.1^{+19.3}_{-8.6}$	$33.2^{+31.6}_{-7.9}$	$10.6^{+12.4}_{-3.3}$	$5.0$
YSO < 2 Myr	$38.5^{+10.8}_{-5.9}$	$141.7^{+33.0}_{-48.4}$	$30.2^{+8.7}_{-10.5}$	$2.0$

Table 1: heisenberg fitting results for the tuning fork diagrams shown in Figure 8. The first column lists the SFR tracer used, the middle three columns are the fitted physical quantities describing mean cloud separation, duration of gas-only phase, and duration of gas-star overlap phase, and the last column provides the star-only phase duration

t_{\mathrm{ref}}

, which is used as the reference time scale in fitting.

From this table it is obvious that when the SFR is traced by YSOs with ages $<5$ Myr instead of H $\alpha$ emission, the fitted value of $\lambda$ decreases by $\sim 50\%$ , and the duration of the “overlap” phase increases by $\sim 25\%$ . The effect is even more dramatic if we consider YSOs with ages $<2$ Myr, but the uncertainties are extremely large, and it is possible that the method itself breaks down in this case because for these young YSOs the upper branch of the tuning fork does not open at all, contradicting a basic assumption of the method. Nonetheless, this analysis makes clear that the very short overlap time inferred by previous studies is substantially biased by photon transport effects in H $\alpha$ , and that molecular cloud dispersal is perhaps not quite as fast as had been supposed. It is worth noting that this result helps resolve a longstanding tension: existing measurements from CO-H $\alpha$ decorrelation seem to suggest that molecular clouds have an extended quiescent phase when there are no massive stars present, followed by a very rapid dispersal phase once such stars appear, but this is hard to reconcile with the observation that starless molecular clouds in the Milky Way are exceedingly rare. Our finding suggests a resolution: the dispersal time has been significantly underestimated because much of the CO-H $\alpha$ decorrelation is due not to cloud dispersal, but to ionizing photons produced by stars at ages $\approx 2-5$ Myr traveling far from their parent stars before being absorbed and giving rise to H $\alpha$ emission. This picture is also consistent with the recent discovery by Calzetti et al. (2023) using James Webb Space Telescope (JWST) of a substantial population of star clusters with ages $\gtrsim 5$ Myr that are still highly dust-embedded.

4.3 SFR calibration model

In Section 4.1, we find that in regions where YSOs are present in regions of high gas column density, H $\alpha$ luminosity does approximately trace the recent star formation rate in the 100-pc scale. However, direct identification of extragalactic YSOs is unavailable for all but the few nearest galaxies due to resolution limits, making it impossible to directly apply such criteria to study cloud star formation rate with H $\alpha$ emission. Nevertheless, calibration of the H $\alpha$ tracer to YSOs is still possible because of the strong correlation between molecular clouds and star formation. In essence, if the cloud mass and area is large enough, we can safely assume the existence of embedded YSOs, and therefore assume that we are looking at a region where H $\alpha$ will be reliable. In our synthetic observations, every CO contour with a mean radius larger than 100 pc has YSOs within it, so we select such CO contours and calculate the total SFR within each of them using both H $\alpha$ and YSOs. Via this process we obtain a sample of 464 distinct CO contours, with a minimum total molecular gas mass of $\sim 10^{6}\;\rm M_{\odot}$ and a minimum SFR_Hα of $\sim 10^{-3}\;\rm M_{\odot}yr^{-1}$ . The latter limit is also helpful, because this corresponds to an ionising photon flux of $\sim 10^{49}$ yr^-1, about the luminosity of a single O type star. Therefore, this is roughly also the limit below which we expect stochasticity to foil measurements of the SFR from H $\alpha$ even with limited photon transport effects – an intuition confirmed by detailed slug simulations by da Silva et al. (2014). Thus by selecting contiguous regions of significant CO emission with radii $>100$ pc, we ensure that these regions almost certainly contain YSOs, and almost certainly contain enough massive stars that stochastic IMF sampling has minimal effects.

We plot the measured SFR_YSO vs. SFR_Hα for our contours in Figure 9. We find that the relation between the two tracers is well-fit by a simple linear (on log scale) model

\log_{10}\frac{\mathrm{SFR_{YSO}}}{\mathrm{M}_{\odot}\;\mathrm{yr}^{-1}}=k\log_{10}\frac{\mathrm{SFR_{H\alpha}}}{\mathrm{M}_{\odot}\;\mathrm{yr}^{-1}}+b.

(8)

The fit coefficients are $k=0.71$ and $b=-0.33$ , and the $R^{2}$ value is 0.81, indicating a strong linear relation. This relation can be applied in observation to calibrate the star formation rate of individual clouds where a local YSO catalog is unavailable. Although this relation is limited to large molecular clouds, this is not a severe limitation for extragalactic studies, which commonly achieve spatial resolutions of only $\approx 100$ pc (e.g., Chevance et al., 2020), close to our cloud radius limit. Note that we did not consider the effect of dust attenuation in our post-analysis, we assume that we observe the total H $\alpha$ emission. For application to real observations one must first dust-correct the H $\alpha$ luminosity (e.g., Kennicutt & Evans, 2012; Tacchella et al., 2022).

4.4 Future work on nearby galaxies

The results in this work are derived from a simulation of a Milky Way-like galaxy. It is unclear to what extent these results will differ for galaxies of very different star formation rates and sizes, where different stellar feedback effects may dominate compared to those in Milky Way-like systems; the properties of the interstellar medium and dust may also differ between galaxies, which can also affect the transmission of ionizing photons.

One approach to mitigate this is of course to extend our simulations to different galaxy types. However, it is also essential to test our results by comparing real high resolution H $\alpha$ luminosity maps with YSO distribution maps. At present the only viable options for such a test are the “Local Group” galaxies, including the Large and Small Magellanic Clouds (LMC and SMC), and the M33. H $\alpha$ maps of these galaxies are already available at sub-arcsec resolution, corresponding to physical scale of $\sim 0.2$ pc. The limiting factor is YSOs: the current YSO catalog for the LMC, for example, is still limited to several small regions. Using Spitzer IR survey, previous works have (Indebetouw et al., 2008; Chen et al., 2009, 2010) identified YSOs from the LMC’s molecular ridge and two H ii complexes. However, the observed area in total is less than 1 kpc², and thus not sufficient for systematic comparison of H $\alpha$ and YSOs across scales. Moreover, their YSO mass detection lower limit varies between 3 M_⊙ and 8 M_⊙, making SFR estimates strongly depend on both complex completeness calculations and assumptions about the shape of underlying IMF, especially at the low mass end. Nevertheless, high resolution observations empowered by JWST now have great potential to identify extragalactic YSOs. Peltonen et al. (2024) observe one spiral arm of M33, and find 793 YSOs with a lower mass limit of $\sim 6$ M_⊙. For closer galaxies like the LMC, JWST should be able to detect YSOs down to sub-Solar masses. If identified from the whole galaxy, these YSO catalogs will enable a repetition of our simulation analysis here on real data from a suite of nearby galaxies, which can help us better understand the effect of different galactic environments on GMC life cycles. In the future, we plan to run simulations of these galaxies suitable for direct comparisons to such observations.

5 Conclusion

In this work, we investigate the discrepancy between estimates of galaxy star formation rates based on H $\alpha$ emission and direct measurements from counts of young stellar objects (YSOs), on spatial scales from a whole galaxy to smaller than individual molecular clouds. To this end we first perform high resolution magneto-hydrodynamic (MHD) simulations of a Milky Way-like galaxy that can resolve cloud-scale structures, with realistic star formation and stellar feedback algorithms, in particular including a treatment of photoionization feedback that allows us to track the locations of ionized gas. Then we post-process the simulation outputs to derive local chemical compositions, non-LTE level populations, and emissivities of both H $\alpha$ and CO $J=1\rightarrow 0$ emission lines. We use these to produce synthetic H $\alpha$ and CO emission maps, and compare the former to maps of the true star formation rate as traced by YSOs born with ages $<2$ Myr. We confirm the reliability of our stellar feedback treatment by showing that our simulations reproduce the so-called “tuning fork” diagram for observed Milky Way-like galaxies, which quantifies the degree of small-scale decorrelation between CO and H $\alpha$ maps of galaxies.

We find that the H $\alpha$ - and YSO-based SFR maps have good agreement on kpc scales, but exhibit significant spatial mismatch on sub-100 pc scales. We show that the primary cause for this discrepancy is the leakage of ionizing photons from H ii regions, especially in low column density regions, with a subdominant contribution due to stars with ages of $\approx 2-5$ Myr drifting away from their birth sites. We therefore conclude that on sub-100 pc scales, H $\alpha$ should be used as a star formation tracer only with extreme caution, and we show that tuning fork diagrams derived from true YSO positions rather than H $\alpha$ emission differ significantly in a way that may have led prior analyses to substantially underestimate the time it takes for newborn stars to disperse their parent clouds.

However, we also show that H $\alpha$ emission does become a reasonably reliable tracer of star formation in regions of high gas density, $N_{\mathrm{H}}\gtrsim 3\times 10^{21}$ cm^-2 when averaged over 100 pc scales, because in such regions ionizing photons are absorbed close to their emission sites and leakage has minimal effects. We propose a calibration model that exploits this result and allows reliable measurements of the SFRs of large molecular regions ( $r\gtrsim 100$ pc) from H $\alpha$ luminosity. Future high-resolution observations powered by JWST are essential to test our findings and refine this calibration model, and allow more accurate measurements of star formation rates across different environments.

Acknowledgements

We thank the anonymous referee for a helpful and constructive report that improved this paper. We acknowledge Dr. Diederik Kruijssen and Dr. Mélanie Chevance for suggestions on running the heisenberg code. The authors acknowledge funding from the Australian Research Council through its Future Fellowship and Laureate Fellowship funding schemes, awards FT180100375 and FL220100020, as well as its Discovery Projects scheme (award DP230102280), the Australia-Germany Joint Research Cooperation Scheme (UA-DAAD), and high-performance computing resources provided by the Australian National Computational Infrastructure (grants jh2 and ek9) and the Pawsey Supercomputing Centre (grant pawsey0810) through the National and ANU Computational Merit Allocation Schemes, and by the Leibniz Rechenzentrum and the Gauss Centre for Supercomputing (grants pr32lo, pr48pi and GCS Large-scale project 10391).

Data Availability

The data underlying this article will be shared upon reasonable request to the corresponding author.

References

Arora et al. (2021) Arora R., Krumholz M. R., Federrath C., 2021, MNRAS, 508, 3290
Belfiore et al. (2018) Belfiore F., et al., 2018, MNRAS, 477, 3014
Belfiore et al. (2022) Belfiore F., et al., 2022, A&A, 659, A26
Bolatto et al. (2013) Bolatto A. D., Wolfire M., Leroy A. K., 2013, ARA&A, 51, 207–268
Calzetti et al. (2023) Calzetti D., et al., 2023, ApJ, 946, 1
Caplar & Tacchella (2019) Caplar N., Tacchella S., 2019, MNRAS, 487, 3845–3869
Chabrier (2005) Chabrier G., 2005, in Corbelli E., Palla F., Zinnecker H., eds, Astrophysics and Space Science Library Vol. 327, The Initial Mass Function 50 Years Later. p. 41, doi:10.1007/978-1-4020-3407-7_5
Chen et al. (2009) Chen C. H. R., Chu Y.-H., Gruendl R. A., Gordon K. D., Heitsch F., 2009, ApJ, 695, 511
Chen et al. (2010) Chen C. H. R., et al., 2010, ApJ, 721, 1206
Chevance et al. (2020) Chevance M., et al., 2020, MNRAS, 493, 2872
Chevance et al. (2021) Chevance M., et al., 2021, MNRAS, 509, 272–288
Chevance et al. (2023) Chevance M., Krumholz M. R., McLeod A. F., Ostriker E. C., Rosolowsky E. W., Sternberg A., 2023, in Inutsuka S., Aikawa Y., Muto T., Tomida K., Tamura M., eds, Astronomical Society of the Pacific Conference Series Vol. 534, Protostars and Planets VII. p. 1 (arXiv:2203.09570), doi:10.48550/arXiv.2203.09570
Draine (2011) Draine B. T., 2011, Physics of the Interstellar and Intergalactic Medium. Princeton University Press
Ellison et al. (2017) Ellison S. L., Sánchez S. F., Ibarra-Medel H., Antonio B., Mendel J. T., Barrera-Ballesteros J., 2017, MNRAS, 474, 2039–2054
Ferland et al. (1998) Ferland G. J., Korista K. T., Verner D. A., Ferguson J. W., Kingdon J. B., Verner E. M., 1998, PASP, 110, 761
Flores Velázquez et al. (2020) Flores Velázquez J. A., et al., 2020, MNRAS, 501, 4812–4824
Fujimoto et al. (2019) Fujimoto Y., Chevance M., Haydon D. T., Krumholz M. R., Kruijssen J. M. D., 2019, MNRAS, 487, 1717
Girardi et al. (2000) Girardi L., Bressan A., Bertelli G., Chiosi C., 2000, A&AS, 141, 371
Glover et al. (2010) Glover S. C. O., Federrath C., Mac Low M. M., Klessen R. S., 2010, MNRAS, 404, 2
Gregg et al. (2024) Gregg B., et al., 2024, arXiv e-prints, p. arXiv:2405.09667
Haydon et al. (2020) Haydon D. T., Kruijssen J. M. D., Chevance M., Hygate A. P. S., Krumholz M. R., Schruba A., Longmore S. N., 2020, MNRAS, 498, 235
Hopkins (2015) Hopkins P. F., 2015, MNRAS, 450, 53–110
Hopkins (2016) Hopkins P. F., 2016, MNRAS, 462, 576
Hopkins & Raives (2016) Hopkins P. F., Raives M. J., 2016, MNRAS, 455, 51
Hopkins et al. (2018a) Hopkins P. F., et al., 2018a, MNRAS, 477, 1578
Hopkins et al. (2018b) Hopkins P. F., et al., 2018b, MNRAS, 480, 800
Hu et al. (2022) Hu Z., Krumholz M. R., Pokhrel R., Gutermuth R. A., 2022, MNRAS
Hu et al. (2023) Hu Z., Wibking B. D., Krumholz M. R., 2023, MNRAS, 521, 5604–5615
Hygate et al. (2019) Hygate A. P. S., Kruijssen J. M. D., Chevance M., Schruba A., Haydon D. T., Longmore S. N., 2019, MNRAS, 488, 2800
Indebetouw et al. (2008) Indebetouw R., et al., 2008, AJ, 136, 1442
Jeffreson et al. (2021) Jeffreson S. M. R., Krumholz M. R., Fujimoto Y., Armillotta L., Keller B. W., Chevance M., Kruijssen J. M. D., 2021, MNRAS, 505, 3470
Jeffreson et al. (2024) Jeffreson S. M. R., Semenov V. A., Krumholz M. R., 2024, MNRAS, 527, 7093
Jin et al. (2022) Jin Y., Kewley L. J., Sutherland R. S., 2022, ApJ, 934, L8
Kado-Fong et al. (2020) Kado-Fong E., Kim J.-G., Ostriker E. C., Kim C.-G., 2020, ApJ, 897, 143
Kennicutt (1983) Kennicutt R. C. J., 1983, ApJ, 272, 54
Kennicutt & Evans (2012) Kennicutt R. C., Evans N. J., 2012, ARA&A, 50, 531–608
Kim et al. (2022) Kim J., et al., 2022, MNRAS, 516, 3006
Kruijssen & Longmore (2014) Kruijssen J. M. D., Longmore S. N., 2014, MNRAS, 439, 3239
Kruijssen et al. (2018) Kruijssen J. M. D., Schruba A., Hygate A. P. S., Hu C.-Y., Haydon D. T., Longmore S. N., 2018, MNRAS, 479, 1866
Kruijssen et al. (2019) Kruijssen J. M. D., et al., 2019, Nature, 569, 519–522
Krumholz (2014) Krumholz M. R., 2014, MNRAS, 437, 1662
Krumholz et al. (2015) Krumholz M. R., Fumagalli M., da Silva R. L., Rendahl T., Parra J., 2015, MNRAS, 452, 1447
Krumholz et al. (2019) Krumholz M. R., McKee C. F., Bland-Hawthorn J., 2019, ARA&A, 57, 227
Leitherer et al. (1999) Leitherer C., et al., 1999, ApJS, 123, 3
Mathis (1986) Mathis J. S., 1986, ApJ, 301, 423
McClymont et al. (2024) McClymont W., et al., 2024, MNRAS, 532, 2016
Meynet et al. (1994) Meynet G., Maeder A., Schaller G., Schaerer D., Charbonnel C., 1994, A&AS, 103, 97
Ochsendorf et al. (2017) Ochsendorf B. B., Meixner M., Roman-Duval J., Rahman M., Evans II N. J., 2017, ApJ, 841, 109
Peltonen et al. (2024) Peltonen J., et al., 2024, MNRAS, 527, 10668
Peters et al. (2017) Peters T., et al., 2017, MNRAS, 466, 3293
Pflamm-Altenburg & Kroupa (2008) Pflamm-Altenburg J., Kroupa P., 2008, Nature, 455, 641
Pokhrel et al. (2020) Pokhrel R., et al., 2020, ApJ, 896, 60
Sánchez et al. (2012) Sánchez S. F., et al., 2012, A&A, 538, A8
Schaller et al. (1992) Schaller G., Schaerer D., Meynet G., Maeder A., 1992, A&AS, 96, 269
Sembach et al. (2000) Sembach K. R., Howk J. C., Ryans R. S. I., Keenan F. P., 2000, ApJ, 528, 310–324
Semenov et al. (2021) Semenov V. A., Kravtsov A. V., Gnedin N. Y., 2021, ApJ, 918, 13
Shivaei et al. (2015) Shivaei I., Reddy N. A., Steidel C. C., Shapley A. E., 2015, ApJ, 804, 149
Smith et al. (2017) Smith B. D., et al., 2017, MNRAS, 466, 2217
Solomon & Vanden Bout (2005) Solomon P., Vanden Bout P., 2005, ARA&A, 43, 677–725
Sukhbold et al. (2016) Sukhbold T., Ertl T., Woosley S. E., Brown J. M., Janka H. T., 2016, ApJ, 821, 38
Tacchella et al. (2022) Tacchella S., et al., 2022, MNRAS, 513, 2904–2929
Tritsis et al. (2022) Tritsis A., Federrath C., Willacy K., Tassis K., 2022, MNRAS, 510, 4420
Weisz et al. (2012) Weisz D. R., et al., 2012, ApJ, 744, 44
Wibking & Krumholz (2023) Wibking B. D., Krumholz M. R., 2023, MNRAS, 521, 5972
Wood et al. (2010) Wood K., Hill A. S., Joung M. R., Mac Low M.-M., Benjamin R. A., Haffner L. M., Reynolds R. J., Madsen G. J., 2010, ApJ, 721, 1397–1403
da Silva et al. (2012) da Silva R. L., Fumagalli M., Krumholz M., 2012, ApJ, 745, 145
da Silva et al. (2014) da Silva R. L., Fumagalli M., Krumholz M. R., 2014, MNRAS, 444, 3275

On the limitations of Hα\alpha luminosity as a star formation tracer in spatially resolved observations