Population Models for Star Formation Timescales in Early Galaxies: The First Step Towards Solving Outshining in Star Formation History Inference

Bingjie Wang (王冰洁) Department of Astronomy & Astrophysics, The Pennsylvania State University, University Park, PA 16802, USA Institute for Computational & Data Sciences, The Pennsylvania State University, University Park, PA 16802, USA Institute for Gravitation and the Cosmos, The Pennsylvania State University, University Park, PA 16802, USA bwang@psu.edu Joel Leja Department of Astronomy & Astrophysics, The Pennsylvania State University, University Park, PA 16802, USA Institute for Computational & Data Sciences, The Pennsylvania State University, University Park, PA 16802, USA Institute for Gravitation and the Cosmos, The Pennsylvania State University, University Park, PA 16802, USA joel.leja@psu.edu Hakim Atek Institut d’Astrophysique de Paris, CNRS, Sorbonne Université, 98bis Boulevard Arago, 75014, Paris, France hakim.atek@iap.fr Rachel Bezanson Department of Physics & Astronomy and PITT PACC, University of Pittsburgh, Pittsburgh, PA 15260, USA rachel.bezanson@pitt.edu Emilie Burnham Department of Astronomy & Astrophysics, The Pennsylvania State University, University Park, PA 16802, USA efb5552@psu.edu Pratika Dayal Kapteyn Astronomical Institute, University of Groningen, P.O. Box 800, 9700 AV Groningen, The Netherlands p.dayal@rug.nl Robert Feldmann Department of Astrophysics, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland robert.feldmann@uzh.ch Jenny E. Greene Department of Astrophysical Sciences, Princeton University, Princeton, NJ 08544, USA jgreene@astro.princeton.edu Benjamin D. Johnson Center for Astrophysics

|

Harvard & Smithsonian, Cambridge, MA 02138, USA benjamin.johnson@cfa.harvard.edu Ivo Labbe Centre for Astrophysics and Supercomputing, Swinburne University of Technology, Melbourne, VIC 3122, Australia ilabbe@swin.edu.au Michael V. Maseda Department of Astronomy, University of Wisconsin-Madison, Madison, WI 53706, USA maseda@astro.wisc.edu Themiya Nanayakkara Centre for Astrophysics and Supercomputing, Swinburne University of Technology, Melbourne, VIC 3122, Australia wnanayakkara@swin.edu.au Sedona H. Price Department of Physics & Astronomy and PITT PACC, University of Pittsburgh, Pittsburgh, PA 15260, USA sedona.price@pitt.edu Katherine A. Suess Department for Astrophysical and Planetary Science, University of Colorado, Boulder, CO 80309, USA wren.suess@colorado.edu John R. Weaver Department of Astronomy, University of Massachusetts, Amherst, MA 01003, USA jweaver@astro.umass.edu Katherine E. Whitaker Department of Astronomy, University of Massachusetts, Amherst, MA 01003, USA Cosmic Dawn Center (DAWN), Niels Bohr Institute, University of Copenhagen, Jagtvej 128, København N, DK-2200, Denmark kwhitaker@astro.umass.edu

Abstract

JWST have revealed temporarily-quenched and ultraviolet-luminous galaxies in the early universe, suggesting enhanced star formation stochasticity. Verifying this hypothesis is critical, yet challenging. Outshining, wherein light from young stars dominates the spectral energy distribution, represents perhaps the greatest challenge in inferring the formation histories of unresolved galaxies. In this paper, we take a simple model of burstiness and show that state-of-the-art inference methods with flexible star formation histories (SFHs) and neutral priors, while recovering average star formation rates (SFRs; $\sim 0.1$ dex median offset), fail to recover the complexities of fluctuations on tens of Myr timescales, and typically underestimate masses in bursty systems ( $\sim 0.15$ dex). Surprisingly, detailed SFH recovery is still sensitive to priors even when data quality is optimal, e.g., including high signal-to-noise ( $\rm 20~pixel^{-1}$ ) spectroscopy with wide coverage (rest-frame $0.12-1.06~\mu$ m). Crucially, however, refitting the same data with a prior correctly encoding the bursty expectation eliminates these biases: median offsets in mass and SFRs decrease to $\sim 0.04$ dex and $\sim 0.05$ dex, respectively. Under the assumption that current population burstiness predicts past SFH, the solution to outshining in modeling statistical samples is empirically measuring recent galaxy SFHs with population modeling. A prototype is H $\alpha$ /UV: while helpful, it is insufficient to constrain the expected complex burstiness. To this end, we introduce a more complete, quantitative population-level approach and demonstrate that it promises to recover the typical amplitude, timescale, and slope of the recent SFH to high accuracy. This approach thus has the strong potential to solve outshining using observations from JWST.

Galaxy evolution (594) – Galaxy formation (595) – Post-starburst galaxies (2176) – Spectral energy distribution (2129) – Starburst galaxies (1570) – Star formation (1569)

^†^†software: Dynesty (J. S. Speagle, 2020), Matplotlib (J. D. Hunter, 2007), Nautilus (J. U. Lange, 2023), NumPy (C. R. Harris et al., 2020), Prospector (B. D. Johnson et al., 2021), SciPy (P. Virtanen et al., 2020).

1 Introduction

The growth of galaxies is regulated by physical processes operating across different timescales, ranging from the creation and destruction of giant molecular clouds at below $\sim$ 10 Myr to quenching at over $\sim 10^{3}$ Myr (see C. F. McKee & E. C. Ostriker 2007 for a theoretical overview). Understanding the observed diversity of the galaxy populations—a cumulative effect of these intertwined individual processes—has long been a focus in the field of galaxy formation and evolution.

Theoretical studies based on cosmological simulations offer a promising approach to dissect the formation and evolution of galaxies, but face the challenge of the wide dynamic range required to describe the physics involved. While models agree on a core set of critical processes shaping galaxy properties, their phenomenological implementations (sub-grid physics) differ, leading to a dispersion in model predictions (R. S. Somerville & R. Davé, 2015; T. Naab & J. P. Ostriker, 2017). Observationally, the imprints of these individual processes are anticipated to shape the star formation histories (SFHs) of galaxies, with each process exerting its influence over distinct characteristic timescales (e.g., K. G. Iyer et al. 2020). This means that by constraining the variability of SFHs, it would be possible to disentangle the various mechanisms driving or suppressing star formation.

Recent breakthroughs facilitated by the James Webb Space Telescope (JWST) suggest a staggering diversity of inferred SFHs within galaxy populations at early times, reigniting interest in the study of its variability (e.g., T. J. Looser et al. 2025; C. Witten et al. 2025). The discovery of mini-quenched galaxies at early epochs (T. J. Looser et al., 2024; V. Strait et al., 2023) and the over-abundance of luminous high-redshift galaxies have challenged theoretical models (e.g., Y. Harikane et al. 2023; C. A. Mason et al. 2023). Large amplitude, short timescale fluctuations in star formation rates (SFRs) have been suggested in $z>1$ studies (A. van der Wel et al., 2011; H. Atek et al., 2011; M. V. Maseda et al., 2018). This phenomenon may be prominent in the gas-rich $z>4$ universe, and has been proposed to explain these exotic objects (A. Pallottini & A. Ferrara, 2023; X. Shen et al., 2023; G. Sun et al., 2023b; A. L. Faisst & T. Morishita, 2024), in addition to several other theoretical possibilities, including an evolving initial mass function (IMF; E. R. Cueto et al. 2024; P. van Dokkum & C. Conroy 2024; L. Y. A. Yung et al. 2024), a decrease in dust attenuation with increasing redshift (V. Mauerhofer & P. Dayal, 2023; A. Ferrara, 2024), a high efficiency of gas conversion into stars (A. Dekel et al., 2023; A. Renzini, 2023; J. Sipple & A. Lidz, 2024), and black hole contribution to the UV luminosity (Y. Ono et al., 2018; F. Pacucci & A. Loeb, 2022). This stochasticity is predicted by theoretical models of galaxy formation at high redshift as the feedback timescale approaches or becomes smaller than the dynamical time of the system, disrupting the otherwise simple correlation between smooth gas accretion and SFRs (G. S. Stinson et al., 2007; S. Tacchella et al., 2016; R. Feldmann et al., 2017; C.-A. Faucher-Giguère, 2018; L. Bassini et al., 2023; P. F. Hopkins et al., 2023a; T. Dome et al., 2024). Short-lived and frequent deviations from an “average” SFR would naturally explain both unexpected low-mass quiescent systems as post-starbursts and overly luminous galaxies as starbursts.

The standard methodology for constraining SFH variability is via H $\alpha$ -to-UV flux ratios, as the two are expected to trace star formation on different timescales (D. R. Weisz et al., 2012; B. D. Johnson et al., 2013; M. Sparre et al., 2017; N. Caplar & S. Tacchella, 2019; A. L. Faisst et al., 2019; J. A. Flores Velázquez et al., 2021). H $\alpha$ emission arises from recombination in nebular clouds, primarily driven by O and B-type stars on timescales of $\lesssim$ 5 or 10 Myr. UV emission is generated by O and B-type stars with lifetimes of $\lesssim 300$ Myr (J. Kennicutt, 1998), although due to outshining of the older stellar light by recent star formation (e.g., C. Papovich et al. 2001), the UV emission in star-forming galaxies typically traces $30-70$ Myr timescales. Pre-JWST studies at $z<3$ have demonstrated the feasibility of this method (e.g., J. C. Lee et al. 2009; N. Emami et al. 2019), but it faces two main challenges: (i) a purely rising SFH would lead to maximal H $\alpha$ and UV fluxes similar to a bursty SFH (V. Mehta et al., 2023); (ii) translating the H $\alpha$ -to-UV flux ratios to timescales is difficult as the timescale probed by the UV is a strong function of the intrinsic SFH, meaning that every object would need to be uniquely and explicitly modeled (e.g., B. D. Johnson et al. 2013; J. A. Flores Velázquez et al. 2021).

In principle, one can solve both these problems by performing individual spectral energy distribution (SED) fits instead. Indeed, with advancements in panchromatic SED fitting techniques, numerous studies attempt to estimate SFH variability by fitting photometric observations from deep JWST surveys (e.g., L. Ciesla et al. 2024), in addition to using the H $\alpha$ -to-UV flux ratios (Y. Asada et al., 2024; R. Endsley et al., 2024). However, the challenge here is that the inference of SFHs from individual SED fitting of photometric data, at least in broad-band photometry, cannot distinguish between an “extremely bursty” and a “smooth” model. B. Wang et al. (2024a) has demonstrated that both models can produce statistically acceptable solutions, as the Bayes factor shows no preference for either model.

Fundamentally, the challenge is in solving outshining. To infer SFH from integrated light is to infer the past from the present, but with asymmetric information: a recent burst of star formation dramatically outshines the older, fainter stellar populations, even in the near-infrared where old stars are relatively brightest. This prevents an accurate inference of the full SFH, introducing biases in the stellar mass and SFR estimates. C. Papovich et al. (2001) analyzed the HST photometric data of a sample of Lyman-break galaxies at $2\lesssim z\lesssim 3.5$ , and found that a hypothetical old stellar population could in principle contain up to $\approx 3-8$ times the stellar mass of the young stars that dominate the observed SED; B. Wang et al. (2024a) showed that highly stochastic SFHs constrained only by broadband photometry produce $\sim 0.8$ dex systematics in SFR averaged over a short timescale and $\sim 0.3$ dex systematics in average stellar age. Recent studies performed on cosmological simulations also hint at systematic biases in inferred stellar masses (e.g., D. Narayanan et al. 2024; R. K. Cochrane et al. 2025). Owing to the expected increasing prevalence of burstiness at $z\gtrsim 4$ as outlined previously, and paired with the fact that the SFH is degenerate with most key parameters measured from galaxy SEDs, outshining is perhaps the biggest challenge in interpreting the light from galaxies in the early universe.

In addition, galaxy brightness (and therefore detectability in a flux-limited survey) and burstiness are inherently interconnected, as a recent burst makes galaxies bright. It follows that the duty cycle of the SFH variation would be biased in a flux-limited sample; that is, galaxies without a burst at the low-mass end (over a range of $10-100$ in stellar mass according to a study done on the FIRE simulation; G. Sun et al. 2023a) are preferentially missed. This selection effect has broad implications for statistical studies of the galaxy population such as the mass assembly history and global SFH.

All of the above factors motivate the need for a thorough examination of the extent to which burstiness can be observationally constrained. In this paper, we take a simple model of burstiness—essentially, repeated up-and-down fluctuations in the SFH—and evaluate the accuracy of recovered SFHs for bursty systems inferred with various methods, including state-of-the-art SED fitting of simulated high signal-to-noise (S/N) spectra, and population-level analyses of spectral features. The latter is motivated by recent developments in making the population distributions of the parameters of interest, such as photometric redshift or stellar population parameters, as the inference objective, bypassing individual fits (J. Alsing et al., 2024; J. Li et al., 2024). All tests have a quantitative target for recovery, enabled by our parameterization of the SFH. This work provides a comprehensive assessment of burstiness constraints, and critically, sets out a potential path forward for solving outshining.

The structure of this paper is as follows. Section 2 presents the SFH model and mock observations. Section 3 focuses on inferring SFH variations from individual galaxies, using common setups in SED fitting. Section 4 proceeds to show the improvement on these constraints given a correct SFH prior. Section 5 assesses the performance of the widely used population-level burstiness indicator, the H $\alpha$ -to-UV flux ratio. Section 6 proposes our population-level approach to complement the H $\alpha$ -to-UV method, and demonstrates its additional constraining power. Section 7 ties all the pieces together to address outshining, and also provides additional discussions on the individual SED fits including the prior influences, and the observability and selection effects due to SFH variations. We conclude in Section 8.

Where applicable, we adopt the best-fit cosmological parameters from the WMAP 9 yr results: $H_{0}=69.32$ ${\rm km\,s^{-1}\,Mpc^{-1}}$ , $\Omega_{M}=0.2865$ , and $\Omega_{\Lambda}=0.7135$ (G. Hinshaw et al., 2013), and a G. Chabrier (2003) IMF over the mass range of $0.08-120~{\rm M}_{\odot}$ . Unless otherwise mentioned, we report the median of the posterior, and 1 $\sigma$ error bars are the 16th and 84th percentiles.

2 Simulations

2.1 Models of Star Formation History

Figure 1: Models of star-formation history. (Left) Two examples of our parameterization of the population-level SFH parameters. One SFH simulated with a fluctuation amplitude

\sigma=0.3

dex around the star forming sequence (gray dash-dotted line), a slowly rising slope,

\alpha=-0.40

, and a short duration of the high/low SFR phase,

\delta t=20

Myr, is plotted in light brown. One SFH simulated with a greater fluctuation amplitude, duration, and a steeply rising slope is plotted in blue. (Middle) The distributions of sSFRs averaged over one duration, for galaxy population observed at different phases, are shown as histograms in the corresponding colors. (Right) A SFH from the FIREbox simulation (R. Feldmann et al., 2023), and a parameterized SFH generated for this paper are shown in black and orange, respectively. While our parameterization of the SFH is simple, it captures the key features in many of the FIREbox SFHs. This simplified model provides a clear basis to evaluate the different inference methods. A second SFH, also from FIREbox but with larger fluctuating amplitudes, is over-plotted in gray. The bursts in this example are

\sim 2

dex peak-to-peak, illustrating that the range of the fluctuating amplitude explored in this work is well motivated.

Figure 2: Variations in the F277W and F444W fluxes due to SFH variations alone, with variations due to mass, dust, and redshift removed. For each filter, we include two panels showing the distributions of fluxes from SFHs with a slowly rising slope (

\alpha=-0.40

), and a steeply rising slope (

\alpha=-0.65

). The four curves in each panel reflect the changes in fluxes due to various fluctuating amplitudes and durations. For rising SFHs with the same slope, the varying fluctuating amplitudes and durations alone produce over 2 (1.5) magnitudes of variability in the observed rest-frame optical (near-IR) fluxes.

Bursty star formation, defined as repeated up-and-down fluctuations in an otherwise smoothly-varying SFH, is a generic prediction in various simulations of galaxy formation (e.g., G. S. Stinson et al. 2007; F. Governato et al. 2012; P. F. Hopkins et al. 2014; C.-A. Faucher-Giguère 2018; L. Legrand et al. 2022). An example from the FIREbox cosmological volume simulation (R. Feldmann et al., 2023) is shown in Figure 1. Motivated by these SFHs, we develop a simple parameterization, emulating the recurrent “on” and “off” phases in star formation. It is defined by the following population-level SFH parameters, describing the population distribution from which individual galaxies can be drawn for testing purposes.

1.

$\sigma$ : amplitude of fluctuations in specific star-formation rates (sSFR) around the star-forming sequence, measured in dex.
2.

$\delta t$ : duration of the high/low SFR phase, measured in linear time.
3.

$\alpha$ : the power-law slope of the SFR during the most recent 500 Myr in lookback time, i.e., $\log({\rm SFR})\propto\alpha\log(t)$ .
4.

$\phi$ : phase; $\in$ [0, 2 $\pi$ ], with 0 ( $\pi$ ) referring to the instant when the galaxy is observed at the end of a relatively quiescent (star-forming) period.

The above set of parameters also specifies the total mass formed in the most recent 500 Myr. The choice of 500 Myr is arbitrary, and the results do not depend on this particular choice as long as the time scale is sufficiently long ( $\gtrsim 200$ Myr). As we would like to remove the effect on the simulations due to varying galaxy masses, while reaching a target sSFR corresponding to the scatter in the observed star-forming main sequence (e.g., K. E. Whitaker et al. 2012; J. S. Speagle et al. 2014; J. Leja et al. 2022), we locate any remaining mass in the last bin.

While the above parameterization is simple, it effectively captures the key features of the oscillating patterns present in many FIREbox SFHs. Certainly, the FIREbox simulation predicts SFHs with more complex forms that cannot be fully described by our model. More generally, different cosmological simulations produce distinct SFHs with power at different timescales (K. G. Iyer et al., 2020), largely due to different feedback implementations. It is worth emphasizing that the purpose of our model is not to replicate SFHs from cosmological simulations. Instead, the simplicity of our parameterization is an intentional choice, providing a clear basis from which the inference of SFH variations from observed SEDs via different methods can be straightforwardly evaluated and compared. As suggested in Figure 1, this model is sufficient for our testing purposes. We will explore more flexible formalisms in the future.

As a pilot study, we explore a few representative points in the parameter space, bracketing a range guided by observations and simulations. These choices are summarized in Figure 1, and explained in more detail below.

1.

$\sigma$ = 0.3, 0.8 dex. $\sigma=0.3$ dex represents the population of “normal” star-forming galaxies, motivated by the 0.3 dex scatter seen in the star-forming sequence (K. E. Whitaker et al., 2012; J. S. Speagle et al., 2014; J. Leja et al., 2022). $\sigma=0.8$ dex resembles a burstier case. Note that this parameter represents (the half-width of) the peak-to-peak fluctuations in sSFR(t), i.e., not smoothed over some timescale. Its amplitude can thus be greater than the observed scatter in the star-forming sequence, and we use $0.8$ dex as an illustrative contrasting case to the smaller $\sigma$ value.
2.

$\delta t$ = 20, 40 Myr. T. Dome et al. (2024) find variations in simulations are typically on the order of 40 Myr, whereas M. V. Maseda et al. (2014) posit that the bursts need to be $<50$ Myr for stability purposes, we thus consider these two illustrative cases.
3.

$\alpha$ = $-0.40$ , $-0.65$ . As the SFRs of a galaxy have to rise with time in order for it to stay on the star-forming sequence, we consider one slowly and one steeply rising slope. $\alpha=-0.40$ ( $\alpha=-0.65$ ) roughly corresponds to a mass-doubling time of 120 (50) Myr.
4.

$\phi$ . We Monte Carlo $\phi$ over the full range of [0, 2 $\pi$ ], meaning that the galaxies are observed at random phases in our simulations.

Again, we stress that the goal of this work is to perform a strict recovery test; that is, to check whether the individual spectrum, or spectral features contain sufficient information to constrain burstiness. The results will not be dependent on whether the adopted SFH parameterization is a truthful representation of the population-level burstiness of real galaxies, as long as the models capture the spectral characteristics of bursty SFHs.

2.2 Mock Observations

Following the above parameterization, we simulate observations of galaxy populations using Prospector (B. D. Johnson et al., 2021). We adopt the MIST stellar isochrones (J. Choi et al., 2016; A. Dotter, 2016) and MILES stellar library (P. Sánchez-Blázquez et al., 2006) from FSPS (C. Conroy & J. E. Gunn, 2010). Stellar metallicity is drawn randomly in the range of $10^{-1.98}$ to $10^{0.19}~{\rm Z}_{\odot}$ . Although the focus is on high redshift, super-solar metallicities are included simply to allow for the mock observations to cover a wide parameter space. This is a conservative approach as the varying metallicities act to increase the scatter in timescale-sensitive features. Nebular emission is included using a pre-computed Cloudy (G. J. Ferland et al., 2017) grid (N. Byler et al., 2017). Gas-phase metallicity is fixed to $10^{-0.5}~{\rm Z}_{\odot}$ . The attenuation of the intergalactic medium (IGM) is assumed to follow P. Madau (1995). Dust attenuation is set to zero for simplicity—in practice, for star-forming galaxies with rest-optical spectra, it can be reliably inferred with H $\alpha$ /H $\beta$ .

All spectra are shifted to a common redshift of $z=4$ . This simplification is not expected to impact our results, since the goal is to test the constraining power of the data. Provided that key timescale-sensitive features, such as the Balmer break and Balmer emission lines, remain present, the exact redshift range is irrelevant for our purpose. The model spectra, examples of which are shown in Figure 3, are then smoothed with the instrumental line-spread function of the JWST/Prism spectroscopy from JDox¹¹1https://jwst-docs.stsci.edu. We include all the JWST NIRCam broad and medium-band photometry, and HST F435W, F606W and F814W filters. This setup is driven by the observing strategy of CANUCS (C. J. Willott et al., 2022) and UNCOVER (R. Bezanson et al., 2024; K. A. Suess et al., 2024; S. H. Price et al., 2025), which represents the best wavelength coverage for distant galaxies. We additionally test the constraining power from the longer wavelengths by including all the MIRI bands, the results of which are presented in the Appendix. Noise is modeled as Gaussian with S/N $=$ 20, again motivated by the feasibility of obtaining statistical samples with JWST. We simulate 50 galaxies observed at random times for each model to mitigate random measurement uncertainties and sample different phases.

As an illustration, the distributions of the observed F277W (corresponding to rest-frame optical) and F444W flux (corresponding to rest-frame near-infrared), for a galaxy population at the same redshift and mass, but different SFH variations, are shown in Figure 2. The SFH variations alone can produce over one order of magnitude of variability in the observed brightness, driven by the dim objects being in the quiescent phase whereas the bright objects are in the star-bursting phase.

It follows that star formation variability is critical to the interpretation of flux-limited samples. A lack of understanding in SFH variations translates into a selection bias, where we preferentially observe objects in a starburst. Seen from a different perspective, characterizing star formation variability is also of interest for studies attempting completeness corrections.

Figure 3: Examples of SFHs and the corresponding model spectra, illustrating the timescale-sensitive spectral features. SFHs simulated with a slowly rising slope of

\alpha=-0.40

, a fluctuation amplitude of

\sigma=0.3

dex around the star forming sequence (gray dotted line) and two durations of

\delta t=20

Myr and

\delta t=40

Myr are plotted in light and dark green, respectively in the upper left panel. The resulting model spectra are to the right, with the key observations identified in this paper annotated. The lower panel shows the simulated SFHs and model spectra in a similar format, only that the SFHs assume a steeply rising slope of

\alpha=-0.65

3 Constraining SFH Variations from Individual Fits

From Figure 3, it appears that different SFHs can have noticeable impacts on the resulting spectra. The most straightforward approach to estimating SFH variations then is via SED fitting of individual galaxies. We thus begin our analysis by joint photometric and spectral fitting of simulated galaxies, using a setup that follows the most common practices. For ease of comparison, this section contains the methods and results of this individual approach, with further discussions supplied in Section 7.3.

We note that the goal is to infer the population-level SFH parameters, as opposed to the usual integrated stellar population properties. For this reason, we conduct this experiment in three stages, described as follows.

3.1 SED Fitting

We perform individual SED fitting using the Prospector model (B. D. Johnson et al., 2021), representative of the state-of-the-art in Bayesian inference of stellar population parameters (e.g., J. Leja et al. 2019; S. Tacchella et al. 2022). We use the same stellar libraries in Section 2.2, and the set of free parameters and priors is chosen to follow common practices (J. Leja et al., 2022; S. Tacchella et al., 2022; B. Wang et al., 2024b). In brief, the free parameters include the total mass formed, stellar metallicity, 3 parameters controlling the dust based on the two-component model (S. Charlot & S. M. Fall, 2000), a power-law index for the D. Calzetti et al. (2000) dust attenuation curve (S. Noll et al., 2009), and nebular emission described by gas-phase metallicity and an ionization parameter. Nuisance parameters, unconstrained by the available wavelength coverage but included here to be marginalized over, comprise the normalization and dust optical depth of mid-infrared AGN (J. Leja et al., 2018), and 3 parameters describing dust emission (B. T. Draine & A. Li, 2007). We follow the standard SED-fitting practice of assuming that all line emission is produced by starlight (without contribution from e.g., shocks or AGN). This means that the strength of the emission lines as probed by our mock spectra directly constrain the short-term SFH. Altogether, these constitute 13 free parameters.

A 6th-order polynomial is fit to the ratio of the model and observed spectra, so the normalization is set by the photometry, and effectively the shape of the continuum is removed. This ensures that the likelihood is not overly dominated by the continuum shape, allowing the fit to be more sensitive to the spectral features of interest. The polynomial order, however, is low enough to preserve the Balmer break strength, which is also an important age indicator. Note that the use of the polynomial calibration does not mean disregarding the information encoded in the continuum shape. This is well captured by the excellent photometric data, which includes all the JWST/NIRCam broad- and medium-band filters.

We devote special attention to the assumed SFH forms used in the SED fitting. As a useful reference point for assessing the recovery of critical stellar population properties such as mass and averaged SFRs, we adopt the widely used Prospector- $\alpha$ model (J. Leja et al., 2017), in which SFH is described non-parametrically via mass formed in seven logarithmically spaced time bins in lookback time. A Student’s-t prior distribution is placed on the logarithm of SFR ratios between adjacent age bins. The expectation value, $\mu$ , of the logarithm SFR ratios is 0. The resulting prior SFH tends toward constant SFR(t), and down-weights dramatic changes between adjacent time bins. The scale, $\sigma$ —analogous to the standard deviation for a Gaussian distribution—is 0.3. The allowed prior range is capped at $\pm 10$ for numerical stability. Following J. Leja et al. (2017), we refer to this prior as the continuity prior. We emphasize that this setup is one of the most common ways for using Prospector to fit large samples of distant galaxies; therefore our results should be interpreted as applicable in a broader context as well.

The variations in the bursty SFHs at tens of Myr scale necessitate finer time bins. We thus modify the bin scheme in the Prospector- $\alpha$ model. As a best-case scenario, we tailor our SFH bins to exactly match the periods of quiescence and bursting for each individual galaxy. In practice, this would be impossible as it requires a priori knowledge of a galaxy’s SFH. While the influence of the choice of the width and location of the age bins has been examined for smoothly varying SFHs in J. Leja et al. (2019), the more complex bursty SFH forms would require a new assessment, especially given that rebinning of a bursty SFH can change the predicted spectrum (see Appendix A). Given the possible influence of priors on the inferred SFHs, we consider two Student’s-t distributions. First, we use the same continuity prior as proposed in J. Leja et al. (2017). Second, we use the bursty continuity prior from S. Tacchella et al. (2022). This differs from the continuity prior on the scale, which is changed from 0.3 to 1, permitting more dramatic changes in SFRs.

In short, we consider three SFH models in this section: Prospector- $\alpha$ , which follows the exact definition as in J. Leja et al. (2017) using 7 time bins; finer time bins matched to the input SFHs, assuming a continuity prior; and the same finer time bins but assuming a bursty continuity prior. As an additional test, we conduct the same analysis using flexible time bins (K. A. Suess et al., 2022). This implementation of non-parametric SFH contains a number of flexible width bins, which is designed to efficiently produce the flexibility required to model post-starburst SFHs. Examples are included in Appendix B.

Finally, sampling is performed using the dynamic nested sampler dynesty (J. S. Speagle, 2020). The simulated and inferred model spectra, as well as SFHs, for two galaxies are included in Figures 4–5 as examples.

Figure 4: Individual SED fits. (a) An example of a well recovered SFH, assuming a continuity prior in the SED fit. The left panel shows the simulated spectrum in black, and the best-fit model spectrum assuming the continuity SFH prior in purple. The middle panel shows the true SFH in black, the non-parametric SFH inferred from SED fit in purple, and its best-fit periodic interpretation in green. The right panel shows the posterior distributions from the periodic interpretation, with the truths over-plotted in black. (b) Results for the same galaxy, but assuming the bursty continuity prior are shown in the same manner.

Figure 5: Individual SED fits. An example of a poorly recovered SFH assuming the continuity prior and the bursty continuity prior are shown in the same format as Figure 5. The individual fits give noisy information. Even though the spectra are well-fit in all cases, the fidelity of the SFH recovery varies. Occasionally, the continuity prior tends to recover the average SFH better than the bursty continuity prior, likely because it prefers less dramatically varying SFHs.

3.2 Periodic Interpretation of Individual SFHs

During SED fitting, we opt not to directly fit for the functional form of the SFH determined by the population-level SFH parameters. This is a more realistic approach, since the SFHs of real galaxies are not expected to follow a simple form, meaning that more complex variations must be allowed. However, the inference objective of interest is the population-level SFH parameters. Therefore, in this second stage, we obtain a periodic interpretation of individual SFHs by fitting the SFHs corresponding to the posterior medians (with $1\sigma$ uncertainty corresponding to the 16 and 84th percentiles) with realizations of the population-level SFH parameters, again using dynesty (J. S. Speagle, 2020). Examples are also shown in Figures 4–5.

3.3 Distribution of Population-level SFH parameters from Individual Fits

The above two steps are repeated for random realizations of the sampled population-level SFH parameters chosen in Section 2.1, with 50 simulated galaxies in each SFH family. In the final step, we examine the distributions of the posterior medians obtained in Section 3.2.

3.4 Assessing the Accuracy of Individual Fits

Figure 6: Recovery of mass and averaged SFR from SED fitting, under different model assumptions. (a) The beige and blue histograms show the distributions of difference between the recovered mass and the true mass, assuming the continuity prior and the bursty continuity prior, respectively. The red histograms show the same results using the Prospector-

\alpha

model (J. Leja et al., 2017), which describes the SFH in less number of time bins and assumes the continuity prior. This model is included here as a fiducial reference. The black histograms correspond to the results when encoding the correct SFH model in the inference process. The recovery is quantified by three summary statistics; the median offset,

\mu

, the scatter

\sigma_{\rm NMAD}

, and the outlier fraction,

f_{\rm out}

. These are annotated in each panel in the same colors. The bursty continuity prior underestimates the mass more often than the continuity prior. This is perhaps due to the bursty continuity prior favoring more dramatic changes in adjacent time bins, leading it to put more mass in a recent burst; it is thus more prone to outshining. Interestingly, the coarse time bins, while insufficient to infer SFH variations on short time scales, lead to less biased mass estimates. (b) The recoveries of SFRs averaged over the most recent 30 and 100 Myr are depicted in the same format. The continuity prior results in a marginally better recovery of the SFR. In all cases, the SFR averaged over the most recent 100 Myr is better calibrated than the SFRs averaged over the most recent 30 Myr.

We now evaluate the results of the three previous sections, before concluding with a brief summary. Immediately seen from Figure 4 is that all the spectra are fit equally well, but the recovery in the critical stellar population properties (mass and averaged SFR) and detailed behaviors in the SFH varies.

3.4.1 Mass and SFR Recovery

Histograms showing the recovery of the mass and the SFR averaged over 30, 100 Myr timescale and the corresponding uncertainty calibrations are in Figure 6. The median offsets in the inferred mass, assuming the continuity prior, for the slowly and steeply rising SFHs are -0.11 and -0.18 dex, respectively. These values for the cases assuming the bursty continuity prior, for the slowly and steeply rising SFHs, are -0.36 and -0.48 dex, respectively. Such underestimations in mass are clear demonstrations of outshining, where the older stellar populations are too faint to be inferred correctly from the integrated light. Using the bursty continuity prior systematically underestimates the mass slightly more, since this prior prefers to locate most of the star formation activity in a recent burst. Notably, the Prospector- $\alpha$ model, which uses fewer time bins to describe the SFH, is much less prone to underestimate the masses. The mean bias in the inferred masses for slowing (steeply) rising SFHs, is $-0.01$ dex ( $-0.03$ dex).

The scatter in residuals is quantified using the normalized median absolute deviation (NMAD), given its advantage of being less sensitive to outliers than root mean square. It is defined as $\sigma_{\rm NMAD}=1.48\times{\rm median}|x_{\rm obs}-x_{\rm true}|$ . The Prospector- $\alpha$ model results in the least scatter in the inferred stellar masses, with $\sigma_{\rm NMAD}\approx 0.16$ dex. With finer time bins, the scatter for the cases assuming the continuity prior and the bursty continuity prior $\approx 0.27$ and 0.62 dex, respectively.

We quantify the uncertainty calibration via an outlier fraction, $f_{\rm out}$ , defined as the number of objects where $|x_{\rm obs}-x_{\rm true}|>3\sigma$ , where $3\sigma$ refers to the 0.1 and 99.9% credible interval value distance from the posterior median. In the case of slowly rising SFHs, the outlier fractions for the inferred stellar masses for the Prospector- $\alpha$ model, finer time bins with the continuity prior, and finer time bins with the bursty continuity prior are 0.21, 0.34 and 0.80, respectively. In the case of steeply rising SFHs, these values increase to 0.35, 0.50, and 0.87, respectively. Interestingly, the standard Prospector- $\alpha$ model (J. Leja et al., 2017) performs the best in all metrics for mass recovery. Consistent with the findings in J. Leja et al. (2019), this non-parametric description with sufficiently coarse time resolutions is able to recover mass across a range of SFH forms.

However, the robust performance in mass recovery with the coarse time bins is not retained in the inference of SFRs. As shown in Figure 6-b, the median offset, scatter, and outlier fraction in SFR averaged over the most recent 30 (100) Myr are 0.2 (0.1) dex, 0.5 (0.5) dex, and 0.9 (0.6), respectively. Finer age bins with the continuity prior perform marginally better, as it tends to trace the average SFH more accurately. The median offset, scatter, and outlier fraction in SFR averaged over the most recent 30 (100) Myr are 0.1 (0.1) dex, 0.2 (0.2) dex, and 0.6 (0.5), respectively. Finer age bins with the bursty continuity prior result in similar median offset and scatter as the standard Prospector- $\alpha$ model, except that the outlier fraction is slightly lower (0.7 for SFR averaged over the most recent 30 Myr, and 0.5 for SFR averaged over the most recent 100 Myr). In all cases, SFRs averaged over a longer time scale (100 Myr) are recovered more accurately than those averaged over a short time scale (30 Myr), at $<0.1$ dex level in median offsets. This finding is consistent with B. Wang et al. (2024a), where the systematic uncertainty in the inferred SFRs from broad-band photometry, driven by the modeling choices of a smooth and an extremely bursty SFH prior, is found to be substantially worse for SFRs averaged over short time scales.

Overall, under the finer time bin scheme, the masses and averaged SFRs are reasonably recovered, with $<0.2$ dex offsets. The concerning aspect is that the uncertainties tend to be significantly underestimated, as seen from the large outlier fractions. This is likely due to a complex likelihood surface that is populated with local maxima, creating a unusually challenging situation for the sampler. We examine this point further in Appendix C.

3.4.2 Inferring Short-Term Variations in the SFH

The individual non-parametric SFH can be a poor description to the input SFH. Earlier studies have shown the influence of priors on the inferred SFHs using photometric data (J. Leja et al., 2019; S. Tacchella et al., 2022; B. Wang et al., 2024a), but here we demonstrate that their impact can still be significant in the presence of high S/N spectra. We also find that the continuity prior occasionally recovers the SFH averaged over a full period better than the bursty continuity prior, as it prefers more smoothly varying SFHs.

The above challenge translates to a wide dispersion in the quality of the periodic interpretation of individual SFHs (Figures 4–5). Even in the case of reasonably traced SFH variations from the individual non-parametric SFH fit, it is challenging to estimate the SFH parameters, as illustrated by the wide, multi-modal posterior distributions. In the case of a poor non-parametric fit, the truths can fall outside of the posterior distributions. The amplitude and the slope are often degenerate, further complicating the interpretation.

Third, the distributions of posterior moments sufficiently illustrate the lack of constraint in the inferred population-level SFH parameters. The results based on SED modeling assuming the continuity prior are shown in Figure 7. The distributions based on the bursty continuity prior exhibit similar trends, albeit with larger deviations from the true values. A simplified version of Figure 7 is included as Figure 8, which summarizes the results on a scatter plot with error bars indicating the 16 and 84th percentiles of the distributions of the posterior medians. The accuracy and precision of the inferred population-level SFH parameters from individual fits vary wildly. In most cases, the uncertainties are so large that render the constraints meaningless; i.e., they span the allowed prior range. In some cases of the duration and slope parameters, the answers are wrong with high confidence. The latter is particularly concerning: if we were infer the burstiness from these individual fits, we would arrive at the wrong, but confident conclusion that there is a great diversity in the level of burstiness. Looking only at the distributions of posterior medians, which are annotated in the right corners of Figure 7, the spread in the posterior medians are often larger than the spacing of the model grid. For instance, the $1\sigma$ scatters in the inferred amplitudes and durations can be up to 1 dex. This means that our sampled population-level SFH parameters—the periodic bursts of 20 and 40 Myr, the fluctuation amplitudes of 0.3 and 0.8 dex, and doubling times of 50 and 120 Myr—cannot be distinguished.

Simply put, the substantial spread in the inferred population-level SFH parameters limit our ability for an accurate accounting of the population-level burstiness, even though all the spectra are well fit. We therefore conclude that the information content in the spectra is insufficient for inferring the population-level SFH parameters via the state-of-the-art inference methods with flexible SFHs and neutral priors.

Figure 7: The accuracy and precision of the inferred population-level SFH parameters from individual fits vary wildly. Each row corresponds to a SFH family. Each panel shows the posterior moments (median / 16th / 84th quantiles) of the population-level SFH parameters, amplitude

\sigma

, duration

\delta t

, and slope

\alpha

, inferred assuming the continuity prior, in gray. The median of the distribution of the posterior medians in gray are shown as a colored data point, with the error bars indicate the 16th and 84th quantiles. The actual numerical values are annotated in the upper right corners. Truths are indicated as black horizontal lines to guide the eye. The spread in the posterior medians are often larger than the spacing of the model grid, demonstrating that it is not possible to determine which population model any individual galaxy was drawn from. Notably, some gray points are far from the truth and have small errorbars. This means that if we were to infer the burstiness from these individual fits, we would arrive at the wrong, but confident conclusion that there is a great diversity in the level of burstiness.

Figure 8: Contrasting the recoveries of the population-level SFH parameters via individual SED fit (Section 3), H

\alpha

/UV (Section 5), and population fit (Section 6). The colored data points indicate 50th quantiles of the distributions of posterior medians of the population-level SFH parameters, amplitude

\sigma

, duration

\delta t

, and slope

\alpha

, inferred from individual fits, assuming the continuity prior, and the error bars indicate the 16 and 84th quantiles. The colors correspond to different SFH models. The

1\sigma

spread in the posterior medians are often larger than the spacing of the model grid, demonstrating that it is not possible to determine which population model any individual galaxy was drawn from. The 100% success rate of identifying the correct model population-level SFH parameters via the population-level approach is illustrated by the stars. Note here that the error bars mean the spread in posterior medians. The individual fits often have uncertainties far smaller than their biases.

Figure 9: Inferring SFHs with medium-resolution spectra. The results are presented in the same format as Figure 4. It remains difficult to recover the SFH with medium-resolution spectra; despite the excellent fit to the data and systematic-free modeling (impossible with real data!), the input SFH is not recovered. This reinforces our finding that detailed SFH and therefore its variability often pushes beyond the limit of the state-of-the-art stellar population synthesis modeling.

3.5 Comparing the Accuracy of Individual Fits across Different Data Resolutions

Having focused on recovering the oscillating SFH models using JWST broad and medium-band photometry + Prism spectroscopy, it is also instructive to know whether this is a fundamental problem driven by lack of observational information, or whether higher resolution spectra would provide more constraining power. Therefore, we devote this section to a comparison of information recovery for the individual fits using different and more detailed observations.

We adopt the same general Prospector settings as outlined in Section 3.1. Again, we stress that our setup is representative of the most common practices in performing SED fits. Two additional observing strategies are considered: photometry + medium-resolution G235M spectroscopy, and photometry + medium-resolution G395M spectroscopy. At our redshift range of interest ( $z\sim 4-7$ ), the G235M grating covers a critical age indicator—the Balmer break—and also H $\beta$ , whereas the G395M grating covers H $\alpha$ and H $\beta$ .

The results are summarized in Figure 9. In this example, the observable signatures—including the Balmer break, stellar absorption features, and emission lines—are all encapsulated by the G235M grating, and they are distinct enough for the models to correctly infer the recent truncation in star formation. However, despite the excellent fit to the data, the details of the recent SFH are not recovered. The same effects on the inferred SFHs due to the two priors are also seen here; that is, the continuity prior tends to recover the average SFH, whereas the bursty continuity prior tends to locate most of the star formation activity in a recent burst. As for the photometry + G395M grating combination, the medium bands capture the strength of the Balmer break, but the spectral coverage only includes the H $\alpha$ emission. The recent SFH inferred with the continuity prior is less accurate possibly due to the lack of constraints from the rich absorption features around the Balmer break.

Additional comparisons using a rejuvenating SFH, fit with the flexible time bins, produce similar results. Those are presented in Appendix B.

The above results all support the conclusions in Section 3.4 that the information content contained in the individual spectra is, even in the most favorable systematics-free situation, too limited to correctly infer the complex behaviors in the SFH. However, it remains to be demonstrated how much the individual fits can be improved, given a correct model of burstiness. We do so in the next section.

4 Individual SED fits, revised with a correct model of burstiness

Figure 10: With the appropriate model for burstiness, inferring the SFH via SED fitting is indeed feasible. The results here are shown in the same format as Figure 4. (a) Fitting the photometry and Prism spectrum simultaneously with the correct burstiness model recovers the input SFH. Note that we assume no dust attenuation here for simplicity. (b) Under the same settings, fitting only the broad- and medium-band photometric data can produce different performances. Generally, for cases where clear age indicators (e.g., the Balmer break in the left example) are captured by the medium bands, the input SFH can be reasonably recovered without spectroscopy. The x-axis range is truncated to emphasize the recent SFH.

Thus far, we have adopted the common setup (the continuity and bursty continuity priors) used in SED fits, but these state-of-the-art priors do not specifically weight towards the oscillating behavior of the simple SFHs; i.e., the model allows for but does not prefer the oscillating SFHs used as inputs. Having demonstrated that even high-quality spectra with a neutral prior cannot recover the detailed SFH, it is natural to ask whether individual SFHs can be accurately inferred with a prior tuned to the underlying population SFH. Such priors may be needed to solve the outshining problem, and can be empirically measured: B. Wang et al. (2023b) demonstrated the utility of physically motivated, informative priors in breaking degeneracies among stellar population parameters. More recently, Z. Gao et al. (2025) recovered cosmic SFR density and stellar mass density by constructing a prior based on a correlation between photometric colors and SFHs from the IllustrisTNG simulation (R. Weinberger et al., 2017; A. Pillepich et al., 2018), whereas J. T. Wan et al. (2024) showed the potential for a hierarchical modeling approach with a flexible stochastic SFH prior that aims to describe a wide range of SFH behaviors.

In this section, we test the hypothesis that, if the short-term SFH variability is known a priori, the SFH of individual galaxies can indeed be recovered via SED fitting. To this end, we fit mock observations using an oscillating SFH model as parameterized in Section 2.1. While this choice is intentionally avoided in Section 3.1—given that real galaxies are not expected to strictly follow a simple SFH form—it serves as a clear test here to address the question: if the model is informed that the SFH follows an oscillating pattern, can we recover the SFH from fitting individual spectra? Here, we use uniform priors on all four population-level SFH parameters: $\phi\in[0,2\pi]$ , $\sigma\in[0,1.0]$ dex, $\alpha\in[-1,0]$ , and $\delta t\in[10,70]$ Myr.

The summary statistics quantifying the stellar population parameter recovery are included in Figure 6. With the correct model for population-level burstiness, the distribution of the residuals in the inferred mass deviates from 0 by $<0.05$ dex, the scatter is $0.1$ dex, and the distribution of uncertainty-normalized residuals becomes close to a unit Gaussian with an outlier fraction of $0.2$ .

Furthermore, the intricate variations in the SFH can also be recovered. An example is shown in Figure 10, where the inferred SFH is consistent with the true SFH. We note that the good recovery is contingent upon an accurate measurement of dust attenuation. This is because dust attenuation is needed to decode the Balmer emission lines, tracing SFR in the most recent $\sim 10$ Myr, and UV fluxes tracing SFR in the most recent $\sim 30-70$ Myr scale. The degeneracy between dust attenuation and intrinsic emission-line luminosity, given an observed flux, can cause a factor of $>2$ (up to orders of magnitudes) deviation from the true recent SFRs. Uncertainty in the deblending of H $\alpha$ and [N ii] introduces a second order effect where the recent SFH can be incorrectly inferred at a level of $\sim 20$ %, since [N ii] typically is $\sim 10$ % of H $\alpha$ flux.

Figure 10-b additionally includes two examples of inferring the SFHs with photometry alone. Generally, for cases where clear age indicators (e.g., a Balmer break) are captured by the medium bands, the input SFH can be recovered without spectroscopy. However, as alluded to earlier, in some cases accurate and precise measurements of the Balmer emission lines are necessary for recovering the SFH.

The above findings suggest that given the correct SFH model, it is indeed possible to infer the SFH by performing SED fitting. Put in another way, outshining can be solved if the right SFH model is known. A key assumption here is that the current burstiness observable in the population is consistent with the past SFH behavior. For cases where star formation is primarily self-regulated and stationary—rather than driven by external events like mergers, or evolving sharply with time due to e.g., gas availability—this assumption is likely reasonable. We revisit this point in Section 7.

To summarize, the key to address outshining is to empirically measure the population distribution of bursty SFHs. Possible ways forward to achieve this is the focus of the second half of the paper.

5 H $\alpha$ -to-UV Flux Ratio

Figure 11: H

\alpha

-to-UV flux ratios are sensitive to some, but not all population-level SFH parameters. (Upper) The flux ratios are plotted as functions of observed phases in the first panel, whereas the same flux ratios are plotted as histograms to illustrate their distributions, assuming all phases are observed, in the second panel. The two SFH families considered have different fluctuation amplitudes, and the H

\alpha

-to-UV flux ratios show different distributions, meaning that they are sensitive to the fluctuation amplitudes. (Lower) Same as the upper panel, but the two SFH families considered here share the same fluctuation amplitude. Although one SFH family has a slowly rising slope and a short duration (purple), and the other SFH family has a steeply rising slope and a longer duration (cyan), the distributions of the flux ratios exhibit marginal difference.

The most well-studied measurement of burstiness is likely the scatter in H $\alpha$ -to-UV flux ratios. There is rich literature on this variability indicator (K. Glazebrook et al., 1999; J. C. Lee et al., 2009; D. R. Weisz et al., 2012; Y. Guo et al., 2016; N. Emami et al., 2019; T. Nanayakkara et al., 2020), and it can be considered as a prototype of a population-level fit. We therefore start the presentation of population-level constraints of burstiness with H $\alpha$ -to-UV flux ratios, and test its constraining power using our mock observations.

In Figure 11, we first show the H $\alpha$ -to-UV flux ratios as functions of the observed phases for two SFH families that differ in fluctuation amplitudes. For these cases, the H $\alpha$ -to-UV flux ratios differ markedly. The larger amplitude leads to an increased scatter in H $\alpha$ /UV, as expected. However, this difference becomes significantly less noticeable for models with the same fluctuation amplitude, but different durations and/or slopes.

The significant changes in the sSFRs for the more steeply rising SFH models drive the sudden change in H $\alpha$ /UV near $\phi=\pi$ . However, some scatter is present in this transition, making it appear smoother than one might expect from the abrupt changes in the simulated SFHs at $\phi=\pi$ . The varying metallicities partly contribute to the smoothness. More importantly, the abrupt changes in the simulated SFHs are on the instantaneous sSFRs. As shown in J. Choi et al. (2017), the number of ionization photons—and therefore—the H $\alpha$ -to-UV flux ratios, is proportional to SFR averaged over 10 Myr, as opposed to instantaneous SFRs.

The simulated H $\alpha$ /UV ratios are broadly consistent with the observationally inferred values reported in Y. Asada et al. (2024), although the maximum value of $\sim 0.07$ lies at the extreme end of their distribution. This upper limit also exceeds the maximum value found in R. Endsley et al. (2024), likely due to differences in modeling assumptions, as R. Endsley et al. (2024) derived their H $\alpha$ /UV ratios from BEAGLE (J. Chevallard & S. Charlot, 2016) fits to the photometry. We stress that our primary goal is not to precisely match the observations but rather to encompass a reasonable range of burstiness, since these simulations are designed to simply serve as a controlled baseline for evaluating the constraining power of different methods.

To quantify the constraining power of H $\alpha$ -to-UV flux ratios, we compute the Wasserstein distance between the observed, noisy distributions to the model distributions. Also known as the earth mover’s distance, the Wasserstein distance is a measure of the similarity between two probability distributions. It is similar to the perhaps more widely known Kullback-Leibler divergence, but is more robust to outliers and is sensitive to the spatial arrangement of the probability mass (V. M. Panaretos & Y. Zemel, 2019), and has been applied to astronomical data (e.g., J. Li et al. 2024). It is possible to explore alternative distance metrics in the future, but here the best-fit model is found by simply identifying the shortest Wasserstein distance.

We note that while the observed, noisy distributions are from the same model grid used to test the individual SED fits, the model distributions being compared to are generated from a denser grid ( $\sigma\in$ [0.2, 0.9] dex with 0.1 dex increments, $\delta t\in$ [15, 45] Myr with 5 Myr increments, $\alpha\in$ [-0.3, -0.7] dex with 0.05 increments). We repeat the mock observations with different realized noises and the resulting inference procedure 1,000 times for each SFH family sampled on the denser model grid as a robustness test. We then calculate the fraction of population-level SFH parameters correctly identified as the best-fit, and consider the modal best-fit parameters as the overall best-fit. For example, suppose we have a noisy distribution of H $\alpha$ /UV corresponding to the SFH family of $\sigma=0.3,\delta t=40,\alpha=-0.4$ . In the 1000 trials, 80% of the time the shortest Wasserstein distance gives $\sigma=0.4$ , 10% of the time gives $\sigma=0.3$ , and 10% of the time gives $\sigma=0.5$ . We take $\sigma=0.4$ as the best-fit, and plot it as a hexagon in Figure 8. The shade indicates the possible range (i.e., from 0.3 to 0.5 in this example).

As seen from Figure 8, the H $\alpha$ -to-UV flux ratio almost always gets the large fluctuating amplitude right, and gets the small fluctuating amplitude quite close ( $\sigma\sim 0.5$ dex when the truth is 0.4 dex). Notably, it never mistakenly infer one model with true $\sigma=0.4$ dex as $\sigma\sim 0.8$ dex, and vice versa. This performance is markedly better than the individual SED fits, where the spread in the posterior medians can be larger than the difference between these two amplitudes. However, such discerning power vanishes when considering the other two parameters, namely duration and slope, as the best-fits only differ marginally for the different cases. Put another way, the H $\alpha$ -to-UV flux ratio only has constraining power on the fluctuating amplitude: it is not very constraining for either the timescale of bursts or the underlying SFH slope.

These results demonstrate that the H $\alpha$ -to-UV flux ratio is sensitive to recent bursts, which in turn can be linked to many physical processes driving the SFH, and that it is difficult to disentangle them based on the distribution of H $\alpha$ -to-UV flux ratio alone. This challenge has also been implied in V. Mehta et al. (2023), where they found that a smoothly rising or declining SFR over the long term can produce similar scatter in H $\alpha$ -to-UV flux ratios as short-term bursts.

In summary, the H $\alpha$ -to-UV flux ratio is a useful indicator for burstiness, but going beyond inferring a general characteristic of SFH variability (e.g., the fluctuation amplitude in our case) to constrain all the population-level SFH parameters requires additional information. This motivates us to explore a simultaneous population-level approach, as detailed below.

6 A Simultaneous Population-level Approach to Constrain SFH Variations

The physical processes of various characteristic timescales influence certain spectral features in different ways. Figure 3 illustrates this with four examples of model SFHs and their corresponding spectra, and the spectral features of Balmer breaks, Balmer emission lines, near- and far-ultraviolet flux densities (NUV, FUV) are highlighted. While isolated analysis of individual spectra is insufficient to robustly infer the population-level SFH parameters, simultaneous analysis of the distribution of the relevant spectral features may give greater discriminating power. This is because while determining the behavior of a single galaxy is challenging, but many galaxies living within the same space can effectively trace out phase-space to provide a comprehensive view of the population-level characteristics.

The above serves as a motivation for our proposed population-level approach; at the same time, it is also a key assumption that the formation histories of most galaxies vary on similar timescales. Here we show that galaxies drawn from the same SFH family can be used to identify that family. In real observations, one must somehow also first group galaxies into different SFH families.

Simulations suggest that this may be challenging, as the probability of a burst (at fixed stellar mass) depends on a stability parameter, which involves gas mass and its specific angular momentum (E. Cenci et al., 2024). In other words, even at fixed mass the burstiness can depend on other galaxy properties such as their structure (see also P. F. Hopkins et al. 2023b). However, as a first-order analysis, we may assume that galaxies in similar redshift and mass ranges likely experience comparable environmental conditions, leading to similar SFH variations.

6.1 Spectral Features Sensitive to SFH Variations

Figure 12: Proposed timescale-sensitive observables of this paper. The sampled population-level SFH parameters are the same as those used in testing the individual fits. Each row includes the predicted distributions drawn from SFH families with the same slope. The distributions of the various spectral features are sensitive to different aspects of the SFH variability.

The suite of spectral features is chosen for their sensitivity to SFH variability over different timescales as described below, loosely following K. G. Iyer et al. (2024).

1.

Balmer break strength: caused by the bound-free absorption by electrons in the electron band $n=2$ , A-type ( $1.4-2.1~{\rm M}_{\odot}$ ) and later stars contribute to the break strength. We measure this from spectroscopy using the revised index from B. Wang et al. (2024c), which uses two windows at [3620, 3720] Å and [4000, 4100] Å. This new definition captures the full Balmer break, as opposed to the 4000 Å break (M. L. Balogh et al., 1999). Critically, the 4000 Å break only appears after Gyr of quiescence, while the Balmer break appears after $\sim 100$ Myr. Given that the age of the universe at $z>4$ is $<1.6$ Gyr, a 4000 Å break is rarely seen. Our revised index is thus a more suitable age indicator at high redshifts.
2.

Balmer line emission: emerging from H ii regions, this emission is indicative of the ionizing radiation from short-lived stars such as O and B-type stars ( $5-10$ Myr). The readily available H $\alpha$ emission is thus often taken as an “instantaneous” SFR indicator.
3.

Dust-corrected NUV and FUV flux densities: due to outshining of the older stellar light by recent star formation, the UV in star-forming galaxies typically traces $30-70$ Myr timescales.

We measure these spectral features from the simulated spectra (§ 2.2). The resulting distributions for each of the population-level SFH parameters are shown in Figure 12. We find that all the features are sensitive to the fluctuation amplitude, where a larger amplitude leads to a wider distribution. They are also moderately sensitive to the duration, which results in distributions peaking at different values. Additionally, the dust-corrected NUV and FUV flux densities are sensitive to the slope, as either peaks become better separated, or their locations shift for a steeper slope.

It is worth emphasizing that most of the above distributions show wide spreads. This means that any one object, labeled with only one value from each observable, cannot trace the population distribution.

6.2 Inferring Population-level SFH parameters

Given the demonstrated sensitivity, then, in principle, the population-level description of burstiness (as parameterized by the population-level SFH parameters) could be fit to a population-level distribution of observables. As a proof of concept, we test whether the population-level distributions of observables can correctly identify the SFH parameters, sampled on the same model grid that the H $\alpha$ -to-UV method is tested on in Section 5; that is, $\sigma\in$ [0.2, 0.9] dex with 0.1 dex increments, $\delta t\in$ [15, 45] Myr with 5 Myr increments, and $\alpha\in$ [-0.3, -0.7] dex with 0.05 increments). This analysis is justified on the basis that the spread in the posterior medians from the individual fits exceed the spacing of this model grid.

We begin by forward-modeling the observations corresponding to various SFH families (different combinations of the population-level SFH parameters), assuming Gaussian noise for each feature with S/N=20, and randomly generate a sample of 300 galaxies. Notably, the S/N here is generally noisier than the assumed individual spectra because each of these features is composed of multiple pixels, which are then averaged or combined to produce a spectral feature. This is especially true for FUV and NUV flux densities, which are the sum of many such pixels. We note that this setup represents a data quality level that is achievable in a JWST spectroscopic survey, e.g., UNCOVER (R. Bezanson et al., 2024) and JADES (D. J. Eisenstein et al., 2023).

We then identify the best-fit model via the shortest Wasserstein distance in the same manner as outlined in Section 5, with a minor modification to account for the fact that four observables are considered here. We treat each observable separately at first by calculating the Wasserstein distances of each, and then add the four Wasserstein distances in quadrature. The best-fit is again determined by this shortest total distance.

It is worth pointing out that we only consider the marginals here. Cross-correlations can in principle provide additional information. As the current setup already always identifies the correct population-level SFH parameters, we defer adding the cross-correlations to a future paper, where a full pipeline for the population-level fit will be presented.

6.3 Assessing the Accuracy of Population Fits

While the distribution of one or two observables alone may not always identify the correct population-level SFH parameter (e.g., the H $\alpha$ /UV flux ratio discussed in Section 5), we find that by utilizing the four observables proposed in this paper, the correct SFH model can always be identified.

This performance is substantially more accurate and precise than the individual fits. Figure 8 shows the performance comparisons across all the methods considered in this paper in an illustrative manner. Specifically, the individual fits cannot distinguish between the sampled SFH parameters (e.g., 0.3 vs 0.8 dex fluctuation amplitudes) since the range in posterior medians from the individual fits exceeds the spacing of the sparse model grid. This justifies the simplified test on the population-level method in this work. Certainly, real galaxy populations reside in a more complex parameter space, and the next steps involve exploring the full density field, which will be detailed in a future paper.

7 Discussion

Having discussed the individual and population-level fits separately in Sections 3.4, 4, 5, and 6.3, we now tie all the pieces together. We begin this section by a brief summary, presenting the implications from this work for solving the longstanding challenge of outshining. Key lingering questions and future improvements are discussed in Section 7.2.

The remaining two subsections supply additional discussion on the effects of priors placed on the SFH in individual SED fitting, and the implications of burstiness for studies assessing completeness.

7.1 A Solution to Outshining?

Inferring the past SFH from integrated light, where recent bursts dominate the SED—a phenomenon known as outshining (C. Papovich et al., 2001)—remains a longstanding challenge. Outshining makes it inherently difficult to recover SFH behavior $>100-200$ Myr in the past. The problem is further exacerbated in the high-redshift universe, where bursty SFHs are expected to be more common, complicating the inference of the behavior of SFR(t) even at timescales $<100-200$ Myr.

Our analysis demonstrates that state-of-the-art inference methods with flexible SFHs and neutral priors systematically underestimate mass in bursty systems with rising SFHs by $\sim 0.15$ dex. While these methods reasonably recover average SFRs with a median offset of $\sim 0.1$ dex, they fail to capture the tens of Myr fluctuations in bursty SFHs. However, fitting a bursty SFH with the correct model for burstiness removes the biases: median offsets in mass and averaged SFR decrease to $\sim 0.04$ dex and $\sim 0.05$ dex, respectively (§ 4). Furthermore, this approach facilitates the correct recovery of the detailed recent SFH.

The profound implication here is that the key to addressing outshining is to empirically measure the population distribution of bursty SFHs for a representative sample, and then apply this as an informed prior for SFR(t) for individual systems. Note that here we attempt to solve outshining at a statistical level; that is, we do not claim to solve it for individual galaxies.

To this end, we introduce a population-level approach in Section 6, aiming to recover the typical amplitude, timescale, and slope of the recent SFH to very high accuracy. Given the distributions of spectral features, we are able to correctly identify the corresponding SFH models via the shortest Wasserstein distance between the observed and the predicted distributions. By showing that an accurate model for SFH fluctuations does not suffer from the outshining problem, and that a model for SFH fluctuations can be accurately inferred from a complete and representative sample of galaxies, this work thus provides the first step towards solving outshining.

7.2 Outstanding Questions

We elaborate on the lingering questions, along with possible ways forward, in this subsection. First, in this work, we only sample a few representative points in the SFH model space, serving as a proof of concept. This choice is justified by the fact that the scatter in the part of the inferred SFH parameters from the individual approach already deviate from the truths by values larger than the spacing of the model grid, and that the scatters often exceed the spacing as well. Certainly this is a simple test, but sufficiently illustrates that the population-level approach is a promising alternative. Going forward requires running a full fit exploring the complete parameter space, possibly adapting a more flexible and realistic SFH model, and obtaining the probability distributions of the inferred parameters. A toy model along these lines is presented in K. G. Iyer et al. 2024. Although this test is limited, as it was conducted on ensemble observations of only about 30 galaxies, it demonstrates that the constraints are sufficient to differentiate between the timescale parameters. Another option is to perform Bayesian hierarchical modeling, as alluded to in J. T. Wan et al. (2024). Fundamentally, we can consider our problem as attempting to learn about a population from many individual measurements, one that hierarchical models are designed to address.

Second, a critical assumption in using population-level burstiness model to solve outshining is that the population-level burstiness predicts past SFH. This holds true when the conditions for star formation evolve slowly, i.e., SFR(t) in the past few hundred Myr is similar to SFR(t) in the past few Gyr. This can “solve” outshining in a statistical sense, but not necessarily for individual galaxies; for example, individual galaxy bursts driven by external events (e.g., mergers or interactions) won’t necessarily be captured by an improved prior, though they will be more properly marginalized over. It has been shown that merger-induced SFR is particularly important in more massive galaxies ( ${\rm M}_{\star}>10^{10}~{\rm M}_{\odot}$ ) at lower redshifts ( $z=0-1$ ) (E. Cenci et al., 2024) using the FIREbox cosmological volume simulation (R. Feldmann et al., 2023). However, such interactions are not the main driver of burstiness at lower stellar masses (see also P. F. Hopkins et al. 2023a); instead self-regulated star formation is more common here, which is well-suited to a population model. Therefore, at least to first-order, a measurement of the population distribution of bursty SFHs promises to mitigate the substantial uncertainty in interpreting the light of the galaxies where recent bursts dominate the SEDs.

Third, our parameterization of the population-level SFHs is simplified. This is meant to be illustrative. Rigorous tests are needed to confirm whether more complex forms of SFHs, e.g., fluctuations over many timescales, can still be accurately inferred with the proposed methodologies. On a related note, we perform the population-level fits with simulations assuming no dust, as is done in G. Sun et al. (2025) where a semi-analytic framework is developed to model bursty SFHs. While high-redshift, low-mass galaxies are expected to be less dusty in general, uncertainty in the dust attenuation when fitting for real galaxies may degrade the SFH recovery. Specifically, the best way forward is to have a Balmer decrement and UV flux in order to accuratly constrain dust attenuation—generally this means medium-resolution rest-optical spectroscopy alongside rest-UV photometry or spectroscopy.

Fourth, this work does not account for systematics in stellar mass and SFH recovery that arise from assumptions in SPS model choices. SPS models include multiple components, such as an IMF, stellar isochrones, and nebular physics (see C. Conroy 2013 for a review). Previous studies have demonstrated that inferred parameters can be sensitive to these fundamental modeling assumptions (C. Pacifici et al., 2023; L. Whitler et al., 2023; B. Wang et al., 2024b). The extent to which these assumptions influence our proposed population-level approach remains an open question.

Nevertheless, we emphasize that this paper presents a thorough examination of the constraining bursty SFHs via multiple approaches, and the findings suggest that a population-level burstiness model is a promising first step to solve outshining. We will explore the remaining questions in future work. Below, we supply additional discussions regarding the individual fits and observability and selection effects as implied by our burstiness model.

7.3 Individual Inference for Galaxy Burstiness

7.3.1 The Challenges: Limited Information Content in Individual Spectra and a Complex Likelihood Surface

To start, while all the spectra are well fit, the complex variations in the input SFHs are rarely recovered based on individual spectra. This holds true even with high-quality data and no model mismatch. Here, no model mismatch means that the mock observations are created using the same stellar population synthesis as the forward model used in the SED fitting. Simply put, for a single galaxy, it is not possible to distinguish between the burst amplitude, duration, and an underlying smooth increase in the SFH.

The above finding suggests that the fundamental problem is the information content—the individual spectrum does not contain sufficient information to accurately infer the population-level SFH parameters. The additional test by including the full MIRI coverage suggests a similar problem of insufficient information content (the mean bias typically is improved by $0.1-0.2$ dex for the inferred masses, and $<0.06$ dex for the averaged SFRs; see Appendix D). High S/N medium resolution spectra can produce greater constraining power, but still insufficient to constrain the complex SFH behaviors, as demonstrated in Figure 9.

Meanwhile, in light of recent developments in inferring SFHs from SED fitting with more flexible or more physically realistic models, it is reasonable to ask whether this individual approach could be improved by adopting these new formalisms. Examples include Dense Basis (K. G. Iyer et al., 2019), which constructs SFHs independent of any functional form using Gaussian processes, and stochastic priors (J. T. Wan et al., 2024) based on power spectral density analysis (N. Caplar & S. Tacchella, 2019). However, as we demonstrated in Section 3, a significantly different SFH from the input SFH can still fit the data equally well, even in our best-case scenario where no model mismatch is present. The fact that the input SFH can fall outside the posterior distribution of the inferred SFHs suggests that the likelihood space is full of local maxima which produce fits of similar overall quality—a challenge that more flexibility will exacerbate. Even with our simple models, the sampler struggles to locate the global maximum; otherwise, the true SFH should be included in the posterior distributions. This issue of missing the global maximum is further illustrated in Appendix C by the variations in inferred parameters found when fitting the same data multiple times. Making the SFH model more flexible would only add complexity to the likelihood surface, making it unlikely that SFH recovery would improve this way without a corresponding leap in sampling efficiency (e.g., through simulation-based inference; C. Hahn & P. Melchior 2022; G. Khullar et al. 2022; B. Wang et al. 2023a; K. Zhang et al. 2023; M. Ho et al. 2024, or machine-learning-improved sampling (e.g., M. Karamanis et al. 2022; J. U. Lange 2023).

7.3.2 Continuity vs. Bursty Continuity Priors

The above lists the key points regarding the individual fits, here we further elaborate on an interesting finding—namely, the different performances between the continuity and bursty continuity priors. Previous works have shown that inferring SFH from fitting photometric data are influenced by prior choices (J. Leja et al., 2019; B. Wang et al., 2024a), and here we demonstrate that this can still be true when fitting high S/N spectra.

Surprisingly, the continuity prior which favors smooth and constant SFHs better recovers masses (Appendix D) as well as the average trends in SFHs. As illustrated in Figure 4, the bursty continuity prior permits more dramatic changes in the SFH, and thus is more likely to show long-term deviations from the true average SFH. This is most evident in the case of the input SFH having a steeply rising slope, where the continuity prior captures the overall shape of the SFH, but the bursty continuity prior tends to put most star formation in a recent burst (i.e., making these objects form in a shorter period of time). Figure 9 illustrates this prior impact on the fitting of medium-resolution spectra, in which we see that the bursty continuity prior favors an immediate quench after a recent burst. These findings suggest that the bursty continuity prior is more prone to the classic outshining problem, where light from recent star formation eclipses that of older stellar populations (C. Papovich et al., 2001).

A puzzle here is that the bursty continuity prior, in principle, still favors constant SFH, as the expectation value of the ratio between the adjunct time bins is one. This means that the solution found by the continuity prior, which traces the average SFH better, should have higher prior probability. In addition, the likelihood for the data should also be higher for an SFH that is more similar to the true input SFH by design. However, the observed discrepancy between the results obtained with the two priors challenges this expectation. To investigate, we compare the likelihood and probability values between fits using the two priors, and find small differences with no systematic trends. Moreover, model comparison based on Bayes factors (R. Trotta, 2008) suggests that the data are agnostic regarding the choice of prior. It is thus possible that by allowing more dramatic changes in the SFH, the bursty continuity prior leads to a more complex likelihood surface for the sampler to efficiently explore, potentially affecting the resulting inference.

It is worth emphasizing that the differences in performance depend on the forms of the SFH. While the combination of bursty and rising SFHs in this paper results in the worsened performance of the bursty continuity prior, it has been shown to be more effective at detecting rejuvenation events (M. Park et al., 2024). This is yet another example highlighting the importance of carefully considering the choice of prior based on the relevant physical scenarios.

With the addition of the full MIRI coverage, the outshining problem is partially mitigated. The $\sim 0.2$ dex systematic underestimation of the masses found when fitting the SEDs, assuming the continuity prior, is no longer present (see Appendix D). The mean bias in the averaged SFR is decreased by $<0.06$ dex. These improvements are likely due to the mass-to-light ratio exhibiting minimal variation in this wavelength range, particularly at rest 2 $\mu$ m (e.g., T. Kettlety et al. 2018). However, the MIRI bands are still insufficient to accurately trace the SFH variations from individual SED fits.

7.4 Observability and Selection Effects

Figure 13: Population-level burstiness can shift the distributions of magnitudes at a fixed mass and redshift by

>2

magnitudes. Fluctuation amplitudes in the SFHs are plotted as functions of observed fluxes in F277W band to the left, corresponding to rest-frame optical, and in F444W band to the right, corresponding to rest-frame near-infrared. Colors indicate the different families of SFH models, but the exact correspondence is irrelevant here. For a flux-limited survey, the observed sample of the galaxy population would most likely be dominated by those having larger fluctuation amplitudes (greater

\sigma

As hinted in Figure 2, SFH variations directly influence the observability of individual objects. For a galaxy population with the same mass and at the same redshift, different models for SFH variations alone produce over a magnitude of variability in observed fluxes in the F444W filter band, which corresponds to rest near-infrared. This is already the best-case scenario, since the fluxes in the rest near-infrared are expected to be the most stable. The variability in the F277W filter band, corresponding to rest-optical is even more dramatic, up to two magnitudes.

Figure 13 further connects the observability to various aspects of SFH variability by showing distributions flux densities for different population-level models of burstiness. For a flux-limited survey, the observed galaxy population near the observable limit would be dominated by those having SFHs with large fluctuation amplitudes (larger $\sigma$ ). Although this variability in brightness is mainly driven by the bustier models with $\sigma=0.8$ dex, even with the more conservative case of $\sigma=0.3$ dex, the range of flux densities spans over a magnitude.

As pointed out in recent studies on observability using FIRE simulations (G. Sun et al., 2023a) and a sample of Lyman-break galaxies (R. Endsley et al., 2024), our results also demonstrate that it is essential to quantify burstiness before completeness—the distribution of masses at a certain fixed flux level—can be assessed. This has further implications for determining stellar mass functions and global SFHs. Changes on the SFMS due to mass incompleteness have already been shown in observations (J. Leja et al., 2022) and in simulations (W. McClymont et al., 2025).

8 Conclusions

This paper establishes the first step toward solving outshining—a longstanding challenge in interpreting the light from galaxies—by empirically measuring and then applying a model for short-term galaxy burstiness. This issue has become especially pertinent in light of recent observations from JWST of the early mini-quenched galaxies (e.g., T. J. Looser et al. 2024; V. Strait et al. 2023; R. Endsley et al. 2024; A. Weibel et al. 2025; W. M. Baker et al. 2025; J. A. A. Trussler et al. 2025) and the over-abundance of luminous high-redshift galaxies (e.g., R. P. Naidu et al. 2022; H. Atek et al. 2023; C. M. Casey et al. 2023; S. L. Finkelstein et al. 2023; B. E. Robertson et al. 2023).

We present a thorough examination of the extent of constraints attainable regarding bursty star formation, defined here as recurrent up-and-down fluctuations on tens of Myr timescales over the past 500 Myr of formation history. We begin our study by assessing the state-of-the-art SED-fitting methods for inferring burstiness, and find that standard techniques with neutral priors fail to recover detailed recent SFHs and often under-estimate the total stellar mass; however, a successful recovery is achievable with the correct model for burstiness. We then introduce a population-level technique leveraging the distributions of timescale-sensitive spectral features, and demonstrate its potential for accurately inferring burstiness. While motivated by high-redshift findings, our results are also applicable to studies of lower-redshift galaxies. The main findings are summarized as follows.

First, in the presence of bursty star formation, the population-level SFH parameters, particularly the fluctuation amplitude, lead to strong selection effects in flux-limited surveys. The simulated brightness from our SFH models, for a galaxy population of the same redshift and mass, span over one magnitude in rest near-infrared, where the fluxes are expected to be the most stable. The variability in rest-optical is greater—more than two magnitudes. This may explain the prevalence of bursty star formation at high redshift in objects typically selected in the rest-optical, or in spectroscopic samples with more complex selection functions.

Second, even with exquisite data with no modeling systematics, individual SED fits are unable to place meaningful constraints on the population-level SFH parameters. The spread in posterior medians in the inferred fluctuation amplitude, duration, and slope, from fitting an ensemble SEDs simulated from one SFH family, range from 0.5 to 1.0 dex, 10 to 120 Myr, and 0.5 to 1.2, respectively. These spreads are greater than the spacing of the model grid, meaning that the sampled SFH parameters cannot be distinguished. Critically, if these were real data, we would interpret the wide spread in posterior medians as evidence of a diversity in galaxy SFH timescales. This is especially concerning when the true SFH parameters fall outside the posterior coverage (Figures 4–5).

Third, the above findings are consistent for the tests done using the continuity and the bursty continuity priors. Surprisingly, the continuity prior performs marginally better than the bursty continuity prior. Assuming the continuity prior, the mean biases in the inferred masses, and SFRs averaged over the most 100 Myr are approximately $-0.06$ dex, and $0.1$ dex, respectively; assuming the bursty continuity prior, the biases increase to $-0.3$ dex and $0.2$ dex. This is because by preferring smooth transitions between time bins in the SFH, the continuity prior is less prone to outshining.

Fourth, the population distributions of spectral features—Balmer break strength, Balmer emission lines, NUV/FUV flux densities—are sensitive to at least some of the population-level SFH parameters. Considering these four observables simultaneously, we can always identify the correct population-level SFH parameters via the shortest Wasserstein distance. This work serves as a proof of concept, and the full methodology and validation of our proposed population-level method will be presented in a future work.

Finally, under the assumption that current population-level burstiness predicts past SFH, the complete formation history, even when the light from recent bursts dominates the SED, can be accurately inferred given the correct burstiness model. In other words, the proposed population-level method for constraining burstiness is the key to address outshining.

To conclude, bursty star formation has emerged as a potential unifying explanation for the high-redshift temporarily quenched galaxies and overly luminous galaxies discovered by JWST; yet observational constraints on burstiness remain uncertain. Outshining represents perhaps the greatest challenge in interpreting the light from galaxies experiencing bursty star formation. The simultaneous population-level method proposed in this paper has the strong potential to constrain burstiness, and, thereby, addressing the issue of outshining. This work paves the way for a complete understanding of the observed diverse populations of galaxies.

Acknowledgments

We thank Elia Cenci for providing the FIREbox SFHs. We also thank Hiranya Peiris, Chris Hayward, and Kartheik Iyer for valuable discussions. We are grateful to the anonymous referee for the helpful comments. B.W. and J.L. acknowledge support from JWST-GO-02561.022-A. Computations for this research were performed on the Pennsylvania State University’s Institute for Computational and Data Sciences’ Roar supercomputer. This publication made use of the NASA Astrophysical Data System for bibliographic information.

Appendix

A Effect of Re-binning SFHs in Sed Fitting

Figure A.1: Rebinning of the model SFH changes the predicted spectrum. (Left) The SFH in intrinsic time bins is shown in black, whereas the rebinned SFH is shown in purple. (Right) Predicted spectra from the two SFHs are shown in the same colors.

In the main text, we adopt the same time bins as used in the input SFH when fitting SEDs. However, as it is difficult to know the timescales that the SFH varies a priori, certain form of rebinning is likely required.

J. Leja et al. (2019) has tested different number of time bins extensively on smoothly varying SFHs, and concluded that rebinning has no impact on the SED-fitting results. We revisit this issue here given the more complex bursty SFHs tested in this work. We rebin the input SFH into 5 Myr fine bins in the first 100 Myr, and the rest into a uniform logarithmic grid with the last bin fixed to 90% of the age of the universe. This choice is motivated by the need to be sufficiently flexible to describe SFHs potentially varying on a range of timescales, while keeping the total number of bins (and hence the total number of free parameters) to be within a manageable number. As shown in Figure A.1, the input and the rebinned SFHs predict different model spectra. While alternative non-parametric SFH implementations exist (e.g., K. A. Suess et al. 2022; K. G. Iyer et al. 2024), we demonstrate, in this abbreviated manner, that the binning of SFH would need to be carefully selected when inferring SFH variability in SED fitting.

B Individual Fits across Different Data Resolutions using Flexible time bins

Figure B.1: Inferring rejuvenation in the SFH from individual fits, using flexible time bins describing the SFH. The plots are organized in the same format as Figure 9. A short (200 Myr) period of past star formation cannot be accurately recovered, as the recent burst dominate the observed SED. Here we assume the ideal scenario where dust attention is known perfectly.

We adopt the same general Prospector settings as outlined in Section 3.1, but using flexible time bins (K. A. Suess et al., 2022), which allows the bin widths to change, so that the period where the SFH is inferred to have the most variations can be more described in greater resolution.

Four observing strategies are considered: photometry only (including broad and medium bands), photometry + low-resolution Prism spectroscopy, photometry + medium-resolution G235M spectroscopy, and photometry + medium-resolution G395M spectroscopy. At our redshift range of interest ( $z\sim 4-7$ ), the G235M grating covers a critical age indicator—the Balmer break—and also H $\beta$ , whereas the G395M grating covers H $\alpha$ and H $\beta$ .

The results, where we assume the ideal scenario of dust attention being known perfectly, are presented in Figure B.1. This rejuvenating case—where short periods of star formation and quiescence occur in the recent past—produces a model spectrum in which the most recent burst dominate the observed SED. Specially, this galaxy experiences a period of star formation over the most recent 340–540 Myr, followed by a 200 Myr long quiescence, and then forms stars again during the most recent 40 Myr.

Without the complication of the degeneracy between dust attention and emission-line strength, the most recent SFHs are inferred accurately in all cases. However, outshining induces significant uncertainty in inferring the past SFHs.

We also perform the same fits but letting dust attention to be fit for. The inferred becomes all consistent with a single burst, and the most recent SFRs are overestimated by a factor of $\sim 2$ .

C Challenge in Mapping out the Likelihood Surface

Figure C.1: Testing the sampler’s ability to find the global maximum in the likelihood space. Total masses inferred from fitting the same SED multiple times are plotted in purple. The true mass is indicated by the black dashed line. The scatter as well as the changes in the uncertainties mean that locating the global maximum is not guaranteed.

The challenge being put forth by the complex likelihood surface is manifested in the underestimated uncertainties in Figures 6 and D.1. To verify the hypothesis that the sampler fails to locate the global maximum on the likelihood surface, we fit the same galaxy multiple times. The only randomness in this process is the noise—while the S/N is always set to 20, we let the realized noise to be generated randomly for each trial. As seen from Figure C.1, the inferred masses as well as their uncertainties can change significantly. In addition, we repeat the same exercise using nautilus, which is also a nested sampler but with enhanced efficiency aided by neural networks (J. U. Lange, 2023). Interestingly, nautilus tends to consistently identify the underestimated mass as the final solution.

A rigorous test of various samplers is out of the scope of this paper; however, our findings suggest that the sampling problem here is indeed a challenging one. The bias in mass recovery is perhaps primarily driven by the challenge in finding the global maximum on the likelihood surface. High-dimensional SED modeling is known to have a complex likelihood space, and the complex behavior of bursty and rising SFHs exacerbates this challenge. As discussed in B. Wang et al. (2024b), the nested sampling method (J. Skilling, 2004), which is also used in this paper, is already better suited to sample multi-modal posteriors than other traditional techniques such as Markov chain Monte Carlo (J. Goodman & J. Weare, 2010). In principle, the problem of locating the global maximal likelihood modes can be mitigated by substantially increasing in the accuracy settings, but the resulting increase in the CPU-hours would quickly become prohibitively expensive.

Possible improvements on the current state-of-art sampling process used in astronomy (F. Feroz et al., 2009; D. Foreman-Mackey et al., 2013; W. J. Handley et al., 2015; J. S. Speagle, 2020; J. Buchner, 2021) include gradient-based samplers (M. D. Hoffman & A. Gelman, 2011), combining with machine-learning techniques (M. Karamanis et al., 2022; J. U. Lange, 2023), or new inference workflows (C. Hahn & P. Melchior, 2022; G. Khullar et al., 2022; B. Wang et al., 2023a; K. Zhang et al., 2023; M. Ho et al., 2024). It is worth noting that this missing-mode issue does not undermine the main conclusion in the paper that the individual spectrum does not contain enough information for complex recent SFH recovery. We have shown that all the spectra are well fit, but the inferred SFH can be very different from the input SFH. A sampler that is able to better map likelihood space can improve the uncertainty calibration, but unlikely to solve the fundamental problem regarding the information content.

D Effect of Adding JWST/MIRI in the Mass and SFR Recovery from SED Fitting

Figure D.1: The recovery of mass and SFR after including all the MIRI bands, shown in the same format as Figure 6. The biases in inferred mass see improvements at a

0.1-0.2

dex level, whereas the improvements in the averaged SFRs are more marginal,

<0.06

dex.

The long-wavelength coverage enabled by MIRI improves the recovery of mass and averaged SFR. The bias typically is improved by $0.1-0.2$ dex for the inferred masses, and $<0.06$ dex for the SFRs. This is likely because the mass-to-light ratio shows minimal variation in this wavelength range, particularly at rest 2 $\mu$ m (e.g., T. Kettlety et al. 2018). However, the MIRI bands are still insufficient to accurately trace the SFH variations from individual SED fits.

References

J. Alsing et al. (2024) Alsing, J., Thorp, S., Deger, S., et al. 2024, \bibinfotitlepop-cosmos: A Comprehensive Picture of the Galaxy Population from COSMOS Data, ApJS, 274, 12, doi: 10.3847/1538-4365/ad5c69
Y. Asada et al. (2024) Asada, Y., Sawicki, M., Abraham, R., et al. 2024, \bibinfotitleBursty star formation and galaxy-galaxy interactions in low-mass galaxies 1 Gyr after the Big Bang, MNRAS, 527, 11372, doi: 10.1093/mnras/stad3902
H. Atek et al. (2011) Atek, H., Siana, B., Scarlata, C., et al. 2011, \bibinfotitleVery Strong Emission-line Galaxies in the WFC3 Infrared Spectroscopic Parallel Survey and Implications for High-redshift Galaxies, ApJ, 743, 121, doi: 10.1088/0004-637X/743/2/121
H. Atek et al. (2023) Atek, H., Chemerynska, I., Wang, B., et al. 2023, \bibinfotitleJWST UNCOVER: discovery of z ¿ 9 galaxy candidates behind the lensing cluster Abell 2744, MNRAS, 524, 5486, doi: 10.1093/mnras/stad1998
W. M. Baker et al. (2025) Baker, W. M., D’Eugenio, F., Maiolino, R., et al. 2025, \bibinfotitleZapped then napped? A rapidly quenched remnant leaker candidate with a steep spectroscopic $\beta$ _UV slope at z = 8.5, A&A, 697, A90, doi: 10.1051/0004-6361/202553766
M. L. Balogh et al. (1999) Balogh, M. L., Morris, S. L., Yee, H. K. C., Carlberg, R. G., & Ellingson, E. 1999, \bibinfotitleDifferential Galaxy Evolution in Cluster and Field Galaxies at z~0.3, ApJ, 527, 54, doi: 10.1086/308056
L. Bassini et al. (2023) Bassini, L., Feldmann, R., Gensior, J., et al. 2023, \bibinfotitleThe inefficiency of stellar feedback in driving galactic outflows in massive galaxies at high redshift, MNRAS, 525, 5388, doi: 10.1093/mnras/stad2617
R. Bezanson et al. (2024) Bezanson, R., Labbe, I., Whitaker, K. E., et al. 2024, \bibinfotitleThe JWST UNCOVER Treasury Survey: Ultradeep NIRSpec and NIRCam Observations before the Epoch of Reionization, ApJ, 974, 92, doi: 10.3847/1538-4357/ad66cf
J. Buchner (2021) Buchner, J. 2021, \bibinfotitleUltraNest - a robust, general purpose Bayesian inference engine, The Journal of Open Source Software, 6, 3001, doi: 10.21105/joss.03001
N. Byler et al. (2017) Byler, N., Dalcanton, J. J., Conroy, C., & Johnson, B. D. 2017, \bibinfotitleNebular Continuum and Line Emission in Stellar Population Synthesis Models, ApJ, 840, 44, doi: 10.3847/1538-4357/aa6c66
D. Calzetti et al. (2000) Calzetti, D., Armus, L., Bohlin, R. C., et al. 2000, \bibinfotitleThe Dust Content and Opacity of Actively Star-forming Galaxies, ApJ, 533, 682, doi: 10.1086/308692
N. Caplar & S. Tacchella (2019) Caplar, N., & Tacchella, S. 2019, \bibinfotitleStochastic modelling of star-formation histories I: the scatter of the star-forming main sequence, MNRAS, 487, 3845, doi: 10.1093/mnras/stz1449
C. M. Casey et al. (2023) Casey, C. M., Kartaltepe, J. S., Drakos, N. E., et al. 2023, \bibinfotitleCOSMOS-Web: An Overview of the JWST Cosmic Origins Survey, ApJ, 954, 31, doi: 10.3847/1538-4357/acc2bc
E. Cenci et al. (2024) Cenci, E., Feldmann, R., Gensior, J., et al. 2024, \bibinfotitleStarbursts driven by central gas compaction, MNRAS, 527, 7871, doi: 10.1093/mnras/stad3709
G. Chabrier (2003) Chabrier, G. 2003, \bibinfotitleGalactic Stellar and Substellar Initial Mass Function, PASP, 115, 763, doi: 10.1086/376392
S. Charlot & S. M. Fall (2000) Charlot, S., & Fall, S. M. 2000, \bibinfotitleA Simple Model for the Absorption of Starlight by Dust in Galaxies, ApJ, 539, 718, doi: 10.1086/309250
J. Chevallard & S. Charlot (2016) Chevallard, J., & Charlot, S. 2016, \bibinfotitleModelling and interpreting spectral energy distributions of galaxies with BEAGLE, MNRAS, 462, 1415, doi: 10.1093/mnras/stw1756
J. Choi et al. (2017) Choi, J., Conroy, C., & Byler, N. 2017, \bibinfotitleThe Evolution and Properties of Rotating Massive Star Populations, ApJ, 838, 159, doi: 10.3847/1538-4357/aa679f
J. Choi et al. (2016) Choi, J., Dotter, A., Conroy, C., et al. 2016, \bibinfotitleMesa Isochrones and Stellar Tracks (MIST). I. Solar-scaled Models, ApJ, 823, 102, doi: 10.3847/0004-637X/823/2/102
L. Ciesla et al. (2024) Ciesla, L., Elbaz, D., Ilbert, O., et al. 2024, \bibinfotitleIdentification of a transition from stochastic to secular star formation around z = 9 with JWST, A&A, 686, A128, doi: 10.1051/0004-6361/202348091
R. K. Cochrane et al. (2025) Cochrane, R. K., Katz, H., Begley, R., Hayward, C. C., & Best, P. N. 2025, \bibinfotitleHigh-z Stellar Masses Can Be Recovered Robustly with JWST Photometry, ApJ, 978, L42, doi: 10.3847/2041-8213/ad9a4d
C. Conroy (2013) Conroy, C. 2013, \bibinfotitleModeling the Panchromatic Spectral Energy Distributions of Galaxies, ARA&A, 51, 393, doi: 10.1146/annurev-astro-082812-141017
C. Conroy & J. E. Gunn (2010) Conroy, C., & Gunn, J. E. 2010, \bibinfotitleThe Propagation of Uncertainties in Stellar Population Synthesis Modeling. III. Model Calibration, Comparison, and Evaluation, ApJ, 712, 833, doi: 10.1088/0004-637X/712/2/833
E. R. Cueto et al. (2024) Cueto, E. R., Hutter, A., Dayal, P., et al. 2024, \bibinfotitleASTRAEUS. IX. Impact of an evolving stellar initial mass function on early galaxies and reionisation, A&A, 686, A138, doi: 10.1051/0004-6361/202349017
A. Dekel et al. (2023) Dekel, A., Sarkar, K. C., Birnboim, Y., Mandelker, N., & Li, Z. 2023, \bibinfotitleEfficient formation of massive galaxies at cosmic dawn by feedback-free starbursts, MNRAS, 523, 3201, doi: 10.1093/mnras/stad1557
T. Dome et al. (2024) Dome, T., Tacchella, S., Fialkov, A., et al. 2024, \bibinfotitleMini-quenching of z = 4-8 galaxies by bursty star formation, MNRAS, 527, 2139, doi: 10.1093/mnras/stad3239
A. Dotter (2016) Dotter, A. 2016, \bibinfotitleMESA Isochrones and Stellar Tracks (MIST) 0: Methods for the Construction of Stellar Isochrones, ApJS, 222, 8, doi: 10.3847/0067-0049/222/1/8
B. T. Draine & A. Li (2007) Draine, B. T., & Li, A. 2007, \bibinfotitleInfrared Emission from Interstellar Dust. IV. The Silicate-Graphite-PAH Model in the Post-Spitzer Era, ApJ, 657, 810, doi: 10.1086/511055
D. J. Eisenstein et al. (2023) Eisenstein, D. J., Willott, C., Alberts, S., et al. 2023, \bibinfotitleOverview of the JWST Advanced Deep Extragalactic Survey (JADES), arXiv e-prints, arXiv:2306.02465, doi: 10.48550/arXiv.2306.02465
N. Emami et al. (2019) Emami, N., Siana, B., Weisz, D. R., et al. 2019, \bibinfotitleA Closer Look at Bursty Star Formation with L _Hα and L _UV Distributions, ApJ, 881, 71, doi: 10.3847/1538-4357/ab211a
R. Endsley et al. (2024) Endsley, R., Chisholm, J., Stark, D. P., Topping, M. W., & Whitler, L. 2024, \bibinfotitleThe Burstiness of Star Formation at $z\sim 6$ : A Huge Diversity in the Recent Star Formation Histories of Very UV-faint Galaxies, arXiv e-prints, arXiv:2410.01905, doi: 10.48550/arXiv.2410.01905
A. L. Faisst et al. (2019) Faisst, A. L., Capak, P. L., Emami, N., Tacchella, S., & Larson, K. L. 2019, \bibinfotitleThe Recent Burstiness of Star Formation in Galaxies at z $\sim$ 4.5 from H $\alpha$ Measurements, ApJ, 884, 133, doi: 10.3847/1538-4357/ab425b
A. L. Faisst & T. Morishita (2024) Faisst, A. L., & Morishita, T. 2024, \bibinfotitleDead or Alive? How Bursty Star Formation and Patchy Dust Can Cause Temporary Quiescence in High-redshift Galaxies, ApJ, 971, 47, doi: 10.3847/1538-4357/ad58e2
C.-A. Faucher-Giguère (2018) Faucher-Giguère, C.-A. 2018, \bibinfotitleA model for the origin of bursty star formation in galaxies, MNRAS, 473, 3717, doi: 10.1093/mnras/stx2595
R. Feldmann et al. (2017) Feldmann, R., Quataert, E., Hopkins, P. F., Faucher-Giguère, C.-A., & Kereš, D. 2017, \bibinfotitleColours, star formation rates and environments of star-forming and quiescent galaxies at the cosmic noon, MNRAS, 470, 1050, doi: 10.1093/mnras/stx1120
R. Feldmann et al. (2023) Feldmann, R., Quataert, E., Faucher-Giguère, C.-A., et al. 2023, \bibinfotitleFIREbox: simulating galaxies at high dynamic range in a cosmological volume, MNRAS, 522, 3831, doi: 10.1093/mnras/stad1205
G. J. Ferland et al. (2017) Ferland, G. J., Chatzikos, M., Guzmán, F., et al. 2017, \bibinfotitleThe 2017 Release Cloudy, Rev. Mexicana Astron. Astrofis., 53, 385, doi: 10.48550/arXiv.1705.10877
F. Feroz et al. (2009) Feroz, F., Hobson, M. P., & Bridges, M. 2009, \bibinfotitleMULTINEST: an efficient and robust Bayesian inference tool for cosmology and particle physics, MNRAS, 398, 1601, doi: 10.1111/j.1365-2966.2009.14548.x
A. Ferrara (2024) Ferrara, A. 2024, \bibinfotitleSuper-early JWST galaxies, outflows, and Ly $\alpha$ visibility in the Epoch of Reionization, A&A, 684, A207, doi: 10.1051/0004-6361/202348321
S. L. Finkelstein et al. (2023) Finkelstein, S. L., Bagley, M. B., Ferguson, H. C., et al. 2023, \bibinfotitleCEERS Key Paper. I. An Early Look into the First 500 Myr of Galaxy Formation with JWST, ApJ, 946, L13, doi: 10.3847/2041-8213/acade4
J. A. Flores Velázquez et al. (2021) Flores Velázquez, J. A., Gurvich, A. B., Faucher-Giguère, C.-A., et al. 2021, \bibinfotitleThe time-scales probed by star formation rate indicators for realistic, bursty star formation histories from the FIRE simulations, MNRAS, 501, 4812, doi: 10.1093/mnras/staa3893
D. Foreman-Mackey et al. (2013) Foreman-Mackey, D., Hogg, D. W., Lang, D., & Goodman, J. 2013, \bibinfotitleemcee: The MCMC Hammer, PASP, 125, 306, doi: 10.1086/670067
Z. Gao et al. (2025) Gao, Z., Peng, Y., Wang, K., et al. 2025, \bibinfotitleFrom Halos to Galaxies. X. Decoding Galaxy SEDs with Physical Priors and Accurate Star Formation History Reconstruction, ApJ, 979, 66, doi: 10.3847/1538-4357/ad9a5c
K. Glazebrook et al. (1999) Glazebrook, K., Blake, C., Economou, F., Lilly, S., & Colless, M. 1999, \bibinfotitleMeasurement of the star formation rate from H $\alpha$ in field galaxies at z=1, MNRAS, 306, 843, doi: 10.1046/j.1365-8711.1999.02576.x
J. Goodman & J. Weare (2010) Goodman, J., & Weare, J. 2010, \bibinfotitleEnsemble samplers with affine invariance, Communications in Applied Mathematics and Computational Science, 5, 65, doi: 10.2140/camcos.2010.5.65
F. Governato et al. (2012) Governato, F., Zolotov, A., Pontzen, A., et al. 2012, \bibinfotitleCuspy no more: how outflows affect the central dark matter and baryon distribution in $\Lambda$ cold dark matter galaxies, MNRAS, 422, 1231, doi: 10.1111/j.1365-2966.2012.20696.x
Y. Guo et al. (2016) Guo, Y., Rafelski, M., Faber, S. M., et al. 2016, \bibinfotitleThe Bursty Star Formation Histories of Low-mass Galaxies at 0.4 ¡ z ¡ 1 Revealed by Star Formation Rates Measured From H $\beta$ and FUV, ApJ, 833, 37, doi: 10.3847/1538-4357/833/1/37
C. Hahn & P. Melchior (2022) Hahn, C., & Melchior, P. 2022, \bibinfotitleAccelerated Bayesian SED Modeling Using Amortized Neural Posterior Estimation, ApJ, 938, 11, doi: 10.3847/1538-4357/ac7b84
W. J. Handley et al. (2015) Handley, W. J., Hobson, M. P., & Lasenby, A. N. 2015, \bibinfotitlePOLYCHORD: next-generation nested sampling, MNRAS, 453, 4384, doi: 10.1093/mnras/stv1911
Y. Harikane et al. (2023) Harikane, Y., Ouchi, M., Oguri, M., et al. 2023, \bibinfotitleA Comprehensive Study of Galaxies at z 9-16 Found in the Early JWST Data: Ultraviolet Luminosity Functions and Cosmic Star Formation History at the Pre-reionization Epoch, ApJS, 265, 5, doi: 10.3847/1538-4365/acaaa9
C. R. Harris et al. (2020) Harris, C. R., Millman, K. J., van der Walt, S. J., et al. 2020, \bibinfotitleArray programming with NumPy, Nature, 585, 357, doi: 10.1038/s41586-020-2649-2
G. Hinshaw et al. (2013) Hinshaw, G., Larson, D., Komatsu, E., et al. 2013, \bibinfotitleNine-year Wilkinson Microwave Anisotropy Probe (WMAP) Observations: Cosmological Parameter Results, ApJS, 208, 19, doi: 10.1088/0067-0049/208/2/19
M. Ho et al. (2024) Ho, M., Bartlett, D. J., Chartier, N., et al. 2024, \bibinfotitleLtU-ILI: An All-in-One Framework for Implicit Inference in Astrophysics and Cosmology, The Open Journal of Astrophysics, 7, 54, doi: 10.33232/001c.120559
M. D. Hoffman & A. Gelman (2011) Hoffman, M. D., & Gelman, A. 2011, \bibinfotitleThe No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo, arXiv e-prints, arXiv:1111.4246, doi: 10.48550/arXiv.1111.4246
P. F. Hopkins et al. (2014) Hopkins, P. F., Kereš, D., Oñorbe, J., et al. 2014, \bibinfotitleGalaxies on FIRE (Feedback In Realistic Environments): stellar feedback explains cosmologically inefficient star formation, MNRAS, 445, 581, doi: 10.1093/mnras/stu1738
P. F. Hopkins et al. (2023a) Hopkins, P. F., Gurvich, A. B., Shen, X., et al. 2023a, \bibinfotitleWhat causes the formation of discs and end of bursty star formation?, MNRAS, 525, 2241, doi: 10.1093/mnras/stad1902
P. F. Hopkins et al. (2023b) Hopkins, P. F., Wetzel, A., Wheeler, C., et al. 2023b, \bibinfotitleFIRE-3: updated stellar evolution models, yields, and microphysics and fitting functions for applications in galaxy simulations, MNRAS, 519, 3154, doi: 10.1093/mnras/stac3489
J. D. Hunter (2007) Hunter, J. D. 2007, \bibinfotitleMatplotlib: A 2D Graphics Environment, Computing in Science and Engineering, 9, 90, doi: 10.1109/MCSE.2007.55
K. G. Iyer et al. (2019) Iyer, K. G., Gawiser, E., Faber, S. M., et al. 2019, \bibinfotitleNonparametric Star Formation History Reconstruction with Gaussian Processes. I. Counting Major Episodes of Star Formation, ApJ, 879, 116, doi: 10.3847/1538-4357/ab2052
K. G. Iyer et al. (2024) Iyer, K. G., Speagle, J. S., Caplar, N., et al. 2024, \bibinfotitleStochastic Modeling of Star Formation Histories. III. Constraints from Physically Motivated Gaussian Processes, ApJ, 961, 53, doi: 10.3847/1538-4357/acff64
K. G. Iyer et al. (2020) Iyer, K. G., Tacchella, S., Genel, S., et al. 2020, \bibinfotitleThe diversity and variability of star formation histories in models of galaxy evolution, MNRAS, 498, 430, doi: 10.1093/mnras/staa2150
B. D. Johnson et al. (2021) Johnson, B. D., Leja, J., Conroy, C., & Speagle, J. S. 2021, \bibinfotitleStellar Population Inference with Prospector, ApJS, 254, 22, doi: 10.3847/1538-4365/abef67
B. D. Johnson et al. (2013) Johnson, B. D., Weisz, D. R., Dalcanton, J. J., et al. 2013, \bibinfotitleMeasuring Galaxy Star Formation Rates from Integrated Photometry: Insights from Color-Magnitude Diagrams of Resolved Stars, ApJ, 772, 8, doi: 10.1088/0004-637X/772/1/8
M. Karamanis et al. (2022) Karamanis, M., Beutler, F., Peacock, J. A., Nabergoj, D., & Seljak, U. 2022, \bibinfotitleAccelerating astronomical and cosmological inference with preconditioned Monte Carlo, MNRAS, 516, 1644, doi: 10.1093/mnras/stac2272
J. Kennicutt (1998) Kennicutt, Robert C., J. 1998, \bibinfotitleStar Formation in Galaxies Along the Hubble Sequence, ARA&A, 36, 189, doi: 10.1146/annurev.astro.36.1.189
T. Kettlety et al. (2018) Kettlety, T., Hesling, J., Phillipps, S., et al. 2018, \bibinfotitleGalaxy and mass assembly (GAMA): the consistency of GAMA and WISE derived mass-to-light ratios, MNRAS, 473, 776, doi: 10.1093/mnras/stx2379
G. Khullar et al. (2022) Khullar, G., Nord, B., Ćiprijanović, A., Poh, J., & Xu, F. 2022, \bibinfotitleDIGS: deep inference of galaxy spectra with neural posterior estimation, Machine Learning: Science and Technology, 3, 04LT04, doi: 10.1088/2632-2153/ac98f4
J. U. Lange (2023) Lange, J. U. 2023, \bibinfotitleNAUTILUS: boosting Bayesian importance nested sampling with deep learning, MNRAS, 525, 3181, doi: 10.1093/mnras/stad2441
J. C. Lee et al. (2009) Lee, J. C., Gil de Paz, A., Tremonti, C., et al. 2009, \bibinfotitleComparison of H $\alpha$ and UV Star Formation Rates in the Local Volume: Systematic Discrepancies for Dwarf Galaxies, ApJ, 706, 599, doi: 10.1088/0004-637X/706/1/599
L. Legrand et al. (2022) Legrand, L., Hutter, A., Dayal, P., et al. 2022, \bibinfotitleAstraeus IV: quantifying the star formation histories of galaxies in the Epoch of Reionization, MNRAS, 509, 595, doi: 10.1093/mnras/stab3034
J. Leja et al. (2019) Leja, J., Carnall, A. C., Johnson, B. D., Conroy, C., & Speagle, J. S. 2019, \bibinfotitleHow to Measure Galaxy Star Formation Histories. II. Nonparametric Models, ApJ, 876, 3, doi: 10.3847/1538-4357/ab133c
J. Leja et al. (2018) Leja, J., Johnson, B. D., Conroy, C., & van Dokkum, P. 2018, \bibinfotitleHot Dust in Panchromatic SED Fitting: Identification of Active Galactic Nuclei and Improved Galaxy Properties, ApJ, 854, 62, doi: 10.3847/1538-4357/aaa8db
J. Leja et al. (2017) Leja, J., Johnson, B. D., Conroy, C., van Dokkum, P. G., & Byler, N. 2017, \bibinfotitleDeriving Physical Properties from Broadband Photometry with Prospector: Description of the Model and a Demonstration of its Accuracy Using 129 Galaxies in the Local Universe, ApJ, 837, 170, doi: 10.3847/1538-4357/aa5ffe
J. Leja et al. (2022) Leja, J., Speagle, J. S., Ting, Y.-S., et al. 2022, \bibinfotitleA New Census of the 0.2 ¡ z ¡ 3.0 Universe. II. The Star-forming Sequence, ApJ, 936, 165, doi: 10.3847/1538-4357/ac887d
J. Li et al. (2024) Li, J., Melchior, P., Hahn, C., & Huang, S. 2024, \bibinfotitlePopSED: Population-level Inference for Galaxy Properties from Broadband Photometry with Neural Density Estimation, AJ, 167, 16, doi: 10.3847/1538-3881/ad0be4
T. J. Looser et al. (2024) Looser, T. J., D’Eugenio, F., Maiolino, R., et al. 2024, \bibinfotitleA recently quenched galaxy 700 million years after the Big Bang, Nature, 629, 53, doi: 10.1038/s41586-024-07227-0
T. J. Looser et al. (2025) Looser, T. J., D’Eugenio, F., Maiolino, R., et al. 2025, \bibinfotitleJADES: Differing assembly histories of galaxies: Observational evidence for bursty star formation histories and (mini-)quenching in the first billion years of the Universe, A&A, 697, A88, doi: 10.1051/0004-6361/202347102
P. Madau (1995) Madau, P. 1995, \bibinfotitleRadiative Transfer in a Clumpy Universe: The Colors of High-Redshift Galaxies, ApJ, 441, 18, doi: 10.1086/175332
M. V. Maseda et al. (2014) Maseda, M. V., van der Wel, A., Rix, H.-W., et al. 2014, \bibinfotitleThe Nature of Extreme Emission Line Galaxies at z = 1-2: Kinematics and Metallicities from Near-infrared Spectroscopy, ApJ, 791, 17, doi: 10.1088/0004-637X/791/1/17
M. V. Maseda et al. (2018) Maseda, M. V., van der Wel, A., Rix, H.-W., et al. 2018, \bibinfotitleThe Number Density Evolution of Extreme Emission Line Galaxies in 3D-HST: Results from a Novel Automated Line Search Technique for Slitless Spectroscopy, ApJ, 854, 29, doi: 10.3847/1538-4357/aaa76e
C. A. Mason et al. (2023) Mason, C. A., Trenti, M., & Treu, T. 2023, \bibinfotitleThe brightest galaxies at cosmic dawn, MNRAS, 521, 497, doi: 10.1093/mnras/stad035
V. Mauerhofer & P. Dayal (2023) Mauerhofer, V., & Dayal, P. 2023, \bibinfotitleThe dust enrichment of early galaxies in the JWST and ALMA era, MNRAS, 526, 2196, doi: 10.1093/mnras/stad2734
W. McClymont et al. (2025) McClymont, W., Tacchella, S., Smith, A., et al. 2025, \bibinfotitleThe THESAN-ZOOM project: Burst, quench, repeat – unveiling the evolution of high-redshift galaxies along the star-forming main sequence, arXiv e-prints, arXiv:2503.00106, doi: 10.48550/arXiv.2503.00106
C. F. McKee & E. C. Ostriker (2007) McKee, C. F., & Ostriker, E. C. 2007, \bibinfotitleTheory of Star Formation, ARA&A, 45, 565, doi: 10.1146/annurev.astro.45.051806.110602
V. Mehta et al. (2023) Mehta, V., Teplitz, H. I., Scarlata, C., et al. 2023, \bibinfotitleA Spatially Resolved Analysis of Star Formation Burstiness by Comparing UV and H $\alpha$ in Galaxies at z $\sim$ 1 with UVCANDELS, ApJ, 952, 133, doi: 10.3847/1538-4357/acd9cf
T. Naab & J. P. Ostriker (2017) Naab, T., & Ostriker, J. P. 2017, \bibinfotitleTheoretical Challenges in Galaxy Formation, ARA&A, 55, 59, doi: 10.1146/annurev-astro-081913-040019
R. P. Naidu et al. (2022) Naidu, R. P., Oesch, P. A., van Dokkum, P., et al. 2022, \bibinfotitleTwo Remarkably Luminous Galaxy Candidates at z $\approx$ 10-12 Revealed by JWST, ApJ, 940, L14, doi: 10.3847/2041-8213/ac9b22
T. Nanayakkara et al. (2020) Nanayakkara, T., Brinchmann, J., Glazebrook, K., et al. 2020, \bibinfotitleReconstructing the Observed Ionizing Photon Production Efficiency at z $\sim$ 2 Using Stellar Population Models, ApJ, 889, 180, doi: 10.3847/1538-4357/ab65eb
D. Narayanan et al. (2024) Narayanan, D., Lower, S., Torrey, P., et al. 2024, \bibinfotitleOutshining by Recent Star Formation Prevents the Accurate Measurement of High-z Galaxy Stellar Masses, ApJ, 961, 73, doi: 10.3847/1538-4357/ad0966
S. Noll et al. (2009) Noll, S., Burgarella, D., Giovannoli, E., et al. 2009, \bibinfotitleAnalysis of galaxy spectral energy distributions from far-UV to far-IR with CIGALE: studying a SINGS test sample, A&A, 507, 1793, doi: 10.1051/0004-6361/200912497
Y. Ono et al. (2018) Ono, Y., Ouchi, M., Harikane, Y., et al. 2018, \bibinfotitleGreat Optically Luminous Dropout Research Using Subaru HSC (GOLDRUSH). I. UV luminosity functions at z $\sim$ 4-7 derived with the half-million dropouts on the 100 deg² sky, PASJ, 70, S10, doi: 10.1093/pasj/psx103
C. Pacifici et al. (2023) Pacifici, C., Iyer, K. G., Mobasher, B., et al. 2023, \bibinfotitleThe Art of Measuring Physical Parameters in Galaxies: A Critical Assessment of Spectral Energy Distribution Fitting Techniques, ApJ, 944, 141, doi: 10.3847/1538-4357/acacff
F. Pacucci & A. Loeb (2022) Pacucci, F., & Loeb, A. 2022, \bibinfotitleThe search for the farthest quasar: consequences for black hole growth and seed models, MNRAS, 509, 1885, doi: 10.1093/mnras/stab3071
A. Pallottini & A. Ferrara (2023) Pallottini, A., & Ferrara, A. 2023, \bibinfotitleStochastic star formation in early galaxies: Implications for the James Webb Space Telescope, A&A, 677, L4, doi: 10.1051/0004-6361/202347384
V. M. Panaretos & Y. Zemel (2019) Panaretos, V. M., & Zemel, Y. 2019, \bibinfotitleStatistical Aspects of Wasserstein Distances, Annual Review of Statistics and Its Application, 6, 405, doi: 10.1146/annurev-statistics-030718-104938
C. Papovich et al. (2001) Papovich, C., Dickinson, M., & Ferguson, H. C. 2001, \bibinfotitleThe Stellar Populations and Evolution of Lyman Break Galaxies, ApJ, 559, 620, doi: 10.1086/322412
M. Park et al. (2024) Park, M., Belli, S., Conroy, C., et al. 2024, \bibinfotitleWidespread Rapid Quenching at Cosmic Noon Revealed by JWST Deep Spectroscopy, ApJ, 976, 72, doi: 10.3847/1538-4357/ad7e15
A. Pillepich et al. (2018) Pillepich, A., Springel, V., Nelson, D., et al. 2018, \bibinfotitleSimulating galaxy formation with the IllustrisTNG model, MNRAS, 473, 4077, doi: 10.1093/mnras/stx2656
S. H. Price et al. (2025) Price, S. H., Bezanson, R., Labbe, I., et al. 2025, \bibinfotitleThe UNCOVER Survey: First Release of Ultradeep JWST/NIRSpec PRISM Spectra for $\sim$ 700 Galaxies from z $\sim$ 0.3–13 in A2744, ApJ, 982, 51, doi: 10.3847/1538-4357/adaec1
A. Renzini (2023) Renzini, A. 2023, \bibinfotitleA transient overcooling in the early Universe? Clues from globular clusters formation, MNRAS, 525, L117, doi: 10.1093/mnrasl/slad091
B. E. Robertson et al. (2023) Robertson, B. E., Tacchella, S., Johnson, B. D., et al. 2023, \bibinfotitleIdentification and properties of intense star-forming galaxies at redshifts z ¿ 10, Nature Astronomy, 7, 611, doi: 10.1038/s41550-023-01921-1
P. Sánchez-Blázquez et al. (2006) Sánchez-Blázquez, P., Peletier, R. F., Jiménez-Vicente, J., et al. 2006, \bibinfotitleMedium-resolution Isaac Newton Telescope library of empirical spectra, MNRAS, 371, 703, doi: 10.1111/j.1365-2966.2006.10699.x
X. Shen et al. (2023) Shen, X., Vogelsberger, M., Boylan-Kolchin, M., Tacchella, S., & Kannan, R. 2023, \bibinfotitleThe impact of UV variability on the abundance of bright galaxies at z $\geq$ 9, MNRAS, 525, 3254, doi: 10.1093/mnras/stad2508
J. Sipple & A. Lidz (2024) Sipple, J., & Lidz, A. 2024, \bibinfotitleThe Star Formation Efficiency during Reionization as Inferred from the Hubble Frontier Fields, ApJ, 961, 50, doi: 10.3847/1538-4357/ad06a7
J. Skilling (2004) Skilling, J. 2004, in American Institute of Physics Conference Series, Vol. 735, Bayesian Inference and Maximum Entropy Methods in Science and Engineering: 24th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, ed. R. Fischer, R. Preuss, & U. V. Toussaint (AIP), 395–405, doi: 10.1063/1.1835238
R. S. Somerville & R. Davé (2015) Somerville, R. S., & Davé, R. 2015, \bibinfotitlePhysical Models of Galaxy Formation in a Cosmological Framework, ARA&A, 53, 51, doi: 10.1146/annurev-astro-082812-140951
M. Sparre et al. (2017) Sparre, M., Hayward, C. C., Feldmann, R., et al. 2017, \bibinfotitle(Star)bursts of FIRE: observational signatures of bursty star formation in galaxies, MNRAS, 466, 88, doi: 10.1093/mnras/stw3011
J. S. Speagle (2020) Speagle, J. S. 2020, \bibinfotitleDYNESTY: a dynamic nested sampling package for estimating Bayesian posteriors and evidences, MNRAS, 493, 3132, doi: 10.1093/mnras/staa278
J. S. Speagle et al. (2014) Speagle, J. S., Steinhardt, C. L., Capak, P. L., & Silverman, J. D. 2014, \bibinfotitleA Highly Consistent Framework for the Evolution of the Star-Forming “Main Sequence” from z ~0-6, ApJS, 214, 15, doi: 10.1088/0067-0049/214/2/15
G. S. Stinson et al. (2007) Stinson, G. S., Dalcanton, J. J., Quinn, T., Kaufmann, T., & Wadsley, J. 2007, \bibinfotitleBreathing in Low-Mass Galaxies: A Study of Episodic Star Formation, ApJ, 667, 170, doi: 10.1086/520504
V. Strait et al. (2023) Strait, V., Brammer, G., Muzzin, A., et al. 2023, \bibinfotitleAn Extremely Compact, Low-mass Galaxy on its Way to Quiescence at z = 5.2, ApJ, 949, L23, doi: 10.3847/2041-8213/acd457
K. A. Suess et al. (2022) Suess, K. A., Leja, J., Johnson, B. D., et al. 2022, \bibinfotitleRecovering the Star Formation Histories of Recently Quenched Galaxies: The Impact of Model and Prior Choices, ApJ, 935, 146, doi: 10.3847/1538-4357/ac82b0
K. A. Suess et al. (2024) Suess, K. A., Weaver, J. R., Price, S. H., et al. 2024, \bibinfotitleMedium Bands, Mega Science: A JWST/NIRCam Medium-band Imaging Survey of A2744, ApJ, 976, 101, doi: 10.3847/1538-4357/ad75fe
G. Sun et al. (2023a) Sun, G., Faucher-Giguère, C.-A., Hayward, C. C., & Shen, X. 2023a, \bibinfotitleSeen and unseen: bursty star formation and its implications for observations of high-redshift galaxies with JWST, MNRAS, 526, 2665, doi: 10.1093/mnras/stad2902
G. Sun et al. (2023b) Sun, G., Faucher-Giguère, C.-A., Hayward, C. C., et al. 2023b, \bibinfotitleBursty Star Formation Naturally Explains the Abundance of Bright Galaxies at Cosmic Dawn, ApJ, 955, L35, doi: 10.3847/2041-8213/acf85a
G. Sun et al. (2025) Sun, G., Muñoz, J. B., Mirocha, J., & Faucher-Giguère, C.-A. 2025, \bibinfotitleConstraining bursty star formation histories with galaxy UV and H $\alpha$ luminosity functions and clustering, J. Cosmology Astropart. Phys, 2025, 034, doi: 10.1088/1475-7516/2025/04/034
S. Tacchella et al. (2016) Tacchella, S., Dekel, A., Carollo, C. M., et al. 2016, \bibinfotitleEvolution of density profiles in high-z galaxies: compaction and quenching inside-out, MNRAS, 458, 242, doi: 10.1093/mnras/stw303
S. Tacchella et al. (2022) Tacchella, S., Conroy, C., Faber, S. M., et al. 2022, \bibinfotitleFast, Slow, Early, Late: Quenching Massive Galaxies at z $\sim$ 0.8, ApJ, 926, 134, doi: 10.3847/1538-4357/ac449b
R. Trotta (2008) Trotta, R. 2008, \bibinfotitleBayes in the sky: Bayesian inference and model selection in cosmology, ConPh, 49, 71, doi: 10.1080/00107510802066753
J. A. A. Trussler et al. (2025) Trussler, J. A. A., Conselice, C. J., Adams, N., et al. 2025, \bibinfotitleLike a candle in the wind: the embers of once aflame, now smouldering galaxies at 5 ¡ z ¡ 8, MNRAS, 537, 3662, doi: 10.1093/mnras/staf213
A. van der Wel et al. (2011) van der Wel, A., Rix, H.-W., Wuyts, S., et al. 2011, \bibinfotitleThe Majority of Compact Massive Galaxies at z ~2 are Disk Dominated, ApJ, 730, 38, doi: 10.1088/0004-637X/730/1/38
P. van Dokkum & C. Conroy (2024) van Dokkum, P., & Conroy, C. 2024, \bibinfotitleReconciling M/L Ratios Across Cosmic Time: a Concordance IMF for Massive Galaxies, ApJ, 973, L32, doi: 10.3847/2041-8213/ad77b8
P. Virtanen et al. (2020) Virtanen, P., Gommers, R., Oliphant, T. E., et al. 2020, \bibinfotitleSciPy 1.0: fundamental algorithms for scientific computing in Python, Nature Methods, 17, 261, doi: 10.1038/s41592-019-0686-2
J. T. Wan et al. (2024) Wan, J. T., Tacchella, S., Johnson, B. D., et al. 2024, \bibinfotitleStochastic prior for non-parametric star-formation histories, MNRAS, 532, 4002, doi: 10.1093/mnras/stae1734
B. Wang et al. (2023a) Wang, B., Leja, J., Villar, V. A., & Speagle, J. S. 2023a, \bibinfotitleSBI⁺⁺: Flexible, Ultra-fast Likelihood-free Inference Customized for Astronomical Applications, ApJ, 952, L10, doi: 10.3847/2041-8213/ace361
B. Wang et al. (2023b) Wang, B., Leja, J., Bezanson, R., et al. 2023b, \bibinfotitleInferring More from Less: Prospector as a Photometric Redshift Engine in the Era of JWST, ApJ, 944, L58, doi: 10.3847/2041-8213/acba99
B. Wang et al. (2024a) Wang, B., Leja, J., Atek, H., et al. 2024a, \bibinfotitleQuantifying the Effects of Known Unknowns on Inferred High-redshift Galaxy Properties: Burstiness, IMF, and Nebular Physics, ApJ, 963, 74, doi: 10.3847/1538-4357/ad187c
B. Wang et al. (2024b) Wang, B., Leja, J., Labbé, I., et al. 2024b, \bibinfotitleThe UNCOVER Survey: A First-look HST+JWST Catalog of Galaxy Redshifts and Stellar Population Properties Spanning 0.2 $\lesssim$ z $\lesssim$ 15, ApJS, 270, 12, doi: 10.3847/1538-4365/ad0846
B. Wang et al. (2024c) Wang, B., Leja, J., de Graaff, A., et al. 2024c, \bibinfotitleRUBIES: Evolved Stellar Populations with Extended Formation Histories at z $\sim$ 7–8 in Candidate Massive Galaxies Identified with JWST/NIRSpec, ApJ, 969, L13, doi: 10.3847/2041-8213/ad55f7
A. Weibel et al. (2025) Weibel, A., de Graaff, A., Setton, D. J., et al. 2025, \bibinfotitleRUBIES Reveals a Massive Quiescent Galaxy at z = 7.3, ApJ, 983, 11, doi: 10.3847/1538-4357/adab7a
R. Weinberger et al. (2017) Weinberger, R., Springel, V., Hernquist, L., et al. 2017, \bibinfotitleSimulating galaxy formation with black hole driven thermal and kinetic feedback, MNRAS, 465, 3291, doi: 10.1093/mnras/stw2944
D. R. Weisz et al. (2012) Weisz, D. R., Johnson, B. D., Johnson, L. C., et al. 2012, \bibinfotitleModeling the Effects of Star Formation Histories on H $\alpha$ and Ultraviolet Fluxes in nearby Dwarf Galaxies, ApJ, 744, 44, doi: 10.1088/0004-637X/744/1/44
K. E. Whitaker et al. (2012) Whitaker, K. E., van Dokkum, P. G., Brammer, G., & Franx, M. 2012, \bibinfotitleThe Star Formation Mass Sequence Out to z = 2.5, ApJ, 754, L29, doi: 10.1088/2041-8205/754/2/L29
L. Whitler et al. (2023) Whitler, L., Stark, D. P., Endsley, R., et al. 2023, \bibinfotitleStar formation histories of UV-luminous galaxies at z ≃ 6.8: implications for stellar mass assembly at early cosmic times, MNRAS, 519, 5859, doi: 10.1093/mnras/stad004
C. J. Willott et al. (2022) Willott, C. J., Doyon, R., Albert, L., et al. 2022, \bibinfotitleThe Near-infrared Imager and Slitless Spectrograph for the James Webb Space Telescope. II. Wide Field Slitless Spectroscopy, PASP, 134, 025002, doi: 10.1088/1538-3873/ac5158
C. Witten et al. (2025) Witten, C., McClymont, W., Laporte, N., et al. 2025, \bibinfotitleRising from the ashes: evidence of old stellar populations and rejuvenation events in the very early Universe, MNRAS, 537, 112, doi: 10.1093/mnras/staf001
L. Y. A. Yung et al. (2024) Yung, L. Y. A., Somerville, R. S., Finkelstein, S. L., Wilkins, S. M., & Gardner, J. P. 2024, \bibinfotitleAre the ultra-high-redshift galaxies at z ¿ 10 surprising in the context of standard galaxy formation models?, MNRAS, 527, 5929, doi: 10.1093/mnras/stad3484
K. Zhang et al. (2023) Zhang, K., Bloom, J., & Hernitschek, N. 2023, in Machine Learning for Astrophysics, 38, doi: 10.48550/arXiv.2312.03824