Predicting the 21 cm field with a Hybrid Effective Field Theory approach
Abstract
A detection of the 21 cm signal can provide a unique window of opportunity for uncovering complex astrophysical phenomena at the epoch of reionization and placing constraints on cosmology at high redshifts, which are usually elusive to large-scale structure surveys. In this work, we provide a theoretical model based on a quadratic bias expansion capable of recovering the 21 cm power spectrum with high accuracy sufficient for upcoming ground-based radio interferometer experiments. In particular, we develop a hybrid effective field theory (HEFT) model in redshift space that leverages the accuracy of -body simulations with the predictive power of analytical bias expansion models, and test it against the Thesan suite of radiative transfer hydrodynamical simulations. We make predictions of the 21 cm brightness temperature field at several distinct redshifts, ranging between and 11, thus probing a large fraction of the reionization history of the Universe (), and compare our model to the ‘true’ 21 cm brightness in terms of the correlation coefficient, power spectrum and modeling error. We find percent-level agreement at large and intermediate scales, , and favorable behavior down to small scales, , outperforming pure perturbation-theory-based models. To put our findings into context, we show that even in the absence of any foreground contamination the thermal noise of a futuristic HERA-like experiment is comparable with the theoretical uncertainty in our model in the allowed ‘wedge’ of observations, providing further evidence in support of using HEFT-based models to approximate a range of cosmological observables.
I Introduction
Detecting the hyperfine transition of neutral hydrogen in the form of emitted radiation at a rest wavelength of 21 cm offers a viable path for mapping out structure in the Universe at high redshifts, which are otherwise hard to probe due to the small number of detectable luminous sources. Currently, most of the cosmological information used to test our theories about the Universe comes from either large-scale structure (LSS) surveys at low redshifts or the cosmic microwave background (CMB), emitted at during the epoch of recombination. As such, much of the formation history of the Universe at intermediate redshifts remains unexplored. The ideal tracer of the period spanning between the epoch of recombination and the epoch of reionization (EoR) is the diffuse neutral hydrogen gas, detectable in the form of 21 cm emission.
The EoR refers to the period during which the first sources of (ionizing) UV radiation turned on and ionized the neutral hydrogen in the Universe (2016ARA&A..54..313M, ). In addition to providing avenues for testing central assumptions in cosmology, studying the EoR is invaluable to understanding the formation of galaxies and other complex astrophysical phenomena. As indicated by CMB and Ly forest measurements, the mean redshift of reionization is (2016A&A…596A.108P, ), and it is mostly complete by (2015MNRAS.447..499M, ). Improvements in the measurements of the reionization process could reveal information about the properties of the ionizing sources (2006MNRAS.365..115F, ; 2007MNRAS.377.1043M, ), the abundance of small-scale structures that absorb many of the ionizing photons (2006MNRAS.366..689C, ; 2009MNRAS.394..960C, ; 2008ApJ…682…14F, ), and overall help us to better grasp the process of galaxy formation.
Several experiments are already well on their way to directly image the neutral hydrogen and map the 21 cm signal from the EoR. A number of them attempt to do so at the level of the mean global signal (monopole) such as EDGES (2017ApJ…835…49M, ), LEDA (2018MNRAS.478.4193P, ), PRIZM (2019JAI…..850004P, ), and SARAS (2018ExA….45..269S, ), while others are after detecting the fluctuations in the signal such as PAPER (2010AJ….139.1468P, ), the MWA (2013PASA…30….7T, ; 2013PASA…30…31B, ), LOFAR (2013A&A…556A…2V, ), HERA (2017PASP..129d5001D, ), and eventually the Square Kilometre Array (SKA) (2020PASA…37….2W, ). Additionally, efforts such as CHIME (2014SPIE.9145E..22B, ), HIRAX (2016SPIE.9906E..5XN, ), and CHORD (2019clrp.2020…28V, ) attempt to trace the post-reionization history of the Universe by adopting intensity mapping techniques. Once a detection of the redshifted 21 cm signal is confirmed, the challenge is to robustly extract information about the EoR and provide constraints on our cosmological model.
To allow for a reliable interpretation of the data, a number of numerical methods have been developed, including semi-analytic reionization models such as 21CMFAST (2011MNRAS.411..955M, ) and SimFast21 (2010MNRAS.406.2421S, ). The advantage of these methods is that they are computationally inexpensive and can therefore be scaled up to cosmological volumes. In these models, ionized regions are typically ‘painted’ on top of a realization of the matter density distribution at the locations where sources are likely to reside (2005MNRAS.363.1031F, ; 2012ApJ…747..126A, ; 2014MNRAS.440.1662S, ). While some of these prescriptions can also incorporate models for the photon sinks that retard reionization, they do not directly resolve them and as such are not as faithful on small scales.
The most reliable method for simulating the reionization process is through the use of radiative transfer simulations (1997ApJ…486..581G, ; 2003MNRAS.343.1101C, ; 2006MNRAS.369.1625I, ; 2007ApJ…671….1T, ; 2014ApJ…793…30G, ; 2017MNRAS.466..960P, ), which are much more computationally expensive and are therefore often limited to smaller box sizes (100 Mpc) (2005ApJ…624L..65B, ). The challenge is that these simulations need to have a sufficiently large number of resolution elements in order to capture the smallest intergalactic structures that act as photon sinks (2000ApJ…530….1M, ; 2000ApJ…542..535G, ; 2004MNRAS.348..753S, ), and at the same time have a sufficiently large volume to provide a representative sample of the largest structures found in the Universe. Nonethlesess, these simulations are prohibitively expensive to survey the vast parameter space of potential source models. However, perturbative models, inspired by Effective Field Theory (EFT) (e.g., 2005MNRAS.363.1031F, ; 2007ApJ…654…12Z, ; 2011MNRAS.414..727Z, ), can potentially provide a fast way of analyzing 21 cm observables and thus, elucidate many astrophysical and cosmological conundrums.
Until recently, it was believed that the 21 cm signal cannot be treated perturbatively in the regime probed by 21 cm experiments due to the presence of large ionized bubbles. However, developments in the last few years (2007MNRAS.375..324Z, ; 2015PhRvD..91h3015M, ; 2018JCAP…10..016M, ; 2022PhRvD.106l3506Q, ; 2022JCAP…10..007S, ) suggest otherwise: the 21cm signal is likely perturbative on many of the scales that future interferometric instruments will be sensitive to. On these scales, the signal is modeled in terms of ‘bias’ parameters, which connect it to the underlying matter density field and which can be tied to physical quantities relevant to the physics of reionization.
Apart from being able to model the 21 cm field in real space, it is of utmost importance that we develop techniques that work in redshift space as well, where the 21 cm field is affected by the presence of redshift space distortions (RSDs). These contributions to the observed redshift result from the peculiar velocities of the neutral hydrogen along the line-of-sight (LOS) and as such, make it difficult to disentangle the ‘real-space’ distribution of the field from its intrinsic motion along the LOS. Interestingly, future experiments such as HERA will predominantly observe modes that are nearly parallel to the LOS for three main reasons: i) the instrument needs to be shielded from ground radio sources; ii) obtaining high spectral resolution is easier than obtaining high angular resolution; iii) modes parallel to the LOS are less affected by foreground contamination due to the chromatic instrument response, which even in the case of spectrally smooth foregrounds induces an unsmooth ringing that is mostly confined to the low and high regime (see e.g., 1997ApJ…486..581G, ; 2003MNRAS.343.1101C, ; 2015aska.confE…1K, ). This underscores the importance of developing redshift-space methods such as the one proposed in this work.
In this paper, we develop a method originally from cosmological perturbation theory for modeling the 21 cm signal in redshift space. Our method provides a viable path towards extracting the cosmological information about the composition of the Universe, encapsulated in the large-scale modes, and the astrophysical information about the morphology of reionization, encoded in the small-scale modes. Since the 21 cm field obeys basic symmetries such as rotational invariance and the equivalence principle, we can write it as an expansion in powers of the density field, tidal field and low-order derivatives on scales larger than the typical size of the ionized bubbles (2009JCAP…08..020M, ; 2018JCAP…10..016M, ; 2018PhR…733….1D, ; 2022PhRvD.106l3506Q, ). To enable us to work to smaller scales, we combine the power of such a symmetries-based bias expansion and the accuracy of -body simulations in the non-linear regime. The free (bias) parameters of our theory can be linked to valuable astrophysical quantities and can reveal the answers to questions such as when reionization occurred, what fraction of the Universe is neutral, what the bias of the sources is, and what the characteristic bubble sizes are.
This paper is organized as follows. In Section II, we provide details about our hybrid perturbative model and describe the radiative transfer simulation Thesan-1, against which we test our model. In Section III, we present our main findings in terms of the two-point correlation function and the relative modeling error, putting it in the context of the sensitivity of future experiments. Finally, we summarize our main conclusions in Section IV.
II Methods
In this section, we describe the radiative transfer simulations adopted in this study and our theoretical model for approximating the 21 cm field.
II.1 Thesan simulations
We test the theoretical model developed in this work via the high-resolution radiation-magneto-hydrodynamical simulation suite, Thesan. This suite is designed to faithfully reproduce both the complex physics of reionization as well as galaxy formation at high redshifts. The comoving box size of these simulations is Mpc, and they are sufficiently well-resolved to capture the formation of atomic cooling halos, which are the smallest structures significantly contributing to the process of reionization. The galaxy formation model is built on top of the well-tried and tested IllustrisTNG model (2017MNRAS.465.3291W, ; 2018MNRAS.473.4077P, ), whereas the initial conditions are produced according to Ref. (2016MNRAS.462L…1A, ). The simulations are run using the radiative transfer version of the Arepo code, Arepo-RT (2010MNRAS.401..791S, ; 2019MNRAS.485..117K, ). Radiation-magnetohydrodynamics equations are solved on a mesh, with mesh-generating points that approximately follow the gas flow via their Voronoi tessellation.
The Thesan simulations have been shown to reproduce a number of observables of the reionization history of the Universe, the evolution of the temperature of the intergalactic medium (IGM), the optical depth to the CMB, the UV luminosity function at (2022MNRAS.514.3857K, ), the photo-ionization rate, the mean-free-path of ionizing photons, the IGM opacity, and the temperature-density relation (2022MNRAS.512.4909G, ). In this work, we employ Thesan-1, the highest-resolution simulation, which contains 21003 dark matter particles with mass and 21003 gas particles of mass , and its dark-matter-only (-body) counterpart Thesan-Dark-1, which contains 21003 dark matter particles with mass . The cosmology of the suite is identical to that of IllustrisTNG: , , , , , .
In this work, we utilize outputs at several different redshifts: , 9, 8.3, 7.5, 7, and 6.5, with neutral hydrogen fraction of , 0.8, 0.71, 0.57, 0.45 and 0.3, respectively, thus sampling a large fraction of the reionization history of the Universe. We show the reionization history predicted by the Thesan-1 simulation in Fig. 1. Consistent with CMB experiments, the mid-point in reionization according to Thesan-1 occurs around . Most of the redshifts at which we probe our theoretical model are in the first half of reionization, where the bubble sizes are smaller, and we expect the 21 cm field to be more perturbative. We also display the characteristic bubble size in the same figure, identified through the mean-free path analysis of Ref. (2024MNRAS.tmp.1328N, ). Towards the end of reionization , the bubbles expand exponentially, increasing the scale range affected by non-local physics. Identifying the scale of non-locality is crucial for perturbative models, as it indicates the regime in which we expect the model to break.
II.2 Theoretical model
In this section, we summarize our approach for modeling the 21 cm signal, which is inspired by the treatment of biased tracers in Lagrangian Perturbation Theory (LPT). For a review of perturbation theory, see Ref. 2002PhR…367….1B ; 2018PhR…733….1D ; 2020moco.book…..D .
In the Lagrangian picture, we work with infinitesimal fluid elements labeled by their initial (Lagrangian) positions . Their dynamics are encoded in a displacement vector , sourced by the gravitational potential and defined such that the Eulerian (comoving) positions of the fluid element at some conformal time is 2015JCAP…09..014V ; 2014JCAP…05..022P . The 3D distribution of cosmological observables such as the galaxy field or the 21cm signal is, in general, determined by complex small-scale astrophysical processes and their response to the large-scale matter distribution (2018PhR…733….1D, ).
We can approximate the behavior of any field obeying simple symmetries such as rotation invariance and the equivalence principle in terms of the matter density, velocity gradients and tidal fields in a neighborhood around its trajectory, i.e., within the scale of ‘nonlocality’. In the case of galaxies, this roughly corresponds to the size of the Lagrangian halo radius, since the dominant physical process is the gravitational collapse of the Lagrangian patch into a halo. On the other hand, the 21cm signal is sensitive to potentially larger nonlocalities resulting from the mean free path of ionizing photons, which can travel significant distances in the absence of absorption. In that case, the size of the nonlocality is roughly set by the size of the ionized bubbles. Thus, at higher redshifts, when the bubbles are smaller, we can use perturbation theory to model the signal over a wider range of scales 2018JCAP…10..016M .
We can thus write the dependence of the 21cm signal along its trajectory as an expansion to second order in the initial conditions adopting the Lagrangian EFT framework 2009JCAP…08..020M ; 2015JCAP…11..007S ; 2014JCAP…08..056A ; 2016JCAP…12..007V :
(1) | ||||
where , , and are free bias parameters and is the shear/tidal field, with . Note that the functional can only depend on scalar combinations of the density field and the tidal field. Nonlocality effects are handled by performing a Taylor expansion of the field around . To lowest order, this calls for the inclusion of . We note that on nonlocal scales, i.e., smaller than the bubble size, the Lagrangian approximation breaks down.
The functional can then be advected to the real-space (Eulerian) position 2008PhRvD..77f3530M :
(2) |
Thus, the 21cm clustering signal contains a dynamics piece, , as well as a piece depending on the initial conditions, .
To account for the fact that we observe the signal in redshift space, , and thus pick up contributions in the form of redshift-space distortions (RSD) due to the peculiar velocity along the LOS, , we write the redshift-space signal as 2008PhRvD..78j9901M ; 2019JCAP…03..007V ; 2021JCAP…03..100C :
(3) |
where is the redshift-space displacement field, defined as:
(4) |
In this work, instead of adopting the usual LPT steps of perturbatively expanding the displacement fields and , we use an -body simulation as an accurate solver of the dynamics of the matter field. This method has recently been referred to in the literature as Hybrid Effective Field Theory (HEFT) (2020MNRAS.492.5754M, ; 2021JCAP…09..020H, ; 2021MNRAS.505.1422K, ). While in real space, this is a trivial operation as described below, in redshift space, we need to be a bit more careful, as taking directly the peculiar velocity field and adding it to the displacement field can incur unwanted contributions from the motions of particles on small scales, also known as Finger-of-God (FoG) effects, which can affect observables such as the power spectrum at the scales of cosmological interest. This is exacerbated by the fact that we use the highest-resolution simulation with 21003 particles, for which such FoG effects are inevitable. To mitigate this effect, we low-pass filter the LOS velocity field in Fourier space with the following kernel:
(5) |
We adopt the following values for the free parameters: and and check that our results are unchanged if we vary these parameters by 50%. We then add the low-pass filtered velocities sampled at the location of the DM particles to the displacement field. Our approach for handling redshift-space displacements is not unique, and other methods have been proposed to model the redshift-space clustering of large-scale structure tracers using the HEFT model (see e.g., 2023MNRAS.520.3725P, , for an application on galaxies). A nice property of the field-level HEFT model is that in principle it can be used to model not only the two-point correlation function (power spectrum in Fourier space), but also higher-order statistics such as the 21 cm bispectrum, which are expected to contain much of the information, since the 21 cm field is highly non-Gaussian (2022MNRAS.510.3838W, ).
We summarize the empirical steps we employ in order to obtain the advected operators as follows:
-
1.
We use the initial conditions at to calculate the 4 fields on a cubic grid of size in Fourier space. We note that we have tested our entire pipeline at a resolution of and find that our results are unchanged at the scales of interest.
-
2.
We then evolve the initial condition fields linearly to some redshift of choice, , by tagging each particle at by the values of the four initial conditions fields and then performing TSC interpolation, using these tags as weights. Including a case in which we set the weights to one, this yields the following five advected bias operators: by assigning each particle a weight given by the value of its position in the initial conditions.
-
3.
Finally, we store all the advected fields at that redshift and construct a model of our observable as a linear combination of these operators:
We note that our expansion is performed in the product of the neutral hydrogen fraction and the gas density rather than as a combined expansion in both fields separately, which is more natural from the perspective of perturbation theory applied to large-scale structure 2022PhRvD.106l3506Q ; 2022JCAP…10..007S .
II.3 Hybrid Lagrangian operators
The end product of our theoretical model for the 21 cm field is the Lagrangian fields, , , and , advected to some redshift of interest using an -body simulation. In this section, we compare their power spectra to a theoretical prediction coming from the second-order LPT code, velocileptors 2020JCAP…07..062C and study how strongly correlated these fields are with our observable, the 21 cm field.
We first visualize the advected fields (Lagrangian operators) as well as the 21 cm field in Fig. 2. These are shown at when the Universe is 80% neutral and the bubbles have not yet started to coalesce into large ionized regions. Visible in this plot are the bubble nucleation sites, the sharp ionization fronts where ionizing photons get absorbed, and the regions of high density in neutral hydrogen.
Looking at the higher-order Lagrangian operators, we notice that the field highlights the regions with high bias, the emphasizes regions with large anisotropies, and accentuates the smallest scales aiming to capture non-local effects. We display our observable as well as the Lagrangian operators at the fiducial resolution of this analysis, , which corresponds to a cell size of . This is sufficiently fine to resolve the scales of cosmological interest for future radio interferometer experiments, i.e. Fourier modes smaller than , or configuration-space scales larger than .
We next compare the 15 combinations of cross-power spectra constructed from our five numerically advected fields against the predictions from second-order LPT obtained using the python package velocileptors111https://github.com/sfschen/velocileptors. We compute the power spectrum using 20 bins between and , and note that the fundamental mode of the box is . We show this comparison at in Fig. 3, but we find a consistent picture for all other redshifts used in this study.
We see good agreement overall between theory and numerics: terms involving the and tend to agree best with theory, whereas terms involving the and exhibit the largest deviations from theory likely because both are particularly sensitive to small-scale noise due to aliasing effects. The numerical noise is particularly pronounced on all scales for the -squared operators, and . We note that smoothing of the initial conditions fields before performing the advection helps in reconciling these differences. Thus, the cause of the differences in all of these operators is well-understood from a theoretical point of view and irrelevant to the analysis performed in this work, as it can be diminished via a renormalization.
We are also interested in understanding how correlated the advected fields are with our observable of interest, the 21 cm field . To this end, we compute the cross-correlation coefficient between and the advected fields, which allows us to quantify the amount of correlation, as follows:
(7) |
Qualitatively, we expect the coefficient to drop sharply on small scales once we approach the typical scale of bubbles at the given redshift, which corresponds to the scale of non-locality. In Fig. 4, we show this quantity at for all five advected fields. We note that qualitatively this plot looks very similar at the higher and lower redshifts we consider, with the higher redshifts yielding generally higher correlation and the lower redshifts generating a lower correlation. This can be easily understood as at these higher redshifts there are fewer bubbles to decorrelate the signal from the underlying density fields as well as the Universe is more linear as structure has barely begun to collapse gravitationally.
While on large scales, , we are limited by cosmic variance due to the small volume of the simulation, recently a powerful technique known as Zel’dovich Control Variates has been developed and applied to cosmological observables such as the power spectrum and the correlation function to significantly reduce the sample variance of simulation-measured quantities (2022JCAP…09..059K, ; 2023JCAP…02..008D, ; 2023OJAp….6E..38H, ). This method performs best in the regime where the cross-correlation coefficient between the observable of interest and the Lagrangian/Zel’dovich operators is close to one, as is the case for the 21 cm field at (i.e., when the bubble size is smaller than the box, and is therefore close to one). Since radiative transfer hydrodynamical simulations are prohibitively expensive to run in Gpc-sized boxes, the Control Variates technique could be extremely useful for partially overcoming the volume limitation.
Looking at Fig. 4, we see that the correlation with the DM field peaks at 60% on the largest available scales for our box, then crosses zero around , the characteristic bubble size at this redshift, and finally peaks at very small scales . We note that on these scales the Universe is mostly ionized and only small pockets of neutral hydrogen exist due to recombination and self-shielding. Interestingly, these surviving pockets of neutral hydrogen are traced well by the regions of high density. Furthermore, we see that the operator exhibits the strongest correlation () on large scales and then drops steadily on small scales, crossing zero at , which occurs on smaller scales than the bubble size due to the mode-coupling of Fourier modes. The is the least correlated operator on large scales. As will be discussed later, at these redshifts , the linear bias appears to vanish, i.e. in the Lagrangian picture (see Fig. 6), which implies that the higher-order operators dominate, which provides a plausible explanation for the high cross-correlation coefficient between the field and .
We interpret this due to the fact that the matter displacements are much smaller than the distances photons travel during reionization (i.e., the bubble sizes). In addition, because the sources driving reionization are highly biased, we expect their tidal bias, which is proportional to the 21 cm tidal bias, to be small (2018JCAP…10..016M, ). Finally, the behaves similarly to the term. This is because can be approximated to first order as . Thus, the bias parameter associated with that field can be linked to the characteristic bubble size , as , where is the scale at which this term surges up (and reaches zero). To create this plot, we use a finer grid size than used in the rest of the paper: 60 for all operators except for , for which we adopt a grid size of 0.5 due to its numerical instability on small scales.
III Results
In this section, we present the main findings of this study. We first describe our bias-fitting method and then assess its performance in terms of the cross-correlation coefficient between our model and the 21 cm field and the error power spectrum. To put our findings into context, we also draw a comparison between the thermal noise of a futuristic radio interferometer experiment and the theoretical noise of our model.
III.1 Fitting method
Here we elaborate on the methodology to extract and minimize the stochastic contributions in LSS models by enhancing the accuracy of bias parameter estimates (see PhysRevD.100.043514, ; 2022MNRAS.514.2198K, , for more details). The goal is to estimate the bias parameters of our model by minimizing the difference between the model and the observable . Our result can be expressed through the stochastic field , which represents residuals after removing deterministic contributions from the model:
(8) |
We obtain the bias parameters by solving the least squares problem of minimizing the error power spectrum, which leads us to the following expression for the bias transfer functions:
(9) |
In particular, we do this as follows:
(10) |
where and are defined as
(11) |
(12) |
and
(13) |
(14) |
where the subscript b represents the index that runs from 1 to , which corresponds to the number of bins into which we have divided our Fourier modes. Once the coefficients are obtained, they can be substituted back into our model to estimate the error power spectrum as follows:
(15) |
While typically in cosmological analyses we marginalize over the bias parameters, they can also be linked to astrophysical quantities relevant to reionization. Following Ref. (2018JCAP…10..016M, ), we can draw the following connections (though we note that our analysis is performed in Lagrangian space):
-
•
constrains the source bias and the global neutral fraction and within a simple model is roughly , where is the source bias. If we assume that the source bias varies more slowly than , then we roughly expect to increase (in absolute magnitude), as we plunge deeper into the EoR.
-
•
Because reionization is a patchy process and highly correlated with regions of high density (and thus large source bias), the parameter is expected to be quite large and in fact be the dominant term when .
-
•
is linked to of the source and is likely subdominant, as the scale of displacements in the matter field are smaller than the reionization front scale.
-
•
Finally, the bias associated with the tells us how significant and what scale non-localities occur, which in the case of reionization is related to the characteristic bubble size.
At the end of reionization (), there is some residual neutral hydrogen found in galaxies, and thus the 21 cm field effectively becomes a positively biased tracer of large-scale structure. In this regime, perturbative bias expansion methods have been extensively tested and can be straightforwardly applied to model the late-time 21 cm field 2018arXiv181009572C ; 2019JCAP…09..024M ; 2019JCAP…11..023M ; 2021JCAP…12..049S ; 2023PhRvD.108h3528O ; 2024arXiv240518559F .
III.2 Field-level fits
We first explore the scale dependence of the first-order bias parameter in Fig. 6, which as argued in Section III.1 can be connected to the mean reionization fraction and the source bias. As claimed above, in the case of a nearly constant (in time) source bias, we expect to become more and more negative as we go to lower redshifts. This is also what we find in this figure, though we note that these arguments break down towards the end of reionization where the 21 cm field becomes highly anisotropic and non-linear and the bubbles merge to form large ionized regions. We see that as a function of scale, is stable on scales where , corresponding to scales larger than the bubble sizes. This characteristic wave number is larger at high redshifts ( at ) and smaller at low redshifts, at which the neutral regions occupy an increasingly smaller volume, making both the measurement more noisy and the theory more inaccurate.
In Fig. 7, we explore the connection between the characteristic bubble size, , and , which as conjectured in Section III.1 is related to as (note that is thus expected to be positive). We find that is fairly scale independent at high redshifts, where the approximation works quite well. At low redshifts (), it starts to break down likely because the term struggles to catch up when the bubble size grows exponentially (see Fig. 1).
We next study the monopole of the redshift-space power spectrum and the cross-correlation coefficient between model and observable in Fig. 8 for the best-fit bias parameters obtained as described in Section III.1. At very high redshifts , the 21 cm power spectrum drops on large scales as ionization bubbles start to form and take out regions with high bias. Around , the power spectrum amplitude picks up again as the density fluctuations in the 21 cm field start to evolve. At the same time, due to the growing reionization fronts, the 21 cm signal begins to downgrade on small scales, leading to the characteristic flattening of (see e.g., 2008ApJ…680..962L, , for a discussion). This is exactly what we observe in the top panel. We also see that the perturbative model provides a good description of the true 21 cm power spectrum on large scales and breaks down as we reach the scales of non-locality, where the model power spectrum becomes deficient. As expected, our model is more successful at high redshifts compared with low redshifts.
Luckily, the cosmological information to be gained from these lower-redshift slices is likely quite negligible. One can understand this by noticing that the 21 cm power spectrum, , is very flat towards the end of reionization (see Fig. 8), so the power spectrum is effectively ‘featureless’ and all we can extract is an amplitude (2008ApJ…680..962L, ). This is also made worse by the fact that current and near-future experiments will only measure the signal in a few bandpowers, thus additionally smearing any features that could otherwise yield information. A lot of the power found on small scales is likely due to non-linear astrophysical effects related to the growing and merging population of ionization bubbles. It is thus uncorrelated with the large-scale density field and effectively contributes ‘noise’ from the perspective of our perturbation model. We verify that indeed at the correlation between the dark matter field and the 21 cm signal drops off sharply (and more so than at higher redshifts) at , providing evidence in support of this conjecture. For completeness, we provide the wavenumber values, , at which the cross-correlation coefficient with the DM field, , drops below 0.5: at , at , at , and note that increases smoothly with redshift.
The cross-correlation coefficient between our model and the cm field exhibits a similar trend (Fig. 8). On large scales, we see that all redshifts display an almost perfect correlation , with the curve being consistently larger than all the way until . Even at low redshifts, close to the end of reionization, the cross-correlation coefficient is still very close to one up until . This indicates that our model does not simply match the power spectrum out of shear luck but does so at the field level, matching the phases of the 21 cm Fourier modes. This is important, because the bias parameters affect only the amplitudes of the terms, not their phases. There is a significant amount of noise in all curves, which we attribute to the limited volume of the simulation. We expect to be more stable and consistently closer to one in a larger-box simulation. We note that when becomes small, , then we can say that the 21 cm field is not faithfully tracing large-scale structure in the Universe.
In addition to the lower cross-correlation coefficient with the matter field, we also verify that this holds for its cross-correlation coefficient with halos (and galaxies, by extension), as expected, since they are biased tracers of large-scale structure. This needs to be taken into account for science cases that rely on cross-correlation of cm surveys with surveys of galaxies or star-formation-based line emission. We show the cross-correlation coefficient between halos and the 21 cm field at in Fig. 5. We split the halos by virial mass into three samples: , where the masses are expressed in units of , corresponding to number densities of , , and , respectively. The cross-correlation is highest for the lowest-mass sample, as it has the lowest bias, i.e. is closest to the matter field that the neutral hydrogen traces, and has the highest number density, i.e. the lowest level of shotnoise, which additionally decorrelates the two fields. Additionally, the threshold corresponds to roughly a virial temperature of , which is close to the expected virial temperature of to form ionizing bubbles (2022GReGr..54..102C, ). An open question in EoR studies is the connection between the properties of the ionizing halos and the characteristic bubble size. As proposed in Ref. (2005MNRAS.363.1031F, ), the mean free path of ionizing photons and thus the characteristic bubble size are likely related to the number density and mass of the photon sinks. It is thus through cross-correlations between the 21 cm emission signal and large-scale structure tracers such as galaxies and quasars that we can reveal information about the UV radiation sources such as their characteristic mass.
Due to this limitation in the number of Fourier modes on large scales, we also find a significant amount of noise in the ratio between the error power spectrum (of the difference between model and observable) and the observable power spectrum, studied in Fig. 9. We find that the error is smallest on large scales, as expected, and is around 1%. It then rapidly increases as we push to smaller scales near the scale of non-locality at that redshift. Generally, the highest redshifts have the smallest error power spectra, with the curve reaching the 10% mark only at . Moreover, we note that the error on the power spectrum is well-behaved even on small scales, indicating that our method might be well-suited for analysis even on scales beyond and overall seems to perform better in that regime than pure perturbative methods (2018JCAP…10..016M, ; 2022PhRvD.106l3506Q, ; 2022JCAP…10..007S, ), where the error tends to increase exponentially on these scales. We do not find a strong dependence of this ratio on the angle with respect to the LOS, suggesting that our method for handling redshift-space distortions is adequate. We thus opt to only show the monopole result, as - slices suffer from a significant amount of cosmic variance. We note that the breakdown of the bias expansion towards the end of reionization also suggests that the (small) box does not contain a representative sample of bubble morphologies and sizes at that epoch, i.e. is approaching at late times. In fact, as seen in Fig. 1, by , the bubbles cover almost the entire volume of the Thesan-1 simulation, i.e. .
III.3 Survey realism
In this section, we present a comparison between the thermal noise of a futuristic experiment and the theoretical noise associated with our model. The goal is to qualitatively explore the question of whether our model is sufficiently good at predicting the 21 cm field signal at the level that meets the sensitivity of next-generation experiments. We note that the frequency coverage of existing and planned experiments is sufficient to encompass most of the relevant range of redshifts. In order to probe the regime of at , we need hundred-meter baseline interferometers, which next-generation facilities aim to provide. However, at these intermediate scales and high redshifts, we are sky noise dominated, so the exact baseline distribution is of secondary importance, and we are mostly sensitive to the total collecting area and the integration time. The minimum is limited by the ability of the experiment to subtract foregrounds that are many orders of magnitude brighter than the signal. We note that in this analysis, we ignore other sources of contamination from e.g., Galactic and extragalactic foregrounds such as synchrotron radiation and radio point sources as well as non-uniform instrument noise such as cross-talk, confusion from spurious Earth and electronics radio signals, which can have a spatially anisotropic contribution and thus be confused with the 21 cm signal, especially for large incidence angles, an effect referred to as ‘foreground leakage.’
Here, we consider a baseline interferometer instrument that covers 10% of the sky with a cadence of at and a hexagonally packed array of 14 m dishes. This design is in effect a scaled-up version of the Hydrogen Epoch of Reionization Array (HERA 2017PASP..129d5001D, ). The integration time we consider is 2 years (inversely proportional to the thermal noise), and we adopt a power spectrum binning of , .
In this analysis, we ignore all contributions to the noise budget apart from thermal noise, and we direct the reader to Ref. (2012ApJ…752..137M, ) for a more thorough discussion of ‘foreground wedges’.
We compute the thermal noise power spectrum following Appendix D (see Eq. D4) of Ref. 2018arXiv181009572C (see also Refs. 2015ApJ…803…21B ; 2010ApJ…721..164S ) with the baseline distribution calculated with 21cmSense
222https://github.com/steven-murray/21cmSense. We then divide the thermal noise power spectrum by the mean 21cm brightness temperature squared to convert from units.
We note that because the foregrounds vary smoothly in frequency, most of their power is confined to large-scale modes parallel to the line of sight, so we typically filter out the lowest , which removes the bulk of the foreground contamination.
As noted above, we do not explicitly model a foreground ‘wedge’. Our understanding of instrument calibration and data analysis techniques for minimizing the influence of the wedge are evolving rapidly, so any choice would be quickly rendered obsolete. We will note that in the presence of a significant wedge only the high and high regime would be accessible. This is unfortunately where the cosmological information is minimized compared to the complex details of the reionization process and where our models perform the worst. Thus, qualitatively, our ability to extract large-scale structure information depends upon our success in handling the wedge. We direct the reader to Refs. (2014PhRvD..90b3019L, ; 2015MNRAS.447.1705P, ; 2016MNRAS.456.3142S, ; 2020PASP..132f2001L, ; 2022ApJ…925..221A, ) for extensive discussions of foreground issues in current and future experiments.
In Fig. 10, we show the comparison between the error and the thermal noise ratio with respect to the 21 cm power spectrum for the two slices: and , at three different redshifts. It is evident that for the futuristic HERA-like experiment described in this work (and ignoring the presence of foreground leakage and other systematic errors), the thermal noise, which is the absolute minimum noise threshold, is comparable with the error of the model we propose. This indicates that the HEFT method developed in this work provides a viable description of the 21 cm field on the scales accessible by terrestrial radio interferometers, and even in idealized conditions (of no foregrounds) can be utilized to analyze the signal with a high degree of success.
IV Discussion and conclusions
In this work, we have developed a hybrid effective field theory (HEFT) model that combines -body simulations and a quadratic bias model to describe the redshift-space spatial fluctuations in the 21 cm radiation field during the epoch of reionization. We compare our model predictions to the radiative transfer hydrodynamical simulation Thesan and demonstrate that the 21 cm signal is well-approximated by our HEFT model over the observationally relevant wavenumber range. There exist other methods for modeling the 21 cm signal, including directly through the use of radiative transfer hydrodynamical simulations (2017MNRAS.466..960P, ; 2022MNRAS.514.3857K, ), semi-analytic (2022MNRAS.511.3657M, ), semi-numerical (2022ApJ…927..186T, ), and purely perturbative bias expansion (2019MNRAS.487.3050H, ; 2018JCAP…10..016M, ; 2022PhRvD.106l3506Q, ) models. Our approach performs a symmetries-based bias expansion in Lagrangian space and advects the initial conditions fields by directly looking up the non-linear displacements in an -body simulation. Our method thus takes advantage of the fact that despite the complexity of the physical processes that govern reionization, the 21 cm field must still obey certain symmetries (e.g. rotational invariance and the equivalence principle), allowing us to construct a model that is valid on scales larger than the typical size of an ionized bubble.
In particular, we study the performance of our model in terms of metrics relevant to cosmological analyses such as power spectrum, cross-correlation coefficient and model error at the field-level. The performance at the power spectrum and correlation function level is summarized in Fig. 8 for 6 different redshifts, ranging from and 10.8, with corresponding neutral hydrogen fraction of 0.92 and 0.3. Visually, we see excellent cross-correlation on the largest scales available in the Thesan simulation (), which rapidly falls off on small scales. As expected, at the dawn of reionization, when the bubble sizes are much smaller, the HEFT model works very well. As can be seen in Fig. 9 where we study the error power spectrum ratio, even deep into reionization our model is capable of recovering the 21 cm power spectrum well. We show that our method works best at high redshifts and on large scales, where we find agreement between model and the observable at 10% down to , with that wavenumber increasing to at and dropping down to at .
We compare our findings for the bias parameter evolution to Ref. (2022PhRvD.106l3506Q, ), which develops a Standard perturbation theory and tests it against the Thesan suite of simulations. In Fig. 11, we display the evolution of the Eulerian bias coefficients, obtained by converting our best-fit Lagrangian biases to the Eulerian picture biases according to App. A of Ref. (2024arXiv240518559F, ). Qualitatively, we find very similar trends to Ref. (2022PhRvD.106l3506Q, ). Namely, is close to zero at high redshifts and becomes increasingly more negative as reionization progresses. is negative throughout the EoR and is largest in terms of its absolute value. This also provides a plausible explanation for the high cross-correlation coefficient between and we find in Fig. 4. As expected from previous arguments in Section III.1, is positive and small in magnitude. Specifically, differs the most: we find that in our model it is proportional to the bubble size squared and increases steadily in magnitude, whereas in Ref. (2022PhRvD.106l3506Q, ) it is negative and grows exponentially. We attribute remaining differences to renormalization choices and scale cuts used in the fits. The relatively large values of compared with are not concerning, as the field is quite small and multiplies an even smaller quantity, . We caution that at the lowest redshifts, where the size of the bubbles is comparable to the size of the simulation box, the bias parameters cannot be reliably measured.
Finally, we study the thermal noise of a futuristic baseline interferometer experiment and compare it with the theoretical noise associated with our model in Fig. 10. We find that our model performs well in the range of scales that such an experiment would be sensitive to. This suggests that a HEFT-like model is more than suitable to model the observed data in the range of interest.
In the future, we aim to further validate our model in simulations of larger volumes and different resolutions. We conjecture that the photon sinks in high-resolution simulations (which are reflective of photon sinks in the Universe) may be very small which would lead to a decrease in the characteristic size of bubbles relative to Thesan-1 and thus would extend our method deep into the large-wavenumber regime. We also aim to investigate the effect of foregrounds and foreground leakage more thoroughly by e.g., running the ‘clean’ output of Thesan-1 through an end-to-end simulator of a future 21 cm experiment.
Finally, a potential simplification of our model involves using a ‘cheaper’ and purely theoretical forward model that uses e.g. Zel’dovich or second-order LPT displacements as opposed to -body displacements333See https://github.com/andrejobuljen/Hi-Fi_mocks for an example of such a model., since at most redshifts during reionization, the non-linear scale occurs at larger modes than the scale at which the bias expansion breaks down due to the reionization bubbles growing.
Acknowledgements.
We would like to thank Aaron Smith, Rahul Kannan, Nick Kokron, Ruediger Pakmor and Wenzer Qin both for several fruitful discussions during the preparation of this paper, and we are especially grateful to Enrico Garaldi and Meredith Neyer for providing additional information for running the initial conditions code and studying the bubble size distribution. We wish to acknowledge the support of the N3AS undergraduate research program at UC Berkeley for supporting D.B. through the academic 2023-2024 year. B.H. is grateful to the Miller Institute for their generous support.References
- (1) M. McQuinn, ARA&A54, 313 (2016).
- (2) Planck Collaboration et al., A&A596, A108 (2016).
- (3) I. D. McGreer, A. Mesinger, and V. D’Odorico, MNRAS447, 499 (2015).
- (4) S. R. Furlanetto, M. McQuinn, and L. Hernquist, MNRAS365, 115 (2006).
- (5) M. McQuinn et al., MNRAS377, 1043 (2007).
- (6) B. Ciardi et al., MNRAS366, 689 (2006).
- (7) T. R. Choudhury, M. G. Haehnelt, and J. Regan, MNRAS394, 960 (2009).
- (8) S. R. Furlanetto and S. P. Oh, ApJ682, 14 (2008).
- (9) R. A. Monsalve, A. E. E. Rogers, J. D. Bowman, and T. J. Mozdzen, ApJ835, 49 (2017).
- (10) D. C. Price et al., MNRAS478, 4193 (2018).
- (11) L. Philip et al., Journal of Astronomical Instrumentation 8, 1950004 (2019).
- (12) S. Singh et al., Experimental Astronomy 45, 269 (2018).
- (13) A. R. Parsons et al., AJ139, 1468 (2010).
- (14) S. J. Tingay et al., PASA30, e007 (2013).
- (15) J. D. Bowman et al., PASA30, e031 (2013).
- (16) M. P. van Haarlem et al., A&A556, A2 (2013).
- (17) D. R. DeBoer et al., PASP129, 045001 (2017).
- (18) A. Weltman et al., PASA37, e002 (2020).
- (19) K. Bandura et al., in Ground-based and Airborne Telescopes V, Vol. 9145 of Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, edited by L. M. Stepp, R. Gilmozzi, and H. J. Hall (PUBLISHER, ADDRESS, 2014), p. 914522.
- (20) L. B. Newburgh et al., in Ground-based and Airborne Telescopes VI, Vol. 9906 of Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, edited by H. J. Hall, R. Gilmozzi, and H. K. Marshall (PUBLISHER, ADDRESS, 2016), p. 99065X.
- (21) K. Vanderlinde et al., in Canadian Long Range Plan for Astronomy and Astrophysics White Papers (PUBLISHER, ADDRESS, 2019), Vol. 2020, p. 28.
- (22) A. Mesinger, S. Furlanetto, and R. Cen, MNRAS411, 955 (2011).
- (23) M. G. Santos et al., MNRAS406, 2421 (2010).
- (24) S. R. Furlanetto and S. P. Oh, MNRAS363, 1031 (2005).
- (25) M. A. Alvarez and T. Abel, ApJ747, 126 (2012).
- (26) E. Sobacchi and A. Mesinger, MNRAS440, 1662 (2014).
- (27) N. Y. Gnedin and J. P. Ostriker, ApJ486, 581 (1997).
- (28) B. Ciardi, F. Stoehr, and S. D. M. White, MNRAS343, 1101 (2003).
- (29) I. T. Iliev et al., MNRAS369, 1625 (2006).
- (30) H. Trac and R. Cen, ApJ671, 1 (2007).
- (31) N. Y. Gnedin and A. A. Kaurov, ApJ793, 30 (2014).
- (32) A. H. Pawlik et al., MNRAS466, 960 (2017).
- (33) R. Barkana and A. Loeb, ApJ624, L65 (2005).
- (34) J. Miralda-Escudé, M. Haehnelt, and M. J. Rees, ApJ530, 1 (2000).
- (35) N. Y. Gnedin, ApJ542, 535 (2000).
- (36) P. R. Shapiro, I. T. Iliev, and A. C. Raga, MNRAS348, 753 (2004).
- (37) O. Zahn et al., ApJ654, 12 (2007).
- (38) O. Zahn et al., MNRAS414, 727 (2011).
- (39) J. Zhang, L. Hui, and Z. Haiman, MNRAS375, 324 (2007).
- (40) Y. Mao et al., Phys. Rev. D91, 083015 (2015).
- (41) M. McQuinn and A. D’Aloisio, J. Cosmology Astropart. Phys2018, 016 (2018).
- (42) W. Qin et al., Phys. Rev. D106, 123506 (2022).
- (43) N. Sailer, S.-F. Chen, and M. White, J. Cosmology Astropart. Phys2022, 007 (2022).
- (44) L. Koopmans et al., in Advancing Astrophysics with the Square Kilometre Array (AASKA14) (PUBLISHER, ADDRESS, 2015), p. 1.
- (45) P. McDonald and A. Roy, J. Cosmology Astropart. Phys2009, 020 (2009).
- (46) V. Desjacques, D. Jeong, and F. Schmidt, Phys. Rep.733, 1 (2018).
- (47) R. Weinberger et al., MNRAS465, 3291 (2017).
- (48) A. Pillepich et al., MNRAS473, 4077 (2018).
- (49) R. E. Angulo and A. Pontzen, MNRAS462, L1 (2016).
- (50) V. Springel, MNRAS401, 791 (2010).
- (51) R. Kannan et al., MNRAS485, 117 (2019).
- (52) R. Kannan et al., MNRAS514, 3857 (2022).
- (53) E. Garaldi et al., MNRAS512, 4909 (2022).
- (54) M. Neyer et al., MNRAS(2024).
- (55) F. Bernardeau, S. Colombi, E. Gaztañaga, and R. Scoccimarro, Phys. Rep.367, 1 (2002).
- (56) S. Dodelson and F. Schmidt, Modern Cosmology (PUBLISHER, ADDRESS, 2020).
- (57) Z. Vlah, M. White, and A. Aviles, J. Cosmology Astropart. Phys2015, 014 (2015).
- (58) R. A. Porto, L. Senatore, and M. Zaldarriaga, J. Cosmology Astropart. Phys2014, 022 (2014).
- (59) L. Senatore, J. Cosmology Astropart. Phys2015, 007 (2015).
- (60) V. Assassi, D. Baumann, D. Green, and M. Zaldarriaga, J. Cosmology Astropart. Phys2014, 056 (2014).
- (61) Z. Vlah, E. Castorina, and M. White, J. Cosmology Astropart. Phys2016, 007 (2016).
- (62) T. Matsubara, Phys. Rev. D77, 063530 (2008).
- (63) T. Matsubara, Phys. Rev. D78, 109901 (2008).
- (64) Z. Vlah and M. White, J. Cosmology Astropart. Phys2019, 007 (2019).
- (65) S.-F. Chen, Z. Vlah, E. Castorina, and M. White, J. Cosmology Astropart. Phys2021, 100 (2021).
- (66) C. Modi, S.-F. Chen, and M. White, MNRAS492, 5754 (2020).
- (67) B. Hadzhiyska et al., J. Cosmology Astropart. Phys2021, 020 (2021).
- (68) N. Kokron et al., MNRAS505, 1422 (2021).
- (69) M. Pellejero Ibañez et al., MNRAS520, 3725 (2023).
- (70) C. A. Watkinson, B. Greig, and A. Mesinger, MNRAS510, 3838 (2022).
- (71) S.-F. Chen, Z. Vlah, and M. White, J. Cosmology Astropart. Phys2020, 062 (2020).
- (72) N. Kokron et al., J. Cosmology Astropart. Phys2022, 059 (2022).
- (73) J. DeRose, S.-F. Chen, N. Kokron, and M. White, J. Cosmology Astropart. Phys2023, 008 (2023).
- (74) B. Hadzhiyska et al., The Open Journal of Astrophysics 6, 38 (2023).
- (75) M. Schmittfull, M. Simonović, V. Assassi, and M. Zaldarriaga, Phys. Rev. D 100, 043514 (2019).
- (76) N. Kokron et al., MNRAS514, 2198 (2022).
- (77) Cosmic Visions 21 cm Collaboration et al., arXiv e-prints arXiv:1810.09572 (2018).
- (78) C. Modi, E. Castorina, Y. Feng, and M. White, J. Cosmology Astropart. Phys2019, 024 (2019).
- (79) C. Modi, M. White, A. Slosar, and E. Castorina, J. Cosmology Astropart. Phys2019, 023 (2019).
- (80) N. Sailer, E. Castorina, S. Ferraro, and M. White, J. Cosmology Astropart. Phys2021, 049 (2021).
- (81) A. Obuljen, M. Simonović, A. Schneider, and R. Feldmann, Phys. Rev. D108, 083528 (2023).
- (82) S. Foreman, A. Obuljen, and M. Simonović, arXiv e-prints arXiv:2405.18559 (2024).
- (83) A. Lidz et al., ApJ680, 962 (2008).
- (84) T. R. Choudhury, General Relativity and Gravitation 54, 102 (2022).
- (85) M. F. Morales, B. Hazelton, I. Sullivan, and A. Beardsley, ApJ752, 137 (2012).
- (86) P. Bull, P. G. Ferreira, P. Patel, and M. G. Santos, ApJ803, 21 (2015).
- (87) H.-J. Seo et al., ApJ721, 164 (2010).
- (88) A. Liu, A. R. Parsons, and C. M. Trott, Phys. Rev. D90, 023019 (2014).
- (89) J. C. Pober, MNRAS447, 1705 (2015).
- (90) H.-J. Seo and C. M. Hirata, MNRAS456, 3142 (2016).
- (91) A. Liu and J. R. Shaw, PASP132, 062001 (2020).
- (92) Z. Abdurashidova et al., ApJ925, 221 (2022).
- (93) J. B. Muñoz et al., MNRAS511, 3657 (2022).
- (94) H. Trac et al., ApJ927, 186 (2022).
- (95) K. Hoffmann et al., MNRAS487, 3050 (2019).