Removal of interloper contamination to line-intensity maps using correlations with ancillary tracers of the large-scale structure

José Luis Bernal Instituto de Física de Cantabria (IFCA), CSIC-Univ. de Cantabria, Avda. de los Castros s/n, E-39005 Santander, Spain    Antón Baleato Lizancos Berkeley Center for Cosmological Physics, UC Berkeley, CA 94720, USA Department of Physics, University of California, Berkeley, CA 94720, USA Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA
Abstract

Line-intensity mapping (LIM) offers an approach to obtain three-dimensional maps of the large-scale structure by collecting the aggregate emission from all emitters along the line of sight. The procedure hinges on reconstructing the radial positions of sources by relating the observed frequency to the rest-frame frequency of a target emission line. However, this step is hindered by ‘interloper-line’ emission from different cosmological volumes that redshifts into the same observed frequency. In this work, we propose a model-independent technique to remove the contamination of line interlopers using their statistical correlation with external tracers of the large-scale structure, and identify the weights that minimize the variance of the cleaned field. Furthermore, we derive expressions for the resulting power spectra after applying our cleaning procedure, and validate them against simulations. We find that the cleaning performance improves as the correlation between the line interlopers and the external tracer increases, resulting in a gain in the signal-to-noise ratio of up to a factor 6 (2) for the auto- (cross-)power spectrum in idealized scenarios. This approach has the advantage of being model-independent, and is highly complementary to other techniques, as it removes large-scale clustering modes instead of individually masking the brightest sources of contamination.

I Introduction

Line-intensity mapping (LIM) is an observational technique that employs flux integrated over cosmological scales to obtain three-dimensional maps of the Universe, recovering the line-of-sight information by targeting bright spectral lines with known rest-frame frequency [1, 2]. This approach grants access to the aggregate emission of otherwise undetectable faint sources and offers sensitivity to the astrophysical processes triggering the line emission. Therefore, LIM provides a powerful avenue to trace the large-scale structure of the Universe and galaxy formation and evolution at high redshifts, especially in regimes close to the high-noise or high-confusion limits [3, 4].

The LIM experimental landscape is growing rapidly, with an increasing number of facilities currently observing [5, 6, 7, 8, 9, 10, 11] and many others that will start in the forthcoming years [12, 13, 14, 15, 16, 17, 18]. These experiments are expected to confirm and increase the significance of preliminary detections (see e.g., Refs. [19, 20, 21, 22, 23, 24, 25, 26, 27]), scale up the size and sensitivity of the current pathfinder stage, and bridge the low-redshift and recombination epochs of the Universe probed by current experiments (see e.g., Refs. [28, 29, 30, 31, 32, 33]).

Nonetheless, some of the particularities that make LIM so appealing also feature among its main challenges. A critical one among these stems from the fact that, by using the integrated flux, LIM observations also collect contamination from any unwanted emission along the line of sight. This contamination includes Galactic foregrounds, continuum extragalactic emission, and line interlopers —i.e., emission from lines other than the target that redshifts into the same observed frequency. The contribution of the first two is expected to be smooth enough in frequency to be mitigated with component separation (see e.g., Ref. [34]), though at the cost of losing the longest-wavelength modes along the line of sight [35, 36]. On the other hand, removing line-interloper contamination is a greater challenge.

Although the intensity fluctuations of the interloper lines are uncorrelated with those of the target line,111Most of the commonly-targeted spectral lines are chosen to be bright and to have no other lines close to them in frequency to avoid line confusion. they nevertheless contribute to the total intensity measured and suffer from strong projection effects due to the confusion in redshift when projecting the position on the sky and observed frequency onto three-dimensional, Cartesian positions. Foreground line interlopers are typically brighter than the contribution of interest, even though the target spectral line may be intrinsically more intense, and can even dominate the measurements (see e.g, [37]). In fact, line interlopers may well be the most problematic source of contamination in LIM observations above radio and below optical frequencies.

Line-interloper contamination can be avoided cross-correlating the line-intensity maps either with other spectral line at the same redshift or with other tracers of the large-scale structure (see e.g., [38, 39, 40, 41]). The auto-power spectrum of the line of interest can be partially reconstructed from the cross-correlations of three tracers of the same density fluctuations at large scales [42]. However, if not mitigated, the contribution from line interlopers significantly increases the covariance of not only auto-correlations, but also cross-correlations. Although in this paper we will focus on line interlopers in the context of LIM, they can also be a challenge for certain galaxy surveys (see e.g., [43, 44, 45]), introducing catastrophic redshift errors and leading to impure galaxy catalogs.

There are two main strategies to deal with interlopers: masking and modeling. Masking can be either blind or guided. Guided masking [46, 47, 48] uses external observations to locate galaxies that are expected to be bright in the interloper lines and masks the voxels222A voxel is a three-dimensional pixel —a pixel within some frequency bin. that they fall into. Therefore, this approach relies on some astrophysical model to guide the masking strategy, which may result in suboptimal cleaning. On the other hand, blind masking [49, 50, 51] masks the brightest voxels (which are more prone to be dominated by foreground interloper emission) without requiring external observations, but incurs some (limited) information loss. Knowledge of the rest-frame frequency of the interlopers can also be used to separate interloper and target signals, either using spectral templates at the pixel level [52] or following machine learning approaches [53, 54, 55]. Finally, the contribution from interlopers can be modeled and considered as part of the analysis (see e.g., Refs. [50, 56, 57, 58]), which allows for the search of exotic contributions [59, 60]. Nonetheless, if the line interlopers are too bright they will hinder the study of the target line, despite efforts to model them.

In this work we propose an alternative approach that statistically minimizes the contamination from foreground line interlopers. Our approach is inspired by lensing-nulling techniques [61], especially those developed to undo the effects of lensing from the observed cosmic microwave background (CMB) anisotropies using external tracers [62, 63, 64] or to minimize contributions from certain redshifts to the reconstructed CMB lensing convergence [65, 66, 67, 68]. Following a loose analogy, we will employ ancillary tracers of the large-scale structure —typically fluctuations in the number density of galaxies measured by spectroscopic surveys— that overlap in redshift with the foreground interlopers to remove contributions from the latter from LIM observations. Crucially, the ancillary tracers are first filtered in such a way that the variance of the cleaned map is minimized.

We derive a prediction for the power spectrum of the cleaned map, and successfully validate it against log-normal simulations. This prediction can be calibrated directly from the data and is thus model independent. Since the performance of the cleaning is mostly driven by the extent of correlation between the interloper intensity fluctuations and the ancillary tracer we do the cleaning with, spectroscopic (or very-narrow-band photometric) galaxies are likely to be tracer best suited for this technique.

A key benefit of our approach is that, by (partially) removing the exact realization of the interloper fluctuations present in the sky, the covariance of both LIM auto- and cross-spectra can be reduced. We provide estimates of the improvement factor in the signal-to-noise ratio on the detection of such summary statistics, which can be as high as a factor of 6 (2) for auto- (cross-) power spectra —assuming Gaussian covariances. Given the strong dependence of the correlation coefficients —and the severity of the projection effects— on the specific redshifts and emission lines considered, we choose to remain as general as possible in this proof-of-concept study and defer in-detail investigation of specific examples to future work.

This paper is structured as follows. We review the theory of LIM signal, power spectrum and interloper contamination in Sec. II. The technique we propose to remove interloper contributions to the LIM power spectra is presented in Sec. III and validated with log-normal simulations in Sec. IV. The results and implications of the technique are discussed in Sec. V. Finally, our conclusions are presented in Sec. VI. Further details are relegated to the appendices, including a harmonic-space version of our formalism (appendix A) and an explanation of how to combine information from several tracers of the large-scale structure (appendix B).

II LIM: target signal and line interlopers

Consider any line emission sourced from a position 𝒏^^𝒏\hat{\boldsymbol{n}}over^ start_ARG bold_italic_n end_ARG on the sky at redshift z𝑧zitalic_z, with local specific luminosity density ρLsubscript𝜌𝐿\rho_{L}italic_ρ start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT at rest-frame frequency ν𝜈\nuitalic_ν. Its specific intensity at νobs=ν/(1+z)subscript𝜈obs𝜈1𝑧\nu_{\rm obs}=\nu/(1+z)italic_ν start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT = italic_ν / ( 1 + italic_z ) is given by

Iνobs(𝒏^)=c4πνobs(1+z)H(z)ρL(𝒏^χ(z))XLIνobs(z)ρL(𝒏^χ(z)),subscript𝐼subscript𝜈obs^𝒏𝑐4𝜋subscript𝜈obs1𝑧𝐻𝑧subscript𝜌𝐿^𝒏𝜒𝑧superscriptsubscript𝑋LIsubscript𝜈obs𝑧subscript𝜌𝐿^𝒏𝜒𝑧\begin{split}I_{\nu_{\rm obs}}(\hat{\boldsymbol{n}})=&\frac{c}{4\pi\nu_{\rm obs% }(1+z)H(z)}\rho_{L}(\hat{\boldsymbol{n}}\chi(z))\\ \equiv&X_{\rm LI}^{\nu_{\rm obs}}(z)\rho_{L}(\hat{\boldsymbol{n}}\chi(z))\,,% \end{split}start_ROW start_CELL italic_I start_POSTSUBSCRIPT italic_ν start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( over^ start_ARG bold_italic_n end_ARG ) = end_CELL start_CELL divide start_ARG italic_c end_ARG start_ARG 4 italic_π italic_ν start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT ( 1 + italic_z ) italic_H ( italic_z ) end_ARG italic_ρ start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ( over^ start_ARG bold_italic_n end_ARG italic_χ ( italic_z ) ) end_CELL end_ROW start_ROW start_CELL ≡ end_CELL start_CELL italic_X start_POSTSUBSCRIPT roman_LI end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ν start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_z ) italic_ρ start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ( over^ start_ARG bold_italic_n end_ARG italic_χ ( italic_z ) ) , end_CELL end_ROW (1)

where c𝑐citalic_c is the speed of light, H(z)𝐻𝑧H(z)italic_H ( italic_z ) is the Hubble parameter, χ𝜒\chiitalic_χ is the comoving radial distance to redshift z𝑧zitalic_z, and we have defined XLIνobssuperscriptsubscript𝑋LIsubscript𝜈obsX_{\rm LI}^{\nu_{\rm obs}}italic_X start_POSTSUBSCRIPT roman_LI end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ν start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT end_POSTSUPERSCRIPT to absorb all constants and quantities that depend only on redshift. Then, the total specific intensity observed at a given νobssubscript𝜈obs\nu_{\rm obs}italic_ν start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT is the sum of all contributions that redshift into such frequency:

Iνobs(𝒏^)=iXLIνobs(zi)ρL(i)(𝒏^χ(zi)).subscript𝐼subscript𝜈obs^𝒏subscript𝑖superscriptsubscript𝑋LIsubscript𝜈obssubscript𝑧𝑖superscriptsubscript𝜌𝐿𝑖^𝒏𝜒subscript𝑧𝑖I_{\nu_{\rm obs}}(\hat{\boldsymbol{n}})=\sum_{i}X_{\rm LI}^{\nu_{\rm obs}}(z_{% i})\rho_{L}^{(i)}(\hat{\boldsymbol{n}}\chi(z_{i}))\,.italic_I start_POSTSUBSCRIPT italic_ν start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( over^ start_ARG bold_italic_n end_ARG ) = ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT roman_LI end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ν start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_ρ start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ( over^ start_ARG bold_italic_n end_ARG italic_χ ( italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) . (2)

In the equation above we only consider line emission (assuming that continuum sources have been already cleaned from observations), so that we can replace an integral over the line of sight with the sum of contributions from lines with rest-frame frequency ν(i)=νobs(1+z(i))superscript𝜈𝑖subscript𝜈obs1superscript𝑧𝑖\nu^{(i)}=\nu_{\rm obs}(1+z^{(i)})italic_ν start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT = italic_ν start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT ( 1 + italic_z start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ). Therefore, Eq. (2) accounts for the line of interest of the LIM survey as well as all the line interlopers that contaminate that signal.

As is customary in LIM analyses, we will consider three-dimensional maps,333Note that there are also proposed methodologies employing angular statistics, see e.g., Ref. [58]. recovering line-of-sight information from the spectral resolution of the instrument assuming the rest-frame frequency νtsubscript𝜈t\nu_{\rm t}italic_ν start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT of the line of interest. This transformation means that two points that are separated by an angle ϑitalic-ϑ\varthetaitalic_ϑ on the sky and a redshift interval δz𝛿𝑧\delta zitalic_δ italic_z (centered at z𝑧zitalic_z) are mapped onto separations

r=DM(z)ϑ,andr=cδzH(z)formulae-sequencesubscript𝑟perpendicular-tosubscript𝐷M𝑧italic-ϑandsubscript𝑟parallel-to𝑐𝛿𝑧𝐻𝑧r_{\perp}=D_{\rm M}(z)\vartheta\,,\qquad\text{and}\qquad r_{\parallel}=\frac{c% \,\delta z}{H(z)}\,italic_r start_POSTSUBSCRIPT ⟂ end_POSTSUBSCRIPT = italic_D start_POSTSUBSCRIPT roman_M end_POSTSUBSCRIPT ( italic_z ) italic_ϑ , and italic_r start_POSTSUBSCRIPT ∥ end_POSTSUBSCRIPT = divide start_ARG italic_c italic_δ italic_z end_ARG start_ARG italic_H ( italic_z ) end_ARG (3)

transverse to and along the line of sight, respectively, where DMsubscript𝐷MD_{\rm M}italic_D start_POSTSUBSCRIPT roman_M end_POSTSUBSCRIPT is the comoving angular diameter distance.444This transformation assumes the plane-parallel approximation, which introduces spurious artifacts in the computation of two-point statistics. We refer the interested reader to Ref. [69] for a careful treatment in the context of LIM. As mentioned above, the redshift zt=νt/νobs1subscript𝑧tsubscript𝜈tsubscript𝜈obs1z_{\rm t}=\nu_{\rm t}/\nu_{\rm obs}-1italic_z start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT = italic_ν start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT / italic_ν start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT - 1 is used to apply the transformation in Eq. (3), so that contributions from line interlopers will be subject to projection effects, which introduce severe anisotropies in their fluctuations. For each interloper line, the measured and true components of the three-dimensional separations between two points are related by rjtrue=rjmeasqjsuperscriptsubscript𝑟𝑗truesuperscriptsubscript𝑟𝑗meassubscript𝑞𝑗r_{j}^{\rm true}=r_{j}^{\rm meas}q_{j}italic_r start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_true end_POSTSUPERSCRIPT = italic_r start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_meas end_POSTSUPERSCRIPT italic_q start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, where the subscript j𝑗jitalic_j stands for transverse to and parallel to the line of sight, and555These projection effects are qualitatively similar to the Alcock-Paczyński distortions [70]. Since the latter —sourced by differences between the assumed cosmology applied in Eq. (3) and the actual one— are decoupled from the question we would like to investigate in this work, we neglect them and assume that our fiducial cosmology matches the true cosmology.

q(i)=DM(zi)DM(zt),andq(i)=(1+zi)/H(zi)(1+zt)/H(zt).formulae-sequencesuperscriptsubscript𝑞perpendicular-to𝑖subscript𝐷𝑀subscript𝑧𝑖subscript𝐷𝑀subscript𝑧tandsuperscriptsubscript𝑞parallel-to𝑖1subscript𝑧𝑖𝐻subscript𝑧𝑖1subscript𝑧t𝐻subscript𝑧tq_{\perp}^{(i)}=\frac{D_{M}(z_{i})}{D_{M}(z_{\rm t})},\qquad\text{and}\qquad q% _{\parallel}^{(i)}=\frac{(1+z_{i})/H(z_{i})}{(1+z_{\rm t})/H(z_{\rm t})}\,.italic_q start_POSTSUBSCRIPT ⟂ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT = divide start_ARG italic_D start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG italic_D start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( italic_z start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT ) end_ARG , and italic_q start_POSTSUBSCRIPT ∥ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT = divide start_ARG ( 1 + italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) / italic_H ( italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG ( 1 + italic_z start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT ) / italic_H ( italic_z start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT ) end_ARG . (4)

In Fourier space, this corresponds to kjtrue=kjmeas/qjsuperscriptsubscript𝑘𝑗truesuperscriptsubscript𝑘𝑗meassubscript𝑞𝑗k_{j}^{\rm true}=k_{j}^{\rm meas}/q_{j}italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_true end_POSTSUPERSCRIPT = italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_meas end_POSTSUPERSCRIPT / italic_q start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. If we describe the wavevector 𝒌𝒌\boldsymbol{k}bold_italic_k in terms of its modulus k𝑘kitalic_k and the cosine μ𝜇\muitalic_μ of the angle it subtends relative to the line-of-sight component, μk/k𝜇subscript𝑘parallel-to𝑘\mu\equiv k_{\parallel}/kitalic_μ ≡ italic_k start_POSTSUBSCRIPT ∥ end_POSTSUBSCRIPT / italic_k, we have that the relation between the true and measured values for the interloper i𝑖iitalic_i are [71]

kitrue=kmeasq(i)[1+(μ)2((Fproj(i))21)]1/2andμitrue=μmeasFproj(i)[1+(μ)2((Fproj(i))21)]1/2,subscriptsuperscript𝑘true𝑖superscript𝑘meassuperscriptsubscript𝑞perpendicular-to𝑖superscriptdelimited-[]1superscript𝜇2superscriptsuperscriptsubscript𝐹proj𝑖2112andsubscriptsuperscript𝜇true𝑖superscript𝜇meassuperscriptsubscript𝐹proj𝑖superscriptdelimited-[]1superscript𝜇2superscriptsuperscriptsubscript𝐹proj𝑖2112\begin{split}&k^{\rm true}_{i}=\frac{k^{\rm meas}}{q_{\perp}^{(i)}}\left[1+% \left(\mu\right)^{2}\left((F_{\rm proj}^{(i)})^{-2}-1\right)\right]^{1/2}\,% \text{and}\\ &\mu^{\rm true}_{i}=\frac{\mu^{\rm meas}}{F_{\rm proj}^{(i)}}\left[1+\left(\mu% \right)^{2}\left((F_{\rm proj}^{(i)})^{-2}-1\right)\right]^{-1/2},\end{split}start_ROW start_CELL end_CELL start_CELL italic_k start_POSTSUPERSCRIPT roman_true end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = divide start_ARG italic_k start_POSTSUPERSCRIPT roman_meas end_POSTSUPERSCRIPT end_ARG start_ARG italic_q start_POSTSUBSCRIPT ⟂ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_ARG [ 1 + ( italic_μ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( ( italic_F start_POSTSUBSCRIPT roman_proj end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT - 1 ) ] start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT and end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_μ start_POSTSUPERSCRIPT roman_true end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = divide start_ARG italic_μ start_POSTSUPERSCRIPT roman_meas end_POSTSUPERSCRIPT end_ARG start_ARG italic_F start_POSTSUBSCRIPT roman_proj end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_ARG [ 1 + ( italic_μ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( ( italic_F start_POSTSUBSCRIPT roman_proj end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT - 1 ) ] start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT , end_CELL end_ROW (5)

where Fproj(i)q(i)/q(i)superscriptsubscript𝐹proj𝑖superscriptsubscript𝑞parallel-to𝑖superscriptsubscript𝑞perpendicular-to𝑖F_{\rm proj}^{(i)}\equiv\ q_{\parallel}^{(i)}/q_{\perp}^{(i)}italic_F start_POSTSUBSCRIPT roman_proj end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ≡ italic_q start_POSTSUBSCRIPT ∥ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT / italic_q start_POSTSUBSCRIPT ⟂ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT. In Fig. 1, we show values of qsubscript𝑞perpendicular-toq_{\perp}italic_q start_POSTSUBSCRIPT ⟂ end_POSTSUBSCRIPT and qsubscript𝑞parallel-toq_{\parallel}italic_q start_POSTSUBSCRIPT ∥ end_POSTSUBSCRIPT for different sets of ztsubscript𝑧tz_{\rm t}italic_z start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT and zintsubscript𝑧intz_{\rm int}italic_z start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT, explicitly marking some actual experiments.666In some cases, the projection effects may be very large, and this linear rescaling of separations may be inaccurate. We leave an exploration of such non-linear corrections to future work. For simplicity, we will drop the ‘meas’ superscript from here on out and refer to ‘measured’ distances by default unless otherwise stated.

Refer to caption
Refer to caption
Figure 1: Rescaling factors qsubscript𝑞perpendicular-toq_{\perp}italic_q start_POSTSUBSCRIPT ⟂ end_POSTSUBSCRIPT and qsubscript𝑞parallel-toq_{\parallel}italic_q start_POSTSUBSCRIPT ∥ end_POSTSUBSCRIPT (red and blue, respectively, each on its own triangular panel) as functions of the redshift of the target signal and the redshift that the foreground interloper is emitted from (satisfying zint<ztsubscript𝑧intsubscript𝑧tz_{\rm int}<z_{\rm t}italic_z start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT < italic_z start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT). The markers label, for both factors, the configuration of specific experiments. These factors relate the true wavenumbers in the foreground frame with the measured ones after being projected to ztsubscript𝑧tz_{\rm t}italic_z start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT as ktrue=kmeas/qsubscriptsuperscript𝑘trueparallel-tosubscriptsuperscript𝑘measparallel-tosubscript𝑞parallel-tok^{\rm true}_{\parallel}=k^{\rm meas}_{\parallel}/q_{\parallel}italic_k start_POSTSUPERSCRIPT roman_true end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ∥ end_POSTSUBSCRIPT = italic_k start_POSTSUPERSCRIPT roman_meas end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ∥ end_POSTSUBSCRIPT / italic_q start_POSTSUBSCRIPT ∥ end_POSTSUBSCRIPT and ktrue=kmeas/qsubscriptsuperscript𝑘trueperpendicular-tosubscriptsuperscript𝑘measparallel-tosubscript𝑞perpendicular-tok^{\rm true}_{\perp}=k^{\rm meas}_{\parallel}/q_{\perp}italic_k start_POSTSUPERSCRIPT roman_true end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ⟂ end_POSTSUBSCRIPT = italic_k start_POSTSUPERSCRIPT roman_meas end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ∥ end_POSTSUBSCRIPT / italic_q start_POSTSUBSCRIPT ⟂ end_POSTSUBSCRIPT.

II.1 Power Spectrum

Let us denote the intensity fluctuations as δI(𝒙)I(𝒙)I𝛿𝐼𝒙𝐼𝒙delimited-⟨⟩𝐼\delta I(\boldsymbol{x})\equiv I(\boldsymbol{x})-\langle I\rangleitalic_δ italic_I ( bold_italic_x ) ≡ italic_I ( bold_italic_x ) - ⟨ italic_I ⟩, where Idelimited-⟨⟩𝐼\langle I\rangle⟨ italic_I ⟩ is the intensity averaged over the volume probed. These fluctuations are a biased tracer of the matter density fluctuations. In fact, for spectral lines other than 21 cm before reionization, we can safely assume that all emission comes from galaxies.777With the exception of diffuse Lyman-α𝛼\alphaitalic_α, which may require mild tailoring of the formalism discussed in this work.

We will focus on the power spectrum of these intensity fluctuations —the Fourier-space analog of the two-point correlation function. This receives two kinds of contributions: the shot noise associated with the discreteness of the emitters, and the clustering.

Let us first address the clustering part. Assuming that all contributions to Eq. (2) come from non-overlapping volumes,888This assumption is always fulfilled when the rest-frame frequencies of the relevant contributors are far enough apart. Otherwise, the cross-correlation between them must also be taken into account in Eq. (6). and working to linear order,999Technically, our treatment of redshift-space distortions goes beyond linear order in using a Lorentzian damping factor in Eq. (8) to mimic the effect of Fingers-of-God. the total clustering component is given by

Pclust(k,μ)=ii2(kitrue,μitrue,zi)𝒜v(i)Pm(kitrue,zi),subscript𝑃clust𝑘𝜇subscript𝑖superscriptsubscript𝑖2superscriptsubscript𝑘𝑖truesuperscriptsubscript𝜇𝑖truesubscript𝑧𝑖superscriptsubscript𝒜𝑣𝑖subscript𝑃msuperscriptsubscript𝑘𝑖truesubscript𝑧𝑖P_{\rm clust}(k,\mu)=\sum_{i}\frac{\mathcal{B}_{i}^{2}(k_{i}^{\rm true},\mu_{i% }^{\rm true},z_{i})}{\mathcal{A}_{v}^{(i)}}P_{\rm m}(k_{i}^{\rm true},z_{i})\,,italic_P start_POSTSUBSCRIPT roman_clust end_POSTSUBSCRIPT ( italic_k , italic_μ ) = ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT divide start_ARG caligraphic_B start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_true end_POSTSUPERSCRIPT , italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_true end_POSTSUPERSCRIPT , italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG caligraphic_A start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_ARG italic_P start_POSTSUBSCRIPT roman_m end_POSTSUBSCRIPT ( italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_true end_POSTSUPERSCRIPT , italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , (6)

where the i𝑖iitalic_i subscripts index contributions from the various lines, 𝒜vq2qsubscript𝒜𝑣superscriptsubscript𝑞perpendicular-to2subscript𝑞parallel-to\mathcal{A}_{v}\equiv q_{\perp}^{2}q_{\parallel}caligraphic_A start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ≡ italic_q start_POSTSUBSCRIPT ⟂ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_q start_POSTSUBSCRIPT ∥ end_POSTSUBSCRIPT is a factor due to the isotropic volume dilation from the projection effects, Pmsubscript𝑃mP_{\rm m}italic_P start_POSTSUBSCRIPT roman_m end_POSTSUBSCRIPT is the matter power spectrum in real space, and

iXLI(i)(zi)dMdndM(zi)××Li(M,zi)bh(M,zi)FRSD(kitrue,μitrue,zi).subscript𝑖superscriptsubscript𝑋LI𝑖subscript𝑧𝑖d𝑀d𝑛d𝑀subscript𝑧𝑖subscript𝐿𝑖𝑀subscript𝑧𝑖subscript𝑏𝑀subscript𝑧𝑖subscript𝐹RSDsuperscriptsubscript𝑘𝑖truesuperscriptsubscript𝜇𝑖truesubscript𝑧𝑖\begin{split}\mathcal{B}_{i}\equiv&X_{\rm LI}^{(i)}(z_{i})\int{\rm d}M\frac{{% \rm d}n}{{\rm d}M}(z_{i})\times\\ \times&L_{i}(M,z_{i})b_{h}(M,z_{i})F_{\rm RSD}(k_{i}^{\rm true},\mu_{i}^{\rm true% },z_{i})\,.\end{split}start_ROW start_CELL caligraphic_B start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≡ end_CELL start_CELL italic_X start_POSTSUBSCRIPT roman_LI end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ( italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ∫ roman_d italic_M divide start_ARG roman_d italic_n end_ARG start_ARG roman_d italic_M end_ARG ( italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) × end_CELL end_ROW start_ROW start_CELL × end_CELL start_CELL italic_L start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_M , italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_b start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT ( italic_M , italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_F start_POSTSUBSCRIPT roman_RSD end_POSTSUBSCRIPT ( italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_true end_POSTSUPERSCRIPT , italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_true end_POSTSUPERSCRIPT , italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) . end_CELL end_ROW (7)

In this last expression, dn/dMd𝑛d𝑀{\rm d}n/{\rm d}Mroman_d italic_n / roman_d italic_M is the halo mass function; L𝐿Litalic_L and bhsubscript𝑏b_{h}italic_b start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT are parametrizations of the line luminosity and halo bias, respectively, that depend on both the halo mass and redshift, and FRSDsubscript𝐹RSDF_{\rm RSD}italic_F start_POSTSUBSCRIPT roman_RSD end_POSTSUBSCRIPT is a factor encoding redshift-space distortions:

FRSD(k,μ,z)=1+fμ2/bh(z)1+(kμσpv(i))2/2,subscript𝐹RSD𝑘𝜇𝑧1𝑓superscript𝜇2subscript𝑏𝑧1superscript𝑘𝜇superscriptsubscript𝜎pv𝑖22F_{\rm RSD}(k,\mu,z)=\frac{1+f\mu^{2}/b_{h}(z)}{1+(k\mu\sigma_{{\rm pv}}^{(i)}% )^{2}/2}\,,italic_F start_POSTSUBSCRIPT roman_RSD end_POSTSUBSCRIPT ( italic_k , italic_μ , italic_z ) = divide start_ARG 1 + italic_f italic_μ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / italic_b start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT ( italic_z ) end_ARG start_ARG 1 + ( italic_k italic_μ italic_σ start_POSTSUBSCRIPT roman_pv end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / 2 end_ARG , (8)

where f𝑓fitalic_f is the dimensionless linear growth factor. Note that this expression includes the Kaiser effect on large scales as well as a phenomenological suppression on small scales, which we model as a Lorentzian function of the velocity dispersion σpv2(z)=dkPm(k,z)/6π2superscriptsubscript𝜎pv2𝑧dksubscript𝑃m𝑘𝑧6superscript𝜋2\sigma_{\rm pv}^{2}(z)=\int{\rm dk}P_{\rm m}(k,z)/6\pi^{2}italic_σ start_POSTSUBSCRIPT roman_pv end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_z ) = ∫ roman_dk italic_P start_POSTSUBSCRIPT roman_m end_POSTSUBSCRIPT ( italic_k , italic_z ) / 6 italic_π start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. We assume the halo mass function from Ref. [72] and the halo bias from Ref. [73].

Under the same assumption of no volume overlap between the contributors to the total line intensity, and considering only Poissonian shot noise, the shot-noise component is

Pshot=i(XLI(i)(zi))2𝒜v(i)dMdndM(zi)Li2(M,zi).subscript𝑃shotsubscript𝑖superscriptsuperscriptsubscript𝑋LI𝑖subscript𝑧𝑖2superscriptsubscript𝒜𝑣𝑖differential-d𝑀d𝑛d𝑀subscript𝑧𝑖superscriptsubscript𝐿𝑖2𝑀subscript𝑧𝑖P_{\rm shot}=\sum_{i}\frac{\left(X_{\rm LI}^{(i)}(z_{i})\right)^{2}}{\mathcal{% A}_{v}^{(i)}}\int{\rm d}M\frac{{\rm d}n}{{\rm d}M}(z_{i})L_{i}^{2}(M,z_{i})\,.italic_P start_POSTSUBSCRIPT roman_shot end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT divide start_ARG ( italic_X start_POSTSUBSCRIPT roman_LI end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ( italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG caligraphic_A start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_ARG ∫ roman_d italic_M divide start_ARG roman_d italic_n end_ARG start_ARG roman_d italic_M end_ARG ( italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_L start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_M , italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) . (9)

LIM experiments have limited angular and spectral resolution. Assuming single-dish observations with antennas of diameter D𝐷Ditalic_D, the full width at half maximum of the beam is given by θFWHM=1.22c/νobsDsubscript𝜃FWHM1.22𝑐subscript𝜈obs𝐷\theta_{\rm FWHM}=1.22c/\nu_{\rm obs}Ditalic_θ start_POSTSUBSCRIPT roman_FWHM end_POSTSUBSCRIPT = 1.22 italic_c / italic_ν start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT italic_D. If we consider a Gaussian beam, this corresponds to a standard deviation σbeam=θFWHM/8log2subscript𝜎beamsubscript𝜃FWHM82\sigma_{\rm beam}=\theta_{\rm FWHM}/\sqrt{8\log 2}italic_σ start_POSTSUBSCRIPT roman_beam end_POSTSUBSCRIPT = italic_θ start_POSTSUBSCRIPT roman_FWHM end_POSTSUBSCRIPT / square-root start_ARG 8 roman_log 2 end_ARG. In terms of the frequency resolution, we consider Gaussian channels with width δν𝛿𝜈\delta\nuitalic_δ italic_ν. This corresponds to characteristic resolution limits in the directions along and transverse to the line of sight

σ=cδν(1+z)H(z)νobs,andσ=DM(z)σbeam,formulae-sequencesubscript𝜎parallel-to𝑐𝛿𝜈1𝑧𝐻𝑧subscript𝜈obsandsubscript𝜎perpendicular-tosubscript𝐷M𝑧subscript𝜎beam\sigma_{\parallel}=\frac{c\delta\nu(1+z)}{H(z)\nu_{\rm obs}},\quad\text{and}% \quad\sigma_{\perp}=D_{\rm M}(z)\sigma_{\rm beam}\,,italic_σ start_POSTSUBSCRIPT ∥ end_POSTSUBSCRIPT = divide start_ARG italic_c italic_δ italic_ν ( 1 + italic_z ) end_ARG start_ARG italic_H ( italic_z ) italic_ν start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT end_ARG , and italic_σ start_POSTSUBSCRIPT ⟂ end_POSTSUBSCRIPT = italic_D start_POSTSUBSCRIPT roman_M end_POSTSUBSCRIPT ( italic_z ) italic_σ start_POSTSUBSCRIPT roman_beam end_POSTSUBSCRIPT , (10)

respectively, so that the suppression associated with the beam can be expressed as [74]

Wres(k,μ)=exp{k2[σ2(1μ2)+σ2μ2]}.subscript𝑊res𝑘𝜇superscript𝑘2delimited-[]superscriptsubscript𝜎perpendicular-to21superscript𝜇2superscriptsubscript𝜎parallel-to2superscript𝜇2W_{\rm res}(k,\mu)=\exp\left\{-k^{2}\left[\sigma_{\perp}^{2}(1-\mu^{2})+\sigma% _{\parallel}^{2}\mu^{2}\right]\right\}.italic_W start_POSTSUBSCRIPT roman_res end_POSTSUBSCRIPT ( italic_k , italic_μ ) = roman_exp { - italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT [ italic_σ start_POSTSUBSCRIPT ⟂ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( 1 - italic_μ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) + italic_σ start_POSTSUBSCRIPT ∥ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_μ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] } . (11)

The total power spectrum is then given by

PI(k,μ)=Wres2(k,μ)(Pclust(k,μ)+Pshot).subscript𝑃𝐼𝑘𝜇superscriptsubscript𝑊res2𝑘𝜇subscript𝑃clust𝑘𝜇subscript𝑃shotP_{I}(k,\mu)=W_{\rm res}^{2}(k,\mu)\left(P_{\rm clust}(k,\mu)+P_{\rm shot}% \right)\,.italic_P start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ( italic_k , italic_μ ) = italic_W start_POSTSUBSCRIPT roman_res end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_k , italic_μ ) ( italic_P start_POSTSUBSCRIPT roman_clust end_POSTSUBSCRIPT ( italic_k , italic_μ ) + italic_P start_POSTSUBSCRIPT roman_shot end_POSTSUBSCRIPT ) . (12)

For simplicity, we consider only the monopole component of the redshift-space power spectrum, which can be obtained as P0(k)=dμP(k,μ)/2subscript𝑃0𝑘differential-d𝜇𝑃𝑘𝜇2P_{0}(k)=\int{\rm d}\mu P(k,\mu)/2italic_P start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_k ) = ∫ roman_d italic_μ italic_P ( italic_k , italic_μ ) / 2.

At various points in this paper we will invoke other tracers of the large-scale structure, either as proxies of the interloper fluctuations, or as fluctuation fields to be cross-correlated with our target signal. Though our formalism —described in the next section— is general enough to apply to any tracer of the large-scale structure, we will focus on spectroscopic galaxy surveys for simplicity. Whenever we claim that one such tracer originates from the same redshift as some emission line, we will assume that the two also cover the same volume and project them to ztsubscript𝑧tz_{\rm t}italic_z start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT in exactly the same way. Note, however, that the redshift of the ancillary tracer is known, so that we can use separately the contributions from foreground and target volumes, contrary to the case of line interlopers.

We define galaxy number count fluctuations as δg(ng(𝒙)n¯g)/n¯gsubscript𝛿gsubscript𝑛g𝒙subscript¯𝑛gsubscript¯𝑛g\delta_{\rm g}\equiv(n_{\rm g}(\boldsymbol{x})-\bar{n}_{\rm g})/\bar{n}_{\rm g}italic_δ start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT ≡ ( italic_n start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT ( bold_italic_x ) - over¯ start_ARG italic_n end_ARG start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT ) / over¯ start_ARG italic_n end_ARG start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT,101010More optimal fluctuation definitions can be obtained —including for the line-intensity fluctuations— by applying weights to minimize the variance (see e.g., Ref. [75] for a detailed example pertaining to LIM-galaxy cross-correlations). where ngsubscript𝑛gn_{\rm g}italic_n start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT is the local number density of galaxies and n¯gsubscript¯𝑛g\bar{n}_{\rm g}over¯ start_ARG italic_n end_ARG start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT its mean. In this case, we have g=bgFRSD,gsubscriptgsubscript𝑏gsubscript𝐹RSDg\mathcal{B}_{\rm g}=b_{\rm g}F_{\rm RSD,g}caligraphic_B start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT = italic_b start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT roman_RSD , roman_g end_POSTSUBSCRIPT, where bgsubscript𝑏𝑔b_{g}italic_b start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT is the linear galaxy bias, and the galaxy power spectrum, including also the shot noise, is given by

Pg(k,μ)=𝒜v(g)[g2(ktrue,μtrue,zg)Pm(kitrue,zg)+1n¯g],subscript𝑃g𝑘𝜇superscriptsubscript𝒜𝑣gdelimited-[]superscriptsubscriptg2superscript𝑘truesuperscript𝜇truesubscript𝑧gsubscript𝑃msuperscriptsubscript𝑘𝑖truesubscript𝑧g1subscript¯𝑛gP_{\rm g}(k,\mu)=\mathcal{A}_{v}^{(\rm g)}\left[\mathcal{B}_{\rm g}^{2}(k^{\rm true% },\mu^{\rm true},z_{\rm g})P_{\rm m}(k_{i}^{\rm true},z_{\rm g})+\frac{1}{\bar% {n}_{\rm g}}\right]\,,italic_P start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT ( italic_k , italic_μ ) = caligraphic_A start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( roman_g ) end_POSTSUPERSCRIPT [ caligraphic_B start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_k start_POSTSUPERSCRIPT roman_true end_POSTSUPERSCRIPT , italic_μ start_POSTSUPERSCRIPT roman_true end_POSTSUPERSCRIPT , italic_z start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT ) italic_P start_POSTSUBSCRIPT roman_m end_POSTSUBSCRIPT ( italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_true end_POSTSUPERSCRIPT , italic_z start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT ) + divide start_ARG 1 end_ARG start_ARG over¯ start_ARG italic_n end_ARG start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT end_ARG ] , (13)

where we have assumed the shot noise is Poissonian. On the other hand, the cross-power spectrum with line-intensity fluctuations is

PIg=Wres𝒜v(i)[igPm+XLI(i)dMdndM|gLin¯g],subscript𝑃𝐼gsubscript𝑊ressuperscriptsubscript𝒜𝑣𝑖delimited-[]subscript𝑖subscriptgsubscript𝑃mevaluated-atsuperscriptsubscript𝑋LI𝑖differential-d𝑀d𝑛d𝑀absentgsubscript𝐿𝑖subscript¯𝑛gP_{I{\rm g}}=\frac{W_{\rm res}}{\mathcal{A}_{v}^{(i)}}\left[\mathcal{B}_{i}% \mathcal{B}_{\rm g}P_{\rm m}+X_{\rm LI}^{(i)}\int{\rm d}M\left.\frac{{\rm d}n}% {{\rm d}M}\right|_{\in{\rm g}}\frac{L_{i}}{\bar{n}_{\rm g}}\right]\,,italic_P start_POSTSUBSCRIPT italic_I roman_g end_POSTSUBSCRIPT = divide start_ARG italic_W start_POSTSUBSCRIPT roman_res end_POSTSUBSCRIPT end_ARG start_ARG caligraphic_A start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_ARG [ caligraphic_B start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT caligraphic_B start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT roman_m end_POSTSUBSCRIPT + italic_X start_POSTSUBSCRIPT roman_LI end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ∫ roman_d italic_M divide start_ARG roman_d italic_n end_ARG start_ARG roman_d italic_M end_ARG | start_POSTSUBSCRIPT ∈ roman_g end_POSTSUBSCRIPT divide start_ARG italic_L start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG over¯ start_ARG italic_n end_ARG start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT end_ARG ] , (14)

where we have dropped explicit dependences to ease the reading of the expression and only the source galaxies that are also featured in the galaxy survey contribute to the average luminosity density in the last integral.

II.2 Power spectrum covariance

Assuming white effective experiment noise, the noise power spectrum of single-dish LIM surveys is given by

𝒩I=σN2Vvox2Ndet,subscript𝒩𝐼superscriptsubscript𝜎N2superscriptsubscript𝑉vox2subscript𝑁det\mathcal{N}_{I}=\frac{\sigma_{\rm N}^{2}V_{\rm vox}^{2}}{N_{\rm det}}\,,caligraphic_N start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT = divide start_ARG italic_σ start_POSTSUBSCRIPT roman_N end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_V start_POSTSUBSCRIPT roman_vox end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_N start_POSTSUBSCRIPT roman_det end_POSTSUBSCRIPT end_ARG , (15)

where σNsubscript𝜎N\sigma_{\rm N}italic_σ start_POSTSUBSCRIPT roman_N end_POSTSUBSCRIPT is the standard deviation of the effective noise per voxel, Vvoxsubscript𝑉voxV_{\rm vox}italic_V start_POSTSUBSCRIPT roman_vox end_POSTSUBSCRIPT is the voxel volume, and Ndetsubscript𝑁detN_{\rm det}italic_N start_POSTSUBSCRIPT roman_det end_POSTSUBSCRIPT is the effective number of detectors.

For analytic derivations in this work, we will only consider the Gaussian covariance of the monopole power spectra. Including the contributions from signal and instrumental noise, the covariance of the auto-power spectrum of LIM observations is given by

𝒞II=2(PI+𝒩I)2Nmodes,subscript𝒞𝐼𝐼2superscriptsubscript𝑃𝐼subscript𝒩𝐼2subscript𝑁modes\mathcal{C}_{II}=\frac{2\left(P_{I}+\mathcal{N}_{I}\right)^{2}}{N_{\rm modes}}\,,caligraphic_C start_POSTSUBSCRIPT italic_I italic_I end_POSTSUBSCRIPT = divide start_ARG 2 ( italic_P start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT + caligraphic_N start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_N start_POSTSUBSCRIPT roman_modes end_POSTSUBSCRIPT end_ARG , (16)

where Nmodessubscript𝑁modesN_{\rm modes}italic_N start_POSTSUBSCRIPT roman_modes end_POSTSUBSCRIPT is the number of modes per k𝑘kitalic_k-bin.

For cross-correlations, we will assume that the noise of the LIM experiment is uncorrelated with the galaxy fluctuations, yielding

𝒞Ig=(PI+𝒩I)Pg+PIg2Nmodes,subscript𝒞𝐼gsubscript𝑃𝐼subscript𝒩𝐼subscript𝑃gsubscriptsuperscript𝑃2𝐼gsubscript𝑁modes\mathcal{C}_{I{\rm g}}=\frac{\left(P_{I}+\mathcal{N}_{I}\right)P_{\rm g}+P^{2}% _{I{\rm g}}}{N_{\rm modes}}\,,caligraphic_C start_POSTSUBSCRIPT italic_I roman_g end_POSTSUBSCRIPT = divide start_ARG ( italic_P start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT + caligraphic_N start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ) italic_P start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT + italic_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_I roman_g end_POSTSUBSCRIPT end_ARG start_ARG italic_N start_POSTSUBSCRIPT roman_modes end_POSTSUBSCRIPT end_ARG , (17)

where Pgsubscript𝑃gP_{\rm g}italic_P start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT is the total monopole power spectrum of galaxy clustering, including shot noise.

III Cleaning interlopers

In this section, we describe our proposed methodology to minimize the contribution from interlopers to LIM observations using external tracers of the large-scale structure. Our approach is inspired by nulling techniques as applied to weak lensing [61], and in particular to CMB lensing [63, 65, 66, 64].

Let us consider a simplified scenario where in addition to the target spectral line there is a single foreground line interloper, with the two lines separated enough in frequency so that there is no overlap in the volume they trace. In this limit, the intensity fluctuations are the sum of three uncorrelated fields:

δI(𝒙)=δIint(𝒙)+δIt(𝒙)+δ𝒩(𝒙),𝛿𝐼𝒙𝛿subscript𝐼int𝒙𝛿subscript𝐼t𝒙subscript𝛿𝒩𝒙\delta I(\boldsymbol{x})=\delta I_{\rm int}(\boldsymbol{x})+\delta I_{\rm t}(% \boldsymbol{x})+\delta_{\mathcal{N}}(\boldsymbol{x})\,,italic_δ italic_I ( bold_italic_x ) = italic_δ italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT ( bold_italic_x ) + italic_δ italic_I start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT ( bold_italic_x ) + italic_δ start_POSTSUBSCRIPT caligraphic_N end_POSTSUBSCRIPT ( bold_italic_x ) , (18)

where the interloper component is subject to projection effects, and δ𝒩subscript𝛿𝒩\delta_{\mathcal{N}}italic_δ start_POSTSUBSCRIPT caligraphic_N end_POSTSUBSCRIPT is the noise fluctuation, assumed to be randomly sampled from a power spectrum given by Eq. (15). The measured power spectrum from these fluctuations is PIPIt+PIint+𝒩Isubscript𝑃𝐼subscript𝑃subscript𝐼tsubscript𝑃subscript𝐼intsubscript𝒩𝐼P_{I}\equiv P_{I_{\rm t}}+P_{I_{\rm int}}+\mathcal{N}_{I}italic_P start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ≡ italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT end_POSTSUBSCRIPT + caligraphic_N start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT. Similarly, we consider galaxy number counts from a galaxy survey covering the same volume as the interloper line emission above, leading to an overdensity field δgintsuperscriptsubscript𝛿gint\delta_{\rm g}^{\rm int}italic_δ start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT.

After transforming to Fourier space, let us define a ‘cleaned’ version of the line-intensity fluctuations as

δI^(𝒌)=δI(𝒌)(𝒌)δgint(𝒌),𝛿^𝐼𝒌𝛿𝐼𝒌𝒌superscriptsubscript𝛿gint𝒌\delta\hat{I}(\boldsymbol{k})=\delta I(\boldsymbol{k})-\mathcal{F}(\boldsymbol% {k})\delta_{\rm g}^{\rm int}(\boldsymbol{k})\,,italic_δ over^ start_ARG italic_I end_ARG ( bold_italic_k ) = italic_δ italic_I ( bold_italic_k ) - caligraphic_F ( bold_italic_k ) italic_δ start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT ( bold_italic_k ) , (19)

where \mathcal{F}caligraphic_F is a set of weights yet to be determined. Without loss of generality, the measured power spectrum after cleaning can be modeled as

P~I^=P~I2P~Igint+2P~gint,subscript~𝑃^𝐼subscript~𝑃𝐼2subscript~𝑃𝐼superscriptgintsuperscript2subscript~𝑃superscriptgint\tilde{P}_{\hat{I}}=\tilde{P}_{I}-2\mathcal{F}\tilde{P}_{I\rm g^{\rm int}}+% \mathcal{F}^{2}\tilde{P}_{{\rm g}^{\rm int}}\,,over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT over^ start_ARG italic_I end_ARG end_POSTSUBSCRIPT = over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT - 2 caligraphic_F over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_I roman_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + caligraphic_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT end_POSTSUBSCRIPT , (20)

where tildes denote measured quantities including noise. Then, the model that would be used for PIsubscript𝑃𝐼P_{I}italic_P start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT can be applied to the cleaned version of the map, since we know \mathcal{F}caligraphic_F exactly.

We want to improve the precision of the measurement of the power spectrum of the target line by removing contributions from line interlopers. Therefore, we choose \mathcal{F}caligraphic_F to ensure that δI^𝛿^𝐼\delta\hat{I}italic_δ over^ start_ARG italic_I end_ARG is unbiased and the variance of its power spectrum is minimized.

The first condition is fulfilled by definition —since δI=δgint=0delimited-⟨⟩𝛿𝐼delimited-⟨⟩superscriptsubscript𝛿gint0\langle\delta I\rangle=\langle\delta_{\rm g}^{\rm int}\rangle=0⟨ italic_δ italic_I ⟩ = ⟨ italic_δ start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT ⟩ = 0 and \mathcal{F}caligraphic_F is not a statistical quantity. To fulfill the second condition we need the weights to be

(k)=P~Igint(k)P~gint(k)=P~Iintgint(k)P~gint(k).𝑘subscript~𝑃𝐼superscriptgint𝑘subscript~𝑃superscriptgint𝑘subscript~𝑃subscript𝐼intsuperscriptgint𝑘subscript~𝑃superscriptgint𝑘\mathcal{F}(k)=\frac{\tilde{P}_{I{\rm g}^{\rm int}}(k)}{{\tilde{P}}_{{\rm g}^{% \rm int}(k)}}=\frac{\tilde{P}_{I_{\rm int}{\rm g}^{\rm int}}(k)}{{\tilde{P}}_{% {\rm g}^{\rm int}(k)}}\,.caligraphic_F ( italic_k ) = divide start_ARG over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_I roman_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_k ) end_ARG start_ARG over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT ( italic_k ) end_POSTSUBSCRIPT end_ARG = divide start_ARG over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_k ) end_ARG start_ARG over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT ( italic_k ) end_POSTSUBSCRIPT end_ARG . (21)

Note that the denominator in the filter includes noise —which for galaxy surveys is only the shot noise already included in the definition of Pgsubscript𝑃gP_{\rm g}italic_P start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT; if a different tracer including instrumental noise is used, the noise must be explicitly included in the computation of \mathcal{F}caligraphic_F. Alternatively, the power spectra employed to compute the filter could be obtained from a model for the signal component only; this would null the interloper contribution exactly, but it could also introduce noise and, potentially, bias due to model mismatch (cf. Refs. [66, 65, 68]). Instead, we prefer to obtain them from the actual data, rendering the whole procedure model independent.111111This is exactly analogous to the determination of the optimal weights for CMB delensing [62, 63] and ‘redshift-cleaning’ [68]. These applications have been shown to be relatively robust against inaccuracies in the determination of the filters [76, 77]. The key is that we know exactly what filters are applied to the data, such that any filter misspecification is only likely to result in suboptimal variance reduction, but no bias [63, 76, 77].

If we define the cross-correlation coefficient between two fields as

ρab=Pab(Paa+𝒩a)(Pbb+𝒩b),subscript𝜌𝑎𝑏subscript𝑃𝑎𝑏subscript𝑃𝑎𝑎subscript𝒩𝑎subscript𝑃𝑏𝑏subscript𝒩𝑏\rho_{ab}=\frac{{P}_{ab}}{\sqrt{({P}_{aa}+\mathcal{N}_{a})({P}_{bb}+\mathcal{N% }_{b})}}\,,italic_ρ start_POSTSUBSCRIPT italic_a italic_b end_POSTSUBSCRIPT = divide start_ARG italic_P start_POSTSUBSCRIPT italic_a italic_b end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG ( italic_P start_POSTSUBSCRIPT italic_a italic_a end_POSTSUBSCRIPT + caligraphic_N start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ) ( italic_P start_POSTSUBSCRIPT italic_b italic_b end_POSTSUBSCRIPT + caligraphic_N start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ) end_ARG end_ARG , (22)

the resulting power spectrum for δI^𝛿^𝐼\delta\hat{I}italic_δ over^ start_ARG italic_I end_ARG using the filter defined in Eq. (21) is given by

P~I^=PIt+PIint(1ρIintgint2)+𝒩I==P~I(1ρ~Igint2),subscript~𝑃^𝐼subscript𝑃subscript𝐼tsubscript𝑃subscript𝐼int1subscriptsuperscript𝜌2subscript𝐼intsuperscriptgintsubscript𝒩𝐼subscript~𝑃𝐼1subscriptsuperscript~𝜌2𝐼superscript𝑔int\begin{split}\tilde{P}_{\hat{I}}&=P_{I_{\rm t}}+P_{I_{\rm int}}\left(1-\rho^{2% }_{I_{\rm int}{\rm g}^{\rm int}}\right)+\mathcal{N}_{I}=\\ &=\tilde{P}_{I}\left(1-\tilde{\rho}^{2}_{Ig^{\rm int}}\right)\,,\end{split}start_ROW start_CELL over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT over^ start_ARG italic_I end_ARG end_POSTSUBSCRIPT end_CELL start_CELL = italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( 1 - italic_ρ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) + caligraphic_N start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT = end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ( 1 - over~ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_I italic_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) , end_CELL end_ROW (23)

where ρIintgint2subscriptsuperscript𝜌2subscript𝐼intsuperscriptgint\rho^{2}_{I_{\rm int}{\rm g^{int}}}italic_ρ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT end_POSTSUBSCRIPT in the first line is computed without the noise in the line-intensity map. The two expressions above are equivalent, since the inclusion of noise and target fluctuations in the cross-correlation coefficient reduces its value. While the former expression provides a clearer idea of the origin of the cleaning and the reduction in the power spectrum due to a suppression of interloper clustering, the latter shows that we can predict the resulting power spectrum after the cleaning only with measurable quantities from the data.

Note that, even if the ancillary galaxy sample traces the same underlying matter fluctuations as the interloper line, noise, instrumental resolution and nonlinear clustering will prevent the cleaning from being perfect. The correlation coefficient can differ from unity even on large scales due to shot noise from the discreteness of the galaxies, instrumental noise on the LIM observations, and contributions like non-linear galaxy bias and halo exclusion which mimic stochasticity on large scales. This has been shown explicitly in the context of LIM auto- and cross-correlations in simulations (see e.g. [78, 79, 80]).

The set of weights above can be straightforwardly extended to remove multiple interlopers, or to combine several external tracers for more effective cleaning —more details are provided in appendix B in the harmonic formalism. When the contributions from different lines trace overlapping volumes, the formalism must be extended to take into account the correlation between them. We leave the exploration of the cases that would follow this scenario for future work.

A rather unique benefit of our approach to removing interloper contamination is that we eliminate the actual realization of the contaminant present in our observations —at least partially. Along with the contaminant, then, we also remove some of its sample variance, improving the precision of the statistics we can extract from the observations. Explicitly, the Gaussian covariance of the power spectrum of the cleaned intensity given in Eq. (23) is

𝒞I^I^=2[PIt+PIint(1ρIintgint2)+𝒩I]2Nmodes.subscript𝒞^𝐼^𝐼2superscriptdelimited-[]subscript𝑃subscript𝐼tsubscript𝑃subscript𝐼int1subscriptsuperscript𝜌2subscript𝐼intsuperscriptgintsubscript𝒩𝐼2subscript𝑁modes\mathcal{C}_{\hat{I}\hat{I}}=\frac{2\left[P_{I_{\rm t}}+P_{I_{\rm int}}\left(1% -\rho^{2}_{I_{\rm int}{\rm g}^{\rm int}}\right)+\mathcal{N}_{I}\right]^{2}}{N_% {\rm modes}}\,.caligraphic_C start_POSTSUBSCRIPT over^ start_ARG italic_I end_ARG over^ start_ARG italic_I end_ARG end_POSTSUBSCRIPT = divide start_ARG 2 [ italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( 1 - italic_ρ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) + caligraphic_N start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_N start_POSTSUBSCRIPT roman_modes end_POSTSUBSCRIPT end_ARG . (24)

Removing the contribution from interlopers will also be crucial to increase the detection significance of cross-correlations. This is because the auto-power spectra of both tracers contribute to the covariance of their cross-power spectrum, as shown in Eq. (17). To see this more explicitly, consider the cross-correlation of a target line-intensity map with galaxy number counts ngtsuperscriptsubscript𝑛gtn_{\rm g}^{\rm t}italic_n start_POSTSUBSCRIPT roman_g end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_t end_POSTSUPERSCRIPT probing the same volume. The covariance of their cross-spectrum is

𝒞I^gt=[PIt+PIint(1ρIintgint2)+𝒩I]Pgt+PItgt2Nmodes.subscript𝒞^𝐼superscriptgtdelimited-[]subscript𝑃subscript𝐼tsubscript𝑃subscript𝐼int1superscriptsubscript𝜌subscript𝐼intsuperscriptgint2subscript𝒩𝐼subscript𝑃superscriptgtsuperscriptsubscript𝑃subscript𝐼tsuperscriptgt2subscript𝑁modes\mathcal{C}_{\hat{I}{\rm g}^{\rm t}}=\frac{\left[{P}_{I_{\rm t}}+P_{I_{\rm int% }}\left(1-\rho_{I_{\rm int}{\rm g}^{\rm int}}^{2}\right)+\mathcal{N}_{I}\right% ]{P}_{{\rm g}^{\rm t}}+P_{I_{\rm t}{\rm g}^{\rm t}}^{2}}{N_{\rm modes}}\,.caligraphic_C start_POSTSUBSCRIPT over^ start_ARG italic_I end_ARG roman_g start_POSTSUPERSCRIPT roman_t end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = divide start_ARG [ italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( 1 - italic_ρ start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) + caligraphic_N start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ] italic_P start_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_t end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_t end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_N start_POSTSUBSCRIPT roman_modes end_POSTSUBSCRIPT end_ARG . (25)

Note that the quantity in the square brackets in Eqs. (24) and (25) can equally be expressed in terms of PItot(1ρItotgint2)subscript𝑃subscriptItot1subscriptsuperscript𝜌2subscript𝐼totsuperscript𝑔intP_{\rm I_{\rm tot}}\left(1-\rho^{2}_{I_{\rm tot}g^{\rm int}}\right)italic_P start_POSTSUBSCRIPT roman_I start_POSTSUBSCRIPT roman_tot end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( 1 - italic_ρ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_tot end_POSTSUBSCRIPT italic_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ), as done in Eq. (23).

IV Validation against simulations

Before investigating the gains the cleaning offers, let us validate the analytic prediction for the power spectrum of the cleaned line-intensity map shown in Eq. (23). We use the code simple121212Publicly available at https://github.com/mlujnie/simple. [81] to generate log-normal simulations of the fields of interest. simple is based on log-normal galaxy simulations131313https://bitbucket.org/komatsu5147/lognormal_galaxies/src/master/ [82], to which it adds line intensities according to an input line-luminosity function.

We choose a setup mimicking HETDEX observations [11, 81] to validate our analytic derivations. We consider observations in the wavelength range λ[439.4, 541.4]𝜆439.4541.4\lambda\in\left[439.4,\,541.4\right]italic_λ ∈ [ 439.4 , 541.4 ] nm. This range corresponds to zt[2.61, 3.45]subscript𝑧t2.613.45z_{\rm t}\in\left[2.61,\,3.45\right]italic_z start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT ∈ [ 2.61 , 3.45 ] for the target Lyman-α𝛼\alphaitalic_α line (Lyα𝛼\alphaitalic_α, 121.6 nm at rest frame), and zint[0.18, 0.45]subscript𝑧int0.180.45z_{\rm int}\in\left[0.18,\,0.45\right]italic_z start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT ∈ [ 0.18 , 0.45 ] for the foreground interloper [OII] line (373.7 nm at rest frame); the center redshifts, computed from the mean frequency, correspond to zt=2.99subscript𝑧t2.99z_{\rm t}=2.99italic_z start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT = 2.99 and zint=0.30subscript𝑧int0.30z_{\rm int}=0.30italic_z start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT = 0.30, respectively. In addition to the line-intensity maps, we generate log-normal realizations of galaxy number counts at the same two redshifts.

We compute the Lyα𝛼\alphaitalic_α and [OII] luminosity functions to input to simple at the corresponding redshifts assuming empirical relations between the line luminosity and the star-formation rate (SFR). For the Lyα𝛼\alphaitalic_α line we use [40]

LLyαL=4.18×108(SFR(M)Myr1)fesc(SFR(M),z),subscript𝐿Ly𝛼subscript𝐿direct-product4.18superscript108SFR𝑀subscript𝑀direct-productsuperscriptyr1subscript𝑓escSFR𝑀𝑧\frac{L_{\rm Ly\alpha}}{L_{\odot}}=4.18\times 10^{8}\left(\frac{{\rm SFR}(M)}{% M_{\odot}\text{yr}^{-1}}\right)f_{\rm esc}({\rm SFR}(M),z)\,,divide start_ARG italic_L start_POSTSUBSCRIPT roman_Ly italic_α end_POSTSUBSCRIPT end_ARG start_ARG italic_L start_POSTSUBSCRIPT ⊙ end_POSTSUBSCRIPT end_ARG = 4.18 × 10 start_POSTSUPERSCRIPT 8 end_POSTSUPERSCRIPT ( divide start_ARG roman_SFR ( italic_M ) end_ARG start_ARG italic_M start_POSTSUBSCRIPT ⊙ end_POSTSUBSCRIPT yr start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_ARG ) italic_f start_POSTSUBSCRIPT roman_esc end_POSTSUBSCRIPT ( roman_SFR ( italic_M ) , italic_z ) , (26)

where the escape fraction is parametrized as

fesc(SFR(M),z)=(1+e1.6z+5)1/2××[0.18+0.821+0.8(SFR(M)Myr1)0.875]2.subscript𝑓escSFR𝑀𝑧superscript1superscript𝑒1.6𝑧512superscriptdelimited-[]0.180.8210.8superscriptSFR𝑀subscript𝑀direct-productsuperscriptyr10.8752\begin{split}f_{\rm esc}&({\rm SFR}(M),z)=\left(1+e^{-1.6z+5}\right)^{-1/2}% \times\\ &\times\left[0.18+\frac{0.82}{1+0.8\left(\frac{{\rm SFR}(M)}{M_{\odot}\text{yr% }^{-1}}\right)^{0.875}}\right]^{2}\,.\end{split}start_ROW start_CELL italic_f start_POSTSUBSCRIPT roman_esc end_POSTSUBSCRIPT end_CELL start_CELL ( roman_SFR ( italic_M ) , italic_z ) = ( 1 + italic_e start_POSTSUPERSCRIPT - 1.6 italic_z + 5 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT × end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL × [ 0.18 + divide start_ARG 0.82 end_ARG start_ARG 1 + 0.8 ( divide start_ARG roman_SFR ( italic_M ) end_ARG start_ARG italic_M start_POSTSUBSCRIPT ⊙ end_POSTSUBSCRIPT yr start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT 0.875 end_POSTSUPERSCRIPT end_ARG ] start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . end_CELL end_ROW (27)

In turn, for [OII] we use  [83]

LOIIL=1.87×107(SFR(M)Myr1)100.62/2.5.subscript𝐿OIIsubscript𝐿direct-product1.87superscript107SFR𝑀subscript𝑀direct-productsuperscriptyr1superscript100.622.5\frac{L_{\rm OII}}{L_{\odot}}=1.87\times 10^{7}\left(\frac{{\rm SFR}(M)}{M_{% \odot}\text{yr}^{-1}}\right)10^{-0.62/2.5}\,.divide start_ARG italic_L start_POSTSUBSCRIPT roman_OII end_POSTSUBSCRIPT end_ARG start_ARG italic_L start_POSTSUBSCRIPT ⊙ end_POSTSUBSCRIPT end_ARG = 1.87 × 10 start_POSTSUPERSCRIPT 7 end_POSTSUPERSCRIPT ( divide start_ARG roman_SFR ( italic_M ) end_ARG start_ARG italic_M start_POSTSUBSCRIPT ⊙ end_POSTSUBSCRIPT yr start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_ARG ) 10 start_POSTSUPERSCRIPT - 0.62 / 2.5 end_POSTSUPERSCRIPT . (28)

We assume the SFR-to-halo mass and redshift relation from Ref. [84] along with the quenching fractions from Ref. [85]. We transform these relations to luminosity functions assuming the halo mass function from Ref. [72], and compute the luminosity-averaged bias with the halo-bias fitting function from Ref. [73].

We generate 100 redshift-space realizations for the setup described above. We do not include instrumental noise or foreground-continuum signal and mitigation, and boost the number density for the galaxy number counts to consider a signal-dominated regime and thus a stricter test for the validation. We neglect line broadening for simplicity; we refer the reader to Refs. [86] and [87] for a discussion about the effects of line broadening on the LIM power spectrum and 1-point statistics. The details for the procedure followed to generate each of the realizations implementing the projection effects are described below:

Refer to caption
Figure 2: Validation of the analytic prediction for the power spectrum of the line-intensity map cleaned using the approach proposed in this paper. Top panel: monopole of the redshift-space power spectrum of the target Lyα𝛼\alphaitalic_α line (red) and the total, interloper-contaminated intensity [OII]+++Lyα𝛼\alphaitalic_α before (blue) and after cleaning (yellow). The theoretical prediction for the power after cleaning (derived from the measured power spectra) lies exactly on top of the simulated result. Bottom panel: cross-correlation coefficient between the [OII] intensity fluctuations and the foreground galaxy number counts.
  1. 1.

    We generate a log-normal realization of ‘target’ Lyα𝛼\alphaitalic_α emission at zt=2.99subscript𝑧t2.99z_{\rm t}=2.99italic_z start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT = 2.99 over a cubic box with side length 256 Mpc/habsent/h/ italic_h using a grid of 1283 cubic cells of side length 2Mpc/h2Mpc2\,{\rm Mpc}/h2 roman_Mpc / italic_h. Using the same random seed —to ensure the same density fluctuations underlie every realization— we generate a log-normal realization of number counts with linear bias bg=2.0subscript𝑏𝑔2.0b_{g}=2.0italic_b start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT = 2.0 and number density 0.17 (Mpc/habsent/h/ italic_h)-3.

  2. 2.

    We determine the volume for the simulated foreground emission. The foreground volume at zint=0.30subscript𝑧int0.30z_{\rm int}=0.30italic_z start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT = 0.30 that projects onto the cubic box described above is a rectangular box with side lengths L=48.8Mpc/hsubscript𝐿perpendicular-to48.8MpcL_{\perp}=48.8\,{\rm Mpc}/hitalic_L start_POSTSUBSCRIPT ⟂ end_POSTSUBSCRIPT = 48.8 roman_Mpc / italic_h and L=321.2Mpc/hsubscript𝐿parallel-to321.2MpcL_{\parallel}=321.2\,{\rm Mpc}/hitalic_L start_POSTSUBSCRIPT ∥ end_POSTSUBSCRIPT = 321.2 roman_Mpc / italic_h. Similarly, the foreground volume that projects onto each cell of the background box is 0.382×2.51(Mpc/h)3superscript0.3822.51superscriptMpc30.38^{2}\times 2.51\,({\rm Mpc}/h)^{3}0.38 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT × 2.51 ( roman_Mpc / italic_h ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT. However, in order to simulate all the relevant density-fluctuation modes, we consider a larger cubic box of side 321.2Mpc/h321.2Mpc321.2\,{\rm Mpc}/h321.2 roman_Mpc / italic_h and grid of 8433 cells.

  3. 3.

    We change the random seed for the foreground distributions to avoid any correlation with the realizations described above. We generate the foreground [OII] realization and a galaxy-number-counts realization with number density 2.8 (Mpc/habsent/h/ italic_h)-3 and bias bg=1.3subscript𝑏𝑔1.3b_{g}=1.3italic_b start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT = 1.3 using the same random seed.

  4. 4.

    We downsample the realization along the line of sight into a grid of 8432×128superscript8432128843^{2}\times 128843 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT × 128 cells, and save only the central 1283superscript1283128^{3}128 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT of them. The stored (rectangular) volume of 1283superscript1283128^{3}128 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT(rectangular) cells corresponds exactly to the volume that would project onto the background box when interpreted at ztsubscript𝑧tz_{\rm t}italic_z start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT.

  5. 5.

    In order to analyze only the fluctuations in the line intensities and galaxy number counts, we remove the mean of each slice along the line of sight from each map.

  6. 6.

    To facilitate the analysis, we interpret all 4 fields as being contained in a nominal box of (256 Mpc/h)3/h)^{3}/ italic_h ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT.

  7. 7.

    We apply an anisotropic Gaussian filter to the line-intensity realizations, using standard deviation widths of σ=4Mpc/hsubscript𝜎perpendicular-to4Mpc\sigma_{\perp}=4\,{\rm Mpc}/hitalic_σ start_POSTSUBSCRIPT ⟂ end_POSTSUBSCRIPT = 4 roman_Mpc / italic_h and σ=6Mpc/hsubscript𝜎parallel-to6Mpc\sigma_{\parallel}=6\,{\rm Mpc}/hitalic_σ start_POSTSUBSCRIPT ∥ end_POSTSUBSCRIPT = 6 roman_Mpc / italic_h to simulate limited experimental resolution.

Once we have all the realizations, we compute the auto- and cross-power spectra of all correlated combinations, considering also the total line-intensity power spectrum by adding together the Lyα𝛼\alphaitalic_α and [OII] maps. Note that since the random seed used to generate the background and foreground tracers is different, they are not correlated.

Refer to caption
Figure 3: Square-root of the variance (leftmost panels) and correlation matrices (additional panels to the right) of the LIM auto-spectrum (top) and cross-spectrum with galaxies (middle), estimated from the lognormal realizations described in Sec. IV. For each of these, we show the case of the total intensity fluctuations (blue lines in the leftmost panels), the cleaned intensity maps (yellow lines), and the Lyα𝛼\alphaitalic_α-only intensity maps (red); the correlation matrices are labeled accordingly. We show slices of the correlation matrices in the bottom panels.

From these products we can also compute the cleaning weights \mathcal{F}caligraphic_F from Eq. (21) and generate the cleaned maps. Finally, we compute the power spectrum of the cleaned map and its cross-correlation with the background galaxies. We also compute the cross-correlation coefficients and use them to evaluate the theoretical prediction for the power spectrum of the cleaned maps, Eq. (23), in combination with the measured power spectra of the uncleaned maps and our choice of \mathcal{F}caligraphic_F.

Figure 2 shows the measured intensity power spectra of only Lyα𝛼\alphaitalic_α fluctuations, total intensity, and the cleaned intensity maps, as well as the theoretical prediction for the power spectrum after cleaning. We can see that the prediction and the measurement of the power spectrum of the cleaned map match very precisely, validating the theoretical prediction derived in Sec. III.

As anticipated in Sec. III —see Eqs. (24) and (25)—, partially removing the contributions from interlopers to the line-intensity maps does indeed significantly reduce the covariance of the auto- and cross-power spectra. We demonstrate this reduction in Fig. 3 by showing the square root of the diagonal of the covariance matrices, estimated from the 100 realizations described above, and the corresponding correlation matrices 𝒞ab𝒞aa𝒞bbsubscript𝒞𝑎𝑏subscript𝒞𝑎𝑎subscript𝒞𝑏𝑏\mathcal{R}\equiv\mathcal{C}_{ab}\sqrt{\mathcal{C}_{aa}\mathcal{C}_{bb}}caligraphic_R ≡ caligraphic_C start_POSTSUBSCRIPT italic_a italic_b end_POSTSUBSCRIPT square-root start_ARG caligraphic_C start_POSTSUBSCRIPT italic_a italic_a end_POSTSUBSCRIPT caligraphic_C start_POSTSUBSCRIPT italic_b italic_b end_POSTSUBSCRIPT end_ARG. Notice how the cleaning significantly reduces the covariance even for the cross-power spectrum —the amplitude of which does not receive a contribution from interlopers.

Interestingly, Fig. 3 reveals an additional, problematic feature arising from the interloper contribution to the measured line-intensity maps. Given the disparate values of qsubscript𝑞parallel-toq_{\parallel}italic_q start_POSTSUBSCRIPT ∥ end_POSTSUBSCRIPT and qsubscript𝑞perpendicular-toq_{\perp}italic_q start_POSTSUBSCRIPT ⟂ end_POSTSUBSCRIPT that drive the projection effects of the interloper fluctuations (see Fig. 1), a measured k𝑘kitalic_k mode in the target volume corresponds to much higher ktruesuperscriptsubscript𝑘perpendicular-totruek_{\perp}^{\rm true}italic_k start_POSTSUBSCRIPT ⟂ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_true end_POSTSUPERSCRIPT and smaller ktruesuperscriptsubscript𝑘parallel-totruek_{\parallel}^{\rm true}italic_k start_POSTSUBSCRIPT ∥ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_true end_POSTSUPERSCRIPT. This results in strong mode coupling of the measured auto-power spectrum, which is manifested as strong off-diagonal covariance, as shown in the correlation matrices of Fig. 3. Even if the diagonal covariance is significantly reduced after cleaning, the reduction of the off-diagonal elements in the correlation matrix is much smaller. This is likely due to the fact that we used an isotropic set of filtering weights. If instead they were anisotropic —for which they could also be tailored to minimize higher-order Legendre multipoles—, we expect that the off-diagonal correlations would be reduced further. Similarly, harmonic cleaning in narrow frequency bins effectively corresponds to anisotropic filtering weights, although the cleaning performance in this case may be limited by an increased galaxy shot noise due to the binning of the galaxy catalog. We leave the exploration of these improved filtering weights to future work.

V Discussion

In the previous section we validated the theoretical prediction for the power spectrum of the cleaned line-intensity map after removing the interloper using Eq. (19). We also checked that this cleaning procedure significantly reduces the covariance of the auto- and cross-power spectra. Nonetheless, the case considered is but an example used to validate the predictions. In this section, we provide an approximate metric for the gains we can expect from removing interlopers more generally.

Let us start by assuming the (diagonal) Gaussian covariances from Sec. III, and quantifying the signal-to-noise ratio on the power spectrum of just the target line. After removing the interlopers, this improves to

SNRauto2=kNmodesPIt22[PIt+PIint(1ρIintgint2)+𝒩I]2,subscriptsuperscriptSNR2autosubscript𝑘subscript𝑁modessuperscriptsubscript𝑃subscript𝐼t22superscriptdelimited-[]subscript𝑃subscript𝐼tsubscript𝑃subscript𝐼int1subscriptsuperscript𝜌2subscript𝐼intsuperscriptgintsubscript𝒩𝐼2{\rm SNR}^{2}_{\rm auto}=\sum_{k}\frac{N_{\rm modes}P_{I_{\rm t}}^{2}}{2\left[% P_{I_{\rm t}}+P_{I_{\rm int}}\left(1-\rho^{2}_{I_{\rm int}{\rm g}^{\rm int}}% \right)+\mathcal{N}_{I}\right]^{2}}\,,roman_SNR start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_auto end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT divide start_ARG italic_N start_POSTSUBSCRIPT roman_modes end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 [ italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( 1 - italic_ρ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) + caligraphic_N start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , (29)

while in the case of the cross-correlation of the target signal with galaxies, we have instead

SNR×2=kNmodesPItgt2[PIt+PIint(1ρIintgt2)+𝒩I]Pgt+PItgt2.subscriptsuperscriptSNR2subscript𝑘subscript𝑁modessuperscriptsubscript𝑃subscript𝐼tsuperscriptgt2delimited-[]subscript𝑃subscript𝐼tsubscript𝑃subscript𝐼int1superscriptsubscript𝜌subscript𝐼intsuperscriptgt2subscript𝒩𝐼subscript𝑃superscriptgtsuperscriptsubscript𝑃subscript𝐼tsuperscriptgt2{\rm SNR}^{2}_{\times}=\sum_{k}\frac{N_{\rm modes}P_{I_{\rm t}{\rm g^{t}}}^{2}% }{\left[{P}_{I_{\rm t}}+P_{I_{\rm int}}\left(1-\rho_{I_{\rm int}{\rm g^{t}}}^{% 2}\right)+\mathcal{N}_{I}\right]P_{\rm g^{t}}+P_{I_{\rm t}{\rm g^{t}}}^{2}}\,.roman_SNR start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT × end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT divide start_ARG italic_N start_POSTSUBSCRIPT roman_modes end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_t end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG [ italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( 1 - italic_ρ start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_t end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) + caligraphic_N start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ] italic_P start_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_t end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_t end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG . (30)

From the expressions above, it is clear that the signal-to-noise ratio on the target-signal components improves after cleaning. Moreover, following on from the conclusions of Eq. (23) for the power spectrum, the improvements in SNR increases as the cleaning tracer and the interloper intensity fluctuations become more correlated. If we take the ratio of the SNRs per k𝑘kitalic_k-mode before (i.e., taking ρIintgint=0subscript𝜌subscript𝐼intsuperscriptgint0\rho_{I_{\rm int}{\rm g^{\rm int}}}=0italic_ρ start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = 0) and after interloper removal, we get, after minimal manipulations

SNRauto(k)SNRauto,nonull.(k)=1+PIintPIt+𝒩(1ρIintgint2)1+PIintPIt+𝒩,subscriptSNRauto𝑘subscriptSNRautononull𝑘1subscript𝑃subscript𝐼intsubscript𝑃subscript𝐼t𝒩1subscriptsuperscript𝜌2subscript𝐼intsuperscriptgint1subscript𝑃subscript𝐼intsubscript𝑃subscript𝐼t𝒩\frac{{\rm SNR}_{\rm auto}(k)}{{\rm SNR}_{\rm auto,\,no\,null.}(k)}=\frac{1+% \frac{P_{I_{\rm int}}}{P_{I_{\rm t}}+\mathcal{N}}\left(1-\rho^{2}_{I_{\rm int}% {\rm g}^{\rm int}}\right)}{{1+\frac{P_{I_{\rm int}}}{P_{I_{\rm t}}+\mathcal{N}% }}}\,,divide start_ARG roman_SNR start_POSTSUBSCRIPT roman_auto end_POSTSUBSCRIPT ( italic_k ) end_ARG start_ARG roman_SNR start_POSTSUBSCRIPT roman_auto , roman_no roman_null . end_POSTSUBSCRIPT ( italic_k ) end_ARG = divide start_ARG 1 + divide start_ARG italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG start_ARG italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT + caligraphic_N end_ARG ( 1 - italic_ρ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) end_ARG start_ARG 1 + divide start_ARG italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG start_ARG italic_P start_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT + caligraphic_N end_ARG end_ARG , (31)

and similarly for the cross-power spectrum. Note that using the Gaussian covariance as our metric to evaluate the increase in detection significance we ignore any mode coupling in the power spectra. As we find for our example in Fig. 3, mode coupling can be quite large. Hence, the estimation using the expression above corresponds to idealized scenarios and shall be used just as a qualitative guide.

Refer to caption
Figure 4: Contour plot of the expected boost factor in the signal-to-noise ratio of auto- and cross-spectra at a given k𝑘kitalic_k value after removing the interlopers. We show results for auto-correlation (red) and cross-correlation (assuming a cross-correlation coefficient of 0.7 and 1 in purple and orange, respectively) as functions of the ratio of the interloper and target-line power spectra and the cross-correlation coefficient between the interloper fluctuations and the tracer of the large-scale structure used to remove them.

This is still useful, since we can estimate the boost factor in the detection significance for each wavenumber in the most general possible way, as a function of only the correlation coefficient between the foreground tracers and the interloper as well as of the ratio between the projected interloper power spectrum and the target-line power spectrum (including noise). We show this in Fig. 4. As expected, the improvement factor can be very large for auto-correlations, since interlopers contribute to the measured power spectrum and its covariance. For bright interlopers and a high correlation between their fluctuations and the tracer used to null them, we can expect improvements in the SNR of auto- and cross-correlations of up to a factors of 6similar-toabsent6\sim 6∼ 6 and 2, respectively, per wavenumber. This shows how powerful this approach can be when it comes to reducing line-interloper contamination in LIM observations.

The performance of the technique proposed in this work to remove the interlopers depends on the relative intensity between the interloper and target emissions: the brighter the interloper emission, the higher the gain after cleaning. In addition, and more importantly, the performance depends on the correlation between the interloper-intensity fluctuations and the tracer of the large-scale structure used to remove them: as this is a statistical removal, the higher the correlation, more effective is the cleaning. We can anticipate two sources of decorrelation that may limit this technique. First, shot noise from galaxy number counts reduces the correlation coefficient with other tracers. Since photometric surveys are not ideal for cross-correlations with LIM141414This is because of the loss of short parallel modes in photometric surveys due to photometric redshift errors and the loss of long parallel modes in LIM observations due to continuum-contamination cleaning. Therefore, there is only a small overlap in Fourier modes between the two observed fields. and spectroscopic galaxy surveys typically include many fewer galaxies, this may be a problem (photometric surveys with very good redshift determination may however be useful). Second, non-linear clustering effects also reduce the correlation, mostly due to stochasticity and non-linear biasing —see e.g., Ref. [79] for an example in LIM simulations painted over analyses N-body simulation. This latter point gets aggravated by the projection effects (see Fig. 1): a given k𝑘kitalic_k measured at the background volume corresponds to much smaller scales —and is hence more affected by nonlinear clustering— in the perpendicular direction for the foreground fluctuations. This is why we see the interloper removal work better on large scales, where correlations are higher. Since the correlation coefficient is highly case-dependent —depending on emission line, redshift, etc.—, we defer in-detail investigations to future work.

Nonetheless, the main benefit of this technique is that it is completely model independent —it does not assume anything about the interloper, since both the ancillary map used as a tracer of its intensity fluctuations and the filter are obtained from direct measurements— and it is straightforward to apply. The only requirements for the technique to be effective is that the ancillary map used as a proxy of the interlopers overlap with them in volume and have good redshift determination. This is fulfilled for most current and near future, as they have been planned to overlap with spectroscopic galaxy surveys (see Fig. 9 in Ref. [2]). Crucially, since our procedure involves removing the exact realization of the interloper fluctuations, it offers a reduction in variance unmatched by similar cleaning techniques based on pure modeling approaches. Finally, estimating how much interloper signal has been removed requires only knowledge of the correlation coefficient of the intensity map with the cleaning tracer, and thus can also be obtained from the measurements.

With the exception of blind masking, other proposed techniques to remove interloper contamination rely on model assumptions. For instance, targeted masking must assume a relation between the galaxy properties and interloper intensity to select which voxels to mask; the direct modeling of interlopers or the use of spectral templates require a model of the line intensities; and the application of machine learning approaches needs accurate simulations. Considering that this method is complementary to any of the techniques listed above, its model independence can be very beneficial for an unbiased and optimal cleaning of interlopers. As an example, masking attempts to remove the brightest sources, while our interloper removal aims to eliminate statistical contributions, which may also take care of the diffuse emission. We leave the exploration of the synergies between the interloper-cleaning technique proposed here and other interloper mitigation techniques to future work.

Note that the performance of interloper removal can be optimized by using more than one tracer of the foreground fluctuations. Appendix A explains how to optimally combine multiple tracers in such a scenario. Moreover, these weights can also be used to combine samples split into various redshift bins; doing so provides added freedom to weight each shell independently and thus better match the effective redshift distribution of the combined tracer to that of the target line (discussions on this abound in the CMB lensing literature; see e.g., Refs. [63, 66, 77]).

VI Conclusions

In this work we have proposed to minimize the contributions from line interlopers to LIM observations by using ancillary tracers of the large-scale structure as proxies for the spatial intensity fluctuations of the contaminating signal. The only requisite for the application of our method is the availability of external, ancillary observations of tracers of the large-scale structure overlapping in volume with the interloper contribution that we aim to null. Since current and upcoming LIM surveys are being designed to overlap with galaxy surveys, such ancillary tracers should be readily available.

The removal is achieved by subtracting the fluctuations of the ancillary tracer from the LIM observations, after having first filtered the former using a set of weights designed to minimize the variance of the power spectrum of the cleaned map. These weights can be directly calibrated from the data, so the changes in the power spectrum after cleaning can be modeled using only measurable quantities. Hence, interloper removal as presented in this work reduces the interloper contribution to the LIM power spectrum in a model-independent way without increasing the noise of the resulting map. Therefore, this is a cheap and ready-to-use technique that allows us to statistically remove line-interloper contamination.

We have derived the optimal filter to perform the cleaning as well as a model for the power spectra of the cleaned line-intensity fluctuations. Afterwards, we have successfully validated this prediction with log-normal simulations, and estimated the boost in detection significance for the LIM power spectrum and the cross-power spectrum with other large-scale structure tracers —assuming a Gaussian covariance. The gain in detection significance of the power spectrum of interest depends on two factors: first, the relative brightness of the interlopers with respect to the target signal; and second, the extent of correlation between the external ancillary tracer used and the interloper fluctuations. We have found that for bright interlopers, the detection significance can improve up to a factor 6 and 2 for the LIM auto- and cross-power spectra, respectively. Note, however, that this idealized treatment ignores mode coupling, which can be quite large due to the anisotropy induced on the interloper fluctuations by projection effects.

In this work we have focused on the Legendre monopole of the redshift-space power spectrum. Given the strong anisotropies introduced in the interloper fluctuations by the projection effects, extending the cleaning to the anisotropic power spectrum may improve the performance of the nulling. For instance, the filtering weights can be designed to minimize the variance of the anisotropic power spectrum, and may also result in a reduction of the correlation between different k𝑘kitalic_k-bins. Note that improving the detection significance of the quadrupole may have a large impact on the scientific output since higher-order multipoles of the power spectrum can break degeneracies between astrophysics and cosmology [88]. Similarly, the cleaning can be optimized for other summary statistics (e.g., higher-order correlators) by imposing the minimization of the interloper contributions at the time of deriving the optimal weights.

We have applied the cleaning in three-dimensional Fourier space. Alternatively, the cleaning can also be applied separately to each frequency channel in harmonic space, after which the observations can be projected to a three-dimensional Cartesian grid. In such cases, projection effects may be less of a problem and the removal of interlopers may be more tailored to each observed frequency. In addition, this approach may more efficiently reduce the mode coupling due to the projection effects of the line interloper. However, higher shot noise of the ancillary tracer may limit the performance as it reduces the correlation with the interloper signal. Moreover, we have focused on ancillary tracers with good redshift determination. Other alternatives such as photometric galaxy surveys with good redshift determination might be efficient too, but their low correlation with LIM observations after continuum foreground cleaning may severely hinder the nulling performance. Note also that the high-redshift tail of the distribution of the ancillary tracers may correlate with the target signal, which would require developing new weights —following Ref. [68]. Finally, the cleaning can be further improved employing more than one tracer of the large-scale structure —as explored in the appendix—, and splitting the tracer into redshift bins that are weighted separately in order to better match the redshift distribution of the target.

In summary, this work proposes a new way to remove line-interloper emission from LIM observations that is also highly complementary to previous mitigation strategies. While other cleaning techniques focus on the brightest sources and are very model dependent, our proposed removal approach eliminates contributions statistically and in a model independent way. Moreover, it can potentially deal with diffuse emission on large scales too. Therefore, we expect this approach to perform especially well in combination with other techniques. In a similar fashion, this approach can easily be extended to deal with continuum foregrounds both for LIM and wide-band observations.

In light of the expected sensitivity of forthcoming LIM surveys and their extensive overlap with spectroscopic galaxy surveys, we think it is worthwhile to further develop this technique as required by specific scenarios to deliver on the vast scientific promise of line-intensity mapping.

Acknowledgements.
The authors would like to thank the Centro de Ciencias de Benasque Pedro Pascual, Spain, and the organizers of the Understanding Cosmological Observations workshop, where work on this paper first started. JLB acknowledges funding from the Ramón y Cajal Grant RYC2021-033191-I, financed by MCIN/AEI/10.13039/501100011033 and by the European Union “NextGenerationEU”/PRTR, as well as the project UC-LIME (PID2022-140670NA-I00), financed by MCIN/AEI/ 10.13039/501100011033/FEDER, UE. The authors thank the computer resources provided by the Spanish Supercomputing Network (RES) node at Universidad de Cantabria and the Institute of Physics of Cantabria (IFCA-CSIC).

Appendix A Harmonic-space formalism

We have focused on three-dimensional line-intensity maps, as is customary in LIM analyses. Alternatively, it is also possible to null interlopers before projecting the observations to three-dimensional space, in the context of a radial-angular split that is better suited to the basis of the observations and later apply the projection.

Suppose an observed field is given by X(𝒏^,νobs)𝑋^𝒏subscript𝜈obsX(\hat{\boldsymbol{n}},\nu_{\rm obs})italic_X ( over^ start_ARG bold_italic_n end_ARG , italic_ν start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT ). The map of diffuse emission in a given channel —or group of channels— can be binned into a pixelized map and transformed to spherical harmonic space to obtain the spherical harmonic coefficients Xm(νobs)subscript𝑋𝑚subscript𝜈obsX_{\ell m}(\nu_{\rm obs})italic_X start_POSTSUBSCRIPT roman_ℓ italic_m end_POSTSUBSCRIPT ( italic_ν start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT ). In the case of discrete sources such as galaxies, the pixelization step can be bypassed and replaced with a direct spherical harmonic transform in order to avoid issues such as aliasing [89]. In this radial-angular formalism, the connection to the CMB lensing nulling/delensing literature is readily apparent (see e.g., Refs. [62, 63, 64, 65, 66, 90, 68]). Hereinafter, we consider each νobssubscript𝜈obs\nu_{\rm obs}italic_ν start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT separately and therefore drop the explicit notation.

The cleaning operation then proceeds by replacing equation (19) with its harmonic-space equivalent

δI^m=δImδg,mint.𝛿subscript^𝐼𝑚𝛿subscript𝐼𝑚subscriptsuperscriptsubscript𝛿g𝑚int\delta\hat{I}_{\ell m}=\delta I_{\ell m}-\mathcal{F}_{\ell}\delta_{{\rm g},\,% \ell m}^{\rm int}\,.italic_δ over^ start_ARG italic_I end_ARG start_POSTSUBSCRIPT roman_ℓ italic_m end_POSTSUBSCRIPT = italic_δ italic_I start_POSTSUBSCRIPT roman_ℓ italic_m end_POSTSUBSCRIPT - caligraphic_F start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT italic_δ start_POSTSUBSCRIPT roman_g , roman_ℓ italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT . (32)

In this basis, the filter takes the form

=CIgintCgint=CIintgintCgint,subscriptsuperscriptsubscript𝐶𝐼superscriptgintsuperscriptsubscript𝐶superscriptgintsuperscriptsubscript𝐶subscript𝐼intsuperscriptgintsuperscriptsubscript𝐶superscriptgint\mathcal{F}_{\ell}=\frac{C_{\ell}^{I{\rm g^{int}}}}{C_{\ell}^{\rm g^{int}}}=% \frac{C_{\ell}^{I_{\rm int}{\rm g^{int}}}}{C_{\ell}^{\rm g^{int}}}\,,caligraphic_F start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT = divide start_ARG italic_C start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I roman_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_C start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT end_ARG = divide start_ARG italic_C start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT roman_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_C start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_g start_POSTSUPERSCRIPT roman_int end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT end_ARG , (33)

which now involves angular power spectra Csubscript𝐶C_{\ell}italic_C start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT of the fluctuations in the various different channels including all sources of noise. Once this cleaning has been performed, the inverse harmonic transform can be applied to δI^m(νobs)𝛿subscript^𝐼𝑚subscript𝜈obs\delta\hat{I}_{\ell m}(\nu_{\rm obs})italic_δ over^ start_ARG italic_I end_ARG start_POSTSUBSCRIPT roman_ℓ italic_m end_POSTSUBSCRIPT ( italic_ν start_POSTSUBSCRIPT roman_obs end_POSTSUBSCRIPT ) to build the three-dimensional map.

While applying interloper removal in harmonic space may be more convenient to avoid artifacts due to the projection onto a Cartesian grid [69] and reduce the mode coupling introduced by the strong projection effects of the interlopers, the larger shot noise incurred when using narrow frequency channels may limit the performance of the cleaning. In contrast, using wider frequency bands to reduce the shot noise may hinder the cleaning due to the loss of radial modes. The next appendix will use the harmonic-space formalism to adapt more easily the results in the CMB lensing context, but they can be straightforwardly adapted to three-dimensional Fourier space.

Appendix B Interloper removal with multiple tracers

The combination of multiple tracers of the large-scale structure to improve CMB lensing nulling or delensing has been already proposed. In particular, Ref. [63] includes a description of the filters that maximize the cross-correlation coefficient between the co-added tracer and the CMB lensing potential, showing that these are also the ones that minimize the variance of CMB B𝐵Bitalic_B-mode polarization after delensing. Refs. [76, 77] then showed that the procedure is relatively robust against inaccuracies in the determination of the filters.

The optimal cleaned map coadding information from several tracers takes the form

δI^m=δImpc(p)Bm(p),𝛿subscript^𝐼𝑚𝛿subscript𝐼𝑚subscript𝑝subscriptsuperscript𝑐𝑝subscriptsuperscript𝐵𝑝𝑚\delta\hat{I}_{\ell m}=\delta I_{\ell m}-\sum_{p}c^{(p)}_{\ell}B^{(p)}_{\ell m% }\,,italic_δ over^ start_ARG italic_I end_ARG start_POSTSUBSCRIPT roman_ℓ italic_m end_POSTSUBSCRIPT = italic_δ italic_I start_POSTSUBSCRIPT roman_ℓ italic_m end_POSTSUBSCRIPT - ∑ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT italic_c start_POSTSUPERSCRIPT ( italic_p ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT italic_B start_POSTSUPERSCRIPT ( italic_p ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_ℓ italic_m end_POSTSUBSCRIPT , (34)

where p𝑝pitalic_p indexes the various tracers B𝐵Bitalic_B. The weights that maximize the cross-correlation coefficient between the cleaning tracer and the interloper line emission —hence minimizing the contribution of interlopers in the cleaned map, see Eq. (23)— are [63]

c(p)=jρjIintρpjCIintCB(p)+𝒩B(p).subscriptsuperscript𝑐𝑝subscript𝑗subscriptsuperscript𝜌𝑗subscript𝐼intsubscriptsuperscript𝜌𝑝𝑗superscriptsubscript𝐶subscript𝐼intsuperscriptsubscript𝐶superscript𝐵𝑝superscriptsubscript𝒩superscript𝐵𝑝c^{(p)}_{\ell}=\sum_{j}\frac{\rho^{jI_{\rm int}}_{\ell}}{\rho^{pj}_{\ell}}% \sqrt{\frac{C_{\ell}^{I_{\rm{int}}}}{C_{\ell}^{B^{(p)}}+\mathcal{N}_{\ell}^{B^% {(p)}}}}\,.italic_c start_POSTSUPERSCRIPT ( italic_p ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT divide start_ARG italic_ρ start_POSTSUPERSCRIPT italic_j italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_ARG start_ARG italic_ρ start_POSTSUPERSCRIPT italic_p italic_j end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_ARG square-root start_ARG divide start_ARG italic_C start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_C start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT ( italic_p ) end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT + caligraphic_N start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT ( italic_p ) end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT end_ARG end_ARG . (35)

Qualitatively, on a given angular scale and frequency channel, this scheme weights more heavily the tracers that best correlate with the interloper line. A similar expression can be derived for three-dimensional statistics in Fourier space, so that they can be applied to improve the formalism described in the main text, if more than one tracer is available.

It is worth noting that the dependence on CIintsuperscriptsubscript𝐶subscript𝐼intC_{\ell}^{I_{\rm{int}}}italic_C start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT end_POSTSUPERSCRIPT in Eq. (35) is rather benign. Though not measurable directly, this component can typically be modeled. Moreover, the performance is moderately robust againts model misspecification [63, 76, 77], while most likely not introducing any bias in the cleaned map. This is because the filters that were applied to the data are known exactly, however suboptimal they may be.

References