11email: jyao@sissa.it 22institutetext: INFN, Via Valerio 2, 34127 Trieste, Italy 33institutetext: IFPU, Via Beirut 2, 34014 Trieste, Italy 44institutetext: Instituto de Astrofísica de Andalucía, Glorieta de la Astronomía s/n, 18008 Granada, Spain 55institutetext: Dipartimento di Fisica e Astronomia, Università degli Studi di Catania, via S. Sofia, 64, 95123, Catania, Italy 66institutetext: INFN - Sezione di Catania, Via S. Sofia 64, 95123 Catania, Italy 77institutetext: INAF - Osservatorio Astrofisico di Catania, via S. Sofia 78, 95123 Catania, Italy
ForSE+: Simulating non-Gaussian CMB foregrounds at 3 arcminutes in a stochastic way based on a generative adversarial network
We present ForSE+, a Python package that produces non-Gaussian diffuse Galactic thermal dust emission maps at arcminute angular scales and that has the capacity to generate random realizations of small scales. This represents an extension of the ForSE (Foreground Scale Extender) package, which was recently proposed to simulate non-Gaussian small scales of thermal dust emission using generative adversarial networks (GANs). With the input of the large-scale polarization maps from observations, ForSE+ has been trained to produce realistic polarized small scales at following the statistical properties, mainly the non-Gaussianity, of observed intensity small scales, which are evaluated through Minkowski functionals. Furthermore, by adding different realizations of random components to the large-scale foregrounds, we show that ForSE+ is able to generate small scales in a stochastic way. In both cases, the output small scales have a similar level of non-Gaussianity compared with real observations and correct amplitude scaling as a power law. These realistic new maps will be useful, in the future, to understand the impact of non-Gaussian foregrounds on the measurements of the cosmic microwave background (CMB) signal, particularly on the lensing reconstruction, de-lensing, and the detection of cosmological gravitational waves in CMB polarization B-modes.
Key Words.:
Cosmology: Cosmic Microwave Background, Cosmology: diffuse radiation, methods: data analysis1 Introduction
The cosmic microwave background (CMB) contains valuable information about the initial conditions in the early Universe and physical processes that happened during its propagation from the last scattering surface to our observation (Hu & Dodelson, 2002). In addition to the great success of observations and interpretation of the CMB total intensity signal, mainly sourced by scalar perturbations (Hinshaw et al., 2013; Planck Collaboration VI, 2020), the focus of experimental efforts has now been transferred to the observations of the CMB polarization signal. In particular, the quest for the curl or B-mode component in the CMB polarized signal has drawn the most attention nowadays, as it can be a unique probe of primordial gravitational waves (PGWs, i.e., tensor-mode cosmological perturbations) (Zaldarriaga & Seljak, 1997; Kamionkowski et al., 1997; Kamionkowski & Kovetz, 2016). These tensor perturbations are predicted by the families of inflationary theories, which describe a history of accelerated expansion in the very early Universe (Guth, 1981). Great efforts are being made with many ongoing and future telescopes, including ground-based ones, such as PolarBear111Polarization of the Background Radiation experiment (POLARBEAR Collaboration, 2022), SPT222South Pole Telescope (SPT Collaboration, 2023), ACT333Atacama Cosmology Telescope (Madhavacheril et al., 2024), SO444Simons Observatory(Ade et al., 2019), CMB-S4555CMB-Stage-IV (Abazajian et al., 2016), BICEP666Background Imaging of Cosmic Extragalactic Polarization (The BICEP/Keck Collaboration, 2022), and AliCPT777Ali CMB Polarization Telescope (Li et al., 2018), balloon-borne ones such as SPIDER888Suborbital Polarimeter for Inflation Dust and the Epoch of Reionization (Spider Collaboration, 2022), and telescopes on satellites, like LiteBIRD999Lite (Light) satellite for the studies of B-mode polarization and Inflation from the cosmic background Radiation Detection (LiteBIRD Collaboration et al., 2023). The tightest constraint on the amplitude of the PGW, in terms of the tensor-to-scalar ratio, , is (Tristram et al., 2022), at a confidence level.
While the scalar perturbations do not contribute to the CMB B-modes and only source the parity-even E-modes, the gravitational lensing effect from the forming cosmological structures along the propagation of CMB photons, with a typical deflection angle of a few arcmins, converts a small portion of E-modes into B-ones. These lensing-induced B-modes dominate over the PGWs at arcminute angular scales (Hu, 2000; Planck Collaboration VIII, 2020). Some inflationary models also predict nearly Gaussian statistics of the CMB anisotropies with small deviations known as primordial non-Gaussianity (PNG), which can be used to further constrain the vast space of inflationary theories that go beyond a simple single field slow roll scenario (Bartolo et al., 2004; Planck Collaboration IX, 2020). However, the lensing effect will also induce non-Gaussianity into the distribution of CMB photons by distorting the paths and coupling different multipole bins of power spectra.
Despite being a contaminant to PGWs and PNG, CMB lensing carries cosmological information on processes occurring during structure formation, on neutrino masses (Allison et al., 2015), and Dark Energy (Acquaviva & Baccigalupi, 2006). Effective techniques capable of reconstructing the lensing field have been developed, mainly relying on detecting the non-Gaussianity presented in the observed CMB maps (Hu & Okamoto, 2002; Maniyar et al., 2021). The latest detection of lensing potential comes from the ACT experiment, which achieves a detection of the CMB lensing signal (Qu et al., 2024).
The primary source of contamination to CMB B-modes arises from astrophysical processes causing diffuse emissions within our own Galaxy. The two main polarized components are represented by the thermal dust and synchrotron emission, dominating at high ( GHz) and low ( GHz) frequencies, respectively, exceeding the CMB B-mode signal in any frequency and any position on the sky (Krachmalnicoff et al., 2016, 2018). They are caused by dust grains heated by starlight, exhibiting a quasi-thermal emission at about 18 K, and cosmic ray electrons spiraling around the lines of the Galactic magnetic field, respectively. The class of data analysis algorithms capable of extracting the cosmological B-mode anisotropies out of a multi-frequency dataset is known as component separation. Several implementations exist, exploiting the different properties that CMB and foregrounds have, such as frequency dependency and spatial distribution. Depending on the assumptions made about the foregrounds, these methods can generally be classified into blind (Yao et al., 2018; Santos et al., 2021; Zhang et al., 2019; Delabrouille et al., 2009) and parametric ones (Stompor et al., 2009; Planck Collaboration IV, 2020; BeyondPlanck Collaboration I, 2023).
Unlike the CMB, diffuse foregrounds generically exhibit a large degree of non-Gaussianity. This comes primarily from large-scale observations (Coulton & Spergel, 2019; Allys et al., 2019) and is expected to also be true at smaller scales, since dust grains are highly locally distributed and the magnetic field in the diffuse interstellar medium is highly turbulent. The foreground non-Gaussianity introduces coupling between different modes in the angular power spectra, which induces non-diagonal terms in the covariance matrices of the observed signal. On the other hand, covariance matrices are usually treated as diagonal under the assumption that all the sky components behave like Gaussian random fields. Additionally, approaches relying on power spectra will inherently ignore the non-Gaussianity, as the two-point functions are insufficient in capturing its presence. Therefore, the existence of non-Gaussianity in the Galactic foregrounds will potentially induce systematic errors in primordial B-mode detection.
Abril-Cabezas et al. (2024) claim that the constraints on are robust against the non-Gaussianity in the polarized dust at the power spectra level, but they only focus on , the angular scales at which the PGWs cause a maximum B-mode contribution known as recombination bump. At smaller scales where lensing dominates, non-Gaussian foreground will possibly cause bias to lensing reconstruction (Beck et al., 2020) and de-lensing. Moreover, non-Gaussianity also affects PNG measurements (Cabella et al., 2010).
The main reason for neglecting non-Gaussianity is that observations of foregrounds at arcminute scales are lacking, and so is our capability of generating realizations of simulations of their patterns. Current observations of polarized foregrounds are limited to degree scales over a large sky fraction and can only reach a higher resolution at sub-degree or arcminute scales for portions of the sky (Planck Collaboration XI, 2020; Bernardi et al., 2004; Remazeilles et al., 2015).
The way to simulate foregrounds relies on data and phenomenology. Foreground observations at a given frequency are used as templates and extrapolated to other frequency bands by making assumptions about their spectral energy distributions (SEDs). These templates are mostly based on the observations by the Planck (Planck Collaboration IV, 2020) and Wilkinson Microwave Anisotropy Probe (WMAP) satellite observations (Remazeilles et al., 2015; Bennett et al., 2013), which characterize the signal down to a limited angular resolution (around ). There exist several widely used packages that simulate Galactic foreground maps, such as the Python Sky Model (PySM, Thorne et al. (2017)) and Planck Sky Model (Delabrouille et al., 2013). The latest version of PySM3 package (Zonca et al., 2021; PanEx Collaboration, 2024) uses foreground templates from the latest available observations and extrapolates them to arcminute angular scales. These packages usually fit a power law to the observed large-scale power spectra of foregrounds and extrapolate to small scales with Gaussian realizations of them. Although commonly used, these packages are not able to produce foreground small scales with significant non-Gaussianity comparable to the observed large scales.
Clark & Hensley (2019) provides a data-driven framework to construct three-dimensional Stokes parameter maps of thermal dust emission in position–position–velocity space using only neutral hydrogen data on the basis that HI is strongly correlated with the dust in the diffuse interstellar medium (Lenz et al., 2017). By integrating over the velocity space, they obtained the polarized dust emission maps over the full sky at a resolution of , corresponding to that of HI data, which are in good agreement with the Planck observed 353 GHz dust maps in terms of several physical properties such as the polarization fraction and power spectra, although they do not have a thorough discussion about the non-Gaussianity contained in their maps.
Hervías-Caimapo & Huffenberger (2022) uses a large number of filaments in the distribution of the thermal dust grains together with the large-scale template from data to reproduce the main features of the Planck 353 GHz map. These include the power spectrum slopes of intensity and polarization maps, the ratios between EE, BB, and TE power spectra, and the level of non-Gaussianity in the total intensity map, which can be controlled by the density of filaments. When focusing on scales of arcminutes or tens of arcminutes, , however, the Planck 353 GHz total intensity map has more non-Gaussianity with respect to the generated small scales from the model based on filaments.
Diffuse foreground models are also generated by exploiting magnetohydrodynamic (MHD) simulations. They model the physical processes of the interstellar medium, such as heating and cooling of the gas and the interaction between dust grains and the turbulent magnetic field. The thermal dust emission can be obtained by integrating the emission of dust grains along the line of sight (Padoan et al., 2001; Planck Collaboration Int. XX, 2015; Kim et al., 2019). The simulated maps are thus non-Gaussian because of the MHD processes and can reach small scales according to the resolution of the MHD simulation itself. However, the MHD simulations can only reproduce the statistical properties of the Galaxy, and thus fail to generate the specific morphology of Galactic foreground emission. Another important limiting factor of MHD simulations is that they are computationally expensive, especially in achieving a high resolution.
An additional technique consists of simulating the foreground by exploiting innovative algorithms capable of modeling the emission pattern at a high resolution on the basis of the observed one at moderated and low angular scales. The ForSE model introduced by Krachmalnicoff & Puglisi (2021) (hereafter KP2021) is able to generate small scales, up to , starting from the low-resolution polarized dust observations. It utilizes a generative adversarial network (GAN), which is trained to inject small-scale features with statistical properties like the ones observed at a high resolution in the intensity maps.
In this study, we further extend the ForSE algorithm to: i) produce dust polarization maps at resolution, which is vital for checking the stability of lensing reconstruction and de-lensing algorithms in CMB observations, ii) stochastically generate multiple realizations of small-scale features, which are most important in estimating the uncertainty associated with foreground variations.
Additional uses of deep generative models for dust simulations exist in the literature. For example, Aylor et al. (2021) use GANs to generate simulated total intensity maps from the observed Planck generalized needlet internal linear combination (GNILC) map at 353GHz. Other common generative models, such as variational autoencoders (Thorne et al., 2021) or diffusion models (Heurtel-Depeiges et al., 2023), are also used.
The outline of the paper is as follows. In Section 2, we summarize the first version of the ForSE algorithm and present the extension and methodology used to achieve the new version operating at arcminute scale. In Section 3, we describe the pre-processing, training, and post-processing of the new model and also present our validation procedure. In Section 4, full-sky maps are presented and compared with maps from the latest version of PySM3 package. Conclusions are given in Section 5.
2 The ForSE+ model
In this section, we briefly review the basic structure and assumptions of the ForSE algorithm presented in KP2021. We then introduce our new version of the code, ForSE+, which allows one to generate maps of the thermal dust emission with non-Gaussian structures at angular resolution, in both a deterministic and a stochastic way.
2.1 Review of the ForSE model: From to
As has already been mentioned, the ForSE model, introduced and validated in KP2021, is based on GANs and allows one to produce non-Gaussian full-sky maps of polarized thermal dust emission at an angular resolution of from low-resolution Planck observations at .
Generative adversarial networks are a particular family of networks (Goodfellow et al., 2014) whose characteristic feature is to be composed of two sub-networks called Generator () and Discriminator (), which are trained to compete against each other. In practice, the goal of is to produce new images that are compared by with a set of real images that are the training set. Once the training of the two sub-networks is done in an adversarial way, is able to produce images that have the same statistical properties as those belonging to the training set, in such a way that mock and real images are no longer distinguishable by .
In the ForSE implementation, the input to are images at a low resolution () of the thermal dust emission observed by the Planck satellite and processed through the GNILC method101010http://pla.esac.esa.int/pla/aio/product-action?MAP.MAP_ID=COM_CompMap_IQU-thermaldust-gnilc-unires_2048_R3.00.fits (Planck Collaboration IV, 2020). The output of are small-scale features at , which are then compared by with real observations in total intensity at the same angular resolution. We note that different training is performed for total intensity and Stokes and maps, but always having as the training set images of small scales in total intensity. KP2021 therefore assumed that thermal dust statistical properties of small scales in polarization are the same as for Stokes maps. We also rely on that assumption in this work.
In KP2021 and in this work, the following definition of “small scales” is used. Let be a foreground map at a given angular resolution, which can be seen as the sum of two maps containing, respectively, only large- and small-scale structures:
(1) |
Assuming that the map encoding small-scale features, , is modulated by the large scales, – that is, – we have
(2) |
where represents the small-scale map generated by network , having as an input .
Although there exist neural networks designed to work on the sphere (Krachmalnicoff & Tomasi, 2019), our GAN deals with flat two-dimensional images. For this reason, we had to project the input maps onto flat patches, and project output patches back onto the sphere, after the application of the trained GAN model. We used the same projection strategy as that described in the appendix of KP2021, which divides the GNILC thermal dust Stokes and maps at 80 with 111111Presented in the Hierarchical Equal Area Latitude Pixels (HEALPix) (Górski et al., 2005) format. into 174 square patches that have pixels and a physical side length of . We note that this projection on flat patches and reprojection on the sphere can introduce distortions in the final full-sky map. In order to mitigate this effect, the flat patches overlap each other. We estimate that the final level of distortion induced by our procedure is always less than 7% of the signal (around 2% on average).
From now on we will use and to refer to the small-scale (output of the network) and large-scale (input of the network) patches, respectively, where defines the Stokes parameters, the physical dimension of the patch in degrees, and its angular resolution in arcminutes.
The GAN architecture, training procedure, input maps, and output validation of the ForSE model are fully described in KP2021.
2.2 ForSE+: Producing non-Gaussian dust maps at
In this work, we implement ForSE+, an updated version of the ForSE code and its training procedure, with two objectives:
-
1.
To allow the generation of full-sky polarization maps with non-Gaussian features at an enhanced resolution of .
-
2.
Starting from the same low-resolution maps, to generate multiple realizations of stochastic small scales that still have the correct non-Gaussian statistical properties.
In the sections below, we explain the assumptions and methodology used to achieve these two goals. Figure 1 sketches the input and output of the new ForSE+ models and in Table 1 we summarize the new models trained in this paper.
Model | Description (Output) | Input |
---|---|---|
ForSE | KP2021 version to simulate deterministic thermal dust emission maps at | Planck GNILC maps at |
ForSE+S12 | new model to simulate stochastic thermal dust emission at | Planck GNILC maps at plus random component |
ForSE+D | new model to simulate deterministic thermal dust emission maps at | Maps at , smoothed from ForSE maps |
ForSE+S3 | new model to simulate stochastic thermal dust emission maps at | Maps at smoothed from ForSE+S12 plus random component |
In order to incorporate these extensions, we used the same GAN architecture as for the ForSE model, re-implemented using the new Tensorflow121212https://www.tensorflow.org framework (version 2.6.0), and we performed additional fine-tuning steps of some hyper-parameters of the networks to improve our results.
2.2.1 Scale invariance assumption
The first goal of our implementation of ForSE+ is to generate a map of polarized dust emission with non-Gaussian features at arcminute angular scales. As we anticipated, this is crucial to understand the possible impact of non-Gaussianity in the extraction of the CMB -mode lensing signals, which peak at these scales. In order to achieve this, we need to find a way to overcome the current limitation in the observational data at our disposal. As a matter of fact, in order to train our GANs we can only rely on a training set composed of 350 patches, with dimensions of and pixels, at an angular resolution of , taken from the total intensity GNILC Planck map at 353 GHz, described in KP2021. No other observations of thermal dust emission at a higher angular resolution in a portion of the sky large enough to be used as training set are available.
The idea to circumvent this restriction and still be able to reach a resolution higher than , even in the absence of a proper training set, is to make a scale-invariance assumption, applying our GAN model in an iterative way. In practice, our set of training squared patches with dimensions of and a resolution of can equally be treated as having dimensions of and resolution of , since the network does not have a sense of physical units and is only sensitive to the ratio between the dimension of the patch and its angular resolution (i.e., ). In this way, the same dataset used to train the first version of ForSE in KP2021 can be used to train a new GAN model. This model takes the output of the first iteration of the code as input and generates non-Gaussian scales at . As has been mentioned, this implies that we are assuming scale invariance for the statistical properties of dust emission; that is, scales at have the same properties as those at . This assumption is justified by the fact that current observations of the dust polarization power spectra shows that they can, at the first order, be approximated as a power law as a function of the angular scales (Planck Collaboration XI, 2020).
Additional pre- and post-processing of the input patches (including upsampling, smoothing, and the sub-division of patches) is needed to train the GAN in the correct way and to be able to restore full-sky maps, as is described in Section 3.2.1.
2.2.2 Stochasticity
Our second goal is to produce different realizations of the non-Gaussian small-scale structures. This is important to estimate the variance of the signal we are simulating as well as the correlation among different angular scales. The way we achieved this was by simply adding a random component to the large-scale maps that are the input of our GANs. We then trained new models on these signal + random component maps, always using the 350 patches as the training set.
The random component that we considered was simply generated as a random realization from a uniform distribution in the range . Since our input maps were always re-scaled to have pixel values ranging in the same interval to be compatible with the input of our neural networks, we had a signal-to-noise ratio ( of . We refer to this as “stochastic training;” in other words, a random component was added to the input signal, ForSE+S, as opposed to the deterministic case (ForSE+D), in which we did not add any random component to the input maps.
3 Pre-processing, training, and post-processing of ForSE+
In this section, we describe the training details of ForSE+S and ForSE+D, including the pre-processing and post-processing steps, and present the results on flat-squared patches, including maps and power spectra, before reprojecting them to full-sky maps.
3.1 ForSE+S to
We first describe how we generate stochastic maps with non-Gaussian features at a resolution of . We also applied a similar procedure to construct maps at , as is described in Section 3.2. For clarity, we will call ForSE+S12 the model that generates stochastic maps at and ForSE+S3 the one that goes up to .
As was mentioned above, we injected stochasticity into our generative process by simply adding a random component to the low-resolution maps that are the input of our GAN. By doing so, ForSE+S is able to generate non-Gaussian small-scale features that still partially depend on the real observed large-scale structures but will have a different morphology as we change the realization of the random component in the input dataset.
3.1.1 Training and post-processing
As in KP2021, we trained two models for and maps separately. The inputs to the generator, , were the 174 patches that together cover the full sky, with dust signal plus an additional random component. The training set, which encodes the target statistical distribution of small scales, was the 350 maps. The weights of the neural works were updated for epochs and saved every 500 steps. Since we do not want to generalize the usage of the trained model – the trained model only needs to predict the output from the training data – it is not a problem if there is an over-fitting during training. Therefore, there was no consideration of a separate validation set during the training process.
Following KP2021, we used the Minkowski functionals (MFs), as implemented in (Mantz et al., 2008), as the criteria with which to choose the best epoch. For two-dimensional fields, three kinds of MFs can be built: , , and . They fully describe the statistical properties of the field and represent the covered area (), the boundary length (), and the number difference of connected domains and holes of the image’s feature (), as a function of the threshold, (Hadwiger, 1957).
During the GAN training process, the goal is to produce small-scale feature maps with statistical properties as close as possible to the ones of the training set. We quantified the level of agreement by calculating the overlapping fractions between the distributions of MFs of the maps in the training set () and those of the generated ones (). The distributions of MFs are indicated by the variation among the training set or generated maps and the overlapping fractions were computed as the ratio between the intersection area and the total area spanned by the two distributions. In practice, we calculated the MFs overlapping for each saved epoch of and selected as the best model the one with the highest score. In doing so, we computed the MFs for the output maps, , by applying to maps with the realization of a random component different from the one used for training.
The best models are obtained after 5500 epochs and 6000 epochs for and , respectively. Their MFs are shown in Figure 2 and compared with the ones from the training set. The overlap among the distributions is at a level of , comparable with the one obtained in Krachmalnicoff & Puglisi (2023) (that includes corrections to KP2021). In comparison, Gaussian small scales have MFs with obviously different shapes from these two sets of maps, as is illustrated in Fig.7 of KP2021.
We also computed the overlapping fraction of the MFs by considering 100 different realizations of the small-scale maps (obtained by changing the random component in the input). The mean (standard variation) values of the overlapping fraction for , , and are 59.1%(1.5%), 62.9%(0.6%), and 55.8%(0.7%), respectively, for Stokes maps, and 58.8%(2%), 56.3% (1.3%), and 45.5%(1.1%), respectively, for maps. These numbers show the robustness of ForSE+S in generating stochastic small scales with non-Gaussian high-order statistics.
Since in the training procedure both the input maps and the training sets are normalized in the range of [-1,1], the output maps also have pixel values in this range, and therefore need to be normalized to restore physical units. We achieved this by following the procedure of KP2021, hence ensuring that, for each patch and for both and , the amplitude of the power spectrum of the produced small scales matches the extrapolation at higher multipoles of the power spectrum of the low-resolution input maps at .
3.1.2 Results of ForSE+S at
Figure 3 shows patches with two different realizations of small scales at from ForSE+S12, after the normalization mentioned above, in the second and third columns, compared with the deterministic results from ForSE in the first column. Both the differences at small scales and consistency at large scales between the outputs from ForSE and ForSE+S12 show the capability of the trained model to produce small-scale features in the map space.
To further validate these maps, we calculated the second-order statistics – the power spectrum – using the Namaster package (Alonso et al., 2019).131313https://namaster.readthedocs.io/en/latest/sample_flat.html In Figure 4, we show the EE and BB power spectra from the squared patches of low-resolution maps, the output from ForSE, and 100 realizations of ForSE+S12 in black, green, and gray, respectively. The mean values of power spectra from 100 stochastic realizations are also shown in red. The output maps from ForSE+S12 are consistent with the deterministic output maps in terms of the power spectra, as was expected.
3.2 ForSE+D and ForSE+S3 to
We describe now the procedures that we followed to generate maps at the resolution of , by using our GAN model iteratively both in the case of ForSE+D and for ForSE+S3. The whole procedures, including several pre- and post-processing steps, are sketched in Figure 5 and described in the following (see also Foschi (2021)).
3.2.1 Pre-processing for the training of ForSE+D
As was mentioned above, we reached the resolution of by assuming scale invariance in the statistics of the thermal dust emission. We exploited the output of the first GAN model as the input to a second generative step by using the same training set and considering it to be composed of 350 patches with dimensions of at a resolution of , as was explained in Section 2.2.1. Therefore, since the first iteration of ForSE allows one to go from maps with dimensions of and a resolution of to maps at a resolution of , the second iteration can produce maps at a resolution of . In order to preserve the proportions among all the relevant quantities (i.e., patch dimension, resolution of input, and resolution of output), the input patches for the second iteration should have dimensions of and a resolution of . We can obtain those patches by smoothing and subdividing the output of ForSE.
In practice, we pre-processed each of the 174 maps with pixels obtained from the first iteration in the following way:
-
1.
We upsampled each map with pixels to by repeating each pixel four times.
-
2.
We smoothed these maps with a Gaussian kernel in order to reach a resolution of .
-
3.
We subdivided each of these large squared patches into patches, each one containing pixels. We divided each large patch into 49 small ones, with a large overlap among them, made of 160 pixels on each side. This overlap is needed in order to avoid border effects when the composition and reprojection on the sphere is performed.
-
4.
At the end of this procedure, we have a set of patches for and , which is the total amount of patches covering the full sky and represents the that will be used as the input to the second GAN iteration.
We applied the same pre-processing to the output of ForSE and ForSE+S12 in order to produce maps at in both the deterministic case and the stochastic one. In the stochastic case, we added an additional random component, as is described in Section 2.2.2.
3.2.2 Training and post-processing of ForSE+D
Once the pre-processing steps described above are performed, the obtained 8526 patches can be used to train a new GAN model. However, the time cost is basically linear with the number of patches. We note that the 8526 patches have repeating pixels among them (see the steps above to get 8526 patches); thus, they are actually not independent. In order to save computational time, we fed as input to the generator, , only a subset of 696 patches, randomly selected from the total 8526 patches. Once the network was trained, we applied it to the remaining patches.
In Figure 6 we show the MFs of the generated small scales at . The overlap with the target distribution is at a level of 70-80% in the deterministic case. In the stochastic case (not shown), on the other hand, the overlap ranges between 60% and 70%, showing therefore that we are also able to generate non-Gaussian small-scale features at this higher resolution.
The output of the GAN models are patches, , containing the small-scale features that, as in the first iteration, have pixel values in the range of . We therefore normalized them in physical units before multiplying them with the large-scale maps at the resolution of , obtaining 8526 maps.
We recall that each of these patches comes from a subdivision of the larger maps with physical dimensions of (as is described in Section 3.2.1) into 49 sub-patches with a large overlap among each other. Therefore, before reprojecting them on the sphere to obtain the full-sky maps, we needed to recombine them. In order to avoid border effects, we did this by using apodization window functions, as is shown in Figure 7: each sub-patch was weighted with the corresponding window function, then they were summed together to form the 174 maps that could then be reprojected on the sphere.
3.2.3 Results of ForSE+D and ForSE+S3 at
In Figure 8, we show the comparisons of a selected patch at , from ForSE and maps at from ForSE+D, from left to right. It is clear that ForSE+D can inject small-scale structures, following the modulation of the large-scale emission.
The power spectra, computed on the same patch, are shown in Figure 9. As can be seen, the amplitude at low matches the one from the low-resolution GNILC maps, and the generated small scales have power at higher multipoles, with no breaks in the power spectrum that follows a power law, as was expected.
ForSE+S3 was utilized to generate stochastic small scales at , as is mentioned in 2.2.2. Two realizations of small scales for the map in the range of [-1, 1] with a side length of and centered on (i.e., at the position of the dashed red box in the upper right plot of Figure 8) are shown in the upper panel of Figure 10, and in the lower panel we show the normalized maps combined with the large scales. The differences between the two realizations should be noted and both of them trace the large-scale features of the input maps. We also show the power spectra of 100 realizations in Figure 9, where we also plot the mean, validating the results of stochasticity on the power spectrum level.
4 Validations of full-sky maps from ForSE+
In this section, we show the maps and power spectra after reprojecting the flat patches back to the sphere, following the algorithm in the appendix of KP2021. Before showing the results, we introduce a further step to adjust the E-to-B ratio in the simulated maps to match the observations at large scales.
4.1 Fine-tuning of the E-to-B ratio to match observations
We recall that, due to the limitation of observational data in polarization, we used the statistical properties of the total intensity small scales to be the ground truth of both and during the training process. This implies that the output high-resolution map will have the same power for and modes. However, high-frequency observations of the Planck satellite have shown the existence of an asymmetry between the measured power of thermal dust emission in the and modes, with over the multipole range of (Planck Collaboration XI, 2020). Therefore, in order to match the real observations, we applied a fine-tuning to the full-sky maps, obtained after all the steps mentioned above.
We first transformed maps into the harmonic space to obtain the and coefficients. Then we applied a factor of to the to decrease its power, since and have the same variance as was just mentioned. Finally, we transformed the tuned back to the map space to get maps that match the observed ratio of 0.5 at the power spectra level. We note that the large scales in the output maps of ForSE/ForSE+ remain to be the observed ones, so the tuning was applied only to the small scales; that is, belonging to 141414Corresponding to 80: = and the transition between large and small scales was smoothed with a sigmoid function to avoid discontinuities in the final power spectra.
The E-to-B ratios of the power spectra calculated for different fractions of sky are shown in Figure 11. The red lines show the ratios after the tuning steps, which were indeed corrected to for the injected small scales.
4.2 Full-sky maps
We are now ready to present the full-sky maps and specifically, we show the ForSE+D maps at , at Nside = 4096, in Figure 12, compared with the input GNILC ones at . The difference between the two highlights the presence of small-scale features in the ForSE+D maps. When zooming into a patch that consists of two patches we also find no clear edge effects. These results show that border effects from reprojection are negligible.
In Figure 13, the power spectra of the maps on the sphere at different resolutions are shown for different sky fractions. As was expected, our maps at can further extrapolate the power up to , which corresponds to the drop scale of . We also note the absence of discontinuities at the transition from large scales to small scales (), implying that the small scales are produced with a similar pattern as the one at large scales, attributed to the normalization step to rescale the generated small scales in the range of [-1, 1] to the correct amplitude, as was mentioned in Section 3.2.2. In order to make a comparison with the latest d9 dust model from PySM3151515https://github.com/galsci/pysm (PanEx Collaboration, 2024), we also show the power spectra of d9 maps at 353GHz in blue. Although the latter was produced with entirely different methods, the power spectra of ForSE+D maps at are impressively close to those of d9 maps, even for different sky fractions.
We also calculated the power spectra of 100 realizations of full-sky maps at , generated with ForSE+S3, with a sky mask, up to and with a bin width of . The covariance matrix from these 100 realizations is shown in Figure 14. The correlation among multipoles at small scales is further evidence that the small scales at were synthesized in a non-Gaussian way. In fact, if the small-scale features were produced with Gaussian properties, we would not observe any non-diagonal correlation, which is verified from our experiment to calculate the covariance matrix of a purely Gaussian field. We devote the next section to further deepening the non-Gaussian properties of our maps.
4.3 Non-Gaussianity measured on the sphere
In previous sections, we considered the MFs in order to characterize non-Gaussianity for flat patches. Here, we expand the analysis to the full-sky maps, by exploiting the algorithms described in Grewal et al. (2022), in order to calculate MFs for spherical maps in the HEALPix format. We focus on the small scales injected by ForSE and ForSE+D and here we used the multipole range [200, 2048] to band-pass filter the raw maps and applied a sky mask to ignore the Galactic plane.
In Figure 15 we show the results of the MFs statistics applied to the sphere, confirming our previous claims. Differences in the ForSE+D map are visible with respect to the case of Gaussian maps in dashed gray for all three kinds of MFs. The Gaussian maps were generated from a random realization from the power spectra of the ForSE+D map. We also show the results of the latest PySM3 d9 dust maps in blue, which exhibits a deviation from Gaussianity, although being similar to the modulated Gaussian maps in orange, whose non-Gaussianity is supposed to derive only from the modulation of large scales. We further checked that the results are robust for the sky mask and for = 500 and 1000.
5 Discussion
In this paper, we extend the ability of the ForSE model proposed in KP2021 based on the GAN technique to simulate non-Gaussian stochastic small scales of polarized thermal dust emission up to . We have trained three new models, which are summarized in Table 1, together with the one trained in KP2021.
Based on the results obtained in KP2021, our first test was to bring stochasticity into our models by adding a random component into the input and to train a new network called ForSE+S12 so that it can generate different maps with different seeds. Minkowski functionals were used to quantify the level of non-Gaussianity in the maps and in Figure 2 we demonstrate that the polarized thermal dust small scales have a similar level of non-Gaussianity to that in the intensity small scales. Different realizations of maps at for a specific patch and the corresponding power spectra are shown in Figure 4, which indeed have expected variations at small scales and the correct amplitude scaling with multipoles for power spectra.
We then considered the case of a angular scale, where lensing -modes have the strongest signal, and we studied the deterministic case first, ForSE+D. By relying on the scale invariance assumption discussed in Section 2.2.1 and the pre- and post-processing steps outlined in Figure 5, we trained the model to generate maps at out of those at , which are the output from ForSE. MFs distributions in Figure 6 represent a verification of the training process. Maps and power spectra of a selected patch of maps at are shown in Figure 8, where the power is present as expected. ForSE+S3 was finally trained in order to generate stochastic small scales at . Two realizations of small scales are shown in Figure 10, with their power spectra shown in the right panel of Figure 8, showing the expected multipole scaling.
After obtaining the small scales on flat patches, we reprojected them onto the celestial sphere in order to get full-sky maps in the HEALPix format, with for maps at and 4096 for maps at . We show the Mollview projection of the deterministic maps at in Figure 12 and compare them with the observed Planck GNILC maps, validating both the effectiveness of the injected small scales and the reprojection process. The power spectra of deterministic maps at , and with , and sky masks are plotted in Figure 13, indicating that the small scales generated preserve the same anisotropies as the low-resolution observations.
We further obtain the covariance matrix of power spectra up to from 100 realizations of maps at in Figure 14, which has a strong correlation between different multipoles at small scales, and thus highlights the non-Gaussianity in our maps. We note that for a Gaussian field, the variances along the diagonal line are smaller than the variances we obtained here and the off-diagonal correlation is zero. We repeated the calculation of the covariance matrix for the PySM3 d11 model, which is also designed to generate multiple realizations of small scales of thermal dust emission (PanEx Collaboration, 2024), and find that off-diagonal elements are also close to zero, meaning that the level of non-Gaussianity of the injected small scales on the full sky is small.
Finally, we measured the level of non-Gaussianity using MFs in the simulated full-sky maps in Figure 15, and compared it with the small scales simulated within the Gaussian assumption. The difference between the MFs obtained from the two kinds of maps is another clear indication of the non-Gaussian small-scale component generated with ForSE+.
Once the steps above are achieved, we can use the synthetic dust Stokes and maps at 353GHz at resolution as a template and appropriately scale across different frequencies by taking the observed dust SEDs. The series of maps can then serve as a simulation suite to obtain the scatter of component separation against variations in the foreground realization. Moreover, high-resolution maps are essential in order to validate the lensing reconstruction methods, tested so far on foreground models where the arcminute structure is either absent or Gaussian (Lonappan et al., 2023), or based on models of the Galactic magnetic fields in order to reproduce a non-Gaussian pattern (Beck et al., 2020). In a forthcoming work, we will investigate if any variation of the results shown in Beck et al. (2020) occurs when considering the foreground maps exhibiting non-Gaussianity generated through GANs in this paper.
With the new version of PySM3 under development, we capitalize on integrating our maps into a new model of the latest PySM packages for public use. Maps will be shared upon request to the authors. The code to generate maps has also been made publicly available.161616https://github.com/yaojian95/ForSEplus. Finally, we report here future improvements in the ForSE algorithm. First, more observations are needed. In fact, the assumptions made in Section 2.2 are somewhat a compromise due to the lack of observation at the required resolution. The models will get more reliable as more observations become available. Second, by considering the network architecture, the loss function accounting for the non-Gaussianity of the targets produced by the GAN may be considered, as in the current implementation this process is not automatized. A way to quantify the level of non-Gaussianity with a formalism that is differentiable with respect to the input pixels would be desirable, as then we could construct the loss function of the network in order to include the information of non-Gaussianity, which will effectively guide the generator. The last point we want to mention is that the operation of adding noise is to some extent like the training process of diffusion models in deep learning, which is more natural than what is done in this work, so if we turn to use the network of diffusion model, it may have a better performance.
Acknowledgement
The authors acknowledge partial support by the Italian Space Agency LiteBIRD Project (ASI Grants No. 2020-9-HH.0 and 2016-24-H.1-2018), as well as the InDark and LiteBIRD Initiative of the National Institute for Nuclear Phyiscs, and the RadioForegroundsPlus Project (HORIZON-CL4-2023-SPACE-01, GA 101135036). G.P. acknowledges support from Italian Research Center on High Performance Computing Big Data and Quantum Computing (ICSC), project funded by European Union - NextGenerationEU - and National Recovery and Resilience Plan (NRRP) - Mission 4 Component 2 within the activities of Spoke 3 (Astrophysics and Cosmos Observations). J.Y. thanks Yun Zheng for useful discussions. This research used resources of the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility located at Lawrence Berkeley National Laboratory, operated under Contract No. DE-AC02-05CH11231.
Software used: Astropy (Astropy Collaboration et al., 2013, 2018), PySM (Zonca et al., 2021; Thorne et al., 2017), NumPy (Harris et al., 2020), Namaster (Alonso et al., 2019), numba (Lam et al., 2015), Tensorflow (Abadi et al., 2015), reproject (Robitaille et al., 2023), HEALPix (Zonca et al., 2019; Górski et al., 2005)
References
- Abadi et al. (2015) Abadi, M., Agarwal, A., Barham, P., et al. 2015, TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems, software available from tensorflow.org
- Abazajian et al. (2016) Abazajian, K. N., Adshead, P., Ahmed, Z., et al. 2016, arXiv e-prints, arXiv:1610.02743
- Abril-Cabezas et al. (2024) Abril-Cabezas, I., Hervías-Caimapo, C., von Hausegger, S., Sherwin, B. D., & Alonso, D. 2024, MNRAS, 527, 5751
- Acquaviva & Baccigalupi (2006) Acquaviva, V. & Baccigalupi, C. 2006, Phys. Rev. D, 74, 103510
- Ade et al. (2019) Ade, P., Aguirre, J., Ahmed, Z., et al. 2019, J. Cosmology Astropart. Phys., 2019, 056
- Allison et al. (2015) Allison, R., Caucal, P., Calabrese, E., Dunkley, J., & Louis, T. 2015, Phys. Rev. D, 92, 123535
- Allys et al. (2019) Allys, E., Levrier, F., Zhang, S., et al. 2019, A&A, 629, A115
- Alonso et al. (2019) Alonso, D., Sanchez, J., Slosar, A., & LSST Dark Energy Science Collaboration. 2019, MNRAS, 484, 4127
- Astropy Collaboration et al. (2018) Astropy Collaboration, Price-Whelan, A. M., Sipőcz, B. M., et al. 2018, AJ, 156, 123
- Astropy Collaboration et al. (2013) Astropy Collaboration, Robitaille, T. P., Tollerud, E. J., et al. 2013, A&A, 558, A33
- Aylor et al. (2021) Aylor, K., Haq, M., Knox, L., Hezaveh, Y., & Perreault-Levasseur, L. 2021, MNRAS, 500, 3889
- Bartolo et al. (2004) Bartolo, N., Komatsu, E., Matarrese, S., & Riotto, A. 2004, Phys. Rep, 402, 103
- Beck et al. (2020) Beck, D., Errard, J., & Stompor, R. 2020, J. Cosmology Astropart. Phys., 2020, 030
- Bennett et al. (2013) Bennett, C. L., Larson, D., Weiland, J. L., et al. 2013, ApJS, 208, 20
- Bernardi et al. (2004) Bernardi, G., Carretti, E., Fabbri, R., et al. 2004, MNRAS, 351, 436
- BeyondPlanck Collaboration I (2023) BeyondPlanck Collaboration I. 2023, A&A, 675, A1
- Cabella et al. (2010) Cabella, P., Pietrobon, D., Veneziani, M., et al. 2010, MNRAS, 405, 961
- Clark & Hensley (2019) Clark, S. E. & Hensley, B. S. 2019, ApJ, 887, 136
- Coulton & Spergel (2019) Coulton, W. R. & Spergel, D. N. 2019, J. Cosmology Astropart. Phys., 2019, 056
- Delabrouille et al. (2013) Delabrouille, J., Betoule, M., Melin, J. B., et al. 2013, A&A, 553, A96
- Delabrouille et al. (2009) Delabrouille, J., Cardoso, J. F., Le Jeune, M., et al. 2009, A&A, 493, 835
- Foschi (2021) Foschi, M. 2021, Master’s thesis, University of Trento
- Goodfellow et al. (2014) Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., et al. 2014, arXiv e-prints, arXiv:1406.2661
- Górski et al. (2005) Górski, K. M., Hivon, E., Banday, A. J., et al. 2005, ApJ, 622, 759
- Grewal et al. (2022) Grewal, N., Zuntz, J., Tröster, T., & Amon, A. 2022, The Open Journal of Astrophysics, 5, 13
- Guth (1981) Guth, A. H. 1981, Phys. Rev. D, 23, 347
- Hadwiger (1957) Hadwiger, H. 1957, Vorlesungen ueber Inhalt, Oberflache und Isoperimetrie, Die Grundlehren der mathematischen Wissenschaften (Springer)
- Harris et al. (2020) Harris, C. R., Millman, K. J., van der Walt, S. J., et al. 2020, Nature, 585, 357
- Hervías-Caimapo & Huffenberger (2022) Hervías-Caimapo, C. & Huffenberger, K. M. 2022, ApJ, 928, 65
- Heurtel-Depeiges et al. (2023) Heurtel-Depeiges, D., Burkhart, B., Ohana, R., & Régaldo-Saint Blancard, B. 2023, arXiv e-prints, arXiv:2310.16285
- Hinshaw et al. (2013) Hinshaw, G., Larson, D., Komatsu, E., et al. 2013, ApJS, 208, 19
- Hu (2000) Hu, W. 2000, Phys. Rev. D, 62, 043007
- Hu & Dodelson (2002) Hu, W. & Dodelson, S. 2002, ARA&A, 40, 171
- Hu & Okamoto (2002) Hu, W. & Okamoto, T. 2002, ApJ, 574, 566
- Kamionkowski et al. (1997) Kamionkowski, M., Kosowsky, A., & Stebbins, A. 1997, Phys. Rev. D, 55, 7368
- Kamionkowski & Kovetz (2016) Kamionkowski, M. & Kovetz, E. D. 2016, ARA&A, 54, 227
- Kim et al. (2019) Kim, C.-G., Choi, S. K., & Flauger, R. 2019, ApJ, 880, 106
- Krachmalnicoff et al. (2016) Krachmalnicoff, N., Baccigalupi, C., Aumont, J., Bersanelli, M., & Mennella, A. 2016, A&A, 588, A65
- Krachmalnicoff et al. (2018) Krachmalnicoff, N., Carretti, E., Baccigalupi, C., et al. 2018, A&A, 618, A166
- Krachmalnicoff & Puglisi (2021) Krachmalnicoff, N. & Puglisi, G. 2021, ApJ, 911, 42
- Krachmalnicoff & Puglisi (2023) Krachmalnicoff, N. & Puglisi, G. 2023, The Astrophysical Journal, 947, 93
- Krachmalnicoff & Tomasi (2019) Krachmalnicoff, N. & Tomasi, M. 2019, A&A, 628, A129
- Lam et al. (2015) Lam, S. K., Pitrou, A., & Seibert, S. 2015, in Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, 1–6
- Lenz et al. (2017) Lenz, D., Hensley, B. S., & Doré, O. 2017, ApJ, 846, 38
- Li et al. (2018) Li, H., Li, S.-Y., Liu, Y., et al. 2018, National Science Review, 6, 145
- LiteBIRD Collaboration et al. (2023) LiteBIRD Collaboration, Allys, E., Arnold, K., et al. 2023, Progress of Theoretical and Experimental Physics, 2023, 042F01
- Lonappan et al. (2023) Lonappan, A. I., Namikawa, T., Piccirilli, G., et al. 2023, arXiv e-prints, arXiv:2312.05184
- Madhavacheril et al. (2024) Madhavacheril, M. S., Qu, F. J., Sherwin, B. D., et al. 2024, ApJ, 962, 113
- Maniyar et al. (2021) Maniyar, A. S., Ali-Haïmoud, Y., Carron, J., Lewis, A., & Madhavacheril, M. S. 2021, Phys. Rev. D, 103, 083524
- Mantz et al. (2008) Mantz, H., Jacobs, K., & Mecke, K. 2008, Journal of Statistical Mechanics: Theory and Experiment, 2008, 12015
- Padoan et al. (2001) Padoan, P., Goodman, A., Draine, B. T., et al. 2001, ApJ, 559, 1005
- PanEx Collaboration (2024) PanEx Collaboration. 2024, in preparation
- Planck Collaboration Int. XX (2015) Planck Collaboration Int. XX. 2015, A&A, 576, A105
- Planck Collaboration IV (2020) Planck Collaboration IV. 2020, A&A, 641, A4
- Planck Collaboration IX (2020) Planck Collaboration IX. 2020, A&A, 641, A9
- Planck Collaboration VI (2020) Planck Collaboration VI. 2020, A&A, 641, A6
- Planck Collaboration VIII (2020) Planck Collaboration VIII. 2020, A&A, 641, A8
- Planck Collaboration XI (2020) Planck Collaboration XI. 2020, A&A, 641, A11
- POLARBEAR Collaboration (2022) POLARBEAR Collaboration. 2022, ApJ, 931, 101
- Qu et al. (2024) Qu, F. J., Sherwin, B. D., Madhavacheril, M. S., et al. 2024, ApJ, 962, 112
- Remazeilles et al. (2015) Remazeilles, M., Dickinson, C., Banday, A. J., Bigot-Sazy, M. A., & Ghosh, T. 2015, MNRAS, 451, 4311
- Robitaille et al. (2023) Robitaille, T., Ginsburg, A., Mumford, S., et al. 2023, astropy/reproject: v0.10.0
- Santos et al. (2021) Santos, L., Yao, J., Zhang, L., et al. 2021, A&A, 650, A65
- Spider Collaboration (2022) Spider Collaboration. 2022, ApJ, 927, 174
- SPT Collaboration (2023) SPT Collaboration. 2023, Phys. Rev. D, 108, 023510
- Stompor et al. (2009) Stompor, R., Leach, S., Stivoli, F., & Baccigalupi, C. 2009, MNRAS, 392, 216
- The BICEP/Keck Collaboration (2022) The BICEP/Keck Collaboration. 2022, ApJ, 927, 77
- Thorne et al. (2017) Thorne, B., Dunkley, J., Alonso, D., & Næss, S. 2017, MNRAS, 469, 2821
- Thorne et al. (2021) Thorne, B., Knox, L., & Prabhu, K. 2021, MNRAS, 504, 2603
- Tristram et al. (2022) Tristram, M., Banday, A. J., Górski, K. M., et al. 2022, Phys. Rev. D, 105, 083524
- Yao et al. (2018) Yao, J., Zhang, L., Zhao, Y., et al. 2018, ApJS, 239, 36
- Zaldarriaga & Seljak (1997) Zaldarriaga, M. & Seljak, U. 1997, Phys. Rev. D, 55, 1830
- Zhang et al. (2019) Zhang, P., Zhang, J., & Zhang, L. 2019, MNRAS, 484, 1616
- Zonca et al. (2019) Zonca, A., Singer, L., Lenz, D., et al. 2019, Journal of Open Source Software, 4, 1298
- Zonca et al. (2021) Zonca, A., Thorne, B., Krachmalnicoff, N., & Borrill, J. 2021, Journal of Open Source Software, 6, 3783