F563-V2
F565-V2
F567-2
F568-1
F568-3
F568-V1
F571-8
F571-V1
F574-1
F574-2
F579-V1
ESO079-G014
ESO116-G012
ESO444-G084
ESO563-G021
IC2574
IC4202
LSBC D563-03
LSBC D564-08
PGC51017
KK98-251
LADDER: Revisiting the Cosmic Distance Ladder with Deep Learning Approaches and Exploring Its Applications
Rahul Shah, Soumadeep Saha, Purba Mukherjee, Utpal Garain, and Supratik Pal
Published 2024 July 26 • © 2024. The Author(s). Published by the American Astronomical Society.
The Astrophysical Journal Supplement Series, Volume 273, Number 2
Citation Rahul Shah et al 2024 ApJS 273 27
DOI 10.3847/1538-4365/ad5558
DownloadArticle PDFDownloadArticle ePub
Authors
Figures
Tables
References
Article data
Download PDFDownload ePub
Article metrics
1930 Total downloads
2020 total citations on Dimensions.
Share this article
Article information
Abstract
We investigate the prospect of reconstructing the “cosmic distance ladder” of the Universe using a novel deep learning framework called LADDER—Learning Algorithm for Deep Distance Estimation and Reconstruction. LADDER is trained on the apparent magnitude data from the Pantheon Type Ia supernova compilation, incorporating the full covariance information among data points, to produce predictions along with corresponding errors. After employing several validation tests with a number of deep learning models, we pick LADDER as the best-performing one. We then demonstrate applications of our method in the cosmological context, including serving as a model-independent tool for consistency checks for other data sets like baryon acoustic oscillations, calibration of high-redshift data sets such as gamma-ray bursts, and use as a model-independent mock-catalog generator for future probes. Our analysis advocates for careful consideration of machine learning techniques applied to cosmological contexts.
Export citation and abstractBibTeXRIS
Previous article in issue
Next article in issue
Related links
Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
1. Introduction
Knowledge of accurate distances to astronomical entities at various redshifts is essential for deducing the expansion history of the Universe. Observationally, however, this task is not simple since one single standardizable measure of distances at all scales of cosmological interest does not exist. Hence one has to resort to a progressive method of calibrating distances, called the “cosmic distance ladder” method, using overlapping regions of potentially different standardizable objects as “rungs of the ladder.” The conventional distance ladder method (Riess & Breuval 2024) starts with direct measures of geometric distances and progresses to calibrating Cepheid variables (Freedman & Madore 2024) or tip of the red giant branch stars (Freedman et al. 2020), and finally Type Ia supernovae (SN Ia). Conversely, the “inverse” distance ladder begins with cosmology-dependent constraints on the sound horizon at the drag epoch from the cosmic microwave background (CMB), which is then used to calibrate distances to baryon acoustic oscillations (BAOs) and ultimately to SN Ia at lower redshifts (Cuesta et al. 2015; Camarena & Marra 2020). SN Ia are the preferred endpoints for both ladders given their property of being reliable standard candles over a wide redshift range.
A physical theory describing the expansion history of a spatially flat, homogeneous, and isotropic universe is given by a cosmological model, which is assumed to be valid over the entire range of observed scales, i.e., from the present epoch (z = 0) to the epoch of recombination (zCMB ∼ 1100), with the Lambda cold dark matter (ΛCDM) model being the current standard, having six free parameters to be fixed by observations. For a Friedmann–Lemaítre–Robertson–Walker universe, the cosmic distance duality relation dL = (1 + z)2 dA enables switching between luminosity distance dL and angular diameter distance dA —the two primary measures of distance in cosmology. The luminosity distances are related to this physical model as
, where E(z) = H(z)/H0 is the reduced Hubble parameter, and H0 = H(z = 0) is the Hubble constant, signifying the rate of the Universe’s expansion today. For sufficiently low redshifts, dL is well approximated by Hubble’s law, z = H0 dL /c, offering a means to obtain H0 without assuming a cosmological model. However, of late, inconsistencies have arisen in the concordance model, with the most significant being the tension in the measurement of the Hubble constant (H0) (Novosyadlyj et al. 2014; Hazra et al. 2015; Bernal et al. 2016). This, and other growing issues with ΛCDM, has prompted the community to turn either to more complicated cosmological models or to cosmological-model-independent (henceforth referred to as simply “model-independent”) approaches, the second route proving more effective with time.
The simplest method involves cosmography (Visser 2005), which being merely a Taylor expansion of the scale factor does not introduce bias toward any particular cosmological model. There is, however, an ambiguity as to the number of terms to consider in such a series. The aforementioned issues in contemporary cosmology, such as the emergence of tensions, arise when subjected to precision data from observations. This necessitates any alternative method of building the distance ladder to maintain, if not improve, the precision of the data being used. Premature truncation of the cosmographic series may induce significant numerical errors at higher redshifts, while considering higher-order terms raises doubts on convergence. Although alternatives to the Taylor series, such as Padé (Wei et al. 2014) and Chebyshev (Capozziello et al. 2018), help overcome convergence issues to some extent, there still is no clear consensus on the exact number of terms to consider to faithfully mimic the underlying cosmology.
This has motivated the community to resort to reverse engineering by employing model-independent methods for reconstructing distances, and estimation of cosmological parameters therefrom. There have been multiple attempts to reconstruct cosmic distances using Gaussian processes (GPs) and genetic algorithms, by various authors, both with present data and with simulated data from future observatories (Arjona et al. 2021; Keeley et al. 2021; Mukherjee & Mukherjee 2021; Li et al. 2024). Ambiguity over the choice of kernels, the function dictionary, the mean function, and overfitting concerns with overwhelming errors in data-scarce regions have significantly limited the prospects of these approaches (Ó Colgáin & Sheikh-Jabbari 2021; Hwang et al. 2023). This has led to an active use of deep learning with artificial neural networks (Wang et al. 2020a, 2020b; Tang et al. 2021; Escamilla-Rivera et al. 2022; Olvera et al. 2022; Dialektopoulos et al. 2023, 2024; Giambagli et al. 2023; Gómez-Vargas et al. 2023a, 2023b; Liu et al. 2023; Mehrabi 2023; Xie et al. 2023; Zhang et al. 2023, 2024; Mukherjee et al. 2024a) in this domain.
As important as accuracy is, measuring distances is limited by experimental precision, due to astrophysical uncertainties, foregrounds, peculiar velocity effects, and other practical limitations. Precision is also limited by the available number of data points. These are a critical concern as precision tests are essential for scrutinizing the standard, or alternative, cosmological models. Although both these issues are expected to be improved on considerably by upcoming observatories, the use of innovative analysis methods, such as machine learning (ML) techniques, on current data could help overcome these challenges. Another common limitation for a majority of previous ML attempts in cosmology is in the correct and stable prediction of errors at relatively higher redshifts, which makes them unsuitable for undertaking precision cosmological tests when it comes to the issue of tensions. With this motivation, we present our approach, LADDER (Learning Algorithm for Deep Distance Estimation and Reconstruction), which has been designed from the ground up keeping the above considerations in mind. Moreover, almost every straightforward technique fails at extrapolation to any redshift beyond the range of available data in a model-independent manner, due to prediction uncertainties playing a dominant role. Being able to extrapolate beyond the range of available data is lucrative since it could allow simulations of intermediate-redshift data, or at least, serve as some stable augmentation of currently available data to higher redshifts.
In this spirit, we aim to revisit the cosmic distance ladder by presenting this novel deep learning algorithm LADDER, which is trained using the Pantheon SN Ia data set (Scolnic et al. 2018), taking into account the corresponding errors and complete covariances in the data. Our algorithm interpolates from the joint distribution of a randomly chosen subset of the data set to estimate the target variable and errors simultaneously, and elegantly incorporates correlations and the sequential nature of the data. This leads to predictions that are robust to input noise and outliers and helps make precise predictions even in data-sparse regions. In the following sections, we first outline the data sets and the proposed algorithm, followed by performance validation. We then point out a few cosmological applications that can be explored further using our algorithm. In particular, we demonstrate LADDER's versatility in conducting consistency checks for a similar SN Ia data set, Pantheon+ (Scolnic et al. 2022). Subsequently, we analyze the implications of our model-independent predictions using the BAO data set, with regard to their alleged dependence on fiducial cosmology (Sherwin & White 2019). We then use the LADDER predictions to calibrate the high-redshift gamma-ray burst (GRB) data set to derive constraints on the ΛCDM and wCDM models. Additionally, we discuss the potential of our deep learning network as a model-independent mock-data generator for cosmological studies, and some future directions.
2. Training Data Set
In principle, one can utilize LADDER with any data set. However, in this article, our primary focus is on achieving a model-independent reconstruction of the cosmological distance ladder. To accomplish this, we employ the Pantheon SN Ia compilation (Scolnic et al. 2018) as our training data set. The Pantheon data set offers several cosmological advantages for the model-independent reconstruction of the cosmic distance ladder. First, it remains uncalibrated, which guarantees that it is neither plagued by the choice of cosmological models nor biased by any inherent systematics in otherwise model-independent direct distance measurement calibration methods. Moreover, it involves fewer sources of uncertainties, enhancing its robustness. Additionally, its broad redshift range and diverse data samples help ensure unbiased learning. Notably, no a priori priors on the calibration parameter (MB , the absolute magnitude of SN Ia in the B band) are imposed on the “Pantheon” data set during training, either from early-time model-dependent CMB constraints or from direct late-time distance measurements.
Pantheon features rich data from 1048 spectroscopically confirmed SN Ia spanning a broad range of redshifts 0.01 ≲ z < 2.3 with a higher sample density at lower redshifts, and notable sparsity with increasing z. This data set comprises observations on direct measurement of the apparent magnitude (m) with the statistical uncertainties (Δm) tabulated at different redshifts (z). Additionally, there is a 1048 × 1048 matrix corresponding to covariances among the data points. This data set thus allows for a thorough exploration of cosmic distances covering a wide range of redshifts, and is well suited for model-independent analyses.
Given knowledge of MB , one can find the luminosity distance dL (z) independent of any cosmological model. This is expressed by the equation
where μ(z) = m(z) − MB is the distance modulus.
The observed apparent magnitudes () for each SN Ia light curve as measured on Earth depend on the heliocentric (zhel) and CMB frame (zcmb) redshifts. In terms of only zcmb (i.e., in the absence of peculiar velocities) we have
We then also propagate the errors in zcmb into m(zcmb). This gives us the data in the final form m(zcmb) versus zcmb, which we henceforth refer to as simply m versus z, with corresponding statistical errors Δm and covariance matrix C sys.
Armed with this data, we aim to train a neural network capable of proficiently learning, and extrapolating to higher redshifts, the apparent magnitude data set independently of the underlying cosmological model.
3. Methodology
3.1. Formal Problem Description
Given the Pantheon data set, , which is drawn from some a priori unknown distribution, and , we are interested in estimating the distribution of with the assumption , for some value of the functions μθ and σθ and parameter θ. In ML parlance this would be restated as follows: given find , such that for any new input z, we have
for a certain class of functions , which could be a deep learning network, and , which is a risk functional. This risk functional is typically the empirical risk,
where ℓ is a loss function, usually the Kullback–Leibler (KL) divergence since we are measuring the distance between distributions.
Although our goal is to interpolate from the given points, this problem notably differs from standard regression as the samples are not independent. In particular, since , our typical empirical risk minimization does not work, and we are left dealing with the following intractable empirical risk:
3.2. Our Approach—LADDER
Since our data points are not independent, any predictive model we devise would have to depend on the entire data set. This presents a challenge, as, whenever we have access to any new data, we must readjust our predictive model taking into account the correlations between the new and old data. In order to mitigate this issue without ignoring the correlations between the data instances, we assume that at most K many samples from the data set are correlated with each other, and rewrite the empirical risk as
where
are the predicted and observed covariances, respectively, and f is our predictive model. This way, although we are not considering the whole covariance matrix Σm for each sample, all possible correlations are accounted for, since we are minimizing risk over all observed data points in aggregate. This motivates our choice of function to be of the form
Our objective then is to minimize
DKL is a pseudometric measuring the “distance” between the distributions P and Q. The parameter θ can be found with an algorithm like stochastic gradient descent.
During training, we first choose K points from
and designate K − 1 as “support” points, and the remaining point is dubbed the “query point.” We create a training instance by sampling from
to get
, and create
and
(we rearrange the indices such that
). Put simply, the training proxy objective asks: given these K − 1 points from the data set, predict (m, Δm) corresponding to my point of interest.
Given j1, j2, …, jK , we compute
as follows:
Our job then is to find
, such that
, where fθ is a suitably chosen deep neural network, with parameter θ. The full algorithm is outlined in Algorithm 1, with a schematic outlined in Figure 1.
Zoom InZoom OutReset image size
Figure 1. Schematic overview of the training algorithm.
Download figure:
Standard imageHigh-resolution image
Algorithm 1. LADDER—Learning Algorithm for Deep Distance Reconstruction and Estimation
Given
and batch size B.
Initialize
.
while not StopCondition do
for
do
(Equation (9))
Forward pass.
end for
Compute
Gradient update (illustrative).
if … then
Check if model converged. StopCondition
True
end if
UGC00634
UGC00731
UGC00891
UGC01230
UGC01281
UGC02023
UGC02259
UGC02455
UGC02487
UGC02885
UGC02916
UGC03205
UGC03546
UGC03580
UGC04305
UGC04325
UGC04483
UGC04499
UGC05005
UGC00634
UGC00731
UGC00891
UGC01230
UGC01281
UGC02023
UGC02259
UGC02455
UGC02487
UGC02885
UGC02916
UGC03205
UGC03546
UGC03580
UGC04305
UGC04325
UGC04483
UGC04499
UGC05005
THE GOLDEN BALLROOM/BUNKER
Ben Meiselas reports on the shocking admission by Donald Trump’s DOJ in a court case where the DOJ admits to a secret project underneath the ballroom which they claim is needed to protect Donald Trump’s life for “national security purposes.” "You unlock this door with the key of complicity. Beyond it is another dimension — a dimension of betrayal, of indulgence, of fear. You’re moving into a land of both shadow and substance, of politics and paranoia. You’ve just crossed into… the MAGA Zone." "Tonight’s story: A leader sworn to protect his nation makes a bargain with its enemies. The deal? Silence in the face of nuclear annihilation. No retaliation, no defense — only surrender dressed in secrecy. While citizens live unaware, their president builds a palace beneath the earth, a ballroom of gold, of marble and chandeliers, a masquerade hall for billionaires. But behind the gilded doors lies not music and laughter, but a bomb shelter — a sanctuary for the few, pur...