.. _students_introduction: Introduction for students ========================= This page is written for a reader with a solid background in physics who is new to observational cosmology. It explains *what* this package does and *why* each measurement matters, before you dive into the technical details elsewhere in the documentation. Experts may want to skip ahead to the API reference. .. contents:: Contents :local: :depth: 1 ---- What is a galaxy survey? ------------------------ A **galaxy survey** is a systematic census of the universe — a telescope photographs the sky and/or disperses the light of each target into a spectrum, and software detects hundreds of millions of galaxies and stars. It is important to separate what is *directly observed* from what is *physically inferred*. Directly measured quantities ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * **Sky position** — `right ascension (RA) and declination (Dec) `__, the astronomical equivalent of longitude and latitude, measured from the centroid of the detected light distribution. * **Apparent flux in photometric bands** — the number of photons detected per second through each broad-band filter (e.g. *u*, *g*, *r*, *i*, *z*, *Y* in the optical/near-IR). This is the fundamental output of an imaging survey; see `astronomical photometry `__. The ratio of fluxes in two bands is called a **colour** and encodes information about the galaxy's temperature and stellar population. * **Spectra** — the wavelength-resolved flux, obtained by dispersing the light through a `spectrograph `__. `Spectral lines `__ — the fingerprints of specific atomic or molecular transitions — appear at well-known rest-frame wavelengths and are Doppler-shifted by the motion of the source and cosmologically redshifted due to the expansion of the universe. * **Galaxy shape (ellipticity)** — from high-resolution images, the projected shape of each galaxy can be measured. Shape catalogues are the raw input for weak gravitational lensing analyses. Quantities inferred from the observations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * **Redshift** :math:`z` — the fractional shift of spectral lines toward longer (redder) wavelengths, caused by the expansion of the universe (`Wikipedia `__). In **spectroscopic surveys** :math:`z` is measured precisely by identifying known lines in the dispersed spectrum (precision :math:`\sigma_z \sim 0.0001`). In **photometric surveys** it is estimated from the shape of the broad-band `spectral energy distribution (SED) `__ — a technique called `photometric redshift `__ — with typical precision :math:`\sigma_z/(1+z) \sim 0.02`. As a rough distance scale: :math:`z = 0.1` corresponds to roughly 1.3 billion light-years; :math:`z = 1` to roughly 8 billion light-years. * **Absolute luminosity and distance** — once :math:`z` is known, a cosmological model converts it to a `luminosity distance `__, which combined with the apparent flux yields the intrinsic luminosity (and hence absolute magnitude). This step introduces a dependence on the assumed cosmology. * **Stellar mass** :math:`M_\star` — the total mass locked up in stars, inferred by fitting `stellar population synthesis (SPS) `__ models to the observed SED (`Wikipedia `__). The result depends on the assumed `initial mass function `__, star-formation history, and dust attenuation law — all of which introduce systematic uncertainties at the factor-of-two level. * **Star formation rate (SFR)** — derived from nebular emission-line fluxes (e.g. H\ :math:`\alpha`, [O II]) or from UV and infrared luminosities; both trace the rate at which interstellar gas is converted into new stars (`Wikipedia `__). Modern surveys observe tens to hundreds of millions of galaxies, producing a 3-D map of the observable universe that can be statistically compared to theoretical predictions. ---- Major galaxy surveys -------------------- Surveys are broadly divided into **photometric** surveys — which prioritise wide sky coverage and galaxy-shape measurements — and **spectroscopic** surveys — which measure precise redshifts for millions of targets. Ongoing and upcoming reference surveys ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * `Euclid `__ — ESA space mission launched in 2023. Its wide-field imager (VIS) and near-infrared spectrograph/photometer (NISP) will map the distribution and shapes of more than a billion galaxies out to :math:`z \sim 2`. Primary probes: weak lensing and galaxy clustering. * `Rubin Observatory / LSST `__ — the Legacy Survey of Space and Time; a 10-year ground-based photometric survey from Cerro Pachón, Chile. Six-band (*ugrizy*) imaging to unprecedented depth over half the sky, with science goals spanning dark energy, dark matter, transients, and the Solar System. * `DESI `__ — the Dark Energy Spectroscopic Instrument, a massively multiplexed fibre spectrograph at Kitt Peak National Observatory. It will deliver spectroscopic redshifts for ~40 million galaxies and quasars over :math:`0 < z < 3.5`. Previous weak-lensing and photometric surveys ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * `HSC (Hyper Suprime-Cam Subaru Strategic Program) `__ — deep wide-field imaging on the 8.2-m Subaru Telescope, noted for its exceptional image quality and depth. * `DES (Dark Energy Survey) `__ — six-year imaging survey covering ~5000 deg² of the southern sky in five bands; released shape catalogues of ~100 million galaxies. * `KiDS (Kilo-Degree Survey) `__ — European weak-lensing survey with the OmegaCAM camera at the VLT Survey Telescope. * `COSMOS `__ — a deep pencil-beam survey covering a 2 deg² field with data from X-ray to radio, widely used as a photometric-redshift calibration field and for lensing studies. Previous spectroscopic surveys ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * `SDSS (Sloan Digital Sky Survey) `__ — the landmark survey that established wide-area spectroscopy. Multiple generations (SDSS-I through SDSS-V) have provided redshifts for over 3 million objects and photometry across one-third of the sky. * `2dFGRS (Two-degree Field Galaxy Redshift Survey) `__ — obtained ~220 000 galaxy spectra in the early 2000s and established key results on large-scale structure and the galaxy luminosity function. * `GAMA (Galaxy and Mass Assembly) `__ — deep spectroscopic survey designed to study galaxy evolution and the galaxy–halo connection out to :math:`z \sim 0.5` at high spectroscopic completeness. ---- Why do we compress data into summary statistics? ------------------------------------------------ A catalogue of one billion galaxies cannot be compared directly to a theoretical model — it would take impossibly long to compute the probability of the entire catalogue under any cosmological model. Instead, we *compress* the catalogue into a small number of **summary statistics**: compact numbers (or curves) that 1. retain most of the cosmological information we care about, and 2. can be quickly predicted from a model. This is analogous to characterising the height distribution of a population not by listing every person, but by reporting the mean and standard deviation. The reduction in data volume is enormous (from :math:`\sim 10^9` numbers to :math:`\sim 10^2`), yet the compressed form still allows us to constrain the cosmological parameters that govern the large-scale structure of the universe. ``sum_stat`` computes three families of summary statistics from galaxy catalogues and weak-lensing data. ---- The three families of statistics --------------------------------- One-point statistics: how many galaxies of each mass exist? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The **stellar mass function** :math:`\phi(M_\star)` answers the question: *how many galaxies per cubic megaparsec have a stellar mass between* :math:`M_\star` *and* :math:`M_\star + dM_\star`? Think of it as a histogram of galaxy masses, normalised by the volume of universe surveyed. Its shape — a power law at low masses that drops exponentially above a characteristic mass :math:`M^*` (the **Schechter function**) — encodes how efficiently the universe has converted dark matter and gas into stars. Similarly, the **luminosity function** :math:`\phi(M)` counts galaxies by their absolute brightness rather than their mass. .. note:: *Megaparsec (Mpc):* 1 Mpc = 3.09 × 10²² m ≈ 3.26 million light-years. A typical distance between galaxies is a few Mpc. Two-point statistics: do galaxies cluster? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Galaxies are not scattered randomly — they cluster along **cosmic filaments** and around massive **dark matter halos**, leaving voids almost devoid of galaxies. **Two-point correlation functions** quantify this clustering by measuring the excess probability of finding a galaxy pair at a given separation, compared with a uniform random distribution. Specifically: * **Angular correlation function** :math:`w(\theta)` — pairs separated by an angle :math:`\theta` on the sky, regardless of distance. No redshift information is needed. * **Projected correlation function** :math:`w_p(r_p)` — pairs at a *physical* (comoving) transverse separation :math:`r_p`. The line-of-sight component is integrated out to remove the distortions that galaxy peculiar velocities introduce to measured distances. * **Multipole decomposition** :math:`\xi_\ell(s)` — retains the directional anisotropy of the clustering signal to constrain the **growth rate** of large-scale structure. The amplitude and shape of these functions depend on how galaxies trace the underlying dark matter distribution — this connection is what we ultimately want to model. Galaxy-galaxy lensing: weighing dark matter halos ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Weak gravitational lensing** is one of the most direct ways to measure mass in the universe, including the invisible dark matter. When light from a distant **source** galaxy passes near a massive **lens** galaxy or cluster, the gravitational field slightly deflects the light path. This deflection causes the source to appear stretched into an arc. For a single lens–source pair the effect is far too small to see, but by averaging over millions of pairs the coherent shear signal becomes measurable. The quantity we compute is the **excess surface density**: .. math:: \Delta\Sigma(r_p) = \bar{\Sigma}(