Simulating stellar populations

Many astronomical investigations require simulating populations of stars, and isochrones contains some utilities to help enable this. Given population distributions of the quantities required to simulate individual stars, a StarPopulation object can be defined and used to generate sample populations following this distribution. Binary stars, ubiquitous as they are, are necessarily built into this framework, so the parameters needed to simulate an individual stellar observation are the following:

\[M_A, M_B, T, [Fe/H], d, A_V\]

where \(M_A, M_B\) are the primary and (if present) secondary masses, \(T\) is age, \([Fe/H]\) is the metalicity, \(d\) is distance, and \(A_V\) is the \(V\)-band extinction, quantifying the effect of dust along the line of sight. Generating a population of such stars then requires sampling from distributions of each of the above quantities. A StarPopulation takes metallicity, distance, and extinction distributions as arguments, and samples from each of those distributions when generating a sample population.

Sampling primary/secondary masses and ages is a bit less straightforward. For \(M_A, M_B\), isochrones parametrizes the distribution with a primary initial mass function (IMF), binary fraction \(f_B\), and mass-ratio (\(q = M_B/M_A\)) distribution \(p(q) \propto q^\gamma\). The age distribution of stars in a population is often described as a “star-formation history” (SFH)—sampling a population with a given SFH is the same as treating the SFH as the probability distribution function of stellar age, sampling ages from this distribution, and then truncating any stars that have reached the end of their evolution. Practically, this truncation happens by rejection sampling: evaluating the ModelGridInterpolator at each sampled set of parameters, and rejecting samples for which the interpolator returns np.nan values for the observed stellar properties (which will happen when trying to interpolate out-of-bounds, which happens when a star is requested beyond the end of its lifetime).

StarPopulation object

Here is an example of StarPopulation usage:

from scipy.stats import uniform, norm
from isochrones import get_ichrone
from isochrones.priors import GaussianPrior, SalpeterPrior, DistancePrior, FlatPrior
from isochrones.populations import StarFormationHistory, StarPopulation

# Initialize interpolator
mist = get_ichrone('mist')

# Initialize distributions

# Ingredients required to generate primary & secondary masses
imf = SalpeterPrior(bounds=(1, 10))  # minimum 1 Msun
fB = 0.4
gamma = 0.3

# SFH distribution takes a scipy stats distribution, of age in Gyr
sfh = StarFormationHistory(dist=uniform(0, 10))

# The following are all isochrones.priors.Prior objects,
# or anything with a .sample(N) method
feh = GaussianPrior(-0.2, 0.2)
distance = DistancePrior(max_distance=3000)
AV = FlatPrior(bounds=[0, 1])

pop = StarPopulation(mist, imf=imf, fB=fB, gamma=gamma, sfh=sfh, feh=feh, distance=distance, AV=AV)

Once the object is created, it can be used to generate a population of stars.

df = pop.generate(1000)
mass_0 logg_0 delta_nu_0 initial_mass_0 phase_0 eep_0 radius_0 Mbol_0 logTeff_0 feh_0 ... W1_mag A_W1 W2_mag A_W2 W3_mag A_W3 TESS_mag A_TESS Kepler_mag A_Kepler
0 1.553332 4.380359 102.670924 1.553406 0.0 299.894473 1.332984 2.451403 3.927937 -0.327345 ... 14.208519 0.014534 14.203228 0.008647 14.194746 0.002359 14.477501 0.161784 14.609700 0.225833
1 1.549665 4.211669 78.515862 1.549955 0.0 340.291660 1.617098 2.454235 3.885739 -0.111990 ... 13.706165 0.048358 13.686343 0.028772 13.662330 0.007847 14.462563 0.529039 14.812933 0.730147
2 1.127802 4.009138 66.783488 1.128399 0.0 447.067891 1.740697 3.206514 3.794381 -0.366979 ... 14.346362 0.010542 14.338646 0.006274 14.318684 0.001710 15.193251 0.114144 15.567852 0.155572
3 1.046129 4.299633 109.308356 1.046413 0.0 384.695516 1.199744 3.803016 3.815517 -0.668881 ... 14.342954 0.028781 14.327035 0.017125 14.301296 0.004667 15.269802 0.311150 15.679683 0.424941
4 1.267605 3.757970 42.600593 1.268362 2.0 460.286164 2.463683 2.543589 3.785193 -0.307018 ... 13.221399 0.041585 13.205129 0.024749 13.170042 0.006745 14.403408 0.445389 14.910002 0.604260

5 rows × 110 columns

Note that this operation is not nearly as fast as directly interpolating an isochrone or evolution track grid (since generating properites given mass, age, and metallicity necessarily involves solving for EEP first):

%timeit pop.generate(1000)
1.24 s ± 152 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Also, this can be made much faster if you loosen the requirement on getting exactly a particularly desired number of stars (as part of the generating algorithm involves replacing stars that come out as nan until no nans are left):

print(len(pop.generate(1000, exact_N=False)))

%timeit pop.generate(1000, exact_N=False)
64.9 ms ± 381 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

The full column list of this table of simulated stars is the following:

', '.join(df.columns)
'mass_0, logg_0, delta_nu_0, initial_mass_0, phase_0, eep_0, radius_0, Mbol_0, logTeff_0, feh_0, density_0, nu_max_0, logL_0, Teff_0, interpolated_0, star_age_0, age_0, dt_deep_0, J_mag_0, H_mag_0, K_mag_0, G_mag_0, BP_mag_0, RP_mag_0, W1_mag_0, W2_mag_0, W3_mag_0, TESS_mag_0, Kepler_mag_0, distance_0, AV_0, initial_feh_0, requested_age_0, A_J_0, A_H_0, A_K_0, A_G_0, A_BP_0, A_RP_0, A_W1_0, A_W2_0, A_W3_0, A_TESS_0, A_Kepler_0, mass_1, logg_1, delta_nu_1, initial_mass_1, phase_1, eep_1, radius_1, Mbol_1, logTeff_1, feh_1, density_1, nu_max_1, logL_1, Teff_1, interpolated_1, star_age_1, age_1, dt_deep_1, J_mag_1, H_mag_1, K_mag_1, G_mag_1, BP_mag_1, RP_mag_1, W1_mag_1, W2_mag_1, W3_mag_1, TESS_mag_1, Kepler_mag_1, distance_1, AV_1, initial_feh_1, requested_age_1, A_J_1, A_H_1, A_K_1, A_G_1, A_BP_1, A_RP_1, A_W1_1, A_W2_1, A_W3_1, A_TESS_1, A_Kepler_1, J_mag, A_J, H_mag, A_H, K_mag, A_K, G_mag, A_G, BP_mag, A_BP, RP_mag, A_RP, W1_mag, A_W1, W2_mag, A_W2, W3_mag, A_W3, TESS_mag, A_TESS, Kepler_mag, A_Kepler'

All quantities with a tag _0 refer to the primary star; all quantities with _1 refer to the secondary. Columns ending in just _mag represent the combined magnitude of both primary and secondary component. Let’s look the Gaia color-magnitude diagram for this simulated population. Note also the A_[x] columns, which give the specific extinction per band for each system (and for the individual components of the binary).

import holoviews as hv
import hvplot.pandas

def hr_plot(df):
    df['BpRp'] = df.BP_mag - df.RP_mag
    hr = df.hvplot.scatter('BpRp', 'G_mag',
                           hover_cols=['mass_0', 'mass_1', 'age_0', 'AV_0'],
    return hr.options(height=400, width=500, invert_yaxis=True)