Samplers

AMPy uses Markov Chain Monte Carlo (MCMC) sampling to estimate posterior distributions of model parameters given observational data. Samplers explore the parameter space according to the posterior probability:

\[\log p(\theta \mid D) \propto \log \mathcal{L}(D \mid \theta) + \sum_i \log p_i(\theta_i)\]

AMPy provides high-level sampler classes that wrap well-established MCMC implementations while integrating tightly with the AMPy modeling framework. These samplers handle parameter transformations, priors, and model evaluation automatically, so users typically interact with them only through the ampy.AMPy interface or configuration files.

Two samplers are currently available:

EnsembleSampler — affine-invariant ensemble sampler based on emcee
PTSampler — parallel-tempered ensemble sampler based on ptemcee

Choosing a sampler

EnsembleSampler: Efficient for moderately correlated posteriors and typical afterglow parameter spaces.
PTSampler: Recommended for highly multimodal or degenerate posteriors. Parallel tempering allows chains to move between separated probability regions at the cost of additional computation.

Configuration

The sampler is selected in the inference configuration:

[inference]
sampler     = "ensemble"
n_walkers   = 100
burn_length = 1000
run_length  = 2000

For parallel tempering:

[inference]
sampler     = "tempered"
n_walkers   = 100
n_temps     = 10
burn_length = 1000
run_length  = 2000

class ampy.inference.samplers.EnsembleSampler(nwalkers, ndim, log_prob_fn, args, **kw)

Bases: EnsembleSampler

Adapter for emcee.EnsembleSampler.

name = 'ensemble'

classmethod load(nwalkers, ndim, log_prob_fn, pool, args=None, **kw)

Instantiate the EnsembleSampler.

Parameters:

nwalkers (int) – The number of walkers in the ensemble.
ndim (int) – Number of dimensions in the parameter space.
log_prob_fn (callable) – A function that takes a vector in the parameter space as input and returns the natural logarithm of the posterior probability (up to an additive constant) for that position.
pool (ProcessPoolExecutor or ThreadPoolExecutor or nullcontext) – The pool to use for multithreading/processing.
args (iterable, optional) – A list of extra positional arguments for log_prob_fn. log_prob_fn will be called with the sequence log_prob_fn(p, *args, **kwargs).
kw (dict, optional) – Extra keyword arguments for log_prob_fn. log_prob_fn will be called with the sequence log_prob_fn(p, *args, **kwargs).

Returns:

The loaded ensemble sampler.

Return type:

ampy.inference.samplers.ensemble.EnsembleSampler

save(path)

Saves the chain and log posterior to path.

It is highly recommended to use emcee.backends.HDFBackend instead of this method.

Parameters:: path (str or pathlib.Path) – The path to save the file.

draw_positions(params, **kwargs)

Draw the initial positions from the priors.

Parameters:

params (Parameters) – The MCMC parameters.
kwargs – For compatability with PTSampler.draw_positions.

Returns:

The starting positions.

Return type:

np.ndarray of float with shape [nwalkers, ndim]

class ampy.inference.samplers.PTSampler(ntemps, nwalkers, ndim, log_like, log_prior, log_l_args=(), log_p_args=(), log_l_kwargs=(), log_p_kwargs=(), pool=None, **kwargs)

Bases: object

Provides an API adapter that matches emcee.

Parallel tempering in emcee stopped receiving support and was removed from official release. Using the latest version of emcee that supports the PTSampler forces users to use very old packages that the PTSampler depends on. There is a community developed version called ptemcee. However, the authors stopped maintaining it years ago.

This class aims to provide an API that matches emcee. Methods are only added on an as-needed basis and are by no means complete.

Parameters:

ntemps (int) – The number of temperatures.
nwalkers (int) – The number of walkers.
ndim (int) – The number of fitting dimensions.
log_like – The log likelihood and log prior methods.
log_prior – The log likelihood and log prior methods.
log_l_args (array_like, optional) – The log likelihood and log prior arguments.
log_p_args (array_like, optional) – The log likelihood and log prior arguments.
log_l_kwargs (array_like, optional) – The log likelihood and log prior kwargs.
log_p_kwargs (array_like, optional) – The log likelihood and log prior kwargs.
pool (optional) – An object with a map method that follows the same calling sequence as emcee’s built-in map function. This is used to compute the log-probabilities in parallel.
kwargs – Any kwargs accepted by ptemcee.Sampler.

name = 'tempered'

property sampler

property chain

property iteration

property ntemps

property nwalkers

property ndim

property acor

property acceptance_fraction

property swap_acceptance_fraction

property lnprobability

run_mcmc(x0, iterations, **kwargs)

Perform MCMC sampling.

Parameters:

x0 (np.ndarray) – The initial position vector.
iterations (int) – The number of steps to run.
kwargs – Any kwargs accepted by ptemcee.Chain.

Returns:

The last samples.

Return type:

np.ndarray with shape [ntemps, nwalkers, ndim]

reset()

Overwrites the sampler with a new one.

There’s no reset method in ptemcee that I’m aware of. Overwriting with a new sampler is safer than attempting to reset attributes individually.

save(path)

Saves the chain and log posterior to path.

To load the data, do: data = np.load(path). To access the chain, do: data[‘chain’].

Parameters:: path (str or pathlib.Path) – The path to save the file.

draw_positions(params, engine, log_post_fn)

Draw the initial positions from the priors.

PTSampler requires that the start positions be valid. To meet this, any invalid position is overwritten with the best position for that temperature.

Parameters:

params (ampy.core.params.ParameterView) – The MCMC parameters.
engine (ampy.modeling.engine.ModelingEngine) – The MCMC models.
log_post_fn (func) – The log-posterior function.

Returns:

The starting positions.

Return type:

np.ndarray of float with shape [ntemps, nwalkers, ndim]

get_autocorr_time(): Returns the autocorrelation time for the 0th temperature.

get_last_sample(): Returns last samples with shape [ntemps, nwalkers, ndim].

get_value(name, flat=False, thin=1, discard=0, temp=0)

Get the attribute name.

Parameters:

name (str) – Name of the attribute to retrieve.
flat (bool, optional, default=False) – Flatten the chain across the ensemble.
thin (int, optional, default=1) – Takes only every thin steps from the chain.
discard (int, optional, default=0) – Discard the first discard steps in the chain as burn-in.
temp (int, optional, default=0) – Takes only the temp attribute. Defaults to the temp at index 0 which is the highest probability temperature.

Return type:

np.ndarray

get_chain(**kwargs)

Get the stored chain of MCMC samples.

Parameters:

kwargs –

flatbool, optional, default=False: Flatten the chain across the ensemble.
thinint, optional, default=1: Take only every thin steps from the chain.
discardint, optional, default=0: Discard the first discard steps in the chain as burn-in.
tempint, optional, default=0: Take only the temp chain. Defaults to the temp at the 0 h index which corresponds to the highest probability temperature.

Returns:

The samples contained in ptemcee.Chain.x.

Return type:

np.ndarray with shape […, nwalkers, ndim]

get_log_prob(**kwargs)

Get the chain of log probabilities evaluated at the MCMC samples.

Parameters:

kwargs –

flatbool, optional, default=False: Flatten the chain across the ensemble.
thinint, optional, default=1: Take only every thin steps from the chain.
discardint, optional, default=0: Discard the first discard steps in the chain as burn-in.
tempint, optional, default=0: Take only the temp log prob. Defaults to the temp at the 0 h index which corresponds to the highest probability temperature.

Returns:

The chain of log probabilities.

Return type:

np.ndarray with shape […, nwalkers]