Inference Engine

The inference engine is the main class for running an MCMC inference with AMPy. Since the AMPy class provides a high-level method for running MCMC, most users will not need to interact with this class directly.

API Reference

ampy.inference.engine.get_pool_context(workers=None, executor='process')

If executor==process and workers>1:

Returns ProcessPoolExecutor(max_workers=workers).

Since processes need to load everything into memory, this should only be used if the likelihood calculation takes about one-second or more to calculate.

If executor==thread and workers>1:

Returns ThreadPoolExecutor(max_workers=workers).

Note that unless a free-threaded Python is installed, multithreading will not yield any benefits. Even if a no-GIL Python version is used, the performance increase depends on the likelihood implementation. Pure Python implementations will see a large performance increase. If the likelihood uses Cython, then it depends on how the code is compiled and optimized.

Parameters:

workers (int, optional, default=None) – The max number of workers.
executor (str, optional, default='process') – See above docstring for details. Must be process or thread.

Returns:

The pool context manager.

Return type:

ProcessPoolExecutor or ThreadPoolExecutor or nullcontext

class ampy.inference.engine.InferenceEngine(modeling_engine, param_view)

Bases: object

High-level driver for MCMC inference over an ModelingEngine.

It manages sampler construction, initialization of starting positions, optional burn-in, and the production run.

modeling_engine

The engine used for forward modeling.

Type:: ampy.modeling.engine.ModelingEngine

param_view

Parameter and plugin view used by the sampler callbacks.

Type:: ampy.core.params.ParameterView

sampler

The active sampler instance (e.g., EnsembleSampler or PTSampler).

Type:: object or None

burn_chain

Copy of the burn-in chain, if burn-in was run and stored.

Type:: np.ndarray or None

property ndim: The number of fitting dimensions.

set_sampler(sampler, nwalkers, pool, ntemps=None, **kwargs)

Construct and assign the sampler used for inference.

Parameters:

sampler ({"ensemble", "tempered"}) –

Sampler backend to use.

"ensemble" uses EnsembleSampler (affine-invariant ensemble sampling).
"tempered" uses PTSampler (parallel tempering).

nwalkersint: The number of walkers.
poolProcessPoolExecutor or ThreadPoolExecutor or nullcontext: The pool to use for multithreading/processing.
ntempsint, optional: Number of temperatures (required for "tempered").
kwargs: Any kwargs accepted by the sampler.

set_start_positions(resume=False)

Determine starting positions for the sampler.

Parameters:: resume (bool, optional, default=False) – If True and the sampler exposes get_last_sample(), start from the last recorded sample (typically from a backend).
Returns:: Starting positions in the format expected by the active sampler.
Return type:: np.ndarray or emcee.State

run(nwalkers, iterations, burn=0, sampler='ensemble', workers=None, ntemps=None, sampler_kw=None, run_kw=None, resume=False)

Runs the MCMC sampling routine.

Parameters:

nwalkers (int) – The number of walkers.
iterations (int) – The number of post-burn-in iterations.
burn (int, optional, default=0) – The number of iterations to burn. If burn>0, the burn sampler is stored to self.burn_sampler before resetting it for the main run.
sampler (str, optional, default='ensemble') – Must be ensemble or tempered.
workers (int, optional, default=None) – The max number of workers to use.
ntemps (int, optional, default=None) – The number of temperatures for PTSampler.
sampler_kw (dict, optional) – Any kwargs accepted by the sampler.
run_kw (dict, optional) – Any kwargs accepted by run_mcmc.
resume (bool, optional, default=False) – Resume from a previous run?

Notes

Burn-in storage uses a deepcopy of the burn chain rather than deep-copying the sampler, because copying the sampler can detach or invalidate the pool used for parallelism.

summary()

Summarize the current sampling run.

Returns:

Dictionary containing:

sampler: sampler name
burn_iters: number of burn-in iterations stored (0 if none)
prod_iters: number of production iterations completed
nwalkers: number of walkers
nmap_idx: index of the maximum a-posteriori (MAP) sample in
the flattened log-probability array
nmap_val: -2 * max(log_prob), which is often used as a deviance-like quantity (smaller is better)

Return type:

dict

ampy.inference.engine.log_posterior_model(theta, engine, param_view)

Calculates the natural log of the posterior probability.

The posterior probability is the probability of the parameters, theta, given the evidence X denoted by p(theta | X).

Parameters:

theta (np.ndarray of float, with length of params.fitting) – The MCMC sampled values.
engine (ampy.modeling.engine.ModelingEngine)
param_view (ampy.core.params.ParameterView)

Returns:

The natural log of the posterior.

Return type:

float

ampy.inference.engine.log_prior_model(theta, params)

Evaluates the natural log of the priors.

Parameters:

theta (np.ndarray of float, with length of params.fitting) – The sampled MCMC parameter values.
params (Parameters) – MCMC parameter container.

Returns:

The log of the evaluated priors.

Return type:

float

ampy.inference.engine.log_likelihood_model(theta, engine, param_view)

Calculates the natural log of the likelihood.

Parameters:

theta (np.ndarray of float) – The MCMC sampled values.
engine (ampy.modeling.engine.ModelingEngine) – The MCMC Modeling engine.
param_view (ampy.core.params.ParameterView) – The parameter viewer used to map the MCMC samples to dicts.

Returns:

The log of the likelihood if the parameters were valid. Else, -np.inf.

Return type:

float or -np.inf