Inference Engine

The inference engine is the main class for running an MCMC inference with AMPy. Since the AMPy class provides a high-level method for running MCMC, most users will not need to interact with this class directly.

API Reference

ampy.inference.engine.get_pool_context(workers=None, executor='process')
If executor==process and workers>1:

Returns ProcessPoolExecutor(max_workers=workers).

Since processes need to load everything into memory, this should only be used if the likelihood calculation takes about one-second or more to calculate.

If executor==thread and workers>1:

Returns ThreadPoolExecutor(max_workers=workers).

Note that unless a free-threaded Python is installed, multithreading will not yield any benefits. Even if a no-GIL Python version is used, the performance increase depends on the likelihood implementation. Pure Python implementations will see a large performance increase. If the likelihood uses Cython, then it depends on how the code is compiled and optimized.

Parameters:
  • workers (int, optional, default=None) – The max number of workers.

  • executor (str, optional, default='process') – See above docstring for details. Must be process or thread.

Returns:

The pool context manager.

Return type:

ProcessPoolExecutor or ThreadPoolExecutor or nullcontext

class ampy.inference.engine.InferenceEngine(modeling_engine, param_view)

Bases: object

High-level driver for MCMC inference over an ModelingEngine.

It manages sampler construction, initialization of starting positions, optional burn-in, and the production run.

modeling_engine

The engine used for forward modeling.

Type:

ampy.modeling.engine.ModelingEngine

param_view

Parameter and plugin view used by the sampler callbacks.

Type:

ampy.core.params.ParameterView

sampler

The active sampler instance (e.g., EnsembleSampler or PTSampler).

Type:

object or None

burn_chain

Copy of the burn-in chain, if burn-in was run and stored.

Type:

np.ndarray or None

property ndim

The number of fitting dimensions.

set_sampler(sampler, nwalkers, pool, ntemps=None, **kwargs)

Construct and assign the sampler used for inference.

Parameters:

sampler ({"ensemble", "tempered"}) –

Sampler backend to use.

nwalkersint

The number of walkers.

poolProcessPoolExecutor or ThreadPoolExecutor or nullcontext

The pool to use for multithreading/processing.

ntempsint, optional

Number of temperatures (required for "tempered").

kwargs

Any kwargs accepted by the sampler.

set_start_positions(resume=False)

Determine starting positions for the sampler.

Parameters:

resume (bool, optional, default=False) – If True and the sampler exposes get_last_sample(), start from the last recorded sample (typically from a backend).

Returns:

Starting positions in the format expected by the active sampler.

Return type:

np.ndarray or emcee.State

run(nwalkers, iterations, burn=0, sampler='ensemble', workers=None, ntemps=None, sampler_kw=None, run_kw=None, resume=False)

Runs the MCMC sampling routine.

Parameters:
  • nwalkers (int) – The number of walkers.

  • iterations (int) – The number of post-burn-in iterations.

  • burn (int, optional, default=0) – The number of iterations to burn. If burn>0, the burn sampler is stored to self.burn_sampler before resetting it for the main run.

  • sampler (str, optional, default='ensemble') – Must be ensemble or tempered.

  • workers (int, optional, default=None) – The max number of workers to use.

  • ntemps (int, optional, default=None) – The number of temperatures for PTSampler.

  • sampler_kw (dict, optional) – Any kwargs accepted by the sampler.

  • run_kw (dict, optional) – Any kwargs accepted by run_mcmc.

  • resume (bool, optional, default=False) – Resume from a previous run?

Notes

Burn-in storage uses a deepcopy of the burn chain rather than deep-copying the sampler, because copying the sampler can detach or invalidate the pool used for parallelism.

summary()

Summarize the current sampling run.

Returns:

Dictionary containing:

  • sampler: sampler name

  • burn_iters: number of burn-in iterations stored (0 if none)

  • prod_iters: number of production iterations completed

  • nwalkers: number of walkers

  • nmap_idx: index of the maximum a-posteriori (MAP) sample in

    the flattened log-probability array

  • nmap_val: -2 * max(log_prob), which is often used as a deviance-like quantity (smaller is better)

Return type:

dict

ampy.inference.engine.log_posterior_model(theta, engine, param_view)

Calculates the natural log of the posterior probability.

The posterior probability is the probability of the parameters, theta, given the evidence X denoted by p(theta | X).

Parameters:
  • theta (np.ndarray of float, with length of params.fitting) – The MCMC sampled values.

  • engine (ampy.modeling.engine.ModelingEngine)

  • param_view (ampy.core.params.ParameterView)

Returns:

The natural log of the posterior.

Return type:

float

ampy.inference.engine.log_prior_model(theta, params)

Evaluates the natural log of the priors.

Parameters:
  • theta (np.ndarray of float, with length of params.fitting) – The sampled MCMC parameter values.

  • params (Parameters) – MCMC parameter container.

Returns:

The log of the evaluated priors.

Return type:

float

ampy.inference.engine.log_likelihood_model(theta, engine, param_view)

Calculates the natural log of the likelihood.

Parameters:
  • theta (np.ndarray of float) – The MCMC sampled values.

  • engine (ampy.modeling.engine.ModelingEngine) – The MCMC Modeling engine.

  • param_view (ampy.core.params.ParameterView) – The parameter viewer used to map the MCMC samples to dicts.

Returns:

The log of the likelihood if the parameters were valid. Else, -np.inf.

Return type:

float or -np.inf