threeML.bayesian package

Submodules

threeML.bayesian.autoemcee_sampler module

class threeML.bayesian.autoemcee_sampler.AutoEmceeSampler(likelihood_model=None, data_list=None, **kwargs)[source]

Bases: UnitCubeSampler

sample(quiet=False)[source]

sample using the UltraNest numerical integration method :rtype:

Returns:

setup(num_global_samples=10000, num_chains=4, num_walkers=None, max_ncalls=1000000, max_improvement_loops=4, num_initial_steps=100, min_autocorr_times=0)[source]

Sample until MCMC chains have converged.

The steps are:

  1. Draw num_global_samples from prior. The highest num_walkers points are selected.

  2. Set num_steps to num_initial_steps

  3. Run num_chains MCMC ensembles for num_steps steps

  4. For each walker chain, compute auto-correlation length (Convergence requires num_steps/autocorrelation length > min_autocorr_times)

  5. For each parameter, compute geweke convergence diagnostic (Convergence requires |z| < 2)

  6. For each ensemble, compute gelman-rubin rank convergence diagnostic (Convergence requires rhat<1.2)

  7. If converged, stop and return results.

  8. Increase num_steps by 10, and repeat from (3) up to max_improvement_loops times.

num_global_samples: int

Number of samples to draw from the prior to

num_chains: int

Number of independent ensembles to run. If running with MPI, this is set to the number of MPI processes.

num_walkers: int

Ensemble size. If None, max(100, 4 * dim) is used

max_ncalls: int

Maximum number of likelihood function evaluations

num_initial_steps: int

Number of sampler steps to take in first iteration

max_improvement_loops: int

Number of times MCMC should be re-attempted (see above)

min_autocorr_times: float

if positive, additionally require for convergence that the number of samples is larger than the min_autocorr_times times the autocorrelation length.

threeML.bayesian.bayesian_analysis module

class threeML.bayesian.bayesian_analysis.BayesianAnalysis(likelihood_model: Model, data_list: DataList, **kwargs)[source]

Bases: object

property analysis_type: str
convergence_plots(n_samples_in_each_subset, n_subsets)[source]

Compute the mean and variance for subsets of the samples, and plot them. They should all be around the same values if the MCMC has converged to the posterior distribution.

The subsamples are taken with two different strategies: the first is to slide a fixed-size window, the second is to take random samples from the chain (bootstrap)

Parameters:
  • n_samples_in_each_subset – number of samples in each subset

  • n_subsets – number of subsets to take for each strategy

Returns:

a matplotlib.figure instance

property data_list: DataList

data list for this analysis

Type:

return

property likelihood_model: Model

likelihood model (a Model instance)

Type:

return

property log_like_values: Optional[ndarray]

Returns the value of the log_likelihood found by the bayesian sampler while sampling from the posterior. If you need to find the values of the parameters which generated a given value of the log. likelihood, remember that the samples accessible through the property .raw_samples are ordered in the same way as the vector returned by this method.

Returns:

a vector of log. like values

property log_marginal_likelihood: Optional[float]

Return the log marginal likelihood (evidence ) if computed

return:

property log_probability_values: Optional[ndarray]

Returns the value of the log_probability (posterior) found by the bayesian sampler while sampling from the posterior. If you need to find the values of the parameters which generated a given value of the log. likelihood, remember that the samples accessible through the property .raw_samples are ordered in the same way as the vector returned by this method.

Returns:

a vector of log probabilty values

plot_chains(thin=None)[source]

Produce a plot of the series of samples for each parameter

Parameters:

thin – use only one sample every ‘thin’ samples

Returns:

a matplotlib.figure instance

property raw_samples: Optional[ndarray]

Access the samples from the posterior distribution generated by the selected sampler in raw form (i.e., in the format returned by the sampler)

Returns:

the samples as returned by the sampler

restore_MAP_fit() None[source]

Sets the model parameters to the MAP of the probability

restore_median_fit()[source]

Sets the model parameters to the mean of the marginal distributions

property results: Optional[BayesianResults]
sample(quiet=False) None[source]

sample the posterior of the model with the selected algorithm

If no algorithm as been set, then the configured default algorithm we default parameters will be run

Parameters:

quiet – if True, then no output is displayed

Returns:

property sampler: Optional[SamplerBase]

Access the instance of the sampler used to sample the posterior distribution :return: an instance of the sampler

property samples: Optional[Dict[str, ndarray]]

Access the samples from the posterior distribution generated by the selected sampler

Returns:

a dictionary with the samples from the posterior distribution for each parameter

set_sampler(sampler_name: str = 'default', **kwargs)[source]

Set the sampler :param sampler_name: (str) Name of sampler

Parameters:

share_spectrum – (optional) Option to share the spectrum calc

between detectors with the same input energy bins

threeML.bayesian.dynesty_sampler module

class threeML.bayesian.dynesty_sampler.DynestyDynamicSampler(likelihood_model=None, data_list=None, **kwargs)[source]

Bases: UnitCubeSampler

sample(quiet=False)[source]

sample using the UltraNest numerical integration method :rtype:

Returns:

setup(nlive_init=500, maxiter_init=None, maxcall_init=None, dlogz_init=0.01, logl_max_init=inf, n_effective_init=inf, nlive_batch=500, wt_function=None, wt_kwargs=None, maxiter_batch=None, maxcall_batch=None, maxiter=None, maxcall=None, maxbatch=None, n_effective=inf, stop_function=None, stop_kwargs=None, use_stop=True, save_bounds=True, print_func=None, live_points=None, bound='multi', sample='auto', periodic=None, reflective=None, update_interval=None, first_update=None, npdim=None, rstate=None, use_pool=None, logl_args=None, logl_kwargs=None, ptform_args=None, ptform_kwargs=None, gradient=None, grad_args=None, grad_kwargs=None, compute_jac=False, enlarge=None, bootstrap=0, walks=25, facc=0.5, slices=5, fmove=0.9, max_move=100, update_func=None, **kwargs)[source]

TODO describe function

Parameters:
  • nlive_init

  • maxiter_init

  • maxcall_init

  • dlogz_init

  • logl_max_init

  • n_effective_init

  • nlive_batch

  • wt_function

  • wt_kwargs

  • maxiter_batch

  • maxcall_batch

  • maxiter

  • maxcall

  • maxbatch

  • n_effective

  • stop_function

  • stop_kwargs

  • use_stop

  • save_bounds

  • print_func

  • live_points

  • bound

  • sample

  • periodic

  • reflective

  • update_interval

  • first_update

  • npdim

  • rstate

  • use_pool

  • logl_args

  • logl_kwargs

  • ptform_args

  • ptform_kwargs

  • gradient

  • grad_args

  • grad_kwargs

  • compute_jac

  • enlarge

  • bootstrap

  • vol_dec

  • vol_check

  • walks

  • facc

  • slices

  • fmove

  • max_move

  • update_func

Returns:

class threeML.bayesian.dynesty_sampler.DynestyNestedSampler(likelihood_model=None, data_list=None, **kwargs)[source]

Bases: UnitCubeSampler

sample(quiet=False)[source]

sample using the UltraNest numerical integration method :rtype:

Returns:

setup(n_live_points=400, maxiter=None, maxcall=None, dlogz=None, logl_max=inf, n_effective=None, add_live=True, print_func=None, save_bounds=True, bound='multi', sample='auto', periodic=None, reflective=None, update_interval=None, first_update=None, npdim=None, rstate=None, use_pool=None, live_points=None, logl_args=None, logl_kwargs=None, ptform_args=None, ptform_kwargs=None, gradient=None, grad_args=None, grad_kwargs=None, compute_jac=False, enlarge=None, bootstrap=0, walks=25, facc=0.5, slices=5, fmove=0.9, max_move=100, update_func=None, **kwargs)[source]

TODO describe function

Parameters:
  • n_live_points

  • maxiter

  • maxcall

  • dlogz

  • logl_max

  • n_effective

  • add_live

  • print_func

  • save_bounds

  • bound

  • sample

  • periodic

  • reflective

  • update_interval

  • first_update

  • npdim

  • rstate

  • use_pool

  • live_points

  • logl_args

  • logl_kwargs

  • ptform_args

  • ptform_kwargs

  • gradient

  • grad_args

  • grad_kwargs

  • compute_jac

  • enlarge

  • bootstrap

  • vol_dec

  • vol_check

  • walks

  • facc

  • slices

  • fmove

  • max_move

  • update_func

Returns:

class threeML.bayesian.dynesty_sampler.DynestyPool(dview)[source]

Bases: object

A simple wrapper for dview.

map(function, tasks)[source]

threeML.bayesian.emcee_sampler module

class threeML.bayesian.emcee_sampler.EmceeSampler(likelihood_model=None, data_list=None, **kwargs)[source]

Bases: MCMCSampler

sample(quiet=False)[source]
setup(n_iterations: int, n_burn_in: Optional[int] = None, n_walkers: int = 20, seed=None, **kwargs)[source]

TODO describe function

Parameters:
  • n_iterations (int) –

  • n_burn_in (Optional[int]) –

  • n_walkers (int) –

  • seed

Returns:

threeML.bayesian.multinest_sampler module

class threeML.bayesian.multinest_sampler.MultiNestSampler(likelihood_model: Optional[Model] = None, data_list: Optional[DataList] = None, **kwargs)[source]

Bases: UnitCubeSampler

sample(quiet: bool = False)[source]

sample using the MultiNest numerical integration method

Returns:

Return type:

setup(n_live_points: int = 400, chain_name: str = 'chains/fit-', resume: bool = False, importance_nested_sampling: bool = False, auto_clean: bool = False, **kwargs)[source]

Setup the MultiNest Sampler. For details see: https://github.com/farhanferoz/MultiNest

Parameters:
  • n_live_points – number of live points for the evaluation

  • chain_name – the chain name

  • importance_nested_sampling – use INS

  • auto_clean – automatically remove multinest chains after run

Resume:

resume from previous fit

Returns:

Return type:

threeML.bayesian.nautilus_sampler module

class threeML.bayesian.nautilus_sampler.NautilusSampler(likelihood_model=None, data_list=None, **kwargs)[source]

Bases: UnitCubeSampler

sample(quiet=False)[source]

sample using the UltraNest numerical integration method :rtype:

Returns:

setup(n_live: int = 2000, n_update: Optional[int] = None, enlarge_per_dim: float = 1.1, n_points_min: Optional[int] = None, split_threshold: int = 100, n_networks: int = 4, neural_network_kwargs: Dict[str, Any] = {}, prior_args: List[Any] = [], prior_kwargs: Dict[str, Any] = {}, likelihood_args: List[Any] = [], likelihood_kwargs: Dict[str, Any] = {}, n_batch: int = 100, n_like_new_bound: Optional[int] = None, vectorized: bool = False, pass_dict: Optional[bool] = None, pool: Optional[int] = None, seed: Optional[int] = None, filepath: Optional[str] = None, resume: bool = True, f_live: float = 0.01, n_shell: Optional[int] = None, n_eff: int = 10000, discard_exploration: bool = False, verbose: bool = False)[source]

setup the nautilus sampler.

See: https://nautilus-sampler.readthedocs.io/en/stable/index.html

Parameters:
  • n_live (int) – Number of so-called live points. New bounds are constructed so that they encompass the live points. Default is 3000.

  • n_update (Optional[int]) – The maximum number of additions to the live set before a new bound is created. If None, use n_live. Default is None.

  • enlarge_per_dim (float) – Along each dimension, outer ellipsoidal bounds are enlarged by this factor. Default is 1.1.

  • n_points_min (Optional[int]) – The minimum number of points each ellipsoid should have. Effectively, ellipsoids with less than twice that number will not be split further. If None, uses n_points_min = n_dim + 50. Default is None.

  • split_threshold (int) – hreshold used for splitting the multi-ellipsoidal bound used for sampling. If the volume of the bound prior enlarging is larger than split_threshold times the target volume, the multi-ellipsiodal bound is split further, if possible. Default is 100.

  • n_networks (int) – Number of networks used in the estimator. Default is 4.

  • neural_network_kwargs (Dict[Any]) – Non-default keyword arguments passed to the constructor of MLPRegressor.

  • prior_args (List[Any]) – List of extra positional arguments for prior. Only used if prior is a function.

  • prior_kwargs (Dict[Any]) – Dictionary of extra keyword arguments for prior. Only used if prior is a function.

  • likelihood_args (List[Any]) – List of extra positional arguments for likelihood.

  • likelihood_kwargs (Dict[Any]) – Dictionary of extra keyword arguments for likelihood.

  • n_batch (int) – Number of likelihood evaluations that are performed at each step. If likelihood evaluations are parallelized, should be multiple of the number of parallel processes. Very large numbers can lead to new bounds being created long after n_update additions to the live set have been achieved. This will not cause any bias but could reduce efficiency. Default is 100.

  • n_like_new_bound (Optional[int]) – The maximum number of likelihood calls before a new bounds is created. If None, use 10 times n_live. Default is None.

  • vectorized (bool) – If True, the likelihood function can receive multiple input sets at once. For example, if the likelihood function receives arrays, it should be able to take an array with shape (n_points, n_dim) and return an array with shape (n_points). Similarly, if the likelihood function accepts dictionaries, it should be able to process dictionaries where each value is an array with shape (n_points). Default is False.

  • pass_dict (Optional[bool]) – If True, the likelihood function expects model parameters as dictionaries. If False, it expects regular numpy arrays. Default is to set it to True if prior was a nautilus.Prior instance and False otherwise

  • pool (Optional[int]) – Pool used for parallelization of likelihood calls and sampler calculations. If None, no parallelization is performed. If an integer, the sampler will use a multiprocessing.Pool object with the specified number of processes. Finally, if specifying a tuple, the first one specifies the pool used for likelihood calls and the second one the pool for sampler calculations. Default is None.

  • seed (Optional[int]) – Seed for random number generation used for reproducible results accross different runs. If None, results are not reproducible. Default is None.

  • filepath (Optional[str]) – ath to the file where results are saved. Must have a ‘.h5’ or ‘.hdf5’ extension. If None, no results are written. Default is None.

  • resume (bool) – If True, resume from previous run if filepath exists. If False, start from scratch and overwrite any previous file. Default is True.

  • f_live (float) – Maximum fraction of the evidence contained in the live set before building the initial shells terminates. Default is 0.01.

  • n_shell (Optional[int]) – Minimum number of points in each shell. The algorithm will sample from the shells until this is reached. Default is the batch size of the sampler which is 100 unless otherwise specified.

  • n_eff (int) – Minimum effective sample size. The algorithm will sample from the shells until this is reached. Default is 10000.

  • discard_exploration (bool) – Whether to discard points drawn in the exploration phase. This is required for a fully unbiased posterior and evidence estimate. Default is False.

  • verbose (bool) – If True, print additional information. Default is False.

Returns:

threeML.bayesian.nautilus_sampler.capture_arguments(func, *args, **kwargs)[source]

threeML.bayesian.sampler_base module

class threeML.bayesian.sampler_base.MCMCSampler(likelihood_model, data_list, **kwargs)[source]

Bases: SamplerBase

class threeML.bayesian.sampler_base.SamplerBase(likelihood_model: Model, data_list: DataList, **kwargs)[source]

Bases: object

get_posterior(trial_values) float[source]

Compute the posterior for the normal sampler

property log_like_values: Optional[ndarray]

Returns the value of the log_likelihood found by the bayesian sampler while samplin g from the posterior. If you need to find the values of the parameters which generated a given value of the log. likelihood, remember that the samples accessible through the property .raw_samples are ordered in the same way as the vector returned by this method.

Returns:

a vector of log. like values

property log_marginal_likelihood: Optional[float]

Return the log marginal likelihood (evidence) if computed :return:

property log_probability_values: Optional[ndarray]

Returns the value of the log_probability (posterior) found by the bayesian sampler while sampling from the posterior. If you need to find the values of the parameters which generated a given value of the log. likelihood, remember that the samples accessible through the property .raw_samples are ordered in the same way as the vector returned by this method.

Returns:

a vector of log probabilty values

property raw_samples: Optional[ndarray]

Access the samples from the posterior distribution generated by the selected sampler in raw form (i.e., in the format returned by the sampler)

Returns:

the samples as returned by the sampler

restore_MAP_fit() None[source]

Sets the model parameters to the MAP of the probability

restore_median_fit() None[source]

Sets the model parameters to the median of the log probability

property results: BayesianResults
abstract sample()[source]
property samples: Optional[Dict[str, ndarray]]

Access the samples from the posterior distribution generated by the selected sampler

Returns:

a dictionary with the samples from the posterior distribution for each parameter

abstract setup() None[source]
class threeML.bayesian.sampler_base.UnitCubeSampler(likelihood_model, data_list, **kwargs)[source]

Bases: SamplerBase

threeML.bayesian.sampler_base.arg_median(a)[source]

threeML.bayesian.tutorial_material module

class threeML.bayesian.tutorial_material.BayesianAnalysisWrap(likelihood_model: Model, data_list: DataList, **kwargs)[source]

Bases: BayesianAnalysis

sample(*args, **kwargs)[source]

sample the posterior of the model with the selected algorithm

If no algorithm as been set, then the configured default algorithm we default parameters will be run

Parameters:

quiet – if True, then no output is displayed

Returns:

threeML.bayesian.tutorial_material.array_to_cmap(values, cmap, use_log=False)[source]

Generates a color map and color list that is normalized to the values in an array. Allows for adding a 3rd dimension onto a plot

Parameters:
  • values – a list a values to map into a cmap

  • cmap – the mpl colormap to use

  • use_log – if the mapping should be done in log space

threeML.bayesian.tutorial_material.get_bayesian_analysis_object_complex_likelihood()[source]
threeML.bayesian.tutorial_material.get_bayesian_analysis_object_simple_likelihood()[source]
threeML.bayesian.tutorial_material.plot_likelihood_function(bayes, fig=None, show_prior=False)[source]
threeML.bayesian.tutorial_material.plot_sample_path(bayes, burn_in=None, truth=None)[source]
Parameters:

jl (JointLikelihood) –

Returns:

threeML.bayesian.ultranest_sampler module

class threeML.bayesian.ultranest_sampler.UltraNestSampler(likelihood_model=None, data_list=None, **kwargs)[source]

Bases: UnitCubeSampler

sample(quiet=False)[source]

sample using the UltraNest numerical integration method :rtype:

Returns:

setup(min_num_live_points: int = 400, dlogz: float = 0.5, chain_name: Optional[str] = None, resume: str = 'overwrite', wrapped_params=None, stepsampler=None, use_mlfriends: bool = True, **kwargs)[source]

set up the Ultranest sampler. Consult the documentation:

https://johannesbuchner.github.io/UltraNest/ultranest.html?highlight=reactive#ultranest.integrator.ReactiveNestedSampler

param min_num_live_points:

minimum number of live points throughout the run

type min_num_live_points:

int

param dlogz:

Target evidence uncertainty. This is the std between bootstrapped logz integrators.

type dlogz:

float

param chain_name:

where to store output files

type chain_name:

param resume:

(‘resume’, ‘resume-similar’, ‘overwrite’ or ‘subfolder’) –

if ‘overwrite’, overwrite previous data. if ‘subfolder’, create a fresh subdirectory in log_dir. if ‘resume’ or True, continue previous run if available. Only works when dimensionality, transform or likelihood are consistent. if ‘resume-similar’, continue previous run if available. Only works when dimensionality and transform are consistent. If a likelihood difference is detected, the existing likelihoods are updated until the live point order differs. Otherwise, behaves like resume.

type resume:

str

param wrapped_params:

(list of bools) – indicating whether this parameter wraps around (circular parameter).

type wrapped_params:

param stepsampler:

type stepsampler:

param use_mlfriends:

Whether to use MLFriends+ellipsoidal+tellipsoidal region (better for multi-modal problems) or just ellipsoidal sampling (faster for high-dimensional, gaussian-like problems).

type use_mlfriends:

bool

returns:

threeML.bayesian.zeus_sampler module

class threeML.bayesian.zeus_sampler.ZeusSampler(likelihood_model=None, data_list=None, **kwargs)[source]

Bases: MCMCSampler

sample(quiet=False)[source]
setup(n_iterations, n_burn_in=None, n_walkers=20, seed=None)[source]

set up the zeus sampler

Parameters:
  • n_iterations

  • n_burn_in

  • n_walkers

  • seed

Returns:

Module contents