threeML.bayesian package
Submodules
threeML.bayesian.autoemcee_sampler module
- class threeML.bayesian.autoemcee_sampler.AutoEmceeSampler(likelihood_model=None, data_list=None, **kwargs)[source]
Bases:
UnitCubeSampler
- sample(quiet=False)[source]
sample using the UltraNest numerical integration method :rtype:
- Returns:
- setup(num_global_samples=10000, num_chains=4, num_walkers=None, max_ncalls=1000000, max_improvement_loops=4, num_initial_steps=100, min_autocorr_times=0)[source]
Sample until MCMC chains have converged.
The steps are:
Draw num_global_samples from prior. The highest num_walkers points are selected.
Set num_steps to num_initial_steps
Run num_chains MCMC ensembles for num_steps steps
For each walker chain, compute auto-correlation length (Convergence requires num_steps/autocorrelation length > min_autocorr_times)
For each parameter, compute geweke convergence diagnostic (Convergence requires |z| < 2)
For each ensemble, compute gelman-rubin rank convergence diagnostic (Convergence requires rhat<1.2)
If converged, stop and return results.
Increase num_steps by 10, and repeat from (3) up to max_improvement_loops times.
- num_global_samples: int
Number of samples to draw from the prior to
- num_chains: int
Number of independent ensembles to run. If running with MPI, this is set to the number of MPI processes.
- num_walkers: int
Ensemble size. If None, max(100, 4 * dim) is used
- max_ncalls: int
Maximum number of likelihood function evaluations
- num_initial_steps: int
Number of sampler steps to take in first iteration
- max_improvement_loops: int
Number of times MCMC should be re-attempted (see above)
- min_autocorr_times: float
if positive, additionally require for convergence that the number of samples is larger than the min_autocorr_times times the autocorrelation length.
threeML.bayesian.bayesian_analysis module
- class threeML.bayesian.bayesian_analysis.BayesianAnalysis(likelihood_model: Model, data_list: DataList, **kwargs)[source]
Bases:
object
- property analysis_type: str
- convergence_plots(n_samples_in_each_subset, n_subsets)[source]
Compute the mean and variance for subsets of the samples, and plot them. They should all be around the same values if the MCMC has converged to the posterior distribution.
The subsamples are taken with two different strategies: the first is to slide a fixed-size window, the second is to take random samples from the chain (bootstrap)
- Parameters:
n_samples_in_each_subset – number of samples in each subset
n_subsets – number of subsets to take for each strategy
- Returns:
a matplotlib.figure instance
- property likelihood_model: Model
likelihood model (a Model instance)
- Type:
return
- property log_like_values: ndarray | None
Returns the value of the log_likelihood found by the bayesian sampler while sampling from the posterior. If you need to find the values of the parameters which generated a given value of the log. likelihood, remember that the samples accessible through the property .raw_samples are ordered in the same way as the vector returned by this method.
- Returns:
a vector of log. like values
- property log_marginal_likelihood: float | None
Return the log marginal likelihood (evidence ) if computed
- return:
- property log_probability_values: ndarray | None
Returns the value of the log_probability (posterior) found by the bayesian sampler while sampling from the posterior. If you need to find the values of the parameters which generated a given value of the log. likelihood, remember that the samples accessible through the property .raw_samples are ordered in the same way as the vector returned by this method.
- Returns:
a vector of log probabilty values
- plot_chains(thin=None)[source]
Produce a plot of the series of samples for each parameter
- Parameters:
thin – use only one sample every ‘thin’ samples
- Returns:
a matplotlib.figure instance
- property raw_samples: ndarray | None
Access the samples from the posterior distribution generated by the selected sampler in raw form (i.e., in the format returned by the sampler)
- Returns:
the samples as returned by the sampler
- property results: BayesianResults | None
- sample(quiet=False) None [source]
sample the posterior of the model with the selected algorithm
If no algorithm as been set, then the configured default algorithm we default parameters will be run
- Parameters:
quiet – if True, then no output is displayed
- Returns:
- property sampler: SamplerBase | None
Access the instance of the sampler used to sample the posterior distribution :return: an instance of the sampler
- property samples: Dict[str, ndarray] | None
Access the samples from the posterior distribution generated by the selected sampler
- Returns:
a dictionary with the samples from the posterior distribution for each parameter
threeML.bayesian.dynesty_sampler module
- class threeML.bayesian.dynesty_sampler.DynestyDynamicSampler(likelihood_model=None, data_list=None, **kwargs)[source]
Bases:
UnitCubeSampler
- sample(quiet=False)[source]
sample using the UltraNest numerical integration method :rtype:
- Returns:
- setup(nlive_init=500, maxiter_init=None, maxcall_init=None, dlogz_init=0.01, logl_max_init=inf, n_effective_init=inf, nlive_batch=500, wt_function=None, wt_kwargs=None, maxiter_batch=None, maxcall_batch=None, maxiter=None, maxcall=None, maxbatch=None, n_effective=inf, stop_function=None, stop_kwargs=None, use_stop=True, save_bounds=True, print_func=None, live_points=None, bound='multi', sample='auto', periodic=None, reflective=None, update_interval=None, first_update=None, npdim=None, rstate=None, use_pool=None, logl_args=None, logl_kwargs=None, ptform_args=None, ptform_kwargs=None, gradient=None, grad_args=None, grad_kwargs=None, compute_jac=False, enlarge=None, bootstrap=0, walks=25, facc=0.5, slices=5, fmove=0.9, max_move=100, update_func=None, **kwargs)[source]
TODO describe function
- Parameters:
nlive_init
maxiter_init
maxcall_init
dlogz_init
logl_max_init
n_effective_init
nlive_batch
wt_function
wt_kwargs
maxiter_batch
maxcall_batch
maxiter
maxcall
maxbatch
n_effective
stop_function
stop_kwargs
use_stop
save_bounds
print_func
live_points
bound
sample
periodic
reflective
update_interval
first_update
npdim
rstate
use_pool
logl_args
logl_kwargs
ptform_args
ptform_kwargs
gradient
grad_args
grad_kwargs
compute_jac
enlarge
bootstrap
vol_dec
vol_check
walks
facc
slices
fmove
max_move
update_func
- Returns:
- class threeML.bayesian.dynesty_sampler.DynestyNestedSampler(likelihood_model=None, data_list=None, **kwargs)[source]
Bases:
UnitCubeSampler
- sample(quiet=False)[source]
sample using the UltraNest numerical integration method :rtype:
- Returns:
- setup(n_live_points=400, maxiter=None, maxcall=None, dlogz=None, logl_max=inf, n_effective=None, add_live=True, print_func=None, save_bounds=True, bound='multi', sample='auto', periodic=None, reflective=None, update_interval=None, first_update=None, npdim=None, rstate=None, use_pool=None, live_points=None, logl_args=None, logl_kwargs=None, ptform_args=None, ptform_kwargs=None, gradient=None, grad_args=None, grad_kwargs=None, compute_jac=False, enlarge=None, bootstrap=0, walks=25, facc=0.5, slices=5, fmove=0.9, max_move=100, update_func=None, **kwargs)[source]
TODO describe function
- Parameters:
n_live_points
maxiter
maxcall
dlogz
logl_max
n_effective
add_live
print_func
save_bounds
bound
sample
periodic
reflective
update_interval
first_update
npdim
rstate
use_pool
live_points
logl_args
logl_kwargs
ptform_args
ptform_kwargs
gradient
grad_args
grad_kwargs
compute_jac
enlarge
bootstrap
vol_dec
vol_check
walks
facc
slices
fmove
max_move
update_func
- Returns:
threeML.bayesian.emcee_sampler module
- class threeML.bayesian.emcee_sampler.EmceeSampler(likelihood_model=None, data_list=None, **kwargs)[source]
Bases:
MCMCSampler
threeML.bayesian.multinest_sampler module
- class threeML.bayesian.multinest_sampler.MultiNestSampler(likelihood_model: Model | None = None, data_list: DataList | None = None, **kwargs)[source]
Bases:
UnitCubeSampler
- sample(quiet: bool = False)[source]
sample using the MultiNest numerical integration method
- Returns:
- Return type:
- setup(n_live_points: int = 400, chain_name: str = 'chains/fit-', resume: bool = False, importance_nested_sampling: bool = False, auto_clean: bool = False, **kwargs)[source]
Setup the MultiNest Sampler. For details see: https://github.com/farhanferoz/MultiNest
- Parameters:
n_live_points – number of live points for the evaluation
chain_name – the chain name
importance_nested_sampling – use INS
auto_clean – automatically remove multinest chains after run
- Resume:
resume from previous fit
- Returns:
- Return type:
threeML.bayesian.nautilus_sampler module
- class threeML.bayesian.nautilus_sampler.NautilusSampler(likelihood_model=None, data_list=None, **kwargs)[source]
Bases:
UnitCubeSampler
- sample(quiet=False)[source]
sample using the UltraNest numerical integration method :rtype:
- Returns:
- setup(n_live: int = 2000, n_update: int | None = None, enlarge_per_dim: float = 1.1, n_points_min: int | None = None, split_threshold: int = 100, n_networks: int = 4, neural_network_kwargs: Dict[str, Any] = {}, prior_args: List[Any] = [], prior_kwargs: Dict[str, Any] = {}, likelihood_args: List[Any] = [], likelihood_kwargs: Dict[str, Any] = {}, n_batch: int = 100, n_like_new_bound: int | None = None, vectorized: bool = False, pass_dict: bool | None = None, pool: int | None = None, seed: int | None = None, filepath: str | None = None, resume: bool = True, f_live: float = 0.01, n_shell: int | None = None, n_eff: int = 10000, discard_exploration: bool = False, verbose: bool = False)[source]
setup the nautilus sampler.
See: https://nautilus-sampler.readthedocs.io/en/stable/index.html
- Parameters:
n_live (int) – Number of so-called live points. New bounds are constructed so that they encompass the live points. Default is 3000.
n_update (Optional[int]) – The maximum number of additions to the live set before a new bound is created. If None, use n_live. Default is None.
enlarge_per_dim (float) – Along each dimension, outer ellipsoidal bounds are enlarged by this factor. Default is 1.1.
n_points_min (Optional[int]) – The minimum number of points each ellipsoid should have. Effectively, ellipsoids with less than twice that number will not be split further. If None, uses n_points_min = n_dim + 50. Default is None.
split_threshold (int) – hreshold used for splitting the multi-ellipsoidal bound used for sampling. If the volume of the bound prior enlarging is larger than split_threshold times the target volume, the multi-ellipsiodal bound is split further, if possible. Default is 100.
n_networks (int) – Number of networks used in the estimator. Default is 4.
neural_network_kwargs (Dict[Any]) – Non-default keyword arguments passed to the constructor of MLPRegressor.
prior_args (List[Any]) – List of extra positional arguments for prior. Only used if prior is a function.
prior_kwargs (Dict[Any]) – Dictionary of extra keyword arguments for prior. Only used if prior is a function.
likelihood_args (List[Any]) – List of extra positional arguments for likelihood.
likelihood_kwargs (Dict[Any]) – Dictionary of extra keyword arguments for likelihood.
n_batch (int) – Number of likelihood evaluations that are performed at each step. If likelihood evaluations are parallelized, should be multiple of the number of parallel processes. Very large numbers can lead to new bounds being created long after n_update additions to the live set have been achieved. This will not cause any bias but could reduce efficiency. Default is 100.
n_like_new_bound (Optional[int]) – The maximum number of likelihood calls before a new bounds is created. If None, use 10 times n_live. Default is None.
vectorized (bool) – If True, the likelihood function can receive multiple input sets at once. For example, if the likelihood function receives arrays, it should be able to take an array with shape (n_points, n_dim) and return an array with shape (n_points). Similarly, if the likelihood function accepts dictionaries, it should be able to process dictionaries where each value is an array with shape (n_points). Default is False.
pass_dict (Optional[bool]) – If True, the likelihood function expects model parameters as dictionaries. If False, it expects regular numpy arrays. Default is to set it to True if prior was a nautilus.Prior instance and False otherwise
pool (Optional[int]) – Pool used for parallelization of likelihood calls and sampler calculations. If None, no parallelization is performed. If an integer, the sampler will use a multiprocessing.Pool object with the specified number of processes. Finally, if specifying a tuple, the first one specifies the pool used for likelihood calls and the second one the pool for sampler calculations. Default is None.
seed (Optional[int]) – Seed for random number generation used for reproducible results accross different runs. If None, results are not reproducible. Default is None.
filepath (Optional[str]) – ath to the file where results are saved. Must have a ‘.h5’ or ‘.hdf5’ extension. If None, no results are written. Default is None.
resume (bool) – If True, resume from previous run if filepath exists. If False, start from scratch and overwrite any previous file. Default is True.
f_live (float) – Maximum fraction of the evidence contained in the live set before building the initial shells terminates. Default is 0.01.
n_shell (Optional[int]) – Minimum number of points in each shell. The algorithm will sample from the shells until this is reached. Default is the batch size of the sampler which is 100 unless otherwise specified.
n_eff (int) – Minimum effective sample size. The algorithm will sample from the shells until this is reached. Default is 10000.
discard_exploration (bool) – Whether to discard points drawn in the exploration phase. This is required for a fully unbiased posterior and evidence estimate. Default is False.
verbose (bool) – If True, print additional information. Default is False.
- Returns:
threeML.bayesian.sampler_base module
- class threeML.bayesian.sampler_base.MCMCSampler(likelihood_model, data_list, **kwargs)[source]
Bases:
SamplerBase
- class threeML.bayesian.sampler_base.SamplerBase(likelihood_model: Model, data_list: DataList, **kwargs)[source]
Bases:
object
- property log_like_values: ndarray | None
Returns the value of the log_likelihood found by the bayesian sampler while samplin g from the posterior. If you need to find the values of the parameters which generated a given value of the log. likelihood, remember that the samples accessible through the property .raw_samples are ordered in the same way as the vector returned by this method.
- Returns:
a vector of log. like values
- property log_marginal_likelihood: float | None
Return the log marginal likelihood (evidence) if computed :return:
- property log_probability_values: ndarray | None
Returns the value of the log_probability (posterior) found by the bayesian sampler while sampling from the posterior. If you need to find the values of the parameters which generated a given value of the log. likelihood, remember that the samples accessible through the property .raw_samples are ordered in the same way as the vector returned by this method.
- Returns:
a vector of log probabilty values
- property raw_samples: ndarray | None
Access the samples from the posterior distribution generated by the selected sampler in raw form (i.e., in the format returned by the sampler)
- Returns:
the samples as returned by the sampler
- property results: BayesianResults
- property samples: Dict[str, ndarray] | None
Access the samples from the posterior distribution generated by the selected sampler
- Returns:
a dictionary with the samples from the posterior distribution for each parameter
- class threeML.bayesian.sampler_base.UnitCubeSampler(likelihood_model, data_list, **kwargs)[source]
Bases:
SamplerBase
threeML.bayesian.tutorial_material module
- class threeML.bayesian.tutorial_material.BayesianAnalysisWrap(likelihood_model: Model, data_list: DataList, **kwargs)[source]
Bases:
BayesianAnalysis
- threeML.bayesian.tutorial_material.array_to_cmap(values, cmap, use_log=False)[source]
Generates a color map and color list that is normalized to the values in an array. Allows for adding a 3rd dimension onto a plot
- Parameters:
values – a list a values to map into a cmap
cmap – the mpl colormap to use
use_log – if the mapping should be done in log space
- threeML.bayesian.tutorial_material.plot_likelihood_function(bayes, fig=None, show_prior=False)[source]
- threeML.bayesian.tutorial_material.plot_sample_path(bayes, burn_in=None, truth=None)[source]
- Parameters:
jl (JointLikelihood)
- Returns:
threeML.bayesian.ultranest_sampler module
- class threeML.bayesian.ultranest_sampler.UltraNestSampler(likelihood_model=None, data_list=None, **kwargs)[source]
Bases:
UnitCubeSampler
- sample(quiet=False)[source]
sample using the UltraNest numerical integration method :rtype:
- Returns:
- setup(min_num_live_points: int = 400, dlogz: float = 0.5, chain_name: str | None = None, resume: str = 'overwrite', wrapped_params=None, stepsampler=None, use_mlfriends: bool = True, **kwargs)[source]
set up the Ultranest sampler. Consult the documentation:
- param min_num_live_points:
minimum number of live points throughout the run
- type min_num_live_points:
int
- param dlogz:
Target evidence uncertainty. This is the std between bootstrapped logz integrators.
- type dlogz:
float
- param chain_name:
where to store output files
- type chain_name:
- param resume:
(‘resume’, ‘resume-similar’, ‘overwrite’ or ‘subfolder’) –
if ‘overwrite’, overwrite previous data. if ‘subfolder’, create a fresh subdirectory in log_dir. if ‘resume’ or True, continue previous run if available. Only works when dimensionality, transform or likelihood are consistent. if ‘resume-similar’, continue previous run if available. Only works when dimensionality and transform are consistent. If a likelihood difference is detected, the existing likelihoods are updated until the live point order differs. Otherwise, behaves like resume.
- type resume:
str
- param wrapped_params:
(list of bools) – indicating whether this parameter wraps around (circular parameter).
- type wrapped_params:
- param stepsampler:
- type stepsampler:
- param use_mlfriends:
Whether to use MLFriends+ellipsoidal+tellipsoidal region (better for multi-modal problems) or just ellipsoidal sampling (faster for high-dimensional, gaussian-like problems).
- type use_mlfriends:
bool
- returns:
threeML.bayesian.zeus_sampler module
- class threeML.bayesian.zeus_sampler.ZeusSampler(likelihood_model=None, data_list=None, **kwargs)[source]
Bases:
MCMCSampler