abcpy package¶
This reference given details about the API of modules, classes and functions included in ABCpy.
abcpy.acceptedparametersmanager module¶

class
abcpy.acceptedparametersmanager.
AcceptedParametersManager
(model)[source]¶ Bases:
object

__init__
(model)[source]¶ This class manages the accepted parameters and other bds objects.
Parameters: model (list) – List of all root probabilistic models

broadcast
(backend, observations)[source]¶ Broadcasts the observations to observations_bds using the specified backend.
Parameters:  backend (abcpy.backends object) – The backend used by the inference algorithm
 observations (list) – A list containing all observed data

update_kernel_values
(backend, kernel_parameters)[source]¶ Broadcasts new parameters for each kernel
Parameters:  backend (abcpy.backends object) – The backend used by the inference algorithm
 kernel_parameters (list) – A list, in which each entry contains the values of the parameters associated with the corresponding kernel in the joint perturbation kernel

update_broadcast
(backend, accepted_parameters=None, accepted_weights=None, accepted_cov_mats=None)[source]¶ Updates the broadcasted values using the specified backend
Parameters:  backend (abcpy.backend object) – The backend to be used for broadcasting
 accepted_parameters (list) – The accepted parameters to be broadcasted
 accepted_weights (list) – The accepted weights to be broadcasted
 accepted_cov_mats (np.ndarray) – The accepted covariance matrix to be broadcasted

get_mapping
(models, is_root=True, index=0)[source]¶ Returns the order in which the models are discovered during recursive depthfirst search. Commonly used when returning the accepted_parameters_bds for certain models.
Parameters:  models (list) – List of the root probabilistic models of the graph.
 is_root (boolean) – Specifies whether the current list of models is the list of overall root models
 index (integer) – The current index in depthfirst search.
Returns: The first entry corresponds to the mapping of the root model, as well as all its parents. The second entry corresponds to the next index in depthfirst search.
Return type: list

get_accepted_parameters_bds_values
(models)[source]¶ Returns the accepted bds values for the specified models.
Parameters: models (list) – Contains the probabilistic models for which the accepted bds values should be returned Returns: The accepted_parameters_bds values of all the probabilistic models specified in models. Return type: list

abcpy.approx_lhd module¶

class
abcpy.approx_lhd.
Approx_likelihood
(statistics_calc)[source]¶ Bases:
object
This abstract base class defines the approximate likelihood function.

__init__
(statistics_calc)[source]¶ The constructor of a subclass must accept a nonoptional statistics calculator, which is stored to self.statistics_calc.
Parameters: statistics_calc (abcpy.stasistics.Statistics) – Statistics extractor object that conforms to the Statistics class.

likelihood
(y_sim)[source]¶ To be overwritten by any subclass: should compute the approximate likelihood value given the observed data set y_obs and the data set y_sim simulated from model set at the parameter value.
Parameters:  y_obs (Python list) – Observed data set.
 y_sim (Python list) – Simulated data set from model at the parameter value.
Returns: Computed approximate likelihood.
Return type: float


class
abcpy.approx_lhd.
SynLiklihood
(statistics_calc)[source]¶ Bases:
abcpy.approx_lhd.Approx_likelihood
This class implements the approximate likelihood function which computes the approximate likelihood using the synthetic likelihood approach described in Wood [1]. For synthetic likelihood approximation, we compute the robust precision matrix using Ledoit and Wolf’s [2] method.
[1] S. N. Wood. Statistical inference for noisy nonlinear ecological dynamic systems. Nature, 466(7310):1102–1104, Aug. 2010.
[2] O. Ledoit and M. Wolf, A WellConditioned Estimator for LargeDimensional Covariance Matrices, Journal of Multivariate Analysis, Volume 88, Issue 2, pages 365411, February 2004.

__init__
(statistics_calc)[source]¶ The constructor of a subclass must accept a nonoptional statistics calculator, which is stored to self.statistics_calc.
Parameters: statistics_calc (abcpy.stasistics.Statistics) – Statistics extractor object that conforms to the Statistics class.

likelihood
(y_obs, y_sim)[source]¶ To be overwritten by any subclass: should compute the approximate likelihood value given the observed data set y_obs and the data set y_sim simulated from model set at the parameter value.
Parameters:  y_obs (Python list) – Observed data set.
 y_sim (Python list) – Simulated data set from model at the parameter value.
Returns: Computed approximate likelihood.
Return type: float


class
abcpy.approx_lhd.
PenLogReg
(statistics_calc, model, n_simulate, n_folds=10, max_iter=100000, seed=None)[source]¶ Bases:
abcpy.approx_lhd.Approx_likelihood
,abcpy.graphtools.GraphTools
This class implements the approximate likelihood function which computes the approximate likelihood up to a constant using penalized logistic regression described in Dutta et. al. [1]. It takes one additional function handler defining the true model and two additional parameters n_folds and n_simulate correspondingly defining number of folds used to estimate prediction error using crossvalidation and the number of simulated dataset sampled from each parameter to approximate the likelihood function. For lasso penalized logistic regression we use glmnet of Friedman et. al. [2].
[1] Reference: R. Dutta, J. Corander, S. Kaski, and M. U. Gutmann. Likelihoodfree inference by penalised logistic regression. arXiv:1611.10242, Nov. 2016.
[2] Friedman, J., Hastie, T., and Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1–22.
Parameters:  statistics_calc (abcpy.stasistics.Statistics) – Statistics extractor object that conforms to the Statistics class.
 model (abcpy.models.Model) – Model object that conforms to the Model class.
 n_simulate (int) – Number of data points in the simulated data set.
 n_folds (int, optional) – Number of folds for crossvalidation. The default value is 10.
 max_iter (int, optional) – Maximum passes over the data. The default is 100000.
 seed (int, optional) – Seed for the random number generator. The used glmnet solver is not deterministic, this seed is used for determining the cv folds. The default value is None.

__init__
(statistics_calc, model, n_simulate, n_folds=10, max_iter=100000, seed=None)[source]¶ The constructor of a subclass must accept a nonoptional statistics calculator, which is stored to self.statistics_calc.
Parameters: statistics_calc (abcpy.stasistics.Statistics) – Statistics extractor object that conforms to the Statistics class.

likelihood
(y_obs, y_sim)[source]¶ To be overwritten by any subclass: should compute the approximate likelihood value given the observed data set y_obs and the data set y_sim simulated from model set at the parameter value.
Parameters:  y_obs (Python list) – Observed data set.
 y_sim (Python list) – Simulated data set from model at the parameter value.
Returns: Computed approximate likelihood.
Return type: float
abcpy.backends module¶

class
abcpy.backends.base.
Backend
[source]¶ Bases:
object
This is the base class for every parallelization backend. It essentially resembles the map/reduce API from Spark.
An idea for the future is to implement a MPI version of the backend with the hope to be more complient with standard HPC infrastructure and a potential speedup.

parallelize
(list)[source]¶ This method distributes the list on the available workers and returns a reference object.
The list should be split into number of workers many parts. Each part should then be sent to a separate worker node.
Parameters: list (Python list) – the list that should get distributed on the worker nodes Returns: A reference object that represents the parallelized list Return type: PDS class (parallel data set)

broadcast
(object)[source]¶ Send object to all worker nodes without splitting it up.
Parameters: object (Python object) – An abitrary object that should be available on all workers Returns: A reference to the broadcasted object Return type: BDS class (broadcast data set)

map
(func, pds)[source]¶ A distributed implementation of map that works on parallel data sets (PDS).
On every element of pds the function func is called.
Parameters:  func (Python func) – A function that can be applied to every element of the pds
 pds (PDS class) – A parallel data set to which func should be applied
Returns: a new parallel data set that contains the result of the map
Return type: PDS class


class
abcpy.backends.base.
PDS
[source]¶ Bases:
object
The reference class for parallel data sets (PDS).

class
abcpy.backends.base.
BDS
[source]¶ Bases:
object
The reference class for broadcast data set (BDS).

class
abcpy.backends.base.
BackendDummy
[source]¶ Bases:
abcpy.backends.base.Backend
This is a dummy parallelization backend, meaning it doesn’t parallelize anything. It is mainly implemented for testing purpose.

parallelize
(python_list)[source]¶ This actually does nothing: it just wraps the Python list into dummy pds (PDSDummy).
Parameters: python_list (Python list) – Returns: Return type: PDSDummy (parallel data set)

broadcast
(object)[source]¶ This actually does nothing: it just wraps the object into BDSDummy.
Parameters: object (Python object) – Returns: Return type: BDSDummy class

map
(func, pds)[source]¶ This is a wrapper for the Python internal map function.
Parameters:  func (Python func) – A function that can be applied to every element of the pds
 pds (PDSDummy class) – A pseudoparallel data set to which func should be applied
Returns: a new pseudoparallel data set that contains the result of the map
Return type: PDSDummy class


class
abcpy.backends.base.
PDSDummy
(python_list)[source]¶ Bases:
abcpy.backends.base.PDS
This is a wrapper for a Python list to fake parallelization.

class
abcpy.backends.base.
BDSDummy
(object)[source]¶ Bases:
abcpy.backends.base.BDS
This is a wrapper for a Python object to fake parallelization.

class
abcpy.backends.spark.
BackendSpark
(sparkContext, parallelism=4)[source]¶ Bases:
abcpy.backends.base.Backend
A parallelization backend for Apache Spark. It is essetially a wrapper for the required Spark functionality.

__init__
(sparkContext, parallelism=4)[source]¶ Initialize the backend with an existing and configured SparkContext.
Parameters:  sparkContext (pyspark.SparkContext) – an existing and fully configured PySpark context
 parallelism (int) – defines on how many workers a distributed dataset can be distributed

parallelize
(python_list)[source]¶ This is a wrapper of pyspark.SparkContext.parallelize().
Parameters: list (Python list) – list that is distributed on the workers Returns: A reference object that represents the parallelized list Return type: PDSSpark class (parallel data set)

broadcast
(object)[source]¶ This is a wrapper for pyspark.SparkContext.broadcast().
Parameters: object (Python object) – An abitrary object that should be available on all workers Returns: A reference to the broadcasted object Return type: BDSSpark class (broadcast data set)

map
(func, pds)[source]¶ This is a wrapper for pyspark.rdd.map()
Parameters:  func (Python func) – A function that can be applied to every element of the pds
 pds (PDSSpark class) – A parallel data set to which func should be applied
Returns: a new parallel data set that contains the result of the map
Return type: PDSSpark class


class
abcpy.backends.spark.
PDSSpark
(rdd)[source]¶ Bases:
abcpy.backends.base.PDS
This is a wrapper for Apache Spark RDDs.

class
abcpy.backends.spark.
BDSSpark
(bcv)[source]¶ Bases:
abcpy.backends.base.BDS
This is a wrapper for Apache Spark Broadcast variables.

class
abcpy.backends.mpi.
BackendMPIMaster
(master_node_ranks=[0], chunk_size=1)[source]¶ Bases:
abcpy.backends.base.Backend
Defines the behavior of the master process
This class defines the behavior of the master process (The one with rank==0) in MPI.

OP_PARALLELIZE
= 1¶

OP_MAP
= 2¶

OP_COLLECT
= 3¶

OP_BROADCAST
= 4¶

OP_DELETEPDS
= 5¶

OP_DELETEBDS
= 6¶

OP_FINISH
= 7¶

finalized
= False¶

__init__
(master_node_ranks=[0], chunk_size=1)[source]¶ Parameters:  master_node_ranks (Python list) – list of ranks computation should not happen on. Should include the master so it doesn’t get overwhelmed with work.
 chunk_size (Integer) – size of one block of data to be sent to free executors

parallelize
(python_list)[source]¶ This method distributes the list on the available workers and returns a reference object.
The list is split into number of workers many parts as a numpy array. Each part is sent to a separate worker node using the MPI scatter.
MASTER: python_list is the real data that is to be split up
Parameters: list (Python list) – the list that should get distributed on the worker nodes Returns: A reference object that represents the parallelized list Return type: PDSMPI class (parallel data set)

orchestrate_map
(pds_id)[source]¶ Orchestrates the slaves/workers to perform a map function
This works by keeping track of the workers who haven’t finished executing, waiting for them to request the next chunk of data when they are free, responding to them with the data and then sending them a Sentinel signalling that they can exit.

map
(func, pds)[source]¶ A distributed implementation of map that works on parallel data sets (PDS).
On every element of pds the function func is called.
Parameters:  func (Python func) – A function that can be applied to every element of the pds
 pds (PDS class) – A parallel data set to which func should be applied
Returns: a new parallel data set that contains the result of the map
Return type: PDSMPI class

collect
(pds)[source]¶  Gather the pds from all the workers,
 send it to the master and return it as a standard Python list.
Parameters: pds (PDS class) – a parallel data set Returns: all elements of pds as a list Return type: Python list

broadcast
(value)[source]¶ Send object to all worker nodes without splitting it up.
Parameters: object (Python object) – An abitrary object that should be available on all workers Returns: A reference to the broadcasted object Return type: BDS class (broadcast data set)


class
abcpy.backends.mpi.
BackendMPISlave
[source]¶ Bases:
abcpy.backends.base.Backend
Defines the behavior of the slaves/worker processes
This class defines how the slaves should behave during operation. Slaves are those processes(not nodes like Spark) that have rank!=0 and whose ids are not present in the list of non workers.

OP_PARALLELIZE
= 1¶

OP_MAP
= 2¶

OP_COLLECT
= 3¶

OP_BROADCAST
= 4¶

OP_DELETEPDS
= 5¶

OP_DELETEBDS
= 6¶

OP_FINISH
= 7¶

slave_run
()[source]¶ This method is the infinite loop a slave enters directly from init. It makes the slave wait for a command to perform from the master and then calls the appropriate function.
This method also takes care of the synchronization of data between the master and the slaves by matching PDSs based on the pds_ids sent by the master with the command.
Commands received from the master are of the form of a tuple. The first component of the tuple is always the operation to be performed and the rest are conditional on the operation.
(op,pds_id) where op == OP_PARALLELIZE for parallelize (op,pds_id, pds_id_result,func) where op == OP_MAP for map. (op,pds_id) where op == OP_COLLECT for a collect operation (op,pds_id) where op == OP_DELETEPDS for a delete of the remote PDS on slaves (op,) where op==OP_FINISH for the slave to break out of the loop and terminate

parallelize
()[source]¶ This method distributes the list on the available workers and returns a reference object.
The list should be split into number of workers many parts. Each part should then be sent to a separate worker node.
Parameters: list (Python list) – the list that should get distributed on the worker nodes Returns: A reference object that represents the parallelized list Return type: PDS class (parallel data set)

map
(func)[source]¶ A distributed implementation of map that works on parallel data sets (PDS).
On every element of pds the function func is called.
Parameters: func (Python func) – A function that can be applied to every element of the pds Returns: a new parallel data set that contains the result of the map Return type: PDSMPI class


class
abcpy.backends.mpi.
BackendMPI
(master_node_ranks=[0])[source]¶ Bases:
abcpy.backends.mpi.BackendMPISlave
A backend parallelized by using MPI
The backend conditionally inherits either the BackendMPIMaster class or the BackendMPISlave class depending on it’s rank. This lets BackendMPI have a uniform interface for the user but allows for a logical split between functions performed by the master and the slaves.

class
abcpy.backends.mpi.
PDSMPI
(python_list, pds_id, backend_obj)[source]¶ Bases:
abcpy.backends.base.PDS
This is an MPI wrapper for a Python parallel data set.

class
abcpy.backends.mpi.
BDSMPI
(object, bds_id, backend_obj)[source]¶ Bases:
abcpy.backends.base.BDS
This is a wrapper for MPI’s BDS class.
abcpy.continuousmodels module¶

class
abcpy.continuousmodels.
Uniform
(parameters, name='Uniform')[source]¶ Bases:
abcpy.probabilisticmodels.ProbabilisticModel
,abcpy.probabilisticmodels.Continuous

__init__
(parameters, name='Uniform')[source]¶ This class implements a probabilistic model following an uniform distribution.
Parameters:  parameters (list) – Contains two lists. The first list specifies the probabilistic models and hyperparameters from which the lower bound of the uniform distribution derive. The second list specifies the probabilistic models and hyperparameters from which the upper bound derives.
 name (string, optional) – The name that should be given to the probabilistic model in the journal file.

forward_simulate
(input_values, k, rng=<MagicMock name='mock.RandomState()' id='140708317737592'>)[source]¶ Samples from a uniform distribution using the current values for each probabilistic model from which the model derives.
Parameters:  input_values (list) – List of input parameters, in the same order as specified in the InputConnector passed to the init function
 k (integer) – The number of samples that should be drawn.
 rng (Random number generator) – Defines the random number generator to be used. The default value uses a random seed to initialize the generator.
Returns: list – A list containing the sampled values as nparray.
Return type: [np.ndarray]

get_output_dimension
()[source]¶ Provides the output dimension of the current model.
This function is in particular important if the current model is used as an input for other models. In such a case it is assumed that the output is always a vector of int or float. The length of the vector is the dimension that should be returned here.
Returns: The dimension of the output vector of a single forward simulation. Return type: int

pdf
(input_values, x)[source]¶ Calculates the probability density function at point x. Commonly used to determine whether perturbed parameters are still valid according to the pdf.
Parameters:  input_values (list) – List of input parameters, in the same order as specified in the InputConnector passed to the init function
 x (list) – The point at which the pdf should be evaluated.
Returns: The evaluated pdf at point x.
Return type: Float


class
abcpy.continuousmodels.
Normal
(parameters, name='Normal')[source]¶ Bases:
abcpy.probabilisticmodels.ProbabilisticModel
,abcpy.probabilisticmodels.Continuous

__init__
(parameters, name='Normal')[source]¶ This class implements a probabilistic model following a normal distribution with mean mu and variance sigma.
Parameters:  parameters (list) – Contains the probabilistic models and hyperparameters from which the model derives. The list has two entries: from the first entry mean of the distribution and from the second entry variance is derived. Note that the second value of the list is strictly greater than 0.
 name (string) – The name that should be given to the probabilistic model in the journal file.

forward_simulate
(input_values, k, rng=<MagicMock name='mock.RandomState()' id='140708316310720'>)[source]¶ Samples from a normal distribution using the current values for each probabilistic model from which the model derives.
Parameters:  input_values (list) – List of input parameters, in the same order as specified in the InputConnector passed to the init function
 k (integer) – The number of samples that should be drawn.
 rng (Random number generator) – Defines the random number generator to be used. The default value uses a random seed to initialize the generator.
Returns: list – A list containing the sampled values as nparray.
Return type: [np.ndarray]

get_output_dimension
()[source]¶ Provides the output dimension of the current model.
This function is in particular important if the current model is used as an input for other models. In such a case it is assumed that the output is always a vector of int or float. The length of the vector is the dimension that should be returned here.
Returns: The dimension of the output vector of a single forward simulation. Return type: int

pdf
(input_values, x)[source]¶ Calculates the probability density function at point x. Commonly used to determine whether perturbed parameters are still valid according to the pdf.
Parameters:  input_values (list) – List of input parameters of the from [mu, sigma]
 x (list) – The point at which the pdf should be evaluated.
Returns: The evaluated pdf at point x.
Return type: Float


class
abcpy.continuousmodels.
StudentT
(parameters, name='StudentT')[source]¶ Bases:
abcpy.probabilisticmodels.ProbabilisticModel
,abcpy.probabilisticmodels.Continuous

__init__
(parameters, name='StudentT')[source]¶ This class implements a probabilistic model following the Student’s Tdistribution.
Parameters:  parameters (list) – Contains the probabilistic models and hyperparameters from which the model derives. The list has two entries: from the first entry mean of the distribution and from the second entry degrees of freedom is derived. Note that the second value of the list is strictly greater than 0.
 name (string) – The name that should be given to the probabilistic model in the journal file.

forward_simulate
(input_values, k, rng=<MagicMock name='mock.RandomState()' id='140708316635824'>)[source]¶ Samples from a Student’s Tdistribution using the current values for each probabilistic model from which the model derives.
Parameters:  input_values (list) – List of input parameters, in the same order as specified in the InputConnector passed to the init function
 k (integer) – The number of samples that should be drawn.
 rng (Random number generator) – Defines the random number generator to be used. The default value uses a random seed to initialize the generator.
Returns: list – A list containing the sampled values as nparray.
Return type: [np.ndarray]

get_output_dimension
()[source]¶ Provides the output dimension of the current model.
This function is in particular important if the current model is used as an input for other models. In such a case it is assumed that the output is always a vector of int or float. The length of the vector is the dimension that should be returned here.
Returns: The dimension of the output vector of a single forward simulation. Return type: int

pdf
(input_values, x)[source]¶ Calculates the probability density function at point x. Commonly used to determine whether perturbed parameters are still valid according to the pdf.
Parameters:  input_values (list) – List of input parameters
 x (list) – The point at which the pdf should be evaluated.
Returns: The evaluated pdf at point x.
Return type: Float


class
abcpy.continuousmodels.
MultivariateNormal
(parameters, name='Multivariate Normal')[source]¶ Bases:
abcpy.probabilisticmodels.ProbabilisticModel
,abcpy.probabilisticmodels.Continuous

__init__
(parameters, name='Multivariate Normal')[source]¶ This class implements a probabilistic model following a multivariate normal distribution with mean and covariance matrix.
Parameters:  parameters (list of at length 2) – Contains the probabilistic models and hyperparameters from which the model derives. The first entry defines the mean, while the second entry defines the Covariance matrix. Note that if the mean is n dimensional, the covariance matrix is required to be of dimension nxn, symmetric and positivedefinite.
 name (string) – The name that should be given to the probabilistic model in the journal file.

forward_simulate
(input_values, k, rng=<MagicMock name='mock.RandomState()' id='140708316214608'>)[source]¶ Samples from a multivariate normal distribution using the current values for each probabilistic model from which the model derives.
Parameters:  input_values (list) – List of input parameters, in the same order as specified in the InputConnector passed to the init function
 k (integer) – The number of samples that should be drawn.
 rng (Random number generator) – Defines the random number generator to be used. The default value uses a random seed to initialize the generator.
Returns: list – A list containing the sampled values as nparray.
Return type: [np.ndarray]

get_output_dimension
()[source]¶ Provides the output dimension of the current model.
This function is in particular important if the current model is used as an input for other models. In such a case it is assumed that the output is always a vector of int or float. The length of the vector is the dimension that should be returned here.
Returns: The dimension of the output vector of a single forward simulation. Return type: int

pdf
(input_values, x)[source]¶ Calculates the probability density function at point x. Commonly used to determine whether perturbed parameters are still valid according to the pdf.
Parameters:  input_values (list) – List of input parameters
 x (list) – The point at which the pdf should be evaluated.
Returns: The evaluated pdf at point x.
Return type: Float


class
abcpy.continuousmodels.
MultiStudentT
(parameters, name='MultiStudentT')[source]¶ Bases:
abcpy.probabilisticmodels.ProbabilisticModel
,abcpy.probabilisticmodels.Continuous

__init__
(parameters, name='MultiStudentT')[source]¶ This class implements a probabilistic model following the multivariate StudentT distribution.
Parameters:  parameters (list) – All but the last two entries contain the probabilistic models and hyperparameters from which the model derives. The second to last entry contains the covariance matrix. If the mean is of dimension n, the covariance matrix is required to be nxn dimensional. The last entry contains the degrees of freedom.
 name (string) – The name that should be given to the probabilistic model in the journal file.

forward_simulate
(input_values, k, rng=<MagicMock name='mock.RandomState()' id='140708316391912'>)[source]¶ Samples from a multivariate Student’s Tdistribution using the current values for each probabilistic model from which the model derives.
Parameters:  input_values (list) – List of input parameters, in the same order as specified in the InputConnector passed to the init function
 k (integer) – The number of samples that should be drawn.
 rng (Random number generator) – Defines the random number generator to be used. The default value uses a random seed to initialize the generator.
Returns: list – A list containing the sampled values as nparray.
Return type: [np.ndarray]

get_output_dimension
()[source]¶ Provides the output dimension of the current model.
This function is in particular important if the current model is used as an input for other models. In such a case it is assumed that the output is always a vector of int or float. The length of the vector is the dimension that should be returned here.
Returns: The dimension of the output vector of a single forward simulation. Return type: int

pdf
(input_values, x)[source]¶ Calculates the probability density function at point x. Commonly used to determine whether perturbed parameters are still valid according to the pdf.
Parameters:  input_values (list) – List of input parameters
 x (list) – The point at which the pdf should be evaluated.
Returns: The evaluated pdf at point x.
Return type: Float

abcpy.discretemodels module¶

class
abcpy.discretemodels.
Bernoulli
(parameters, name='Bernoulli')[source]¶ Bases:
abcpy.probabilisticmodels.Discrete
,abcpy.probabilisticmodels.ProbabilisticModel

__init__
(parameters, name='Bernoulli')[source]¶ This class implements a probabilistic model following a bernoulli distribution.
Parameters:  parameters (list) – A list containing one entry, the probability of the distribution.
 name (string) – The name that should be given to the probabilistic model in the journal file.

forward_simulate
(input_values, k, rng=<MagicMock name='mock.RandomState()' id='140708315295080'>)[source]¶ Samples from the bernoulli distribution associtated with the probabilistic model.
Parameters:  input_values (list) – List of input parameters, in the same order as specified in the InputConnector passed to the init function
 k (integer) – The number of samples to be drawn.
 rng (random number generator) – The random number generator to be used.
Returns: list – A list containing the sampled values as nparray.
Return type: [np.ndarray]

get_output_dimension
()[source]¶ Provides the output dimension of the current model.
This function is in particular important if the current model is used as an input for other models. In such a case it is assumed that the output is always a vector of int or float. The length of the vector is the dimension that should be returned here.
Returns: The dimension of the output vector of a single forward simulation. Return type: int

pmf
(input_values, x)[source]¶ Evaluates the probability mass function at point x.
Parameters:  input_values (list) – List of input parameters, in the same order as specified in the InputConnector passed to the init function
 x (float) – The point at which the pmf should be evaluated.
Returns: The pmf evaluated at point x.
Return type: float


class
abcpy.discretemodels.
Binomial
(parameters, name='Binomial')[source]¶ Bases:
abcpy.probabilisticmodels.Discrete
,abcpy.probabilisticmodels.ProbabilisticModel

__init__
(parameters, name='Binomial')[source]¶ This class implements a probabilistic model following a binomial distribution.
Parameters:  parameters (list) – Contains the probabilistic models and hyperparameters from which the model derives. Note that the first entry of the list, n, an integer and has to be larger than or equal to 0, while the second entry, p, has to be in the interval [0,1].
 name (string) – The name that should be given to the probabilistic model in the journal file.

forward_simulate
(input_values, k, rng=<MagicMock name='mock.RandomState()' id='140708315387104'>)[source]¶ Samples from a binomial distribution using the current values for each probabilistic model from which the model derives.
Parameters:  input_values (list) – List of input parameters, in the same order as specified in the InputConnector passed to the init function
 k (integer) – The number of samples that should be drawn.
 rng (Random number generator) – Defines the random number generator to be used. The default value uses a random seed to initialize the generator.
Returns: list – A list containing the sampled values as nparray.
Return type: [np.ndarray]

get_output_dimension
()[source]¶ Provides the output dimension of the current model.
This function is in particular important if the current model is used as an input for other models. In such a case it is assumed that the output is always a vector of int or float. The length of the vector is the dimension that should be returned here.
Returns: The dimension of the output vector of a single forward simulation. Return type: int

pmf
(input_values, x)[source]¶ Calculates the probability mass function at point x.
Parameters:  input_values (list) – List of input parameters, in the same order as specified in the InputConnector passed to the init function
 x (list) – The point at which the pmf should be evaluated.
Returns: The evaluated pmf at point x.
Return type: Float


class
abcpy.discretemodels.
Poisson
(parameters, name='Poisson')[source]¶ Bases:
abcpy.probabilisticmodels.Discrete
,abcpy.probabilisticmodels.ProbabilisticModel

__init__
(parameters, name='Poisson')[source]¶ This class implements a probabilistic model following a poisson distribution.
Parameters:  parameters (list) – A list containing one entry, the mean of the distribution.
 name (string) – The name that should be given to the probabilistic model in the journal file.

forward_simulate
(input_values, k, rng=<MagicMock name='mock.RandomState()' id='140708314999160'>)[source]¶ Samples k values from the defined possion distribution.
Parameters:  input_values (list) – List of input parameters, in the same order as specified in the InputConnector passed to the init function
 k (integer) – The number of samples.
 rng (random number generator) – The random number generator to be used.
Returns: list – A list containing the sampled values as nparray.
Return type: [np.ndarray]

get_output_dimension
()[source]¶ Provides the output dimension of the current model.
This function is in particular important if the current model is used as an input for other models. In such a case it is assumed that the output is always a vector of int or float. The length of the vector is the dimension that should be returned here.
Returns: The dimension of the output vector of a single forward simulation. Return type: int

pmf
(input_values, x)[source]¶ Calculates the probability mass function of the distribution at point x.
Parameters:  input_values (list) – List of input parameters, in the same order as specified in the InputConnector passed to the init function
 x (integer) – The point at which the pmf should be evaluated.
Returns: The evaluated pmf at point x.
Return type: Float


class
abcpy.discretemodels.
DiscreteUniform
(parameters, name='DiscreteUniform')[source]¶ Bases:
abcpy.probabilisticmodels.Discrete
,abcpy.probabilisticmodels.ProbabilisticModel

__init__
(parameters, name='DiscreteUniform')[source]¶ This class implements a probabilistic model following a Discrete Uniform distribution.
Parameters:  parameters (list) – A list containing two entries, the upper and lower bound of the range.
 name (string) – The name that should be given to the probabilistic model in the journal file.

forward_simulate
(input_values, k, rng=<MagicMock name='mock.RandomState()' id='140708315292840'>)[source]¶ Samples from the Discrete Uniform distribution associated with the probabilistic model.
Parameters:  input_values (list) – List of input parameters, in the same order as specified in the InputConnector passed to the init function
 k (integer) – The number of samples to be drawn.
 rng (random number generator) – The random number generator to be used.
Returns: list – A list containing the sampled values as nparray.
Return type: [np.ndarray]

get_output_dimension
()[source]¶ Provides the output dimension of the current model.
This function is in particular important if the current model is used as an input for other models. In such a case it is assumed that the output is always a vector of int or float. The length of the vector is the dimension that should be returned here.
Returns: The dimension of the output vector of a single forward simulation. Return type: int

pmf
(input_values, x)[source]¶ Evaluates the probability mass function at point x.
Parameters:  input_values (list) – List of input parameters, in the same order as specified in the InputConnector passed to the init function
 x (float) – The point at which the pmf should be evaluated.
Returns: The pmf evaluated at point x.
Return type: float

abcpy.distances module¶

class
abcpy.distances.
Distance
(statistics_calc)[source]¶ Bases:
object
This abstract base class defines how the distance between the observed and simulated data should be implemented.

__init__
(statistics_calc)[source]¶ The constructor of a subclass must accept a nonoptional statistics calculator as a parameter. If stored to self.statistics_calc, the private helper method _calculate_summary_stat can be used.
Parameters: statistics_calc (abcpy.stasistics.Statistics) – Statistics extractor object that conforms to the Statistics class.

distance
(d2)[source]¶ To be overwritten by any subclass: should calculate the distance between two sets of data d1 and d2 using their respective statistics.
Notes
The data sets d1 and d2 are arraylike structures that contain n1 and n2 data points each. An implementation of the distance function should work along the following steps:
1. Transform both input sets dX = [ dX1, dX2, …, dXn ] to sX = [sX1, sX2, …, sXn] using the statistics object. See _calculate_summary_stat method.
2. Calculate the mutual desired distance, here denoted by , between the statstics dist = [s11  s21, s12  s22, …, s1n  s2n].
Important: any subclass must not calculate the distance between data sets d1 and d2 directly. This is the reason why any subclass must be initialized with a statistics object.
Parameters:  d1 (Python list) – Contains n1 data points.
 d2 (Python list) – Contains n2 data points.
Returns: The distance between the two input data sets.
Return type: numpy.ndarray

dist_max
()[source]¶ To be overwritten by subclass: should return maximum possible value of the desired distance function.
Examples
If the desired distance maps to \(\mathbb{R}\), this method should return numpy.inf.
Returns: The maximal possible value of the desired distance function. Return type: numpy.float


class
abcpy.distances.
Euclidean
(statistics)[source]¶ Bases:
abcpy.distances.Distance
This class implements the Euclidean distance between two vectors.
The maximum value of the distance is np.inf.

__init__
(statistics)[source]¶ The constructor of a subclass must accept a nonoptional statistics calculator as a parameter. If stored to self.statistics_calc, the private helper method _calculate_summary_stat can be used.
Parameters: statistics_calc (abcpy.stasistics.Statistics) – Statistics extractor object that conforms to the Statistics class.

distance
(d1, d2)[source]¶ Calculates the distance between two datasets.
Parameters: d2 (d1,) – A list, containing a list describing the data set

dist_max
()[source]¶ To be overwritten by subclass: should return maximum possible value of the desired distance function.
Examples
If the desired distance maps to \(\mathbb{R}\), this method should return numpy.inf.
Returns: The maximal possible value of the desired distance function. Return type: numpy.float


class
abcpy.distances.
PenLogReg
(statistics)[source]¶ Bases:
abcpy.distances.Distance
This class implements a distance mesure based on the classification accuracy.
The classification accuracy is calculated between two dataset d1 and d2 using lasso penalized logistics regression and return it as a distance. The lasso penalized logistic regression is done using glmnet package of Friedman et. al. [2]. While computing the distance, the algorithm automatically chooses the most relevant summary statistics as explained in Gutmann et. al. [1]. The maximum value of the distance is 1.0.
[1] Gutmann, M., Dutta, R., Kaski, S., and Corander, J. (2014). Statistical inference of intractable generative models via classification. arXiv:1407.4981.
[2] Friedman, J., Hastie, T., and Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1–22.

__init__
(statistics)[source]¶ The constructor of a subclass must accept a nonoptional statistics calculator as a parameter. If stored to self.statistics_calc, the private helper method _calculate_summary_stat can be used.
Parameters: statistics_calc (abcpy.stasistics.Statistics) – Statistics extractor object that conforms to the Statistics class.

distance
(d1, d2)[source]¶ Calculates the distance between two datasets.
Parameters: d2 (d1,) – A list, containing a list describing the data set

dist_max
()[source]¶ To be overwritten by subclass: should return maximum possible value of the desired distance function.
Examples
If the desired distance maps to \(\mathbb{R}\), this method should return numpy.inf.
Returns: The maximal possible value of the desired distance function. Return type: numpy.float


class
abcpy.distances.
LogReg
(statistics)[source]¶ Bases:
abcpy.distances.Distance
This class implements a distance measure based on the classification accuracy [1]. The classification accuracy is calculated between two dataset d1 and d2 using logistics regression and return it as a distance. The maximum value of the distance is 1.0.
[1] Gutmann, M., Dutta, R., Kaski, S., and Corander, J. (2014). Statistical inference of intractable generative models via classification. arXiv:1407.4981.

__init__
(statistics)[source]¶ The constructor of a subclass must accept a nonoptional statistics calculator as a parameter. If stored to self.statistics_calc, the private helper method _calculate_summary_stat can be used.
Parameters: statistics_calc (abcpy.stasistics.Statistics) – Statistics extractor object that conforms to the Statistics class.

distance
(d1, d2)[source]¶ Calculates the distance between two datasets.
Parameters: d2 (d1,) – A list, containing a list describing the data set

dist_max
()[source]¶ To be overwritten by subclass: should return maximum possible value of the desired distance function.
Examples
If the desired distance maps to \(\mathbb{R}\), this method should return numpy.inf.
Returns: The maximal possible value of the desired distance function. Return type: numpy.float

abcpy.graphtools module¶

class
abcpy.graphtools.
GraphTools
[source]¶ Bases:
object
This class implements all methods that will be called recursively on the graph structure.

sample_from_prior
(model=None, rng=<MagicMock name='mock.RandomState()' id='140708318564424'>)[source]¶ Samples values for all random variables of the model. Commonly used to sample new parameter values on the whole graph.
Parameters:  model (abcpy.ProbabilisticModel object) – The root model for which sample_from_prior should be called.
 rng (Random number generator) – Defines the random number generator to be used

pdf_of_prior
(models, parameters, mapping=None, is_root=True)[source]¶ Calculates the joint probability density function of the prior of the specified models at the given parameter values. Commonly used to check whether new parameters are valid given the prior, as well as to calculate acceptance probabilities.
Parameters:  models (list of abcpy.ProbabilisticModel objects) – Defines the models for which the pdf of their prior should be evaluated
 parameters (python list) – The parameters at which the pdf should be evaluated
 mapping (list of tupels) – Defines the mapping of probabilistic models and index in a parameter list.
 is_root (boolean) – A flag specifying whether the provided models are the root models. This is to ensure that the pdf is calculated correctly.
Returns: The resulting pdf,as well as the next index to be considered in the parameters list.
Return type: list

get_parameters
(models=None, is_root=True)[source]¶ Returns the current values of all free parameters in the model. Commonly used before perturbing the parameters of the model.
Parameters:  models (list of abcpy.ProbabilisticModel objects) – The models for which, together with their parents, the parameter values should be returned. If no value is provided, the root models are assumed to be the model of the inference method.
 is_root (boolean) – Specifies whether the current models are at the root. This ensures that the values corresponding to simulated observations will not be returned.
Returns: A list containing all currently sampled values of the free parameters.
Return type: list

set_parameters
(parameters, models=None, index=0, is_root=True)[source]¶ Sets new values for the currently used values of each random variable. Commonly used after perturbing the parameter values using a kernel.
Parameters:  parameters (list) – Defines the values to which the respective parameter values of the models should be set
 model (list of abcpy.ProbabilisticModel objects) – Defines all models for which, together with their parents, new values should be set. If no value is provided, the root models are assumed to be the model of the inference method.
 index (integer) – The current index to be considered in the parameters list
 is_root (boolean) – Defines whether the current models are at the root. This ensures that only values corresponding to random variables will be set.
Returns: list – Returns whether it was possible to set all parameters and the next index to be considered in the parameters list.
Return type: [boolean, integer]

get_correct_ordering
(parameters_and_models, models=None, is_root=True)[source]¶ Orders the parameters returned by a kernel in the order required by the graph. Commonly used when perturbing the parameters.
Parameters:  parameters_and_models (list of tuples) – Contains tuples containing as the first entry the probabilistic model to be considered and as the second entry the parameter values associated with this model
 models (list) – Contains the root probabilistic models that make up the graph. If no value is provided, the root models are assumed to be the model of the inference method.
Returns: The ordering which can be used by recursive functions on the graph.
Return type: list

simulate
(n_samples_per_param, rng=<MagicMock name='mock.RandomState()' id='140708318632872'>)[source]¶ Simulates data of each model using the currently sampled or perturbed parameters.
Parameters: rng (random number generator) – The random number generator to be used. Returns: Each entry corresponds to the simulated data of one model. Return type: list

abcpy.output module¶

class
abcpy.output.
Journal
(type)[source]¶ Bases:
object
The journal holds information created by the run of inference schemes.
It can be configured to even hold intermediate.

parameters
¶ numpy.array – a nxpxt matrix

weights
¶ numpy.array – a nxt matrix

opt_value
¶ numpy.array – nxp matrix containing for each parameter the evaluated objective function for every time step

configuration
¶ Python dictionary – dictionary containing the schemes configuration parameters

__init__
(type)[source]¶ Initializes a new output journal of given type.
Parameters: type (int (identifying type)) – type=0 only logs final parametersa and weight (production use); type=1 logs all generated information (reproducibily use).

classmethod
fromFile
(filename)[source]¶ This method reads a saved journal from disk an returns it as an object.
Notes
To store a journal use Journal.save(filename).
Parameters: filename (string) – The string representing the location of a file Returns: The journal object serialized in <filename> Return type: abcpy.output.Journal Example
>>> jnl = Journal.fromFile('example_output.jnl')

add_parameters
(params)[source]¶ Saves provided parameters by appending them to the journal. If type==0, old parameters get overwritten.
Parameters: params (numpy.array) – nxp matrix containing n parameters of dimension p

add_user_parameters
(names_and_params)[source]¶ Saves the provided parameters and names of the probabilistic models corresponding to them. If type==0, old parameters get overwritten.
Parameters: names_and_params (list) – Each entry is a tupel, where the first entry is the name of the probabilistic model, and the second entry is the parameters associated with this model.

get_parameters
(iteration=None)[source]¶ Returns the parameters from a sampling scheme.
For intermediate results, pass the iteration.
Parameters: iteration (int) – specify the iteration for which to return parameters

get_weights
(iteration=None)[source]¶ Returns the weights from a sampling scheme.
For intermediate results, pass the iteration.
Parameters: iteration (int) – specify the iteration for which to return weights

add_weights
(weights)[source]¶ Saves provided weights by appending them to the journal. If type==0, old weights get overwritten.
Parameters: weights (numpy.array) – vector containing n weigths

get_distances
(iteration=None)[source]¶ Returns the distances from a sampling scheme.
For intermediate results, pass the iteration.
Parameters: iteration (int) – specify the iteration for which to return distances

add_distances
(distances)[source]¶ Saves provided distances by appending them to the journal. If type==0, old weights get overwritten.
Parameters: distances (numpy.array) – vector containing n distances

add_opt_values
(opt_values)[source]¶ Saves provided values of the evaluation of the schemes objective function. If type==0, old values get overwritten
Parameters: opt_value (numpy.array) – vector containing n evaluations of the schemes objective function

save
(filename)[source]¶ Stores the journal to disk.
Parameters: filename (string) – the location of the file to store the current object to.

posterior_mean
()[source]¶ Computes posterior mean from the samples drawn from posterior distribution
Returns: posterior mean Return type: np.ndarray

abcpy.inferences module¶

class
abcpy.inferences.
InferenceMethod
[source]¶ Bases:
abcpy.graphtools.GraphTools
This abstract base class represents an inference method.

sample
()[source]¶ To be overwritten by any subclass: Samples from the posterior distribution of the model parameter given the observed data observations.

model
¶ To be overwritten by any subclass – an attribute specifying the model to be used

rng
¶ To be overwritten by any subclass – an attribute specifying the random number generator to be used

backend
¶ To be overwritten by any subclass – an attribute specifying the backend to be used.

n_samples
¶ To be overwritten by any subclass – an attribute specifying the number of samples to be generated

n_samples_per_param
¶ To be overwritten by any subclass – an attribute specifying the number of data points in each simulated data set.


class
abcpy.inferences.
BaseMethodsWithKernel
[source]¶ Bases:
object
This abstract base class represents inference methods that have a kernel.

kernel
¶ To be overwritten by any subclass – an attribute specifying the transition or perturbation kernel.

perturb
(column_index, epochs=10, rng=<MagicMock name='mock.RandomState()' id='140708310772312'>)[source]¶ Perturbs all free parameters, given the current weights. Commonly used during inference.
Parameters:  column_index (integer) – The index of the column in the accepted_parameters_bds that should be used for perturbation
 epochs (integer) – The number of times perturbation should happen before the algorithm is terminated
Returns: Whether it was possible to set new parameter values for all probabilistic models
Return type: boolean


class
abcpy.inferences.
BaseLikelihood
[source]¶ Bases:
abcpy.inferences.InferenceMethod
,abcpy.inferences.BaseMethodsWithKernel
This abstract base class represents inference methods that use the likelihood.

likfun
¶ To be overwritten by any subclass – an attribute specifying the likelihood function to be used.


class
abcpy.inferences.
BaseDiscrepancy
[source]¶ Bases:
abcpy.inferences.InferenceMethod
,abcpy.inferences.BaseMethodsWithKernel
This abstract base class represents inference methods using descrepancy.

distance
¶ To be overwritten by any subclass – an attribute specifying the distance function.


class
abcpy.inferences.
RejectionABC
(root_models, distances, backend, seed=None)[source]¶ Bases:
abcpy.inferences.InferenceMethod
This base class implements the rejection algorithm based inference scheme [1] for Approximate Bayesian Computation.
[1] Tavaré, S., Balding, D., Griffith, R., Donnelly, P.: Inferring coalescence times from DNA sequence data. Genetics 145(2), 505–518 (1997).
Parameters:  model (list) – A list of the Probabilistic models corresponding to the observed datasets
 distance (abcpy.distances.Distance) – Distance object defining the distance measure to compare simulated and observed data sets.
 backend (abcpy.backends.Backend) – Backend object defining the backend to be used.
 seed (integer, optional) – Optional initial seed for the random number generator. The default value is generated randomly.

n_samples
= None¶

n_samples_per_param
= None¶

epsilon
= None¶

__init__
(root_models, distances, backend, seed=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.

model
= None¶

distance
= None¶

backend
= None¶

rng
= None¶

sample
(observations, n_samples, n_samples_per_param, epsilon, full_output=0)[source]¶ Samples from the posterior distribution of the model parameter given the observed data observations.
Parameters:  observations (list) – A list, containing lists describing the observed data sets
 n_samples (integer) – Number of samples to generate
 n_samples_per_param (integer) – Number of data points in each simulated data set.
 epsilon (float) – Value of threshold
 full_output (integer, optional) – If full_output==1, intermediate results are included in output journal. The default value is 0, meaning the intermediate results are not saved.
Returns: a journal containing simulation results, metadata and optionally intermediate results.
Return type:

class
abcpy.inferences.
PMCABC
(root_models, distances, backend, kernel=None, seed=None)[source]¶ Bases:
abcpy.inferences.BaseDiscrepancy
,abcpy.inferences.InferenceMethod
This base class implements a modified version of Population Monte Carlo based inference scheme for Approximate Bayesian computation of Beaumont et. al. [1]. Here the threshold value at tth generation are adaptively chosen by taking the maximum between the epsilon_percentileth value of discrepancies of the accepted parameters at t1th generation and the threshold value provided for this generation by the user. If we take the value of epsilon_percentile to be zero (default), this method becomes the inference scheme described in [1], where the threshold values considered at each generation are the ones provided by the user.
[1] M. A. Beaumont. Approximate Bayesian computation in evolution and ecology. Annual Review of Ecology, Evolution, and Systematics, 41(1):379–406, Nov. 2010.
Parameters:  model (list) – A list of the Probabilistic models corresponding to the observed datasets
 distance (abcpy.distances.Distance) – Distance object defining the distance measure to compare simulated and observed data sets.
 kernel (abcpy.distributions.Distribution) – Distribution object defining the perturbation kernel needed for the sampling.
 backend (abcpy.backends.Backend) – Backend object defining the backend to be used.
 seed (integer, optional) – Optional initial seed for the random number generator. The default value is generated randomly.

n_samples
= 2¶

n_samples_per_param
= None¶

__init__
(root_models, distances, backend, kernel=None, seed=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.

model
= None¶

distance
= None¶

kernel
= None¶

backend
= None¶

rng
= None¶

sample
(observations, steps, epsilon_init, n_samples=10000, n_samples_per_param=1, epsilon_percentile=0, covFactor=2, full_output=0, journal_file=None)[source]¶ Samples from the posterior distribution of the model parameter given the observed data observations.
Parameters:  observations (list) – A list, containing lists describing the observed data sets
 steps (integer) – Number of iterations in the sequential algoritm (“generations”)
 epsilon_init (numpy.ndarray) – An array of proposed values of epsilon to be used at each steps. Can be supplied A single value to be used as the threshold in Step 1 or a stepsdimensional array of values to be used as the threshold in evry steps.
 n_samples (integer, optional) – Number of samples to generate. The default value is 10000.
 n_samples_per_param (integer, optional) – Number of data points in each simulated data set. The default value is 1.
 epsilon_percentile (float, optional) – A value between [0, 100]. The default value is 0, meaning the threshold value provided by the user being used.
 covFactor (float, optional) – scaling parameter of the covariance matrix. The default value is 2 as considered in [1].
 full_output (integer, optional) – If full_output==1, intermediate results are included in output journal. The default value is 0, meaning the intermediate results are not saved.
Returns: A journal containing simulation results, metadata and optionally intermediate results.
Return type:

class
abcpy.inferences.
PMC
(root_models, likfuns, backend, kernel=None, seed=None)[source]¶ Bases:
abcpy.inferences.BaseLikelihood
,abcpy.inferences.InferenceMethod
Population Monte Carlo based inference scheme of Cappé et. al. [1].
This algorithm assumes a likelihood function is available and can be evaluated at any parameter value given the oberved dataset. In absence of the likelihood function or when it can’t be evaluated with a rational computational expenses, we use the approximated likelihood functions in abcpy.approx_lhd module, for which the argument of the consistency of the inference schemes are based on Andrieu and Roberts [2].
[1] Cappé, O., Guillin, A., Marin, J.M., and Robert, C. P. (2004). Population Monte Carlo. Journal of Computational and Graphical Statistics, 13(4), 907–929.
[2] C. Andrieu and G. O. Roberts. The pseudomarginal approach for efficient Monte Carlo computations. Annals of Statistics, 37(2):697–725, 04 2009.
Parameters:  model (list) – A list of the Probabilistic models corresponding to the observed datasets
 likfun (abcpy.approx_lhd.Approx_likelihood) – Approx_likelihood object defining the approximated likelihood to be used.
 kernel (abcpy.distributions.Distribution) – Distribution object defining the perturbation kernel needed for the sampling.
 backend (abcpy.backends.Backend) – Backend object defining the backend to be used.
 seed (integer, optional) – Optional initial seed for the random number generator. The default value is generated randomly.

n_samples
= None¶

n_samples_per_param
= None¶

__init__
(root_models, likfuns, backend, kernel=None, seed=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.

model
= None¶

likfun
= None¶

kernel
= None¶

backend
= None¶

rng
= None¶

sample
(observations, steps, n_samples=10000, n_samples_per_param=100, covFactors=None, iniPoints=None, full_output=0, journal_file=None)[source]¶ Samples from the posterior distribution of the model parameter given the observed data observations.
Parameters:  observations (list) – A list, containing lists describing the observed data sets
 steps (integer) – number of iterations in the sequential algoritm (“generations”)
 n_samples (integer, optional) – number of samples to generate. The default value is 10000.
 n_samples_per_param (integer, optional) – number of data points in each simulated data set. The default value is 100.
 covFactor (list of float, optional) – scaling parameter of the covariance matrix. The default is a p dimensional array of 1 when p is the dimension of the parameter.
 inipoints (numpy.ndarray, optional) – parameter vaulues from where the sampling starts. By default sampled from the prior.
 full_output (integer, optional) – If full_output==1, intermediate results are included in output journal. The default value is 0, meaning the intermediate results are not saved.
Returns: A journal containing simulation results, metadata and optionally intermediate results.
Return type:

class
abcpy.inferences.
SABC
(root_models, distances, backend, kernel=None, seed=None)[source]¶ Bases:
abcpy.inferences.BaseDiscrepancy
,abcpy.inferences.InferenceMethod
This base class implements a modified version of Simulated Annealing Approximate Bayesian Computation (SABC) of [1] when the prior is noninformative.
[1] C. Albert, H. R. Kuensch and A. Scheidegger. A Simulated Annealing Approach to Approximate Bayes Computations. Statistics and Computing, (2014).
Parameters:  model (list) – A list of the Probabilistic models corresponding to the observed datasets
 distance (abcpy.distances.Distance) – Distance object defining the distance measure used to compare simulated and observed data sets.
 kernel (abcpy.distributions.Distribution) – Distribution object defining the perturbation kernel needed for the sampling.
 backend (abcpy.backends.Backend) – Backend object defining the backend to be used.
 seed (integer, optional) – Optional initial seed for the random number generator. The default value is generated randomly.

n_samples
= None¶

n_samples_per_param
= None¶

epsilon
= None¶

__init__
(root_models, distances, backend, kernel=None, seed=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.

model
= None¶

distance
= None¶

kernel
= None¶

backend
= None¶

rng
= None¶

smooth_distances_bds
= None¶

all_distances_bds
= None¶

sample
(observations, steps, epsilon, n_samples=10000, n_samples_per_param=1, beta=2, delta=0.2, v=0.3, ar_cutoff=0.5, resample=None, n_update=None, adaptcov=1, full_output=0, journal_file=None)[source]¶ Samples from the posterior distribution of the model parameter given the observed data observations.
Parameters:  observations (list) – A list, containing lists describing the observed data sets
 steps (integer) – Number of maximum iterations in the sequential algoritm (“generations”)
 epsilon (numpy.float) – A proposed value of threshold to start with.
 n_samples (integer, optional) – Number of samples to generate. The default value is 10000.
 n_samples_per_param (integer, optional) – Number of data points in each simulated data set. The default value is 1.
 beta (numpy.float) – Tuning parameter of SABC
 delta (numpy.float) – Tuning parameter of SABC
 v (numpy.float, optional) – Tuning parameter of SABC, The default value is 0.3.
 ar_cutoff (numpy.float) – Acceptance ratio cutoff, The default value is 0.5
 resample (int, optional) – Resample after this many acceptance, The default value if n_samples
 n_update (int, optional) – Number of perturbed parameters at each step, The default value if n_samples
 adaptcov (boolean, optional) – Whether we adapt the covariance matrix in iteration stage. The default value TRUE.
 full_output (integer, optional) – If full_output==1, intermediate results are included in output journal. The default value is 0, meaning the intermediate results are not saved.
Returns: A journal containing simulation results, metadata and optionally intermediate results.
Return type:

class
abcpy.inferences.
ABCsubsim
(root_models, distances, backend, kernel=None, seed=None)[source]¶ Bases:
abcpy.inferences.BaseDiscrepancy
,abcpy.inferences.InferenceMethod
This base class implements Approximate Bayesian Computation by subset simulation (ABCsubsim) algorithm of [1].
[1] M. Chiachio, J. L. Beck, J. Chiachio, and G. Rus., Approximate Bayesian computation by subset simulation. SIAM J. Sci. Comput., 36(3):A1339–A1358, 2014/10/03 2014.
Parameters:  model (list) – A list of the Probabilistic models corresponding to the observed datasets
 distance (abcpy.distances.Distance) – Distance object defining the distance used to compare the simulated and observed data sets.
 kernel (abcpy.distributions.Distribution) – Distribution object defining the perturbation kernel needed for the sampling.
 backend (abcpy.backends.Backend) – Backend object defining the backend to be used.
 seed (integer, optional) – Optional initial seed for the random number generator. The default value is generated randomly.

n_samples
= None¶

n_samples_per_param
= None¶

chain_length
= None¶

__init__
(root_models, distances, backend, kernel=None, seed=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.

model
= None¶

distance
= None¶

kernel
= None¶

backend
= None¶

rng
= None¶

anneal_parameter
= None¶

sample
(observations, steps, n_samples=10000, n_samples_per_param=1, chain_length=10, ap_change_cutoff=10, full_output=0, journal_file=None)[source]¶ Samples from the posterior distribution of the model parameter given the observed data observations.
Parameters:  observations (list) – A list, containing lists describing the observed data sets
 steps (integer) – Number of iterations in the sequential algoritm (“generations”)
 ap_change_cutoff (float, optional) – The cutoff value for the percentage change in the anneal parameter. If the change is less than ap_change_cutoff the iterations are stopped. The default value is 10.
 full_output (integer, optional) – If full_output==1, intermediate results are included in output journal. The default value is 0, meaning the intermediate results are not saved.
Returns: A journal containing simulation results, metadata and optionally intermediate results.
Return type:

class
abcpy.inferences.
RSMCABC
(root_models, distances, backend, kernel=None, seed=None)[source]¶ Bases:
abcpy.inferences.BaseDiscrepancy
,abcpy.inferences.InferenceMethod
This base class implements Replenishment Sequential Monte Carlo Approximate Bayesian computation of Drovandi and Pettitt [1].
[1] CC. Drovandi CC and AN. Pettitt, Estimation of parameters for macroparasite population evolution using approximate Bayesian computation. Biometrics 67(1):225–233, 2011.
Parameters:  model (list) – A list of the Probabilistic models corresponding to the observed datasets
 distance (abcpy.distances.Distance) – Distance object defining the distance measure used to compare simulated and observed data sets.
 kernel (abcpy.distributions.Distribution) – Distribution object defining the perturbation kernel needed for the sampling.
 backend (abcpy.backends.Backend) – Backend object defining the backend to be used.
 seed (integer, optional) – Optional initial seed for the random number generator. The default value is generated randomly.

n_samples
= None¶

n_samples_per_param
= None¶

alpha
= None¶

__init__
(root_models, distances, backend, kernel=None, seed=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.

model
= None¶

distance
= None¶

kernel
= None¶

backend
= None¶

R
= None¶

rng
= None¶

accepted_dist_bds
= None¶

sample
(observations, steps, n_samples=10000, n_samples_per_param=1, alpha=0.1, epsilon_init=100, epsilon_final=0.1, const=0.01, covFactor=2.0, full_output=0, journal_file=None)[source]¶ Samples from the posterior distribution of the model parameter given the observed data observations.
Parameters:  observations (list) – A list, containing lists describing the observed data sets
 steps (integer) – Number of iterations in the sequential algoritm (“generations”)
 n_samples (integer, optional) – Number of samples to generate. The default value is 10000.
 n_samples_per_param (integer, optional) – Number of data points in each simulated data set. The default value is 1.
 alpha (float, optional) – A parameter taking values between [0,1], the default value is 0.1.
 epsilon_init (float, optional) – Initial value of threshold, the default is 100
 epsilon_final (float, optional) – Terminal value of threshold, the default is 0.1
 const (float, optional) – A constant to compute acceptance probabilty
 covFactor (float, optional) – scaling parameter of the covariance matrix. The default value is 2.
 full_output (integer, optional) – If full_output==1, intermediate results are included in output journal. The default value is 0, meaning the intermediate results are not saved.
Returns: A journal containing simulation results, metadata and optionally intermediate results.
Return type:

class
abcpy.inferences.
APMCABC
(root_models, distances, backend, kernel=None, seed=None)[source]¶ Bases:
abcpy.inferences.BaseDiscrepancy
,abcpy.inferences.InferenceMethod
This base class implements Adaptive Population Monte Carlo Approximate Bayesian computation of M. Lenormand et al. [1].
[1] M. Lenormand, F. Jabot and G. Deffuant, Adaptive approximate Bayesian computation for complex models. Computational Statistics, 28:2777–2796, 2013.
Parameters:  model (list) – A list of the Probabilistic models corresponding to the observed datasets
 distance (abcpy.distances.Distance) – Distance object defining the distance measure used to compare simulated and observed data sets.
 kernel (abcpy.distributions.Distribution) – Distribution object defining the perturbation kernel needed for the sampling.
 backend (abcpy.backends.Backend) – Backend object defining the backend to be used.
 seed (integer, optional) – Optional initial seed for the random number generator. The default value is generated randomly.

n_samples
= None¶

n_samples_per_param
= None¶

alpha
= None¶

accepted_dist
= None¶

__init__
(root_models, distances, backend, kernel=None, seed=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.

model
= None¶

distance
= None¶

kernel
= None¶

backend
= None¶

epsilon
= None¶

rng
= None¶

sample
(observations, steps, n_samples=10000, n_samples_per_param=1, alpha=0.9, acceptance_cutoff=0.03, covFactor=2.0, full_output=0, journal_file=None)[source]¶ Samples from the posterior distribution of the model parameter given the observed data observations.
Parameters:  observations (list) – A list, containing lists describing the observed data sets
 steps (integer) – Number of iterations in the sequential algoritm (“generations”)
 n_samples (integer, optional) – Number of samples to generate. The default value is 10000.
 n_samples_per_param (integer, optional) – Number of data points in each simulated data set. The default value is 1.
 alpha (float, optional) – A parameter taking values between [0,1], the default value is 0.1.
 acceptance_cutoff (float, optional) – Acceptance ratio cutoff, should be chosen between 0.01 and 0.05
 covFactor (float, optional) – scaling parameter of the covariance matrix. The default value is 2.
 full_output (integer, optional) – If full_output==1, intermediate results are included in output journal. The default value is 0, meaning the intermediate results are not saved.
Returns: A journal containing simulation results, metadata and optionally intermediate results.
Return type:

class
abcpy.inferences.
SMCABC
(root_models, distances, backend, kernel=None, seed=None)[source]¶ Bases:
abcpy.inferences.BaseDiscrepancy
,abcpy.inferences.InferenceMethod
This base class implements Adaptive Population Monte Carlo Approximate Bayesian computation of Del Moral et al. [1].
[1] P. Del Moral, A. Doucet, A. Jasra, An adaptive sequential Monte Carlo method for approximate Bayesian computation. Statistics and Computing, 22(5):1009–1020, 2012.
Parameters:  model (list) – A list of the Probabilistic models corresponding to the observed datasets
 distance (abcpy.distances.Distance) – Distance object defining the distance measure used to compare simulated and observed data sets.
 kernel (abcpy.distributions.Distribution) – Distribution object defining the perturbation kernel needed for the sampling.
 backend (abcpy.backends.Backend) – Backend object defining the backend to be used.
 seed (integer, optional) – Optional initial seed for the random number generator. The default value is generated randomly.

n_samples
= None¶

n_samples_per_param
= None¶

__init__
(root_models, distances, backend, kernel=None, seed=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.

model
= None¶

distance
= None¶

kernel
= None¶

backend
= None¶

epsilon
= None¶

rng
= None¶

accepted_y_sim_bds
= None¶

sample
(observations, steps, n_samples=10000, n_samples_per_param=1, epsilon_final=0.1, alpha=0.95, covFactor=2, resample=None, full_output=0, journal_file=None)[source]¶ Samples from the posterior distribution of the model parameter given the observed data observations.
Parameters:  observations (list) – A list, containing lists describing the observed data sets
 steps (integer) – Number of iterations in the sequential algoritm (“generations”)
 epsilon_final (float, optional) – The final threshold value of epsilon to be reached. The default value is 0.1.
 n_samples (integer, optional) – Number of samples to generate. The default value is 10000.
 n_samples_per_param (integer, optional) – Number of data points in each simulated data set. The default value is 1.
 alpha (float, optional) – A parameter taking values between [0,1], determinining the rate of change of the threshold epsilon. The default value is 0.5.
 covFactor (float, optional) – scaling parameter of the covariance matrix. The default value is 2.
 full_output (integer, optional) – If full_output==1, intermediate results are included in output journal. The default value is 0, meaning the intermediate results are not saved.
Returns: A journal containing simulation results, metadata and optionally intermediate results.
Return type:
abcpy.perturbationkernel module¶

class
abcpy.perturbationkernel.
PerturbationKernel
(models)[source]¶ Bases:
object
This abstract base class represents all perturbation kernels

__init__
(models)[source]¶ Parameters: models (list) – The list of abcpy.probabilisticmodel objects that should be perturbed by this kernel.

calculate_cov
(accepted_parameters_manager, kernel_index)[source]¶ Calculates the covariance matrix for the kernel.
Parameters:  accepted_parameters_manager (abcpy.acceptedparametersmanager object) – The accepted parameters manager that manages all bds objects.
 kernel_index (integer) – The index of the kernel in the list of kernels of the joint perturbation kernel.
Returns: The covariance matrix for the kernel.
Return type: numpy.ndarray

update
(accepted_parameters_manager, row_index, rng)[source]¶ Perturbs the parameters for this kernel.
Parameters:  accepted_parameters_manager (abcpy.acceptedparametersmanager object) – The accepted parameters manager that manages all bds objects.
 row_index (integer) – The index of the accepted parameters bds that should be perturbed.
 rng (random number generator) – The random number generator to be used.
Returns: The perturbed parameters.
Return type: numpy.ndarray

pdf
(accepted_parameters_manager, kernel_index, row_index, x)[source]¶ Calculates the pdf of the kernel at point x.
Parameters:  accepted_parameters_manager (abcpy.acceptedparametersmanager object) – The accepted parameters manager that manages all bds objects.
 kernel_index (integer) – The index of the kernel in the list of kernels of the joint perturbation kernel.
 row_index (integer) – The index of the accepted parameters bds for which the pdf should be evaluated.
 x (list or float) – The point at which the pdf should be evaluated.
Returns: The pdf evaluated at point x.
Return type: float


class
abcpy.perturbationkernel.
ContinuousKernel
[source]¶ Bases:
object
This abstract base class represents all perturbation kernels acting on continuous parameters.

class
abcpy.perturbationkernel.
DiscreteKernel
[source]¶ Bases:
object
This abstract base class represents all perturbation kernels acting on discrete parameters.

class
abcpy.perturbationkernel.
JointPerturbationKernel
(kernels)[source]¶ Bases:
abcpy.perturbationkernel.PerturbationKernel

__init__
(kernels)[source]¶ This class joins different kernels to make up the overall perturbation kernel. Any userimplemented perturbation kernel should derive from this class. Any kernels defined on their own should be joined in the end using this class.
Parameters: kernels (list) – List of abcpy.PerturbationKernels

calculate_cov
(accepted_parameters_manager)[source]¶ Calculates the covariance matrix corresponding to each kernel. Commonly used before calculating weights to avoid repeated calculation.
Parameters: accepted_parameters_manager (abcpy.AcceptedParametersManager object) – The AcceptedParametersManager to be uesd. Returns: Each entry corresponds to the covariance matrix of the corresponding kernel. Return type: list

update
(accepted_parameters_manager, row_index, rng=<MagicMock name='mock.RandomState()' id='140708309723008'>)[source]¶ Perturbs the parameter values contained in accepted_parameters_manager. Commonly used while perturbing.
Parameters:  accepted_parameters_manager (abcpy.AcceptedParametersManager object) – Defines the AcceptedParametersManager to be used.
 row_index (integer) – The index of the row that should be considered from the accepted_parameters_bds matrix.
 rng (random number generator) – The random number generator to be used.
Returns: The list contains tupels. Each tupel contains as the first entry a probabilistic model and as the second entry the perturbed parameter values corresponding to this model.
Return type: list

pdf
(mapping, accepted_parameters_manager, index, x)[source]¶ Calculates the overall pdf of the kernel. Commonly used to calculate weights.
Parameters:  mapping (list) – Each entry is a tupel of which the first entry is a abcpy.ProbabilisticModel object, the second entry is the index in the accepted_parameters_bds list corresponding to an output of this model.
 accepted_parameters_manager (abcpy.AcceptedParametersManager object) – The AcceptedParametersManager to be used.
 index (integer) – The row to be considered in the accepted_parameters_bds matrix.
 x (The point at which the pdf should be evaluated.) –
Returns: The pdf evaluated at point x.
Return type: float


class
abcpy.perturbationkernel.
MultivariateNormalKernel
(models)[source]¶ Bases:
abcpy.perturbationkernel.PerturbationKernel
,abcpy.perturbationkernel.ContinuousKernel
This class defines a kernel perturbing the parameters using a multivariate normal distribution.

__init__
(models)[source]¶ Parameters: models (list) – The list of abcpy.probabilisticmodel objects that should be perturbed by this kernel.

calculate_cov
(accepted_parameters_manager, kernel_index)[source]¶ Calculates the covariance matrix relevant to this kernel.
Parameters:  accepted_parameters_manager (abcpy.AcceptedParametersManager object) – AcceptedParametersManager to be used.
 kernel_index (integer) – The index of the kernel in the list of kernels of the joint kernel.
Returns: The covariance matrix corresponding to this kernel.
Return type: list

update
(accepted_parameters_manager, kernel_index, row_index, rng=<MagicMock name='mock.RandomState()' id='140708309838760'>)[source]¶ Updates the parameter values contained in the accepted_paramters_manager using a multivariate normal distribution.
Parameters:  accepted_parameters_manager (abcpy.AcceptedParametersManager object) – Defines the AcceptedParametersManager to be used.
 kernel_index (integer) – The index of the kernel in the list of kernels in the joint kernel.
 row_index (integer) – The index of the row that should be considered from the accepted_parameters_bds matrix.
 rng (random number generator) – The random number generator to be used.
Returns: The perturbed parameter values.
Return type: np.ndarray

pdf
(accepted_parameters_manager, kernel_index, index, x)[source]¶ Calculates the pdf of the kernel. Commonly used to calculate weights.
Parameters:  accepted_parameters_manager (abcpy.AcceptedParametersManager object) – The AcceptedParametersManager to be used.
 kernel_index (integer) – The index of the kernel in the list of kernels in the joint kernel.
 index (integer) – The row to be considered in the accepted_parameters_bds matrix.
 x (The point at which the pdf should be evaluated.) –
Returns: The pdf evaluated at point x.
Return type: float


class
abcpy.perturbationkernel.
MultivariateStudentTKernel
(models, df)[source]¶ Bases:
abcpy.perturbationkernel.PerturbationKernel
,abcpy.perturbationkernel.ContinuousKernel

__init__
(models, df)[source]¶ This class defines a kernel perturbing the parameters using a multivariate normal distribution.
Parameters:  models (list of abcpy.probabilisticmodel objects) – The models that should be perturbed using this kernel
 df (integer) – The degrees of freedom to be used.

calculate_cov
(accepted_parameters_manager, kernel_index)[source]¶ Calculates the covariance matrix relevant to this kernel.
Parameters:  accepted_parameters_manager (abcpy.AcceptedParametersManager object) – AcceptedParametersManager to be used.
 kernel_index (integer) – The index of the kernel in the list of kernels of the joint kernel.
Returns: The covariance matrix corresponding to this kernel.
Return type: list

update
(accepted_parameters_manager, kernel_index, row_index, rng=<MagicMock name='mock.RandomState()' id='140708309782480'>)[source]¶ Updates the parameter values contained in the accepted_paramters_manager using a multivariate normal distribution.
Parameters:  accepted_parameters_manager (abcpy.AcceptedParametersManager object) – Defines the AcceptedParametersManager to be used.
 kernel_index (integer) – The index of the kernel in the list of kernels in the joint kernel.
 row_index (integer) – The index of the row that should be considered from the accepted_parameters_bds matrix.
 rng (random number generator) – The random number generator to be used.
Returns: The perturbed parameter values.
Return type: np.ndarray

pdf
(accepted_parameters_manager, kernel_index, index, x)[source]¶ Calculates the pdf of the kernel. Commonly used to calculate weights.
Parameters:  accepted_parameters_manager (abcpy.AcceptedParametersManager object) – The AcceptedParametersManager to be used.
 kernel_index (integer) – The index of the kernel in the list of kernels in the joint kernel.
 index (integer) – The row to be considered in the accepted_parameters_bds matrix.
 x (The point at which the pdf should be evaluated.) –
Returns: The pdf evaluated at point x.
Return type: float


class
abcpy.perturbationkernel.
RandomWalkKernel
(models)[source]¶ Bases:
abcpy.perturbationkernel.PerturbationKernel
,abcpy.perturbationkernel.DiscreteKernel

__init__
(models)[source]¶ This class defines a kernel perturbing discrete parameters using a naive random walk.
Parameters: models (list) – List of abcpy.ProbabilisticModel objects

update
(accepted_parameters_manager, kernel_index, row_index, rng=<MagicMock name='mock.RandomState()' id='140708310717496'>)[source]¶ Updates the parameter values contained in the accepted_paramters_manager using a random walk.
Parameters:  accepted_parameters_manager (abcpy.AcceptedParametersManager object) – Defines the AcceptedParametersManager to be used.
 row_index (integer) – The index of the row that should be considered from the accepted_parameters_bds matrix.
 rng (random number generator) – The random number generator to be used.
Returns: The perturbed parameter values.
Return type: np.ndarray

calculate_cov
(accepted_parameters_manager, kernel_index)[source]¶ Calculates the covariance matrix of this kernel. Since there is no covariance matrix associated with this random walk, it returns an empty list.

pmf
(accepted_parameters_manager, kernel_index, index, x)[source]¶ Calculates the pmf of the kernel. Commonly used to calculate weights.
Parameters:  cov (list) – The covariance matrix used for this kernel. This is a dummy input.
 accepted_parameters_manager (abcpy.AcceptedParametersManager object) – The AcceptedParametersManager to be used.
 kernel_index (integer) – The index of the kernel in the list of kernels of the joint kernel.
 index (integer) – The row to be considered in the accepted_parameters_bds matrix.
 x (The point at which the pdf should be evaluated.) –
Returns: The pmf evaluated at point x.
Return type: float


class
abcpy.perturbationkernel.
DefaultKernel
(models)[source]¶ Bases:
abcpy.perturbationkernel.JointPerturbationKernel

__init__
(models)[source]¶ This class implements a kernel that perturbs all continuous parameters using a multivariate normal, and all discrete parameters using a random walk. To be used as an example for user defined kernels.
Parameters: models (list) – List of abcpy.ProbabilisticModel objects, the models for which the kernel should be defined.

abcpy.probabilisticmodels module¶

class
abcpy.probabilisticmodels.
InputConnector
(dimension)[source]¶ Bases:
object

__init__
(dimension)[source]¶ Creates input parameters of given dimensionality. Each dimension needs to be specified using the set method.
Parameters: dimension (int) – Dimensionality of the input parameters.

from_number
()[source]¶ Convenient initializer that converts a number to a hyperparameter input parameter.
Parameters: number – Returns: Return type: InputConnector

from_model
()[source]¶ Convenient initializer that converts the full output of a model to input parameters.
Parameters: ProbabilisticModel – Returns: Return type: InputConnector

from_list
()[source]¶ Creates an InputParameters object from a list of ProbabilisticModels.
In this case, number of input parameters equals the sum of output dimensions of all models in the parameter list. Further, the output and models are connected to the input parameters in the order they appear in the parameter list.
For convenience,  the parameter list can contain nested lists  the method also accepts numbers instead of models, which are automatically converted to hyper parameters.
Parameters: parameters (list) – A list of ProbabilisticModels Returns: Return type: InputConnector

get_model
(index)[source]¶ Returns the model at index.
Returns: Return type: ProbabilisticModel

set
(index, model, model_index)[source]¶ Sets for an input parameter index the input model and the model index to use.
For convenience, model can also be a number, which is automatically casted to a hyper parameter.
Parameters:  index (int) – Index of the input parameter to be set.
 model (ProbabilisticModel, Number) – The model to be set for the input parameter.
 model_index (int) – Index of model’s output to be used as input parameter.

all_models_fixed_values
()[source]¶ Checks whether all input models have fixed an output value (pseudo data).
In order get a fixed output value (a realization of the random variable described by the model) a model has to run a forward simulation, which is not done automatically upon initialization.
Returns: Return type: boolean


class
abcpy.probabilisticmodels.
ProbabilisticModel
(input_connector, name='')[source]¶ Bases:
object
This abstract class represents all probabilistic models.

__init__
(input_connector, name='')[source]¶ This initializer must be called from any derived class to properly connect it to its input models.
It accepts as input an InputConnector object that fully specifies how to connect all parent models to the current model.
Parameters:  input_connector (list) – A list of input parameters.
 name (string) – A human readable name for the model. Can be the variable name for example.

get_input_values
()[source]¶ Returns the input values from the parent models as a list. Commonly used when sampling from the distribution.
Returns: Return type: list

get_stored_output_values
()[source]¶ Returns the stored sampled value of the probabilistic model after setting the values explicitly.
At initialization the function should return None.
Returns: Return type: numpy.array or None.

get_input_connector
()[source]¶ Returns the input connector object that connecects the current model to its parents.
In case of no dependencies, this function should return None.
Returns: Return type: InputConnector, None

get_input_dimension
()[source]¶ Returns the input dimension of the current model.
Returns: Return type: int

set_output_values
(values)[source]¶ Sets the output values of the model. This method is commonly used to set new values after perturbing the old ones.
Parameters: values (numpy array or dimension equal to output dimension.) – Returns: Returns True if it was possible to set the values, false otherwise. Return type: boolean

__add__
(other)[source]¶ Overload the + operator for probabilistic models.
Parameters: other (probabilistic model or Hyperparameter) – The model to be added to self. Returns: A probabilistic model describing a model coming from summation. Return type: SummationModel

__sub__
(other)[source]¶ Overload the  operator for probabilistic models.
Parameters: other (probabilistic model or Hyperparameter) – The model to be subtracted from self. Returns: A probabilistic model describing a model coming from subtraction. Return type: SubtractionModel

__mul__
(other)[source]¶ Overload the * operator for probabilistic models.
Parameters: other (probabilistic model or Hyperparameter) – The model to be multiplied with self. Returns: A probabilistic model describing a model coming from multiplication. Return type: MultiplicationModel

__truediv__
(other)[source]¶ Overload the / operator for probabilistic models.
Parameters: other (probabilistic model or Hyperparameter) – The model to be divide self. Returns: A probabilistic model describing a model coming from division. Return type: DivisionModel

pdf
(input_values, x)[source]¶ Calculates the probability density function at point x.
Commonly used to determine whether perturbed parameters are still valid according to the pdf.
Parameters:  input_values (list) – List of input parameters, in the same order as specified in the InputConnector passed to the init function
 x (list) – The point at which the pdf should be evaluated.
Returns: The pdf evaluated at point x.
Return type: float

calculate_and_store_pdf_if_needed
(x)[source]¶ Calculates the probability density function at point x and stores the result internally for later use.
This function is intended to be used within the inference computation.
Parameters: x (list) – The point at which the pdf should be evaluated.

flush_stored_pdf
()[source]¶ This function flushes the internally stored value of a previously computed pdf.

get_stored_pdf
()[source]¶ Retrieves the value of a previously calculated pdf.
Returns: Return type: number

forward_simulate
(input_values, k, rng)[source]¶ Provides the output (pseudo data) from a forward simulation of the current model.
In case the model is intended to be used as input for another model, a forward simulation must return a list of k numpy arrays with shape (get_output_dimension(),).
In case the model is directly used for inference, and not as input for another model, a forward simulation also must return a list, but the elements can be arbitrarily defined. In this case it is only important that the used statistics and distance functions can read the input.
Parameters:  input_values (list) – A list of numbers that are the concatenation of all parent model outputs in the order specified by the InputConnector object that was passed during initialization.
 k (integer) – The number of forward simulations that should be run
 rng (Random number generator) – Defines the random number generator to be used. The default value uses a random seed to initialize the generator.
Returns: A list of k elements, where each element is of type numpy arary and represents the result of a single forward simulation.
Return type: list

get_output_dimension
()[source]¶ Provides the output dimension of the current model.
This function is in particular important if the current model is used as an input for other models. In such a case it is assumed that the output is always a vector of int or float. The length of the vector is the dimension that should be returned here.
Returns: The dimension of the output vector of a single forward simulation. Return type: int


class
abcpy.probabilisticmodels.
Continuous
[source]¶ Bases:
object
This abstract class represents all continuous probabilistic models.

pdf
(input_values, x)[source]¶ Calculates the probability density function of the model.
Parameters:  input_values (list) – A list of numbers that are the concatenation of all parent model outputs in the order specified by the InputConnector object that was passed during initialization.
 x (float) – The location at which the probability density function should be evaluated.


class
abcpy.probabilisticmodels.
Discrete
[source]¶ Bases:
object
This abstract class represents all discrete probabilistic models.

pmf
(input_values, x)[source]¶ Calculates the probability mass function of the model.
Parameters:  input_values (list) – A list of numbers that are the concatenation of all parent model outputs in the order specified by the InputConnector object that was passed during initialization.
 x (float) – The location at which the probability mass function should be evaluated.


class
abcpy.probabilisticmodels.
Hyperparameter
(value, name='Hyperparameter')[source]¶ Bases:
abcpy.probabilisticmodels.ProbabilisticModel
This class represents all hyperparameters (i.e. fixed parameters).

__init__
(value, name='Hyperparameter')[source]¶ Parameters: value (list) – The values to which the hyperparameter should be set

set_output_values
(values, rng=<MagicMock name='mock.RandomState()' id='140708319142408'>)[source]¶ Sets the output values of the model. This method is commonly used to set new values after perturbing the old ones.
Parameters: values (numpy array or dimension equal to output dimension.) – Returns: Returns True if it was possible to set the values, false otherwise. Return type: boolean

get_input_dimension
()[source]¶ Returns the input dimension of the current model.
Returns: Return type: int

get_output_dimension
()[source]¶ Provides the output dimension of the current model.
This function is in particular important if the current model is used as an input for other models. In such a case it is assumed that the output is always a vector of int or float. The length of the vector is the dimension that should be returned here.
Returns: The dimension of the output vector of a single forward simulation. Return type: int

get_input_connector
()[source]¶ Returns the input connector object that connecects the current model to its parents.
In case of no dependencies, this function should return None.
Returns: Return type: InputConnector, None

get_input_values
()[source]¶ Returns the input values from the parent models as a list. Commonly used when sampling from the distribution.
Returns: Return type: list

forward_simulate
(input_values, k, rng=<MagicMock name='mock.RandomState()' id='140708319621016'>)[source]¶ Provides the output (pseudo data) from a forward simulation of the current model.
In case the model is intended to be used as input for another model, a forward simulation must return a list of k numpy arrays with shape (get_output_dimension(),).
In case the model is directly used for inference, and not as input for another model, a forward simulation also must return a list, but the elements can be arbitrarily defined. In this case it is only important that the used statistics and distance functions can read the input.
Parameters:  input_values (list) – A list of numbers that are the concatenation of all parent model outputs in the order specified by the InputConnector object that was passed during initialization.
 k (integer) – The number of forward simulations that should be run
 rng (Random number generator) – Defines the random number generator to be used. The default value uses a random seed to initialize the generator.
Returns: A list of k elements, where each element is of type numpy arary and represents the result of a single forward simulation.
Return type: list

pdf
(input_values, x)[source]¶ Calculates the probability density function at point x.
Commonly used to determine whether perturbed parameters are still valid according to the pdf.
Parameters:  input_values (list) – List of input parameters, in the same order as specified in the InputConnector passed to the init function
 x (list) – The point at which the pdf should be evaluated.
Returns: The pdf evaluated at point x.
Return type: float


class
abcpy.probabilisticmodels.
ModelResultingFromOperation
(parameters, name='')[source]¶ Bases:
abcpy.probabilisticmodels.ProbabilisticModel
This class implements probabilistic models returned after performing an operation on two probabilistic models

__init__
(parameters, name='')[source]¶ Parameters: parameters (list) – List containing two probabilistic models that should be added together.

forward_simulate
(input_values, k, rng=<MagicMock name='mock.RandomState()' id='140708319324648'>)[source]¶ Provides the output (pseudo data) from a forward simulation of the current model.
In case the model is intended to be used as input for another model, a forward simulation must return a list of k numpy arrays with shape (get_output_dimension(),).
In case the model is directly used for inference, and not as input for another model, a forward simulation also must return a list, but the elements can be arbitrarily defined. In this case it is only important that the used statistics and distance functions can read the input.
Parameters:  input_values (list) – A list of numbers that are the concatenation of all parent model outputs in the order specified by the InputConnector object that was passed during initialization.
 k (integer) – The number of forward simulations that should be run
 rng (Random number generator) – Defines the random number generator to be used. The default value uses a random seed to initialize the generator.
Returns: A list of k elements, where each element is of type numpy arary and represents the result of a single forward simulation.
Return type: list

get_output_dimension
()[source]¶ Provides the output dimension of the current model.
This function is in particular important if the current model is used as an input for other models. In such a case it is assumed that the output is always a vector of int or float. The length of the vector is the dimension that should be returned here.
Returns: The dimension of the output vector of a single forward simulation. Return type: int

pdf
(input_values, x)[source]¶ Calculates the probability density function at point x.
Parameters:  input_values (list) – List of input parameters, in the same order as specified in the InputConnector passed to the init function
 x (float or list) – The point at which the pdf should be evaluated.
Returns: The probability density function evaluated at point x.
Return type: float

sample_from_input_models
(k, rng=<MagicMock name='mock.RandomState()' id='140708319505600'>)[source]¶ Return for each input model k samples.
Parameters: k (int) – Specifies the number of samples to generate from each input model. Returns: A dictionary of type ProbabilisticModel:[], where the list contains k samples of the corresponding model. Return type: dict


class
abcpy.probabilisticmodels.
SummationModel
(parameters, name='')[source]¶ Bases:
abcpy.probabilisticmodels.ModelResultingFromOperation
This class represents all probabilistic models resulting from an addition of two probabilistic models

forward_simulate
(input_values, k, rng=<MagicMock name='mock.RandomState()' id='140708319338848'>)[source]¶ Adds the sampled values of both parent distributions.
Parameters:  input_values (list) – List of input values
 k (integer) – The number of samples that should be sampled
 rng (random number generator) – The random number generator to be used.
Returns: The first entry is True, it is always possible to sample, given two parent values. The second entry is the sum of the parents values.
Return type: list


class
abcpy.probabilisticmodels.
SubtractionModel
(parameters, name='')[source]¶ Bases:
abcpy.probabilisticmodels.ModelResultingFromOperation
This class represents all probabilistic models resulting from an subtraction of two probabilistic models

forward_simulate
(input_values, k, rng=<MagicMock name='mock.RandomState()' id='140708318057920'>)[source]¶ Adds the sampled values of both parent distributions.
Parameters:  input_values (list) – List of input values
 k (integer) – The number of samples that should be sampled
 rng (random number generator) – The random number generator to be used.
Returns: The first entry is True, it is always possible to sample, given two parent values. The second entry is the difference of the parents values.
Return type: list


class
abcpy.probabilisticmodels.
MultiplicationModel
(parameters, name='')[source]¶ Bases:
abcpy.probabilisticmodels.ModelResultingFromOperation
This class represents all probabilistic models resulting from a multiplication of two probabilistic models

forward_simulate
(input_values, k, rng=<MagicMock name='mock.RandomState()' id='140708318087712'>)[source]¶ Multiplies the sampled values of both parent distributions element wise.
Parameters:  input_values (list) – List of input values
 k (integer) – The number of samples that should be sampled
 rng (random number generator) – The random number generator to be used.
Returns: The first entry is True, it is always possible to sample, given two parent values. The second entry is the product of the parents values.
Return type: list


class
abcpy.probabilisticmodels.
DivisionModel
(parameters, name='')[source]¶ Bases:
abcpy.probabilisticmodels.ModelResultingFromOperation
This class represents all probabilistic models resulting from a division of two probabilistic models

forward_simulate
(input_valus, k, rng=<MagicMock name='mock.RandomState()' id='140708318125696'>)[source]¶ Divides the sampled values of both parent distributions.
Parameters:  input_values (list) – List of input values
 k (integer) – The number of samples that should be sampled
 rng (random number generator) – The random number generator to be used.
Returns: The first entry is True, it is always possible to sample, given two parent values. The second entry is the fraction of the parents values.
Return type: list


class
abcpy.probabilisticmodels.
ExponentialModel
(parameters, name='')[source]¶ Bases:
abcpy.probabilisticmodels.ModelResultingFromOperation
This class represents all probabilistic models resulting from an exponentiation of two probabilistic models

__init__
(parameters, name='')[source]¶ Specific initializer for exponential models that does additional checks.
Parameters: parameters (list) – List of probabilistic models that should be added together.

forward_simulate
(input_values, k, rng=<MagicMock name='mock.RandomState()' id='140708318163856'>)[source]¶ Raises the sampled values of the base by the exponent.
Parameters:  input_values (list) – List of input values
 k (integer) – The number of samples that should be sampled
 rng (random number generator) – The random number generator to be used.
Returns: The first entry is True, it is always possible to sample, given two parent values. The second entry is the exponential of the parents values.
Return type: list


class
abcpy.probabilisticmodels.
RExponentialModel
(parameters, name='')[source]¶ Bases:
abcpy.probabilisticmodels.ModelResultingFromOperation
This class represents all probabilistic models resulting from an exponentiation of a Hyperparameter by another probabilistic model.

__init__
(parameters, name='')[source]¶ Specific initializer for exponential models that does additional checks.
Parameters: parameters (list) – List of probabilistic models that should be added together.

forward_simulate
(input_values, k, rng=<MagicMock name='mock.RandomState()' id='140708318197856'>)[source]¶ Raises the base by the sampled value of the exponent.
Parameters:  input_values (list) – List of input values
 k (integer) – The number of samples that should be sampled
 rng (random number generator) – The random number generator to be used.
Returns: The first entry is True, it is always possible to sample, given two parent values. The second entry is the exponential of the parents values.
Return type: list

abcpy.modelselections module¶

class
abcpy.modelselections.
ModelSelections
(model_array, statistics_calc, backend, seed=None)[source]¶ Bases:
object
This abstract base class defines a model selection rule of how to choose a model from a set of models given an observation.

__init__
(model_array, statistics_calc, backend, seed=None)[source]¶ Constructor that must be overwritten by the subclass.
The constructor of a subclass must accept an array of models to choose the model from, and two nonoptional parameters statistics calculator and backend stored in self.statistics_calc and self.backend defining how to calculate sumarry statistics from data and what kind of parallelization to use.
Parameters:  model_array (list) – A list of models which are of type abcpy.probabilisticmodels
 statistics (abcpy.statistics.Statistics) – Statistics object that conforms to the Statistics class.
 backend (abcpy.backends.Backend) – Backend object that conforms to the Backend class.
 seed (integer, optional) – Optional initial seed for the random number generator. The default value is generated randomly.

select_model
(observations, n_samples=1000, n_samples_per_param=100)[source]¶ To be overwritten by any subclass: returns a model selected by the modelselection procedure most suitable to the obersved data set observations. Further two optional integer arguments n_samples and n_samples_per_param is supplied denoting the number of samples in the refernce table and the data points in each simulated data set.
Parameters:  observations (python list) – The observed data set.
 n_samples (integer, optional) – Number of samples to generate for reference table.
 n_samples_per_param (integer, optional) – Number of data points in each simulated data set.
Returns: A model which are of type abcpy.probabilisticmodels
Return type: abcpy.probabilisticmodels

posterior_probability
(observations)[source]¶ To be overwritten by any subclass: returns the approximate posterior probability of the chosen model given the observed data set observations.
Parameters: observations (python list) – The observed data set. Returns: A vector containing the approximate posterior probability of the model chosen. Return type: np.ndarray


class
abcpy.modelselections.
RandomForest
(model_array, statistics_calc, backend, N_tree=100, n_try_fraction=0.5, seed=None)[source]¶ Bases:
abcpy.modelselections.ModelSelections
,abcpy.graphtools.GraphTools
This class implements the model selection procedure based on the Random Forest ensemble learner as described in Pudlo et. al. [1].
[1] Pudlo, P., Marin, J.M., Estoup, A., Cornuet, J.M., Gautier, M. and Robert, C. (2016). Reliable ABC model choice via random forests. Bioinformatics, 32 859–866.

__init__
(model_array, statistics_calc, backend, N_tree=100, n_try_fraction=0.5, seed=None)[source]¶ Parameters:  N_tree (integer, optional) – Number of trees in the random forest. The default value is 100.
 n_try_fraction (float, optional) – The fraction of number of summary statistics to be considered as the size of the number of covariates randomly sampled at each node by the randomised CART. The default value is 0.5.

select_model
(observations, n_samples=1000, n_samples_per_param=1)[source]¶ Parameters:  observations (python list) – The observed data set.
 n_samples (integer, optional) – Number of samples to generate for reference table. The default value is 1000.
 n_samples_per_param (integer, optional) – Number of data points in each simulated data set. The default value is 1.
Returns: A model which are of type abcpy.probabilisticmodels
Return type: abcpy.probabilisticmodels

abcpy.statistics module¶

class
abcpy.statistics.
Statistics
(degree=2, cross=True)[source]¶ Bases:
object
This abstract base class defines how to calculate statistics from dataset.
The base class also implements a polynomial expansion with crossproduct terms that can be used to get desired polynomial expansion of the calculated statistics.

__init__
(degree=2, cross=True)[source]¶ Constructor that must be overwritten by the subclass.
The constructor of a subclass must accept arguments for the polynomial expansion after extraction of the summary statistics, one has to define the degree of polynomial expansion and cross, indicating whether crossprodcut terms are included.
Parameters:  degree (integer, optional) – Of polynomial expansion. The default value is 2 meaning second order polynomial expansion.
 cross (boolean, optional) – Defines whether to include the crossproduct terms. The default value is TRUE, meaning the cross product term is included.

statistics
(data: object) → object[source]¶ To be overwritten by any subclass: should extract statistics from the data set data. It is assumed that data is a list of n same type elements(eg., The data can be a list containing n timeseries, n graphs or n np.ndarray).
Parameters: data (python list) – Contains n data sets. Returns: nxp matrix where for each of the n data points p statistics are calculated. Return type: numpy.ndarray


class
abcpy.statistics.
Identity
(degree=2, cross=True)[source]¶ Bases:
abcpy.statistics.Statistics
This class implements identity statistics returning a nxp matrix when the data set contains n numpy.ndarray of length p.

__init__
(degree=2, cross=True)[source]¶ Constructor that must be overwritten by the subclass.
The constructor of a subclass must accept arguments for the polynomial expansion after extraction of the summary statistics, one has to define the degree of polynomial expansion and cross, indicating whether crossprodcut terms are included.
Parameters:  degree (integer, optional) – Of polynomial expansion. The default value is 2 meaning second order polynomial expansion.
 cross (boolean, optional) – Defines whether to include the crossproduct terms. The default value is TRUE, meaning the cross product term is included.

statistics
(data)[source]¶ To be overwritten by any subclass: should extract statistics from the data set data. It is assumed that data is a list of n same type elements(eg., The data can be a list containing n timeseries, n graphs or n np.ndarray).
Parameters: data (python list) – Contains n data sets. Returns: nxp matrix where for each of the n data points p statistics are calculated. Return type: numpy.ndarray

abcpy.summaryselections module¶

class
abcpy.summaryselections.
Summaryselections
(model, statistics_calc, backend, n_samples=1000, seed=None)[source]¶ Bases:
object
This abstract base class defines a way to choose the summary statistics.

__init__
(model, statistics_calc, backend, n_samples=1000, seed=None)[source]¶ The constructor of a subclass must accept a nonoptional model, statistics calculator and backend which are stored to self.model, self.statistics_calc and self.backend. Further it accepts two optional parameters n_samples and seed defining the number of simulated dataset used for the pilot to decide the summary statistics and the integer to initialize the random number generator.
Parameters:  model (abcpy.models.Model) – Model object that conforms to the Model class.
 statistics_cal (abcpy.statistics.Statistics) – Statistics object that conforms to the Statistics class.
 backend (abcpy.backends.Backend) – Backend object that conforms to the Backend class.
 n_samples (int, optional) – The number of (parameter, simulated data) tuple generated to learn the summary statistics in pilot step. The default value is 1000.
 n_samples_per_param (int, optional) – Number of data points in each simulated data set.
 seed (integer, optional) – Optional initial seed for the random number generator. The default value is generated randomly.


class
abcpy.summaryselections.
Semiautomatic
(model, statistics_calc, backend, n_samples=1000, n_samples_per_param=1, seed=None)[source]¶ Bases:
abcpy.summaryselections.Summaryselections
,abcpy.graphtools.GraphTools
This class implements the semi automatic summary statistics choice described in Fearnhead and Prangle [1].
[1] Fearnhead P., Prangle D. 2012. Constructing summary statistics for approximate Bayesian computation: semiautomatic approximate Bayesian computation. J. Roy. Stat. Soc. B 74:419–474.

__init__
(model, statistics_calc, backend, n_samples=1000, n_samples_per_param=1, seed=None)[source]¶ The constructor of a subclass must accept a nonoptional model, statistics calculator and backend which are stored to self.model, self.statistics_calc and self.backend. Further it accepts two optional parameters n_samples and seed defining the number of simulated dataset used for the pilot to decide the summary statistics and the integer to initialize the random number generator.
Parameters:  model (abcpy.models.Model) – Model object that conforms to the Model class.
 statistics_cal (abcpy.statistics.Statistics) – Statistics object that conforms to the Statistics class.
 backend (abcpy.backends.Backend) – Backend object that conforms to the Backend class.
 n_samples (int, optional) – The number of (parameter, simulated data) tuple generated to learn the summary statistics in pilot step. The default value is 1000.
 n_samples_per_param (int, optional) – Number of data points in each simulated data set.
 seed (integer, optional) – Optional initial seed for the random number generator. The default value is generated randomly.
