straindesign.strainDesignProblem

Classes and functions for the construction of strain design MILPs

This module contains functions that help construct mixed-integer linear problems, mainly functions that facilitate the construction of LP and Farkas dual problems from linear problems of the type A_ineq*x<=b_ineq, A_eq*x<=b_eq, lb<=x<=ub. The functions also help keeping track of the relationship of constraints and variables and their individual counterparts in dual problems, which is essential when simulating knockouts in dual problems. Most of the time, the sparse datatype is used to store and edit matrices for improved speed and memory.

Module Contents

class straindesign.strainDesignProblem.ContMILP(A_ineq, b_ineq, A_eq, b_eq, lb, ub, c, z_map_constr_ineq, z_map_constr_eq, z_map_vars)[source]

Continuous representation of the strain design MILP.

This MILP can be used to verify computation results. Since this class also stores the relationship between intervention variables z and corresponding (in)equality constraints and variables in the problem, it can be used to verify computed designs quickly and in a numerically stable manner.

class straindesign.strainDesignProblem.SDProblem(model: cobra.Model, sd_modules: List[straindesign.SDModule], *args, **kwargs)[source]

Strain design MILP

The construcor of this class translates a model and strain design modules into a mixed integer linear problem. This class, however, is the backbone of the strain design computation. Preprocessing steps that enable gene, reaction and regulatory interventions, or network compression usually preceed the construction of an SDProblem-object and are integrated in the function compute_strain_designs.

Parameters:

model (cobra.Model) – A metabolic model that is an instance of the cobra.Model class.
sd_modules ((list of) straindesign.SDModule) – Modules that specify the strain design problem, e.g., protected or suppressed flux states for MCS strain design or inner and outer objective functions for OptKnock. See description of SDModule for more information on how to set up modules.
ko_cost (optional (dict)) – (Default: None) A dictionary of reaction identifiers and their associated knockout costs. If not specified, all reactions are treated as knockout candidates, equivalent to ko_cost = {‘r1’:1, ‘r2’:1, …}. If a subset of reactions is listed in the dict, all other are not considered as knockout candidates.
ki_cost (optional (dict)) – (Default: None) A dictionary of reaction identifiers and their associated costs for addition. If not specified, all reactions are treated as knockout candidates. Reaction addition candidates must be present in the original model with the intended flux boundaries after insertion. Additions are treated adversely to knockouts, meaning that their exclusion from the network is not associated with any cost while their presence entails intervention costs.
max_cost (optional (int)) – (Default: inf): The maximum cost threshold for interventions. Every possible intervention is associated with a cost value (1, by default). Strain designs cannot exceed the max_cost threshold. Individual intervention cost factors may be defined through ki_cost and ko_cost.
solver (optional (str)) – (Default: same as defined in model / COBRApy) The solver that should be used for preparing and carrying out the strain design computation. Allowed values are ‘cplex’, ‘gurobi’, ‘scip’ and ‘glpk’.
M (optional (int)) – (Default: None) If this value is specified (and non-zero, not None), the computation uses the big-M method instead of indicator constraints. Since GLPK does not support indicator constraints it uses the big-M method by default (with COBRA standard M=1000). M should be chosen ‘sufficiently large’ to avoid computational artifacts and ‘sufficiently small’ to avoid numerical issues.
essential_kis (optional (set)) – A set of reactions that are marked as addable and that are essential for at least one of the strain design modules. Providing such “essential knock-ins” may speed up the strain design computation.

Returns:

An instance of SDProblem containing the strain design MILP

Return type:

(SDProblem)

addModule(sd_module)[source]

Generate module LP and z-linking-matrix for each module and add them to the strain design MILP

Parameters:: sd_module (straindesign.SDModule) – Modules to describe strain design problems like protected or suppressed flux states for MCS strain design or inner and outer objective functions for OptKnock. See description of SDModule for more information on how to set up modules.

link_z()[source]

Connect binary intervention variables to variables and constraints of the strain design problem

Function that uses the maps between intervention indicators z and variables and constraints of the linear strain design (in)equality system (self.z_map_constr_ineq, self.z_map_constr_eq and self.z_map_vars) to set up the strain design MILP.

MILP construction uses the following steps:

Translate equality-KOs/KIs to two inequality-KOs/KIs
Translate variable-KOs/KIs to inequality-KIs/KOs
Try to bound the problem with LPs
Use LP-determined bounds to link z-variables, where such bounds were found

(5) Translate remaining inequalities back to equalities when possible and link z via indicator constraints. If necessary, the solver interface will translate them to big-M constraints. (6) Remove redundant equalities from static problem

straindesign.strainDesignProblem.LP_dualize(A_ineq_p, b_ineq_p, A_eq_p, b_eq_p, lb_p, ub_p, c_p, z_map_constr_ineq_p=None, z_map_constr_eq_p=None, z_map_vars_p=None) → Tuple[scipy.sparse.csr_matrix, Tuple, scipy.sparse.csr_matrix, Tuple, Tuple, Tuple, scipy.sparse.csr_matrix, scipy.sparse.csr_matrix, scipy.sparse.csr_matrix][source]

Construct the LP dual (dual-feasibility system) of a primal system

The primal is read as a MAXIMIZATION in the standard form: max{c’x} s.t. A_ineq x <= b_ineq, A_eq x = b_eq, lb <= x <= ub. This returns the dual variables y, the dual-feasibility constraints A’y {>=,=,<=} c (sign rules below), and the dual objective coefficients (= the primal right-hand sides). It is the dual-feasibility block of the KKT conditions – deliberately NOT a complete/solved dual on its own. Callers compose it:

farkas_dualize adds the normalization b’y <= -1 (with c == 0) -> an infeasibility certificate of the primal system;

the bilevel / strong-duality path adds the primal system and the objective equality c’x = b’y (strong duality) -> an optimality certificate.

(LP duality: the optimum of the maximization equals the minimal optimum of its dual; enforcing equality of the two objective values encodes inner optimality for nested optimization.)

Construction of the LP dual:

Variables translate to constraints:: x={R} -> = x>=0 -> >= (new constraint is multiplied with -1 to translate to <= e.g. -A_i’ y <= -c_i) x<=0 -> <=
Constraints translate to variables:: = -> y={R} <= -> y>=0

Parameters:

A_ineq_p (sparse.csr_matrix and list of float) – A coefficient matrix and a vector that describe the linear inequalities of the primal LP A_ineq_p*x <= b_ineq_p
b_ineq_p (sparse.csr_matrix and list of float) – A coefficient matrix and a vector that describe the linear inequalities of the primal LP A_ineq_p*x <= b_ineq_p
A_eq_p (sparse.csr_matrix and list of float) – A coefficient matrix and a vector that describe the linear equalities of the primal LP A_eq_p*x <= b_eq_p
b_eq_p (sparse.csr_matrix and list of float) – A coefficient matrix and a vector that describe the linear equalities of the primal LP A_eq_p*x <= b_eq_p
lb_p (list of float) – Upper and lower variable bounds in vector form.
ub_p (list of float) – Upper and lower variable bounds in vector form.
c_p (list of float) – The objective coefficient vector of the primal minimization-LP. z_map_constr_ineq_p, z_map_constr_eq_p, z_map_vars_p
z_map_constr_ineq (optional (sparse.csr_matrix)) – Matrices that contain the relationship between metabolic reactions and different parts of the LP, such as reactions, metabolites or other (in)equalities. These matrices help keeping track of the parts of the LP that are affected by reaction knockouts and additions. When a reaction (i) knockout removes the variable or constraint (j), the respective matrix contains a coefficient 1 at this position. -1 marks additions. E.g.: If the knockout of reaction i corresponds to the removal of inequality constraint j, there is a matrix entry z_map_constr_ineq_(i,j) = 1. If these matrices are provided, they are updated for the dualized LP, if not, the dual problem is constructed without returning information about these relationships.
z_map_constr_eq (optional (sparse.csr_matrix)) – Matrices that contain the relationship between metabolic reactions and different parts of the LP, such as reactions, metabolites or other (in)equalities. These matrices help keeping track of the parts of the LP that are affected by reaction knockouts and additions. When a reaction (i) knockout removes the variable or constraint (j), the respective matrix contains a coefficient 1 at this position. -1 marks additions. E.g.: If the knockout of reaction i corresponds to the removal of inequality constraint j, there is a matrix entry z_map_constr_ineq_(i,j) = 1. If these matrices are provided, they are updated for the dualized LP, if not, the dual problem is constructed without returning information about these relationships.
z_map_vars (optional (sparse.csr_matrix)) – Matrices that contain the relationship between metabolic reactions and different parts of the LP, such as reactions, metabolites or other (in)equalities. These matrices help keeping track of the parts of the LP that are affected by reaction knockouts and additions. When a reaction (i) knockout removes the variable or constraint (j), the respective matrix contains a coefficient 1 at this position. -1 marks additions. E.g.: If the knockout of reaction i corresponds to the removal of inequality constraint j, there is a matrix entry z_map_constr_ineq_(i,j) = 1. If these matrices are provided, they are updated for the dualized LP, if not, the dual problem is constructed without returning information about these relationships.

Returns: (Tuple): The LP dual of the problem in the format: A_ineq, b_ineq, A_eq, b_eq, c, lb, ub and optionally also z_map_constr_ineq, z_map_constr_eq, z_map_vars

straindesign.strainDesignProblem.build_primal_from_cbm(model, V_ineq=None, v_ineq=None, V_eq=None, v_eq=None, c=None) → Tuple[scipy.sparse.csr_matrix, Tuple, scipy.sparse.csr_matrix, Tuple, Tuple, Tuple, scipy.sparse.csr_matrix, scipy.sparse.csr_matrix, scipy.sparse.csr_matrix][source]

Builds primal LP from constraint-based model and (optionally) additional constraints.

standard form: A_ineq x <= b_ineq, A_eq x = b_eq, lb <= x <= ub, min{c’x}. Additionally, this function also returns a set of matrices that associate each variable (and constraint) with reactions. In the primal problems all variables correspond to reactions (z), therefore, the z_map_vars matrix is an identity matrix. The constraints correspond to metabolites, thus z_map_constr_ineq, z_map_constr_eq are all-zero.

Parameters:

model (cobra.Model) – A metabolic model that is an instance of the cobra.Model class
V_ineq (sparse.csr_matrix, list of float) – Linear inequality constraints of the form V_ineq*x <= v_ineq. Ensure that the number of columns in V_ineq is identical to the number of reactions in the model.
v_ineq (sparse.csr_matrix, list of float) – Linear inequality constraints of the form V_ineq*x <= v_ineq. Ensure that the number of columns in V_ineq is identical to the number of reactions in the model.
V_eq (sparse.csr_matrix, list of float) – Linear equality constraints of the form V_eq*x = v_eq. Ensure that the number of columns in V_eq is identical to the number of reactions in the model.
v_eq (sparse.csr_matrix, list of float) – Linear equality constraints of the form V_eq*x = v_eq. Ensure that the number of columns in V_eq is identical to the number of reactions in the model.
c (list of float) – Object coefficient vector (same lenght as variable vector).

Returns:

A_ineq, b_ineq, A_eq, b_eq, lb, ub, c, z_map_constr_ineq, z_map_constr_eq, z_map_vars. A constraint-based steady-state model in the form of a linear (in)equality system. The matrices z_map_constr_ineq, z_map_constr_eq, z_map_vars contain the association between reactions and different parts of the LP, such as reactions, metabolites or other (in)equalities.

Return type:

(Tuple)

straindesign.strainDesignProblem.farkas_dualize(A_ineq_p, b_ineq_p, A_eq_p, b_eq_p, lb_p, ub_p, z_map_constr_ineq_p=None, z_map_constr_eq_p=None, z_map_vars_p=None) → Tuple[scipy.sparse.csr_matrix, Tuple, scipy.sparse.csr_matrix, Tuple, Tuple, scipy.sparse.csr_matrix, scipy.sparse.csr_matrix, scipy.sparse.csr_matrix][source]

Translate a primal system of linear (in)equality to its Farkas dual

The primal system must be given in the standard form: A_ineq x <= b_ineq, A_eq x = b_eq, lb <= x < ub. Farkas’ lemma defines a set of two systems of linear (in)equalities of which exactly one is feasible. Since the feasibility of one is a certificate for the infeasibility of the other one, this theorem can be used to set up problems that imply the infeasibility and thus exclusion of a certain subspace. This priciple is used for MCS calculation (the SUPPRESS module).

Consider that the following is not implemented: In the case of (1) A x = b, (2) x={R}, (3) b~=0, Farkas’ lemma is special, because b’y ~= 0 is required to make the primal infeasible instead of b’y < 0. 1. This does not occur very often. 2. Splitting the equality into two inequalities that translate to y>=0 would be posible, and yield b’y < 0 in the farkas’ lemma. Maybe splitting is required, but I actually don’t think so. Using the special case of b’y < 0 for b’y ~= 0 should be enough.

Parameters:

A_ineq_p (sparse.csr_matrix and list of float) – A coefficient matrix and a vector that describe the linear inequalities of the primal LP A_ineq_p*x <= b_ineq_p
b_ineq_p (sparse.csr_matrix and list of float) – A coefficient matrix and a vector that describe the linear inequalities of the primal LP A_ineq_p*x <= b_ineq_p
A_eq_p (sparse.csr_matrix and list of float) – A coefficient matrix and a vector that describe the linear equalities of the primal LP A_eq_p*x <= b_eq_p
b_eq_p (sparse.csr_matrix and list of float) – A coefficient matrix and a vector that describe the linear equalities of the primal LP A_eq_p*x <= b_eq_p
lb_p (list of float) – Upper and lower variable bounds in vector form.
ub_p (list of float) – Upper and lower variable bounds in vector form.
z_map_constr_ineq (optional (sparse.csr_matrix)) – Matrices that contain the relationship between metabolic reactions and different parts of the LP, such as reactions, metabolites or other (in)equalities. These matrices help keeping track of the parts of the LP that are affected by reaction knockouts and additions. When a reaction (i) knockout removes the variable or constraint (j), the respective matrix contains a coefficient 1 at this position. -1 marks additions. E.g.: If the knockout of reaction i corresponds to the removal of inequality constraint j, there is a matrix entry z_map_constr_ineq_(i,j) = 1. If these matrices are provided, they are updated for the dualized LP, if not, the dual problem is constructed without returning information about these relationships.
z_map_constr_eq (optional (sparse.csr_matrix)) – Matrices that contain the relationship between metabolic reactions and different parts of the LP, such as reactions, metabolites or other (in)equalities. These matrices help keeping track of the parts of the LP that are affected by reaction knockouts and additions. When a reaction (i) knockout removes the variable or constraint (j), the respective matrix contains a coefficient 1 at this position. -1 marks additions. E.g.: If the knockout of reaction i corresponds to the removal of inequality constraint j, there is a matrix entry z_map_constr_ineq_(i,j) = 1. If these matrices are provided, they are updated for the dualized LP, if not, the dual problem is constructed without returning information about these relationships.
z_map_vars (optional (sparse.csr_matrix)) – Matrices that contain the relationship between metabolic reactions and different parts of the LP, such as reactions, metabolites or other (in)equalities. These matrices help keeping track of the parts of the LP that are affected by reaction knockouts and additions. When a reaction (i) knockout removes the variable or constraint (j), the respective matrix contains a coefficient 1 at this position. -1 marks additions. E.g.: If the knockout of reaction i corresponds to the removal of inequality constraint j, there is a matrix entry z_map_constr_ineq_(i,j) = 1. If these matrices are provided, they are updated for the dualized LP, if not, the dual problem is constructed without returning information about these relationships.

Returns: (Tuple): The Farkas dual of the linear (in)equality system in the format: A_ineq, b_ineq, A_eq, b_eq, lb, ub and optionally also z_map_constr_ineq, z_map_constr_eq, z_map_vars

straindesign.strainDesignProblem.prevent_boundary_knockouts(A_ineq, b_ineq, lb, ub, z_map_constr_ineq, z_map_vars) → Tuple[scipy.sparse.csr_matrix, Tuple, Tuple, Tuple, scipy.sparse.csr_matrix][source]

Put positive lower bounds and negative upper bounds into (notknockable) inequalities

This is a helper function that puts positive lower bounds and negative upper bounds into (not-knockable) inequalities. Later on, one may simulate the knockouts by multiplying the upper and lower bounds with a binary variable z. This functions prevents that

Parameters:

A_ineq_p (sparse.csr_matrix and list of float) – A coefficient matrix and a vector that describe the linear inequalities of the primal LP A_ineq_p*x <= b_ineq_p
b_ineq_p (sparse.csr_matrix and list of float) – A coefficient matrix and a vector that describe the linear inequalities of the primal LP A_ineq_p*x <= b_ineq_p
lb_p (list of float) – Upper and lower variable bounds in vector form.
ub_p (list of float) – Upper and lower variable bounds in vector form.
z_map_constr_ineq (optional (sparse.csr_matrix)) – Matrices that contain the relationship between metabolic reactions and different parts of the LP, such as reactions, metabolites or other (in)equalities. These matrices help keeping track of the parts of the LP that are affected by reaction knockouts and additions. When a reaction (i) knockout removes the variable or constraint (j), the respective matrix contains a coefficient 1 at this position. -1 marks additions. E.g.: If the knockout of reaction i corresponds to the removal of inequality constraint j, there is a matrix entry z_map_constr_ineq_(i,j) = 1. If these matrices are provided, they are updated. Otherwise, all reactions are assumed to be knockable and thus all negative upper and positive lower bounds are translated into constraints.
z_map_vars (optional (sparse.csr_matrix)) – Matrices that contain the relationship between metabolic reactions and different parts of the LP, such as reactions, metabolites or other (in)equalities. These matrices help keeping track of the parts of the LP that are affected by reaction knockouts and additions. When a reaction (i) knockout removes the variable or constraint (j), the respective matrix contains a coefficient 1 at this position. -1 marks additions. E.g.: If the knockout of reaction i corresponds to the removal of inequality constraint j, there is a matrix entry z_map_constr_ineq_(i,j) = 1. If these matrices are provided, they are updated. Otherwise, all reactions are assumed to be knockable and thus all negative upper and positive lower bounds are translated into constraints.

Returns: (Tuple): A linear (in)equality system in the format: A_ineq, b_ineq, A_eq, b_eq, lb, ub and optionally also updated z_map_constr_ineq, z_map_constr_eq

straindesign.strainDesignProblem.reassign_lb_ub_from_ineq(A_ineq, b_ineq, A_eq, b_eq, lb, ub, z_map_constr_ineq=None, z_map_constr_eq=None, z_map_vars=None) → Tuple[scipy.sparse.csr_matrix, Tuple, scipy.sparse.csr_matrix, Tuple, Tuple, Tuple, scipy.sparse.csr_matrix, scipy.sparse.csr_matrix][source]

Remove single constraints in A_ineq or A_eq in favor of lower and upper bounds on variables

Constraints on single variables instead translated into lower and upper bounds (lb, ub). This is useful to filter out redundant bounds on variables and keep the (in)equality system concise. To avoid interference with the knock-out logic, negative upper bounds and positive lower bounds are not put into lb and ub, when reactions are flagged knockable with z_map_vars.

Parameters:

A_ineq_p (sparse.csr_matrix and list of float) – A coefficient matrix and a vector that describe the linear inequalities of the primal LP A_ineq_p*x <= b_ineq_p
b_ineq_p (sparse.csr_matrix and list of float) – A coefficient matrix and a vector that describe the linear inequalities of the primal LP A_ineq_p*x <= b_ineq_p
A_eq_p (sparse.csr_matrix and list of float) – A coefficient matrix and a vector that describe the linear equalities of the primal LP A_eq_p*x <= b_eq_p
b_eq_p (sparse.csr_matrix and list of float) – A coefficient matrix and a vector that describe the linear equalities of the primal LP A_eq_p*x <= b_eq_p
lb_p (list of float) – Upper and lower variable bounds in vector form.
ub_p (list of float) – Upper and lower variable bounds in vector form.
z_map_constr_ineq (optional (sparse.csr_matrix)) – Matrices that contain the relationship between metabolic reactions and different parts of the LP, such as reactions, metabolites or other (in)equalities. These matrices help keeping track of the parts of the LP that are affected by reaction knockouts and additions. When a reaction (i) knockout removes the variable or constraint (j), the respective matrix contains a coefficient 1 at this position. -1 marks additions. E.g.: If the knockout of reaction i corresponds to the removal of inequality constraint j, there is a matrix entry z_map_constr_ineq_(i,j) = 1. If these matrices are provided, they are updated. Otherwise, all reactions are assumed to be notknockable and thus all constraints on single variables put into lb and ub.
z_map_constr_eq (optional (sparse.csr_matrix)) – Matrices that contain the relationship between metabolic reactions and different parts of the LP, such as reactions, metabolites or other (in)equalities. These matrices help keeping track of the parts of the LP that are affected by reaction knockouts and additions. When a reaction (i) knockout removes the variable or constraint (j), the respective matrix contains a coefficient 1 at this position. -1 marks additions. E.g.: If the knockout of reaction i corresponds to the removal of inequality constraint j, there is a matrix entry z_map_constr_ineq_(i,j) = 1. If these matrices are provided, they are updated. Otherwise, all reactions are assumed to be notknockable and thus all constraints on single variables put into lb and ub.
z_map_vars (optional (sparse.csr_matrix)) – Matrices that contain the relationship between metabolic reactions and different parts of the LP, such as reactions, metabolites or other (in)equalities. These matrices help keeping track of the parts of the LP that are affected by reaction knockouts and additions. When a reaction (i) knockout removes the variable or constraint (j), the respective matrix contains a coefficient 1 at this position. -1 marks additions. E.g.: If the knockout of reaction i corresponds to the removal of inequality constraint j, there is a matrix entry z_map_constr_ineq_(i,j) = 1. If these matrices are provided, they are updated. Otherwise, all reactions are assumed to be notknockable and thus all constraints on single variables put into lb and ub.

Returns: (Tuple): A linear (in)equality system in the format: A_ineq, b_ineq, A_eq, b_eq, lb, ub and optionally also updated z_map_constr_ineq, z_map_constr_eq

straindesign.strainDesignProblem.worker_compute(i) → Tuple[int, float][source]: Helper function for determining bounds on linear expressions

straindesign.strainDesignProblem.worker_init(A, A_ineq, b_ineq, A_eq, b_eq, lb, ub, solver, seed)[source]: Helper function for determining bounds on linear expressions