Solve the true problem

Now that we have constructed the true problem and discretize it. The MSPPy package provides two kinds of solvers, namely SDDP and extensive, to solve the problem. After solving the problem, the obtained policy can be evaluated both on the discretization problem and the true problem.

Stage-wise independent discrete finite problem

Both the extensive solver and SDDP solver shows that, the optimum is $3.18 and an optimal amount of newspaper to buy today is 7 for the true problem

from msppy.utils.examples import construct_nvid
from msppy.solver import SDDP, Extensive
from msppy.evaluation import Evaluation, EvaluationTrue
nvid = construct_nvid()
nvid_ext = Extensive(nvid)
nvid_ext.solve(outputFlag=0)
nvid_ext.first_stage_solution
nvid_sddp = SDDP(nvid)
nvid_sddp.solve(max_iterations=10)
nvid_sddp.db[-1]
nvid_sddp.first_stage_solution
res = EvaluationTrue(nvid)
res.run(n_simulations=-1)
res.gap

Stage-wise independent continuous problem

In this case, for both solvers, the problem needs to be discretized. The following snippet discretizes the data process by 100 number of samples and a random seed 888. We refer the resulting disretization problem as SAA(1). Both solvers show that, for SAA(1), the optimum is $2.79 and an optimal amount of newspaper to buy today is 9.08

from msppy.utils.examples import construct_nvic
from msppy.solver import SDDP, Extensive
from msppy.evaluation import Evaluation, EvaluationTrue
nvic = construct_nvic()
nvic.discretize(random_state=1, n_samples=100)
nvic_ext = Extensive(nvic)
nvic_ext.solve(outputFlag=0)
nvic_ext.first_stage_solution
nvic_sddp = SDDP(nvic)
nvic_sddp.solve(max_iterations=10)
nvic_sddp.db[-1]
nvic_sddp.first_stage_solution
res = Evaluation(nvic)
res.run(n_simulations=-1)
res.gap
res_true = EvaluationTrue(nvid)
res_true.run(n_simulations=3000, percentile=95)
res_true.CI

Our intention is solve the true problem rather than the SAA(1). The advantage of SDDP solver is that by solving SAA(1), it also offers a policy to the true problem (though not optimal). By simulation as shown above, the 95% confidence interval of the value of the obtained policy on the true problem is [3.10,3.37]. Therefore, we conclude that by solving SAA(1), the SDDP solver provides you with a policy whose value is within [3.10,3.37] with 95% confidence for the true problem.

Markov chain problem

Both the extensive solver and SDDP solver shows that, the optimum is $9.05 and an optimal amount of newspaper to buy today is 6 for the true problem.

from msppy.utils.examples import construct_nvmc
from msppy.solver import SDDP, Extensive
from msppy.evaluation import Evaluation, EvaluationTrue
nvmc = construct_nvmc()
nvmc_ext = Extensive(nvmc)
nvmc_ext.solve()
nvmc_ext.first_stage_solution
nvmc_sddp = SDDP(nvmc)
nvmc_sddp.solve(max_iterations=10)
nvmc_sddp.db[-1]
nvmc_sddp.first_stage_solution
res = Evaluation(nvmc)
res.run(n_simulations=-1)
res.gap

Markovian continuous problem

In this case, for both solvers, the problem needs to be discretized. The following snippet ten dimensional Markov chain using stochastic approximation method (stochastic gradient descent) with 1000 iterations. We refer the resulting disretization problem as MC. Both solvers show that, for MC, the optimum is $2.67 and an optimal amount of newspaper to buy today is 17.76

from msppy.utils.examples import construct_nvm
from msppy.solver import SDDP, Extensive
from msppy.evaluation import Evaluation, EvaluationTrue
nvm = construct_nvm()
nvm.discretize(n_Markov_states=10, n_sample_paths=1000, method='SA');
nvm_ext = Extensive(nvm)
nvm_ext.solve(outputFlag=0)
nvm_ext.first_stage_solution
nvm_sddp = SDDP(nvm)
nvm_sddp.solve(max_iterations=100, logToConsole=0)
nvm_sddp.db[-1]
nvm_sddp.first_stage_solution
res = Evaluation(nvm)
res.run(n_simulations=-1)
res.gap
res_true = EvaluationTrue(nvm)
res_true.run(n_simulations=3000)
res_true.CI

But our intention is solve the true problem rather than the MC. The advantage of SDDP solver is that by solving MC, it also offers a policy to the true problem (though not guaranteed to be optimal). By simulation as above, the 95% confidence interval of the value of the obtained policy on the true problem is [26.54,27.99]. Therefore, we conclude that by solving MC, the SDDP solver provides you with a policy whose value is within [26.54,27.99] with 95% confidence for the true problem.

Risk averse problem

Direct method

from msppy.utils.examples import construct_nvid
from msppy.solver import SDDP
nvid = construct_nvid()
nvid.set_AVaR(l=0.5, a=0.1, method='direct')
nvid_sddp = SDDP(nvid)
nvid_sddp.solve(max_iterations=10)
nvid_sddp.first_stage_solution

Indirect method

from msppy.utils.examples import construct_nvid
from msppy.solver import SDDP
nvid = construct_nvid()
nvid.set_AVaR(l=0.5, a=0.1, method='indirect')
nvid_sddp = SDDP(nvid)
nvid_sddp.solve(max_iterations=20)
nvid_sddp.first_stage_solution

Biased sampling

For both finite-horizon and infinite horizon risk averse problems, we provide a biased sampling scheme to solve the problem. Details can be found in the paper. A biased sampling scheme is in essence a change of the refrence probability measure making `bad’ scenarios more frequent and is shown to improve the rate of convergence.

Finite horizon

from msppy.utils.examples import construct_nvid
from msppy.solver import SDDP
nvid = construct_nvid()
nvid.set_AVaR(l=0.5, a=0.1)
nvid_sddp = SDDP(nvid, biased_sampling = True)
nvid_sddp.solve(max_iterations=20)

Inifinite horizon

from msppy.utils.examples import construct_nvid
from msppy.solver import SDDP
nvid = construct_nvid()
nvid.set_AVaR(l=0.5, a=0.1)
nvid_sddp = PSDDP(nvid, biased_sampling = True)
nvid_sddp.solve(max_iterations=20)

Integer problem

SDDiP solver and Extensive solver can solve MSIP problems. The built-in cuts provided are Benders’ cut, strengthened Benders’ cut and Lagaragian cut.

from msppy.utils.examples import construct_nvidi
from msppy.solver import SDDiP, Extensive
nvidi = construct_nvidi()
nvidi_ext = Extensive(nvidi)
nvidi_ext.solve(outputFlag=0)
nvidi_ext.first_stage_solution
nvidi_sddip = SDDiP(nvidi)
nvidi_sddip.solve(max_iterations=10, cuts=['SB'])
nvidi_sddip.db[-1]
nvidi_sddip.first_stage_solution

Infinite horizon problem

PSDDP solver use periodical SDDP algorithm to solve MSLP problems. Recall that we have constructed a 3-stage MSLP with period of 2 stages. So the backward pass will add cuts to these 3 stages. In the following snippet, forward_T is set to be 10, meaning that the forward pass will solve the first 10 stages and select trial solutions.

from msppy.utils.examples import construct_nvidinf
from msppy.solver import PSDDP
nvidinf = construct_nvidinf()
nvidinf_sddp = PSDDP(nvidinf)
nvidinf_sddp.solve(max_iterations=10, forward_T=10)
nvidinf_sddp.db[-1]
nvidinf_sddp.first_stage_solution
from msppy.evaluation import Evaluation
res = Evaluation(nvidinf)
res.run(n_simulations=1000, T=100, n_processes=3)
res.gap

Stopping criterion of the SDDP solver

SDDP algorithm is an iterative algorithm and thus we need to specify when to stop the algorithm. Besides simple numeric stopping criterion (max_iterations, max_time, max_stable_iterations, see documentation below), more sophisticated criterion are available. Returns back to the stage-wise independent finite discrete problem introduced at the beginning of this section, in which I specify max_iterations to be 10. Because this problem is tiny, one can be certain that after 10 iterations, it converges. When problem is large-scale and we cannot estimate a good threshold for number of iterations or computational time. We need something else.

Optimality gap

The following snippet specifies an optimality-gap based stopping criteria. Specifically, the SDDP solver evaluates the obtained policy every 1 iteration by compute the exact expected policy value (turning off simulation). If the gap becomes not larger than 1e-4, the algorithm will be stopped. We can see that the algorithm stops after 6 iterations.

from msppy.utils.examples import construct_nvid
from msppy.solver import SDDP
nvid = construct_nvid()
nvid_sddp = SDDP(nvid)
nvid_sddp.solve(freq_evaluations=1, n_simulations=-1, tol=1e-4)
nvid_sddp.iteration
nvid_sddp.db[-1]
nvid_sddp.first_stage_solution

Stabilization

The following snippet specifies a stabilization based stopping criteria. Specifically, the SDDP solver compares the policy every 1 iteration by computing the difference of the expected policy values. If the difference becomes not larger than 1e-4, the algorithm will be stopped. We can see that again, the algorithm stops after 6 iterations.

from msppy.utils.examples import construct_nvid
from msppy.solver import SDDP
nvid = construct_nvid()
nvid_sddp = SDDP(nvid)
nvid_sddp.solve(freq_comparisons=1, n_simulations=-1, tol=1e-4)
nvid_sddp.iteration
nvid_sddp.db[-1]
nvid_sddp.first_stage_solution

Module Reference

The Solver module

class msppy.solver.Extensive(MSP)[source]

Extensive solver class. Can solve 1. small-scale stgage-wise independent finite discrete risk netural problem 2. small-scale Markov chain risk neutral problem.

Parameters:MSP (list) – A multi-stage stochastic program object.
extensive_model

The constructed extensive model

solving_time

The time cost in solving extensive model

construction_time

The time cost in constructing extensive model

first_stage_solution

the obtained solution in the first stage

solve(start=0, flag_rolling=0, **kwargs)[source]

Call extensive solver to solve the discretized problem. It will first construct the extensive model and then call Gurobi solver to solve it.

Parameters:**kwargs (optional) – Gurobipy attributes to specify on extensive model.
class msppy.solver.PSDDP(*args, **kwargs)[source]
solve(forward_T=None, *args, **kwargs)[source]

Solve the discretized problem.

Parameters:forward_T (int, optional) – The number of stages to consider in forward passes and trial point selection
class msppy.solver.PSDDiP(*args, **kwargs)[source]
class msppy.solver.SDDP(MSP, biased_sampling=False)[source]

SDDP solver base class.

Parameters:MSP (list) – A multi-stage stochastic program object.
bounds

dataframe of the obtained bound

first_stage_solution

the obtained solution in the first stage

plot_bounds(start=0, window=1, smooth=0, ax=None)[source]

plot the evolution of bounds

Parameters:
  • ax (Matplotlib AxesSubplot instance, optional) – The specified subplot is used to plot; otherwise a new figure is created.
  • window (int, optional (default=1)) – The length of the moving windows to aggregate the policy values. If length is bigger than 1, approximate confidence interval of the policy values and statistical bounds will be plotted.
  • smooth (bool, optional (default=0)) – If 1, fit a smooth line to the policy values to better visualize the trend of statistical values/bounds.
  • start (int, optional (default=0)) – The start iteration to plot the bounds. Set start to other values can zoom in the evolution of bounds in most recent iterations.
Returns:

Return type:

matplotlib.pyplot.figure instance

solve(n_processes=1, n_steps=1, max_iterations=10000, max_stable_iterations=10000, max_time=1000000.0, tol=0.001, freq_evaluations=None, percentile=95, tol_diff=-inf, random_state=None, evaluation_true=False, freq_comparisons=None, n_simulations=3000, n_simulations_true=3000, query=None, query_T=None, query_dual=None, query_stage_cost=False, query_policy_value=False, freq_clean=None, logFile=1, logToConsole=1, directory='', rgl_norm='L2', rgl_a=0, rgl_b=0.95)[source]

Solve the discretized problem.

Parameters:
  • n_processes (int, optional (default=1)) – The number of processes to run in parallel. Run serial SDDP if 1. If n_steps is 1, n_processes is coerced to be 1.
  • n_steps (int, optional (default=1)) – The number of forward/backward steps to run in each cut iteration. It is coerced to be 1 if n_processes is 1.
  • max_iterations (int, optional (default=10000)) – The maximum number of iterations to run SDDP.
  • max_stable_iterations (int, optional (default=10000)) – The maximum number of iterations to have same deterministic bound
  • tol (float, optional (default=1e-3)) – tolerance for convergence of bounds
  • freq_evaluations (int, optional (default=None)) – The frequency of evaluating gap on the discretized problem. It will be ignored if risk averse
  • percentile (float, optional (default=95)) – The percentile used to compute confidence interval
  • diff (float, optional (default=-inf)) – The stabilization threshold
  • freq_comparisons (int, optional (default=None)) – The frequency of comparisons of policies
  • n_simulations (int, optional (default=10000)) – The number of simluations to run when evaluating a policy on the discretized problem
  • freq_clean (int/list, optional (default=None)) – The frequency of removing redundant cuts. If int, perform cleaning at the same frequency for all stages. If list, perform cleaning at different frequency for each stage; must be of length T-1 (the last stage does not have any cuts).
  • random_state (int, RandomState instance or None, optional (default=None)) – Used in evaluations and comparisons. (In the forward step, there is an internal random_state which is not supposed to be changed.) If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by numpy.random.
  • logFile (binary, optional (default=1)) – Switch of logging to log file
  • logToConsole (binary, optional (default=1)) – Switch of logging to console

Examples

>>> SDDP().solve(max_iterations=10, max_time=10,
    max_stable_iterations=10)
Optimality gap based stopping criteria: evaluate the obtained policy
every freq_evaluations iterations by running n_simulations Monte Carlo
simulations. If the gap becomes not larger than tol, the algorithm
will be stopped.
>>> SDDP().solve(freq_evaluations=10, n_simulations=1000, tol=1e-2)
Simulation can be turned off; the solver will evaluate the exact expected
policy value.
>>> SDDP().solve(freq_evaluation=10, n_simulations=-1, tol=1e-2)
Stabilization based stopping criteria: compare the policy every
freq_comparisons iterations by computing the CI of difference of the
expected policy values. If the upper end of CI becomes not larger
than tol diff, the algorithm will be stopped.
>>> SDDP().solve(freq_comparisons=10, n_simulations=1000, tol=1e-2)
Turn off simulation and
class msppy.solver.SDDiP(MSP, biased_sampling=False)[source]

SDDP solver base class.

Parameters:MSP (list) – A multi-stage stochastic program object.
solve(cuts, pattern=None, relax_stage=None, level_step_size=0.2929, level_max_stable_iterations=1000, level_max_iterations=1000, level_max_time=1000, level_mip_gap=0.0001, level_tol=0.001, *args, **kwargs)[source]

Call SDDiP solver to solve the discretized problem.

Parameters:
  • cuts (list) – Entries of the list could be ‘B’,’SB’,’LG’
  • pattern (dict, optional (default=None)) – The pattern of adding cuts can be cyclical or barrier-in. See the example below.
  • relax_stage (int, optional (default=None)) – All stage models after relax_stage (exclusive) will be relaxed.
  • level_step_size (float, optional (default=0.2929)) – Step size for level method.
  • level_max_stable_iterations (int, optional (default=1000)) – The maximum number of iterations to have the same deterministic g_* for the level method.
  • level_mip_gap (float, optional (default=1e-4)) – The MIPGap used when solving the inner problem for the level method.
  • level_max_iterations (int, optional (default=1000)) – The maximum number of iterations to run for the level method.
  • level_max_time (int, optional (default=1000)) – The maximum number of time to run for the level method.
  • level_tol (float, optional (default=1e-3)) – Tolerance for convergence of bounds for the level method.

Examples

>>> SDDiP().solve(max_iterations=10, cut=['SB'])

The following cyclical add difference cuts. Specifically, for every six iterations add Benders’ cuts for the first four, strengthened Benders’ cuts for the fifth, and Lagrangian cuts for the last.

>>> SDDiP().solve(max_iterations=10, cut=['B','SB','LG'],
...     pattern={"cycle": (4, 1, 1)})

The following add difference cuts from certain iterations. Specifically, add Benders’ cuts from the beginning, Strengthened Benders’ cuts from the fourth iteration, and Lagragian cuts from the fifth iteration.

>>> SDDiP().solve(max_iterations=10, cut=['B','SB','LG'],
...     pattern={'in': (0, 4, 5)})

The Evaluation module

class msppy.evaluation.Evaluation(MSP)[source]

Evaluation base class.

Parameters:MSP (list) – A multi-stage stochastic program object.
db

The deterministic bounds.

Type:list
pv

The simulated policy values.

Type:list
epv

The exact value of expected policy value (only available for approximation model).

Type:float
CI

The CI of simulated policy values.

Type:tuple
gap

The gap between upper end of the CI and deterministic bound.

Type:float
stage_cost

The cost of individual stage models.

Type:dataframe
solution

The solution of queried variables.

Type:dataframe
n_sample_paths

The number of sample paths to evaluate policy.

Type:int
sample_paths_idx

The index list of exhaustive sample paths if simulation is turned off.

Type:list
markovian_samples

The simulated Markovian type samples.

markovian_idx

The Markov state that is the closest to the markovian_samples.

Type:list
run(*args, **kwargs)[source]

Run a Monte Carlo simulation to evaluate the policy.

Parameters:
  • n_simulations (int/-1) – If int: the number of simulations; If -1: exhuastive evaluation.
  • percentile (float, optional (default=95)) – The percentile used to compute the confidence interval.
  • query (list, optional (default=None)) – The names of variables that are intended to query.
  • query_dual (list, optional (default=None)) – The names of constraints whose dual variables are intended to query.
  • query_stage_cost (bool, optional (default=False)) – Whether to query values of individual stage costs.
  • n_processes (int, optional (default=1)) – The number of processes to run the simulation.
  • T (int, optional (default=None)) – For infinite horizon problem, the number stages to evaluate the policy.
class msppy.evaluation.EvaluationTrue(MSP)[source]

Evaluation base class.

Parameters:MSP (list) – A multi-stage stochastic program object.
db

The deterministic bounds.

Type:list
pv

The simulated policy values.

Type:list
epv

The exact value of expected policy value (only available for approximation model).

Type:float
CI

The CI of simulated policy values.

Type:tuple
gap

The gap between upper end of the CI and deterministic bound.

Type:float
stage_cost

The cost of individual stage models.

Type:dataframe
solution

The solution of queried variables.

Type:dataframe
n_sample_paths

The number of sample paths to evaluate policy.

Type:int
sample_paths_idx

The index list of exhaustive sample paths if simulation is turned off.

Type:list
markovian_samples

The simulated Markovian type samples.

markovian_idx

The Markov state that is the closest to the markovian_samples.

Type:list
run(*args, **kwargs)[source]

Run a Monte Carlo simulation to evaluate the policy.

Parameters:
  • n_simulations (int/-1) – If int: the number of simulations; If -1: exhuastive evaluation.
  • percentile (float, optional (default=95)) – The percentile used to compute the confidence interval.
  • query (list, optional (default=None)) – The names of variables that are intended to query.
  • query_dual (list, optional (default=None)) – The names of constraints whose dual variables are intended to query.
  • query_stage_cost (bool, optional (default=False)) – Whether to query values of individual stage costs.
  • n_processes (int, optional (default=1)) – The number of processes to run the simulation.
  • T (int, optional (default=None)) – For infinite horizon problem, the number stages to evaluate the policy.