Evaluate the obtained policy¶
We have already seen how to evaluate the obtained policy in the previous section. In this section, we will dig into more details about evaluation.
Take the stage-wise independent continuous problem we have introduced. We set the number of stages T=4 to intentionally make the problem a bit more complex. We choose optimality gap less than 1e-3 as our stopping criterion and turn off simulation to obtain the exact gap. As shown below, after ten iterations, evaluation of the obtained policy shows the optimality gap of 0.22%; after twenty iterations, the optimality gap turns to 0.00%, which is below the tolerance we set, so the algorithm stops. In the end, we obtain that the optimal value is 6.68 and a first stage solution is 9.08.
[1]:
from msppy.msp import MSLP
import numpy as np
from msppy.solver import SDDP
from msppy.evaluation import Evaluation, EvaluationTrue
nvic = MSLP(T=3, sense=-1, bound=100)
def f(random_state):
return random_state.lognormal(mean=np.log(4),sigma=2)
for t in range(3):
m = nvic[t]
buy_now, buy_past = m.addStateVar(name='bought', obj=-1.0)
if t != 0:
sold = m.addVar(name='sold', obj=2)
unsatisfied = m.addVar(name='unsatisfied')
recycled = m.addVar(name='recycled', obj=0.5)
m.addConstr(sold + unsatisfied == 0, uncertainty={'rhs':f})
m.addConstr(sold + recycled == buy_past)
nvic.discretize(random_state=1, n_samples=100)
nvic_sddp = SDDP(nvic)
nvic_sddp.solve(max_iterations=30, freq_evaluations=10, n_simulations=-1, tol=1e-3)
nvic_sddp.db[-1]
nvic_sddp.first_stage_solution
Academic license - for non-commercial use only
Academic license - for non-commercial use only
Academic license - for non-commercial use only
----------------------------------------------------------------
SDDP Solver, Lingquan Ding
----------------------------------------------------------------
Iteration Bound Value Time
----------------------------------------------------------------
----------------------------------------------------------------------------
Evaluation for approximation model, Lingquan Ding
----------------------------------------------------------------------------
Iteration Bound Value Time
----------------------------------------------------------------------------
1 75.000000 0.000000 0.030841
2 16.762950 4.681206 0.023101
3 14.183936 -4.237863 0.014929
4 7.146810 18.153307 0.033554
5 7.077500 -0.167953 0.022254
6 7.013682 17.689941 0.023259
7 6.792215 6.187859 0.019248
8 6.720049 32.249301 0.017918
9 6.699237 11.650188 0.017348
10 6.687975 5.936189 0.017623
10 6.687975 6.673283 8.646298 0.22%
11 6.683078 -6.759604 0.021045
12 6.681344 -5.749741 0.028955
13 6.680848 1.394741 0.019374
14 6.680848 7.709294 0.021415
15 6.680848 15.143449 0.022055
16 6.680848 15.751499 0.019867
17 6.680848 -9.404420 0.015488
18 6.680848 8.153084 0.021040
19 6.680848 4.383224 0.021477
20 6.680848 9.160585 0.016293
20 6.680848 6.680848 9.266528 0.00%
----------------------------------------------------------------
Time: 0.42708349227905273 seconds
Algorithm stops since convergence tolerance:0.001 has reached
----------------------------------------------------------------------------
Time: 17.91282606124878 seconds
[1]:
{'bought': 9.082937518406089}
Evaluate the policy on the true problem
[2]:
res_true = EvaluationTrue(nvic)
res_true.run(n_simulations=3000, percentile=95,
query=['sold','bought','unsatisfied','recycled'], query_stage_cost=True)
res_true.CI
[2]:
(4.673637020980986, 5.326828207132874)
[3]:
res_true.stage_cost
[3]:
| 0 | 1 | 2 | |
|---|---|---|---|
| 0 | -9.082938 | 6.136286 | 9.204739 |
| 1 | -9.082938 | 6.136286 | 6.121598 |
| 2 | -9.082938 | -4.323174 | 24.059178 |
| 3 | -9.082938 | -5.103329 | 24.059178 |
| 4 | -9.082938 | -5.171771 | 19.555382 |
| ... | ... | ... | ... |
| 2995 | -9.082938 | -0.296567 | 24.059178 |
| 2996 | -9.082938 | -7.281342 | 8.099300 |
| 2997 | -9.082938 | -7.119978 | 24.059178 |
| 2998 | -9.082938 | 0.424552 | 6.393926 |
| 2999 | -9.082938 | 6.136286 | 6.346240 |
3000 rows × 3 columns
[4]:
res_true.solution['sold']
[4]:
| 0 | 1 | 2 | |
|---|---|---|---|
| 0 | NaN | 9.082938 | 2.126630 |
| 1 | NaN | 9.082938 | 0.071202 |
| 2 | NaN | 2.109964 | 12.029589 |
| 3 | NaN | 1.589861 | 12.029589 |
| 4 | NaN | 1.544233 | 9.027058 |
| ... | ... | ... | ... |
| 2995 | NaN | 4.794369 | 12.029589 |
| 2996 | NaN | 0.137852 | 1.389670 |
| 2997 | NaN | 0.245428 | 12.029589 |
| 2998 | NaN | 5.275114 | 0.252754 |
| 2999 | NaN | 9.082938 | 0.220964 |
3000 rows × 3 columns
[5]:
res_true.solution['bought']
[5]:
| 0 | 1 | 2 | |
|---|---|---|---|
| 0 | 9.082938 | 12.029589 | 0.0 |
| 1 | 9.082938 | 12.029589 | 0.0 |
| 2 | 9.082938 | 12.029589 | 0.0 |
| 3 | 9.082938 | 12.029589 | 0.0 |
| 4 | 9.082938 | 12.029589 | 0.0 |
| ... | ... | ... | ... |
| 2995 | 9.082938 | 12.029589 | 0.0 |
| 2996 | 9.082938 | 12.029589 | 0.0 |
| 2997 | 9.082938 | 12.029589 | 0.0 |
| 2998 | 9.082938 | 12.029589 | 0.0 |
| 2999 | 9.082938 | 12.029589 | 0.0 |
3000 rows × 3 columns
[6]:
res_true.solution['unsatisfied']
[6]:
| 0 | 1 | 2 | |
|---|---|---|---|
| 0 | NaN | 5.314715 | 0.000000 |
| 1 | NaN | 2.338170 | 0.000000 |
| 2 | NaN | 0.000000 | 10.243121 |
| 3 | NaN | 0.000000 | 2.060445 |
| 4 | NaN | 0.000000 | 0.000000 |
| ... | ... | ... | ... |
| 2995 | NaN | 0.000000 | 2.027065 |
| 2996 | NaN | 0.000000 | 0.000000 |
| 2997 | NaN | 0.000000 | 10.667744 |
| 2998 | NaN | 0.000000 | 0.000000 |
| 2999 | NaN | 21.167075 | 0.000000 |
3000 rows × 3 columns
[7]:
res_true.solution['recycled']
[7]:
| 0 | 1 | 2 | |
|---|---|---|---|
| 0 | NaN | 0.000000 | 9.902959 |
| 1 | NaN | 0.000000 | 11.958387 |
| 2 | NaN | 6.972973 | 0.000000 |
| 3 | NaN | 7.493077 | 0.000000 |
| 4 | NaN | 7.538705 | 3.002531 |
| ... | ... | ... | ... |
| 2995 | NaN | 4.288569 | 0.000000 |
| 2996 | NaN | 8.945085 | 10.639919 |
| 2997 | NaN | 8.837509 | 0.000000 |
| 2998 | NaN | 3.807823 | 11.776834 |
| 2999 | NaN | 0.000000 | 11.808625 |
3000 rows × 3 columns