-
Notifications
You must be signed in to change notification settings - Fork 5
/
Seminar_09_Value_of_information.Rmd
86 lines (61 loc) · 6.48 KB
/
Seminar_09_Value_of_information.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
## Seminar 9: Value of Information {#voi}
<!-- reference with [Models](#value_of_information_seminar) -->
Welcome to the Value of Information seminar for the course **Decision Analysis and Forecasting for Agricultural Development**.Feel free to bring up any questions or concerns in the Slack or to [Dr. Cory Whitney](mailto:[email protected]?subject=[Seminar_4]%20Decision%20Analysis%20Lecture) or the course tutor.
### Value of information and value of control
<!-- Cory's EVPI video -->
Since the model inputs are uncertain variables, there is always a chance that the decision turns out to be wrong. Gathering new data to inform the decision, will lead to (on average) better decisions. But how much better? Is it worth the measuring effort? What if we not only *measure* but even *manipulate* the model inputs? These concepts are informally known as "value of clairvoyance" and "value of wizardry". Please watch the following lecture by Johannes Kopton and enter the magical world of decision analysis:
<iframe width="560" height="315" src="https://www.youtube.com/embed/g-MxM_2rWJg" frameborder="0" allowfullscreen></iframe>
```{r johannes-forecasts-question-1, echo=FALSE}
question("A lottery has 200 tickets which costs 10€ each. The winner gets 1000€. The decision is, whether to buy a ticket or not. What is the value of information on whether a specific ticket will be a winner?",
answer("0 €"),
answer("4.95 €", correct = TRUE),
answer("10 €"),
answer("990 €"),
incorrect = "Watch [the talk](https://youtu.be/g-MxM_2rWJg) and try again.",
allow_retry = TRUE
)
```
```{r johannes-forecasts-question-2, echo=FALSE}
question("What is the value of controlling whether a specific ticket will be a winner?",
answer("0 €"),
answer("4.95 €"),
answer("10 €"),
answer("990 €", correct = TRUE),
incorrect = "Watch [the talk](https://youtu.be/g-MxM_2rWJg) and try again.",
allow_retry = TRUE
)
```
```{r schiffers-forecasts-question-2, echo=FALSE}
question("Why do we calculate Expected Value of Perfect Information (EVPI)?",
answer("To help identify those variables for which more information would maximize the model output.", correct = TRUE),
answer("To define potential research priorities.", correct = TRUE),
answer("It is more a hypothetical concept without any real application."),
incorrect = "Watch [the talk](https://youtu.be/g-MxM_2rWJg) and try again.",
allow_retry = TRUE
)
```
```{r schiffers-forecasts-question-3, echo=FALSE}
question("How is EVPI calculated?",
answer("It is the expected maximum value minus the expected value given perfect information."),
answer("It is the expected value given perfect information
minus the best expected value without further information.", correct = TRUE),
answer("It is the Varible Importance in the Projection (VIP) score multiplied by the number of variables in the model."),
incorrect = "Watch [the talk](https://youtu.be/g-MxM_2rWJg) and try again.",
allow_retry = TRUE
)
```
Technically we assess the possibility of making the wrong decision with a payoff matrix $Rij$ with the row index $i$ describing a choice and the column index $j$ describing a random variable that the decision maker does not yet have knowledge of, that has probability $pj$ of being in state $j$. If the decision maker has to choose $i$ without knowing the value of $j$, the best choice is the one that maximizes the expected value:
${\mbox{EMV}}=\max _{i}\sum _{j}p_{j}R_{ij}$
where $\sum _{j}p_{j}R_{ij}$ is the expected payoff for action $i$ i.e. the expectation value, and ${\mbox{EMV}}=\max _{i}$ is choosing the maximum of these expectations for all available actions.
However, with perfect knowledge of $j$, the decision maker would choose a value of $i$ that optimizes the expectation for that specific $j$. Therefore, the expected value given perfect information is
${\mbox{EV}}|{\mbox{PI}}=\sum _{j}p_{j}(\max _{i}R_{ij})$
where $p_{j}$ is the probability that the system is in state $j$, and $R_{ij}$ is the pay-off if one follows action $i$ while the system is in state $j$. Here $(\max_{i}R_{ij})$ indicates the best choice of action $i$ for each state $j$.
The expected value of perfect information is the difference between these two quantities,
${\mbox{EVPI}}={\mbox{EV}}|{\mbox{PI}}-{\mbox{EMV}}$
This difference describes, in expectation, how much larger a value the decision maker can hope to obtain by knowing $j$ and picking the best $i$ for that $j$, as compared to picking a value of $i$ before $j$ is known. Since $EV|PI$ is necessarily greater than or equal to $EMV$, $EVPI$ is always non-negative.
### Projection to Latent Structures
Projection to Latent Structures (PLS), also sometimes known as Partial Least Squares regression is a multivariate statistical technique that can deal with multiple collinear dependent and independent variables [@wold_pls-regression_2001]. It can be used as another means to assess the outcomes of a Monte Carlo model. Read more in ['A Simple Explanation of Partial Least Squares' by Kee Siong Ng](http://users.cecs.anu.edu.au/~kee/pls.pdf).
Variable Importance in Projection (VIP) scores estimate the importance of each variable in the projection used in a PLS mode. VIP is a parameter used for calculating the cumulative measure of the influence of individual $X$-variables on the model. For a given PLS dimension, $a$, the squared PLS weight $(W_a^2$ of that term is multiplied by the explained sum of squares ($SS$) of that $PLS$ dimension; and the value obtained is then divided by the total explained $SS$ by the PLS model and multiplied by the number of terms in the model. The final $VIP$ is the square root of that number.
$VIP_{PLS} = K\times (\frac{[\sum_{a=1}^{A}(W_{a}^{2} \times SSY_{comp,a})]}{SSY_{cum}})$
Technically, VIP is a weighted combination overall components of the squared PLS weights ($Wa$), where $SSY_{comp,a}$ is the sum of squares of $Y$ explained by component $a$, $A$ is the total number of components, and $K$ is the total number of variables. The average VIP is equal to 1 because the $SS$ of all VIP values is equal to the number of variables in $X$. A variable with a VIP Score close to or greater than 1 (one) can be considered important in given model. The input is a PLS model and the output is a set of column vectors equal in length to the number of variables included in the model. See @galindo-prieto_variable_2014 for a detailed description of variations of VIP analysis.
In the seminar we will go into detail abut how to calculate PLS and the VIP with the built-in functions of the `decisionSupport` package.