-
Notifications
You must be signed in to change notification settings - Fork 6
/
README.Rmd
81 lines (62 loc) · 2.62 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
---
output: github_document
bibliography: pomdp.bib
---
```{r echo=FALSE, results = 'asis'}
pkg <- 'pomdp'
source("https://raw.githubusercontent.com/mhahsler/pkg_helpers/main/pkg_helpers.R")
pkg_title(pkg)
```
## Introduction
A partially observable Markov decision process (POMDP) models an agent decision process
where the agent cannot directly observe the environment's state, but has to rely on
observations. The goal is to find an optimal policy to guide the agent's actions.
The `pomdp` package provides the infrastructure to define and analyze the solutions
of optimal control problems formulated as Partially Observable Markov Decision Processes (POMDP).
The package uses the solvers from [pomdp-solve](http://www.pomdp.org/code/) (Cassandra, 2015)
available in the companion R package [**pomdpSolve**](https://github.com/mhahsler/pomdpSolve) to solve
POMDPs using a variety of exact and approximate algorithms.
The package provides
fast functions (using C++, sparse matrix representation, and parallelization with `foreach`)
to perform experiments (sample from the belief space,
simulate trajectories,
belief update,
calculate the regret of a policy). The package also interfaces
to the following algorithms:
* Exact value iteration
- __Enumeration algorithm__ [@Sondik1971; @Monahan1982].
- __Two pass algorithm__ [@Sondik1971].
- __Witness algorithm__ [@Littman1995].
- __Incremental pruning algorithm__ [@Zhang1996; @Cassandra1997].
* Approximate value iteration
- __Finite grid algorithm__ [@Cassandra2015], a variation of point-based value iteration to solve larger POMDPs (__PBVI__; see [@Pineau2003] without dynamic belief set expansion.
- __SARSOP__ [@Kurniawati2008], point-based algorithm that approximates optimally reachable belief spaces for infinite-horizon problems (via package [sarsop](https://github.com/boettiger-lab/sarsop)).
If you are new to POMDPs then start with the [POMDP Tutorial](https://pomdp.org/tutorial/).
```{r echo=FALSE, results = 'asis'}
pkg_citation(pkg, 1)
pkg_install(pkg)
```
## Usage
Solving the simple infinite-horizon Tiger problem.
```{r}
library("pomdp")
data("Tiger")
Tiger
```
```{r}
sol <- solve_POMDP(model = Tiger)
sol
```
Display the value function.
```{r value_function, fig.height=4}
plot_value_function(sol, ylim = c(0, 20))
```
Display the policy graph.
```{r policy_graph, fig.height=7}
plot_policy_graph(sol)
```
## Acknowledgments
Development of this package was supported in part by
National Institute of Standards and Technology (NIST) under grant number
[60NANB17D180](https://www.nist.gov/ctl/pscr/safe-net-integrated-connected-vehicle-computing-platform).
## References