05 Dec 16:59

mhahsler

pomdp_1.2.4 Latest

Latest

pomdp 1.2.4 (12/04/2024)

New Features

Added the DynaMaze MDP dataset.

Bugfixes

gridworld_maze_MDP: start state is now recorded in info.
policy_graph: use complete parameter name.

pomdp 1.2.3 (05/04/2024)

Bugfixes

Fixed possible memory violation in observation_matrix() and
transition_matrix().

Assets 2

02 Apr 17:52

mhahsler

pomdp_1.2.0

New Features

Added functions to work with MDP policies (see ? MDP_policy_functions).
Added MDP solver functions: Q-learning, Sarsa, and expected Sarsa.
simulate_MDP() and simulate_POMDP() gained parameter return_trajectories.
New functions absorbing_states() and reachable_states() for MDPs and POMDPs.
Support for gridworlds (see ? gridworld).
New datasets: Cliff_walking, Windy_gridworld, RussianTiger
plot_transition_graph() now hides unavailable actions.
Added actions() to find available actions (unavailable actions have a reward
of -Inf).
Added make_partially_observable() and make_fully_observable() to convert
between MDPs and POMDPs.

Changes

simulate_POMDP(): Better calculation of T for infinite-horizon problems.
several functions are now generics with methods for POMDP and MDP.
policy() lost the parameters alpha and action.
policy() and value_function() and gained the parameter drop.
regret(): renamed parameter belief to start. Regret is now available for MDPs.
simulate_MDP() stops now at absorbing states.
simulate_MDP_cpp() works now with sparse model representation.
POMDP and MDP gained field for additional info.
approx_MDP_policy_evaluation() is now called MDP_policy_evaluation() and gained
parameter theta as an additional stopping criterion.
rewrote all accessor code reward_matrix, transition_matrix, observation_matrix
for better and faster access.
normalize() gained parameters for more detailed normalization.
POMDP() and MDP() lost normalize.
model.h now has support for keywords in transition_prob and observation_prob.
MDP2POMDP is now make_partially_observable().

Bugfixes

q_values_MDP(), solve_MDP(): Fixed reward representation issue.
reward_val_cpp(): fixed observation matching bug.

Assets 2

05 Sep 13:53

mhahsler

pomdp_1.1.1

pomdp 1.1.1 09/04/2023)

Changes

plot_policy_graph(): The parameter order has slightly changed; belief_col is now called state_col;
unreachable states are now suppressed.
policy() gained parameters alpha and action.
color palettes are now exported.
POMPD accessors gain parameter drop.
POMDP constructor and read_POMDP gained parameter normalize and, by default, normalize
the POMDP definition.

New Features

Large POMDP descriptions are now handled better by keeping the reward as a data.frame and
supporting sparse matrices in the C++ code.
New function value_function() to access alpha vectors.
New function regret() to calculate the regret of a policy.
transition_graph() to visualize the transition model.
Problem descriptions are now normalized by default.

pomdp 1.1.0 (01/23/2023)

New Features

Added C++ (Rcpp) support. Speed up for simulate_POMDP, sample_belief_space, reward, ...
simulate_POMDP and sample_belief_space now have parallel (foreach) support.
Sparse matrices from package Matrix for matrices with a density below 50%.
Added support to parse matrices for POMDP files.
Added model normalization.
is_solved_POMDP(), is_converged_POMDP(), is_timedependent_POMDP(), and is_solved_MDP() are now exported.

Changes

accessors are now called now transition_val() and observation_val().
simulate_POMDP() and simulate_MDP() now return a list.
reimplemented round_stochastic() to improve speed.
MDP policy now uses factors for actions.
estimate_belief_for_nodes() now can also use trajectories to estimate beliefs faster.
cleaned up the interface for episodes and epochs.

Assets 2

17 May 20:09

mhahsler

pomdp_1.0.2

policy_graph() can now produce policy trees for finite-horizon problems, and the initial belief can be specified.
simulate_POMDP(): fixed bug with not using horizon.
reward() and reward_node_action() have now been separated.
sample_belief_space() gained method 'trajectories'.
simulate_POMDP(): supports not epsilon-greedy policies.
added x_prob() and x_val() functions to access individual parts of the matrices.
fixed converged finite-horizon case. It now only returns the converged graph/alpha.
we use not internally NA to represent * in the POMDP definition.
actions, states, and observations are now factors in most places.

Assets 2

24 Feb 13:50

mhahsler

pomdp_1.0.0

POMDP objects now have no list element model, but are the model list directly.
moved pomdp-solve to package pomdpSolve.
added solve_MDP().
transition probability, observation probabilities and rewards can now
be specified as a function.
transition_matrix et al now can also return a function.
Improved POMDP file writer.

Assets 2

05 Aug 12:15

mhahsler

pomdp_0.99.3

moved Ternary and visNetwork to SUGGESTED.
removed clang warning for lex scanners.

Assets 2

15 May 01:43

mhahsler

pomdp_0.99.2

pomdp 0.99.2 (05/14/2021)

Bugfix

Removed nonportable flag -C from Makefile.

pomdp 0.99.1 (05/13/2021)

New Features

Added a wrapper for the sarsop library.

Changes

Improved error messages when accessing fields not parsed by read_POMDP.
policy() no longer returns the graph, but just alphas and the optimal action.
The maintainer is now mhahsler.

Bugfix

Resolved issues with factors for R 4.0. We now mostly use character instead of factors.
States and actions as numbers are now handled correctly (reported by meeheal).
Added spelling fixes by brianrice2.
Fixed buffer overflow for filename parameters in pomdpsolve.

Assets 2

10 May 00:25

mhahsler

pomdp_0.99.0

pomdp_0.99.0 (05/04/2020)

Changes from pomdp_0.9.2

Support finite-horizon POMDPs and store epochs.
reward now looks at different epochs, calculates the optimal actions, and the parameter names are improved.
solve_POMDP not looks at convergence.
solve_POMDP gained parameter terminal_values.
solve_POMDP gained parameter discount to overwrite the discount rate specified in the model.
solve_POMDP can now solve POMDPs with time-dependent transition probabilities, observation probabilities and reward structure.
solve_POMDP gained parameter grid in the parameter list to specify a custom belief point grid for the grid method.
write_POMDP and solve_POMDP gained parameter digits.
added read_POMDP to read POMDP files.
plot for POMDP is now replaced by plot_policy_graph.
added policy graph visualization with vizNetwork.
added plot_value_function.
added function sample_belief_space to sample from the belief space.
added function plot_belief_space.
added function transition_matrix.
added function observation_matrix.
added function reward_matrix.
POMDP model now also contains horizon and terminal_values.
added MDP formulated as a POMDP.
added policy function to extract a better readable policy.
added update_belief.
added simulate_POMDP.
added round_stochastic.
added optimal_action.

Assets 2