Skip to content

Releases: mhahsler/pomdp

pomdp_1.2.4

05 Dec 16:59
Compare
Choose a tag to compare

pomdp 1.2.4 (12/04/2024)

New Features

  • Added the DynaMaze MDP dataset.

Bugfixes

  • gridworld_maze_MDP: start state is now recorded in info.
  • policy_graph: use complete parameter name.

pomdp 1.2.3 (05/04/2024)

Bugfixes

  • Fixed possible memory violation in observation_matrix() and
    transition_matrix().

pomdp_1.2.0

02 Apr 17:52
Compare
Choose a tag to compare

New Features

  • Added functions to work with MDP policies (see ? MDP_policy_functions).
  • Added MDP solver functions: Q-learning, Sarsa, and expected Sarsa.
  • simulate_MDP() and simulate_POMDP() gained parameter return_trajectories.
  • New functions absorbing_states() and reachable_states() for MDPs and POMDPs.
  • Support for gridworlds (see ? gridworld).
  • New datasets: Cliff_walking, Windy_gridworld, RussianTiger
  • plot_transition_graph() now hides unavailable actions.
  • Added actions() to find available actions (unavailable actions have a reward
    of -Inf).
  • Added make_partially_observable() and make_fully_observable() to convert
    between MDPs and POMDPs.

Changes

  • simulate_POMDP(): Better calculation of T for infinite-horizon problems.
  • several functions are now generics with methods for POMDP and MDP.
  • policy() lost the parameters alpha and action.
  • policy() and value_function() and gained the parameter drop.
  • regret(): renamed parameter belief to start. Regret is now available for MDPs.
  • simulate_MDP() stops now at absorbing states.
  • simulate_MDP_cpp() works now with sparse model representation.
  • POMDP and MDP gained field for additional info.
  • approx_MDP_policy_evaluation() is now called MDP_policy_evaluation() and gained
    parameter theta as an additional stopping criterion.
  • rewrote all accessor code reward_matrix, transition_matrix, observation_matrix
    for better and faster access.
  • normalize() gained parameters for more detailed normalization.
  • POMDP() and MDP() lost normalize.
  • model.h now has support for keywords in transition_prob and observation_prob.
  • MDP2POMDP is now make_partially_observable().

Bugfixes

  • q_values_MDP(), solve_MDP(): Fixed reward representation issue.
  • reward_val_cpp(): fixed observation matching bug.

pomdp_1.1.1

05 Sep 13:53
Compare
Choose a tag to compare

pomdp 1.1.1 09/04/2023)

Changes

  • plot_policy_graph(): The parameter order has slightly changed; belief_col is now called state_col;
    unreachable states are now suppressed.
  • policy() gained parameters alpha and action.
  • color palettes are now exported.
  • POMPD accessors gain parameter drop.
  • POMDP constructor and read_POMDP gained parameter normalize and, by default, normalize
    the POMDP definition.

New Features

  • Large POMDP descriptions are now handled better by keeping the reward as a data.frame and
    supporting sparse matrices in the C++ code.
  • New function value_function() to access alpha vectors.
  • New function regret() to calculate the regret of a policy.
  • transition_graph() to visualize the transition model.
  • Problem descriptions are now normalized by default.

pomdp 1.1.0 (01/23/2023)

New Features

  • Added C++ (Rcpp) support. Speed up for simulate_POMDP, sample_belief_space, reward, ...
  • simulate_POMDP and sample_belief_space now have parallel (foreach) support.
  • Sparse matrices from package Matrix for matrices with a density below 50%.
  • Added support to parse matrices for POMDP files.
  • Added model normalization.
  • is_solved_POMDP(), is_converged_POMDP(), is_timedependent_POMDP(), and is_solved_MDP() are now exported.

Changes

  • accessors are now called now transition_val() and observation_val().
  • simulate_POMDP() and simulate_MDP() now return a list.
  • reimplemented round_stochastic() to improve speed.
  • MDP policy now uses factors for actions.
  • estimate_belief_for_nodes() now can also use trajectories to estimate beliefs faster.
  • cleaned up the interface for episodes and epochs.

pomdp_1.0.2

17 May 20:09
Compare
Choose a tag to compare
  • policy_graph() can now produce policy trees for finite-horizon problems, and the initial belief can be specified.
  • simulate_POMDP(): fixed bug with not using horizon.
  • reward() and reward_node_action() have now been separated.
  • sample_belief_space() gained method 'trajectories'.
  • simulate_POMDP(): supports not epsilon-greedy policies.
  • added x_prob() and x_val() functions to access individual parts of the matrices.
  • fixed converged finite-horizon case. It now only returns the converged graph/alpha.
  • we use not internally NA to represent * in the POMDP definition.
  • actions, states, and observations are now factors in most places.

pomdp_1.0.0

24 Feb 13:50
Compare
Choose a tag to compare
  • POMDP objects now have no list element model, but are the model list directly.
  • moved pomdp-solve to package pomdpSolve.
  • added solve_MDP().
  • transition probability, observation probabilities and rewards can now
    be specified as a function.
  • transition_matrix et al now can also return a function.
  • Improved POMDP file writer.

pomdp_0.99.3

05 Aug 12:15
Compare
Choose a tag to compare
  • moved Ternary and visNetwork to SUGGESTED.
  • removed clang warning for lex scanners.

pomdp_0.99.2

15 May 01:43
Compare
Choose a tag to compare

pomdp 0.99.2 (05/14/2021)

Bugfix

  • Removed nonportable flag -C from Makefile.

pomdp 0.99.1 (05/13/2021)

New Features

  • Added a wrapper for the sarsop library.

Changes

  • Improved error messages when accessing fields not parsed by read_POMDP.
  • policy() no longer returns the graph, but just alphas and the optimal action.
  • The maintainer is now mhahsler.

Bugfix

  • Resolved issues with factors for R 4.0. We now mostly use character instead of factors.
  • States and actions as numbers are now handled correctly (reported by meeheal).
  • Added spelling fixes by brianrice2.
  • Fixed buffer overflow for filename parameters in pomdpsolve.

pomdp_0.99.0

10 May 00:25
Compare
Choose a tag to compare

pomdp_0.99.0 (05/04/2020)

Changes from pomdp_0.9.2

  • Support finite-horizon POMDPs and store epochs.
  • reward now looks at different epochs, calculates the optimal actions, and the parameter names are improved.
  • solve_POMDP not looks at convergence.
  • solve_POMDP gained parameter terminal_values.
  • solve_POMDP gained parameter discount to overwrite the discount rate specified in the model.
  • solve_POMDP can now solve POMDPs with time-dependent transition probabilities, observation probabilities and reward structure.
  • solve_POMDP gained parameter grid in the parameter list to specify a custom belief point grid for the grid method.
  • write_POMDP and solve_POMDP gained parameter digits.
  • added read_POMDP to read POMDP files.
  • plot for POMDP is now replaced by plot_policy_graph.
  • added policy graph visualization with vizNetwork.
  • added plot_value_function.
  • added function sample_belief_space to sample from the belief space.
  • added function plot_belief_space.
  • added function transition_matrix.
  • added function observation_matrix.
  • added function reward_matrix.
  • POMDP model now also contains horizon and terminal_values.
  • added MDP formulated as a POMDP.
  • added policy function to extract a better readable policy.
  • added update_belief.
  • added simulate_POMDP.
  • added round_stochastic.
  • added optimal_action.