Skip to content

Latest commit

 

History

History
82 lines (67 loc) · 2.93 KB

README.md

File metadata and controls

82 lines (67 loc) · 2.93 KB

QuickPOMDPs

Build Status Docs - Stable Docs - Dev Coverage Status codecov.io

Simplified interfaces for specifying POMDPs.jl models.

The package contains two interfaces - the Quick interface, and the Discrete Explicit interface.

Please see the documentation for more information on each.

The package can also be used from Python via pyjulia. See examples/tiger.py for an example.

Quick Interface

The Quick Interface exposes nearly all of the features of POMDPs.jl as constructor keyword arguments. Documentation, Mountain Car Example:

mountaincar = QuickMDP(
    function (s, a, rng)        
        x, v = s
        vp = clamp(v + a*0.001 + cos(3*x)*-0.0025, -0.07, 0.07)
        xp = x + vp
        if xp > 0.5
            r = 100.0
        else
            r = -1.0
        end
        return (sp=(xp, vp), r=r)
    end,
    actions = [-1., 0., 1.],
    initialstate = (-0.5, 0.0),
    discount = 0.95,
    isterminal = s -> s[1] > 0.5
)

Discrete Explicit Interface

The Discrete Explicit Interface is suitable for problems with small discrete state, action, and observation spaces. This interface is pedagogically useful because each element of the (S, A, O, R, T, Z, γ) tuple for a POMDP and (S, A, R, T, γ) tuple for an MDP is defined explicitly in a straightforward manner. Documentation, Tiger POMDP Example:

S = [:left, :right]
A = [:left, :right, :listen]
O = [:left, :right]
γ = 0.95

function T(s, a, sp)
    if a == :listen
        return s == sp
    else # a door is opened
        return 0.5 #reset
    end
end

function Z(a, sp, o)
    if a == :listen
        if o == sp
            return 0.85
        else
            return 0.15
        end
    else
        return 0.5
    end
end

function R(s, a)
    if a == :listen  
        return -1.0
    elseif s == a # the tiger was found
        return -100.0
    else # the tiger was escaped
        return 10.0
    end
end

m = DiscreteExplicitPOMDP(S,A,O,T,Z,R,γ)