Releases: markkho/msdm
Releases · markkho/msdm
v0.11 Release
v0.10 Release
Summary of changes/additions:
- Implemented a
Tableclass that allows for a dict and numpy-like interface with numpy array backend MarkovDecisionProcessandPartiallyObservableMDPalgorithms returnResultsobjects with attributes in the form ofTables (e.g.,state_value,action_value,policy) - note that this is a breaking change- For all MDPs and derived problem classes,
is_terminalhas been changed tois_absorbing FunctionalPolicyandTabularPolicyclasses introducedPolicyIteration,ValueIteration, andMultichainPolicyIterationhave been (re-)implemented- Tests have been streamlined
- Organization of core modules has been streamlined
v0.9 Release
Summary of changes/additions:
- RMAX implementation
- Fix TD Learning bug
- Fix
TabularMDP.reachable_states - New tests
v0.8 Release
Summary of changes/additions:
LAOStarerror handling- New
DictDistributionmethods - New
condition,chain, andis_normalizedmethods inFiniteDistribution
v0.7 Release
Summary of changes/additions:
- POMDP solvers:
FSCBoundedPolicyIteration(new)FSCGradientAscent(minor changes)
- Planning algorithms
- Major refactor of
LAOStarto support event listener pattern (note interface changes) - Minor refactor of
LRTDPto support event listener pattern
- Major refactor of
- Core classes
- Fix to
TabularPolicy.from_q_matricescalculation of softmax distribution - Minor changes to core POMDP implementation
- Fix to
- New domains
GridMDPbase class and plotting toolsWindyGridWorldMDP
- clean up
v0.6
v0.5 Release
This release mainly includes interfaces, algorithms, and test domains for tabular partially observable markov decision processes (POMDPs).
Summary of changes:
- Core POMDP classes:
PartiallyObservableMDPTabularPOMDPBeliefMDPPOMDPPolicyValueBasedTabularPOMDPPolicyAlphaVectorPolicyFiniteStateControllerStochasticFiniteStateController
- Domains:
HeavenOrHellLoadUnloadTiger
- Algorithms:
PointBasedValueIterationQMDPFSCGradientAscent
- JuliaPOMDPs wrapper
- Fixes to Policy Iteration and Value Iteration
- Updated README.md
v0.4 Release
New Features
- QLearning, SARSA, Expected SARSA, DoubleQLearning
- Policy Iteration
- Entropy Regularized Policy Iteration
- Works with python 3.9
- QuickMDP and QuickTabularMDP constructors
- Construction of TabularMDPs from matrices
- New domains: CliffWalking, GridMDP generic class, Russell & Norvig gridworld example
- Gridworld plotting of action values
Refactoring of core
Major overhaul of core and tabular methods:
- States/actions are assumed to be hashable (e.g., Gridworld now uses frozendict; no built-in hashing functions; dictionaries are the main way to create maps)
- The distribution classes have been streamlined (Multinomial has been removed and DictDistribution is the main way to represent categorical distributions; .sample() takes a random number generator)
- Policy classes have been simplified
- More thorough type hints
Minor additions to algorithms
v0.2 Add makefile