Releases · markkho/msdm

17 Oct 22:20

markkho

v0.11

2d2cf37

v0.11 Release Latest

Latest

Major fix to A* implementation in 7a52fa7
Additional table support
ImplicitDistribution implementation
Implementation of Options Framework (Sutton, Precup & Singh, 1999)

Assets 2

11 Jan 22:22

markkho

v0.10

f9bd0b8

v0.10 Release

Summary of changes/additions:

Implemented a Table class that allows for a dict and numpy-like interface with numpy array backend
MarkovDecisionProcess and PartiallyObservableMDP algorithms return Results objects with attributes in the form of Tables (e.g., state_value, action_value, policy) - note that this is a breaking change
For all MDPs and derived problem classes, is_terminal has been changed to is_absorbing
FunctionalPolicy and TabularPolicy classes introduced
PolicyIteration, ValueIteration, and MultichainPolicyIteration have been (re-)implemented
Tests have been streamlined
Organization of core modules has been streamlined

Assets 2

21 Sep 11:40

markkho

v0.9

fdb0144

v0.9 Release

Summary of changes/additions:

RMAX implementation
Fix TD Learning bug
Fix TabularMDP.reachable_states
New tests

Assets 2

05 Aug 02:41

markkho

v0.8

6bec75a

v0.8 Release

Summary of changes/additions:

LAOStar error handling
New DictDistributionmethods
New condition, chain, and is_normalized methods in FiniteDistribution

Assets 2

05 Dec 20:41

markkho

v0.7

a88a5a8

v0.7 Release

Summary of changes/additions:

POMDP solvers:
- FSCBoundedPolicyIteration (new)
- FSCGradientAscent (minor changes)
Planning algorithms
- Major refactor of LAOStar to support event listener pattern (note interface changes)
- Minor refactor of LRTDP to support event listener pattern
Core classes
- Fix to TabularPolicy.from_q_matrices calculation of softmax distribution
- Minor changes to core POMDP implementation
New domains
- GridMDP base class and plotting tools
- WindyGridWorld MDP
clean up

Assets 2

03 Nov 22:44

markkho

v0.6

e882fc8

v0.6

Minor changes

Assets 2

30 Oct 17:25

markkho

v0.5

e6918f7

v0.5 Release

This release mainly includes interfaces, algorithms, and test domains for tabular partially observable markov decision processes (POMDPs).

Summary of changes:

Core POMDP classes:
- PartiallyObservableMDP
- TabularPOMDP
- BeliefMDP
- POMDPPolicy
- ValueBasedTabularPOMDPPolicy
- AlphaVectorPolicy
- FiniteStateController
- StochasticFiniteStateController
Domains:
- HeavenOrHell
- LoadUnload
- Tiger
Algorithms:
- PointBasedValueIteration
- QMDP
- FSCGradientAscent
JuliaPOMDPs wrapper
Fixes to Policy Iteration and Value Iteration
Updated README.md

Assets 2

21 Oct 20:49

markkho

v0.4

4aafea6

v0.4 Release

New Features

QLearning, SARSA, Expected SARSA, DoubleQLearning
Policy Iteration
Entropy Regularized Policy Iteration
Works with python 3.9
QuickMDP and QuickTabularMDP constructors
Construction of TabularMDPs from matrices
New domains: CliffWalking, GridMDP generic class, Russell & Norvig gridworld example
Gridworld plotting of action values

Assets 2

05 Apr 19:50

markkho

v0.3

554bcd2

Refactoring of core

Major overhaul of core and tabular methods:

States/actions are assumed to be hashable (e.g., Gridworld now uses frozendict; no built-in hashing functions; dictionaries are the main way to create maps)
The distribution classes have been streamlined (Multinomial has been removed and DictDistribution is the main way to represent categorical distributions; .sample() takes a random number generator)
Policy classes have been simplified
More thorough type hints

Assets 2

04 Apr 14:58

markkho

v0.2

749c77a

Minor additions to algorithms

v0.2

Add makefile

Assets 2

Releases: markkho/msdm

v0.11 Release

Uh oh!

v0.10 Release

Summary of changes/additions:

Uh oh!

v0.9 Release

Uh oh!

v0.8 Release

Uh oh!

v0.7 Release

Uh oh!

v0.6

Uh oh!

v0.5 Release

Uh oh!

v0.4 Release

Uh oh!

Refactoring of core

Uh oh!

Minor additions to algorithms

Uh oh!