-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
Organizing some thoughts on how to clean up tabular MDP code for a v1.0.0 release (@cgc)
- Streamline testing with a standard set of benchmark tasks and comparison of outputs for exact algorithms
- Create a modified policy iteration implementation that has value iteration and policy iteration as special cases
- Include a non-vectorized version of MPI/VI/PI for easier debugging
- Reorganize
msdm/coredirectory structure to be flatter - Explicitly define
MDPPolicysince these will be different fromPOMDPPolicies, etc. - Streamline translation between different policy and state/action mapping representations (e.g., matrices vs. dictionarys), and html-table-like output for jupyter notebooks
- change mdp
run_onto return a trajectory ofStepobjects defined bynamedtuple(similar to what pomdp does now)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels