feat: add track completion detection and comprehensive environment guide#138
feat: add track completion detection and comprehensive environment guide#138EldarAlvik wants to merge 2 commits intotrackmania-rl:masterfrom
Conversation
7bff043 to
42f93df
Compare
Add reached_finishline flag to info dict and create detailed documentation for using the TrackMania Gymnasium environment. Changes: - Add 'reached_finishline' boolean flag to info dict in all TM2020 interfaces - Available in TM20FULL, TM20LIDAR, and TM20LIDARPROGRESS environments - Allows users to reliably detect successful track completion - Backward compatible with existing code - Create comprehensive environment guide (readme/tmrl_gym_enviroment.md) - Setup instructions and verification steps - Detailed observation structures for all environment types - Action space documentation (continuous vs discrete) - Usage examples and best practices - Troubleshooting section for ViGEm controller issues - Complete config.json example - Include troubleshooting image (Nefarius2.png) - Update README.md with link to new environment guide - Fix reward computation to properly handle race completion
|
Nice, thanks for the contribution, I will try to look into it soon. I am unsure about what you are describing regarding the distinction between termination and truncation, though. The way I understand it, termination means MDP terminal state (which includes going off-limits) whereas truncation means episode truncated in a non-terminal state (i.e., ending the episode due to some non-observable constraint, typically time limit - the point is that an episode should not be considered terminated in a truncated state, i.e., the value function should not be considered 0 at that time, as the constraint is not part of the MDP ; the off-limits constraint is clearly a terminal state: the entire point of this constraint is to set the value function to 0 when the car strays too far from the road) |
|
ok. i guess it could be argued either way as the agent could continue acting. But after looking at how car_racing handles it i agree with you. I updated the pull request and undid the change and updated the readme |
Thank you for the project and wanted to contribute a bit with some extra documentation that i have gotten from working on the gym environment using the get_environment method and added info extraction about if the agent has reached the finish line. also a tiny semantics thing for compute reward.
Changes
1. Add
reached_finishlineflag (tmrl/custom/tm/tm_gym_interfaces.py)infodict2. Create gymnasium environment documentation (
readme/tmrl_gym_enviroment.md)3. Link gymnasium environment guide in README (
README.md)**4. Updated terminated/truncated semantics slightly for
tmrl/custom/tm/utils/compute_reward.pyterminated→truncatedfor timeout/off-track failuresterminated= goal state,truncated= constraint violationTesting
python -m tmrl --test