-
Notifications
You must be signed in to change notification settings - Fork 153
Open
Description
Sim wishlist for agentic eval / higher order tests like nav, person follow etc
- should run faster then realtime - (agent evals run hundreds of tests, multiple executions per test case since LLMs are stohastic)
- light, good visual fidelity (for VLMs)
- potentially don't simulate physics in great detail, just want a floating camera for agentic.
- easy way to import different 3d models (.obj files?) and have them as collidable objects (I want to choose a random map like https://sketchfab.com/tags/map) and use it for evals
https://sketchfab.com/3d-models/lowpoly-fps-tdm-game-map-by-resoforge-d41a19f699ea421a9aa32b407cb7537b
https://sketchfab.com/3d-models/tdm-map-3-fps-game-environment-low-poly-4c645286d8554eb0b8cf2bee6642d710
https://sketchfab.com/3d-models/lowpoly-stylized-classroom-35762c5a787e40c8b1daae410c1429de
env collection: https://sketchfab.com/bonku/collections/enviroments-6356df7a6de54491bf61fb8f2c3000f9
nice to haves
we want to support:
- run in parallel (faster then realtime?) for CI/evals
- spatial temporal memory in sim (visual fidelity and navigation, low fidelity physics, just for collisions)
basic navigation perception (lightweight)
Example test cases
- Navigation replanning. Tell the robot to go to another room. When the robot arrives close to the door, close it. The robot should arrive in the room by a secondary door. It should explore until it finds it and arrives in the room.
- Obstacle avoidance. The robot should be given a goal point a large distance away in a busy office. There should be multiple people who cut in its path, and it should successfully plan around and arrive at the goal.
- Drone follow. In a small section of a low poly city, we should have cars of different colors move in predictable paths. The drone should be given the task of following a red car and should successfully follow the car through it's planned route.
Synced from DIM-383
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels