ROS 2 package that provides a Co‑Pilot Vision agent with a PyQt6 GUI. It lets you run a configurable vision pipeline on an image, visualize the live overlay produced by the pipeline, and interact with the agent via LLM-backed tools.
The package is designed to work both from an installed ROS share (installed with colcon) and directly from source during development. Key assets such as prompts.yaml and vision_functions.json are resolved from the package install path with a safe fallback to the local repository.
- PyQt6 GUI: enter the image name, select the tool view (FunctionsView), pick a model from
prompts.yaml, and run the agent. - Live overlay preview: the pipeline writes an overlay image during processing; the GUI watches and refreshes it automatically.
- Final vs overlay: the pipeline’s final output image is still saved and used by the agent, while the GUI displays the overlay so you can see intermediate annotations.
- ROS 2 node lifecycle: the GUI hosts a ROS2 node; it can be launched directly or from a ROS entry point.
pm_co_pilot_vision/gui/agent_gui.py— PyQt6 GUI (right pane shows Original on top and Overlay at the bottom).pm_co_pilot_vision/pm_co_pilot_vision.py— entry point that can launch the GUI inside a ROS2 node context.pm_co_pilot_vision/co_pilot_modules/agent.py— the Agent wrapper, withFunctionsViewenum and optional model override.pm_co_pilot_vision/utils/vision_functions.py— VisionHandler that orchestrates the vision pipeline and file outputs.config/prompts.yaml— models and prompt configuration; the GUI reads available models from here.files/vision_functions.json— vision tool/function specs used by the agent.launch/pm_co_pilot_vision.launch.py— example launch file.
- ROS 2 (tested with humble)
- Python 3.10+
- PyQt6
- Project dependencies that the package imports at runtime:
pm_vision_manager(pipeline, camera configs)
You can install pm_vision_manager by cloning its repository into your ROS 2 workspace and building it with colcon.
Install Python user deps (PyQt6) into the environment you use to run ROS:
pip install --user PyQt6If you’re using a venv, activate it first; if you’re using the ROS Python, consider creating a venv to avoid mixing system packages.
Place the package in your ROS 2 workspace and build with colcon:
cd ~/ros2_ws/src
git clone <this-repo-url> pm_co_pilot_vision
cd ..
colcon build --packages-select pm_co_pilot_vision
source install/setup.bashYou can run the GUI via the package executable or with a launch file (if wired in your environment):
# direct executable (installed via entry_points)
ros2 run pm_co_pilot_vision pm_co_pilot_vision_gui
# or, if you prefer using the launch file (example)
ros2 launch pm_co_pilot_vision pm_co_pilot_vision.launch.pyAt runtime the GUI expects only the image file name (e.g., sensor_corner.png). It locates paths as follows:
PM_CO_PILOT_IMAGE_PATH(optional): directory containing your input images.PM_CO_PILOT_PROCESSES_PATH(optional): directory where pipeline JSON files will be written.
If these variables are not set, the code uses sensible fallbacks that match typical pm_vision_manager locations, e.g.:
- Images default:
/home/<user>/Documents/ros2_ws/src/pm_vision_manager/pm_vision_manager/vision_db/co_pilot_tests/ - Processes default:
/home/<user>/Documents/ros2_ws/src/pm_vision_manager/pm_vision_manager/vision_processes/co_pilot_tests
Exporting them before launching is recommended:
export PM_CO_PILOT_IMAGE_PATH=/path/to/images
export PM_CO_PILOT_PROCESSES_PATH=/path/to/vision_processes- Start the app:
ros2 run pm_co_pilot_vision pm_co_pilot_vision_gui. - Fill the fields on the left:
- FunctionsView: Names only or Full specs
- Model: comes from
config/prompts.yaml(see below) - Image name: just the filename (e.g.,
sensor_corner.png) located inPM_CO_PILOT_IMAGE_PATH - User prompt: free text prompt for the agent
- Click “Run Agent”.
- Right pane shows two images:
- Original (top)
- Overlay (bottom): this updates live as the pipeline saves the overlay file.
Outputs:
- Final processed image is saved as
<image>_processed.pngunder an auto-created result directory. - Live overlay is saved as
<image>_overlay.pngin the same directory; the GUI watches it and refreshes automatically. - A results JSON (
vision_results.json) is also written with serializable content extracted from the pipeline/agent.
The GUI reads “available_models” from prompts.yaml to populate the model dropdown. The file is resolved in this order:
- Package share directory:
$AMENT_PREFIX/share/pm_co_pilot_vision/prompts.yamlor$AMENT_PREFIX/share/pm_co_pilot_vision/config/prompts.yaml. - Fallback to local repo:
config/prompts.yaml.
You can add a new model to the dropdown by editing config/prompts.yaml:
available_models:
- gpt-5
- gpt-5-mini
- any other model available with langchainFunction/tool specifications for the agent. Resolved in this order:
- Package share directory:
$AMENT_PREFIX/share/pm_co_pilot_vision/vision_functions.json - Fallback to local repo:
files/vision_functions.json
Agent(co_pilot_modules/agent.py): wraps the LLM, acceptsfunctions_view: FunctionsViewand optionalmodeloverride.VisionHandler(utils/vision_functions.py): interfaces withpm_vision_managerto run the pipeline, writes output files, builds serializable results.- GUI (
gui/agent_gui.py):- Runs the agent work on a
QThreadwith aQObjectworker to keep the UI responsive. - Emits a path hint for the overlay early so the GUI can start watching and updating live.
- Scales images to fit the viewport; borders hug the image contents (no excess frames). Scrollbars are disabled.
- Runs the agent work on a
-
PyQt6 isn’t found
- Install it in your runtime Python:
pip install --user PyQt6and ensure you run the GUI in that environment.
- Install it in your runtime Python:
-
Image not found dialog
- Set
PM_CO_PILOT_IMAGE_PATHor place the image file under the default path shown in the dialog.
- Set
-
NameError:
QObjectis not defined- Ensure the GUI imports include
from PyQt6.QtCore import Qt, QObject, pyqtSignal, QThread.
- Ensure the GUI imports include
-
KeyError:
agentinprompts.yaml- The loader now aliases
'agent'to'agent_all_functions'. Make sure yourprompts.yamlstructure matches the examples or use the updated keys.
- The loader now aliases
-
GUI freezes while running
- The agent runs in a worker
QThread. If you changed that code and see freezes, verify the long-running calls are off the main thread.
- The agent runs in a worker
-
Scrollbars on the image panel
- The GUI scales images to the viewport and hides scrollbars. If the window is too small, enlarge it or resize the panes.
- Build fast:
colcon build --packages-select pm_co_pilot_vision
source install/setup.bash