Triple Depth2Object Position

A ROS2 Humble package for real-time 3D object detection and mapping using dual Intel RealSense D435 depth cameras with AI-powered object recognition via GroundingDINO.

Features

Dual Camera System: Synchronized operation of two Intel RealSense D435 cameras
3D Object Detection: Real-time object detection with 3D position estimation
AI-Powered Recognition: GroundingDINO model for robust object detection
Stereo Calibration: Precise camera-to-camera calibration using ChArUco boards
Base Frame Calibration: Camera-to-robot base frame transformation
Point Cloud Fusion: Merged point cloud from multiple cameras
Socket Interface: External application integration via TCP socket
RViz Visualization: Real-time visualization of cameras, objects, and point clouds

System Requirements

Hardware

2x Intel RealSense D435 cameras
NVIDIA Jetson Orin (or compatible CUDA-capable system)
Minimum 8GB RAM (16GB recommended)
USB 3.0 ports for cameras

Software

Ubuntu 20.04/22.04
ROS2 Humble
Python 3.8+
CUDA 11.4+ (for GPU acceleration)
PyTorch 1.13+ with CUDA support

Installation

1. Install Dependencies

# Install ROS2 dependencies
sudo apt update
sudo apt install ros-humble-realsense2-camera ros-humble-realsense2-description

# Install Python packages
pip3 install opencv-contrib-python==4.7.0.72
pip3 install numpy scipy pyyaml
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu118
pip3 install transformers pillow matplotlib

2. Download GroundingDINO Model

# Create model directory
sudo mkdir -p /opt/models/groundingdino/

# Download model files
cd /opt/models/groundingdino/
sudo wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
sudo wget https://raw.githubusercontent.com/IDEA-Research/GroundingDINO/main/groundingdino/config/GroundingDINO_SwinT_OGC.py

3. Clone and Build Package

# Clone the repository
cd ~/ros2_ws/src
git clone https://github.com/yourusername/triple_depth2object_position.git

# Build the package
cd ~/ros2_ws
colcon build --packages-select triple_depth2object_position
source install/setup.bash

4. Create Calibration Directory

mkdir -p ~/ros2_ws/calibration_data

Camera Configuration

The package is configured for the following camera serials by default:

Left Camera: 419622073822
Right Camera: 033422072712

To use different cameras, update the serial numbers in the launch files.

Usage

Step 1: Generate Calibration Board

# Generate ChArUco calibration board PDF
python3 ~/ros2_ws/src/triple_depth2object_position/scripts/generate_charuco_board_pdf.py

# Print the generated board (charuco_board_4x6.pdf)

Step 2: Stereo Camera Calibration

Calibrate the relative position between the two cameras:

ros2 launch triple_depth2object_position charuco_calibration.launch.py mode:=stereo

Show the ChArUco board to both cameras simultaneously
Press SPACE to capture frames (collect 50-200 samples)
Press 'c' to run calibration
Press 's' to save calibration
Press ESC to exit

Step 3: Base Frame Calibration

Calibrate cameras to robot base frame:

# Measure board position in base frame (example values in meters)
ros2 launch triple_depth2object_position charuco_calibration.launch.py mode:=base \
  board_x:=0.5637134 board_y:=-0.1516032 board_z:=0.1560714

Step 4: Run Object Detection

# Launch the complete object detection system
ros2 launch triple_depth2object_position object_detection.launch.py

# In another terminal, send detection queries
ros2 topic pub /object_query std_msgs/String "data: '[\"bottle\", \"can\", \"cup\"]'" --once

Step 5: Visualization

# Launch RViz for visualization
ros2 launch triple_depth2object_position visualization.launch.py

Launch Files

Launch File	Description
`cameras.launch.py`	Launch both RealSense cameras
`charuco_calibration.launch.py`	Camera calibration (stereo or base mode)
`mapping.launch.py`	3D point cloud mapping
`object_detection.launch.py`	Complete object detection system
`visualization.launch.py`	RViz visualization

ROS2 Topics

Subscribed Topics

/camera_left/camera/color/image_raw - Left RGB image
/camera_right/camera/color/image_raw - Right RGB image
/camera_left/camera/depth/points - Left point cloud
/camera_right/camera/depth/points - Right point cloud
/object_query (std_msgs/String) - Objects to detect (JSON array)

Published Topics

/object_detection_result (std_msgs/String) - Detection results (JSON)
/merged_pointcloud (sensor_msgs/PointCloud2) - Combined point cloud
/camera_frames (visualization_msgs/MarkerArray) - Camera visualizations

Socket Interface

For external application integration:

import socket
import json

# Connect to the detection server
client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client.connect(('localhost', 5000))

# Send detection request
request = {"objects": ["bottle", "can", "cup"]}
client.send(json.dumps(request).encode())

# Receive results
response = client.recv(4096).decode()
result = json.loads(response)
print(f"Detected objects: {result}")

client.close()

Custom Messages

DetectedObject3D.msg

string label
geometry_msgs/Point position
float32 confidence
string source_camera

ObjectDetectionResult.msg

DetectedObject3D[] detected_objects
string[] not_found_objects

Configuration Files

Camera Parameters

Edit config/camera_config.yaml to adjust:

Camera resolution
Frame rate
Depth settings
Point cloud filters

Detection Parameters

Edit config/object_detection_config.yaml to configure:

Detection confidence threshold
Depth filtering parameters
AI model settings

Calibration Board

Edit config/charuco_board.yaml to change board specifications

Troubleshooting

Camera Not Detected

# Check if cameras are connected
rs-enumerate-devices

# Should show both cameras with their serial numbers

CUDA/GPU Issues

# Verify CUDA installation
python3 -c "import torch; print(torch.cuda.is_available())"

# Check GPU memory
nvidia-smi

Calibration Issues

Ensure good lighting conditions
Keep board flat and clearly visible
Collect diverse board poses
Verify board measurements match config

Detection Performance

Processing time: ~2-3 seconds per query
Detection range: 0.1m - 5.0m
Accuracy depends on calibration quality

Project Structure

triple_depth2object_position/
├── dual_camera_system/           # Core Python package
│   ├── detection/               # Object detection modules
│   │   ├── object_detection_3d.py
│   │   └── grounding_dino_wrapper.py
│   ├── mapping/                # Point cloud processing
│   │   └── pointcloud_merger.py
│   ├── visualization/          # RViz visualizations
│   │   ├── marker_visualizer.py
│   │   └── camera_frame_publisher.py
│   ├── charuco_stereo_calibration.py
│   └── charuco_base_calibration.py
├── launch/                     # ROS2 launch files
├── config/                     # Configuration files
├── msg/                       # Custom message definitions
├── scripts/                   # Utility scripts
├── package.xml               # ROS2 package manifest
├── setup.py                  # Python package setup
└── CMakeLists.txt           # Build configuration

Contributing

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request

License

Apache License 2.0 - See LICENSE file for details

Acknowledgments

Intel RealSense SDK
GroundingDINO by IDEA Research
ROS2 Community

Support

For issues and questions:

Create an issue on GitHub
Check existing documentation in the repository

Citation

If you use this package in your research, please cite:

@software{triple_depth2object,
  title = {Triple Depth2Object Position: ROS2 Package for 3D Object Detection},
  year = {2024},
  url = {https://github.com/yourusername/triple_depth2object_position}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
config		config
dual_camera_system		dual_camera_system
examples		examples
launch		launch
msg		msg
resource		resource
scripts		scripts
test		test
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
QUICK_START.md		QUICK_START.md
README.md		README.md
install.sh		install.sh
package.xml		package.xml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

Triple Depth2Object Position

Features

System Requirements

Hardware

Software

Installation

1. Install Dependencies

2. Download GroundingDINO Model

3. Clone and Build Package

4. Create Calibration Directory

Camera Configuration

Usage

Step 1: Generate Calibration Board

Step 2: Stereo Camera Calibration

Step 3: Base Frame Calibration

Step 4: Run Object Detection

Step 5: Visualization

Launch Files

ROS2 Topics

Subscribed Topics

Published Topics

Socket Interface

Custom Messages

DetectedObject3D.msg

ObjectDetectionResult.msg

Configuration Files

Camera Parameters

Detection Parameters

Calibration Board

Troubleshooting

Camera Not Detected

CUDA/GPU Issues

Calibration Issues

Detection Performance

Project Structure

Contributing

License

Acknowledgments

Support

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages