Skip to content

Ahnsanghyun-hi/triple_depth2object_position

Repository files navigation

Triple Depth2Object Position

A ROS2 Humble package for real-time 3D object detection and mapping using dual Intel RealSense D435 depth cameras with AI-powered object recognition via GroundingDINO.

Features

  • Dual Camera System: Synchronized operation of two Intel RealSense D435 cameras
  • 3D Object Detection: Real-time object detection with 3D position estimation
  • AI-Powered Recognition: GroundingDINO model for robust object detection
  • Stereo Calibration: Precise camera-to-camera calibration using ChArUco boards
  • Base Frame Calibration: Camera-to-robot base frame transformation
  • Point Cloud Fusion: Merged point cloud from multiple cameras
  • Socket Interface: External application integration via TCP socket
  • RViz Visualization: Real-time visualization of cameras, objects, and point clouds

System Requirements

Hardware

  • 2x Intel RealSense D435 cameras
  • NVIDIA Jetson Orin (or compatible CUDA-capable system)
  • Minimum 8GB RAM (16GB recommended)
  • USB 3.0 ports for cameras

Software

  • Ubuntu 20.04/22.04
  • ROS2 Humble
  • Python 3.8+
  • CUDA 11.4+ (for GPU acceleration)
  • PyTorch 1.13+ with CUDA support

Installation

1. Install Dependencies

# Install ROS2 dependencies
sudo apt update
sudo apt install ros-humble-realsense2-camera ros-humble-realsense2-description

# Install Python packages
pip3 install opencv-contrib-python==4.7.0.72
pip3 install numpy scipy pyyaml
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu118
pip3 install transformers pillow matplotlib

2. Download GroundingDINO Model

# Create model directory
sudo mkdir -p /opt/models/groundingdino/

# Download model files
cd /opt/models/groundingdino/
sudo wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
sudo wget https://raw.githubusercontent.com/IDEA-Research/GroundingDINO/main/groundingdino/config/GroundingDINO_SwinT_OGC.py

3. Clone and Build Package

# Clone the repository
cd ~/ros2_ws/src
git clone https://github.com/yourusername/triple_depth2object_position.git

# Build the package
cd ~/ros2_ws
colcon build --packages-select triple_depth2object_position
source install/setup.bash

4. Create Calibration Directory

mkdir -p ~/ros2_ws/calibration_data

Camera Configuration

The package is configured for the following camera serials by default:

  • Left Camera: 419622073822
  • Right Camera: 033422072712

To use different cameras, update the serial numbers in the launch files.

Usage

Step 1: Generate Calibration Board

# Generate ChArUco calibration board PDF
python3 ~/ros2_ws/src/triple_depth2object_position/scripts/generate_charuco_board_pdf.py

# Print the generated board (charuco_board_4x6.pdf)

Step 2: Stereo Camera Calibration

Calibrate the relative position between the two cameras:

ros2 launch triple_depth2object_position charuco_calibration.launch.py mode:=stereo
  1. Show the ChArUco board to both cameras simultaneously
  2. Press SPACE to capture frames (collect 50-200 samples)
  3. Press 'c' to run calibration
  4. Press 's' to save calibration
  5. Press ESC to exit

Step 3: Base Frame Calibration

Calibrate cameras to robot base frame:

# Measure board position in base frame (example values in meters)
ros2 launch triple_depth2object_position charuco_calibration.launch.py mode:=base \
  board_x:=0.5637134 board_y:=-0.1516032 board_z:=0.1560714

Step 4: Run Object Detection

# Launch the complete object detection system
ros2 launch triple_depth2object_position object_detection.launch.py

# In another terminal, send detection queries
ros2 topic pub /object_query std_msgs/String "data: '[\"bottle\", \"can\", \"cup\"]'" --once

Step 5: Visualization

# Launch RViz for visualization
ros2 launch triple_depth2object_position visualization.launch.py

Launch Files

Launch File Description
cameras.launch.py Launch both RealSense cameras
charuco_calibration.launch.py Camera calibration (stereo or base mode)
mapping.launch.py 3D point cloud mapping
object_detection.launch.py Complete object detection system
visualization.launch.py RViz visualization

ROS2 Topics

Subscribed Topics

  • /camera_left/camera/color/image_raw - Left RGB image
  • /camera_right/camera/color/image_raw - Right RGB image
  • /camera_left/camera/depth/points - Left point cloud
  • /camera_right/camera/depth/points - Right point cloud
  • /object_query (std_msgs/String) - Objects to detect (JSON array)

Published Topics

  • /object_detection_result (std_msgs/String) - Detection results (JSON)
  • /merged_pointcloud (sensor_msgs/PointCloud2) - Combined point cloud
  • /camera_frames (visualization_msgs/MarkerArray) - Camera visualizations

Socket Interface

For external application integration:

import socket
import json

# Connect to the detection server
client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client.connect(('localhost', 5000))

# Send detection request
request = {"objects": ["bottle", "can", "cup"]}
client.send(json.dumps(request).encode())

# Receive results
response = client.recv(4096).decode()
result = json.loads(response)
print(f"Detected objects: {result}")

client.close()

Custom Messages

DetectedObject3D.msg

string label
geometry_msgs/Point position
float32 confidence
string source_camera

ObjectDetectionResult.msg

DetectedObject3D[] detected_objects
string[] not_found_objects

Configuration Files

Camera Parameters

Edit config/camera_config.yaml to adjust:

  • Camera resolution
  • Frame rate
  • Depth settings
  • Point cloud filters

Detection Parameters

Edit config/object_detection_config.yaml to configure:

  • Detection confidence threshold
  • Depth filtering parameters
  • AI model settings

Calibration Board

Edit config/charuco_board.yaml to change board specifications

Troubleshooting

Camera Not Detected

# Check if cameras are connected
rs-enumerate-devices

# Should show both cameras with their serial numbers

CUDA/GPU Issues

# Verify CUDA installation
python3 -c "import torch; print(torch.cuda.is_available())"

# Check GPU memory
nvidia-smi

Calibration Issues

  • Ensure good lighting conditions
  • Keep board flat and clearly visible
  • Collect diverse board poses
  • Verify board measurements match config

Detection Performance

  • Processing time: ~2-3 seconds per query
  • Detection range: 0.1m - 5.0m
  • Accuracy depends on calibration quality

Project Structure

triple_depth2object_position/
├── dual_camera_system/           # Core Python package
│   ├── detection/               # Object detection modules
│   │   ├── object_detection_3d.py
│   │   └── grounding_dino_wrapper.py
│   ├── mapping/                # Point cloud processing
│   │   └── pointcloud_merger.py
│   ├── visualization/          # RViz visualizations
│   │   ├── marker_visualizer.py
│   │   └── camera_frame_publisher.py
│   ├── charuco_stereo_calibration.py
│   └── charuco_base_calibration.py
├── launch/                     # ROS2 launch files
├── config/                     # Configuration files
├── msg/                       # Custom message definitions
├── scripts/                   # Utility scripts
├── package.xml               # ROS2 package manifest
├── setup.py                  # Python package setup
└── CMakeLists.txt           # Build configuration

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Commit your changes
  4. Push to the branch
  5. Create a Pull Request

License

Apache License 2.0 - See LICENSE file for details

Acknowledgments

  • Intel RealSense SDK
  • GroundingDINO by IDEA Research
  • ROS2 Community

Support

For issues and questions:

  • Create an issue on GitHub
  • Check existing documentation in the repository

Citation

If you use this package in your research, please cite:

@software{triple_depth2object,
  title = {Triple Depth2Object Position: ROS2 Package for 3D Object Detection},
  year = {2024},
  url = {https://github.com/yourusername/triple_depth2object_position}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages