Skip to content

leon0514/trt-sam3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

English | 中文

TensorRT SAM3 (C++ Inference)

This is a TensorRT-based SAM3 inference repository (C++ implementation). It currently implements image preprocessing, image encoding, text encoding, decoder decoding, and post-processing processes, supporting multi-text prompt inference for images.

Key Features:

  • Uses TensorRT engine
  • C++ + CUDA implementation of preprocessing/post-processing kernels, suitable for efficient GPU operation
  • Supports mask/box output based on text prompts and geometric bounding boxes
  • Utilizes batching and memory reuse to simultaneously recognize multiple text prompt categories
  • Draw boxes on image A, recognize on image B

ONNX Model and TensorRT Model Export

Vision Encode Model Quantization

Environment

  • Server
    ubuntu 24.04
  • GPU NVIDIA GeForce RTX 4090
  • Image
    nvcr.io/nvidia/tensorrt:25.10-py3

Recognition Results

  • Multi-word Text Prompts Can simultaneously recognize multiple categories
  • Geometric Prompts
  • Mixed Prompts
  • Prompt boxes on image A, recognition on image B

Speed

Around 50ms

Build and Run

cmake .. -DCMAKE_PREFIX_PATH="$(python3 -m pybind11 --cmakedir)"
make -j$(nproc)

web UI

References

https://github.com/jamjamjon/usls.git

License and Contributions

  • This repository is an example for personal/research use, welcome issues.