Model Export Scripts

This folder contains scripts for exporting models to TensorRT format for NVIDIA Triton Inference Server deployment.

Overview

The export process transforms PyTorch models into optimized TensorRT engines for high-performance GPU inference.

Export Scripts

Script	Purpose	Output
`export_models.py`	YOLO11 object detection with end2end NMS	TensorRT engine
`export_scrfd.py`	SCRFD-10G face detection + landmarks	TensorRT engine
`export_face_recognition.py`	ArcFace face embeddings	TensorRT engine
`export_mobileclip_image_encoder.py`	MobileCLIP image encoder	TensorRT engine
`export_mobileclip_text_encoder.py`	MobileCLIP text encoder	TensorRT engine
`export_paddleocr_det.py`	PP-OCRv5 text detection	TensorRT engine
`export_paddleocr_rec.py`	PP-OCRv5 text recognition	TensorRT engine
`download_face_models.py`	Download pre-trained face models	PyTorch weights
`download_paddleocr.py`	Download PP-OCRv5 models	ONNX models
`download_pytorch_models.py`	Download YOLO11 PyTorch models	PyTorch weights

Model Directory Structure

pytorch_models/
├── yolo11s.pt                          # YOLO11 PyTorch model
├── arcface_w600k_r50.onnx              # ArcFace ONNX model
├── mobileclip2_s2/                     # MobileCLIP checkpoint
├── mobileclip2_s2_image_encoder.onnx   # MobileCLIP image encoder ONNX
└── mobileclip2_s2_text_encoder.onnx    # MobileCLIP text encoder ONNX

models/
├── yolov11_small_trt/                  # YOLO11 TensorRT (standard)
│   ├── 1/model.plan
│   └── config.pbtxt
├── yolov11_small_trt_end2end/          # YOLO11 TensorRT with GPU NMS
│   ├── 1/model.plan
│   └── config.pbtxt
├── scrfd_10g_bnkps/                    # SCRFD-10G face detection TensorRT
│   ├── 1/model.plan
│   └── config.pbtxt
├── arcface_w600k_r50/                  # ArcFace TensorRT
│   ├── 1/model.plan
│   └── config.pbtxt
├── mobileclip2_s2_image_encoder/       # MobileCLIP image encoder
│   ├── 1/model.plan
│   └── config.pbtxt
├── mobileclip2_s2_text_encoder/        # MobileCLIP text encoder
│   ├── 1/model.plan
│   └── config.pbtxt
├── ppocr_det_v5/                       # PP-OCRv5 detection
│   ├── 1/model.plan
│   └── config.pbtxt
└── ppocr_rec_v5/                       # PP-OCRv5 recognition
    ├── 1/model.plan
    └── config.pbtxt

Usage

YOLO11 Object Detection

# Export TensorRT with GPU NMS (recommended)
make export-models

# Or directly:
docker compose exec yolo-api python /app/export/export_models.py \
    --models small \
    --formats trt trt_end2end \
    --normalize-boxes

SCRFD Face Detection

docker compose exec yolo-api python /app/export/export_scrfd.py

Face Recognition (ArcFace)

# Download pre-trained model
docker compose exec yolo-api python /app/export/download_face_models.py

# Export to TensorRT
docker compose exec yolo-api python /app/export/export_face_recognition.py

MobileCLIP (Visual Search)

# Export both image and text encoders
make export-mobileclip

# Or individually:
docker compose exec yolo-api python /app/export/export_mobileclip_image_encoder.py
docker compose exec yolo-api python /app/export/export_mobileclip_text_encoder.py

PP-OCRv5 (Text Recognition)

# Download PP-OCRv5 models
docker compose exec yolo-api python /app/export/download_paddleocr.py

# Export detection and recognition
docker compose exec yolo-api python /app/export/export_paddleocr_det.py
docker compose exec yolo-api python /app/export/export_paddleocr_rec.py

Model Specifications

YOLO11 Object Detection

Input: [B, 3, 640, 640] FP16, normalized [0, 1]
Output (end2end): num_dets, det_boxes, det_scores, det_classes
Dynamic batching: 1-64 (configurable)

SCRFD-10G Face Detection

Input: [B, 3, 640, 640] FP32, RGB, (x-127.5)/128.0 normalized
Output: 9 tensors (3 FPN strides x score/bbox/kps), CPU post-processed
Dynamic batching: 1-32

ArcFace Embeddings

Input: [B, 3, 112, 112] FP16, aligned face crops
Output: [B, 512] L2-normalized embeddings
Dynamic batching: 1-128

MobileCLIP Image Encoder

Input: [B, 3, 256, 256] FP32, normalized [0, 1]
Output: [B, 512] L2-normalized embeddings
Dynamic batching: 1-128

MobileCLIP Text Encoder

Input: [B, 77] INT64 token IDs
Output: [B, 512] L2-normalized embeddings
Dynamic batching: 1-64

PP-OCRv5 Detection

Input: [B, 3, H, W] FP32, dynamic size
Output: Text region polygons

PP-OCRv5 Recognition

Input: [B, 3, 48, W] FP32, dynamic width
Output: Character sequence probabilities

TensorRT Build Settings

All exports use these common settings:

Precision: FP16 (configurable)
Workspace: 4GB
Optimization profiles for dynamic batching

Troubleshooting

"Model file not found"

Download PyTorch models first:

make download-pytorch

"Failed to build TensorRT engine"

Check GPU memory and reduce batch size if needed:

docker compose exec yolo-api nvidia-smi

Triton fails to load model

Verify file structure and restart Triton:

ls -lh models/{model_name}/1/
docker compose restart triton-server

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Export Scripts

Overview

Export Scripts

Model Directory Structure

Usage

YOLO11 Object Detection

SCRFD Face Detection

Face Recognition (ArcFace)

MobileCLIP (Visual Search)

PP-OCRv5 (Text Recognition)

Model Specifications

YOLO11 Object Detection

SCRFD-10G Face Detection

ArcFace Embeddings

MobileCLIP Image Encoder

MobileCLIP Text Encoder

PP-OCRv5 Detection

PP-OCRv5 Recognition

TensorRT Build Settings

Troubleshooting

"Model file not found"

"Failed to build TensorRT engine"

Triton fails to load model

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Model Export Scripts

Overview

Export Scripts

Model Directory Structure

Usage

YOLO11 Object Detection

SCRFD Face Detection

Face Recognition (ArcFace)

MobileCLIP (Visual Search)

PP-OCRv5 (Text Recognition)

Model Specifications

YOLO11 Object Detection

SCRFD-10G Face Detection

ArcFace Embeddings

MobileCLIP Image Encoder

MobileCLIP Text Encoder

PP-OCRv5 Detection

PP-OCRv5 Recognition

TensorRT Build Settings

Troubleshooting

"Model file not found"

"Failed to build TensorRT engine"

Triton fails to load model