diff --git a/deployment/README.md b/deployment/README.md
new file mode 100644
index 000000000..bf8bdfb77
--- /dev/null
+++ b/deployment/README.md
@@ -0,0 +1,74 @@
+# AWML Deployment Framework
+
+AWML ships a unified, task-agnostic deployment stack that turns trained PyTorch checkpoints into production-ready ONNX and TensorRT artifacts. The verification and evaluation toolchain runs across every backend, ensuring numerical parity and consistent metrics across different projects.
+
+At the center is a shared runner/pipeline/exporter architecture that teams can extend with lightweight wrappers or workflows. CenterPoint, YOLOX, CalibrationStatusClassification, and future models plug into the same export and verification flow while still layering in task-specific logic where needed.
+
+
+## Quick Start
+
+```bash
+# Deployment entrypoint
+python -m deployment.cli.main <project> <deploy_cfg.py> <model_cfg.py> [project-specific args]
+
+# Example: CenterPoint deployment
+python -m deployment.cli.main centerpoint <deploy_cfg.py> <model_cfg.py> --rot-y-axis-reference
+```
+
+## Documentation Map
+
+| Topic | Description |
+| --- | --- |
+| [`docs/overview.md`](docs/overview.md) | Design principles, key features, precision policies. |
+| [`docs/architecture.md`](docs/architecture.md) | Workflow diagram, core components, file layout. |
+| [`docs/usage.md`](docs/usage.md) | CLI usage, runner patterns, typed contexts, export modes. |
+| [`docs/configuration.md`](docs/configuration.md) | Config structure, typed schemas, backend enums. |
+| [`docs/projects.md`](docs/projects.md) | CenterPoint, YOLOX, and Calibration deployment specifics. |
+| [`docs/export_pipeline.md`](docs/export_pipeline.md) | ONNX/TRT export steps and pipeline patterns. |
+| [`docs/verification_evaluation.md`](docs/verification_evaluation.md) | Verification scenarios, evaluation metrics, core contract. |
+| [`docs/best_practices.md`](docs/best_practices.md) | Best practices, troubleshooting, roadmap. |
+| [`docs/contributing.md`](docs/contributing.md) | How to add new deployment projects end-to-end. |
+
+Refer to `deployment/docs/README.md` for the same index.
+
+## Architecture Snapshot
+
+- **Entry point** (`deployment/cli/main.py`) loads a project bundle from `deployment/projects/<project>/`.
+- **Runtime** (`deployment/runtime/*`) coordinates load → export → verify → evaluate via shared orchestrators.
+- **Exporters** live under `exporters/common/` with typed config classes; project wrappers/pipelines compose the base exporters as needed.
+- **Pipelines** are registered by each project bundle and resolved via `PipelineFactory`.
+- **Core package** (`core/`) supplies typed configs, runtime contexts, task definitions, and shared verification utilities.
+
+See [`docs/architecture.md`](docs/architecture.md) for diagrams and component details.
+
+## Export & Verification Flow
+
+1. Load the PyTorch checkpoint and run ONNX export (single or multi-file) using the injected wrappers/pipelines.
+2. Optionally build TensorRT engines with precision policies such as `auto`, `fp16`, `fp32_tf32`, or `strongly_typed`.
+3. Register artifacts via `ArtifactManager` for downstream verification and evaluation.
+4. Run verification scenarios defined in config—pipelines are resolved by backend and device, and outputs are recursively compared with typed tolerances.
+5. Execute evaluation across enabled backends and emit typed metrics.
+
+Implementation details live in [`docs/export_pipeline.md`](docs/export_pipeline.md) and [`docs/verification_evaluation.md`](docs/verification_evaluation.md).
+
+## Project Coverage
+
+- **CenterPoint** – multi-file export orchestrated by dedicated ONNX/TRT pipelines; see [`docs/projects.md`](docs/projects.md).
+- **YOLOX** – single-file export with output reshaping via `YOLOXOptElanONNXWrapper`.
+- **CalibrationStatusClassification** – binary classification deployment with identity wrappers and simplified pipelines.
+
+Each project ships its own deployment bundle under `deployment/projects/<project>/`.
+
+## Core Contract
+
+[`core_contract.md`](docs/core_contract.md) defines the boundaries between runners, orchestrators, evaluators, pipelines, and metrics interfaces. Follow the contract when introducing new logic to keep refactors safe and dependencies explicit.
+
+## Contributing & Best Practices
+
+- Start with [`docs/contributing.md`](docs/contributing.md) for the required files and patterns when adding a new deployment project.
+- Consult [`docs/best_practices.md`](docs/best_practices.md) for export patterns, troubleshooting tips, and roadmap items.
+- Keep documentation for project-specific quirks in the appropriate file under `deployment/docs/`.
+
+## License
+
+See LICENSE at the repository root.
diff --git a/deployment/__init__.py b/deployment/__init__.py
new file mode 100644
index 000000000..708e0b666
--- /dev/null
+++ b/deployment/__init__.py
@@ -0,0 +1,24 @@
+"""
+Autoware ML Unified Deployment Framework
+
+This package provides a unified, task-agnostic deployment framework for
+exporting, verifying, and evaluating machine learning models across different
+tasks (classification, detection, segmentation, etc.) and backends (ONNX,
+TensorRT).
+"""
+
+from deployment.core.config.base_config import BaseDeploymentConfig
+from deployment.core.evaluation.base_evaluator import BaseEvaluator
+from deployment.core.io.base_data_loader import BaseDataLoader
+from deployment.core.io.preprocessing_builder import build_preprocessing_pipeline
+from deployment.runtime.runner import BaseDeploymentRunner
+
+__all__ = [
+    "BaseDeploymentConfig",
+    "BaseDataLoader",
+    "BaseEvaluator",
+    "BaseDeploymentRunner",
+    "build_preprocessing_pipeline",
+]
+
+__version__ = "1.0.0"
diff --git a/deployment/cli/__init__.py b/deployment/cli/__init__.py
new file mode 100644
index 000000000..4f413b78e
--- /dev/null
+++ b/deployment/cli/__init__.py
@@ -0,0 +1 @@
+"""Deployment CLI package."""
diff --git a/deployment/cli/main.py b/deployment/cli/main.py
new file mode 100644
index 000000000..76e3e94db
--- /dev/null
+++ b/deployment/cli/main.py
@@ -0,0 +1,101 @@
+"""
+Single deployment entrypoint.
+
+Usage:
+    python -m deployment.cli.main <project> <deploy_cfg.py> <model_cfg.py> [project-specific args]
+"""
+
+from __future__ import annotations
+
+import argparse
+import importlib
+import pkgutil
+import sys
+import traceback
+from typing import List
+
+import deployment.projects as projects_pkg
+from deployment.core.config.base_config import parse_base_args
+from deployment.projects import project_registry
+
+
+def _discover_project_packages() -> List[str]:
+    """Discover project package names under deployment.projects (without importing them)."""
+
+    names: List[str] = []
+    for mod in pkgutil.iter_modules(projects_pkg.__path__):
+        if not mod.ispkg:
+            continue
+        if mod.name.startswith("_"):
+            continue
+        names.append(mod.name)
+    return sorted(names)
+
+
+def _import_and_register_project(project_name: str) -> None:
+    """Import project package, which should register itself into project_registry."""
+    importlib.import_module(f"deployment.projects.{project_name}")
+
+
+def build_parser() -> argparse.ArgumentParser:
+    """Build the unified deployment CLI parser.
+
+    This discovers `deployment.projects.<name>` bundles, imports them to trigger
+    registration into `deployment.projects.project_registry`, then creates a
+    subcommand per registered project.
+    """
+    parser = argparse.ArgumentParser(
+        description="AWML Deployment CLI",
+        formatter_class=argparse.ArgumentDefaultsHelpFormatter,
+    )
+
+    subparsers = parser.add_subparsers(dest="project", required=True)
+
+    # Discover projects and import them so they can contribute args.
+    failed_projects: List[str] = []
+    for project_name in _discover_project_packages():
+        try:
+            _import_and_register_project(project_name)
+        except Exception as e:
+            tb = traceback.format_exc()
+            failed_projects.append(f"- {project_name}: {e}\n{tb}")
+            continue
+
+        try:
+            adapter = project_registry.get(project_name)
+        except KeyError:
+            continue
+
+        sub = subparsers.add_parser(project_name, help=f"{project_name} deployment")
+        parse_base_args(sub)  # adds deploy_cfg, model_cfg, --log-level
+        adapter.add_args(sub)
+        sub.set_defaults(_adapter_name=project_name)
+
+    if not project_registry.list_projects():
+        details = "\n".join(failed_projects) if failed_projects else "(no project packages discovered)"
+        raise RuntimeError(
+            "No deployment projects were registered. This usually means project imports failed.\n" f"{details}"
+        )
+
+    return parser
+
+
+def main(argv: List[str] | None = None) -> int:
+    """CLI entrypoint.
+
+    Args:
+        argv: Optional argv list (without program name). If None, uses `sys.argv[1:]`.
+
+    Returns:
+        Process exit code (0 for success).
+    """
+    argv = sys.argv[1:] if argv is None else argv
+    parser = build_parser()
+    args = parser.parse_args(argv)
+
+    adapter = project_registry.get(args._adapter_name)
+    return int(adapter.run(args) or 0)
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/deployment/core/__init__.py b/deployment/core/__init__.py
new file mode 100644
index 000000000..1afb0be1a
--- /dev/null
+++ b/deployment/core/__init__.py
@@ -0,0 +1,103 @@
+"""Core components for deployment framework."""
+
+from deployment.core.artifacts import (
+    Artifact,
+    get_component_files,
+    resolve_artifact_path,
+    resolve_engine_path,
+    resolve_onnx_path,
+)
+from deployment.core.backend import Backend
+from deployment.core.config.base_config import (
+    BaseDeploymentConfig,
+    DeviceConfig,
+    EvaluationConfig,
+    ExportConfig,
+    ExportMode,
+    RuntimeConfig,
+    TensorRTConfig,
+    VerificationConfig,
+    VerificationScenario,
+    parse_base_args,
+    setup_logging,
+)
+from deployment.core.contexts import (
+    CalibrationExportContext,
+    CenterPointExportContext,
+    ExportContext,
+    YOLOXExportContext,
+)
+from deployment.core.evaluation.base_evaluator import (
+    EVALUATION_DEFAULTS,
+    BaseEvaluator,
+    EvalResultDict,
+    EvaluationDefaults,
+    InferenceInput,
+    ModelSpec,
+    TaskProfile,
+    VerifyResultDict,
+)
+from deployment.core.evaluation.verification_mixin import VerificationMixin
+from deployment.core.io.base_data_loader import BaseDataLoader
+from deployment.core.io.preprocessing_builder import build_preprocessing_pipeline
+from deployment.core.metrics import (
+    BaseMetricsConfig,
+    BaseMetricsInterface,
+    ClassificationMetricsConfig,
+    ClassificationMetricsInterface,
+    Detection2DMetricsConfig,
+    Detection2DMetricsInterface,
+    Detection3DMetricsConfig,
+    Detection3DMetricsInterface,
+)
+
+__all__ = [
+    # Backend and configuration
+    "Backend",
+    # Typed contexts
+    "ExportContext",
+    "YOLOXExportContext",
+    "CenterPointExportContext",
+    "CalibrationExportContext",
+    "BaseDeploymentConfig",
+    "ExportConfig",
+    "ExportMode",
+    "RuntimeConfig",
+    "TensorRTConfig",
+    "DeviceConfig",
+    "EvaluationConfig",
+    "VerificationConfig",
+    "VerificationScenario",
+    "setup_logging",
+    "parse_base_args",
+    # Constants
+    "EVALUATION_DEFAULTS",
+    "EvaluationDefaults",
+    # Data loading
+    "BaseDataLoader",
+    # Evaluation
+    "BaseEvaluator",
+    "TaskProfile",
+    "InferenceInput",
+    "EvalResultDict",
+    "VerifyResultDict",
+    "VerificationMixin",
+    # Artifacts
+    "Artifact",
+    "resolve_artifact_path",
+    "resolve_onnx_path",
+    "resolve_engine_path",
+    "get_component_files",
+    "ModelSpec",
+    # Preprocessing
+    "build_preprocessing_pipeline",
+    # Metrics interfaces (using autoware_perception_evaluation)
+    "BaseMetricsInterface",
+    "BaseMetricsConfig",
+    "Detection3DMetricsInterface",
+    "Detection3DMetricsConfig",
+    "Detection2DMetricsInterface",
+    "Detection2DMetricsConfig",
+    "ClassificationMetricsInterface",
+    "ClassificationMetricsConfig",
+]
diff --git a/deployment/core/artifacts.py b/deployment/core/artifacts.py
new file mode 100644
index 000000000..aa30d3239
--- /dev/null
+++ b/deployment/core/artifacts.py
@@ -0,0 +1,231 @@
+"""
+Artifact Path Resolution for Deployment Pipelines.
+
+This module provides:
+1. Artifact dataclass - represents an exported model artifact
+2. Path resolution functions - resolve artifact paths from deploy config
+
+Supports:
+- Single-component models (YOLOX, Calibration): use component="model"
+- Multi-component models (CenterPoint): use component="voxel_encoder", "backbone_head", etc.
+"""
+
+from __future__ import annotations
+
+import logging
+import os
+import os.path as osp
+from dataclasses import dataclass
+from typing import Any, Dict, Mapping, Optional
+
+logger = logging.getLogger(__name__)
+
+
+# ============================================================================
+# Artifact Dataclass
+# ============================================================================
+
+
+@dataclass(frozen=True)
+class Artifact:
+    """
+    Represents an exported model artifact (ONNX file, TensorRT engine, etc.).
+
+    Attributes:
+        path: Filesystem path to the artifact (file or directory).
+        multi_file: True if artifact is a directory containing multiple files
+                    (e.g., CenterPoint has voxel_encoder.onnx + backbone_head.onnx).
+    """
+
+    path: str
+    multi_file: bool = False
+
+    @property
+    def exists(self) -> bool:
+        """Whether the artifact exists on disk."""
+        return os.path.exists(self.path)
+
+    @property
+    def is_directory(self) -> bool:
+        """Whether the artifact is a directory."""
+        return os.path.isdir(self.path)
+
+    def __str__(self) -> str:
+        return self.path
+
+
+# ============================================================================
+# Path Resolution Functions
+# ============================================================================
+
+# File extension mapping
+FILE_EXTENSIONS: Dict[str, str] = {
+    "onnx_file": ".onnx",
+    "engine_file": ".engine",
+}
+
+
+def resolve_artifact_path(
+    *,
+    base_dir: str,
+    components_cfg: Optional[Mapping[str, Any]],
+    component: str,
+    file_key: str,
+) -> str:
+    """Resolve artifact path for any component.
+
+    This is the entry point for artifact path resolution.
+
+    Args:
+        base_dir: Base directory for artifacts (onnx_dir or tensorrt_dir),
+                  or direct path to an artifact file.
+        components_cfg: The `components` dict from deploy_config.
+                       Can be None for backwards compatibility.
+        component: Component name (e.g., 'model', 'voxel_encoder', 'backbone_head')
+        file_key: Key to look up ('onnx_file' or 'engine_file')
+
+    Returns:
+        Resolved path to the artifact file
+
+    Resolution strategy (single supported mode):
+    1. `base_dir` must be a directory (e.g., `.../onnx` or `.../tensorrt`)
+    2. Require `components_cfg[component][file_key]` to be set
+       - must be a relative path resolved under `base_dir`
+    3. The resolved path must exist and be a file
+
+    This function intentionally does NOT:
+    - scan directories for matching extensions
+    - fall back to default filenames
+    - accept `base_dir` as a file path
+    - accept absolute paths in `components` (enforces fully config-driven, workspace-relative artifacts)
+
+    Examples:
+        # Single-component model (YOLOX)
+        resolve_artifact_path(
+            base_dir="work_dirs/yolox/onnx",
+            components_cfg={"model": {"onnx_file": "yolox.onnx"}},
+            component="model",
+            file_key="onnx_file",
+        )
+
+        # Multi-component model (CenterPoint)
+        resolve_artifact_path(
+            base_dir="work_dirs/centerpoint/tensorrt",
+            components_cfg={"voxel_encoder": {"engine_file": "voxel.engine"}},
+            component="voxel_encoder",
+            file_key="engine_file",
+        )
+    """
+    if not os.path.isdir(base_dir):
+        raise ValueError(
+            "Artifact resolution requires `base_dir` to be a directory. "
+            f"Got: {base_dir}. "
+            "Set evaluation.backends.<backend>.{model_dir|engine_dir} to the artifact directory, "
+            "and set the artifact filename in deploy config under components.*.{onnx_file|engine_file}."
+        )
+
+    # Require filename from components config
+    filename = _get_filename_from_config(components_cfg, component, file_key)
+    if not filename:
+        raise KeyError(
+            "Missing artifact filename in deploy config. "
+            f"Expected components['{component}']['{file_key}'] to be set."
+        )
+
+    if osp.isabs(filename):
+        raise ValueError(
+            "Absolute artifact paths are not allowed. "
+            f"Set components['{component}']['{file_key}'] to a relative filename under base_dir instead. "
+            f"(got: {filename})"
+        )
+
+    base_abs = osp.abspath(base_dir)
+    path = osp.abspath(osp.join(base_abs, filename))
+    # Prevent escaping base_dir via '../'
+    if osp.commonpath([base_abs, path]) != base_abs:
+        raise ValueError(
+            "Artifact path must stay within base_dir. "
+            f"Got components['{component}']['{file_key}']={filename} which resolves to {path} outside {base_abs}."
+        )
+    if not os.path.isfile(path):
+        raise FileNotFoundError(
+            f"Configured artifact file not found: {path}. "
+            f"(base_dir={base_dir}, component={component}, file_key={file_key})"
+        )
+    return path
+
+
+def _get_filename_from_config(
+    components_cfg: Optional[Mapping[str, Any]],
+    component: str,
+    file_key: str,
+) -> Optional[str]:
+    """Extract filename from components config."""
+    if not components_cfg:
+        return None
+
+    comp_cfg = components_cfg.get(component, {})
+    if not isinstance(comp_cfg, Mapping):
+        return None
+
+    filename = comp_cfg.get(file_key)
+    if isinstance(filename, str) and filename:
+        return filename
+    return None
+
+
+def get_component_files(
+    components_cfg: Mapping[str, Any],
+    file_key: str,
+) -> Dict[str, str]:
+    """Get all component filenames for a given file type.
+
+    Useful for multi-component models to enumerate all artifacts.
+
+    Args:
+        components_cfg: The unified `components` dict from deploy_config
+        file_key: Key to look up ('onnx_file' or 'engine_file')
+
+    Returns:
+        Dict mapping component name to filename
+
+    Example:
+        >>> components = {"voxel_encoder": {"onnx_file": "voxel.onnx"},
+        ...               "backbone_head": {"onnx_file": "head.onnx"}}
+        >>> get_component_files(components, "onnx_file")
+        {"voxel_encoder": "voxel.onnx", "backbone_head": "head.onnx"}
+    """
+    result = {}
+    for comp_name, comp_cfg in components_cfg.items():
+        if isinstance(comp_cfg, Mapping) and file_key in comp_cfg:
+            result[comp_name] = comp_cfg[file_key]
+    return result
+
+
+# Convenience aliases for common use cases
+def resolve_onnx_path(
+    base_dir: str,
+    components_cfg: Optional[Mapping[str, Any]] = None,
+    component: str = "model",
+) -> str:
+    """Convenience function for resolving ONNX paths."""
+    return resolve_artifact_path(
+        base_dir=base_dir,
+        components_cfg=components_cfg,
+        component=component,
+        file_key="onnx_file",
+    )
+
+
+def resolve_engine_path(
+    base_dir: str,
+    components_cfg: Optional[Mapping[str, Any]] = None,
+    component: str = "model",
+) -> str:
+    """Convenience function for resolving TensorRT engine paths."""
+    return resolve_artifact_path(
+        base_dir=base_dir,
+        components_cfg=components_cfg,
+        component=component,
+        file_key="engine_file",
+    )
diff --git a/deployment/core/backend.py b/deployment/core/backend.py
new file mode 100644
index 000000000..87cfc3109
--- /dev/null
+++ b/deployment/core/backend.py
@@ -0,0 +1,43 @@
+"""Backend enum used across deployment configs and runtime components."""
+
+from __future__ import annotations
+
+from enum import Enum
+from typing import Union
+
+
+class Backend(str, Enum):
+    """Supported deployment backends."""
+
+    PYTORCH = "pytorch"
+    ONNX = "onnx"
+    TENSORRT = "tensorrt"
+
+    @classmethod
+    def from_value(cls, value: Union[str, Backend]) -> Backend:
+        """
+        Normalize backend identifiers coming from configs or enums.
+
+        Args:
+            value: Backend as string or Backend enum
+
+        Returns:
+            Backend enum instance
+
+        Raises:
+            ValueError: If value cannot be mapped to a supported backend
+        """
+        if isinstance(value, cls):
+            return value
+
+        if isinstance(value, str):
+            normalized = value.strip().lower()
+            try:
+                return cls(normalized)
+            except ValueError as exc:
+                raise ValueError(f"Unsupported backend '{value}'. Expected one of {[b.value for b in cls]}.") from exc
+
+        raise TypeError(f"Backend must be a string or Backend enum, got {type(value)}")
+
+    def __str__(self) -> str:  # pragma: no cover - convenience for logging
+        return self.value
diff --git a/deployment/core/config/__init__.py b/deployment/core/config/__init__.py
new file mode 100644
index 000000000..7859ffe7e
--- /dev/null
+++ b/deployment/core/config/__init__.py
@@ -0,0 +1,32 @@
+"""Configuration subpackage for deployment core."""
+
+from deployment.core.config.base_config import (
+    BaseDeploymentConfig,
+    EvaluationConfig,
+    ExportConfig,
+    ExportMode,
+    PrecisionPolicy,
+    RuntimeConfig,
+    TensorRTConfig,
+    VerificationConfig,
+    VerificationScenario,
+    parse_base_args,
+    setup_logging,
+)
+from deployment.core.evaluation.base_evaluator import EVALUATION_DEFAULTS, EvaluationDefaults
+
+__all__ = [
+    "TensorRTConfig",
+    "BaseDeploymentConfig",
+    "EvaluationConfig",
+    "ExportConfig",
+    "ExportMode",
+    "PrecisionPolicy",
+    "VerificationConfig",
+    "VerificationScenario",
+    "parse_base_args",
+    "setup_logging",
+    "EVALUATION_DEFAULTS",
+    "EvaluationDefaults",
+    "RuntimeConfig",
+]
diff --git a/deployment/core/config/base_config.py b/deployment/core/config/base_config.py
new file mode 100644
index 000000000..65fcf776b
--- /dev/null
+++ b/deployment/core/config/base_config.py
@@ -0,0 +1,661 @@
+"""
+Base configuration classes for deployment framework.
+
+This module provides the foundation for task-agnostic deployment configuration.
+Task-specific deployment configs should extend BaseDeploymentConfig.
+"""
+
+from __future__ import annotations
+
+import argparse
+import logging
+from dataclasses import dataclass, field
+from enum import Enum
+from types import MappingProxyType
+from typing import Any, Dict, Mapping, Optional, Tuple, Union
+
+import torch
+from mmengine.config import Config
+
+from deployment.core.backend import Backend
+from deployment.exporters.common.configs import (
+    ONNXExportConfig,
+    TensorRTExportConfig,
+    TensorRTModelInputConfig,
+    TensorRTProfileConfig,
+)
+
+# Constants
+DEFAULT_WORKSPACE_SIZE = 1 << 30  # 1 GB
+
+
+def _empty_mapping() -> Mapping[Any, Any]:
+    """Return an immutable empty mapping."""
+    return MappingProxyType({})
+
+
+class PrecisionPolicy(str, Enum):
+    """Precision policy options for TensorRT."""
+
+    AUTO = "auto"
+    FP16 = "fp16"
+    FP32_TF32 = "fp32_tf32"
+    STRONGLY_TYPED = "strongly_typed"
+
+
+class ExportMode(str, Enum):
+    """Export pipeline modes."""
+
+    ONNX = "onnx"
+    TRT = "trt"
+    BOTH = "both"
+    NONE = "none"
+
+    @classmethod
+    def from_value(cls, value: Optional[Union[str, ExportMode]]) -> ExportMode:
+        """Parse strings or enum members into ExportMode (defaults to BOTH)."""
+        if value is None:
+            return cls.BOTH
+        if isinstance(value, cls):
+            return value
+        if isinstance(value, str):
+            normalized = value.strip().lower()
+            for member in cls:
+                if member.value == normalized:
+                    return member
+        raise ValueError(f"Invalid export mode '{value}'. Must be one of {[m.value for m in cls]}.")
+
+
+# Precision policy mapping for TensorRT
+PRECISION_POLICIES = {
+    PrecisionPolicy.AUTO.value: {},  # No special flags, TensorRT decides
+    PrecisionPolicy.FP16.value: {"FP16": True},
+    PrecisionPolicy.FP32_TF32.value: {"TF32": True},  # TF32 for FP32 operations
+    PrecisionPolicy.STRONGLY_TYPED.value: {"STRONGLY_TYPED": True},  # Network creation flag
+}
+
+
+@dataclass(frozen=True)
+class ExportConfig:
+    """Configuration for model export settings."""
+
+    mode: ExportMode = ExportMode.BOTH
+    work_dir: str = "work_dirs"
+    onnx_path: Optional[str] = None
+
+    @classmethod
+    def from_dict(cls, config_dict: Mapping[str, Any]) -> ExportConfig:
+        """Create ExportConfig from dict."""
+        return cls(
+            mode=ExportMode.from_value(config_dict.get("mode", ExportMode.BOTH)),
+            work_dir=config_dict.get("work_dir", cls.work_dir),
+            onnx_path=config_dict.get("onnx_path"),
+        )
+
+    @property
+    def should_export_onnx(self) -> bool:
+        """Whether ONNX export is requested."""
+        return self.mode in (ExportMode.ONNX, ExportMode.BOTH)
+
+    @property
+    def should_export_tensorrt(self) -> bool:
+        """Whether TensorRT export is requested."""
+        return self.mode in (ExportMode.TRT, ExportMode.BOTH)
+
+
+@dataclass(frozen=True)
+class DeviceConfig:
+    """Normalized device settings shared across deployment stages."""
+
+    cpu: str = "cpu"
+    cuda: Optional[str] = "cuda:0"
+
+    def __post_init__(self) -> None:
+        object.__setattr__(self, "cpu", self._normalize_cpu(self.cpu))
+        object.__setattr__(self, "cuda", self._normalize_cuda(self.cuda))
+
+    @classmethod
+    def from_dict(cls, config_dict: Mapping[str, Any]) -> DeviceConfig:
+        """Create DeviceConfig from dict."""
+        return cls(cpu=config_dict.get("cpu", cls.cpu), cuda=config_dict.get("cuda", cls.cuda))
+
+    @staticmethod
+    def _normalize_cpu(device: Optional[str]) -> str:
+        """Normalize CPU device string."""
+        if not device:
+            return "cpu"
+        normalized = str(device).strip().lower()
+        if normalized.startswith("cuda"):
+            raise ValueError("CPU device cannot be a CUDA device")
+        return normalized
+
+    @staticmethod
+    def _normalize_cuda(device: Optional[str]) -> Optional[str]:
+        """Normalize CUDA device string to 'cuda:N' format."""
+        if device is None:
+            return None
+        if not isinstance(device, str):
+            raise ValueError("cuda device must be a string (e.g., 'cuda:0')")
+        normalized = device.strip().lower()
+        if normalized == "":
+            return None
+        if normalized == "cuda":
+            normalized = "cuda:0"
+        if not normalized.startswith("cuda"):
+            raise ValueError(f"Invalid CUDA device '{device}'. Must start with 'cuda'")
+        suffix = normalized.split(":", 1)[1] if ":" in normalized else "0"
+        suffix = suffix.strip() or "0"
+        if not suffix.isdigit():
+            raise ValueError(f"Invalid CUDA device index in '{device}'")
+        device_id = int(suffix)
+        if device_id < 0:
+            raise ValueError("CUDA device index must be non-negative")
+        return f"cuda:{device_id}"
+
+    @property
+    def cuda_device_index(self) -> Optional[int]:
+        """Return CUDA device index as integer (if configured)."""
+        if self.cuda is None:
+            return None
+        return int(self.cuda.split(":", 1)[1])
+
+
+@dataclass(frozen=True)
+class RuntimeConfig:
+    """Configuration for runtime I/O settings."""
+
+    info_file: str = ""
+    sample_idx: int = 0
+
+    @classmethod
+    def from_dict(cls, config_dict: Mapping[str, Any]) -> RuntimeConfig:
+        """Create RuntimeConfig from dictionary."""
+        return cls(
+            info_file=config_dict.get("info_file", ""),
+            sample_idx=config_dict.get("sample_idx", 0),
+        )
+
+
+@dataclass(frozen=True)
+class TensorRTConfig:
+    """
+    Configuration for TensorRT backend-specific settings.
+
+    Uses config structure:
+        tensorrt_config = dict(precision_policy="auto", max_workspace_size=1<<30)
+
+    TensorRT profiles are defined in components.*.tensorrt_profile.
+
+    Note:
+        The deploy config key for this section is **`tensorrt_config`**.
+    """
+
+    precision_policy: str = PrecisionPolicy.AUTO.value
+    max_workspace_size: int = DEFAULT_WORKSPACE_SIZE
+
+    def __post_init__(self) -> None:
+        """Validate TensorRT precision policy at construction time."""
+        if self.precision_policy not in PRECISION_POLICIES:
+            raise ValueError(
+                f"Invalid precision_policy '{self.precision_policy}'. "
+                f"Must be one of {list(PRECISION_POLICIES.keys())}"
+            )
+
+    @classmethod
+    def from_dict(cls, config_dict: Mapping[str, Any]) -> TensorRTConfig:
+        return cls(
+            precision_policy=config_dict.get("precision_policy", PrecisionPolicy.AUTO.value),
+            max_workspace_size=config_dict.get("max_workspace_size", DEFAULT_WORKSPACE_SIZE),
+        )
+
+    @property
+    def precision_flags(self) -> Mapping[str, bool]:
+        """TensorRT precision flags for the configured policy."""
+        return PRECISION_POLICIES[self.precision_policy]
+
+
+@dataclass(frozen=True)
+class EvaluationConfig:
+    """Typed configuration for evaluation settings."""
+
+    enabled: bool = False
+    num_samples: int = 10
+    verbose: bool = False
+    backends: Mapping[Any, Mapping[str, Any]] = field(default_factory=_empty_mapping)
+    models: Mapping[Any, Any] = field(default_factory=_empty_mapping)
+    devices: Mapping[str, str] = field(default_factory=_empty_mapping)
+
+    @classmethod
+    def from_dict(cls, config_dict: Mapping[str, Any]) -> EvaluationConfig:
+        backends_raw = config_dict.get("backends", None)
+        if backends_raw is None:
+            backends_raw = {}
+        if not isinstance(backends_raw, Mapping):
+            raise TypeError(f"evaluation.backends must be a mapping, got {type(backends_raw).__name__}")
+        backends_frozen = {key: MappingProxyType(dict(value)) for key, value in backends_raw.items()}
+
+        models_raw = config_dict.get("models", None)
+        if models_raw is None:
+            models_raw = {}
+        if not isinstance(models_raw, Mapping):
+            raise TypeError(f"evaluation.models must be a mapping, got {type(models_raw).__name__}")
+
+        devices_raw = config_dict.get("devices", None)
+        if devices_raw is None:
+            devices_raw = {}
+        if not isinstance(devices_raw, Mapping):
+            raise TypeError(f"evaluation.devices must be a mapping, got {type(devices_raw).__name__}")
+
+        return cls(
+            enabled=config_dict.get("enabled", False),
+            num_samples=config_dict.get("num_samples", 10),
+            verbose=config_dict.get("verbose", False),
+            backends=MappingProxyType(backends_frozen),
+            models=MappingProxyType(dict(models_raw)),
+            devices=MappingProxyType(dict(devices_raw)),
+        )
+
+
+@dataclass(frozen=True)
+class VerificationConfig:
+    """Typed configuration for verification settings."""
+
+    enabled: bool = True
+    num_verify_samples: int = 3
+    tolerance: float = 0.1
+    devices: Mapping[str, str] = field(default_factory=_empty_mapping)
+    scenarios: Mapping[ExportMode, Tuple[VerificationScenario, ...]] = field(default_factory=_empty_mapping)
+
+    @classmethod
+    def from_dict(cls, config_dict: Mapping[str, Any]) -> VerificationConfig:
+        scenarios_raw = config_dict.get("scenarios")
+        if scenarios_raw is None:
+            scenarios_raw = {}
+        if not isinstance(scenarios_raw, Mapping):
+            raise TypeError(f"verification.scenarios must be a mapping, got {type(scenarios_raw).__name__}")
+
+        scenario_map: Dict[ExportMode, Tuple[VerificationScenario, ...]] = {}
+        for mode_key, scenario_list in scenarios_raw.items():
+            mode = ExportMode.from_value(mode_key)
+            if scenario_list is None:
+                scenario_list = []
+            elif not isinstance(scenario_list, (list, tuple)):
+                raise TypeError(
+                    f"verification.scenarios.{mode_key} must be a list or tuple, got {type(scenario_list).__name__}"
+                )
+            scenario_entries = tuple(VerificationScenario.from_dict(entry) for entry in scenario_list)
+            scenario_map[mode] = scenario_entries
+
+        devices_raw = config_dict.get("devices")
+        if devices_raw is None:
+            devices_raw = {}
+        if not isinstance(devices_raw, Mapping):
+            raise TypeError(f"verification.devices must be a mapping, got {type(devices_raw).__name__}")
+
+        return cls(
+            enabled=config_dict.get("enabled", True),
+            num_verify_samples=config_dict.get("num_verify_samples", 3),
+            tolerance=config_dict.get("tolerance", 0.1),
+            devices=MappingProxyType(dict(devices_raw)),
+            scenarios=MappingProxyType(scenario_map),
+        )
+
+    def get_scenarios(self, mode: ExportMode) -> Tuple[VerificationScenario, ...]:
+        """Return scenarios for a specific export mode."""
+        return self.scenarios.get(mode, ())
+
+
+@dataclass(frozen=True)
+class VerificationScenario:
+    """Immutable verification scenario specification."""
+
+    ref_backend: Backend
+    ref_device: str
+    test_backend: Backend
+    test_device: str
+
+    @classmethod
+    def from_dict(cls, data: Mapping[str, Any]) -> VerificationScenario:
+        missing_keys = {"ref_backend", "ref_device", "test_backend", "test_device"} - data.keys()
+        if missing_keys:
+            raise ValueError(f"Verification scenario missing keys: {missing_keys}")
+
+        return cls(
+            ref_backend=Backend.from_value(data["ref_backend"]),
+            ref_device=str(data["ref_device"]),
+            test_backend=Backend.from_value(data["test_backend"]),
+            test_device=str(data["test_device"]),
+        )
+
+
+class BaseDeploymentConfig:
+    """
+    Base configuration container for deployment settings.
+
+    This class provides a task-agnostic interface for deployment configuration.
+    Task-specific configs should extend this class and add task-specific settings.
+
+    Attributes:
+        checkpoint_path: Single source of truth for the PyTorch checkpoint path.
+                        Used by both export (for ONNX conversion) and evaluation
+                        (for PyTorch backend). Defined at top-level of deploy config.
+    """
+
+    def __init__(self, deploy_cfg: Config):
+        """
+        Initialize deployment configuration.
+
+        Args:
+            deploy_cfg: MMEngine Config object containing deployment settings
+        """
+        self.deploy_cfg = deploy_cfg
+        self._validate_config()
+
+        self._checkpoint_path: Optional[str] = deploy_cfg.get("checkpoint_path")
+        self._device_config = DeviceConfig.from_dict(deploy_cfg.get("devices", {}))
+
+        # Initialize config sections
+        self.export_config = ExportConfig.from_dict(deploy_cfg.get("export", {}))
+        self.runtime_config = RuntimeConfig.from_dict(deploy_cfg.get("runtime_io", {}))
+        self.tensorrt_config = TensorRTConfig.from_dict(deploy_cfg.get("tensorrt_config", {}))
+        self._evaluation_config = EvaluationConfig.from_dict(deploy_cfg.get("evaluation", {}))
+        self._verification_config = VerificationConfig.from_dict(deploy_cfg.get("verification", {}))
+
+        self._validate_cuda_device()
+
+    def _validate_config(self) -> None:
+        """Validate configuration structure and required fields."""
+        # Validate required sections
+        if "export" not in self.deploy_cfg:
+            raise ValueError(
+                "Missing 'export' section in deploy config. " "Please update your config to include 'export' section."
+            )
+
+        # Validate export mode
+        try:
+            ExportMode.from_value(self.deploy_cfg.get("export", {}).get("mode", ExportMode.BOTH))
+        except ValueError as exc:
+            raise ValueError(str(exc)) from exc
+
+        # Validate precision policy if present
+        tensorrt_config = self.deploy_cfg.get("tensorrt_config")
+        if tensorrt_config is None:
+            tensorrt_config = {}
+        if not isinstance(tensorrt_config, Mapping):
+            raise TypeError(f"tensorrt_config must be a mapping, got {type(tensorrt_config).__name__}")
+        precision_policy = tensorrt_config.get("precision_policy", PrecisionPolicy.AUTO.value)
+        if precision_policy not in PRECISION_POLICIES:
+            raise ValueError(
+                f"Invalid precision_policy '{precision_policy}'. " f"Must be one of {list(PRECISION_POLICIES.keys())}"
+            )
+
+    def _validate_cuda_device(self) -> None:
+        """Validate CUDA device availability once at config stage."""
+        if not self._needs_cuda_device():
+            return
+
+        cuda_device = self.devices.cuda
+        device_idx = self.devices.cuda_device_index
+
+        if cuda_device is None or device_idx is None:
+            raise RuntimeError(
+                "CUDA device is required (TensorRT export/verification/evaluation enabled) but no CUDA device was"
+                " configured in deploy_cfg.devices."
+            )
+
+        if not torch.cuda.is_available():
+            raise RuntimeError(
+                "CUDA device is required (TensorRT export/verification/evaluation enabled) "
+                "but torch.cuda.is_available() returned False."
+            )
+
+        device_count = torch.cuda.device_count()
+        if device_idx >= device_count:
+            raise ValueError(
+                f"Requested CUDA device '{cuda_device}' but only {device_count} CUDA device(s) are available."
+            )
+
+    def _needs_cuda_device(self) -> bool:
+        """Determine if current deployment config requires a CUDA device."""
+        if self.export_config.should_export_tensorrt:
+            return True
+
+        evaluation_cfg = self.evaluation_config
+        backends_cfg = evaluation_cfg.backends
+        tensorrt_backend = backends_cfg.get(Backend.TENSORRT.value) or backends_cfg.get(Backend.TENSORRT, {})
+        if tensorrt_backend and tensorrt_backend.get("enabled", False):
+            return True
+
+        verification_cfg = self.verification_config
+
+        for scenario_list in verification_cfg.scenarios.values():
+            for scenario in scenario_list:
+                if Backend.TENSORRT in (scenario.ref_backend, scenario.test_backend):
+                    return True
+
+        return False
+
+    @property
+    def checkpoint_path(self) -> Optional[str]:
+        """
+        Get checkpoint path - single source of truth for PyTorch model.
+
+        This path is used by:
+        - Export pipeline: to load the PyTorch model for ONNX conversion
+        - Evaluation: for PyTorch backend evaluation
+        - Verification: when PyTorch is used as reference or test backend
+
+        Returns:
+            Path to the PyTorch checkpoint file, or None if not configured
+        """
+        return self._checkpoint_path
+
+    @property
+    def evaluation_config(self) -> EvaluationConfig:
+        """Get evaluation configuration."""
+        return self._evaluation_config
+
+    @property
+    def onnx_config(self) -> Mapping[str, Any]:
+        """Get ONNX configuration."""
+        return self.deploy_cfg.get("onnx_config", {})
+
+    @property
+    def verification_config(self) -> VerificationConfig:
+        """Get verification configuration."""
+        return self._verification_config
+
+    @property
+    def devices(self) -> DeviceConfig:
+        """Get normalized device settings."""
+        return self._device_config
+
+    @property
+    def evaluation_backends(self) -> Mapping[Any, Mapping[str, Any]]:
+        """
+        Get evaluation backends configuration.
+
+        Returns:
+            Dictionary mapping backend names to their configuration
+        """
+        return self.evaluation_config.backends
+
+    def get_verification_scenarios(self, export_mode: ExportMode) -> Tuple[VerificationScenario, ...]:
+        """
+        Get verification scenarios for the given export mode.
+
+        Args:
+            export_mode: Export mode (`ExportMode`)
+
+        Returns:
+            Tuple of verification scenarios
+        """
+        return self.verification_config.get_scenarios(export_mode)
+
+    @property
+    def task_type(self) -> Optional[str]:
+        """Get task type for pipeline building."""
+        return self.deploy_cfg.get("task_type")
+
+    def get_onnx_settings(self) -> ONNXExportConfig:
+        """
+        Get ONNX export settings from unified components configuration.
+
+        Reads I/O from components.model.io.{inputs, outputs, dynamic_axes}
+
+        Returns:
+            ONNXExportConfig instance containing ONNX export parameters
+        """
+        onnx_config = self.onnx_config
+        components_io = self._get_model_io_from_components()
+
+        # Get input/output names from components
+        input_names = [inp.get("name", "input") for inp in components_io.get("inputs", [])]
+        output_names = [out.get("name", "output") for out in components_io.get("outputs", [])]
+
+        # Fallback to defaults if components not configured
+        if not input_names:
+            input_names = ["input"]
+        if not output_names:
+            output_names = ["output"]
+
+        settings_dict = {
+            "opset_version": onnx_config.get("opset_version", 16),
+            "do_constant_folding": onnx_config.get("do_constant_folding", True),
+            "input_names": tuple(input_names),
+            "output_names": tuple(output_names),
+            "dynamic_axes": components_io.get("dynamic_axes"),
+            "export_params": onnx_config.get("export_params", True),
+            "keep_initializers_as_inputs": onnx_config.get("keep_initializers_as_inputs", False),
+            "verbose": onnx_config.get("verbose", False),
+            "save_file": components_io.get("onnx_file") or onnx_config.get("save_file", "model.onnx"),
+            "batch_size": None,
+        }
+
+        if "simplify" in onnx_config:
+            settings_dict["simplify"] = onnx_config["simplify"]
+
+        return ONNXExportConfig.from_mapping(settings_dict)
+
+    def _get_model_io_from_components(self) -> Dict[str, Any]:
+        """
+        Extract model I/O configuration from components.
+
+        For end-to-end models (single component), returns the io config
+        from components.model.
+
+        Returns:
+            Dictionary with inputs, outputs, dynamic_axes, and onnx_file.
+        """
+        components = self.deploy_cfg.get("components", {})
+        if not components:
+            return {}
+
+        # For single-component models, look for 'model' component
+        if "model" in components:
+            comp_cfg = components["model"]
+            io_cfg = comp_cfg.get("io", {})
+            return {
+                "inputs": io_cfg.get("inputs", None),
+                "outputs": io_cfg.get("outputs", None),
+                "dynamic_axes": io_cfg.get("dynamic_axes"),
+                "onnx_file": comp_cfg.get("onnx_file"),
+            }
+
+        return {}
+
+    def get_tensorrt_settings(self) -> TensorRTExportConfig:
+        """
+        Get TensorRT export settings from unified components configuration.
+
+        TensorRT profiles are read from components.model.tensorrt_profile.
+
+        Returns:
+            TensorRTExportConfig instance containing TensorRT export parameters
+        """
+        model_inputs = self._build_model_inputs()
+
+        settings_dict = {
+            "max_workspace_size": self.tensorrt_config.max_workspace_size,
+            "precision_policy": self.tensorrt_config.precision_policy,
+            "policy_flags": self.tensorrt_config.precision_flags,
+            "model_inputs": model_inputs,
+        }
+        return TensorRTExportConfig.from_mapping(settings_dict)
+
+    def _build_model_inputs(self) -> Optional[Tuple[TensorRTModelInputConfig, ...]]:
+        """
+        Build model_inputs from components configuration.
+
+        For end-to-end models (single component), extracts tensorrt_profile
+        from components.model and converts to TensorRTModelInputConfig format.
+
+        Returns:
+            Tuple of TensorRTModelInputConfig, or None if not configured.
+        """
+        components = self.deploy_cfg.get("components", {})
+        if not components or "model" not in components:
+            return None
+
+        comp_cfg = components["model"]
+        tensorrt_profile = comp_cfg.get("tensorrt_profile", {})
+
+        if not tensorrt_profile:
+            return None
+
+        input_shapes = {}
+        for input_name, shape_cfg in tensorrt_profile.items():
+            if isinstance(shape_cfg, Mapping):
+                input_shapes[input_name] = TensorRTProfileConfig(
+                    min_shape=shape_cfg.get("min_shape", None),
+                    opt_shape=shape_cfg.get("opt_shape", None),
+                    max_shape=shape_cfg.get("max_shape", None),
+                )
+
+        if input_shapes:
+            return (TensorRTModelInputConfig(input_shapes=MappingProxyType(input_shapes)),)
+
+        return None
+
+
+def setup_logging(level: str = "INFO") -> logging.Logger:
+    """
+    Setup logging configuration.
+
+    Args:
+        level: Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
+
+    Returns:
+        Configured logger instance
+    """
+    logging.basicConfig(level=getattr(logging, level), format="%(levelname)s:%(name)s:%(message)s")
+    return logging.getLogger("deployment")
+
+
+def parse_base_args(parser: Optional[argparse.ArgumentParser] = None) -> argparse.ArgumentParser:
+    """
+    Create argument parser with common deployment arguments.
+
+    Args:
+        parser: Optional existing ArgumentParser to add arguments to
+
+    Returns:
+        ArgumentParser with deployment arguments
+    """
+    if parser is None:
+        parser = argparse.ArgumentParser(
+            description="Deploy model to ONNX/TensorRT",
+            formatter_class=argparse.ArgumentDefaultsHelpFormatter,
+        )
+
+    parser.add_argument("deploy_cfg", help="Deploy config path")
+    parser.add_argument("model_cfg", help="Model config path")
+    # Optional overrides
+    parser.add_argument(
+        "--log-level",
+        default="INFO",
+        choices=["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"],
+        help="Logging level",
+    )
+
+    return parser
diff --git a/deployment/core/contexts.py b/deployment/core/contexts.py
new file mode 100644
index 000000000..8df2f8a23
--- /dev/null
+++ b/deployment/core/contexts.py
@@ -0,0 +1,80 @@
+"""
+Typed context objects for deployment workflows.
+
+Usage:
+    # Create context for export
+    ctx = ExportContext(sample_idx=0)
+
+    # Project-specific context
+    ctx = CenterPointExportContext(rot_y_axis_reference=True)
+
+    # Pass to orchestrator
+    result = export_orchestrator.run(ctx)
+"""
+
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from types import MappingProxyType
+from typing import Any, Mapping, Optional
+
+
+@dataclass(frozen=True)
+class ExportContext:
+    """
+    Base context for export operations.
+
+    This context carries parameters needed during the export workflow,
+    including model loading and ONNX/TensorRT export settings.
+
+    Attributes:
+        sample_idx: Index of sample to use for tracing/shape inference (default: 0)
+        extra: Dictionary for project-specific or debug-only options that don't
+               warrant a dedicated field. Use sparingly.
+    """
+
+    sample_idx: int = 0
+    extra: Mapping[str, Any] = field(default_factory=lambda: MappingProxyType({}))
+
+    def get(self, key: str, default: Any = None) -> Any:
+        """Get a value from extra dict with a default."""
+        return self.extra.get(key, default)
+
+
+@dataclass(frozen=True)
+class YOLOXExportContext(ExportContext):
+    """
+    YOLOX-specific export context.
+
+    Attributes:
+        model_cfg_path: Path to model configuration file. If None, attempts
+                        to extract from model_cfg.filename.
+    """
+
+    model_cfg: Optional[str] = None
+
+
+@dataclass(frozen=True)
+class CenterPointExportContext(ExportContext):
+    """
+    CenterPoint-specific export context.
+
+    Attributes:
+        rot_y_axis_reference: Whether to use y-axis rotation reference for
+                              ONNX-compatible output format. This affects
+                              how rotation and dimensions are encoded.
+    """
+
+    rot_y_axis_reference: bool = False
+
+
+@dataclass(frozen=True)
+class CalibrationExportContext(ExportContext):
+    """
+    Calibration model export context.
+
+    Currently uses only base ExportContext fields.
+    Extend with calibration-specific parameters as needed.
+    """
+
+    pass
diff --git a/deployment/core/evaluation/__init__.py b/deployment/core/evaluation/__init__.py
new file mode 100644
index 000000000..1125ce5e0
--- /dev/null
+++ b/deployment/core/evaluation/__init__.py
@@ -0,0 +1,18 @@
+"""Evaluation subpackage for deployment core."""
+
+from deployment.core.evaluation.base_evaluator import BaseEvaluator, TaskProfile
+from deployment.core.evaluation.evaluator_types import (
+    EvalResultDict,
+    ModelSpec,
+    VerifyResultDict,
+)
+from deployment.core.evaluation.verification_mixin import VerificationMixin
+
+__all__ = [
+    "BaseEvaluator",
+    "TaskProfile",
+    "EvalResultDict",
+    "ModelSpec",
+    "VerifyResultDict",
+    "VerificationMixin",
+]
diff --git a/deployment/core/evaluation/base_evaluator.py b/deployment/core/evaluation/base_evaluator.py
new file mode 100644
index 000000000..e15accf41
--- /dev/null
+++ b/deployment/core/evaluation/base_evaluator.py
@@ -0,0 +1,332 @@
+"""
+Base evaluator for model evaluation in deployment.
+
+This module provides:
+- Type definitions (EvalResultDict, VerifyResultDict, ModelSpec)
+- BaseEvaluator: the single base class for all task evaluators
+- TaskProfile: describes task-specific metadata
+
+All project evaluators should extend BaseEvaluator and implement
+the required hooks for their specific task.
+"""
+
+from __future__ import annotations
+
+import logging
+from abc import ABC, abstractmethod
+from dataclasses import dataclass
+from typing import Any, Dict, List, Mapping, Optional, Tuple, Union
+
+import numpy as np
+import torch
+
+from deployment.core.backend import Backend
+from deployment.core.evaluation.evaluator_types import (
+    EvalResultDict,
+    InferenceInput,
+    InferenceResult,
+    LatencyBreakdown,
+    LatencyStats,
+    ModelSpec,
+    VerifyResultDict,
+)
+from deployment.core.evaluation.verification_mixin import VerificationMixin
+from deployment.core.io.base_data_loader import BaseDataLoader
+from deployment.core.metrics import BaseMetricsInterface
+
+# Re-export types
+__all__ = [
+    "EvalResultDict",
+    "VerifyResultDict",
+    "ModelSpec",
+    "InferenceInput",
+    "InferenceResult",
+    "LatencyStats",
+    "LatencyBreakdown",
+    "TaskProfile",
+    "BaseEvaluator",
+    "EvaluationDefaults",
+    "EVALUATION_DEFAULTS",
+]
+
+logger = logging.getLogger(__name__)
+
+
+@dataclass(frozen=True)
+class EvaluationDefaults:
+    """Default values for evaluation settings."""
+
+    LOG_INTERVAL: int = 50
+    GPU_CLEANUP_INTERVAL: int = 10
+
+
+EVALUATION_DEFAULTS = EvaluationDefaults()
+
+
+@dataclass(frozen=True)
+class TaskProfile:
+    """
+    Profile describing task-specific evaluation behavior.
+
+    Attributes:
+        task_name: Internal identifier for the task
+        class_names: Tuple of class names for the task
+        num_classes: Number of classes
+        display_name: Human-readable name for display (defaults to task_name)
+    """
+
+    task_name: str
+    class_names: Tuple[str, ...]
+    num_classes: int
+    display_name: str = ""
+
+    def __post_init__(self):
+        if not self.display_name:
+            object.__setattr__(self, "display_name", self.task_name)
+
+
+class BaseEvaluator(VerificationMixin, ABC):
+    """
+    Base class for all task-specific evaluators.
+
+    This class provides:
+    - A unified evaluation loop (iterate samples → infer → accumulate → compute metrics)
+    - Verification support via VerificationMixin
+    - Common utilities (latency stats, model device management)
+
+    Subclasses implement task-specific hooks:
+    - _create_pipeline(): Create backend-specific pipeline
+    - _prepare_input(): Prepare model input from sample
+    - _parse_predictions(): Normalize pipeline output
+    - _parse_ground_truths(): Extract ground truth from sample
+    - _add_to_interface(): Feed a single frame to the metrics interface
+    - _build_results(): Construct final results dict from interface metrics
+    - print_results(): Format and display results
+    """
+
+    def __init__(
+        self,
+        metrics_interface: BaseMetricsInterface,
+        task_profile: TaskProfile,
+        model_cfg: Any,
+    ):
+        """
+        Initialize evaluator.
+
+        Args:
+            metrics_interface: Metrics interface for computing task-specific metrics
+            task_profile: Profile describing the task
+            model_cfg: Model configuration (MMEngine Config or similar)
+        """
+        self.metrics_interface = metrics_interface
+        self.task_profile = task_profile
+        self.model_cfg = model_cfg
+        self.pytorch_model: Any = None
+
+    @property
+    def class_names(self) -> Tuple[str, ...]:
+        """Get class names from task profile."""
+        return self.task_profile.class_names
+
+    def set_pytorch_model(self, pytorch_model: Any) -> None:
+        """Set PyTorch model (called by deployment runner)."""
+        self.pytorch_model = pytorch_model
+
+    def _ensure_model_on_device(self, device: str) -> Any:
+        """Ensure PyTorch model is on the correct device."""
+        if self.pytorch_model is None:
+            raise RuntimeError(
+                f"{self.__class__.__name__}.pytorch_model is None. "
+                "DeploymentRunner must set evaluator.pytorch_model before calling verify/evaluate."
+            )
+
+        current_device = next(self.pytorch_model.parameters()).device
+        target_device = torch.device(device)
+
+        if current_device != target_device:
+            logger.info(f"Moving PyTorch model from {current_device} to {target_device}")
+            self.pytorch_model = self.pytorch_model.to(target_device)
+
+        return self.pytorch_model
+
+    # ================== Abstract Methods (Task-Specific) ==================
+
+    @abstractmethod
+    def _create_pipeline(self, model_spec: ModelSpec, device: str) -> Any:
+        """Create a pipeline for the specified backend."""
+        raise NotImplementedError
+
+    @abstractmethod
+    def _prepare_input(
+        self,
+        sample: Mapping[str, Any],
+        data_loader: BaseDataLoader,
+        device: str,
+    ) -> InferenceInput:
+        """Prepare model input from a sample.
+
+        Returns:
+            InferenceInput containing:
+                - data: The actual input data (e.g., points tensor)
+                - metadata: Sample metadata forwarded to postprocess()
+        """
+        raise NotImplementedError
+
+    @abstractmethod
+    def _parse_predictions(self, pipeline_output: Any) -> Any:
+        """Normalize pipeline output to standard format."""
+        raise NotImplementedError
+
+    @abstractmethod
+    def _parse_ground_truths(self, gt_data: Mapping[str, Any]) -> Any:
+        """Extract ground truth from sample data."""
+        raise NotImplementedError
+
+    @abstractmethod
+    def _add_to_interface(self, predictions: Any, ground_truths: Any) -> None:
+        """Add a single frame to the metrics interface."""
+        raise NotImplementedError
+
+    @abstractmethod
+    def _build_results(
+        self,
+        latencies: List[float],
+        latency_breakdowns: List[Dict[str, float]],
+        num_samples: int,
+    ) -> EvalResultDict:
+        """Build final results dict from interface metrics."""
+        raise NotImplementedError
+
+    @abstractmethod
+    def print_results(self, results: EvalResultDict) -> None:
+        """Pretty print evaluation results."""
+        raise NotImplementedError
+
+    # ================== VerificationMixin Implementation ==================
+
+    def _create_pipeline_for_verification(
+        self,
+        model_spec: ModelSpec,
+        device: str,
+        log: logging.Logger,
+    ) -> Any:
+        """Create pipeline for verification."""
+        self._ensure_model_on_device(device)
+        return self._create_pipeline(model_spec, device)
+
+    def _get_verification_input(
+        self,
+        sample_idx: int,
+        data_loader: BaseDataLoader,
+        device: str,
+    ) -> InferenceInput:
+        """Get verification input."""
+        sample = data_loader.load_sample(sample_idx)
+        return self._prepare_input(sample, data_loader, device)
+
+    # ================== Core Evaluation Loop ==================
+
+    def evaluate(
+        self,
+        model: ModelSpec,
+        data_loader: BaseDataLoader,
+        num_samples: int,
+        verbose: bool = False,
+    ) -> EvalResultDict:
+        """
+        Run evaluation on the specified model.
+
+        Args:
+            model: Model specification (backend/device/path)
+            data_loader: Data loader for samples
+            num_samples: Number of samples to evaluate
+            verbose: Whether to print progress
+
+        Returns:
+            Evaluation results dictionary
+        """
+        logger.info(f"\nEvaluating {model.backend.value} model: {model.path}")
+        logger.info(f"Number of samples: {num_samples}")
+
+        self._ensure_model_on_device(model.device)
+        pipeline = self._create_pipeline(model, model.device)
+        self.metrics_interface.reset()
+
+        latencies = []
+        latency_breakdowns = []
+
+        actual_samples = min(num_samples, data_loader.num_samples)
+
+        for idx in range(actual_samples):
+            if verbose and idx % EVALUATION_DEFAULTS.LOG_INTERVAL == 0:
+                logger.info(f"Processing sample {idx + 1}/{actual_samples}")
+
+            sample = data_loader.load_sample(idx)
+            inference_input = self._prepare_input(sample, data_loader, model.device)
+
+            if "ground_truth" not in sample:
+                raise KeyError("DataLoader.load_sample() must return 'ground_truth' for evaluation.")
+            gt_data = sample.get("ground_truth")
+            ground_truths = self._parse_ground_truths(gt_data)
+
+            infer_result = pipeline.infer(inference_input.data, metadata=inference_input.metadata)
+            latencies.append(infer_result.latency_ms)
+            if infer_result.breakdown:
+                latency_breakdowns.append(infer_result.breakdown)
+
+            predictions = self._parse_predictions(infer_result.output)
+            self._add_to_interface(predictions, ground_truths)
+
+            if model.backend is Backend.TENSORRT and idx % EVALUATION_DEFAULTS.GPU_CLEANUP_INTERVAL == 0:
+                if torch.cuda.is_available():
+                    torch.cuda.empty_cache()
+
+        # Cleanup pipeline resources
+        try:
+            pipeline.cleanup()
+        except Exception as e:
+            logger.warning(f"Error during pipeline cleanup: {e}")
+
+        return self._build_results(latencies, latency_breakdowns, actual_samples)
+
+    # ================== Utilities ==================
+
+    def compute_latency_stats(self, latencies: List[float]) -> LatencyStats:
+        """Compute latency statistics from a list of measurements."""
+        if not latencies:
+            return LatencyStats.empty()
+
+        arr = np.array(latencies)
+        return LatencyStats(
+            mean_ms=float(np.mean(arr)),
+            std_ms=float(np.std(arr)),
+            min_ms=float(np.min(arr)),
+            max_ms=float(np.max(arr)),
+            median_ms=float(np.median(arr)),
+        )
+
+    def _compute_latency_breakdown(
+        self,
+        latency_breakdowns: List[Dict[str, float]],
+    ) -> LatencyBreakdown:
+        """Compute statistics for each latency stage."""
+        if not latency_breakdowns:
+            return LatencyBreakdown.empty()
+
+        stage_order = list(dict.fromkeys(stage for breakdown in latency_breakdowns for stage in breakdown.keys()))
+
+        return LatencyBreakdown(
+            stages={
+                stage: self.compute_latency_stats([bd[stage] for bd in latency_breakdowns if stage in bd])
+                for stage in stage_order
+            }
+        )
+
+    def format_latency_stats(self, stats: Union[Mapping[str, float], LatencyStats]) -> str:
+        """Format latency statistics as a readable string."""
+        stats_dict = stats.to_dict() if isinstance(stats, LatencyStats) else stats
+        return (
+            f"Latency: {stats_dict['mean_ms']:.2f} ± {stats_dict['std_ms']:.2f} ms "
+            f"(min: {stats_dict['min_ms']:.2f}, max: {stats_dict['max_ms']:.2f}, "
+            f"median: {stats_dict['median_ms']:.2f})"
+        )
diff --git a/deployment/core/evaluation/evaluator_types.py b/deployment/core/evaluation/evaluator_types.py
new file mode 100644
index 000000000..10eed0d1f
--- /dev/null
+++ b/deployment/core/evaluation/evaluator_types.py
@@ -0,0 +1,154 @@
+"""
+Type definitions for model evaluation in deployment.
+
+This module contains the shared type definitions used by evaluators,
+runners, and orchestrators.
+"""
+
+from __future__ import annotations
+
+from dataclasses import asdict, dataclass, field
+from typing import Any, Dict, Mapping, Optional, TypedDict
+
+from deployment.core.artifacts import Artifact
+from deployment.core.backend import Backend
+
+
+class EvalResultDict(TypedDict, total=False):
+    """
+    Structured evaluation result used across deployments.
+
+    Attributes:
+        primary_metric: Main scalar metric for quick ranking (e.g., accuracy, mAP).
+        metrics: Flat dictionary of additional scalar metrics.
+        per_class: Optional nested metrics keyed by class/label name.
+        latency: Latency statistics as returned by compute_latency_stats().
+        metadata: Arbitrary metadata that downstream components might need.
+    """
+
+    primary_metric: float
+    metrics: Dict[str, float]
+    per_class: Dict[str, Any]
+    latency: Dict[str, float]
+    metadata: Dict[str, Any]
+
+
+class VerifyResultDict(TypedDict, total=False):
+    """
+    Structured verification outcome shared between runners and evaluators.
+
+    Attributes:
+        summary: Aggregate pass/fail counts.
+        samples: Mapping of sample identifiers to boolean pass/fail states.
+    """
+
+    summary: Dict[str, int]
+    samples: Dict[str, bool]
+    error: str
+
+
+@dataclass(frozen=True)
+class LatencyStats:
+    """
+    Immutable latency statistics for a batch of inferences.
+
+    Provides a typed alternative to loose dictionaries and a convenient
+    ``to_dict`` helper for interoperability with existing call sites.
+    """
+
+    mean_ms: float
+    std_ms: float
+    min_ms: float
+    max_ms: float
+    median_ms: float
+
+    @classmethod
+    def empty(cls) -> LatencyStats:
+        """Return a zero-initialized stats object."""
+        return cls(0.0, 0.0, 0.0, 0.0, 0.0)
+
+    def to_dict(self) -> Dict[str, float]:
+        """Convert to a plain dictionary for serialization."""
+        return asdict(self)
+
+
+@dataclass(frozen=True)
+class LatencyBreakdown:
+    """
+    Stage-wise latency statistics keyed by stage name.
+
+    Stored as a mapping of stage -> LatencyStats, with a ``to_dict`` helper
+    to preserve backward compatibility with existing dictionary consumers.
+    """
+
+    stages: Dict[str, LatencyStats]
+
+    @classmethod
+    def empty(cls) -> LatencyBreakdown:
+        """Return an empty breakdown."""
+        return cls(stages={})
+
+    def to_dict(self) -> Dict[str, Dict[str, float]]:
+        """Convert to ``Dict[str, Dict[str, float]]`` for downstream use."""
+        return {stage: stats.to_dict() for stage, stats in self.stages.items()}
+
+
+@dataclass(frozen=True)
+class InferenceInput:
+    """Prepared input for pipeline inference.
+
+    Attributes:
+        data: The actual input data (e.g., points tensor, image tensor).
+        metadata: Sample metadata forwarded to postprocess().
+    """
+
+    data: Any
+    metadata: Mapping[str, Any] = field(default_factory=dict)
+
+
+@dataclass(frozen=True)
+class InferenceResult:
+    """Standard inference return payload."""
+
+    output: Any
+    latency_ms: float
+    breakdown: Optional[Dict[str, float]] = None
+
+    @classmethod
+    def empty(cls) -> InferenceResult:
+        """Return an empty inference result."""
+        return cls(output=None, latency_ms=0.0, breakdown={})
+
+    def to_dict(self) -> Dict[str, Any]:
+        """Convert to a plain dictionary for logging/serialization."""
+        return {
+            "output": self.output,
+            "latency_ms": self.latency_ms,
+            "breakdown": dict(self.breakdown or {}),
+        }
+
+
+@dataclass(frozen=True)
+class ModelSpec:
+    """
+    Minimal description of a concrete model artifact to evaluate or verify.
+
+    Attributes:
+        backend: Backend identifier such as 'pytorch', 'onnx', or 'tensorrt'.
+        device: Target device string (e.g., 'cpu', 'cuda:0').
+        artifact: Filesystem representation of the produced model.
+    """
+
+    backend: Backend
+    device: str
+    artifact: Artifact
+
+    @property
+    def path(self) -> str:
+        """Backward-compatible access to artifact path."""
+        return self.artifact.path
+
+    @property
+    def multi_file(self) -> bool:
+        """True if the artifact represents a multi-file bundle."""
+        return self.artifact.multi_file
diff --git a/deployment/core/evaluation/verification_mixin.py b/deployment/core/evaluation/verification_mixin.py
new file mode 100644
index 000000000..e7e866e04
--- /dev/null
+++ b/deployment/core/evaluation/verification_mixin.py
@@ -0,0 +1,497 @@
+"""
+Verification mixin providing shared verification logic for evaluators.
+
+This mixin extracts the common verification workflow that was duplicated
+across CenterPointEvaluator, YOLOXOptElanEvaluator, and ClassificationEvaluator.
+"""
+
+from __future__ import annotations
+
+import logging
+from abc import abstractmethod
+from dataclasses import dataclass, field
+from typing import Any, Dict, List, Mapping, Optional, Tuple, Union
+
+import numpy as np
+import torch
+
+from deployment.core.backend import Backend
+from deployment.core.evaluation.evaluator_types import InferenceInput, ModelSpec, VerifyResultDict
+from deployment.core.io.base_data_loader import BaseDataLoader
+
+
+@dataclass(frozen=True)
+class ComparisonResult:
+    """Result of comparing two outputs (immutable)."""
+
+    passed: bool
+    max_diff: float
+    mean_diff: float
+    num_elements: int = 0
+    details: Tuple[Tuple[str, ComparisonResult], ...] = ()
+
+    def to_dict(self) -> Dict[str, Any]:
+        """Convert to dictionary for serialization."""
+        result = {
+            "passed": self.passed,
+            "max_diff": self.max_diff,
+            "mean_diff": self.mean_diff,
+            "num_elements": self.num_elements,
+        }
+        if self.details:
+            result["details"] = {k: v.to_dict() for k, v in self.details}
+        return result
+
+
+class VerificationMixin:
+    """
+    Mixin providing shared verification logic for all evaluators.
+
+    Subclasses must implement:
+    - _create_pipeline_for_verification(): Create backend-specific pipeline
+    - _get_verification_input(): Extract inputs for verification
+
+    Subclasses may optionally override:
+    - _get_output_names(): Provide meaningful names for list/tuple outputs
+    """
+
+    @abstractmethod
+    def _create_pipeline_for_verification(
+        self,
+        model_spec: ModelSpec,
+        device: str,
+        logger: logging.Logger,
+    ) -> Any:
+        """Create a pipeline for the specified backend."""
+        raise NotImplementedError
+
+    @abstractmethod
+    def _get_verification_input(
+        self,
+        sample_idx: int,
+        data_loader: BaseDataLoader,
+        device: str,
+    ) -> InferenceInput:
+        """Get input data for verification.
+
+        Returns:
+            InferenceInput containing:
+                - data: The actual input data (e.g., points tensor)
+                - metadata: Sample metadata forwarded to postprocess()
+        """
+        raise NotImplementedError
+
+    def _get_output_names(self) -> Optional[List[str]]:
+        """
+        Optional: Provide meaningful names for list/tuple outputs.
+
+        Override this method to provide task-specific output names for better logging.
+        Returns None by default, which uses generic naming (output_0, output_1, ...).
+        """
+        return None
+
+    def _compare_outputs(
+        self,
+        reference: Any,
+        test: Any,
+        tolerance: float,
+        logger: logging.Logger,
+        path: str = "output",
+    ) -> ComparisonResult:
+        """
+        Recursively compare outputs of any structure.
+
+        Handles:
+        - Tensors (torch.Tensor, np.ndarray)
+        - Scalars (int, float)
+        - Dictionaries
+        - Lists/Tuples
+        - None values
+
+        Args:
+            reference: Reference output
+            test: Test output
+            tolerance: Maximum allowed difference
+            logger: Logger instance
+            path: Current path in the structure (for logging)
+
+        Returns:
+            ComparisonResult with comparison statistics
+        """
+        # Handle None
+        if reference is None and test is None:
+            return ComparisonResult(passed=True, max_diff=0.0, mean_diff=0.0)
+
+        if reference is None or test is None:
+            logger.error(f"  {path}: One output is None while the other is not")
+            return ComparisonResult(passed=False, max_diff=float("inf"), mean_diff=float("inf"))
+
+        # Handle dictionaries
+        if isinstance(reference, dict) and isinstance(test, dict):
+            return self._compare_dicts(reference, test, tolerance, logger, path)
+
+        # Handle lists/tuples
+        if isinstance(reference, (list, tuple)) and isinstance(test, (list, tuple)):
+            return self._compare_sequences(reference, test, tolerance, logger, path)
+
+        # Handle tensors and arrays
+        if self._is_array_like(reference) and self._is_array_like(test):
+            return self._compare_arrays(reference, test, tolerance, logger, path)
+
+        # Handle scalars
+        if isinstance(reference, (int, float)) and isinstance(test, (int, float)):
+            diff = abs(float(reference) - float(test))
+            passed = diff < tolerance
+            if not passed:
+                logger.warning(f"  {path}: scalar diff={diff:.6f} > tolerance={tolerance:.6f}")
+            return ComparisonResult(passed=passed, max_diff=diff, mean_diff=diff, num_elements=1)
+
+        # Type mismatch
+        logger.error(f"  {path}: Type mismatch - {type(reference).__name__} vs {type(test).__name__}")
+        return ComparisonResult(passed=False, max_diff=float("inf"), mean_diff=float("inf"))
+
+    def _compare_dicts(
+        self,
+        reference: Mapping[str, Any],
+        test: Mapping[str, Any],
+        tolerance: float,
+        logger: logging.Logger,
+        path: str,
+    ) -> ComparisonResult:
+        """Compare dictionary outputs."""
+        ref_keys = set(reference.keys())
+        test_keys = set(test.keys())
+
+        if ref_keys != test_keys:
+            missing = ref_keys - test_keys
+            extra = test_keys - ref_keys
+            if missing:
+                logger.error(f"  {path}: Missing keys in test: {missing}")
+            if extra:
+                logger.warning(f"  {path}: Extra keys in test: {extra}")
+            return ComparisonResult(passed=False, max_diff=float("inf"), mean_diff=float("inf"))
+
+        max_diff = 0.0
+        total_diff = 0.0
+        total_elements = 0
+        all_passed = True
+        details_list = []
+
+        for key in sorted(ref_keys):
+            child_path = f"{path}.{key}"
+            result = self._compare_outputs(reference[key], test[key], tolerance, logger, child_path)
+            details_list.append((key, result))
+
+            max_diff = max(max_diff, result.max_diff)
+            total_diff += result.mean_diff * result.num_elements
+            total_elements += result.num_elements
+            all_passed = all_passed and result.passed
+
+        mean_diff = total_diff / total_elements if total_elements > 0 else 0.0
+        return ComparisonResult(
+            passed=all_passed,
+            max_diff=max_diff,
+            mean_diff=mean_diff,
+            num_elements=total_elements,
+            details=tuple(details_list),
+        )
+
+    def _compare_sequences(
+        self,
+        reference: Union[List, Tuple],
+        test: Union[List, Tuple],
+        tolerance: float,
+        logger: logging.Logger,
+        path: str,
+    ) -> ComparisonResult:
+        """Compare list/tuple outputs."""
+        if len(reference) != len(test):
+            logger.error(f"  {path}: Length mismatch - {len(reference)} vs {len(test)}")
+            return ComparisonResult(passed=False, max_diff=float("inf"), mean_diff=float("inf"))
+
+        # Get optional output names from subclass
+        output_names = self._get_output_names()
+
+        max_diff = 0.0
+        total_diff = 0.0
+        total_elements = 0
+        all_passed = True
+        details_list = []
+
+        for idx, (ref_item, test_item) in enumerate(zip(reference, test)):
+            # Use provided names or generic naming
+            if output_names and idx < len(output_names):
+                name = output_names[idx]
+            else:
+                name = f"output_{idx}"
+
+            child_path = f"{path}[{name}]"
+            result = self._compare_outputs(ref_item, test_item, tolerance, logger, child_path)
+            details_list.append((name, result))
+
+            max_diff = max(max_diff, result.max_diff)
+            total_diff += result.mean_diff * result.num_elements
+            total_elements += result.num_elements
+            all_passed = all_passed and result.passed
+
+        mean_diff = total_diff / total_elements if total_elements > 0 else 0.0
+        return ComparisonResult(
+            passed=all_passed,
+            max_diff=max_diff,
+            mean_diff=mean_diff,
+            num_elements=total_elements,
+            details=tuple(details_list),
+        )
+
+    def _compare_arrays(
+        self,
+        reference: Any,
+        test: Any,
+        tolerance: float,
+        logger: logging.Logger,
+        path: str,
+    ) -> ComparisonResult:
+        """Compare array-like outputs (tensors, numpy arrays)."""
+        ref_np = self._to_numpy(reference)
+        test_np = self._to_numpy(test)
+
+        if ref_np.shape != test_np.shape:
+            logger.error(f"  {path}: Shape mismatch - {ref_np.shape} vs {test_np.shape}")
+            return ComparisonResult(passed=False, max_diff=float("inf"), mean_diff=float("inf"))
+
+        diff = np.abs(ref_np - test_np)
+        max_diff = float(np.max(diff))
+        mean_diff = float(np.mean(diff))
+        num_elements = int(diff.size)
+
+        passed = max_diff < tolerance
+        logger.info(f"  {path}: shape={ref_np.shape}, max_diff={max_diff:.6f}, mean_diff={mean_diff:.6f}")
+
+        return ComparisonResult(
+            passed=passed,
+            max_diff=max_diff,
+            mean_diff=mean_diff,
+            num_elements=num_elements,
+        )
+
+    @staticmethod
+    def _is_array_like(obj: Any) -> bool:
+        """Check if object is array-like (tensor or numpy array)."""
+        return isinstance(obj, (torch.Tensor, np.ndarray))
+
+    @staticmethod
+    def _to_numpy(tensor: Any) -> np.ndarray:
+        """Convert tensor to numpy array."""
+        if isinstance(tensor, torch.Tensor):
+            return tensor.detach().cpu().numpy()
+        if isinstance(tensor, np.ndarray):
+            return tensor
+        return np.array(tensor)
+
+    def _compare_backend_outputs(
+        self,
+        reference_output: Any,
+        test_output: Any,
+        tolerance: float,
+        backend_name: str,
+        logger: logging.Logger,
+    ) -> Tuple[bool, Dict[str, float]]:
+        """
+        Compare outputs from reference and test backends.
+
+        This is the main entry point for output comparison.
+        Uses recursive comparison to handle any output structure.
+        """
+        result = self._compare_outputs(reference_output, test_output, tolerance, logger)
+
+        logger.info(f"\n  Overall Max difference: {result.max_diff:.6f}")
+        logger.info(f"  Overall Mean difference: {result.mean_diff:.6f}")
+
+        if result.passed:
+            logger.info(f"  {backend_name} verification PASSED ✓")
+        else:
+            logger.warning(
+                f"  {backend_name} verification FAILED ✗ "
+                f"(max diff: {result.max_diff:.6f} > tolerance: {tolerance:.6f})"
+            )
+
+        return result.passed, {"max_diff": result.max_diff, "mean_diff": result.mean_diff}
+
+    def _normalize_verification_device(
+        self,
+        backend: Backend,
+        device: str,
+        logger: logging.Logger,
+    ) -> Optional[str]:
+        """Normalize device for verification based on backend requirements."""
+        if backend is Backend.PYTORCH and device.startswith("cuda"):
+            logger.warning("PyTorch verification is forced to CPU; overriding device to 'cpu'")
+            return "cpu"
+
+        if backend is Backend.TENSORRT:
+            if not device.startswith("cuda"):
+                return None
+            if device != "cuda:0":
+                logger.warning("TensorRT verification only supports 'cuda:0'. Overriding.")
+                return "cuda:0"
+
+        return device
+
+    def verify(
+        self,
+        reference: ModelSpec,
+        test: ModelSpec,
+        data_loader: BaseDataLoader,
+        num_samples: int = 1,
+        tolerance: float = 0.1,
+        verbose: bool = False,
+    ) -> VerifyResultDict:
+        """Verify exported models using policy-based verification."""
+        logger = logging.getLogger(__name__)
+
+        results: VerifyResultDict = {
+            "summary": {"passed": 0, "failed": 0, "total": 0},
+            "samples": {},
+        }
+
+        ref_device = self._normalize_verification_device(reference.backend, reference.device, logger)
+        test_device = self._normalize_verification_device(test.backend, test.device, logger)
+
+        if test_device is None:
+            results["error"] = f"{test.backend.value} requires CUDA"
+            return results
+
+        self._log_verification_header(reference, test, ref_device, test_device, num_samples, tolerance, logger)
+
+        logger.info(f"\nInitializing {reference.backend.value} reference pipeline...")
+        ref_pipeline = self._create_pipeline_for_verification(reference, ref_device, logger)
+
+        logger.info(f"\nInitializing {test.backend.value} test pipeline...")
+        test_pipeline = self._create_pipeline_for_verification(test, test_device, logger)
+
+        actual_samples = min(num_samples, data_loader.num_samples)
+        for i in range(actual_samples):
+            logger.info(f"\n{'='*60}")
+            logger.info(f"Verifying sample {i}")
+            logger.info(f"{'='*60}")
+
+            passed = self._verify_single_sample(
+                i,
+                ref_pipeline,
+                test_pipeline,
+                data_loader,
+                ref_device,
+                test_device,
+                reference.backend,
+                test.backend,
+                tolerance,
+                logger,
+            )
+            results["samples"][f"sample_{i}"] = passed
+
+            if torch.cuda.is_available():
+                torch.cuda.empty_cache()
+
+        # Cleanup pipeline resources - all pipelines now have cleanup() via base class
+        for pipeline in [ref_pipeline, test_pipeline]:
+            if pipeline is not None:
+                try:
+                    pipeline.cleanup()
+                except Exception as e:
+                    logger.warning(f"Error during pipeline cleanup in verification: {e}")
+
+        sample_values = results["samples"].values()
+        passed_count = sum(1 for v in sample_values if v is True)
+        failed_count = sum(1 for v in sample_values if v is False)
+
+        results["summary"] = {"passed": passed_count, "failed": failed_count, "total": len(results["samples"])}
+        self._log_verification_summary(results, logger)
+
+        return results
+
+    def _verify_single_sample(
+        self,
+        sample_idx: int,
+        ref_pipeline: Any,
+        test_pipeline: Any,
+        data_loader: BaseDataLoader,
+        ref_device: str,
+        test_device: str,
+        ref_backend: Backend,
+        test_backend: Backend,
+        tolerance: float,
+        logger: logging.Logger,
+    ) -> bool:
+        """Verify a single sample."""
+        inference_input = self._get_verification_input(sample_idx, data_loader, ref_device)
+
+        ref_name = f"{ref_backend.value} ({ref_device})"
+        logger.info(f"\nRunning {ref_name} reference...")
+        ref_result = ref_pipeline.infer(
+            inference_input.data,
+            metadata=inference_input.metadata,
+            return_raw_outputs=True,
+        )
+        logger.info(f"  {ref_name} latency: {ref_result.latency_ms:.2f} ms")
+
+        test_input = self._move_input_to_device(inference_input.data, test_device)
+        test_name = f"{test_backend.value} ({test_device})"
+        logger.info(f"\nRunning {test_name} test...")
+        test_result = test_pipeline.infer(
+            test_input,
+            metadata=inference_input.metadata,
+            return_raw_outputs=True,
+        )
+        logger.info(f"  {test_name} latency: {test_result.latency_ms:.2f} ms")
+
+        passed, _ = self._compare_backend_outputs(ref_result.output, test_result.output, tolerance, test_name, logger)
+        return passed
+
+    def _move_input_to_device(self, input_data: Any, device: str) -> Any:
+        """Move input data to specified device."""
+        device_obj = torch.device(device)
+
+        if isinstance(input_data, torch.Tensor):
+            return input_data.to(device_obj) if input_data.device != device_obj else input_data
+        if isinstance(input_data, dict):
+            return {k: self._move_input_to_device(v, device) for k, v in input_data.items()}
+        if isinstance(input_data, (list, tuple)):
+            return type(input_data)(self._move_input_to_device(item, device) for item in input_data)
+        return input_data
+
+    def _log_verification_header(
+        self,
+        reference: ModelSpec,
+        test: ModelSpec,
+        ref_device: str,
+        test_device: str,
+        num_samples: int,
+        tolerance: float,
+        logger: logging.Logger,
+    ) -> None:
+        """Log verification header information."""
+        logger.info("\n" + "=" * 60)
+        logger.info("Model Verification (Policy-Based)")
+        logger.info("=" * 60)
+        logger.info(f"Reference: {reference.backend.value} on {ref_device} - {reference.path}")
+        logger.info(f"Test: {test.backend.value} on {test_device} - {test.path}")
+        logger.info(f"Number of samples: {num_samples}")
+        logger.info(f"Tolerance: {tolerance}")
+        logger.info("=" * 60)
+
+    def _log_verification_summary(self, results: VerifyResultDict, logger: logging.Logger) -> None:
+        """Log verification summary."""
+        logger.info("\n" + "=" * 60)
+        logger.info("Verification Summary")
+        logger.info("=" * 60)
+
+        for key, value in results["samples"].items():
+            status = "PASSED" if value else "FAILED"
+            logger.info(f"  {key}: {status}")
+
+        summary = results["summary"]
+        logger.info("=" * 60)
+        logger.info(
+            f"Total: {summary['passed']}/{summary['total']} passed, {summary['failed']}/{summary['total']} failed"
+        )
+        logger.info("=" * 60)
diff --git a/deployment/core/io/__init__.py b/deployment/core/io/__init__.py
new file mode 100644
index 000000000..fae8a5fc0
--- /dev/null
+++ b/deployment/core/io/__init__.py
@@ -0,0 +1,9 @@
+"""I/O utilities subpackage for deployment core."""
+
+from deployment.core.io.base_data_loader import BaseDataLoader
+from deployment.core.io.preprocessing_builder import build_preprocessing_pipeline
+
+__all__ = [
+    "BaseDataLoader",
+    "build_preprocessing_pipeline",
+]
diff --git a/deployment/core/io/base_data_loader.py b/deployment/core/io/base_data_loader.py
new file mode 100644
index 000000000..ee3aaf4bf
--- /dev/null
+++ b/deployment/core/io/base_data_loader.py
@@ -0,0 +1,129 @@
+"""
+Abstract base class for data loading in deployment.
+
+Each task (classification, detection, segmentation, etc.) must implement
+a concrete DataLoader that extends this base class.
+"""
+
+from __future__ import annotations
+
+from abc import ABC, abstractmethod
+from typing import Any, Dict, Mapping, TypedDict
+
+import torch
+
+
+class SampleData(TypedDict, total=False):
+    """
+    Typed representation of a data sample handled by data loaders.
+
+    Attributes:
+        input: Raw input data such as images or point clouds.
+        ground_truth: Labels or annotations if available.
+        metadata: Additional information required for evaluation.
+    """
+
+    input: Any
+    ground_truth: Any
+    metadata: Dict[str, Any]
+
+
+class BaseDataLoader(ABC):
+    """
+    Abstract base class for task-specific data loaders.
+
+    This class defines the interface that all task-specific data loaders
+    must implement. It handles loading raw data from disk and preprocessing
+    it into a format suitable for model inference.
+    """
+
+    def __init__(self, config: Mapping[str, Any]):
+        """
+        Initialize data loader.
+
+        Args:
+            config: Configuration dictionary containing task-specific settings
+        """
+        self.config = config
+
+    @abstractmethod
+    def load_sample(self, index: int) -> SampleData:
+        """
+        Load a single sample from the dataset.
+
+        Args:
+            index: Sample index to load
+
+        Returns:
+            Dictionary containing raw sample data. Structure is task-specific,
+            but should typically include:
+            - Raw input data (image, point cloud, etc.)
+            - Ground truth labels/annotations (if available)
+            - Any metadata needed for evaluation
+
+        Raises:
+            IndexError: If index is out of range
+            FileNotFoundError: If sample data files don't exist
+        """
+        raise NotImplementedError
+
+    @abstractmethod
+    def preprocess(self, sample: SampleData) -> Any:
+        """
+        Preprocess raw sample data into model input format.
+
+        Args:
+            sample: Raw sample data returned by load_sample()
+
+        Returns:
+            Preprocessed model input ready for inference. Type/shape is task-specific.
+            (e.g., torch.Tensor, Dict[str, torch.Tensor], tuple, etc.)
+
+        Raises:
+            ValueError: If sample format is invalid
+        """
+        raise NotImplementedError
+
+    @property
+    @abstractmethod
+    def num_samples(self) -> int:
+        """
+        Get total number of samples in the dataset.
+
+        Returns:
+            Total number of samples available
+        """
+        raise NotImplementedError
+
+    def get_shape_sample(self, index: int = 0) -> Any:
+        """
+        Return a representative sample used for export shape configuration.
+
+        This method provides a consistent interface for exporters to obtain
+        shape information without needing to know the internal structure of
+        preprocessed inputs (e.g., whether it's a single tensor, tuple, or list).
+
+        The default implementation:
+        1. Loads a sample using load_sample()
+        2. Preprocesses it using preprocess()
+        3. If the result is a list/tuple, returns the first element
+        4. Otherwise returns the preprocessed result as-is
+
+        Subclasses can override this method to provide custom shape sample logic
+        if the default behavior is insufficient.
+
+        Args:
+            index: Sample index to use (default: 0)
+
+        Returns:
+            A representative sample for shape configuration. Typically a torch.Tensor,
+            but the exact type depends on the task-specific implementation.
+        """
+        sample = self.load_sample(index)
+        preprocessed = self.preprocess(sample)
+
+        # Handle nested structures: if it's a list/tuple, use first element for shape
+        if isinstance(preprocessed, (list, tuple)):
+            return preprocessed[0] if len(preprocessed) > 0 else preprocessed
+
+        return preprocessed
diff --git a/deployment/core/io/preprocessing_builder.py b/deployment/core/io/preprocessing_builder.py
new file mode 100644
index 000000000..1472f1e3a
--- /dev/null
+++ b/deployment/core/io/preprocessing_builder.py
@@ -0,0 +1,167 @@
+"""
+Preprocessing pipeline builder for deployment data loaders.
+
+This module provides functions to extract and build preprocessing pipelines
+from MMDet/MMDet3D/MMPretrain configs for use in deployment data loaders.
+
+This module is compatible with the BaseDeploymentPipeline.
+"""
+
+from __future__ import annotations
+
+import logging
+from typing import Any, List, Mapping, Optional
+
+from mmengine.config import Config
+from mmengine.dataset import Compose
+from mmengine.registry import init_default_scope
+
+logger = logging.getLogger(__name__)
+
+TransformConfig = Mapping[str, Any]
+
+
+class ComposeBuilder:
+    """
+    Unified builder for creating Compose objects with different MM frameworks.
+
+    Uses MMEngine-based Compose with init_default_scope for all frameworks.
+    """
+
+    @staticmethod
+    def build(
+        pipeline_cfg: List[TransformConfig],
+        scope: str,
+        import_modules: List[str],
+    ) -> Any:
+        """
+        Build Compose object using MMEngine with init_default_scope.
+
+        Args:
+            pipeline_cfg: List of transform configurations
+            scope: Default scope name (e.g., 'mmdet', 'mmdet3d', 'mmpretrain')
+            import_modules: List of module paths to import for transform registration
+
+        Returns:
+            Compose object
+
+        Raises:
+            ImportError: If required packages are not available
+        """
+        # Import transform modules to register transforms
+        for module_path in import_modules:
+            try:
+                __import__(module_path)
+            except ImportError as e:
+                raise ImportError(
+                    f"Failed to import transform module '{module_path}' for scope '{scope}'. "
+                    f"Please ensure the required package is installed. Error: {e}"
+                ) from e
+
+        # Set default scope and build Compose
+        try:
+            init_default_scope(scope)
+            logger.info(
+                "Building pipeline with mmengine.dataset.Compose (default_scope='%s')",
+                scope,
+            )
+            return Compose(pipeline_cfg)
+        except Exception as e:
+            raise RuntimeError(
+                f"Failed to build Compose pipeline for scope '{scope}'. "
+                f"Check your pipeline configuration and transforms. Error: {e}"
+            ) from e
+
+
+TASK_PIPELINE_CONFIGS: Mapping[str, Mapping[str, Any]] = {
+    "detection2d": {
+        "scope": "mmdet",
+        "import_modules": ["mmdet.datasets.transforms"],
+    },
+    "detection3d": {
+        "scope": "mmdet3d",
+        "import_modules": ["mmdet3d.datasets.transforms"],
+    },
+    "classification": {
+        "scope": "mmpretrain",
+        "import_modules": ["mmpretrain.datasets.transforms"],
+    },
+    "segmentation": {
+        "scope": "mmseg",
+        "import_modules": ["mmseg.datasets.transforms"],
+    },
+}
+
+# Valid task types
+VALID_TASK_TYPES = list(TASK_PIPELINE_CONFIGS.keys())
+
+
+def build_preprocessing_pipeline(
+    model_cfg: Config,
+    task_type: str = "detection3d",
+) -> Any:
+    """
+    Build preprocessing pipeline from model config.
+
+    This function extracts the test pipeline configuration from a model config
+    and builds a Compose pipeline that can be used for preprocessing in deployment data loaders.
+
+    Args:
+        model_cfg: Model configuration containing test pipeline definition.
+                   Supports config (``model_cfg.test_pipeline``)
+        task_type: Explicit task type ('detection2d', 'detection3d', 'classification', 'segmentation').
+                   Must be provided either via this argument or via
+                   ``model_cfg.task_type`` / ``model_cfg.deploy.task_type``.
+                   Recommended: specify in deploy_config.py as ``task_type = "detection3d"``.
+    Returns:
+        Pipeline compose object (e.g., mmdet.datasets.transforms.Compose)
+
+    Raises:
+    ValueError: If no valid test pipeline found in config or invalid task_type
+        ImportError: If required transform packages are not available
+
+    Examples:
+        >>> from mmengine.config import Config
+        >>> cfg = Config.fromfile('model_config.py')
+        >>> pipeline = build_preprocessing_pipeline(cfg, task_type='detection3d')
+        >>> # Use pipeline to preprocess data
+        >>> results = pipeline({'img_path': 'image.jpg'})
+    """
+    pipeline_cfg = _extract_pipeline_config(model_cfg)
+    if task_type not in VALID_TASK_TYPES:
+        raise ValueError(
+            f"Invalid task_type '{task_type}'. Must be one of {VALID_TASK_TYPES}. "
+            f"Please specify a supported task type in the deploy config or function argument."
+        )
+
+    logger.info("Building preprocessing pipeline with task_type: %s", task_type)
+    try:
+        task_cfg = TASK_PIPELINE_CONFIGS[task_type]
+    except KeyError:
+        raise ValueError(f"Unknown task_type '{task_type}'. " f"Must be one of {VALID_TASK_TYPES}")
+    return ComposeBuilder.build(pipeline_cfg=pipeline_cfg, **task_cfg)
+
+
+def _extract_pipeline_config(model_cfg: Config) -> List[TransformConfig]:
+    """
+    Extract pipeline configuration from model config.
+
+    Args:
+        model_cfg: Model configuration
+
+    Returns:
+        List of transform configurations
+
+    Raises:
+        ValueError: If no valid pipeline found
+    """
+    try:
+        pipeline_cfg = model_cfg["test_pipeline"]
+    except (KeyError, TypeError) as exc:
+        raise ValueError("No test pipeline found in config. Expected pipeline at: test_pipeline.") from exc
+
+    if not pipeline_cfg:
+        raise ValueError("test_pipeline is defined but empty. Please provide a valid test pipeline.")
+
+    logger.info("Found test pipeline at: test_pipeline")
+    return pipeline_cfg
diff --git a/deployment/core/metrics/__init__.py b/deployment/core/metrics/__init__.py
new file mode 100644
index 000000000..1acbbae10
--- /dev/null
+++ b/deployment/core/metrics/__init__.py
@@ -0,0 +1,79 @@
+"""
+Unified Metrics Interfaces for AWML Deployment Framework.
+
+This module provides task-specific metric interfaces that use autoware_perception_evaluation
+as the single source of truth for metric computation. This ensures consistency between
+training evaluation (T4MetricV2) and deployment evaluation.
+
+Design Principles:
+    1. 3D Detection → Detection3DMetricsInterface (mAP, mAPH using autoware_perception_eval)
+    2. 2D Detection → Detection2DMetricsInterface (mAP using autoware_perception_eval, 2D mode)
+    3. Classification → ClassificationMetricsInterface (accuracy, precision, recall, F1)
+
+Usage:
+    # For 3D detection (CenterPoint, etc.)
+    from deployment.core.metrics import Detection3DMetricsInterface, Detection3DMetricsConfig
+
+    config = Detection3DMetricsConfig(
+        class_names=["car", "truck", "bus", "bicycle", "pedestrian"],
+    )
+    interface = Detection3DMetricsInterface(config)
+    interface.add_frame(predictions, ground_truths)
+    metrics = interface.compute_metrics()
+
+    # For 2D detection (YOLOX, etc.)
+    from deployment.core.metrics import Detection2DMetricsInterface, Detection2DMetricsConfig
+
+    config = Detection2DMetricsConfig(
+        class_names=["car", "truck", "bus", ...],
+    )
+    interface = Detection2DMetricsInterface(config)
+    interface.add_frame(predictions, ground_truths)
+    metrics = interface.compute_metrics()
+
+    # For classification (Calibration, etc.)
+    from deployment.core.metrics import ClassificationMetricsInterface, ClassificationMetricsConfig
+
+    config = ClassificationMetricsConfig(
+        class_names=["miscalibrated", "calibrated"],
+    )
+    interface = ClassificationMetricsInterface(config)
+    interface.add_frame(prediction_label, ground_truth_label, probabilities)
+    metrics = interface.compute_metrics()
+"""
+
+from deployment.core.metrics.base_metrics_interface import (
+    BaseMetricsConfig,
+    BaseMetricsInterface,
+    ClassificationSummary,
+    DetectionSummary,
+)
+from deployment.core.metrics.classification_metrics import (
+    ClassificationMetricsConfig,
+    ClassificationMetricsInterface,
+)
+from deployment.core.metrics.detection_2d_metrics import (
+    Detection2DMetricsConfig,
+    Detection2DMetricsInterface,
+)
+from deployment.core.metrics.detection_3d_metrics import (
+    Detection3DMetricsConfig,
+    Detection3DMetricsInterface,
+)
+
+__all__ = [
+    # Base classes
+    "BaseMetricsInterface",
+    "BaseMetricsConfig",
+    "ClassificationSummary",
+    "DetectionSummary",
+    # 3D Detection
+    "Detection3DMetricsInterface",
+    "Detection3DMetricsConfig",
+    # 2D Detection
+    "Detection2DMetricsInterface",
+    "Detection2DMetricsConfig",
+    # Classification
+    "ClassificationMetricsInterface",
+    "ClassificationMetricsConfig",
+]
diff --git a/deployment/core/metrics/base_metrics_interface.py b/deployment/core/metrics/base_metrics_interface.py
new file mode 100644
index 000000000..b1ee8bf8a
--- /dev/null
+++ b/deployment/core/metrics/base_metrics_interface.py
@@ -0,0 +1,177 @@
+"""
+Base Metrics Interface for unified metric computation.
+
+This module provides the abstract base class that all task-specific metrics interfaces
+must implement. It ensures a consistent contract across 3D detection, 2D detection,
+and classification tasks.
+
+All metric interfaces use autoware_perception_evaluation as the underlying computation
+engine to ensure consistency between training (T4MetricV2) and deployment evaluation.
+"""
+
+import logging
+from abc import ABC, abstractmethod
+from dataclasses import dataclass, field
+from typing import Any, Dict, List, Optional
+
+logger = logging.getLogger(__name__)
+
+
+@dataclass(frozen=True)
+class BaseMetricsConfig:
+    """Base configuration for all metrics interfaces.
+
+    Attributes:
+        class_names: List of class names for evaluation.
+        frame_id: Frame ID for evaluation (e.g., "base_link" for 3D, "camera" for 2D).
+    """
+
+    class_names: List[str]
+    frame_id: str = "base_link"
+
+
+@dataclass(frozen=True)
+class ClassificationSummary:
+    """Structured summary for classification metrics."""
+
+    accuracy: float = 0.0
+    precision: float = 0.0
+    recall: float = 0.0
+    f1score: float = 0.0
+    per_class_accuracy: Dict[str, float] = field(default_factory=dict)
+    confusion_matrix: List[List[int]] = field(default_factory=list)
+    num_samples: int = 0
+    detailed_metrics: Dict[str, float] = field(default_factory=dict)
+
+    def to_dict(self) -> Dict[str, Any]:
+        """Convert to a serializable dictionary."""
+        return {
+            "accuracy": self.accuracy,
+            "precision": self.precision,
+            "recall": self.recall,
+            "f1score": self.f1score,
+            "per_class_accuracy": dict(self.per_class_accuracy),
+            "confusion_matrix": [list(row) for row in self.confusion_matrix],
+            "num_samples": self.num_samples,
+            "detailed_metrics": dict(self.detailed_metrics),
+        }
+
+
+@dataclass(frozen=True)
+class DetectionSummary:
+    """Structured summary for detection metrics (2D/3D).
+
+    All matching modes computed by autoware_perception_evaluation are included.
+    The `mAP_by_mode` and `mAPH_by_mode` dicts contain results for each matching mode.
+    """
+
+    mAP_by_mode: Dict[str, float] = field(default_factory=dict)
+    mAPH_by_mode: Dict[str, float] = field(default_factory=dict)
+    per_class_ap_by_mode: Dict[str, Dict[str, float]] = field(default_factory=dict)
+    num_frames: int = 0
+    detailed_metrics: Dict[str, float] = field(default_factory=dict)
+
+    def to_dict(self) -> Dict[str, Any]:
+        """Convert to dict."""
+        return {
+            "mAP_by_mode": dict(self.mAP_by_mode),
+            "mAPH_by_mode": dict(self.mAPH_by_mode),
+            "per_class_ap_by_mode": {k: dict(v) for k, v in self.per_class_ap_by_mode.items()},
+            "num_frames": self.num_frames,
+            "detailed_metrics": dict(self.detailed_metrics),
+        }
+
+
+class BaseMetricsInterface(ABC):
+    """
+    Abstract base class for all task-specific metrics interfaces.
+
+    This class defines the common interface that all metric interfaces must implement.
+    Each interface wraps autoware_perception_evaluation to compute metrics consistent
+    with training evaluation (T4MetricV2).
+
+    The workflow is:
+        1. Create interface with task-specific config
+        2. Call reset() to start a new evaluation session
+        3. Call add_frame() for each sample
+        4. Call compute_metrics() to get final metrics
+        5. Optionally call get_summary() for a human-readable summary
+
+    Example:
+        interface = SomeMetricsInterface(config)
+        interface.reset()
+        for pred, gt in data:
+            interface.add_frame(pred, gt)
+        metrics = interface.compute_metrics()
+    """
+
+    def __init__(self, config: BaseMetricsConfig):
+        """
+        Initialize the metrics interface.
+
+        Args:
+            config: Configuration for the metrics interface.
+        """
+        self.config = config
+        self.class_names = config.class_names
+        self.frame_id = config.frame_id
+        self._frame_count = 0
+
+    @abstractmethod
+    def reset(self) -> None:
+        """
+        Reset the interface for a new evaluation session.
+
+        This method should clear all accumulated frame data and reinitialize
+        the underlying evaluator.
+        """
+        pass
+
+    @abstractmethod
+    def add_frame(self, *args) -> None:
+        """
+        Add a frame of predictions and ground truths for evaluation.
+
+        The specific arguments depend on the task type:
+        - 3D Detection: predictions: List[Dict], ground_truths: List[Dict]
+        - 2D Detection: predictions: List[Dict], ground_truths: List[Dict]
+        - Classification: prediction: int, ground_truth: int, probabilities: List[float]
+        """
+        pass
+
+    @abstractmethod
+    def compute_metrics(self) -> Dict[str, float]:
+        """
+        Compute metrics from all added frames.
+
+        Returns:
+            Dictionary of metric names to values.
+        """
+        pass
+
+    @property
+    @abstractmethod
+    def summary(self) -> Any:
+        """
+        Get a summary of the evaluation including primary metrics.
+
+        Returns:
+            Dictionary with summary metrics and additional information.
+        """
+        pass
+
+    @property
+    def frame_count(self) -> int:
+        """Return the number of frames added so far."""
+        return self._frame_count
+
+    def format_metrics_report(self) -> Optional[str]:
+        """Format the metrics report as a human-readable string.
+
+        This is an optional method that can be overridden by subclasses to provide
+        task-specific formatting. By default, returns None.
+
+        Returns:
+            Formatted metrics report string. None if not implemented.
+        """
+        return None
diff --git a/deployment/core/metrics/classification_metrics.py b/deployment/core/metrics/classification_metrics.py
new file mode 100644
index 000000000..e18edbd73
--- /dev/null
+++ b/deployment/core/metrics/classification_metrics.py
@@ -0,0 +1,381 @@
+"""
+Classification Metrics Interface using autoware_perception_evaluation.
+
+This module provides an interface to compute classification metrics (accuracy, precision,
+recall, F1) using autoware_perception_evaluation, ensuring consistent metrics between
+training evaluation and deployment evaluation.
+
+Usage:
+    config = ClassificationMetricsConfig(
+        class_names=["miscalibrated", "calibrated"],
+    )
+    interface = ClassificationMetricsInterface(config)
+
+    for pred_label, gt_label in zip(predictions, ground_truths):
+        interface.add_frame(prediction=pred_label, ground_truth=gt_label)
+
+    metrics = interface.compute_metrics()
+    # Returns: {"accuracy": 0.95, "precision": 0.94, "recall": 0.96, "f1score": 0.95, ...}
+"""
+
+import logging
+import time
+from dataclasses import dataclass
+from typing import Any, Dict, List, Optional
+
+import numpy as np
+from perception_eval.common.dataset import FrameGroundTruth
+from perception_eval.common.label import AutowareLabel, Label
+from perception_eval.common.object2d import DynamicObject2D
+from perception_eval.common.schema import FrameID
+from perception_eval.config.perception_evaluation_config import PerceptionEvaluationConfig
+from perception_eval.evaluation.metrics import MetricsScore
+from perception_eval.evaluation.result.perception_frame_config import (
+    CriticalObjectFilterConfig,
+    PerceptionPassFailConfig,
+)
+from perception_eval.manager import PerceptionEvaluationManager
+
+from deployment.core.metrics.base_metrics_interface import (
+    BaseMetricsConfig,
+    BaseMetricsInterface,
+    ClassificationSummary,
+)
+
+logger = logging.getLogger(__name__)
+
+# Valid 2D frame IDs for camera-based classification
+VALID_2D_FRAME_IDS = [
+    "cam_front",
+    "cam_front_right",
+    "cam_front_left",
+    "cam_front_lower",
+    "cam_back",
+    "cam_back_left",
+    "cam_back_right",
+    "cam_traffic_light_near",
+    "cam_traffic_light_far",
+    "cam_traffic_light",
+]
+
+
+@dataclass(frozen=True)
+class ClassificationMetricsConfig(BaseMetricsConfig):
+    """Configuration for classification metrics.
+
+    Attributes:
+        class_names: List of class names for evaluation.
+        frame_id: Camera frame ID for evaluation (default: "cam_front").
+        evaluation_config_dict: Configuration dict for perception evaluation.
+        critical_object_filter_config: Config for filtering critical objects.
+        frame_pass_fail_config: Config for pass/fail criteria.
+    """
+
+    frame_id: str = "cam_front"
+    evaluation_config_dict: Optional[Dict[str, Any]] = None
+    critical_object_filter_config: Optional[Dict[str, Any]] = None
+    frame_pass_fail_config: Optional[Dict[str, Any]] = None
+
+    def __post_init__(self):
+        if self.frame_id not in VALID_2D_FRAME_IDS:
+            raise ValueError(
+                f"Invalid frame_id '{self.frame_id}' for classification. " f"Valid options: {VALID_2D_FRAME_IDS}"
+            )
+
+        if self.evaluation_config_dict is None:
+            object.__setattr__(
+                self,
+                "evaluation_config_dict",
+                {
+                    "evaluation_task": "classification2d",
+                    "target_labels": self.class_names,
+                    "center_distance_thresholds": None,
+                    "center_distance_bev_thresholds": None,
+                    "plane_distance_thresholds": None,
+                    "iou_2d_thresholds": None,
+                    "iou_3d_thresholds": None,
+                    "label_prefix": "autoware",
+                },
+            )
+
+        if self.critical_object_filter_config is None:
+            object.__setattr__(
+                self,
+                "critical_object_filter_config",
+                {
+                    "target_labels": self.class_names,
+                    "ignore_attributes": None,
+                },
+            )
+
+        if self.frame_pass_fail_config is None:
+            object.__setattr__(
+                self,
+                "frame_pass_fail_config",
+                {
+                    "target_labels": self.class_names,
+                    "matching_threshold_list": [1.0] * len(self.class_names),
+                    "confidence_threshold_list": None,
+                },
+            )
+
+
+class ClassificationMetricsInterface(BaseMetricsInterface):
+    """Interface for computing classification metrics using autoware_perception_evaluation.
+
+    Metrics computed:
+    - Accuracy: TP / (num_predictions + num_gt - TP)
+    - Precision: TP / (TP + FP)
+    - Recall: TP / num_gt
+    - F1 Score: 2 * precision * recall / (precision + recall)
+    - Per-class accuracy, precision, recall, F1
+    """
+
+    def __init__(
+        self,
+        config: ClassificationMetricsConfig,
+        data_root: str = "data/t4dataset/",
+        result_root_directory: str = "/tmp/perception_eval_classification/",
+    ):
+        """Initialize the classification metrics interface.
+
+        Args:
+            config: Configuration for classification metrics.
+            data_root: Root directory of the dataset.
+            result_root_directory: Directory for saving evaluation results.
+        """
+        super().__init__(config)
+        self.config: ClassificationMetricsConfig = config
+
+        self.perception_eval_config = PerceptionEvaluationConfig(
+            dataset_paths=data_root,
+            frame_id=config.frame_id,
+            result_root_directory=result_root_directory,
+            evaluation_config_dict=config.evaluation_config_dict,
+            load_raw_data=False,
+        )
+
+        self.critical_object_filter_config = CriticalObjectFilterConfig(
+            evaluator_config=self.perception_eval_config,
+            **config.critical_object_filter_config,
+        )
+
+        self.frame_pass_fail_config = PerceptionPassFailConfig(
+            evaluator_config=self.perception_eval_config,
+            **config.frame_pass_fail_config,
+        )
+
+        self.evaluator: Optional[PerceptionEvaluationManager] = None
+
+    def reset(self) -> None:
+        """Reset the interface for a new evaluation session."""
+        self.evaluator = PerceptionEvaluationManager(
+            evaluation_config=self.perception_eval_config,
+            load_ground_truth=False,
+            metric_output_dir=None,
+        )
+        self._frame_count = 0
+
+    def _convert_index_to_label(self, label_index: int) -> Label:
+        """Convert a label index to a Label object."""
+        if 0 <= label_index < len(self.class_names):
+            class_name = self.class_names[label_index]
+        else:
+            class_name = "unknown"
+
+        autoware_label = AutowareLabel.__members__.get(class_name.upper(), AutowareLabel.UNKNOWN)
+        return Label(label=autoware_label, name=class_name)
+
+    def _create_dynamic_object_2d(
+        self,
+        label_index: int,
+        unix_time: int,
+        score: float = 1.0,
+        uuid: Optional[str] = None,
+    ) -> DynamicObject2D:
+        """Create a DynamicObject2D for classification (roi=None for image-level)."""
+        return DynamicObject2D(
+            unix_time=unix_time,
+            frame_id=FrameID.from_value(self.frame_id),
+            semantic_score=score,
+            semantic_label=self._convert_index_to_label(label_index),
+            roi=None,
+            uuid=uuid,
+        )
+
+    def add_frame(
+        self,
+        prediction: int,
+        ground_truth: int,
+        probabilities: Optional[List[float]] = None,
+        frame_name: Optional[str] = None,
+    ) -> None:
+        """Add a single prediction and ground truth for evaluation.
+
+        Args:
+            prediction: Predicted class index.
+            ground_truth: Ground truth class index.
+            probabilities: Optional probability scores for each class.
+            frame_name: Optional name for the frame.
+        """
+        if self.evaluator is None:
+            self.reset()
+
+        unix_time = int(time.time() * 1e6)
+        if frame_name is None:
+            frame_name = str(self._frame_count)
+
+        # Get confidence score from probabilities if available
+        score = 1.0
+        if probabilities is not None and len(probabilities) > prediction:
+            score = float(probabilities[prediction])
+
+        # Create prediction and ground truth objects
+        estimated_object = self._create_dynamic_object_2d(
+            label_index=prediction, unix_time=unix_time, score=score, uuid=frame_name
+        )
+        gt_object = self._create_dynamic_object_2d(
+            label_index=ground_truth, unix_time=unix_time, score=1.0, uuid=frame_name
+        )
+
+        frame_ground_truth = FrameGroundTruth(
+            unix_time=unix_time,
+            frame_name=frame_name,
+            objects=[gt_object],
+            transforms=None,
+            raw_data=None,
+        )
+
+        try:
+            self.evaluator.add_frame_result(
+                unix_time=unix_time,
+                ground_truth_now_frame=frame_ground_truth,
+                estimated_objects=[estimated_object],
+                critical_object_filter_config=self.critical_object_filter_config,
+                frame_pass_fail_config=self.frame_pass_fail_config,
+            )
+            self._frame_count += 1
+        except Exception as e:
+            logger.warning(f"Failed to add frame {frame_name}: {e}")
+
+    def compute_metrics(self) -> Dict[str, float]:
+        """Compute metrics from all added predictions.
+
+        Returns:
+            Dictionary of metrics including accuracy, precision, recall, f1score,
+            and per-class metrics.
+        """
+        if self.evaluator is None or self._frame_count == 0:
+            logger.warning("No samples to evaluate")
+            return {}
+
+        try:
+            metrics_score: MetricsScore = self.evaluator.get_scene_result()
+            return self._process_metrics_score(metrics_score)
+        except Exception as e:
+            logger.error(f"Error computing metrics: {e}")
+            import traceback
+
+            traceback.print_exc()
+            return {}
+
+    def _process_metrics_score(self, metrics_score: MetricsScore) -> Dict[str, float]:
+        """Process MetricsScore into a flat dictionary."""
+        metric_dict = {}
+
+        for classification_score in metrics_score.classification_scores:
+            # Get overall metrics
+            accuracy, precision, recall, f1score = classification_score._summarize()
+
+            # Handle inf values (replace with 0.0)
+            metric_dict["accuracy"] = 0.0 if accuracy == float("inf") else accuracy
+            metric_dict["precision"] = 0.0 if precision == float("inf") else precision
+            metric_dict["recall"] = 0.0 if recall == float("inf") else recall
+            metric_dict["f1score"] = 0.0 if f1score == float("inf") else f1score
+
+            # Process per-class metrics
+            for acc in classification_score.accuracies:
+                if not acc.target_labels:
+                    continue
+
+                target_label = acc.target_labels[0]
+                class_name = getattr(target_label, "name", str(target_label))
+
+                metric_dict[f"{class_name}_accuracy"] = 0.0 if acc.accuracy == float("inf") else acc.accuracy
+                metric_dict[f"{class_name}_precision"] = 0.0 if acc.precision == float("inf") else acc.precision
+                metric_dict[f"{class_name}_recall"] = 0.0 if acc.recall == float("inf") else acc.recall
+                metric_dict[f"{class_name}_f1score"] = 0.0 if acc.f1score == float("inf") else acc.f1score
+                metric_dict[f"{class_name}_tp"] = acc.num_tp
+                metric_dict[f"{class_name}_fp"] = acc.num_fp
+                metric_dict[f"{class_name}_num_gt"] = acc.num_ground_truth
+                metric_dict[f"{class_name}_num_pred"] = acc.objects_results_num
+
+        metric_dict["total_samples"] = self._frame_count
+        return metric_dict
+
+    # TODO(vividf): Remove after autoware_perception_evaluation supports confusion matrix.
+    @property
+    def confusion_matrix(self) -> np.ndarray:
+        """Get the confusion matrix.
+
+        Returns:
+            2D numpy array where cm[i][j] = count of ground truth i predicted as j.
+        """
+        num_classes = len(self.class_names)
+        if self.evaluator is None or self._frame_count == 0:
+            return np.zeros((num_classes, num_classes), dtype=int)
+
+        confusion_matrix = np.zeros((num_classes, num_classes), dtype=int)
+
+        for frame_result in self.evaluator.frame_results:
+            if not frame_result.object_results:
+                continue
+
+            for obj_result in frame_result.object_results:
+                if obj_result.ground_truth_object is None:
+                    continue
+
+                pred_name = obj_result.estimated_object.semantic_label.name
+                gt_name = obj_result.ground_truth_object.semantic_label.name
+
+                # Find indices
+                pred_idx = next(
+                    (i for i, n in enumerate(self.class_names) if n.lower() == pred_name.lower()),
+                    -1,
+                )
+                gt_idx = next(
+                    (i for i, n in enumerate(self.class_names) if n.lower() == gt_name.lower()),
+                    -1,
+                )
+
+                if 0 <= pred_idx < num_classes and 0 <= gt_idx < num_classes:
+                    confusion_matrix[gt_idx, pred_idx] += 1
+
+        return confusion_matrix
+
+    @property
+    def summary(self) -> ClassificationSummary:
+        """Get a summary of the evaluation.
+
+        Returns:
+            ClassificationSummary with aggregate metrics.
+        """
+        metrics = self.compute_metrics()
+
+        if not metrics:
+            return ClassificationSummary()
+
+        per_class_accuracy = {
+            name: metrics[f"{name}_accuracy"] for name in self.class_names if f"{name}_accuracy" in metrics
+        }
+
+        return ClassificationSummary(
+            accuracy=metrics.get("accuracy", 0.0),
+            precision=metrics.get("precision", 0.0),
+            recall=metrics.get("recall", 0.0),
+            f1score=metrics.get("f1score", 0.0),
+            per_class_accuracy=per_class_accuracy,
+            confusion_matrix=self.confusion_matrix.tolist(),
+            num_samples=self._frame_count,
+            detailed_metrics=metrics,
+        )
diff --git a/deployment/core/metrics/detection_2d_metrics.py b/deployment/core/metrics/detection_2d_metrics.py
new file mode 100644
index 000000000..c462d8256
--- /dev/null
+++ b/deployment/core/metrics/detection_2d_metrics.py
@@ -0,0 +1,501 @@
+"""
+2D Detection Metrics Interface using autoware_perception_evaluation.
+
+This module provides an interface to compute 2D detection metrics (mAP)
+using autoware_perception_evaluation in 2D mode, ensuring consistent metrics
+between training evaluation and deployment evaluation.
+
+For 2D detection, the interface uses:
+- IoU 2D thresholds for matching (e.g., 0.5, 0.75)
+- Only AP is computed (no APH since there's no heading in 2D)
+
+Usage:
+    config = Detection2DMetricsConfig(
+        class_names=["car", "truck", "bus", "bicycle", "pedestrian", "motorcycle", "trailer", "unknown"],
+        frame_id="camera",
+    )
+    interface = Detection2DMetricsInterface(config)
+
+    # Add frames
+    for pred, gt in zip(predictions_list, ground_truths_list):
+        interface.add_frame(
+            predictions=pred,  # List[Dict] with bbox (x1,y1,x2,y2), label, score
+            ground_truths=gt,  # List[Dict] with bbox (x1,y1,x2,y2), label
+        )
+
+    # Compute metrics
+    metrics = interface.compute_metrics()
+    # Returns: {"mAP_iou_2d_0.5": 0.7, "mAP_iou_2d_0.75": 0.65, ...}
+"""
+
+import logging
+import time
+from dataclasses import dataclass, field
+from typing import Any, Dict, List, Optional
+
+import numpy as np
+from perception_eval.common.dataset import FrameGroundTruth
+from perception_eval.common.label import AutowareLabel, Label
+from perception_eval.common.object2d import DynamicObject2D
+from perception_eval.common.schema import FrameID
+from perception_eval.config.perception_evaluation_config import PerceptionEvaluationConfig
+from perception_eval.evaluation.metrics import MetricsScore
+from perception_eval.evaluation.result.perception_frame_config import (
+    CriticalObjectFilterConfig,
+    PerceptionPassFailConfig,
+)
+from perception_eval.manager import PerceptionEvaluationManager
+
+from deployment.core.metrics.base_metrics_interface import BaseMetricsConfig, BaseMetricsInterface, DetectionSummary
+
+logger = logging.getLogger(__name__)
+
+
+# Valid 2D frame IDs for camera-based detection
+VALID_2D_FRAME_IDS = [
+    "cam_front",
+    "cam_front_right",
+    "cam_front_left",
+    "cam_front_lower",
+    "cam_back",
+    "cam_back_left",
+    "cam_back_right",
+    "cam_traffic_light_near",
+    "cam_traffic_light_far",
+    "cam_traffic_light",
+]
+
+
+@dataclass(frozen=True)
+class Detection2DMetricsConfig(BaseMetricsConfig):
+    """Configuration for 2D detection metrics.
+
+    Attributes:
+        class_names: List of class names for evaluation.
+        frame_id: Frame ID for evaluation. Valid values for 2D:
+            "cam_front", "cam_front_right", "cam_front_left", "cam_front_lower",
+            "cam_back", "cam_back_left", "cam_back_right",
+            "cam_traffic_light_near", "cam_traffic_light_far", "cam_traffic_light"
+        iou_thresholds: List of IoU thresholds for evaluation.
+        evaluation_config_dict: Configuration dict for perception evaluation.
+        critical_object_filter_config: Config for filtering critical objects.
+        frame_pass_fail_config: Config for pass/fail criteria.
+    """
+
+    # Override default frame_id for 2D detection (camera frame instead of base_link)
+    frame_id: str = "cam_front"
+    iou_thresholds: List[float] = field(default_factory=lambda: [0.5, 0.75])
+    evaluation_config_dict: Optional[Dict[str, Any]] = None
+    critical_object_filter_config: Optional[Dict[str, Any]] = None
+    frame_pass_fail_config: Optional[Dict[str, Any]] = None
+
+    def __post_init__(self):
+        # Validate frame_id for 2D detection
+        if self.frame_id not in VALID_2D_FRAME_IDS:
+            raise ValueError(
+                f"Invalid frame_id '{self.frame_id}' for 2D detection. Valid options: {VALID_2D_FRAME_IDS}"
+            )
+
+        # Set default evaluation config if not provided
+        if self.evaluation_config_dict is None:
+            default_eval_config = {
+                "evaluation_task": "detection2d",
+                "target_labels": self.class_names,
+                "iou_2d_thresholds": self.iou_thresholds,
+                "center_distance_bev_thresholds": None,
+                "plane_distance_thresholds": None,
+                "iou_3d_thresholds": None,
+                "label_prefix": "autoware",
+            }
+            object.__setattr__(self, "evaluation_config_dict", default_eval_config)
+
+        # Set default critical object filter config if not provided
+        if self.critical_object_filter_config is None:
+            default_filter_config = {
+                "target_labels": self.class_names,
+                "ignore_attributes": None,
+            }
+            object.__setattr__(self, "critical_object_filter_config", default_filter_config)
+
+        # Set default frame pass fail config if not provided
+        if self.frame_pass_fail_config is None:
+            num_classes = len(self.class_names)
+            default_pass_fail_config = {
+                "target_labels": self.class_names,
+                "matching_threshold_list": [0.5] * num_classes,
+                "confidence_threshold_list": None,
+            }
+            object.__setattr__(self, "frame_pass_fail_config", default_pass_fail_config)
+
+
+class Detection2DMetricsInterface(BaseMetricsInterface):
+    """
+    Interface for computing 2D detection metrics using autoware_perception_evaluation.
+
+    This interface provides a simplified interface for the deployment framework to
+    compute mAP for 2D object detection tasks (YOLOX, etc.).
+
+    Unlike 3D detection, 2D detection:
+    - Uses IoU 2D for matching (based on bounding box overlap)
+    - Does not compute APH (no heading information in 2D)
+    - Works with image-space bounding boxes [x1, y1, x2, y2]
+
+    Example usage:
+        config = Detection2DMetricsConfig(
+            class_names=["car", "truck", "bus", "bicycle", "pedestrian"],
+            iou_thresholds=[0.5, 0.75],
+        )
+        interface = Detection2DMetricsInterface(config)
+
+        # Add frames
+        for pred, gt in zip(predictions_list, ground_truths_list):
+            interface.add_frame(
+                predictions=pred,  # List[Dict] with bbox, label, score
+                ground_truths=gt,  # List[Dict] with bbox, label
+            )
+
+        # Compute metrics
+        metrics = interface.compute_metrics()
+    """
+
+    _UNKNOWN = "unknown"
+
+    def __init__(
+        self,
+        config: Detection2DMetricsConfig,
+        data_root: str = "data/t4dataset/",
+        result_root_directory: str = "/tmp/perception_eval_2d/",
+    ):
+        """
+        Initialize the 2D detection metrics interface.
+
+        Args:
+            config: Configuration for 2D detection metrics.
+            data_root: Root directory of the dataset.
+            result_root_directory: Directory for saving evaluation results.
+        """
+        super().__init__(config)
+        self.config: Detection2DMetricsConfig = config
+        self.data_root = data_root
+        self.result_root_directory = result_root_directory
+
+        # Create perception evaluation config
+        self.perception_eval_config = PerceptionEvaluationConfig(
+            dataset_paths=data_root,
+            frame_id=config.frame_id,
+            result_root_directory=result_root_directory,
+            evaluation_config_dict=config.evaluation_config_dict,
+            load_raw_data=False,
+        )
+
+        # Create critical object filter config
+        self.critical_object_filter_config = CriticalObjectFilterConfig(
+            evaluator_config=self.perception_eval_config,
+            **config.critical_object_filter_config,
+        )
+
+        # Create frame pass fail config
+        self.frame_pass_fail_config = PerceptionPassFailConfig(
+            evaluator_config=self.perception_eval_config,
+            **config.frame_pass_fail_config,
+        )
+
+        # Initialize evaluation manager
+        self.evaluator: Optional[PerceptionEvaluationManager] = None
+
+    def reset(self) -> None:
+        """Reset the interface for a new evaluation session."""
+        self.evaluator = PerceptionEvaluationManager(
+            evaluation_config=self.perception_eval_config,
+            load_ground_truth=False,
+            metric_output_dir=None,
+        )
+        self._frame_count = 0
+
+    def _convert_index_to_label(self, label_index: int) -> Label:
+        """Convert a label index to a Label object.
+
+        Args:
+            label_index: Index of the label in class_names.
+
+        Returns:
+            Label object with AutowareLabel.
+        """
+        if 0 <= label_index < len(self.class_names):
+            class_name = self.class_names[label_index]
+        else:
+            class_name = self._UNKNOWN
+
+        autoware_label = AutowareLabel.__members__.get(class_name.upper(), AutowareLabel.UNKNOWN)
+        return Label(label=autoware_label, name=class_name)
+
+    def _predictions_to_dynamic_objects_2d(
+        self,
+        predictions: List[Dict[str, Any]],
+        unix_time: int,
+    ) -> List[DynamicObject2D]:
+        """Convert prediction dicts to DynamicObject2D instances.
+
+        Args:
+            predictions: List of prediction dicts with keys:
+                - bbox: [x1, y1, x2, y2] (image coordinates)
+                - label: int (class index)
+                - score: float (confidence score)
+            unix_time: Unix timestamp in microseconds.
+
+        Returns:
+            List of DynamicObject2D instances.
+        """
+        estimated_objects = []
+        frame_id = FrameID.from_value(self.frame_id)
+
+        for pred in predictions:
+            bbox = pred.get("bbox", [])
+            if len(bbox) < 4:
+                continue
+
+            # Extract bbox components [x1, y1, x2, y2]
+            x1, y1, x2, y2 = bbox[0], bbox[1], bbox[2], bbox[3]
+
+            # Convert [x1, y1, x2, y2] to [xmin, ymin, width, height] format
+            # as required by DynamicObject2D.roi
+            xmin = int(x1)
+            ymin = int(y1)
+            width = int(x2 - x1)
+            height = int(y2 - y1)
+
+            # Get label
+            label_idx = pred.get("label", 0)
+            semantic_label = self._convert_index_to_label(int(label_idx))
+
+            # Get score
+            score = float(pred.get("score", 0.0))
+
+            # Create DynamicObject2D
+            # roi format: (xmin, ymin, width, height)
+            dynamic_obj = DynamicObject2D(
+                unix_time=unix_time,
+                frame_id=frame_id,
+                semantic_score=score,
+                semantic_label=semantic_label,
+                roi=(xmin, ymin, width, height),
+                uuid=None,
+            )
+            estimated_objects.append(dynamic_obj)
+
+        return estimated_objects
+
+    def _ground_truths_to_dynamic_objects_2d(
+        self,
+        ground_truths: List[Dict[str, Any]],
+        unix_time: int,
+    ) -> List[DynamicObject2D]:
+        """Convert ground truth dicts to DynamicObject2D instances.
+
+        Args:
+            ground_truths: List of ground truth dicts with keys:
+                - bbox: [x1, y1, x2, y2] (image coordinates)
+                - label: int (class index)
+            unix_time: Unix timestamp in microseconds.
+
+        Returns:
+            List of DynamicObject2D instances.
+        """
+        gt_objects = []
+        frame_id = FrameID.from_value(self.frame_id)
+
+        for gt in ground_truths:
+            bbox = gt.get("bbox", [])
+            if len(bbox) < 4:
+                continue
+
+            # Extract bbox components [x1, y1, x2, y2]
+            x1, y1, x2, y2 = bbox[0], bbox[1], bbox[2], bbox[3]
+
+            # Convert [x1, y1, x2, y2] to [xmin, ymin, width, height] format
+            # as required by DynamicObject2D.roi
+            xmin = int(x1)
+            ymin = int(y1)
+            width = int(x2 - x1)
+            height = int(y2 - y1)
+
+            # Get label
+            label_idx = gt.get("label", 0)
+            semantic_label = self._convert_index_to_label(int(label_idx))
+
+            # Create DynamicObject2D (GT always has score 1.0)
+            # roi format: (xmin, ymin, width, height)
+            dynamic_obj = DynamicObject2D(
+                unix_time=unix_time,
+                frame_id=frame_id,
+                semantic_score=1.0,
+                semantic_label=semantic_label,
+                roi=(xmin, ymin, width, height),
+                uuid=None,
+            )
+            gt_objects.append(dynamic_obj)
+
+        return gt_objects
+
+    def add_frame(
+        self,
+        predictions: List[Dict[str, Any]],
+        ground_truths: List[Dict[str, Any]],
+        frame_name: Optional[str] = None,
+    ) -> None:
+        """Add a frame of predictions and ground truths for evaluation.
+
+        Args:
+            predictions: List of prediction dicts with keys:
+                - bbox: [x1, y1, x2, y2] (image coordinates)
+                - label: int (class index)
+                - score: float (confidence score)
+            ground_truths: List of ground truth dicts with keys:
+                - bbox: [x1, y1, x2, y2] (image coordinates)
+                - label: int (class index)
+            frame_name: Optional name for the frame.
+        """
+        if self.evaluator is None:
+            self.reset()
+
+        # Unix time in microseconds (int)
+        unix_time = int(time.time() * 1e6)
+        if frame_name is None:
+            frame_name = str(self._frame_count)
+
+        # Convert predictions to DynamicObject2D
+        estimated_objects = self._predictions_to_dynamic_objects_2d(predictions, unix_time)
+
+        # Convert ground truths to DynamicObject2D list
+        gt_objects = self._ground_truths_to_dynamic_objects_2d(ground_truths, unix_time)
+
+        # Create FrameGroundTruth for 2D
+        frame_ground_truth = FrameGroundTruth(
+            unix_time=unix_time,
+            frame_name=frame_name,
+            objects=gt_objects,
+            transforms=None,
+            raw_data=None,
+        )
+
+        # Add frame result to evaluator
+        try:
+            self.evaluator.add_frame_result(
+                unix_time=unix_time,
+                ground_truth_now_frame=frame_ground_truth,
+                estimated_objects=estimated_objects,
+                critical_object_filter_config=self.critical_object_filter_config,
+                frame_pass_fail_config=self.frame_pass_fail_config,
+            )
+            self._frame_count += 1
+        except Exception as e:
+            logger.warning(f"Failed to add frame {frame_name}: {e}")
+
+    def compute_metrics(self) -> Dict[str, float]:
+        """Compute metrics from all added frames.
+
+        Returns:
+            Dictionary of metrics with keys like:
+                - mAP_iou_2d_0.5
+                - mAP_iou_2d_0.75
+                - car_AP_iou_2d_0.5
+                - etc.
+        """
+        if self.evaluator is None or self._frame_count == 0:
+            logger.warning("No frames to evaluate")
+            return {}
+
+        try:
+            # Get scene result (aggregated metrics)
+            metrics_score: MetricsScore = self.evaluator.get_scene_result()
+
+            # Process metrics into a flat dictionary
+            return self._process_metrics_score(metrics_score)
+
+        except Exception as e:
+            logger.error(f"Error computing metrics: {e}")
+            import traceback
+
+            traceback.print_exc()
+            return {}
+
+    def _process_metrics_score(self, metrics_score: MetricsScore) -> Dict[str, float]:
+        """Process MetricsScore into a flat dictionary.
+
+        Args:
+            metrics_score: MetricsScore instance from evaluator.
+
+        Returns:
+            Flat dictionary of metrics.
+        """
+        metric_dict = {}
+
+        for map_instance in metrics_score.mean_ap_values:
+            matching_mode = map_instance.matching_mode.value.lower().replace(" ", "_")
+
+            # Process individual AP values
+            for label, aps in map_instance.label_to_aps.items():
+                label_name = label.value
+
+                for ap in aps:
+                    threshold = ap.matching_threshold
+                    ap_value = ap.ap
+
+                    # Create the metric key
+                    key = f"{label_name}_AP_{matching_mode}_{threshold}"
+                    metric_dict[key] = ap_value
+
+            # Add mAP value (no mAPH for 2D detection)
+            map_key = f"mAP_{matching_mode}"
+            metric_dict[map_key] = map_instance.map
+
+        return metric_dict
+
+    @property
+    def summary(self) -> DetectionSummary:
+        """Get a summary of the evaluation including mAP and per-class metrics for all matching modes."""
+        metrics = self.compute_metrics()
+
+        # Extract matching modes from metrics
+        modes = []
+        for k in metrics.keys():
+            if k.startswith("mAP_") and k != "mAP":
+                modes.append(k[len("mAP_") :])
+        modes = list(dict.fromkeys(modes))  # Remove duplicates while preserving order
+
+        if not modes:
+            return DetectionSummary(
+                mAP_by_mode={},
+                mAPH_by_mode={},
+                per_class_ap_by_mode={},
+                num_frames=self._frame_count,
+                detailed_metrics=metrics,
+            )
+
+        # Collect mAP and per-class AP for each matching mode
+        mAP_by_mode: Dict[str, float] = {}
+        per_class_ap_by_mode: Dict[str, Dict[str, float]] = {}
+
+        for mode in modes:
+            map_value = metrics.get(f"mAP_{mode}", 0.0)
+            mAP_by_mode[mode] = float(map_value)
+
+            # Collect AP values per class for this mode
+            per_class_ap_values: Dict[str, List[float]] = {}
+            ap_key_infix = f"_AP_{mode}_"
+            for key, value in metrics.items():
+                if ap_key_infix not in key or key.startswith("mAP"):
+                    continue
+                class_name = key.split("_AP_", 1)[0]
+                per_class_ap_values.setdefault(class_name, []).append(float(value))
+
+            if per_class_ap_values:
+                per_class_ap_by_mode[mode] = {k: float(np.mean(v)) for k, v in per_class_ap_values.items() if v}
+
+        return DetectionSummary(
+            mAP_by_mode=mAP_by_mode,
+            mAPH_by_mode={},  # 2D detection doesn't have mAPH
+            per_class_ap_by_mode=per_class_ap_by_mode,
+            num_frames=self._frame_count,
+            detailed_metrics=metrics,
+        )
diff --git a/deployment/core/metrics/detection_3d_metrics.py b/deployment/core/metrics/detection_3d_metrics.py
new file mode 100644
index 000000000..4cbf5ac95
--- /dev/null
+++ b/deployment/core/metrics/detection_3d_metrics.py
@@ -0,0 +1,712 @@
+"""
+3D Detection Metrics Interface using autoware_perception_evaluation.
+
+This module provides an interface to compute 3D detection metrics (mAP, mAPH)
+using autoware_perception_evaluation, ensuring consistent metrics between
+training evaluation (T4MetricV2) and deployment evaluation.
+
+Usage:
+    config = Detection3DMetricsConfig(
+        class_names=["car", "truck", "bus", "bicycle", "pedestrian"],
+        frame_id="base_link",
+    )
+    interface = Detection3DMetricsInterface(config)
+
+    # Add frames
+    for pred, gt in zip(predictions_list, ground_truths_list):
+        interface.add_frame(
+            predictions=pred,  # List[Dict] with bbox_3d, label, score
+            ground_truths=gt,  # List[Dict] with bbox_3d, label
+        )
+
+    # Compute metrics
+    metrics = interface.compute_metrics()
+    # Returns: {"mAP_center_distance_bev_0.5": 0.7, ...}
+"""
+
+import logging
+import re
+import time
+from dataclasses import dataclass
+from typing import Any, Dict, List, Mapping, Optional
+
+import numpy as np
+from perception_eval.common.dataset import FrameGroundTruth
+from perception_eval.common.label import AutowareLabel, Label
+from perception_eval.common.object import DynamicObject
+from perception_eval.common.shape import Shape, ShapeType
+from perception_eval.config.perception_evaluation_config import PerceptionEvaluationConfig
+from perception_eval.evaluation.metrics import MetricsScore
+from perception_eval.evaluation.result.perception_frame_config import (
+    CriticalObjectFilterConfig,
+    PerceptionPassFailConfig,
+)
+from perception_eval.manager import PerceptionEvaluationManager
+from pyquaternion import Quaternion
+
+from deployment.core.metrics.base_metrics_interface import BaseMetricsConfig, BaseMetricsInterface, DetectionSummary
+
+logger = logging.getLogger(__name__)
+
+
+@dataclass(frozen=True)
+class Detection3DMetricsConfig(BaseMetricsConfig):
+    """Configuration for 3D detection metrics.
+
+    Attributes:
+        class_names: List of class names for evaluation.
+        frame_id: Frame ID for evaluation (e.g., "base_link").
+        evaluation_config_dict: Configuration dict for perception evaluation.
+            Example:
+                {
+                    "evaluation_task": "detection",
+                    "target_labels": ["car", "truck", "bus", "bicycle", "pedestrian"],
+                    "center_distance_bev_thresholds": [0.5, 1.0, 2.0, 4.0],
+                    "plane_distance_thresholds": [2.0, 4.0],
+                    "iou_2d_thresholds": None,
+                    "iou_3d_thresholds": None,
+                    "label_prefix": "autoware",
+                    "max_distance": 121.0,
+                    "min_distance": -121.0,
+                    "min_point_numbers": 0,
+                }
+        critical_object_filter_config: Config for filtering critical objects.
+            Example:
+                {
+                    "target_labels": ["car", "truck", "bus", "bicycle", "pedestrian"],
+                    "ignore_attributes": None,
+                    "max_distance_list": [121.0, 121.0, 121.0, 121.0, 121.0],
+                    "min_distance_list": [-121.0, -121.0, -121.0, -121.0, -121.0],
+                }
+        frame_pass_fail_config: Config for pass/fail criteria.
+            Example:
+                {
+                    "target_labels": ["car", "truck", "bus", "bicycle", "pedestrian"],
+                    "matching_threshold_list": [2.0, 2.0, 2.0, 2.0, 2.0],
+                    "confidence_threshold_list": None,
+                }
+    """
+
+    evaluation_config_dict: Optional[Dict[str, Any]] = None
+    critical_object_filter_config: Optional[Dict[str, Any]] = None
+    frame_pass_fail_config: Optional[Dict[str, Any]] = None
+
+    def __post_init__(self):
+        # Set default evaluation config if not provided
+        if self.evaluation_config_dict is None:
+            default_eval_config = {
+                "evaluation_task": "detection",
+                "target_labels": self.class_names,
+                "center_distance_bev_thresholds": [0.5, 1.0, 2.0, 4.0],
+                "plane_distance_thresholds": [2.0, 4.0],
+                "iou_2d_thresholds": None,
+                "iou_3d_thresholds": None,
+                "label_prefix": "autoware",
+                "max_distance": 121.0,
+                "min_distance": -121.0,
+                "min_point_numbers": 0,
+            }
+            object.__setattr__(self, "evaluation_config_dict", default_eval_config)
+
+        # Set default critical object filter config if not provided
+        if self.critical_object_filter_config is None:
+            num_classes = len(self.class_names)
+            default_filter_config = {
+                "target_labels": self.class_names,
+                "ignore_attributes": None,
+                "max_distance_list": [121.0] * num_classes,
+                "min_distance_list": [-121.0] * num_classes,
+            }
+            object.__setattr__(self, "critical_object_filter_config", default_filter_config)
+
+        # Set default frame pass fail config if not provided
+        if self.frame_pass_fail_config is None:
+            num_classes = len(self.class_names)
+            default_pass_fail_config = {
+                "target_labels": self.class_names,
+                "matching_threshold_list": [2.0] * num_classes,
+                "confidence_threshold_list": None,
+            }
+            object.__setattr__(self, "frame_pass_fail_config", default_pass_fail_config)
+
+
+class Detection3DMetricsInterface(BaseMetricsInterface):
+    # TODO(vividf): refactor this class after refactoring T4MetricV2
+    """
+    Interface for computing 3D detection metrics using autoware_perception_evaluation.
+
+    This interface provides a simplified interface for the deployment framework to
+    compute mAP, mAPH, and other detection metrics that are consistent with
+    the T4MetricV2 used during training.
+    """
+
+    _UNKNOWN = "unknown"
+
+    def __init__(
+        self,
+        config: Detection3DMetricsConfig,
+        data_root: str = "data/t4dataset/",
+        result_root_directory: str = "/tmp/perception_eval/",
+    ):
+        """
+        Initialize the 3D detection metrics interface.
+
+        Args:
+            config: Configuration for 3D detection metrics.
+            data_root: Root directory of the dataset.
+            result_root_directory: Directory for saving evaluation results.
+        """
+        super().__init__(config)
+        self.data_root = data_root
+        self.result_root_directory = result_root_directory
+
+        cfg_dict = config.evaluation_config_dict
+        if cfg_dict is None:
+            cfg_dict = {}
+        if not isinstance(cfg_dict, Mapping):
+            raise TypeError(f"evaluation_config_dict must be a mapping, got {type(cfg_dict).__name__}")
+        self._evaluation_cfg_dict: Dict[str, Any] = dict(cfg_dict)
+
+        # Create multiple evaluators for different distance ranges (like T4MetricV2)
+        min_distance = cfg_dict.get("min_distance")
+        max_distance = cfg_dict.get("max_distance")
+
+        if isinstance(min_distance, (int, float)) and isinstance(max_distance, (int, float)):
+            min_distance = [float(min_distance)]
+            max_distance = [float(max_distance)]
+        elif not isinstance(min_distance, list) or not isinstance(max_distance, list):
+            raise ValueError(
+                "min_distance and max_distance must be either scalars (int/float) or lists for multi-evaluator mode. "
+                f"Got min_distance={type(min_distance)}, max_distance={type(max_distance)}"
+            )
+
+        if len(min_distance) != len(max_distance):
+            raise ValueError(
+                f"min_distance and max_distance must have the same length. "
+                f"Got len(min_distance)={len(min_distance)}, len(max_distance)={len(max_distance)}"
+            )
+
+        if not min_distance or not max_distance:
+            raise ValueError("min_distance and max_distance lists cannot be empty")
+
+        # Create distance ranges and evaluators
+        self._bev_distance_ranges = list(zip(min_distance, max_distance))
+        self.evaluators: Dict[str, Dict[str, Any]] = {}
+        self._create_evaluators(config)
+
+        self.gt_count_total: int = 0
+        self.pred_count_total: int = 0
+        self.gt_count_by_label: Dict[str, int] = {}
+        self.pred_count_by_label: Dict[str, int] = {}
+        self._last_metrics_by_eval_name: Dict[str, MetricsScore] = {}
+
+    def _create_evaluators(self, config: Detection3DMetricsConfig) -> None:
+        """Create multiple evaluators for different distance ranges (like T4MetricV2)."""
+        range_filter_name = "bev_center"
+
+        for min_dist, max_dist in self._bev_distance_ranges:
+            # Create a copy of evaluation_config_dict with single distance values
+            eval_config_dict_raw = config.evaluation_config_dict
+            if eval_config_dict_raw is None:
+                eval_config_dict_raw = {}
+            if not isinstance(eval_config_dict_raw, Mapping):
+                raise TypeError(f"evaluation_config_dict must be a mapping, got {type(eval_config_dict_raw).__name__}")
+            eval_config_dict = dict(eval_config_dict_raw)
+            eval_config_dict["min_distance"] = min_dist
+            eval_config_dict["max_distance"] = max_dist
+
+            # Create perception evaluation config for this range
+            evaluator_config = PerceptionEvaluationConfig(
+                dataset_paths=self.data_root,
+                frame_id=config.frame_id,
+                result_root_directory=self.result_root_directory,
+                evaluation_config_dict=eval_config_dict,
+                load_raw_data=False,
+            )
+
+            # Create critical object filter config
+            critical_object_filter_config = CriticalObjectFilterConfig(
+                evaluator_config=evaluator_config,
+                **config.critical_object_filter_config,
+            )
+
+            # Create frame pass fail config
+            frame_pass_fail_config = PerceptionPassFailConfig(
+                evaluator_config=evaluator_config,
+                **config.frame_pass_fail_config,
+            )
+
+            evaluator_name = f"{range_filter_name}_{min_dist}-{max_dist}"
+
+            self.evaluators[evaluator_name] = {
+                "evaluator": None,  # Will be created on reset
+                "evaluator_config": evaluator_config,
+                "critical_object_filter_config": critical_object_filter_config,
+                "frame_pass_fail_config": frame_pass_fail_config,
+                "bev_distance_range": (min_dist, max_dist),
+            }
+
+    def reset(self) -> None:
+        """Reset the interface for a new evaluation session."""
+        # Reset all evaluators
+        for eval_name, eval_data in self.evaluators.items():
+            eval_data["evaluator"] = PerceptionEvaluationManager(
+                evaluation_config=eval_data["evaluator_config"],
+                load_ground_truth=False,
+                metric_output_dir=None,
+            )
+
+        self._frame_count = 0
+        self.gt_count_total = 0
+        self.pred_count_total = 0
+        self.gt_count_by_label = {}
+        self.pred_count_by_label = {}
+        self._last_metrics_by_eval_name = {}
+
+    def _convert_index_to_label(self, label_index: int) -> Label:
+        """Convert a label index to a Label object.
+
+        Args:
+            label_index: Index of the label in class_names.
+
+        Returns:
+            Label object with AutowareLabel.
+        """
+        if 0 <= label_index < len(self.class_names):
+            class_name = self.class_names[label_index]
+        else:
+            class_name = self._UNKNOWN
+
+        autoware_label = AutowareLabel.__members__.get(class_name.upper(), AutowareLabel.UNKNOWN)
+        return Label(label=autoware_label, name=class_name)
+
+    def _predictions_to_dynamic_objects(
+        self,
+        predictions: List[Dict[str, Any]],
+        unix_time: float,
+    ) -> List[DynamicObject]:
+        """Convert prediction dicts to DynamicObject instances.
+
+        Args:
+            predictions: List of prediction dicts with keys:
+                - bbox_3d: [x, y, z, l, w, h, yaw] or [x, y, z, l, w, h, yaw, vx, vy]
+                  (Same format as mmdet3d LiDARInstance3DBoxes)
+                - label: int (class index)
+                - score: float (confidence score)
+            unix_time: Unix timestamp for the frame.
+
+        Returns:
+            List of DynamicObject instances.
+        """
+        estimated_objects = []
+        for pred in predictions:
+            bbox = pred.get("bbox_3d", [])
+            if len(bbox) < 7:
+                continue
+
+            # Extract bbox components
+            # mmdet3d LiDARInstance3DBoxes format: [x, y, z, l, w, h, yaw, vx, vy]
+            # where l=length, w=width, h=height
+            x, y, z = bbox[0], bbox[1], bbox[2]
+            l, w, h = bbox[3], bbox[4], bbox[5]
+            yaw = bbox[6]
+
+            # Velocity (optional)
+            vx = bbox[7] if len(bbox) > 7 else 0.0
+            vy = bbox[8] if len(bbox) > 8 else 0.0
+
+            # Create quaternion from yaw
+            orientation = Quaternion(np.cos(yaw / 2), 0, 0, np.sin(yaw / 2))
+
+            # Get label
+            label_idx = pred.get("label", 0)
+            semantic_label = self._convert_index_to_label(int(label_idx))
+
+            # Get score
+            score = float(pred.get("score", 0.0))
+
+            # Shape size follows autoware_perception_evaluation convention: (length, width, height)
+            dynamic_obj = DynamicObject(
+                unix_time=unix_time,
+                frame_id=self.frame_id,
+                position=(x, y, z),
+                orientation=orientation,
+                shape=Shape(shape_type=ShapeType.BOUNDING_BOX, size=(l, w, h)),
+                velocity=(vx, vy, 0.0),
+                semantic_score=score,
+                semantic_label=semantic_label,
+            )
+            estimated_objects.append(dynamic_obj)
+
+        return estimated_objects
+
+    def _ground_truths_to_frame_ground_truth(
+        self,
+        ground_truths: List[Dict[str, Any]],
+        unix_time: float,
+        frame_name: str = "0",
+    ) -> FrameGroundTruth:
+        """Convert ground truth dicts to FrameGroundTruth instance.
+
+        Args:
+            ground_truths: List of ground truth dicts with keys:
+                - bbox_3d: [x, y, z, l, w, h, yaw] or [x, y, z, l, w, h, yaw, vx, vy]
+                  (Same format as mmdet3d LiDARInstance3DBoxes)
+                - label: int (class index)
+                - num_lidar_pts: int (optional, number of lidar points)
+            unix_time: Unix timestamp for the frame.
+            frame_name: Name/ID of the frame.
+
+        Returns:
+            FrameGroundTruth instance.
+        """
+        gt_objects = []
+        for gt in ground_truths:
+            bbox = gt.get("bbox_3d", [])
+            if len(bbox) < 7:
+                continue
+
+            # Extract bbox components
+            # mmdet3d LiDARInstance3DBoxes format: [x, y, z, l, w, h, yaw, vx, vy]
+            # where l=length, w=width, h=height
+            x, y, z = bbox[0], bbox[1], bbox[2]
+            l, w, h = bbox[3], bbox[4], bbox[5]
+            yaw = bbox[6]
+
+            # Velocity (optional)
+            vx = bbox[7] if len(bbox) > 7 else 0.0
+            vy = bbox[8] if len(bbox) > 8 else 0.0
+
+            # Create quaternion from yaw
+            orientation = Quaternion(np.cos(yaw / 2), 0, 0, np.sin(yaw / 2))
+
+            # Get label
+            label_idx = gt.get("label", 0)
+            semantic_label = self._convert_index_to_label(int(label_idx))
+
+            # Get point count (optional)
+            num_pts = gt.get("num_lidar_pts", 0)
+
+            # Shape size follows autoware_perception_evaluation convention: (length, width, height)
+            dynamic_obj = DynamicObject(
+                unix_time=unix_time,
+                frame_id=self.frame_id,
+                position=(x, y, z),
+                orientation=orientation,
+                shape=Shape(shape_type=ShapeType.BOUNDING_BOX, size=(l, w, h)),
+                velocity=(vx, vy, 0.0),
+                semantic_score=1.0,  # GT always has score 1.0
+                semantic_label=semantic_label,
+                pointcloud_num=int(num_pts),
+            )
+            gt_objects.append(dynamic_obj)
+
+        return FrameGroundTruth(
+            unix_time=unix_time,
+            frame_name=frame_name,
+            objects=gt_objects,
+            transforms=None,
+            raw_data=None,
+        )
+
+    def add_frame(
+        self,
+        predictions: List[Dict[str, Any]],
+        ground_truths: List[Dict[str, Any]],
+        frame_name: Optional[str] = None,
+    ) -> None:
+        """Add a frame of predictions and ground truths for evaluation.
+
+        Args:
+            predictions: List of prediction dicts with keys:
+                - bbox_3d: [x, y, z, l, w, h, yaw] or [x, y, z, l, w, h, yaw, vx, vy]
+                - label: int (class index)
+                - score: float (confidence score)
+            ground_truths: List of ground truth dicts with keys:
+                - bbox_3d: [x, y, z, l, w, h, yaw] or [x, y, z, l, w, h, yaw, vx, vy]
+                - label: int (class index)
+                - num_lidar_pts: int (optional)
+            frame_name: Optional name for the frame.
+        """
+        needs_reset = any(eval_data["evaluator"] is None for eval_data in self.evaluators.values())
+        if needs_reset:
+            self.reset()
+
+        unix_time = time.time()
+        if frame_name is None:
+            frame_name = str(self._frame_count)
+
+        self.pred_count_total += len(predictions)
+        self.gt_count_total += len(ground_truths)
+
+        for p in predictions:
+            try:
+                label = int(p.get("label", -1))
+            except Exception:
+                label = -1
+            if 0 <= label < len(self.class_names):
+                name = self.class_names[label]
+                self.pred_count_by_label[name] = self.pred_count_by_label.get(name, 0) + 1
+
+        for g in ground_truths:
+            try:
+                label = int(g.get("label", -1))
+            except Exception:
+                label = -1
+            if 0 <= label < len(self.class_names):
+                name = self.class_names[label]
+                self.gt_count_by_label[name] = self.gt_count_by_label.get(name, 0) + 1
+
+        # Convert predictions to DynamicObject
+        estimated_objects = self._predictions_to_dynamic_objects(predictions, unix_time)
+
+        # Convert ground truths to FrameGroundTruth
+        frame_ground_truth = self._ground_truths_to_frame_ground_truth(ground_truths, unix_time, frame_name)
+
+        # Add frame result to all evaluators
+        try:
+            for eval_name, eval_data in self.evaluators.items():
+                if eval_data["evaluator"] is None:
+                    eval_data["evaluator"] = PerceptionEvaluationManager(
+                        evaluation_config=eval_data["evaluator_config"],
+                        load_ground_truth=False,
+                        metric_output_dir=None,
+                    )
+                eval_data["evaluator"].add_frame_result(
+                    unix_time=unix_time,
+                    ground_truth_now_frame=frame_ground_truth,
+                    estimated_objects=estimated_objects,
+                    critical_object_filter_config=eval_data["critical_object_filter_config"],
+                    frame_pass_fail_config=eval_data["frame_pass_fail_config"],
+                )
+            self._frame_count += 1
+        except Exception as e:
+            logger.warning(f"Failed to add frame {frame_name}: {e}")
+
+    def compute_metrics(self) -> Dict[str, float]:
+        """Compute metrics from all added frames.
+
+        Returns:
+            Dictionary of metrics with keys like:
+                - mAP_center_distance_bev (mean AP across all classes, no threshold)
+                - mAPH_center_distance_bev (mean APH across all classes, no threshold)
+                - car_AP_center_distance_bev_0.5 (per-class AP with threshold)
+                - car_AP_center_distance_bev_1.0 (per-class AP with threshold)
+                - car_APH_center_distance_bev_0.5 (per-class APH with threshold)
+                - etc.
+                For multi-evaluator mode, metrics are prefixed with evaluator name:
+                - bev_center_0.0-50.0_mAP_center_distance_bev
+                - bev_center_0.0-50.0_car_AP_center_distance_bev_0.5
+                - bev_center_50.0-90.0_mAP_center_distance_bev
+                - etc.
+                Note: mAP/mAPH keys do not include threshold; only per-class AP/APH keys do.
+        """
+        if self._frame_count == 0:
+            logger.warning("No frames to evaluate")
+            return {}
+
+        try:
+            # Cache scene results to avoid recomputing
+            scene_results = {}
+            for eval_name, eval_data in self.evaluators.items():
+                evaluator = eval_data["evaluator"]
+                if evaluator is None:
+                    continue
+
+                try:
+                    metrics_score = evaluator.get_scene_result()
+                    scene_results[eval_name] = metrics_score
+                except Exception as e:
+                    logger.warning(f"Error computing metrics for {eval_name}: {e}")
+
+            # Process cached metrics with evaluator name prefix
+            all_metrics = {}
+            for eval_name, metrics_score in scene_results.items():
+                eval_metrics = self._process_metrics_score(metrics_score, prefix=eval_name)
+                all_metrics.update(eval_metrics)
+
+            # Cache results for reuse by format_metrics_report() and summary property
+            self._last_metrics_by_eval_name = scene_results
+
+            return all_metrics
+
+        except Exception as e:
+            logger.error(f"Error computing metrics: {e}")
+            import traceback
+
+            traceback.print_exc()
+            return {}
+
+    def format_metrics_report(self) -> str:
+        """Format the metrics report as a human-readable string.
+
+        For multi-evaluator mode, returns reports for all evaluators with distance range labels.
+        Uses cached results from compute_metrics() if available to avoid recomputation.
+        """
+        # Use cached results if available, otherwise compute them
+        if not self._last_metrics_by_eval_name:
+            # Cache not available, compute now
+            self.compute_metrics()
+
+        # Format reports for all evaluators using cached results
+        reports = []
+        for eval_name, metrics_score in self._last_metrics_by_eval_name.items():
+            try:
+                # Extract distance range from evaluator name (e.g., "bev_center_0.0-50.0" -> "0.0-50.0")
+                distance_range = eval_name.replace("bev_center_", "")
+                report = f"\n{'='*80}\nDistance Range: {distance_range} m\n{'='*80}\n{str(metrics_score)}"
+                reports.append(report)
+            except Exception as e:
+                logger.warning(f"Error formatting report for {eval_name}: {e}")
+
+        return "\n".join(reports) if reports else ""
+
+    def _process_metrics_score(self, metrics_score: MetricsScore, prefix: Optional[str] = None) -> Dict[str, float]:
+        """Process MetricsScore into a flat dictionary.
+
+        Args:
+            metrics_score: MetricsScore instance from evaluator.
+            prefix: Optional prefix to add to metric keys (for multi-evaluator mode).
+
+        Returns:
+            Flat dictionary of metrics.
+        """
+        metric_dict = {}
+        key_prefix = f"{prefix}_" if prefix else ""
+
+        for map_instance in metrics_score.mean_ap_values:
+            matching_mode = map_instance.matching_mode.value.lower().replace(" ", "_")
+
+            # Process individual AP values
+            for label, aps in map_instance.label_to_aps.items():
+                label_name = label.value
+
+                for ap in aps:
+                    threshold = ap.matching_threshold
+                    ap_value = ap.ap
+
+                    # Create the metric key
+                    key = f"{key_prefix}{label_name}_AP_{matching_mode}_{threshold}"
+                    metric_dict[key] = ap_value
+
+            # Process individual APH values
+            label_to_aphs = getattr(map_instance, "label_to_aphs", None)
+            if label_to_aphs:
+                for label, aphs in label_to_aphs.items():
+                    label_name = label.value
+                    for aph in aphs:
+                        threshold = aph.matching_threshold
+                        aph_value = getattr(aph, "aph", None)
+                        if aph_value is None:
+                            aph_value = getattr(aph, "ap", None)
+                        if aph_value is None:
+                            continue
+                        key = f"{key_prefix}{label_name}_APH_{matching_mode}_{threshold}"
+                        metric_dict[key] = aph_value
+
+            # Add mAP and mAPH values
+            map_key = f"{key_prefix}mAP_{matching_mode}"
+            maph_key = f"{key_prefix}mAPH_{matching_mode}"
+            metric_dict[map_key] = map_instance.map
+            metric_dict[maph_key] = map_instance.maph
+
+        return metric_dict
+
+    @staticmethod
+    def _extract_matching_modes(metrics: Mapping[str, float]) -> List[str]:
+        """Extract matching modes from metrics dict keys (e.g., 'mAP_center_distance_bev' -> 'center_distance_bev').
+
+        Supports both prefixed and non-prefixed formats:
+        - Non-prefixed: "mAP_center_distance_bev"
+        - Prefixed: "bev_center_0.0-50.0_mAP_center_distance_bev"
+        """
+        # Matches either "mAP_<mode>" or "<prefix>_mAP_<mode>"
+        pat = re.compile(r"(?:^|_)mAP_(.+)$")
+        modes: List[str] = []
+        for k in metrics.keys():
+            m = pat.search(k)
+            if m:
+                modes.append(m.group(1))
+        # Remove duplicates while preserving order
+        return list(dict.fromkeys(modes))
+
+    @property
+    def summary(self) -> DetectionSummary:
+        """Get a summary of the evaluation including mAP and per-class metrics for all matching modes.
+
+        Only uses metrics from the last distance bucket.
+        """
+        metrics = self.compute_metrics()
+
+        if not self._bev_distance_ranges:
+            return DetectionSummary(
+                mAP_by_mode={},
+                mAPH_by_mode={},
+                per_class_ap_by_mode={},
+                num_frames=self._frame_count,
+                detailed_metrics=metrics,
+            )
+
+        # Use the last distance bucket (should be the full range)
+        last_min_dist, last_max_dist = self._bev_distance_ranges[-1]
+        last_evaluator_name = f"bev_center_{last_min_dist}-{last_max_dist}"
+
+        last_metrics_score = self._last_metrics_by_eval_name.get(last_evaluator_name)
+        if last_metrics_score is None:
+            return DetectionSummary(
+                mAP_by_mode={},
+                mAPH_by_mode={},
+                per_class_ap_by_mode={},
+                num_frames=self._frame_count,
+                detailed_metrics=metrics,
+            )
+
+        last_bucket_metrics = self._process_metrics_score(last_metrics_score, prefix=None)
+
+        modes = self._extract_matching_modes(last_bucket_metrics)
+        if not modes:
+            return DetectionSummary(
+                mAP_by_mode={},
+                mAPH_by_mode={},
+                per_class_ap_by_mode={},
+                num_frames=self._frame_count,
+                detailed_metrics=metrics,
+            )
+
+        mAP_by_mode: Dict[str, float] = {}
+        mAPH_by_mode: Dict[str, float] = {}
+        per_class_ap_by_mode: Dict[str, Dict[str, float]] = {}
+
+        for mode in modes:
+            # Get mAP and mAPH directly from last bucket metrics
+            map_key = f"mAP_{mode}"
+            maph_key = f"mAPH_{mode}"
+
+            mAP_by_mode[mode] = last_bucket_metrics.get(map_key, 0.0)
+            mAPH_by_mode[mode] = last_bucket_metrics.get(maph_key, 0.0)
+
+            # Collect AP values per class for this mode from the last bucket
+            per_class_ap_values: Dict[str, List[float]] = {}
+            ap_key_separator = f"_AP_{mode}_"
+
+            for key, value in last_bucket_metrics.items():
+                idx = key.find(ap_key_separator)
+                if idx < 0:
+                    continue
+
+                # Label is the token right before "_AP_{mode}_"
+                prefix_part = key[:idx]
+                class_name = prefix_part.split("_")[-1] if prefix_part else ""
+                if class_name:
+                    per_class_ap_values.setdefault(class_name, []).append(float(value))
+
+            if per_class_ap_values:
+                per_class_ap_by_mode[mode] = {k: float(np.mean(v)) for k, v in per_class_ap_values.items() if v}
+
+        return DetectionSummary(
+            mAP_by_mode=mAP_by_mode,
+            mAPH_by_mode=mAPH_by_mode,
+            per_class_ap_by_mode=per_class_ap_by_mode,
+            num_frames=self._frame_count,
+            detailed_metrics=metrics,
+        )
diff --git a/deployment/docs/README.md b/deployment/docs/README.md
new file mode 100644
index 000000000..262f195cf
--- /dev/null
+++ b/deployment/docs/README.md
@@ -0,0 +1,13 @@
+# Deployment Docs Index
+
+Reference guides extracted from the monolithic deployment README:
+
+- [`overview.md`](./overview.md) – high-level summary, design principles, and key features.
+- [`architecture.md`](./architecture.md) – workflow diagram, core components, pipelines, and layout.
+- [`usage.md`](./usage.md) – commands, runner setup, typed contexts, CLI args, export modes.
+- [`configuration.md`](./configuration.md) – configuration structure, typed config classes, backend enums.
+- [`projects.md`](./projects.md) – CenterPoint, YOLOX, and Calibration deployment specifics.
+- [`export_pipeline.md`](./export_pipeline.md) – ONNX/TensorRT export details plus export pipelines.
+- [`verification_evaluation.md`](./verification_evaluation.md) – verification mixin, evaluation metrics, core contract.
+- [`best_practices.md`](./best_practices.md) – best practices, troubleshooting, and roadmap items.
+- [`contributing.md`](./contributing.md) – steps for adding new deployment projects.
diff --git a/deployment/docs/architecture.md b/deployment/docs/architecture.md
new file mode 100644
index 000000000..900975dd1
--- /dev/null
+++ b/deployment/docs/architecture.md
@@ -0,0 +1,77 @@
+# Deployment Architecture
+
+## High-Level Workflow
+
+```
+┌─────────────────────────────────────────────────────────┐
+│              Project Entry Points                       │
+│  (projects/*/deploy/main.py)                            │
+│  - CenterPoint, YOLOX-ELAN, Calibration                 │
+└──────────────────┬──────────────────────────────────────┘
+                   │
+┌──────────────────▼──────────────────────────────────────┐
+│ BaseDeploymentRunner + Project Runners                  │
+│  - Coordinates load → export → verify → evaluate        │
+│  - Delegates to helper orchestrators                    │
+│  - Projects extend the base runner for custom logic     │
+└──────────────────┬──────────────────────────────────────┘
+                   │
+        ┌──────────┴────────────┐
+        │                       │
+┌───────▼────────┐     ┌────────▼───────────────┐
+│   Exporters    │     │  Helper Orchestrators │
+│  - ONNX / TRT  │     │  - ArtifactManager     │
+│  - Wrappers    │     │  - VerificationOrch.   │
+│  - Export Ppl. │     │  - EvaluationOrch.     │
+└────────────────┘     └────────┬───────────────┘
+                                │
+┌───────────────────────────────▼─────────────────────────┐
+│                    Evaluators & Pipelines               │
+│  - BaseDeploymentPipeline + task-specific variants      │
+│  - Backend-specific implementations (PyTorch/ONNX/TRT)  │
+└────────────────────────────────────────────────────────┘
+```
+
+## Core Components
+
+### BaseDeploymentRunner & Project Runners
+
+`BaseDeploymentRunner` orchestrates the export/verification/evaluation loop. Project runners (CenterPoint, YOLOX, Calibration, …):
+
+- Implement model loading.
+- Inject wrapper classes and optional export pipelines.
+- Reuse `ExporterFactory` to lazily create ONNX/TensorRT exporters.
+- Delegate artifact registration plus verification/evaluation to the shared orchestrators.
+
+### Core Package (`deployment/core/`)
+
+- `BaseDeploymentConfig` – typed deployment configuration container.
+- `Backend` – enum guaranteeing backend name consistency.
+- `Artifact` – dataclass describing exported artifacts.
+- `VerificationMixin` – recursive comparer for nested outputs.
+- `BaseEvaluator` – task-specific evaluation contract.
+- `BaseDataLoader` – data-loading abstraction.
+- `build_preprocessing_pipeline` – extracts preprocessing steps from MMDet/MMDet3D configs.
+- Typed value objects (`constants.py`, `runtime_config.py`, `task_config.py`, `results.py`) keep configuration and metrics structured.
+
+### Exporters & Export Pipelines
+
+- `exporters/common/` hosts the base exporters, typed config objects, and `ExporterFactory`.
+- Project wrappers live in `exporters/{project}/model_wrappers.py`.
+- Complex projects add export pipelines (e.g., `CenterPointONNXExportPipeline`) that orchestrate multi-file exports by composing the base exporters.
+
+### Pipelines
+
+`BaseDeploymentPipeline` defines `preprocess → run_model → postprocess`, while `PipelineFactory` builds backend-specific implementations for each task (`Detection2D`, `Detection3D`, `Classification`). Pipelines are encapsulated per backend (PyTorch/ONNX/TensorRT) under `deployment/pipelines/{task}/`.
+
+### File Structure Snapshot
+
+```
+deployment/
+├── core/                 # Core dataclasses, configs, evaluators
+├── exporters/            # Base exporters + project wrappers/export pipelines
+├── pipelines/            # Task-specific pipelines per backend
+├── runners/              # Shared runner + project adapters
+```
+
+Project entry points follow the same pattern under `projects/*/deploy/` with `main.py`, `data_loader.py`, `evaluator.py`, and `configs/deploy_config.py`.
diff --git a/deployment/docs/best_practices.md b/deployment/docs/best_practices.md
new file mode 100644
index 000000000..a7ddeec23
--- /dev/null
+++ b/deployment/docs/best_practices.md
@@ -0,0 +1,84 @@
+# Best Practices & Troubleshooting
+
+## Configuration Management
+
+- Keep deployment configs separate from training/model configs.
+- Use relative paths for datasets and artifacts when possible.
+- Document non-default configuration options in project READMEs.
+
+## Model Export
+
+- Inject wrapper classes (and optional export pipelines) into project runners; let `ExporterFactory` build exporters lazily.
+- Store wrappers under `exporters/{model}/model_wrappers.py` and reuse `IdentityWrapper` when reshaping is unnecessary.
+- Add export-pipeline modules only when orchestration beyond single file export is required.
+- Always verify ONNX exports before TensorRT conversion.
+- Choose TensorRT precision policies (`auto`, `fp16`, `fp32_tf32`, `strongly_typed`) based on deployment targets.
+
+## Unified Architecture Pattern
+
+```
+exporters/{model}/
+├── model_wrappers.py
+├── [optional] onnx_export_pipeline.py
+└── [optional] tensorrt_export_pipeline.py
+```
+
+- Simple models: use base exporters + wrappers, no subclassing.
+- Complex models: compose export pipelines that call the base exporters multiple times.
+
+## Dependency Injection Pattern
+
+```python
+runner = YOLOXOptElanDeploymentRunner(
+    ...,
+    onnx_wrapper_cls=YOLOXOptElanONNXWrapper,
+)
+```
+
+- Keeps dependencies explicit.
+- Enables lazy exporter construction.
+- Simplifies testing via mock wrappers/pipelines.
+
+## Verification Tips
+
+- Start with strict tolerances (0.01) and relax only when necessary.
+- Verify a representative sample set.
+- Ensure preprocessing/postprocessing is consistent across backends.
+
+## Evaluation Tips
+
+- Align evaluation settings across backends.
+- Report latency statistics alongside accuracy metrics.
+- Compare backend-specific outputs for regressions.
+
+## Pipeline Development
+
+- Inherit from the correct task-specific base pipeline.
+- Share preprocessing/postprocessing logic where possible.
+- Keep backend-specific implementations focused on inference glue code.
+
+## Troubleshooting
+
+1. **ONNX export fails**
+   - Check for unsupported ops and validate input shapes.
+   - Try alternative opset versions.
+2. **TensorRT build fails**
+   - Validate the ONNX model.
+   - Confirm input shape/profile configuration.
+   - Adjust workspace size if memory errors occur.
+3. **Verification fails**
+   - Tweak tolerance settings.
+   - Confirm identical preprocessing across backends.
+   - Verify device assignments.
+4. **Evaluation errors**
+   - Double-check data loader paths.
+   - Ensure model outputs match evaluator expectations.
+   - Confirm the correct `task_type` in config.
+
+## Future Enhancements
+
+- Support more task types (segmentation, etc.).
+- Automatic precision tuning for TensorRT.
+- Distributed evaluation support.
+- MLOps pipeline integration.
+- Performance profiling tools.
diff --git a/deployment/docs/configuration.md b/deployment/docs/configuration.md
new file mode 100644
index 000000000..9b4ba654c
--- /dev/null
+++ b/deployment/docs/configuration.md
@@ -0,0 +1,224 @@
+# Configuration Reference
+
+Configurations remain dictionary-driven for flexibility, with typed dataclasses layered on top for validation and IDE support.
+
+## Structure
+
+### Single-Model Export (Simple Models)
+
+For simple models with a single ONNX/TensorRT output:
+
+```python
+# Task type
+task_type = "detection3d"  # or "detection2d", "classification"
+
+# Checkpoint (single source of truth)
+checkpoint_path = "model.pth"
+
+devices = dict(
+    cpu="cpu",
+    cuda="cuda:0",
+)
+
+export = dict(
+    mode="both",          # "onnx", "trt", "both", "none"
+    work_dir="work_dirs/deployment",
+    onnx_path=None,       # Required when mode="trt" and ONNX already exists
+)
+
+runtime_io = dict(
+    info_file="data/info.pkl",
+    sample_idx=0,
+)
+
+model_io = dict(
+    input_name="input",
+    input_shape=(3, 960, 960),
+    input_dtype="float32",
+    output_name="output",
+    batch_size=1,
+    dynamic_axes={...},
+)
+
+onnx_config = dict(
+    opset_version=16,
+    do_constant_folding=True,
+    save_file="model.onnx",
+)
+
+tensorrt_config = dict(
+    precision_policy="auto",
+    max_workspace_size=1 << 30,
+)
+```
+
+### Multi-File Export (Complex Models like CenterPoint)
+
+For models that export to multiple ONNX/TensorRT files, use the `components` config:
+
+```python
+task_type = "detection3d"
+checkpoint_path = "work_dirs/centerpoint/best_checkpoint.pth"
+
+devices = dict(
+    cpu="cpu",
+    cuda="cuda:0",
+)
+
+export = dict(
+    mode="both",
+    work_dir="work_dirs/centerpoint_deployment",
+)
+
+# Unified component configuration (single source of truth)
+# Each component defines: name, file paths, IO spec, and TensorRT profile
+components = dict(
+    voxel_encoder=dict(
+        name="pts_voxel_encoder",
+        onnx_file="pts_voxel_encoder.onnx",
+        engine_file="pts_voxel_encoder.engine",
+        io=dict(
+            inputs=[dict(name="input_features", dtype="float32")],
+            outputs=[dict(name="pillar_features", dtype="float32")],
+            dynamic_axes={
+                "input_features": {0: "num_voxels", 1: "num_max_points"},
+                "pillar_features": {0: "num_voxels"},
+            },
+        ),
+        tensorrt_profile=dict(
+            input_features=dict(
+                min_shape=[1000, 32, 11],
+                opt_shape=[20000, 32, 11],
+                max_shape=[64000, 32, 11],
+            ),
+        ),
+    ),
+    backbone_head=dict(
+        name="pts_backbone_neck_head",
+        onnx_file="pts_backbone_neck_head.onnx",
+        engine_file="pts_backbone_neck_head.engine",
+        io=dict(
+            inputs=[dict(name="spatial_features", dtype="float32")],
+            outputs=[
+                dict(name="heatmap", dtype="float32"),
+                dict(name="reg", dtype="float32"),
+                # ... more outputs
+            ],
+            dynamic_axes={...},
+        ),
+        tensorrt_profile=dict(
+            spatial_features=dict(
+                min_shape=[1, 32, 760, 760],
+                opt_shape=[1, 32, 760, 760],
+                max_shape=[1, 32, 760, 760],
+            ),
+        ),
+    ),
+)
+
+# Shared ONNX settings (applied to all components)
+onnx_config = dict(
+    opset_version=16,
+    do_constant_folding=True,
+    simplify=False,
+)
+
+# Shared TensorRT settings (applied to all components)
+tensorrt_config = dict(
+    precision_policy="auto",
+    max_workspace_size=2 << 30,
+)
+```
+
+### Verification and Evaluation
+
+```python
+verification = dict(
+    enabled=True,
+    num_verify_samples=3,
+    tolerance=0.1,
+    devices=devices,
+    scenarios={
+        "both": [
+            {"ref_backend": "pytorch", "ref_device": "cpu",
+             "test_backend": "onnx", "test_device": "cuda"},
+        ]
+    }
+)
+
+evaluation = dict(
+    enabled=True,
+    num_samples=100,
+    verbose=False,
+    backends={
+        "pytorch": {"enabled": True, "device": devices["cpu"]},
+        "onnx": {"enabled": True, "device": devices["cpu"]},
+        "tensorrt": {"enabled": True, "device": devices["cuda"]},
+    }
+)
+```
+
+### Device Aliases
+
+Keep device definitions centralized by declaring a top-level `devices` dictionary and referencing aliases (for example, `devices["cuda"]`). Updating the mapping once automatically propagates to export, evaluation, and verification blocks without digging into nested dictionaries.
+
+## Backend Enum
+
+Use `deployment.core.Backend` to avoid typos while keeping backward compatibility with plain strings.
+
+```python
+from deployment.core import Backend
+
+evaluation = dict(
+    backends={
+        Backend.PYTORCH: {"enabled": True, "device": devices["cpu"]},
+        Backend.ONNX: {"enabled": True, "device": devices["cpu"]},
+        Backend.TENSORRT: {"enabled": True, "device": devices["cuda"]},
+    }
+)
+```
+
+## Typed Exporter Configs
+
+Typed classes in `deployment.exporters.common.configs` provide schema validation and IDE hints.
+
+```python
+from deployment.exporters.common.configs import (
+    ONNXExportConfig,
+    TensorRTExportConfig,
+    TensorRTModelInputConfig,
+    TensorRTProfileConfig,
+)
+
+onnx_config = ONNXExportConfig(
+    input_names=("input",),
+    output_names=("output",),
+    opset_version=16,
+    do_constant_folding=True,
+    simplify=True,
+    save_file="model.onnx",
+    batch_size=1,
+)
+
+trt_config = TensorRTExportConfig(
+    precision_policy="auto",
+    max_workspace_size=1 << 30,
+    model_inputs=(
+        TensorRTModelInputConfig(
+            input_shapes={
+                "input": TensorRTProfileConfig(
+                    min_shape=(1, 3, 960, 960),
+                    opt_shape=(1, 3, 960, 960),
+                    max_shape=(1, 3, 960, 960),
+                )
+            }
+        ),
+    ),
+)
+```
+
+Use `from_mapping()` / `from_dict()` helpers to instantiate typed configs from existing dictionaries.
+
+## Example Config Paths
+
+- `deployment/projects/centerpoint/config/deploy_config.py` - Multi-file export example
diff --git a/deployment/docs/contributing.md b/deployment/docs/contributing.md
new file mode 100644
index 000000000..a2d1b5c6a
--- /dev/null
+++ b/deployment/docs/contributing.md
@@ -0,0 +1,29 @@
+# Contributing to Deployment
+
+## Adding a New Project
+
+1. **Evaluator & Data Loader**
+   - Implement `BaseEvaluator` with task-specific metrics.
+   - Implement `BaseDataLoader` variant for the dataset(s).
+
+2. **Project Bundle**
+   - Create a new bundle under `deployment/projects/<project>/`.
+   - Put **all project deployment code** in one place: `runner.py`, `evaluator.py`, `data_loader.py`, `config/deploy_config.py`.
+
+3. **Pipelines**
+   - Add backend-specific pipelines under `deployment/projects/<project>/pipelines/` and register a factory into `deployment.pipelines.registry.pipeline_registry`.
+
+4. **Export Pipelines (optional)**
+   - If the project needs multi-stage export, implement under `deployment/projects/<project>/export/` (compose the generic exporters in `deployment/exporters/common/`).
+
+5. **CLI wiring**
+   - Register a `ProjectAdapter` in `deployment/projects/<project>/__init__.py`.
+   - The unified entry point is `python -m deployment.cli.main <project> <deploy_cfg.py> <model_cfg.py> ...`
+
+6. **Documentation**
+   - Update `deployment/README.md` and the relevant docs in `deployment/docs/`.
+   - Document special requirements, configuration flags, or export pipelines.
+
+## Core Contract
+
+Before touching shared components, review `deployment/docs/core_contract.md` to understand allowed dependencies between runners, evaluators, pipelines, and exporters. Adhering to the contract keeps refactors safe and ensures new logic lands in the correct layer.
diff --git a/deployment/docs/core_contract.md b/deployment/docs/core_contract.md
new file mode 100644
index 000000000..dd5e9cdd8
--- /dev/null
+++ b/deployment/docs/core_contract.md
@@ -0,0 +1,57 @@
+## Deployment Core Contract
+
+This document defines the responsibilities and boundaries between the primary deployment components. Treat it as the “architecture contract” for contributors.
+
+### BaseDeploymentRunner (and project runners)
+- Owns the end-to-end deployment flow: load PyTorch model → export ONNX/TensorRT → verify → evaluate.
+- Constructs exporters via `ExporterFactory` and never embeds exporter-specific logic.
+- Injects project-provided `BaseDataLoader`, `BaseEvaluator`, model configs, wrappers, and optional export pipelines.
+- Ensures evaluators receive:
+  - Loaded PyTorch model (`set_pytorch_model`)
+  - Runtime/export artifacts (via `ArtifactManager`)
+  - Verification/evaluation requests (via orchestrators)
+- Must not contain task-specific preprocessing/postprocessing; defer to evaluators/pipelines.
+
+### BaseEvaluator (and task evaluators)
+- The single base class for all task evaluators, integrating `VerificationMixin`.
+- Provides the unified evaluation loop: iterate samples → infer → accumulate → compute metrics.
+- Requires a `TaskProfile` (task name, class names) and a `BaseMetricsInterface` at construction.
+- Responsible for:
+  - Creating backend pipelines through `PipelineFactory`
+  - Preparing verification inputs from the data loader
+  - Computing task metrics using metrics interfaces
+  - Printing/reporting evaluation summaries
+- Subclasses implement task-specific hooks:
+  - `_create_pipeline(model_spec, device)` → create backend pipeline
+  - `_prepare_input(sample, data_loader, device)` → extract model input + inference kwargs
+  - `_parse_predictions(pipeline_output)` → normalize raw output
+  - `_parse_ground_truths(gt_data)` → extract ground truth
+  - `_add_to_interface(predictions, ground_truths)` → feed metrics interface
+  - `_build_results(latencies, breakdowns, num_samples)` → construct final results dict
+  - `print_results(results)` → format and display results
+- Inherits `VerificationMixin` automatically; subclasses only need `_get_output_names()` if custom names are desired.
+- Provides common utilities: `_ensure_model_on_device()`, `_compute_latency_breakdown()`, `compute_latency_stats()`.
+
+### BaseDeploymentPipeline & PipelineFactory
+- `BaseDeploymentPipeline` defines the inference template (`preprocess → run_model → postprocess`).
+- Backend-specific subclasses handle only the inference mechanics for their backend.
+- `PipelineFactory` is the single entrypoint for creating pipelines per task/backend:
+  - Hides backend instantiation details from evaluators.
+  - Ensures consistent constructor signatures (PyTorch models vs. ONNX paths vs. TensorRT engines).
+  - Central location for future pipeline wiring (new tasks/backends).
+- Pipelines must avoid loading artifacts or computing metrics; they only execute inference.
+
+### Metrics Interfaces (Autoware-based interfaces)
+- Provide a uniform interface for adding frames and computing summaries regardless of task.
+- Encapsulate conversion from model predictions/ground truth to Autoware perception evaluation inputs.
+- Return metric dictionaries that evaluators incorporate into `EvalResultDict` results.
+- Should not access loaders, runners, or exporters directly; evaluators pass in the data they need.
+
+### Summary of Allowed Dependencies
+- **Runner → Evaluator** (injection) ✓
+- **Evaluator → PipelineFactory / Pipelines / Metrics Interfaces** ✓
+- **PipelineFactory → Pipelines** ✓
+- **Pipelines ↔ Metrics Interfaces** ✗ (evaluators mediate)
+- **Metrics Interfaces → Runner/PipelineFactory** ✗
+
+Adhering to this contract keeps responsibilities isolated, simplifies testing, and allows independent refactors of runners, evaluators, pipelines, and metrics logic.
diff --git a/deployment/docs/export_pipeline.md b/deployment/docs/export_pipeline.md
new file mode 100644
index 000000000..2fe5ffe7a
--- /dev/null
+++ b/deployment/docs/export_pipeline.md
@@ -0,0 +1,104 @@
+# Export Pipelines
+
+## ONNX Export
+
+1. **Model preparation** – load PyTorch model and apply the wrapper if output reshaping is required.
+2. **Input preparation** – grab a representative sample from the data loader.
+3. **Export** – call `torch.onnx.export()` with the configured settings.
+4. **Simplification** – optionally run ONNX simplification.
+5. **Save** – store artifacts under `work_dir/onnx/`.
+
+## TensorRT Export
+
+1. **Validate ONNX** – ensure the ONNX model exists and is compatible.
+2. **Network creation** – parse ONNX and build a TensorRT network.
+3. **Precision policy** – apply the configured precision mode (`auto`, `fp16`, `fp32_tf32`, `strongly_typed`).
+4. **Optimization profile** – configure dynamic-shape ranges.
+5. **Engine build** – compile and serialize the engine.
+6. **Save** – store artifacts under `work_dir/tensorrt/`.
+
+## Multi-File Export (CenterPoint)
+
+CenterPoint splits the model into multiple ONNX/TensorRT artifacts using a unified `components` configuration:
+
+```python
+components = dict(
+    voxel_encoder=dict(
+        name="pts_voxel_encoder",
+        onnx_file="pts_voxel_encoder.onnx",     # ONNX output filename
+        engine_file="pts_voxel_encoder.engine", # TensorRT output filename
+        io=dict(
+            inputs=[dict(name="input_features", dtype="float32")],
+            outputs=[dict(name="pillar_features", dtype="float32")],
+            dynamic_axes={...},
+        ),
+        tensorrt_profile=dict(
+            input_features=dict(min_shape=[...], opt_shape=[...], max_shape=[...]),
+        ),
+    ),
+    backbone_head=dict(
+        name="pts_backbone_neck_head",
+        onnx_file="pts_backbone_neck_head.onnx",
+        engine_file="pts_backbone_neck_head.engine",
+        io=dict(...),
+        tensorrt_profile=dict(...),
+    ),
+)
+```
+
+### Configuration Structure
+
+Each component in `deploy_cfg.components` defines:
+
+- `name`: Component identifier used during export
+- `onnx_file`: Output ONNX filename
+- `engine_file`: Output TensorRT engine filename
+- `io`: Input/output specification (names, dtypes, dynamic_axes)
+- `tensorrt_profile`: TensorRT optimization profile (min/opt/max shapes)
+
+### Export Pipeline Orchestration
+
+Export pipelines orchestrate:
+
+- Sequential export of each component
+- Input/output wiring between stages
+- Directory structure management
+
+CenterPoint uses a project-specific `ModelComponentExtractor` implementation that provides:
+
+- `extract_features(model, data_loader, sample_idx)`: project-specific feature extraction for tracing
+- `extract_components(model, sample_data)`: splitting into ONNX-exportable submodules and per-component config overrides
+
+## Verification-Oriented Exports
+
+- Exporters register artifacts via `ArtifactManager`, making the exported files discoverable for verification and evaluation.
+- Wrappers ensure consistent tensor ordering and shape expectations across backends.
+
+## Dependency Injection Pattern
+
+Projects inject wrappers and export pipelines when instantiating the runner:
+
+```python
+runner = CenterPointDeploymentRunner(
+    ...,
+    onnx_pipeline=CenterPointONNXExportPipeline(...),
+    tensorrt_pipeline=CenterPointTensorRTExportPipeline(...),
+)
+```
+
+Simple projects can skip export pipelines entirely and rely on the base exporters provided by `ExporterFactory`.
+
+## Runtime Pipeline Usage
+
+Runtime pipelines receive the `components_cfg` through constructor injection:
+
+```python
+pipeline = CenterPointONNXPipeline(
+    pytorch_model=model,
+    onnx_dir="/path/to/onnx",
+    device="cuda:0",
+    components_cfg=deploy_cfg["components"],  # Pass component config
+)
+```
+
+This allows pipelines to resolve artifact paths from the unified config.
diff --git a/deployment/docs/overview.md b/deployment/docs/overview.md
new file mode 100644
index 000000000..bebb7a47b
--- /dev/null
+++ b/deployment/docs/overview.md
@@ -0,0 +1,59 @@
+# Deployment Overview
+
+The AWML Deployment Framework provides a standardized, task-agnostic approach to exporting PyTorch models to ONNX and TensorRT with verification and evaluation baked in. It abstracts the common workflow steps while leaving space for project-specific customization so that CenterPoint, YOLOX, CalibrationStatusClassification, and future models can share the same deployment flow.
+
+## Design Principles
+
+1. **Unified interface** – a shared `BaseDeploymentRunner` with thin project-specific subclasses.
+2. **Task-agnostic core** – base classes support detection, classification, and segmentation tasks.
+3. **Backend flexibility** – PyTorch, ONNX, and TensorRT backends are first-class citizens.
+4. **Pipeline architecture** – common pre/postprocessing with backend-specific inference stages.
+5. **Configuration-driven** – configs plus typed dataclasses provide predictable defaults and IDE support.
+6. **Dependency injection** – exporters, wrappers, and export pipelines are explicitly wired for clarity and testability.
+7. **Type-safe building blocks** – typed configs, runtime contexts, and result objects reduce runtime surprises.
+8. **Extensible verification** – mixins compare nested outputs so that evaluators stay lightweight.
+
+## Key Features
+
+### Unified Deployment Workflow
+
+```
+Load Model → Export ONNX → Export TensorRT → Verify → Evaluate
+```
+
+### Scenario-Based Verification
+
+`VerificationMixin` normalizes devices, reuses pipelines from `PipelineFactory`, and recursively compares nested outputs with per-node logging. Scenarios define which backend pairs to compare.
+
+```python
+verification = dict(
+    enabled=True,
+    scenarios={
+        "both": [
+            {"ref_backend": "pytorch", "ref_device": "cpu",
+             "test_backend": "onnx", "test_device": "cpu"},
+            {"ref_backend": "onnx", "ref_device": "cpu",
+             "test_backend": "tensorrt", "test_device": "cuda:0"},
+        ]
+    }
+)
+```
+
+### Multi-Backend Evaluation
+
+Evaluators return typed results via `EvalResultDict` (TypedDict) ensuring consistent structure across backends. Metrics interfaces (`Detection3DMetricsInterface`, `Detection2DMetricsInterface`, `ClassificationMetricsInterface`) compute task-specific metrics using `autoware_perception_evaluation`.
+
+### Pipeline Architecture
+
+Shared preprocessing/postprocessing steps plug into backend-specific inference. Preprocessing can be generated from MMDet/MMDet3D configs via `build_preprocessing_pipeline`.
+
+### Flexible Export Modes
+
+- `mode="onnx"` – PyTorch → ONNX only.
+- `mode="trt"` – Build TensorRT from an existing ONNX export.
+- `mode="both"` – Full export pipeline.
+- `mode="none"` – Skip export and only run evaluation.
+
+### TensorRT Precision Policies
+
+Supports `auto`, `fp16`, `fp32_tf32`, and `strongly_typed` modes with typed configuration to keep engine builds reproducible.
diff --git a/deployment/docs/projects.md b/deployment/docs/projects.md
new file mode 100644
index 000000000..e6ea0d118
--- /dev/null
+++ b/deployment/docs/projects.md
@@ -0,0 +1,74 @@
+# Project Guides
+
+## CenterPoint (3D Detection)
+
+**Highlights**
+
+- Multi-file ONNX export (voxel encoder + backbone/head) orchestrated via export pipelines.
+- ONNX-compatible model configuration that mirrors training graph.
+- Composed exporters keep logic reusable.
+
+**Pipelines & Wrappers**
+
+- `CenterPointONNXExportPipeline` – drives multiple ONNX exports using the generic `ONNXExporter`.
+- `CenterPointTensorRTExportPipeline` – converts each ONNX file via the generic `TensorRTExporter`.
+- `CenterPointONNXWrapper` – identity wrapper.
+
+**Key Files**
+
+- `deployment/cli/main.py` (single entrypoint)
+- `deployment/projects/centerpoint/entrypoint.py`
+- `deployment/projects/centerpoint/evaluator.py`
+- `deployment/projects/centerpoint/pipelines/`
+- `deployment/projects/centerpoint/export/`
+
+**Pipeline Structure**
+
+```
+preprocess() → run_voxel_encoder() → process_middle_encoder() →
+run_backbone_head() → postprocess()
+```
+
+## YOLOX (2D Detection)
+
+**Highlights**
+
+- Standard single-file ONNX export.
+- `YOLOXOptElanONNXWrapper` reshapes output to Tier4-compatible format.
+- ReLU6 → ReLU replacement for ONNX compatibility.
+
+**Export Stack**
+
+- `ONNXExporter` and `TensorRTExporter` instantiated via `ExporterFactory` with the YOLOX wrapper.
+
+**Key Files**
+
+- `deployment/cli/main.py` (single entrypoint)
+- `deployment/projects/yolox_opt_elan/` (planned bundle; not migrated yet)
+
+**Pipeline Structure**
+
+```
+preprocess() → run_model() → postprocess()
+```
+
+## CalibrationStatusClassification
+
+**Highlights**
+
+- Binary classification deployment with calibrated/miscalibrated data loaders.
+- Single-file ONNX export with no extra output reshaping.
+
+**Export Stack**
+
+- `ONNXExporter` and `TensorRTExporter` with `CalibrationONNXWrapper` (identity wrapper).
+
+**Key Files**
+
+- `deployment/projects/calibration_status_classification/legacy/main.py` (legacy script)
+
+**Pipeline Structure**
+
+```
+preprocess() → run_model() → postprocess()
+```
diff --git a/deployment/docs/usage.md b/deployment/docs/usage.md
new file mode 100644
index 000000000..4c81382ba
--- /dev/null
+++ b/deployment/docs/usage.md
@@ -0,0 +1,107 @@
+# Usage & Entry Points
+
+## Basic Commands
+
+```bash
+# Single deployment entrypoint (project is a subcommand)
+python -m deployment.cli.main centerpoint \
+    <deploy_cfg.py> \
+    <model_cfg.py>
+
+# Example with CenterPoint-specific flag
+python -m deployment.cli.main centerpoint \
+    <deploy_cfg.py> \
+    <model_cfg.py> \
+    --rot-y-axis-reference
+```
+
+## Creating a Project Runner
+
+Projects pass lightweight configuration objects (wrapper classes and optional export pipelines) into the runner. Exporters are created lazily via `ExporterFactory`.
+
+```python
+# Project bundles live under deployment/projects/<project> and are resolved by the CLI.
+# The runtime layer is under deployment/runtime/*.
+```
+
+Key points:
+
+- Pass wrapper classes (and optional export pipelines) instead of exporter instances.
+- Exporters are constructed lazily inside `BaseDeploymentRunner`.
+- Entry points remain explicit and easily testable.
+
+## Typed Context Objects
+
+Typed contexts carry parameters through the workflow, improving IDE discoverability and refactor safety.
+
+```python
+from deployment.core import ExportContext, YOLOXExportContext, CenterPointExportContext
+
+results = runner.run(context=YOLOXExportContext(
+    sample_idx=0,
+    model_cfg_path="/path/to/config.py",
+))
+```
+
+Available contexts:
+
+- `ExportContext` – default context with `sample_idx` and `extra` dict.
+- `YOLOXExportContext` – adds `model_cfg_path`.
+- `CenterPointExportContext` – adds `rot_y_axis_reference`.
+- `CalibrationExportContext` – calibration-specific options.
+
+Create custom contexts by subclassing `ExportContext` and adding dataclass fields.
+
+## Command-Line Arguments
+
+```bash
+python deploy/main.py \
+    <deploy_cfg> \          # Deployment configuration file
+    <model_cfg> \           # Model configuration file
+    --log-level <level>     # Optional: DEBUG, INFO, WARNING, ERROR (default: INFO)
+```
+
+## Export Modes
+
+### ONNX Only
+
+```python
+checkpoint_path = "model.pth"
+
+export = dict(
+    mode="onnx",
+    work_dir="work_dirs/deployment",
+)
+```
+
+### TensorRT From Existing ONNX
+
+```python
+export = dict(
+    mode="trt",
+    onnx_path="work_dirs/deployment/onnx/model.onnx",
+    work_dir="work_dirs/deployment",
+)
+```
+
+### Full Export Pipeline
+
+```python
+checkpoint_path = "model.pth"
+
+export = dict(
+    mode="both",
+    work_dir="work_dirs/deployment",
+)
+```
+
+### Evaluation-Only
+
+```python
+checkpoint_path = "model.pth"
+
+export = dict(
+    mode="none",
+    work_dir="work_dirs/deployment",
+)
+```
diff --git a/deployment/docs/verification_evaluation.md b/deployment/docs/verification_evaluation.md
new file mode 100644
index 000000000..e4f13e4df
--- /dev/null
+++ b/deployment/docs/verification_evaluation.md
@@ -0,0 +1,65 @@
+# Verification & Evaluation
+
+## Verification
+
+`VerificationMixin` coordinates scenario-based comparisons:
+
+1. Resolve reference/test pipelines through `PipelineFactory`.
+2. Normalize devices per backend (PyTorch → CPU, TensorRT → `cuda:0`, …).
+3. Run inference on shared samples.
+4. Recursively compare nested outputs with tolerance controls.
+5. Emit per-sample pass/fail statistics.
+
+Example configuration:
+
+```python
+verification = dict(
+    enabled=True,
+    scenarios={
+        "both": [
+            {
+                "ref_backend": "pytorch",
+                "ref_device": "cpu",
+                "test_backend": "onnx",
+                "test_device": "cpu"
+            }
+        ]
+    },
+    tolerance=0.1,
+    num_verify_samples=3,
+)
+```
+
+## Evaluation
+
+Task-specific evaluators share typed metrics so reports stay consistent across backends.
+
+### Detection
+
+- mAP and per-class AP.
+- Latency statistics (mean, std, min, max).
+
+### Classification
+
+- Accuracy, precision, recall.
+- Per-class metrics and confusion matrix.
+- Latency statistics.
+
+Evaluation configuration example:
+
+```python
+evaluation = dict(
+    enabled=True,
+    num_samples=100,
+    verbose=False,
+    backends={
+        "pytorch": {"enabled": True, "device": "cpu"},
+        "onnx": {"enabled": True, "device": "cpu"},
+        "tensorrt": {"enabled": True, "device": "cuda:0"},
+    }
+)
+```
+
+## Core Contract
+
+`deployment/docs/core_contract.md` documents the responsibilities and allowed dependencies between runners, evaluators, pipelines, `PipelineFactory`, and metrics interfaces. Following the contract keeps refactors safe and ensures new projects remain compatible with shared infrastructure.
diff --git a/deployment/exporters/__init__.py b/deployment/exporters/__init__.py
new file mode 100644
index 000000000..31f34d6b6
--- /dev/null
+++ b/deployment/exporters/__init__.py
@@ -0,0 +1,17 @@
+"""Model exporters for different backends."""
+
+from deployment.exporters.common.base_exporter import BaseExporter
+from deployment.exporters.common.configs import ONNXExportConfig, TensorRTExportConfig
+from deployment.exporters.common.model_wrappers import BaseModelWrapper, IdentityWrapper
+from deployment.exporters.common.onnx_exporter import ONNXExporter
+from deployment.exporters.common.tensorrt_exporter import TensorRTExporter
+
+__all__ = [
+    "BaseExporter",
+    "ONNXExportConfig",
+    "TensorRTExportConfig",
+    "ONNXExporter",
+    "TensorRTExporter",
+    "BaseModelWrapper",
+    "IdentityWrapper",
+]
diff --git a/deployment/exporters/common/base_exporter.py b/deployment/exporters/common/base_exporter.py
new file mode 100644
index 000000000..e2105694b
--- /dev/null
+++ b/deployment/exporters/common/base_exporter.py
@@ -0,0 +1,81 @@
+"""
+Abstract base class for model exporters.
+
+Provides a unified interface for exporting models to different formats.
+"""
+
+import logging
+from abc import ABC, abstractmethod
+from typing import Any, Optional
+
+import torch
+
+from deployment.exporters.common.configs import BaseExporterConfig
+from deployment.exporters.common.model_wrappers import BaseModelWrapper
+
+
+class BaseExporter(ABC):
+    """
+    Abstract base class for model exporters.
+
+    This class defines a unified interface for exporting models
+    to different backend formats (ONNX, TensorRT, TorchScript, etc.).
+
+    Enhanced features:
+    - Support for model wrappers (preprocessing before export)
+    - Flexible configuration with overrides
+    - Better logging and error handling
+    """
+
+    def __init__(
+        self,
+        config: BaseExporterConfig,
+        model_wrapper: Optional[BaseModelWrapper] = None,
+        logger: Optional[logging.Logger] = None,
+    ):
+        """
+        Initialize exporter.
+
+        Args:
+            config: Typed export configuration dataclass (e.g., ``ONNXExportConfig``,
+                ``TensorRTExportConfig``). This ensures type safety and clear schema.
+            model_wrapper: Optional model wrapper class or callable.
+                         If a class is provided, it will be instantiated with the model.
+                         If an instance is provided, it should be a callable that takes a model.
+            logger: Optional logger instance
+        """
+        self.config: BaseExporterConfig = config
+        self.logger = logger or logging.getLogger(__name__)
+        self._model_wrapper = model_wrapper
+
+    def prepare_model(self, model: torch.nn.Module) -> torch.nn.Module:
+        """
+        Prepare model for export (apply wrapper if configured).
+
+        Args:
+            model: Original PyTorch model
+
+        Returns:
+            Prepared model (wrapped if wrapper configured)
+        """
+        if self._model_wrapper is None:
+            return model
+
+        self.logger.info("Applying model wrapper for export")
+
+        return self._model_wrapper(model)
+
+    @abstractmethod
+    def export(self, model: torch.nn.Module, sample_input: Any, output_path: str) -> None:
+        """
+        Export model to target format.
+
+        Args:
+            model: PyTorch model to export
+            sample_input: Example model input(s) for tracing/shape inference
+            output_path: Path to save exported model
+
+        Raises:
+            RuntimeError: If export fails
+        """
+        raise NotImplementedError
diff --git a/deployment/exporters/common/configs.py b/deployment/exporters/common/configs.py
new file mode 100644
index 000000000..3872eec7a
--- /dev/null
+++ b/deployment/exporters/common/configs.py
@@ -0,0 +1,167 @@
+"""Typed configuration helpers shared by exporter implementations."""
+
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from types import MappingProxyType
+from typing import Any, Iterable, Mapping, Optional, Tuple
+
+
+def _empty_mapping() -> Mapping[Any, Any]:
+    """Return an immutable empty mapping."""
+    return MappingProxyType({})
+
+
+@dataclass(frozen=True)
+class TensorRTProfileConfig:
+    """Optimization profile description for a TensorRT input tensor."""
+
+    min_shape: Tuple[int, ...] = field(default_factory=tuple)
+    opt_shape: Tuple[int, ...] = field(default_factory=tuple)
+    max_shape: Tuple[int, ...] = field(default_factory=tuple)
+
+    @classmethod
+    def from_dict(cls, data: Mapping[str, Any]) -> TensorRTProfileConfig:
+        return cls(
+            min_shape=cls._normalize_shape(data.get("min_shape")),
+            opt_shape=cls._normalize_shape(data.get("opt_shape")),
+            max_shape=cls._normalize_shape(data.get("max_shape")),
+        )
+
+    @staticmethod
+    def _normalize_shape(shape: Optional[Iterable[int]]) -> Tuple[int, ...]:
+        if shape is None:
+            return tuple()
+        return tuple(int(dim) for dim in shape)
+
+    @property
+    def has_complete_profile(self) -> bool:
+        """Whether all three shape profiles (min, opt, max) are configured."""
+        return bool(self.min_shape and self.opt_shape and self.max_shape)
+
+
+@dataclass(frozen=True)
+class TensorRTModelInputConfig:
+    """Typed container for TensorRT model input shape settings."""
+
+    input_shapes: Mapping[str, TensorRTProfileConfig] = field(default_factory=_empty_mapping)
+
+    @classmethod
+    def from_dict(cls, data: Mapping[str, Any]) -> TensorRTModelInputConfig:
+        input_shapes_raw = data.get("input_shapes")
+        if input_shapes_raw is None:
+            input_shapes_raw = {}
+        if not isinstance(input_shapes_raw, Mapping):
+            raise TypeError(f"input_shapes must be a mapping, got {type(input_shapes_raw).__name__}")
+
+        profile_map = {}
+        for name, shape_dict in input_shapes_raw.items():
+            if shape_dict is None:
+                shape_dict = {}
+            elif not isinstance(shape_dict, Mapping):
+                raise TypeError(f"input_shapes.{name} must be a mapping, got {type(shape_dict).__name__}")
+            profile_map[name] = TensorRTProfileConfig.from_dict(shape_dict)
+
+        return cls(input_shapes=MappingProxyType(profile_map))
+
+
+class BaseExporterConfig:
+    """
+    Base class for typed exporter configuration dataclasses.
+
+    Concrete configs should extend this class and provide typed fields
+    for all configuration parameters.
+    """
+
+    pass
+
+
+@dataclass(frozen=True)
+class ONNXExportConfig(BaseExporterConfig):
+    """
+    Typed schema describing ONNX exporter configuration.
+
+    Attributes:
+        input_names: Ordered collection of input tensor names.
+        output_names: Ordered collection of output tensor names.
+        dynamic_axes: Optional dynamic axes mapping identical to torch.onnx API.
+        simplify: Whether to run onnx-simplifier after export.
+        opset_version: ONNX opset to target.
+        export_params: Whether to embed weights inside the ONNX file.
+        keep_initializers_as_inputs: Mirror of torch.onnx flag.
+        verbose: Whether to log torch.onnx export graph debugging.
+        do_constant_folding: Whether to enable constant folding.
+        save_file: Output filename for the ONNX model.
+        batch_size: Fixed batch size for export (None for dynamic batch).
+    """
+
+    input_names: Tuple[str, ...] = ("input",)
+    output_names: Tuple[str, ...] = ("output",)
+    dynamic_axes: Optional[Mapping[str, Mapping[int, str]]] = None
+    simplify: bool = True
+    opset_version: int = 16
+    export_params: bool = True
+    keep_initializers_as_inputs: bool = False
+    verbose: bool = False
+    do_constant_folding: bool = True
+    save_file: str = "model.onnx"
+    batch_size: Optional[int] = None
+
+    @classmethod
+    def from_mapping(cls, data: Mapping[str, Any]) -> ONNXExportConfig:
+        """Instantiate config from a plain mapping."""
+        return cls(
+            input_names=tuple(data.get("input_names", cls.input_names)),
+            output_names=tuple(data.get("output_names", cls.output_names)),
+            dynamic_axes=data.get("dynamic_axes"),
+            simplify=data.get("simplify", cls.simplify),
+            opset_version=data.get("opset_version", cls.opset_version),
+            export_params=data.get("export_params", cls.export_params),
+            keep_initializers_as_inputs=data.get("keep_initializers_as_inputs", cls.keep_initializers_as_inputs),
+            verbose=data.get("verbose", cls.verbose),
+            do_constant_folding=data.get("do_constant_folding", cls.do_constant_folding),
+            save_file=data.get("save_file", cls.save_file),
+            batch_size=data.get("batch_size", cls.batch_size),
+        )
+
+
+@dataclass(frozen=True)
+class TensorRTExportConfig(BaseExporterConfig):
+    """
+    Typed schema describing TensorRT exporter configuration.
+
+    Attributes:
+        precision_policy: Name of the precision policy (matches PrecisionPolicy enum).
+        policy_flags: Mapping of TensorRT builder/network flags.
+        max_workspace_size: Workspace size in bytes.
+        model_inputs: Tuple of TensorRTModelInputConfig entries describing shapes.
+    """
+
+    precision_policy: str = "auto"
+    policy_flags: Mapping[str, bool] = field(default_factory=dict)
+    max_workspace_size: int = 1 << 30
+    model_inputs: Tuple[TensorRTModelInputConfig, ...] = field(default_factory=tuple)
+
+    @classmethod
+    def from_mapping(cls, data: Mapping[str, Any]) -> TensorRTExportConfig:
+        """Instantiate config from a plain mapping."""
+        inputs_raw = data.get("model_inputs") or ()
+        parsed_inputs = tuple(
+            entry if isinstance(entry, TensorRTModelInputConfig) else TensorRTModelInputConfig.from_dict(entry)
+            for entry in inputs_raw
+        )
+        return cls(
+            precision_policy=str(data.get("precision_policy", cls.precision_policy)),
+            policy_flags=MappingProxyType(data.get("policy_flags", {})),
+            max_workspace_size=int(data.get("max_workspace_size", cls.max_workspace_size)),
+            model_inputs=parsed_inputs,
+        )
+
+
+__all__ = [
+    "BaseExporterConfig",
+    "ONNXExportConfig",
+    "TensorRTExportConfig",
+    "TensorRTModelInputConfig",
+    "TensorRTProfileConfig",
+]
diff --git a/deployment/exporters/common/factory.py b/deployment/exporters/common/factory.py
new file mode 100644
index 000000000..c58192890
--- /dev/null
+++ b/deployment/exporters/common/factory.py
@@ -0,0 +1,59 @@
+"""
+Factory helpers for creating exporter instances from deployment configs.
+"""
+
+from __future__ import annotations
+
+import logging
+from typing import Optional, Type
+
+from deployment.core import BaseDeploymentConfig
+from deployment.exporters.common.configs import TensorRTExportConfig
+from deployment.exporters.common.model_wrappers import BaseModelWrapper
+from deployment.exporters.common.onnx_exporter import ONNXExporter
+from deployment.exporters.common.tensorrt_exporter import TensorRTExporter
+
+
+class ExporterFactory:
+    """
+    Factory class for instantiating exporters using deployment configs.
+    """
+
+    @staticmethod
+    def create_onnx_exporter(
+        config: BaseDeploymentConfig,
+        wrapper_cls: Type[BaseModelWrapper],
+        logger: logging.Logger,
+    ) -> ONNXExporter:
+        """
+        Build an ONNX exporter using the deployment config settings.
+        """
+
+        return ONNXExporter(
+            config=config.get_onnx_settings(),
+            model_wrapper=wrapper_cls,
+            logger=logger,
+        )
+
+    @staticmethod
+    def create_tensorrt_exporter(
+        config: BaseDeploymentConfig,
+        logger: logging.Logger,
+        config_override: Optional[TensorRTExportConfig] = None,
+    ) -> TensorRTExporter:
+        """
+        Build a TensorRT exporter using the deployment config settings.
+
+        Args:
+            config: Deployment configuration
+            logger: Logger instance
+            config_override: Optional TensorRT config to use instead of the one
+                           derived from the deployment config. Useful for
+                           per-component configurations in multi-file exports.
+        """
+        trt_config = config_override if config_override is not None else config.get_tensorrt_settings()
+
+        return TensorRTExporter(
+            config=trt_config,
+            logger=logger,
+        )
diff --git a/deployment/exporters/common/model_wrappers.py b/deployment/exporters/common/model_wrappers.py
new file mode 100644
index 000000000..7f40efa07
--- /dev/null
+++ b/deployment/exporters/common/model_wrappers.py
@@ -0,0 +1,62 @@
+"""
+Base model wrappers for ONNX export.
+
+This module provides the base classes for model wrappers that prepare models
+for ONNX export with specific output formats and processing requirements.
+
+Each project should define its own wrapper in {project}/model_wrappers.py,
+either by using IdentityWrapper or by creating a custom wrapper that inherits
+from BaseModelWrapper.
+"""
+
+from abc import ABC, abstractmethod
+
+import torch
+import torch.nn as nn
+
+
+class BaseModelWrapper(nn.Module, ABC):
+    """
+    Abstract base class for ONNX export model wrappers.
+
+    Wrappers modify model forward pass to produce ONNX-compatible outputs
+    with specific formats required by deployment backends.
+
+    Each project should create its own wrapper class that inherits from this
+    base class if special output format conversion is needed.
+    """
+
+    def __init__(self, model: nn.Module):
+        """
+        Initialize wrapper.
+
+        Args:
+            model: PyTorch model to wrap
+        """
+        super().__init__()
+        self.model = model
+
+    @abstractmethod
+    def forward(self, *args):
+        """
+        Forward pass for ONNX export.
+
+        Must be implemented by subclasses to define ONNX-specific output format.
+        """
+        raise NotImplementedError
+
+
+class IdentityWrapper(BaseModelWrapper):
+    """
+    Identity wrapper that doesn't modify the model.
+
+    Useful for models that don't need special ONNX export handling.
+    This is the default wrapper for most models.
+    """
+
+    def __init__(self, model: nn.Module):
+        super().__init__(model)
+
+    def forward(self, *args):
+        """Forward pass without modification."""
+        return self.model(*args)
diff --git a/deployment/exporters/common/onnx_exporter.py b/deployment/exporters/common/onnx_exporter.py
new file mode 100644
index 000000000..ca1ed9631
--- /dev/null
+++ b/deployment/exporters/common/onnx_exporter.py
@@ -0,0 +1,205 @@
+"""ONNX model exporter."""
+
+import logging
+import os
+from dataclasses import replace
+from typing import Any, Optional
+
+import onnx
+import onnxsim
+import torch
+
+from deployment.exporters.common.base_exporter import BaseExporter
+from deployment.exporters.common.configs import ONNXExportConfig
+
+
+class ONNXExporter(BaseExporter):
+    """
+    ONNX model exporter with enhanced features.
+
+    Exports PyTorch models to ONNX format with:
+    - Optional model wrapping for ONNX-specific output formats
+    - Optional model simplification
+    - Multi-file export support for complex models
+    - Configuration override capability
+    """
+
+    def __init__(
+        self,
+        config: ONNXExportConfig,
+        model_wrapper: Optional[Any] = None,
+        logger: logging.Logger = None,
+    ):
+        """
+        Initialize ONNX exporter.
+
+        Args:
+            config: ONNX export configuration dataclass instance.
+            model_wrapper: Optional model wrapper class (e.g., YOLOXOptElanONNXWrapper)
+            logger: Optional logger instance
+        """
+        super().__init__(config, model_wrapper=model_wrapper, logger=logger)
+        self._validate_config(config)
+
+    def _validate_config(self, config: ONNXExportConfig) -> None:
+        """
+        Validate ONNX export configuration.
+
+        Args:
+            config: Configuration to validate
+
+        Raises:
+            ValueError: If configuration is invalid
+        """
+        if config.opset_version < 11:
+            raise ValueError(f"opset_version must be >= 11, got {config.opset_version}")
+
+        if not config.input_names:
+            raise ValueError("input_names cannot be empty")
+
+        if not config.output_names:
+            raise ValueError("output_names cannot be empty")
+
+        if len(config.input_names) != len(set(config.input_names)):
+            raise ValueError("input_names contains duplicates")
+
+        if len(config.output_names) != len(set(config.output_names)):
+            raise ValueError("output_names contains duplicates")
+
+    def export(
+        self,
+        model: torch.nn.Module,
+        sample_input: Any,
+        output_path: str,
+        *,
+        config_override: Optional[ONNXExportConfig] = None,
+    ) -> None:
+        """
+        Export model to ONNX format.
+
+        Args:
+            model: PyTorch model to export
+            sample_input: Sample input tensor
+            output_path: Path to save ONNX model
+            config_override: Optional configuration override. If provided, will be merged
+                           with base config using dataclasses.replace.
+
+        Raises:
+            RuntimeError: If export fails
+            ValueError: If configuration is invalid
+        """
+        model = self._prepare_for_onnx(model)
+        export_cfg = self._build_export_config(config_override)
+        self._do_onnx_export(model, sample_input, output_path, export_cfg)
+        if export_cfg.simplify:
+            self._simplify_model(output_path)
+
+    def _prepare_for_onnx(self, model: torch.nn.Module) -> torch.nn.Module:
+        """
+        Prepare model for ONNX export.
+
+        Applies model wrapper if configured and sets model to eval mode.
+
+        Args:
+            model: PyTorch model to prepare
+
+        Returns:
+            Prepared model ready for ONNX export
+        """
+        model = self.prepare_model(model)
+        model.eval()
+        return model
+
+    def _build_export_config(self, config_override: Optional[ONNXExportConfig] = None) -> ONNXExportConfig:
+        """
+        Build export configuration by merging base config with override.
+
+        Args:
+            config_override: Optional configuration override. If provided, all fields
+                           from the override will replace corresponding fields in base config.
+
+        Returns:
+            Merged configuration ready for export
+
+        Raises:
+            ValueError: If merged configuration is invalid
+        """
+        if config_override is None:
+            export_cfg = self.config
+        else:
+            export_cfg = replace(self.config, **config_override.__dict__)
+
+        # Validate merged config
+        self._validate_config(export_cfg)
+        return export_cfg
+
+    def _do_onnx_export(
+        self,
+        model: torch.nn.Module,
+        sample_input: Any,
+        output_path: str,
+        export_cfg: ONNXExportConfig,
+    ) -> None:
+        """
+        Perform ONNX export using torch.onnx.export.
+
+        Args:
+            model: Prepared PyTorch model
+            sample_input: Sample input tensor
+            output_path: Path to save ONNX model
+            export_cfg: Export configuration
+
+        Raises:
+            RuntimeError: If export fails
+        """
+        self.logger.info("Exporting model to ONNX format...")
+        if hasattr(sample_input, "shape"):
+            self.logger.info(f"  Input shape: {sample_input.shape}")
+        self.logger.info(f"  Output path: {output_path}")
+        self.logger.info(f"  Opset version: {export_cfg.opset_version}")
+
+        # Ensure output directory exists
+        os.makedirs(os.path.dirname(output_path) if os.path.dirname(output_path) else ".", exist_ok=True)
+
+        try:
+            with torch.no_grad():
+                torch.onnx.export(
+                    model,
+                    sample_input,
+                    output_path,
+                    export_params=export_cfg.export_params,
+                    keep_initializers_as_inputs=export_cfg.keep_initializers_as_inputs,
+                    opset_version=export_cfg.opset_version,
+                    do_constant_folding=export_cfg.do_constant_folding,
+                    input_names=list(export_cfg.input_names),
+                    output_names=list(export_cfg.output_names),
+                    dynamic_axes=export_cfg.dynamic_axes,
+                    verbose=export_cfg.verbose,
+                )
+
+            self.logger.info(f"ONNX export completed: {output_path}")
+
+        except Exception as e:
+            self.logger.error(f"ONNX export failed: {e}")
+            import traceback
+
+            self.logger.error(traceback.format_exc())
+            raise RuntimeError("ONNX export failed") from e
+
+    def _simplify_model(self, onnx_path: str) -> None:
+        """
+        Simplify ONNX model using onnxsim.
+
+        Args:
+            onnx_path: Path to ONNX model file
+        """
+        self.logger.info("Simplifying ONNX model...")
+        try:
+            model_simplified, success = onnxsim.simplify(onnx_path)
+            if success:
+                onnx.save(model_simplified, onnx_path)
+                self.logger.info("ONNX model simplified successfully")
+            else:
+                self.logger.warning("ONNX model simplification failed")
+        except Exception as e:
+            self.logger.warning(f"ONNX simplification error: {e}")
diff --git a/deployment/exporters/common/tensorrt_exporter.py b/deployment/exporters/common/tensorrt_exporter.py
new file mode 100644
index 000000000..6abc4026f
--- /dev/null
+++ b/deployment/exporters/common/tensorrt_exporter.py
@@ -0,0 +1,411 @@
+"""TensorRT model exporter."""
+
+import logging
+from typing import Any, Dict, Mapping, Optional, Sequence, Tuple
+
+import tensorrt as trt
+import torch
+
+from deployment.core.artifacts import Artifact
+from deployment.exporters.common.base_exporter import BaseExporter
+from deployment.exporters.common.configs import TensorRTExportConfig, TensorRTModelInputConfig, TensorRTProfileConfig
+
+
+class TensorRTExporter(BaseExporter):
+    """
+    TensorRT model exporter.
+
+    Converts ONNX models to TensorRT engine format with precision policy support.
+    """
+
+    def __init__(
+        self,
+        config: TensorRTExportConfig,
+        model_wrapper: Optional[Any] = None,
+        logger: logging.Logger = None,
+    ):
+        """
+        Initialize TensorRT exporter.
+
+        Args:
+            config: TensorRT export configuration dataclass instance.
+            model_wrapper: Optional model wrapper class (usually not needed for TensorRT)
+            logger: Optional logger instance
+        """
+        super().__init__(config, model_wrapper=model_wrapper, logger=logger)
+        self.logger = logger or logging.getLogger(__name__)
+
+    def export(
+        self,
+        model: torch.nn.Module,  # Not used for TensorRT, kept for interface compatibility
+        sample_input: Any,
+        output_path: str,
+        onnx_path: str = None,
+    ) -> Artifact:
+        """
+        Export ONNX model to TensorRT engine.
+
+        Args:
+            model: Not used (TensorRT converts from ONNX)
+            sample_input: Sample input for shape configuration
+            output_path: Path to save TensorRT engine
+            onnx_path: Path to source ONNX model
+
+        Returns:
+            Artifact object representing the exported TensorRT engine
+
+        Raises:
+            RuntimeError: If export fails
+            ValueError: If ONNX path is missing
+        """
+        if onnx_path is None:
+            raise ValueError("onnx_path is required for TensorRT export")
+
+        precision_policy = self.config.precision_policy
+        self.logger.info(f"Building TensorRT engine with precision policy: {precision_policy}")
+        self.logger.info(f"  ONNX source: {onnx_path}")
+        self.logger.info(f"  Engine output: {output_path}")
+
+        return self._do_tensorrt_export(onnx_path, output_path, sample_input)
+
+    def _do_tensorrt_export(
+        self,
+        onnx_path: str,
+        output_path: str,
+        sample_input: Any,
+    ) -> Artifact:
+        """
+        Export a single ONNX file to TensorRT engine.
+
+        This method handles the complete export workflow with proper resource management.
+
+        Args:
+            onnx_path: Path to source ONNX model
+            output_path: Path to save TensorRT engine
+            sample_input: Sample input for shape configuration
+
+        Returns:
+            Artifact object representing the exported TensorRT engine
+
+        Raises:
+            RuntimeError: If export fails
+        """
+        # Initialize TensorRT
+        trt_logger = trt.Logger(trt.Logger.WARNING)
+        trt.init_libnvinfer_plugins(trt_logger, "")
+
+        builder = trt.Builder(trt_logger)
+        try:
+            builder_config, network, parser = self._create_builder_and_network(builder, trt_logger)
+            try:
+                self._parse_onnx(parser, network, onnx_path)
+                self._configure_input_profiles(builder, builder_config, network, sample_input)
+                serialized_engine = self._build_engine(builder, builder_config, network)
+                self._save_engine(serialized_engine, output_path)
+                return Artifact(path=output_path, multi_file=False)
+            finally:
+                del parser
+                del network
+        finally:
+            del builder
+
+    def _create_builder_and_network(
+        self,
+        builder: trt.Builder,
+        trt_logger: trt.Logger,
+    ) -> Tuple[trt.IBuilderConfig, trt.INetworkDefinition, trt.OnnxParser]:
+        """
+        Create builder config, network, and parser.
+
+        Args:
+            builder: TensorRT builder instance
+            trt_logger: TensorRT logger instance
+
+        Returns:
+            Tuple of (builder_config, network, parser)
+        """
+        builder_config = builder.create_builder_config()
+
+        max_workspace_size = self.config.max_workspace_size
+        builder_config.set_memory_pool_limit(pool=trt.MemoryPoolType.WORKSPACE, pool_size=max_workspace_size)
+
+        # Create network with appropriate flags
+        flags = 1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
+
+        # Handle strongly typed flag (network creation flag)
+        policy_flags = self.config.policy_flags
+        if policy_flags.get("STRONGLY_TYPED", False):
+            flags |= 1 << int(trt.NetworkDefinitionCreationFlag.STRONGLY_TYPED)
+            self.logger.info("Using strongly typed TensorRT network creation")
+
+        network = builder.create_network(flags)
+
+        # Apply precision flags to builder config
+        for flag_name, enabled in policy_flags.items():
+            if flag_name == "STRONGLY_TYPED":
+                continue
+            if enabled and hasattr(trt.BuilderFlag, flag_name):
+                builder_config.set_flag(getattr(trt.BuilderFlag, flag_name))
+                self.logger.info(f"BuilderFlag.{flag_name} enabled")
+
+        parser = trt.OnnxParser(network, trt_logger)
+
+        return builder_config, network, parser
+
+    def _parse_onnx(
+        self,
+        parser: trt.OnnxParser,
+        network: trt.INetworkDefinition,
+        onnx_path: str,
+    ) -> None:
+        """
+        Parse ONNX model into TensorRT network.
+
+        Args:
+            parser: TensorRT ONNX parser instance
+            network: TensorRT network definition
+            onnx_path: Path to ONNX model file
+
+        Raises:
+            RuntimeError: If parsing fails
+        """
+        with open(onnx_path, "rb") as f:
+            if not parser.parse(f.read()):
+                self._log_parser_errors(parser)
+                raise RuntimeError("TensorRT export failed: unable to parse ONNX file")
+        self.logger.info("Successfully parsed ONNX file")
+
+    def _configure_input_profiles(
+        self,
+        builder: trt.Builder,
+        builder_config: trt.IBuilderConfig,
+        network: trt.INetworkDefinition,
+        sample_input: Any,
+    ) -> None:
+        """
+        Configure TensorRT optimization profiles for input shapes.
+
+        Creates an optimization profile and configures min/opt/max shapes for each input.
+        See `_configure_input_shapes` for details on shape configuration.
+
+        Note:
+            ONNX `dynamic_axes` and TensorRT profiles serve different purposes:
+
+            - **ONNX dynamic_axes**: Used during ONNX export to define which dimensions
+              are symbolic (dynamic) in the ONNX graph. This allows the ONNX model to
+              accept inputs of varying sizes at those dimensions.
+
+            - **TensorRT profile**: Defines the runtime shape envelope (min/opt/max) that
+              TensorRT will optimize for. TensorRT builds kernels optimized for shapes
+              within this envelope. The profile must be compatible with the ONNX dynamic
+              axes, but they are configured separately and serve different roles:
+              - dynamic_axes: Export-time graph structure
+              - TRT profile: Runtime optimization envelope
+
+            They are related but not equivalent. The ONNX model may have dynamic axes,
+            but TensorRT still needs explicit min/opt/max shapes to build optimized kernels.
+
+        Args:
+            builder: TensorRT builder instance
+            builder_config: TensorRT builder config
+            network: TensorRT network definition
+            sample_input: Sample input for shape configuration (typically obtained via
+                         BaseDataLoader.get_shape_sample())
+        """
+        profile = builder.create_optimization_profile()
+        self._configure_input_shapes(profile, sample_input, network)
+        builder_config.add_optimization_profile(profile)
+
+    def _build_engine(
+        self,
+        builder: trt.Builder,
+        builder_config: trt.IBuilderConfig,
+        network: trt.INetworkDefinition,
+    ) -> bytes:
+        """
+        Build TensorRT engine from network.
+
+        Args:
+            builder: TensorRT builder instance
+            builder_config: TensorRT builder config
+            network: TensorRT network definition
+
+        Returns:
+            Serialized engine as bytes
+
+        Raises:
+            RuntimeError: If engine building fails
+        """
+        self.logger.info("Building TensorRT engine (this may take a while)...")
+        serialized_engine = builder.build_serialized_network(network, builder_config)
+
+        if serialized_engine is None:
+            self.logger.error("Failed to build TensorRT engine")
+            raise RuntimeError("TensorRT export failed: builder returned None")
+
+        return serialized_engine
+
+    def _save_engine(
+        self,
+        serialized_engine: bytes,
+        output_path: str,
+    ) -> None:
+        """
+        Save serialized TensorRT engine to file.
+
+        Args:
+            serialized_engine: Serialized engine bytes
+            output_path: Path to save engine file
+        """
+        with open(output_path, "wb") as f:
+            f.write(serialized_engine)
+
+        max_workspace_size = self.config.max_workspace_size
+        self.logger.info(f"TensorRT engine saved to {output_path}")
+        self.logger.info(f"Engine max workspace size: {max_workspace_size / (1024**3):.2f} GB")
+
+    def _configure_input_shapes(
+        self,
+        profile: trt.IOptimizationProfile,
+        sample_input: Any,
+        network: trt.INetworkDefinition = None,
+    ) -> None:
+        """
+        Configure input shapes for TensorRT optimization profile.
+
+        Note:
+            ONNX dynamic_axes is used for export; TRT profile is the runtime envelope;
+            they are related but not equivalent.
+
+            - **ONNX dynamic_axes**: Controls symbolic dimensions in the ONNX graph during
+              export. Defines which dimensions can vary at runtime in the ONNX model.
+
+            - **TensorRT profile (min/opt/max)**: Defines the runtime shape envelope that
+              TensorRT optimizes for. TensorRT builds kernels optimized for shapes within
+              this envelope. The profile must be compatible with the ONNX dynamic axes,
+              but they are configured separately:
+              - dynamic_axes: Export-time graph structure (what dimensions are variable)
+              - TRT profile: Runtime optimization envelope (what shapes to optimize for)
+
+            They are complementary but independent. The ONNX model may have dynamic axes,
+            but TensorRT still needs explicit min/opt/max shapes to build optimized kernels.
+
+        Raises:
+            ValueError: If neither model_inputs config nor sample_input is provided
+        """
+        model_inputs_cfg = self.config.model_inputs
+
+        # Validate that we have shape information
+        first_input_shapes = None
+        if model_inputs_cfg:
+            first_input_shapes = self._extract_input_shapes(model_inputs_cfg[0])
+
+        if not model_inputs_cfg or not first_input_shapes:
+            if sample_input is None:
+                raise ValueError(
+                    "TensorRT export requires shape information. Please provide either:\n"
+                    "  1. Explicit 'model_inputs' with 'input_shapes' (min/opt/max) in config, OR\n"
+                    "  2. A 'sample_input' tensor for automatic shape inference\n"
+                    "\n"
+                    "Current config has:\n"
+                    f"  - model_inputs: {model_inputs_cfg}\n"
+                    f"  - sample_input: {sample_input}\n"
+                    "\n"
+                    "Example config:\n"
+                    "  tensorrt_config = dict(\n"
+                    "      model_inputs=[\n"
+                    "          dict(\n"
+                    "              input_shapes={\n"
+                    "                  'input': dict(\n"
+                    "                      min_shape=(1, 3, 960, 960),\n"
+                    "                      opt_shape=(1, 3, 960, 960),\n"
+                    "                      max_shape=(1, 3, 960, 960),\n"
+                    "                  )\n"
+                    "              }\n"
+                    "          )\n"
+                    "      ]\n"
+                    "  )"
+                )
+            # If we have sample_input but no config, we could infer shapes
+            # For now, just require explicit config
+            self.logger.warning(
+                "sample_input provided but no explicit model_inputs config. "
+                "TensorRT export may fail if ONNX has dynamic dimensions."
+            )
+
+        if not model_inputs_cfg:
+            raise ValueError("model_inputs is not set in the config")
+
+        # model_inputs is already a Tuple[TensorRTModelInputConfig, ...]
+        first_entry = model_inputs_cfg[0]
+        input_shapes = first_input_shapes
+
+        if not input_shapes:
+            raise ValueError("TensorRT model_inputs[0] missing 'input_shapes' definitions")
+
+        for input_name, profile_cfg in input_shapes.items():
+            min_shape, opt_shape, max_shape = self._resolve_profile_shapes(profile_cfg, sample_input, input_name)
+            self.logger.info(f"Setting {input_name} shapes - min: {min_shape}, opt: {opt_shape}, max: {max_shape}")
+            profile.set_shape(input_name, min_shape, opt_shape, max_shape)
+
+    def _log_parser_errors(self, parser: trt.OnnxParser) -> None:
+        """Log TensorRT parser errors."""
+        self.logger.error("Failed to parse ONNX model")
+        for error in range(parser.num_errors):
+            self.logger.error(f"Parser error: {parser.get_error(error)}")
+
+    def _extract_input_shapes(self, entry: Any) -> Mapping[str, Any]:
+        if isinstance(entry, TensorRTModelInputConfig):
+            return entry.input_shapes
+        if isinstance(entry, Mapping):
+            input_shapes = entry.get("input_shapes")
+            if input_shapes is None:
+                input_shapes = {}
+            if not isinstance(input_shapes, Mapping):
+                raise TypeError(f"input_shapes must be a mapping, got {type(input_shapes).__name__}")
+            return input_shapes
+        raise TypeError(f"Unsupported TensorRT model input entry: {type(entry)}")
+
+    def _resolve_profile_shapes(
+        self,
+        profile_cfg: Any,
+        sample_input: Any,
+        input_name: str,
+    ) -> Sequence[Sequence[int]]:
+        if isinstance(profile_cfg, TensorRTProfileConfig):
+            min_shape = self._shape_to_list(profile_cfg.min_shape)
+            opt_shape = self._shape_to_list(profile_cfg.opt_shape)
+            max_shape = self._shape_to_list(profile_cfg.max_shape)
+        elif isinstance(profile_cfg, Mapping):
+            min_shape = self._shape_to_list(profile_cfg.get("min_shape"))
+            opt_shape = self._shape_to_list(profile_cfg.get("opt_shape"))
+            max_shape = self._shape_to_list(profile_cfg.get("max_shape"))
+        else:
+            raise TypeError(f"Unsupported TensorRT profile type for input '{input_name}': {type(profile_cfg)}")
+
+        return (
+            self._ensure_shape(min_shape, sample_input, input_name, "min"),
+            self._ensure_shape(opt_shape, sample_input, input_name, "opt"),
+            self._ensure_shape(max_shape, sample_input, input_name, "max"),
+        )
+
+    @staticmethod
+    def _shape_to_list(shape: Optional[Sequence[int]]) -> Optional[Sequence[int]]:
+        if shape is None:
+            return None
+        return [int(dim) for dim in shape]
+
+    def _ensure_shape(
+        self,
+        shape: Optional[Sequence[int]],
+        sample_input: Any,
+        input_name: str,
+        bucket: str,
+    ) -> Sequence[int]:
+        if shape:
+            return list(shape)
+        if sample_input is None or not hasattr(sample_input, "shape"):
+            raise ValueError(f"{bucket}_shape missing for {input_name} and sample_input is not provided")
+        inferred = list(sample_input.shape)
+        self.logger.debug("Falling back to sample_input.shape=%s for %s:%s", inferred, input_name, bucket)
+        return inferred
diff --git a/deployment/exporters/export_pipelines/__init__.py b/deployment/exporters/export_pipelines/__init__.py
new file mode 100644
index 000000000..e65b55e0a
--- /dev/null
+++ b/deployment/exporters/export_pipelines/__init__.py
@@ -0,0 +1,16 @@
+"""Export pipeline interfaces and component extraction helpers."""
+
+from deployment.exporters.export_pipelines.base import OnnxExportPipeline, TensorRTExportPipeline
+from deployment.exporters.export_pipelines.interfaces import (
+    ExportableComponent,
+    ModelComponentExtractor,
+)
+
+__all__ = [
+    # Base export pipelines
+    "OnnxExportPipeline",
+    "TensorRTExportPipeline",
+    # Component extraction interfaces
+    "ModelComponentExtractor",
+    "ExportableComponent",
+]
diff --git a/deployment/exporters/export_pipelines/base.py b/deployment/exporters/export_pipelines/base.py
new file mode 100644
index 000000000..438fc45c6
--- /dev/null
+++ b/deployment/exporters/export_pipelines/base.py
@@ -0,0 +1,70 @@
+"""
+Base export pipeline interfaces for specialized export flows.
+"""
+
+from __future__ import annotations
+
+from abc import ABC, abstractmethod
+from typing import Any
+
+from deployment.core.artifacts import Artifact
+from deployment.core.config.base_config import BaseDeploymentConfig
+from deployment.core.io.base_data_loader import BaseDataLoader
+
+
+class OnnxExportPipeline(ABC):
+    """
+    Base interface for ONNX export pipelines.
+    """
+
+    @abstractmethod
+    def export(
+        self,
+        *,
+        model: Any,
+        data_loader: BaseDataLoader,
+        output_dir: str,
+        config: BaseDeploymentConfig,
+        sample_idx: int = 0,
+    ) -> Artifact:
+        """
+        Execute the ONNX export pipeline and return the produced artifact.
+
+        Args:
+            model: PyTorch model to export
+            data_loader: Data loader for samples
+            output_dir: Directory for output files
+            config: Deployment configuration
+            sample_idx: Sample index for tracing
+
+        Returns:
+            Artifact describing the exported ONNX output
+        """
+
+
+class TensorRTExportPipeline(ABC):
+    """
+    Base interface for TensorRT export pipelines.
+    """
+
+    @abstractmethod
+    def export(
+        self,
+        *,
+        onnx_path: str,
+        output_dir: str,
+        config: BaseDeploymentConfig,
+        device: str,
+    ) -> Artifact:
+        """
+        Execute the TensorRT export pipeline and return the produced artifact.
+
+        Args:
+            onnx_path: Path to ONNX model file/directory
+            output_dir: Directory for output files
+            config: Deployment configuration
+            device: CUDA device string
+
+        Returns:
+            Artifact describing the exported TensorRT output
+        """
diff --git a/deployment/exporters/export_pipelines/interfaces.py b/deployment/exporters/export_pipelines/interfaces.py
new file mode 100644
index 000000000..09524326d
--- /dev/null
+++ b/deployment/exporters/export_pipelines/interfaces.py
@@ -0,0 +1,91 @@
+"""
+Interfaces for export pipeline components.
+
+This module defines interfaces that allow project-specific code to provide
+model-specific knowledge to generic deployment export pipelines.
+"""
+
+from abc import ABC, abstractmethod
+from dataclasses import dataclass
+from typing import Any, List, Optional
+
+import torch
+
+from deployment.exporters.common.configs import ONNXExportConfig
+
+
+@dataclass(frozen=True)
+class ExportableComponent:
+    """
+    A model component ready for ONNX export.
+
+    Attributes:
+        name: Component name (e.g., "voxel_encoder", "backbone_head")
+        module: PyTorch module to export
+        sample_input: Sample input tensor for tracing
+        config_override: Optional ONNX export config override
+    """
+
+    name: str
+    module: torch.nn.Module
+    sample_input: Any
+    config_override: Optional[ONNXExportConfig] = None
+
+
+class ModelComponentExtractor(ABC):
+    """
+    Interface for extracting exportable model components.
+
+    This interface allows project-specific code to provide model-specific
+    knowledge (model structure, component extraction, input preparation)
+    without the deployment framework needing to know about specific models.
+
+    This solves the dependency inversion problem: instead of deployment
+    framework importing from projects/, projects/ implement this interface
+    and inject it into export pipelines.
+    """
+
+    @abstractmethod
+    def extract_components(self, model: torch.nn.Module, sample_data: Any) -> List[ExportableComponent]:
+        """
+        Extract all components that need to be exported to ONNX.
+
+        This method should handle all model-specific logic:
+        - Running model inference to prepare inputs
+        - Creating combined modules (e.g., backbone+neck+head)
+        - Preparing sample inputs for each component
+        - Specifying ONNX export configs for each component
+
+        Args:
+            model: PyTorch model to extract components from
+            sample_data: Sample data for preparing inputs
+
+        Returns:
+            List of ExportableComponent instances ready for ONNX export
+        """
+        ...
+
+    @abstractmethod
+    def extract_features(
+        self,
+        model: torch.nn.Module,
+        data_loader: Any,
+        sample_idx: int,
+    ) -> Any:
+        """
+        Extract model-specific intermediate features required for multi-component export.
+
+        Some models require running a portion of the network to generate the input
+        tensor(s) for later components. This method encapsulates that model-specific
+        logic and returns a standardized tuple used by `extract_components`.
+
+        Args:
+            model: PyTorch model used for feature extraction
+            data_loader: Data loader used to access the sample
+            sample_idx: Sample index used for tracing/feature extraction
+
+        Returns:
+            A tuple of (input_features, voxel_dict) or other model-specific payload
+            that `extract_components` expects.
+        """
+        ...
diff --git a/deployment/pipelines/__init__.py b/deployment/pipelines/__init__.py
new file mode 100644
index 000000000..8eaa99d9c
--- /dev/null
+++ b/deployment/pipelines/__init__.py
@@ -0,0 +1,18 @@
+"""Deployment pipeline infrastructure.
+
+Project-specific pipeline implementations live under `deployment/projects/<project>/pipelines/`
+and should register themselves into `deployment.pipelines.registry.pipeline_registry`.
+"""
+
+from deployment.pipelines.base_factory import BasePipelineFactory
+from deployment.pipelines.base_pipeline import BaseDeploymentPipeline
+from deployment.pipelines.factory import PipelineFactory
+from deployment.pipelines.registry import PipelineRegistry, pipeline_registry
+
+__all__ = [
+    "BaseDeploymentPipeline",
+    "BasePipelineFactory",
+    "PipelineRegistry",
+    "pipeline_registry",
+    "PipelineFactory",
+]
diff --git a/deployment/pipelines/base_factory.py b/deployment/pipelines/base_factory.py
new file mode 100644
index 000000000..0576777c5
--- /dev/null
+++ b/deployment/pipelines/base_factory.py
@@ -0,0 +1,69 @@
+"""
+Base Pipeline Factory for Project-specific Pipeline Creation.
+
+Flattened from `deployment/pipelines/common/base_factory.py`.
+"""
+
+import logging
+from abc import ABC, abstractmethod
+from typing import Any, Optional
+
+from deployment.core.backend import Backend
+from deployment.core.evaluation.evaluator_types import ModelSpec
+from deployment.pipelines.base_pipeline import BaseDeploymentPipeline
+
+logger = logging.getLogger(__name__)
+
+
+class BasePipelineFactory(ABC):
+    """Project-specific factory interface for building deployment pipelines.
+
+    A project registers a subclass into `deployment.pipelines.registry.pipeline_registry`.
+    Evaluators then call into the registry/factory to instantiate the correct pipeline
+    for a given (project, backend) pair.
+    """
+
+    @classmethod
+    @abstractmethod
+    def get_project_name(cls) -> str:
+        """Return the unique project identifier used for registry lookup."""
+        raise NotImplementedError
+
+    @classmethod
+    @abstractmethod
+    def create_pipeline(
+        cls,
+        model_spec: ModelSpec,
+        pytorch_model: Any,
+        device: Optional[str] = None,
+        components_cfg: Optional[Any] = None,
+    ) -> BaseDeploymentPipeline:
+        """Build and return a pipeline instance for the given model spec.
+
+        Implementations typically:
+        - Validate/dispatch based on `model_spec.backend`
+        - Wrap `pytorch_model` or load an ONNX/TensorRT runtime
+        - Construct a `BaseDeploymentPipeline` subclass configured for the backend
+
+        Args:
+            model_spec: Describes the model path/device/backend and any metadata.
+            pytorch_model: A loaded PyTorch model (used for PYTORCH backends).
+            device: Optional device override (defaults to `model_spec.device`).
+            components_cfg: Project-specific component configuration (e.g., file paths, IO specs).
+        """
+        raise NotImplementedError
+
+    @classmethod
+    def get_supported_backends(cls) -> list:
+        """Return the list of backends this project factory can instantiate."""
+        return [Backend.PYTORCH, Backend.ONNX, Backend.TENSORRT]
+
+    @classmethod
+    def _validate_backend(cls, backend: Backend) -> None:
+        """Raise a ValueError if `backend` is not supported by this factory."""
+        supported = cls.get_supported_backends()
+        if backend not in supported:
+            supported_names = [b.value for b in supported]
+            raise ValueError(
+                f"Unsupported backend '{backend.value}' for {cls.get_project_name()}. Supported backends: {supported_names}"
+            )
diff --git a/deployment/pipelines/base_pipeline.py b/deployment/pipelines/base_pipeline.py
new file mode 100644
index 000000000..47ee46120
--- /dev/null
+++ b/deployment/pipelines/base_pipeline.py
@@ -0,0 +1,169 @@
+"""
+Base Deployment Pipeline for Unified Model Deployment.
+
+Flattened from `deployment/pipelines/common/base_pipeline.py`.
+"""
+
+import logging
+import time
+from abc import ABC, abstractmethod
+from typing import Any, Dict, Mapping, Optional, Tuple, Union
+
+import torch
+
+from deployment.core.evaluation.evaluator_types import InferenceResult
+
+logger = logging.getLogger(__name__)
+
+
+class BaseDeploymentPipeline(ABC):
+    """Base contract for a deployment inference pipeline.
+
+    A pipeline is responsible for the classic 3-stage inference flow:
+    `preprocess -> run_model -> postprocess`.
+
+    The default `infer()` implementation measures per-stage latency and returns an
+    `InferenceResult` with optional breakdown information.
+    """
+
+    def __init__(self, model: Any, device: str = "cpu", task_type: str = "unknown", backend_type: str = "unknown"):
+        """Create a pipeline bound to a model and a device.
+
+        Args:
+            model: Backend-specific callable/model wrapper used by `run_model`.
+            device: Target device string (e.g. "cpu", "cuda:0") or torch.device.
+            task_type: High-level task label (e.g. "detection3d") for logging/metrics.
+            backend_type: Backend label (e.g. "pytorch", "onnx", "tensorrt") for logging/metrics.
+        """
+        self.model = model
+        self.device = torch.device(device) if isinstance(device, str) else device
+        self.task_type = task_type
+        self.backend_type = backend_type
+        self._stage_latencies: Dict[str, float] = {}
+
+        logger.info(f"Initialized {self.__class__.__name__} on device: {self.device}")
+
+    @abstractmethod
+    def preprocess(self, input_data: Any) -> Any:
+        """Convert raw input into model-ready tensors/arrays.
+
+        Implementations may optionally return a tuple `(model_input, metadata_dict)`
+        where metadata is merged into `infer(..., metadata=...)` and forwarded to
+        `postprocess`.
+        """
+        raise NotImplementedError
+
+    @abstractmethod
+    def run_model(self, preprocessed_input: Any) -> Union[Any, Tuple[Any, Dict[str, float]]]:
+        """Run the underlying model and return its raw outputs.
+
+        Implementations may optionally return `(model_output, stage_latency_dict)`.
+        Latencies are merged into the `InferenceResult.breakdown`.
+        """
+        raise NotImplementedError
+
+    @abstractmethod
+    def postprocess(self, model_output: Any, metadata: Optional[Mapping[str, Any]] = None) -> Any:
+        """Convert raw model outputs into final predictions/results."""
+        raise NotImplementedError
+
+    def infer(
+        self, input_data: Any, metadata: Optional[Mapping[str, Any]] = None, return_raw_outputs: bool = False
+    ) -> InferenceResult:
+        """Run end-to-end inference with latency breakdown.
+
+        Flow:
+            1) preprocess(input_data)
+            2) run_model(model_input)
+            3) postprocess(model_output, merged_metadata) unless `return_raw_outputs=True`
+
+        Args:
+            input_data: Raw input sample(s) in a project-defined format.
+            metadata: Optional auxiliary context merged with preprocess metadata.
+            return_raw_outputs: If True, skip `postprocess` and return raw model output.
+
+        Returns:
+            InferenceResult with `output`, total latency, and per-stage breakdown.
+        """
+        if metadata is None:
+            metadata = {}
+
+        latency_breakdown: Dict[str, float] = {}
+
+        try:
+            start_time = time.perf_counter()
+
+            preprocessed = self.preprocess(input_data)
+
+            preprocess_metadata = {}
+            model_input = preprocessed
+            if isinstance(preprocessed, tuple) and len(preprocessed) == 2 and isinstance(preprocessed[1], dict):
+                model_input, preprocess_metadata = preprocessed
+
+            preprocess_time = time.perf_counter()
+            latency_breakdown["preprocessing_ms"] = (preprocess_time - start_time) * 1000
+
+            merged_metadata = {}
+            if metadata is not None:
+                merged_metadata.update(metadata)
+            if preprocess_metadata is not None:
+                merged_metadata.update(preprocess_metadata)
+
+            model_start = time.perf_counter()
+            model_result = self.run_model(model_input)
+            model_time = time.perf_counter()
+            latency_breakdown["model_ms"] = (model_time - model_start) * 1000
+
+            if isinstance(model_result, tuple) and len(model_result) == 2:
+                model_output, stage_latencies = model_result
+                if isinstance(stage_latencies, dict):
+                    latency_breakdown.update(stage_latencies)
+            else:
+                model_output = model_result
+
+            # Legacy stage latency aggregation (kept)
+            if hasattr(self, "_stage_latencies") and isinstance(self._stage_latencies, dict):
+                latency_breakdown.update(self._stage_latencies)
+                self._stage_latencies = {}
+
+            total_latency = (time.perf_counter() - start_time) * 1000
+
+            if return_raw_outputs:
+                return InferenceResult(output=model_output, latency_ms=total_latency, breakdown=latency_breakdown)
+
+            postprocess_start = time.perf_counter()
+            predictions = self.postprocess(model_output, merged_metadata)
+            postprocess_time = time.perf_counter()
+            latency_breakdown["postprocessing_ms"] = (postprocess_time - postprocess_start) * 1000
+
+            total_latency = (time.perf_counter() - start_time) * 1000
+            return InferenceResult(output=predictions, latency_ms=total_latency, breakdown=latency_breakdown)
+
+        except Exception:
+            logger.exception("Inference failed.")
+            raise
+
+    def cleanup(self) -> None:
+        """Release resources owned by the pipeline.
+
+        Subclasses should override when they hold external resources (e.g., CUDA
+        buffers, TensorRT engines/contexts, file handles). `infer()` does not call
+        this automatically; use the context manager (`with pipeline:`) or call it
+        explicitly.
+        """
+        pass
+
+    def __repr__(self):
+        return (
+            f"{self.__class__.__name__}("
+            f"device={self.device}, "
+            f"task={self.task_type}, "
+            f"backend={self.backend_type})"
+        )
+
+    def __enter__(self):
+        return self
+
+    def __exit__(self, exc_type, exc_val, exc_tb):
+        self.cleanup()
+        return False
diff --git a/deployment/pipelines/factory.py b/deployment/pipelines/factory.py
new file mode 100644
index 000000000..0dcce4cef
--- /dev/null
+++ b/deployment/pipelines/factory.py
@@ -0,0 +1,110 @@
+"""
+Pipeline Factory for Centralized Pipeline Instantiation.
+
+This module provides a unified interface for creating deployment pipelines
+using the registry pattern. Each project registers its own factory, and
+this module provides convenience methods for pipeline creation.
+
+Architecture:
+    - Each project implements `BasePipelineFactory` in its own directory
+    - Factories are registered with `pipeline_registry` using decorators
+    - This factory provides a unified interface for pipeline creation
+
+Usage:
+    from deployment.pipelines.factory import PipelineFactory
+    pipeline = PipelineFactory.create("centerpoint", model_spec, pytorch_model)
+
+    # Or use registry directly:
+    from deployment.pipelines.registry import pipeline_registry
+    pipeline = pipeline_registry.create_pipeline("centerpoint", model_spec, pytorch_model)
+"""
+
+import logging
+from typing import Any, List, Optional
+
+from deployment.core.evaluation.evaluator_types import ModelSpec
+from deployment.pipelines.base_pipeline import BaseDeploymentPipeline
+from deployment.pipelines.registry import pipeline_registry
+
+logger = logging.getLogger(__name__)
+
+
+class PipelineFactory:
+    """
+    Factory for creating deployment pipelines.
+
+    This class provides a unified interface for creating pipelines across
+    different projects and backends. It delegates to project-specific
+    factories through the pipeline registry.
+
+    Example:
+        # Create a pipeline using the generic method
+        pipeline = PipelineFactory.create("centerpoint", model_spec, pytorch_model)
+
+        # List available projects
+        projects = PipelineFactory.list_projects()
+    """
+
+    @staticmethod
+    def create(
+        project_name: str,
+        model_spec: ModelSpec,
+        pytorch_model: Any,
+        device: Optional[str] = None,
+        components_cfg: Optional[Any] = None,
+    ) -> BaseDeploymentPipeline:
+        """
+        Create a pipeline for the specified project.
+
+        Args:
+            project_name: Name of the project (e.g., "centerpoint", "yolox")
+            model_spec: Model specification (backend/device/path)
+            pytorch_model: PyTorch model instance
+            device: Override device (uses model_spec.device if None)
+            components_cfg: Project-specific component configuration
+
+        Returns:
+            Pipeline instance
+
+        Raises:
+            KeyError: If project is not registered
+            ValueError: If backend is not supported
+
+        Example:
+            >>> pipeline = PipelineFactory.create(
+            ...     "centerpoint",
+            ...     model_spec,
+            ...     pytorch_model,
+            ...     components_cfg=components_cfg,
+            ... )
+        """
+        return pipeline_registry.create_pipeline(
+            project_name=project_name,
+            model_spec=model_spec,
+            pytorch_model=pytorch_model,
+            device=device,
+            components_cfg=components_cfg,
+        )
+
+    @staticmethod
+    def list_projects() -> List[str]:
+        """
+        List all registered projects.
+
+        Returns:
+            List of registered project names
+        """
+        return pipeline_registry.list_projects()
+
+    @staticmethod
+    def is_project_registered(project_name: str) -> bool:
+        """
+        Check if a project is registered.
+
+        Args:
+            project_name: Name of the project
+
+        Returns:
+            True if project is registered
+        """
+        return pipeline_registry.is_registered(project_name)
diff --git a/deployment/pipelines/gpu_resource_mixin.py b/deployment/pipelines/gpu_resource_mixin.py
new file mode 100644
index 000000000..3f4db2048
--- /dev/null
+++ b/deployment/pipelines/gpu_resource_mixin.py
@@ -0,0 +1,149 @@
+"""
+GPU Resource Management utilities for TensorRT Pipelines.
+
+Flattened from `deployment/pipelines/common/gpu_resource_mixin.py`.
+"""
+
+import logging
+from abc import ABC, abstractmethod
+from typing import Any, Dict, List, Optional
+
+import pycuda.driver as cuda
+import torch
+
+logger = logging.getLogger(__name__)
+
+
+def clear_cuda_memory() -> None:
+    """Best-effort CUDA memory cleanup for long-running deployment workflows."""
+    if torch.cuda.is_available():
+        torch.cuda.empty_cache()
+        torch.cuda.synchronize()
+
+
+class GPUResourceMixin(ABC):
+    """Mixin that provides idempotent GPU resource cleanup.
+
+    Subclasses implement `_release_gpu_resources()` and this mixin ensures cleanup
+    is called exactly once (including via context-manager or destructor paths).
+    """
+
+    _cleanup_called: bool = False
+
+    @abstractmethod
+    def _release_gpu_resources(self) -> None:
+        """Release backend-specific GPU resources owned by the instance."""
+        raise NotImplementedError
+
+    def cleanup(self) -> None:
+        """Release GPU resources once and clear CUDA caches (best effort)."""
+        if self._cleanup_called:
+            return
+
+        try:
+            self._release_gpu_resources()
+            clear_cuda_memory()
+            self._cleanup_called = True
+            logger.debug(f"{self.__class__.__name__}: GPU resources released")
+        except Exception as e:
+            logger.warning(f"Error during GPU resource cleanup: {e}")
+
+    def __enter__(self):
+        return self
+
+    def __exit__(self, exc_type, exc_val, exc_tb):
+        self.cleanup()
+        return False
+
+    def __del__(self):
+        try:
+            self.cleanup()
+        except Exception:
+            pass
+
+
+class TensorRTResourceManager:
+    """Helper that tracks CUDA allocations/stream for TensorRT inference.
+
+    This is intentionally minimal: allocate device buffers, provide a stream,
+    and free everything on context exit.
+    """
+
+    def __init__(self):
+        """Create an empty manager (no allocations and no stream)."""
+        self._allocations: List[Any] = []
+        self._stream: Optional[Any] = None
+
+    def allocate(self, nbytes: int) -> Any:
+        """Allocate `nbytes` on the device and track it for automatic cleanup."""
+        allocation = cuda.mem_alloc(nbytes)
+        self._allocations.append(allocation)
+        return allocation
+
+    @property
+    def stream(self) -> Any:
+        """Return a lazily-created CUDA stream shared by the manager."""
+        if self._stream is None:
+            self._stream = cuda.Stream()
+        return self._stream
+
+    def synchronize(self) -> None:
+        """Synchronize the tracked CUDA stream (if created)."""
+        if self._stream is not None:
+            self._stream.synchronize()
+
+    def _release_all(self) -> None:
+        """Free all tracked allocations and drop the stream reference."""
+        for allocation in self._allocations:
+            try:
+                allocation.free()
+            except Exception:
+                pass
+        self._allocations.clear()
+        self._stream = None
+
+    def __enter__(self):
+        return self
+
+    def __exit__(self, exc_type, exc_val, exc_tb):
+        self.synchronize()
+        self._release_all()
+        return False
+
+
+def release_tensorrt_resources(
+    engines: Optional[Dict[str, Any]] = None,
+    contexts: Optional[Dict[str, Any]] = None,
+    cuda_buffers: Optional[List[Any]] = None,
+) -> None:
+    """Best-effort release of TensorRT engines/contexts and CUDA buffers.
+
+    This is defensive cleanup for cases where objects need explicit deletion and
+    CUDA buffers need manual `free()`.
+    """
+    if contexts:
+        for _, context in list(contexts.items()):
+            if context is not None:
+                try:
+                    del context
+                except Exception:
+                    pass
+        contexts.clear()
+
+    if engines:
+        for _, engine in list(engines.items()):
+            if engine is not None:
+                try:
+                    del engine
+                except Exception:
+                    pass
+        engines.clear()
+
+    if cuda_buffers:
+        for buffer in cuda_buffers:
+            if buffer is not None:
+                try:
+                    buffer.free()
+                except Exception:
+                    pass
+        cuda_buffers.clear()
diff --git a/deployment/pipelines/registry.py b/deployment/pipelines/registry.py
new file mode 100644
index 000000000..585da3c09
--- /dev/null
+++ b/deployment/pipelines/registry.py
@@ -0,0 +1,104 @@
+"""
+Pipeline Registry for Dynamic Project Pipeline Registration.
+
+Flattened from `deployment/pipelines/common/registry.py`.
+"""
+
+import logging
+from typing import Any, Dict, Optional, Type
+
+from deployment.core.evaluation.evaluator_types import ModelSpec
+from deployment.pipelines.base_factory import BasePipelineFactory
+from deployment.pipelines.base_pipeline import BaseDeploymentPipeline
+
+logger = logging.getLogger(__name__)
+
+
+class PipelineRegistry:
+    """Registry for mapping project names to pipeline factories.
+
+    Factories are responsible for creating a `BaseDeploymentPipeline` instance
+    given a `ModelSpec` and (optionally) a loaded PyTorch model.
+    """
+
+    def __init__(self):
+        """Initialize an empty registry.
+
+        The registry is populated at import-time by project modules that register
+        their `BasePipelineFactory` subclasses (typically via a decorator call to
+        `pipeline_registry.register(...)`).
+        """
+        self._factories: Dict[str, Type[BasePipelineFactory]] = {}
+
+    def register(self, factory_cls: Type[BasePipelineFactory]) -> Type[BasePipelineFactory]:
+        """Register a project factory class.
+
+        Args:
+            factory_cls: A subclass of `BasePipelineFactory`.
+
+        Returns:
+            The same class, enabling decorator usage:
+            `@pipeline_registry.register`
+            `class MyFactory(BasePipelineFactory): ...`
+        """
+        if not issubclass(factory_cls, BasePipelineFactory):
+            raise TypeError(f"Factory class must inherit from BasePipelineFactory, got {factory_cls.__name__}")
+
+        project_name = factory_cls.get_project_name()
+
+        if project_name in self._factories:
+            logger.warning(
+                f"Overwriting existing factory for project '{project_name}': "
+                f"{self._factories[project_name].__name__} -> {factory_cls.__name__}"
+            )
+
+        self._factories[project_name] = factory_cls
+        logger.debug(f"Registered pipeline factory: {project_name} -> {factory_cls.__name__}")
+        return factory_cls
+
+    def get_factory(self, project_name: str) -> Type[BasePipelineFactory]:
+        """Return the registered factory for a project name.
+
+        Raises:
+            KeyError: If no factory is registered for the given project.
+        """
+        if project_name not in self._factories:
+            available = list(self._factories.keys())
+            raise KeyError(f"No factory registered for project '{project_name}'. Available projects: {available}")
+        return self._factories[project_name]
+
+    def create_pipeline(
+        self,
+        project_name: str,
+        model_spec: ModelSpec,
+        pytorch_model: Any,
+        device: Optional[str] = None,
+        components_cfg: Optional[Any] = None,
+    ) -> BaseDeploymentPipeline:
+        """Create a project-specific pipeline instance using the registered factory.
+
+        This is the central instantiation path used by evaluators and by the
+        convenience wrapper `deployment.pipelines.factory.PipelineFactory`.
+        """
+        factory = self.get_factory(project_name)
+        return factory.create_pipeline(
+            model_spec=model_spec,
+            pytorch_model=pytorch_model,
+            device=device,
+            components_cfg=components_cfg,
+        )
+
+    def list_projects(self) -> list:
+        """List registered project names."""
+        return list(self._factories.keys())
+
+    def is_registered(self, project_name: str) -> bool:
+        """Return True if a project is registered."""
+        return project_name in self._factories
+
+    def reset(self) -> None:
+        """Clear all registrations (primarily useful for tests)."""
+        self._factories.clear()
+
+
+pipeline_registry = PipelineRegistry()
diff --git a/deployment/projects/__init__.py b/deployment/projects/__init__.py
new file mode 100644
index 000000000..649917eb7
--- /dev/null
+++ b/deployment/projects/__init__.py
@@ -0,0 +1,9 @@
+"""Deployment project bundles.
+
+Each subpackage under `deployment/projects/<project>/` should register a
+`ProjectAdapter` into `deployment.projects.registry.project_registry`.
+"""
+
+from deployment.projects.registry import ProjectAdapter, project_registry
+
+__all__ = ["ProjectAdapter", "project_registry"]
diff --git a/deployment/projects/centerpoint/config/deploy_config.py b/deployment/projects/centerpoint/config/deploy_config.py
new file mode 100644
index 000000000..227811a43
--- /dev/null
+++ b/deployment/projects/centerpoint/config/deploy_config.py
@@ -0,0 +1,183 @@
+"""
+CenterPoint Deployment Configuration
+"""
+
+# ============================================================================
+# Task type for pipeline building
+# Options: 'detection2d', 'detection3d', 'classification', 'segmentation'
+# ============================================================================
+task_type = "detection3d"
+
+# ============================================================================
+# Checkpoint Path - Single source of truth for PyTorch model
+# ============================================================================
+checkpoint_path = "work_dirs/centerpoint/best_checkpoint.pth"
+
+# ============================================================================
+# Device settings (shared by export, evaluation, verification)
+# ============================================================================
+devices = dict(
+    cpu="cpu",
+    cuda="cuda:0",
+)
+
+# ============================================================================
+# Export Configuration
+# ============================================================================
+export = dict(
+    mode="both",
+    work_dir="work_dirs/centerpoint_deployment",
+    onnx_path=None,
+)
+
+# Derived artifact directories
+_WORK_DIR = str(export["work_dir"]).rstrip("/")
+_ONNX_DIR = f"{_WORK_DIR}/onnx"
+_TENSORRT_DIR = f"{_WORK_DIR}/tensorrt"
+
+# ============================================================================
+# Unified Component Configuration (Single Source of Truth)
+#
+# Each component defines:
+#   - name: Component identifier used in export
+#   - onnx_file: Output ONNX filename
+#   - engine_file: Output TensorRT engine filename
+#   - io: Input/output specification for ONNX export
+#   - tensorrt_profile: TensorRT optimization profile (min/opt/max shapes)
+# ============================================================================
+components = dict(
+    voxel_encoder=dict(
+        name="pts_voxel_encoder",
+        onnx_file="pts_voxel_encoder.onnx",
+        engine_file="pts_voxel_encoder.engine",
+        io=dict(
+            inputs=[
+                dict(name="input_features", dtype="float32"),
+            ],
+            outputs=[
+                dict(name="pillar_features", dtype="float32"),
+            ],
+            dynamic_axes={
+                "input_features": {0: "num_voxels", 1: "num_max_points"},
+                "pillar_features": {0: "num_voxels"},
+            },
+        ),
+        tensorrt_profile=dict(
+            input_features=dict(
+                min_shape=[1000, 32, 11],
+                opt_shape=[20000, 32, 11],
+                max_shape=[64000, 32, 11],
+            ),
+        ),
+    ),
+    backbone_head=dict(
+        name="pts_backbone_neck_head",
+        onnx_file="pts_backbone_neck_head.onnx",
+        engine_file="pts_backbone_neck_head.engine",
+        io=dict(
+            inputs=[
+                dict(name="spatial_features", dtype="float32"),
+            ],
+            outputs=[
+                dict(name="heatmap", dtype="float32"),
+                dict(name="reg", dtype="float32"),
+                dict(name="height", dtype="float32"),
+                dict(name="dim", dtype="float32"),
+                dict(name="rot", dtype="float32"),
+                dict(name="vel", dtype="float32"),
+            ],
+            dynamic_axes={
+                "spatial_features": {0: "batch_size", 2: "height", 3: "width"},
+                "heatmap": {0: "batch_size", 2: "height", 3: "width"},
+                "reg": {0: "batch_size", 2: "height", 3: "width"},
+                "height": {0: "batch_size", 2: "height", 3: "width"},
+                "dim": {0: "batch_size", 2: "height", 3: "width"},
+                "rot": {0: "batch_size", 2: "height", 3: "width"},
+                "vel": {0: "batch_size", 2: "height", 3: "width"},
+            },
+        ),
+        tensorrt_profile=dict(
+            spatial_features=dict(
+                min_shape=[1, 32, 1020, 1020],
+                opt_shape=[1, 32, 1020, 1020],
+                max_shape=[1, 32, 1020, 1020],
+            ),
+        ),
+    ),
+)
+
+# ============================================================================
+# Runtime I/O settings
+# ============================================================================
+runtime_io = dict(
+    # This should be a path relative to `data_root` in the model config.
+    info_file="info/t4dataset_j6gen2_infos_val.pkl",
+    sample_idx=1,
+)
+
+# ============================================================================
+# ONNX Export Settings (shared across all components)
+# ============================================================================
+onnx_config = dict(
+    opset_version=16,
+    do_constant_folding=True,
+    export_params=True,
+    keep_initializers_as_inputs=False,
+    simplify=False,
+)
+
+# ============================================================================
+# TensorRT Build Settings (shared across all components)
+# ============================================================================
+tensorrt_config = dict(
+    precision_policy="auto",
+    max_workspace_size=2 << 30,
+)
+
+# ============================================================================
+# Evaluation Configuration
+# ============================================================================
+evaluation = dict(
+    enabled=True,
+    num_samples=1,
+    verbose=True,
+    backends=dict(
+        pytorch=dict(
+            enabled=True,
+            device=devices["cuda"],
+        ),
+        onnx=dict(
+            enabled=True,
+            device=devices["cuda"],
+            model_dir=_ONNX_DIR,
+        ),
+        tensorrt=dict(
+            enabled=True,
+            device=devices["cuda"],
+            engine_dir=_TENSORRT_DIR,
+        ),
+    ),
+)
+
+# ============================================================================
+# Verification Configuration
+# ============================================================================
+verification = dict(
+    enabled=False,
+    tolerance=1e-1,
+    num_verify_samples=1,
+    devices=devices,
+    scenarios=dict(
+        both=[
+            dict(ref_backend="pytorch", ref_device="cpu", test_backend="onnx", test_device="cpu"),
+            dict(ref_backend="onnx", ref_device="cuda", test_backend="tensorrt", test_device="cuda"),
+        ],
+        onnx=[
+            dict(ref_backend="pytorch", ref_device="cpu", test_backend="onnx", test_device="cpu"),
+        ],
+        trt=[
+            dict(ref_backend="onnx", ref_device="cuda", test_backend="tensorrt", test_device="cuda"),
+        ],
+        none=[],
+    ),
+)
diff --git a/deployment/projects/registry.py b/deployment/projects/registry.py
new file mode 100644
index 000000000..c5932323c
--- /dev/null
+++ b/deployment/projects/registry.py
@@ -0,0 +1,55 @@
+"""
+Project registry for deployment bundles.
+
+Each deployment project registers an adapter that knows how to:
+- add its CLI args
+- construct data_loader / evaluator / runner
+- execute the deployment workflow
+
+This keeps `deployment/cli/main.py` project-agnostic.
+"""
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+from typing import Callable, Dict, Optional
+
+
+@dataclass(frozen=True)
+class ProjectAdapter:
+    """Minimal adapter interface for a deployment project."""
+
+    name: str
+    add_args: Callable  # (argparse.ArgumentParser) -> None
+    run: Callable  # (argparse.Namespace) -> int
+
+
+class ProjectRegistry:
+    """In-memory registry of deployment project adapters.
+
+    The unified CLI discovers and imports `deployment.projects.<name>` packages;
+    each package registers a `ProjectAdapter` here. This keeps core/cli code
+    project-agnostic while enabling project-specific argument wiring and run logic.
+    """
+
+    def __init__(self) -> None:
+        self._adapters: Dict[str, ProjectAdapter] = {}
+
+    def register(self, adapter: ProjectAdapter) -> None:
+        name = adapter.name.strip().lower()
+        if not name:
+            raise ValueError("ProjectAdapter.name must be non-empty")
+        self._adapters[name] = adapter
+
+    def get(self, name: str) -> ProjectAdapter:
+        key = (name or "").strip().lower()
+        if key not in self._adapters:
+            available = ", ".join(sorted(self._adapters.keys()))
+            raise KeyError(f"Unknown project '{name}'. Available: [{available}]")
+        return self._adapters[key]
+
+    def list_projects(self) -> list[str]:
+        return sorted(self._adapters.keys())
+
+
+project_registry = ProjectRegistry()
diff --git a/deployment/runtime/__init__.py b/deployment/runtime/__init__.py
new file mode 100644
index 000000000..6f0d383a2
--- /dev/null
+++ b/deployment/runtime/__init__.py
@@ -0,0 +1,25 @@
+"""Shared deployment runtime (runner + orchestrators).
+
+This package contains the project-agnostic runtime execution layer:
+- BaseDeploymentRunner
+- Export/Verification/Evaluation orchestrators
+- ArtifactManager
+
+Project-specific code should live under `deployment/projects/<project>/`.
+"""
+
+from deployment.runtime.artifact_manager import ArtifactManager
+from deployment.runtime.evaluation_orchestrator import EvaluationOrchestrator
+from deployment.runtime.export_orchestrator import ExportOrchestrator, ExportResult
+from deployment.runtime.runner import BaseDeploymentRunner, DeploymentResult
+from deployment.runtime.verification_orchestrator import VerificationOrchestrator
+
+__all__ = [
+    "ArtifactManager",
+    "ExportOrchestrator",
+    "ExportResult",
+    "VerificationOrchestrator",
+    "EvaluationOrchestrator",
+    "BaseDeploymentRunner",
+    "DeploymentResult",
+]
diff --git a/deployment/runtime/artifact_manager.py b/deployment/runtime/artifact_manager.py
new file mode 100644
index 000000000..2996e4c15
--- /dev/null
+++ b/deployment/runtime/artifact_manager.py
@@ -0,0 +1,133 @@
+"""
+Artifact management for deployment workflows.
+
+This module handles registration and resolution of model artifacts (PyTorch checkpoints,
+ONNX models, TensorRT engines) across different backends.
+"""
+
+import logging
+import os.path as osp
+from collections.abc import Mapping
+from typing import Any, Dict, Optional, Tuple
+
+from deployment.core.artifacts import Artifact
+from deployment.core.backend import Backend
+from deployment.core.config.base_config import BaseDeploymentConfig
+
+
+class ArtifactManager:
+    """
+    Manages model artifacts and path resolution for deployment workflows.
+
+    Resolution Order (consistent for all backends):
+    1. Registered artifacts (from export operations) - highest priority
+    2. Explicit paths from evaluation.backends.<backend> config:
+       - ONNX: evaluation.backends.onnx.model_dir
+       - TensorRT: evaluation.backends.tensorrt.engine_dir
+    3. Backend-specific fallback paths:
+       - PyTorch: checkpoint_path
+       - ONNX: export.onnx_path
+    """
+
+    def __init__(self, config: BaseDeploymentConfig, logger: logging.Logger):
+        """
+        Initialize artifact manager.
+
+        Args:
+            config: Deployment configuration
+            logger: Logger instance
+        """
+        self.config = config
+        self.logger = logger
+        self.artifacts: Dict[str, Artifact] = {}
+
+    def register_artifact(self, backend: Backend, artifact: Artifact) -> None:
+        """
+        Register an artifact for a given backend.
+
+        Args:
+            backend: Backend to register the artifact for
+            artifact: Artifact to register
+        """
+        self.artifacts[backend.value] = artifact
+        self.logger.debug(f"Registered {backend.value} artifact: {artifact.path}")
+
+    def get_artifact(self, backend: Backend) -> Optional[Artifact]:
+        """
+        Get an artifact for a given backend.
+
+        Args:
+            backend: Backend to get the artifact for
+        Returns:
+            Artifact for the given backend
+        """
+        return self.artifacts.get(backend.value)
+
+    def resolve_artifact(self, backend: Backend) -> Tuple[Optional[Artifact], bool]:
+        """
+        Resolve an artifact for a given backend.
+
+        Args:
+            backend: Backend to resolve the artifact for
+        Returns:
+            Tuple containing the artifact and a boolean indicating if the artifact exists
+        """
+        artifact = self.artifacts.get(backend.value)
+        if artifact:
+            return artifact, artifact.exists
+
+        config_path = self._get_config_path(backend)
+        if config_path:
+            is_dir = osp.isdir(config_path) if osp.exists(config_path) else False
+            artifact = Artifact(path=config_path, multi_file=is_dir)
+            return artifact, artifact.exists
+
+        return None, False
+
+    def _get_config_path(self, backend: Backend) -> Optional[str]:
+        """
+        Get the configuration path for a given backend.
+
+        Args:
+            backend: Backend to get the configuration path for
+        Returns:
+            Configuration path for the given backend
+        """
+        eval_backends = self.config.evaluation_config.backends
+        backend_cfg = self._get_backend_entry(eval_backends, backend)
+        if backend_cfg and isinstance(backend_cfg, Mapping):
+            if backend == Backend.ONNX:
+                path = backend_cfg.get("model_dir")
+                if path:
+                    return path
+            elif backend == Backend.TENSORRT:
+                path = backend_cfg.get("engine_dir")
+                if path:
+                    return path
+
+        if backend == Backend.PYTORCH:
+            return self.config.checkpoint_path
+        if backend == Backend.ONNX:
+            return self.config.export_config.onnx_path
+
+        return None
+
+    @staticmethod
+    def _get_backend_entry(mapping: Optional[Mapping], backend: Backend) -> Any:
+        """
+        Get a backend entry from a mapping.
+
+        Args:
+            mapping: Mapping to get the backend entry from
+            backend: Backend to get the entry for
+        Returns:
+            Backend entry from the mapping
+        """
+        if not mapping:
+            return None
+
+        value = mapping.get(backend.value)
+        if value is not None:
+            return value
+
+        return mapping.get(backend)
diff --git a/deployment/runtime/evaluation_orchestrator.py b/deployment/runtime/evaluation_orchestrator.py
new file mode 100644
index 000000000..1759535da
--- /dev/null
+++ b/deployment/runtime/evaluation_orchestrator.py
@@ -0,0 +1,217 @@
+"""
+Evaluation orchestration for deployment workflows.
+
+This module handles cross-backend evaluation with consistent metrics.
+"""
+
+from __future__ import annotations
+
+import logging
+from typing import Any, Dict, List, Mapping
+
+from deployment.core.backend import Backend
+from deployment.core.config.base_config import BaseDeploymentConfig
+from deployment.core.evaluation.base_evaluator import BaseEvaluator
+from deployment.core.evaluation.evaluator_types import ModelSpec
+from deployment.core.io.base_data_loader import BaseDataLoader
+from deployment.runtime.artifact_manager import ArtifactManager
+
+
+class EvaluationOrchestrator:
+    """
+    Orchestrates evaluation across backends with consistent metrics.
+
+    This class handles:
+    - Resolving models to evaluate from configuration
+    - Running evaluation for each enabled backend
+    - Collecting and formatting evaluation results
+    - Logging evaluation progress and results
+    - Cross-backend metric comparison
+    """
+
+    def __init__(
+        self,
+        config: BaseDeploymentConfig,
+        evaluator: BaseEvaluator,
+        data_loader: BaseDataLoader,
+        logger: logging.Logger,
+    ):
+        """
+        Initialize the evaluation orchestrator.
+
+        Args:
+            config: Deployment configuration
+            evaluator: Evaluator instance for running evaluation
+            data_loader: Data loader for loading samples
+            logger: Logger instance
+        """
+        self.config = config
+        self.evaluator = evaluator
+        self.data_loader = data_loader
+        self.logger = logger
+
+    def run(self, artifact_manager: ArtifactManager) -> Dict[str, Any]:
+        """
+        Run the evaluation orchestration.
+
+        Args:
+            artifact_manager: Artifact manager for resolving model paths
+        Returns:
+            Dictionary of evaluation results
+        """
+        eval_config = self.config.evaluation_config
+
+        if not eval_config.enabled:
+            self.logger.info("Evaluation disabled, skipping...")
+            return {}
+
+        self.logger.info("=" * 80)
+        self.logger.info("Running Evaluation")
+        self.logger.info("=" * 80)
+
+        models_to_evaluate = self._get_models_to_evaluate(artifact_manager)
+        if not models_to_evaluate:
+            self.logger.warning("No models found for evaluation")
+            return {}
+
+        num_samples = eval_config.num_samples
+        if num_samples == -1:
+            num_samples = self.data_loader.num_samples
+
+        verbose_mode = eval_config.verbose
+        all_results: Dict[str, Any] = {}
+
+        for spec in models_to_evaluate:
+            backend = spec.backend
+            backend_device = self._normalize_device_for_backend(backend, spec.device)
+            normalized_spec = ModelSpec(backend=backend, device=backend_device, artifact=spec.artifact)
+
+            self.logger.info(f"\nEvaluating {backend.value} on {backend_device}...")
+            try:
+                results = self.evaluator.evaluate(
+                    model=normalized_spec,
+                    data_loader=self.data_loader,
+                    num_samples=num_samples,
+                    verbose=verbose_mode,
+                )
+                all_results[backend.value] = results
+                self.logger.info(f"\n{backend.value.upper()} Results:")
+                self.evaluator.print_results(results)
+            except Exception as e:
+                self.logger.error(f"Evaluation failed for {backend.value}: {e}", exc_info=True)
+                all_results[backend.value] = {"error": str(e)}
+            finally:
+                from deployment.pipelines.gpu_resource_mixin import clear_cuda_memory
+
+                clear_cuda_memory()
+
+        if len(all_results) > 1:
+            self._print_cross_backend_comparison(all_results)
+
+        return all_results
+
+    def _get_models_to_evaluate(self, artifact_manager: ArtifactManager) -> List[ModelSpec]:
+        """
+        Get the models to evaluate from the configuration.
+
+        Args:
+            artifact_manager: Artifact manager for resolving model paths
+        Returns:
+            List of model specifications
+        """
+        backends = self.config.evaluation_backends
+        models_to_evaluate: List[ModelSpec] = []
+
+        for backend_key, backend_cfg in backends.items():
+            backend_enum = Backend.from_value(backend_key)
+            if not backend_cfg.get("enabled", False):
+                continue
+
+            device = str(backend_cfg.get("device", "cpu") or "cpu")
+            artifact, is_valid = artifact_manager.resolve_artifact(backend_enum)
+
+            if is_valid and artifact:
+                spec = ModelSpec(backend=backend_enum, device=device, artifact=artifact)
+                models_to_evaluate.append(spec)
+                self.logger.info(f"  - {backend_enum.value}: {artifact.path} (device: {device})")
+            elif artifact is not None:
+                self.logger.warning(f"  - {backend_enum.value}: {artifact.path} (not found or invalid, skipping)")
+
+        return models_to_evaluate
+
+    def _normalize_device_for_backend(self, backend: Backend, device: str) -> str:
+        """
+        Normalize the device for a backend.
+
+        Args:
+            backend: Backend to normalize the device for
+            device: Device to normalize
+        Returns:
+            Normalized device string
+        """
+        normalized_device = str(device or self._get_default_device(backend) or "cpu")
+
+        if backend in (Backend.PYTORCH, Backend.ONNX):
+            if normalized_device not in ("cpu",) and not normalized_device.startswith("cuda"):
+                self.logger.warning(
+                    f"Unsupported device '{normalized_device}' for backend '{backend.value}'. Falling back to CPU."
+                )
+                normalized_device = "cpu"
+        elif backend is Backend.TENSORRT:
+            if not normalized_device or normalized_device == "cpu":
+                normalized_device = self.config.devices.cuda or "cuda:0"
+            if not normalized_device.startswith("cuda"):
+                self.logger.warning(
+                    "TensorRT evaluation requires CUDA device. "
+                    f"Overriding device from '{normalized_device}' to 'cuda:0'."
+                )
+                normalized_device = "cuda:0"
+
+        return normalized_device
+
+    def _get_default_device(self, backend: Backend) -> str:
+        """
+        Get the default device for a backend.
+
+        Args:
+            backend: Backend to get the default device for
+        Returns:
+            Default device string
+        """
+        if backend is Backend.TENSORRT:
+            return self.config.devices.cuda or "cuda:0"
+        return self.config.devices.cpu or "cpu"
+
+    def _print_cross_backend_comparison(self, all_results: Mapping[str, Any]) -> None:
+        """
+        Print the cross-backend comparison results.
+
+        Args:
+            all_results: Dictionary of all results
+        """
+        self.logger.info("\n" + "=" * 80)
+        self.logger.info("Cross-Backend Comparison")
+        self.logger.info("=" * 80)
+
+        for backend_label, results in all_results.items():
+            self.logger.info(f"\n{backend_label.upper()}:")
+            if results and "error" not in results:
+                if "accuracy" in results:
+                    self.logger.info(f"  Accuracy: {results.get('accuracy', 0):.4f}")
+                if "mAP_by_mode" in results:
+                    mAP_by_mode = results.get("mAP_by_mode", {})
+                    if mAP_by_mode:
+                        for mode, map_value in mAP_by_mode.items():
+                            self.logger.info(f"  mAP ({mode}): {map_value:.4f}")
+
+                if "mAPH_by_mode" in results:
+                    mAPH_by_mode = results.get("mAPH_by_mode", {})
+                    if mAPH_by_mode:
+                        for mode, maph_value in mAPH_by_mode.items():
+                            self.logger.info(f"  mAPH ({mode}): {maph_value:.4f}")
+
+                if "latency" in results:
+                    latency = results["latency"]
+                    self.logger.info(f"  Latency: {latency.mean_ms:.2f} ± {latency.std_ms:.2f} ms")
+            else:
+                self.logger.info("  No results available")
diff --git a/deployment/runtime/export_orchestrator.py b/deployment/runtime/export_orchestrator.py
new file mode 100644
index 000000000..633f2d80a
--- /dev/null
+++ b/deployment/runtime/export_orchestrator.py
@@ -0,0 +1,534 @@
+"""
+Export orchestration for deployment workflows.
+
+This module handles all model export logic (PyTorch loading, ONNX export, TensorRT export)
+in a unified orchestrator, keeping the deployment runner thin.
+"""
+
+from __future__ import annotations
+
+import logging
+import os
+from dataclasses import dataclass
+from typing import Any, Callable, Mapping, Optional, Type
+
+import torch
+
+from deployment.core.artifacts import Artifact
+from deployment.core.backend import Backend
+from deployment.core.config.base_config import BaseDeploymentConfig
+from deployment.core.contexts import ExportContext
+from deployment.core.io.base_data_loader import BaseDataLoader
+from deployment.exporters.common.factory import ExporterFactory
+from deployment.exporters.common.model_wrappers import BaseModelWrapper
+from deployment.exporters.common.onnx_exporter import ONNXExporter
+from deployment.exporters.common.tensorrt_exporter import TensorRTExporter
+from deployment.exporters.export_pipelines.base import OnnxExportPipeline, TensorRTExportPipeline
+from deployment.runtime.artifact_manager import ArtifactManager
+
+
+@dataclass
+class ExportResult:
+    """
+    Result of the export orchestration.
+
+    Attributes:
+        pytorch_model: Loaded PyTorch model (if loaded)
+        onnx_path: Path to exported ONNX artifact
+        tensorrt_path: Path to exported TensorRT engine
+    """
+
+    pytorch_model: Optional[Any] = None
+    onnx_path: Optional[str] = None
+    tensorrt_path: Optional[str] = None
+
+
+class ExportOrchestrator:
+    """
+    Orchestrates model export workflows (PyTorch loading, ONNX, TensorRT).
+
+    This class centralizes all export-related logic:
+    - Determining when PyTorch model is needed
+    - Loading PyTorch model via injected loader
+    - ONNX export (via workflow or standard exporter)
+    - TensorRT export (via workflow or standard exporter)
+    - Artifact registration
+
+    By extracting this logic from the runner, the runner becomes a thin
+    orchestrator that coordinates Export, Verification, and Evaluation.
+    """
+
+    ONNX_DIR_NAME = "onnx"
+    TENSORRT_DIR_NAME = "tensorrt"
+    DEFAULT_ENGINE_FILENAME = "model.engine"
+
+    def __init__(
+        self,
+        config: BaseDeploymentConfig,
+        data_loader: BaseDataLoader,
+        artifact_manager: ArtifactManager,
+        logger: logging.Logger,
+        model_loader: Callable[..., Any],
+        evaluator: Any,
+        onnx_wrapper_cls: Optional[Type[BaseModelWrapper]] = None,
+        onnx_pipeline: Optional[OnnxExportPipeline] = None,
+        tensorrt_pipeline: Optional[TensorRTExportPipeline] = None,
+    ):
+        """
+        Initialize export orchestrator.
+
+        Args:
+            config: Deployment configuration
+            data_loader: Data loader for loading samples
+            artifact_manager: Artifact manager for resolving model paths
+            logger: Logger instance
+            model_loader: Model loader for loading PyTorch model
+            evaluator: Evaluator instance for running verification
+            onnx_wrapper_cls: ONNX wrapper class for exporting ONNX model
+            onnx_pipeline: ONNX export pipeline
+            tensorrt_pipeline: TensorRT export pipeline
+        """
+        self.config = config
+        self.data_loader = data_loader
+        self.artifact_manager = artifact_manager
+        self.logger = logger
+        self._model_loader = model_loader
+        self._evaluator = evaluator
+        self._onnx_wrapper_cls = onnx_wrapper_cls
+        self._onnx_pipeline = onnx_pipeline
+        self._tensorrt_pipeline = tensorrt_pipeline
+
+        self._onnx_exporter: Optional[ONNXExporter] = None
+        self._tensorrt_exporter: Optional[TensorRTExporter] = None
+
+    def run(self, context: Optional[ExportContext] = None) -> ExportResult:
+        """
+        Execute the complete export workflow.
+
+        This method:
+        1. Determines if PyTorch model is needed
+        2. Loads PyTorch model if needed
+        3. Exports to ONNX if configured
+        4. Exports to TensorRT if configured
+        5. Resolves external artifact paths
+
+        Args:
+            context: Typed export context with parameters. If None, a default
+                     ExportContext is created.
+
+        Returns:
+            ExportResult containing model and artifact paths
+        """
+        if context is None:
+            context = ExportContext()
+
+        result = ExportResult()
+
+        should_export_onnx = self.config.export_config.should_export_onnx
+        should_export_trt = self.config.export_config.should_export_tensorrt
+        checkpoint_path = self.config.checkpoint_path
+        external_onnx_path = self.config.export_config.onnx_path
+
+        requires_pytorch = self._determine_pytorch_requirements()
+
+        pytorch_model = None
+        if requires_pytorch:
+            pytorch_model, success = self._ensure_pytorch_model_loaded(pytorch_model, checkpoint_path, context, result)
+            if not success:
+                return result
+
+        if should_export_onnx:
+            result.onnx_path = self._run_onnx_export(pytorch_model, context)
+
+        if should_export_trt:
+            onnx_path = self._resolve_onnx_path_for_trt(result.onnx_path, external_onnx_path)
+            if not onnx_path:
+                return result
+            result.onnx_path = onnx_path
+            self._register_external_onnx_artifact(onnx_path)
+            result.tensorrt_path = self._run_tensorrt_export(onnx_path, context)
+
+        self._resolve_external_artifacts(result)
+        return result
+
+    def _determine_pytorch_requirements(self) -> bool:
+        """
+        Determine if PyTorch model is required based on configuration.
+
+        Returns:
+            True if PyTorch model is needed, False otherwise
+        """
+        if self.config.export_config.should_export_onnx:
+            return True
+
+        eval_config = self.config.evaluation_config
+        if eval_config.enabled:
+            backends_cfg = eval_config.backends
+            pytorch_cfg = backends_cfg.get(Backend.PYTORCH.value) or backends_cfg.get(Backend.PYTORCH, {})
+            if pytorch_cfg and pytorch_cfg.get("enabled", False):
+                return True
+
+        verification_cfg = self.config.verification_config
+        if verification_cfg.enabled:
+            export_mode = self.config.export_config.mode
+            scenarios = self.config.get_verification_scenarios(export_mode)
+            if scenarios and any(
+                policy.ref_backend is Backend.PYTORCH or policy.test_backend is Backend.PYTORCH for policy in scenarios
+            ):
+                return True
+
+        return False
+
+    def _load_and_register_pytorch_model(self, checkpoint_path: str, context: ExportContext) -> Optional[Any]:
+        """
+        Load and register a PyTorch model from checkpoint.
+
+        Args:
+            checkpoint_path: Path to the PyTorch checkpoint
+            context: Export context with sample index
+        Returns:
+            Loaded PyTorch model or None if loading failed
+        """
+        if not checkpoint_path:
+            self.logger.error(
+                "Checkpoint required but not provided. Please set export.checkpoint_path in config or pass via CLI."
+            )
+            return None
+
+        self.logger.info("\nLoading PyTorch model...")
+        try:
+            pytorch_model = self._model_loader(checkpoint_path, context)
+            self.artifact_manager.register_artifact(Backend.PYTORCH, Artifact(path=checkpoint_path))
+
+            if hasattr(self._evaluator, "set_pytorch_model"):
+                self._evaluator.set_pytorch_model(pytorch_model)
+                self.logger.info("Updated evaluator with pre-built PyTorch model via set_pytorch_model()")
+
+            return pytorch_model
+        except Exception as e:
+            self.logger.error(f"Failed to load PyTorch model: {e}")
+            return None
+
+    def _ensure_pytorch_model_loaded(
+        self,
+        pytorch_model: Optional[Any],
+        checkpoint_path: str,
+        context: ExportContext,
+        result: ExportResult,
+    ) -> tuple[Optional[Any], bool]:
+        """
+        Ensure a PyTorch model is loaded and registered.
+
+        Args:
+            pytorch_model: Existing PyTorch model (if any)
+            checkpoint_path: Path to the PyTorch checkpoint
+            context: Export context with sample index
+            result: Export result object to store the model
+        Returns:
+            Tuple containing the loaded model and success flag
+        """
+        if pytorch_model is not None:
+            return pytorch_model, True
+
+        if not checkpoint_path:
+            self.logger.error("PyTorch model required but checkpoint_path not provided.")
+            return None, False
+
+        pytorch_model = self._load_and_register_pytorch_model(checkpoint_path, context)
+        if pytorch_model is None:
+            self.logger.error("Failed to load PyTorch model; aborting export.")
+            return None, False
+
+        result.pytorch_model = pytorch_model
+        return pytorch_model, True
+
+    def _run_onnx_export(self, pytorch_model: Any, context: ExportContext) -> Optional[str]:
+        """
+        Run the ONNX export workflow.
+
+        Args:
+            pytorch_model: PyTorch model to export
+            context: Export context with sample index
+        Returns:
+            Path to the exported ONNX artifact or None if export failed
+        """
+        onnx_artifact = self._export_onnx(pytorch_model, context)
+        if onnx_artifact:
+            return onnx_artifact.path
+        self.logger.error("ONNX export requested but no artifact was produced.")
+        return None
+
+    def _resolve_onnx_path_for_trt(
+        self, exported_onnx_path: Optional[str], external_onnx_path: Optional[str]
+    ) -> Optional[str]:
+        """
+        Resolve the ONNX path for TensorRT export.
+
+        Args:
+            exported_onnx_path: Path to an exported ONNX artifact (if any)
+            external_onnx_path: Path to an external ONNX artifact (if any)
+        Returns:
+            Resolved ONNX path or None if resolution failed
+        """
+        onnx_path = exported_onnx_path or external_onnx_path
+        if not onnx_path:
+            self.logger.error(
+                "TensorRT export requires an ONNX path. "
+                "Please set export.onnx_path in config or enable ONNX export."
+            )
+            return None
+        return onnx_path
+
+    def _register_external_onnx_artifact(self, onnx_path: str) -> None:
+        """
+        Register an external ONNX artifact.
+
+        Args:
+            onnx_path: Path to the ONNX artifact
+        """
+        if not os.path.exists(onnx_path):
+            return
+        multi_file = os.path.isdir(onnx_path)
+        self.artifact_manager.register_artifact(Backend.ONNX, Artifact(path=onnx_path, multi_file=multi_file))
+
+    def _run_tensorrt_export(self, onnx_path: str, context: ExportContext) -> Optional[str]:
+        """
+        Run the TensorRT export workflow.
+
+        Args:
+            onnx_path: Path to the ONNX artifact
+            context: Export context with sample index
+        Returns:
+            Path to the exported TensorRT engine or None if export failed
+        """
+        trt_artifact = self._export_tensorrt(onnx_path, context)
+        if trt_artifact:
+            return trt_artifact.path
+        self.logger.error("TensorRT export requested but no artifact was produced.")
+        return None
+
+    def _export_onnx(self, pytorch_model: Any, context: ExportContext) -> Optional[Artifact]:
+        """
+        Export a PyTorch model to ONNX.
+
+        Args:
+            pytorch_model: PyTorch model to export
+            context: Export context with sample index
+        Returns:
+            Artifact representing the exported ONNX model or None if export is not configured
+        """
+        if not self.config.export_config.should_export_onnx:
+            return None
+
+        if self._onnx_pipeline is None and self._onnx_wrapper_cls is None:
+            raise RuntimeError("ONNX export requested but no wrapper class or export pipeline provided.")
+
+        onnx_settings = self.config.get_onnx_settings()
+        sample_idx = context.sample_idx if context.sample_idx != 0 else self.config.runtime_config.sample_idx
+
+        onnx_dir = os.path.join(self.config.export_config.work_dir, self.ONNX_DIR_NAME)
+        os.makedirs(onnx_dir, exist_ok=True)
+        output_path = os.path.join(onnx_dir, onnx_settings.save_file)
+
+        if self._onnx_pipeline is not None:
+            self.logger.info("=" * 80)
+            self.logger.info(f"Exporting to ONNX via pipeline ({type(self._onnx_pipeline).__name__})")
+            self.logger.info("=" * 80)
+            artifact = self._onnx_pipeline.export(
+                model=pytorch_model,
+                data_loader=self.data_loader,
+                output_dir=onnx_dir,
+                config=self.config,
+                sample_idx=sample_idx,
+            )
+            self.artifact_manager.register_artifact(Backend.ONNX, artifact)
+            self.logger.info(f"ONNX export successful: {artifact.path}")
+            return artifact
+
+        exporter = self._get_onnx_exporter()
+        self.logger.info("=" * 80)
+        self.logger.info(f"Exporting to ONNX (Using {type(exporter).__name__})")
+        self.logger.info("=" * 80)
+
+        sample = self.data_loader.load_sample(sample_idx)
+        single_input = self.data_loader.preprocess(sample)
+
+        batch_size = onnx_settings.batch_size
+        if batch_size is None:
+            input_tensor = single_input
+            self.logger.info("Using dynamic batch size")
+        else:
+            if isinstance(single_input, (list, tuple)):
+                input_tensor = tuple(
+                    inp.repeat(batch_size, *([1] * (len(inp.shape) - 1))) if len(inp.shape) > 0 else inp
+                    for inp in single_input
+                )
+            else:
+                input_tensor = single_input.repeat(batch_size, *([1] * (len(single_input.shape) - 1)))
+            self.logger.info(f"Using fixed batch size: {batch_size}")
+
+        exporter.export(pytorch_model, input_tensor, output_path)
+
+        multi_file = bool(self.config.onnx_config.get("multi_file", False))
+        artifact_path = onnx_dir if multi_file else output_path
+        artifact = Artifact(path=artifact_path, multi_file=multi_file)
+        self.artifact_manager.register_artifact(Backend.ONNX, artifact)
+        self.logger.info(f"ONNX export successful: {artifact.path}")
+        return artifact
+
+    def _export_tensorrt(self, onnx_path: str, context: ExportContext) -> Optional[Artifact]:
+        """
+        Export an ONNX model to TensorRT.
+
+        Args:
+            onnx_path: Path to the ONNX artifact
+            context: Export context with sample index
+        Returns:
+            Artifact representing the exported TensorRT engine or None if export is not configured
+        """
+        if not self.config.export_config.should_export_tensorrt:
+            return None
+
+        if not onnx_path:
+            self.logger.warning("ONNX path not available, skipping TensorRT export")
+            return None
+
+        exporter_label = None if self._tensorrt_pipeline else type(self._get_tensorrt_exporter()).__name__
+        self.logger.info("=" * 80)
+        if self._tensorrt_pipeline:
+            self.logger.info(f"Exporting to TensorRT via pipeline ({type(self._tensorrt_pipeline).__name__})")
+        else:
+            self.logger.info(f"Exporting to TensorRT (Using {exporter_label})")
+        self.logger.info("=" * 80)
+
+        tensorrt_dir = os.path.join(self.config.export_config.work_dir, self.TENSORRT_DIR_NAME)
+        os.makedirs(tensorrt_dir, exist_ok=True)
+        output_path = self._get_tensorrt_output_path(onnx_path, tensorrt_dir)
+
+        cuda_device = self.config.devices.cuda
+        device_id = self.config.devices.cuda_device_index
+        if cuda_device is None or device_id is None:
+            raise RuntimeError("TensorRT export requires a CUDA device. Set deploy_cfg.devices['cuda'].")
+        torch.cuda.set_device(device_id)
+        self.logger.info(f"Using CUDA device for TensorRT export: {cuda_device}")
+
+        sample_idx = context.sample_idx if context.sample_idx != 0 else self.config.runtime_config.sample_idx
+        sample_input = self.data_loader.get_shape_sample(sample_idx)
+
+        if self._tensorrt_pipeline is not None:
+            artifact = self._tensorrt_pipeline.export(
+                onnx_path=onnx_path,
+                output_dir=tensorrt_dir,
+                config=self.config,
+                device=cuda_device,
+            )
+            self.artifact_manager.register_artifact(Backend.TENSORRT, artifact)
+            self.logger.info(f"TensorRT export successful: {artifact.path}")
+            return artifact
+
+        exporter = self._get_tensorrt_exporter()
+        artifact = exporter.export(
+            model=None,
+            sample_input=sample_input,
+            output_path=output_path,
+            onnx_path=onnx_path,
+        )
+        self.artifact_manager.register_artifact(Backend.TENSORRT, artifact)
+        self.logger.info(f"TensorRT export successful: {artifact.path}")
+        return artifact
+
+    def _get_onnx_exporter(self) -> ONNXExporter:
+        """
+        Get the ONNX exporter instance.
+
+        Returns:
+            ONNX exporter instance
+        """
+        if self._onnx_exporter is None:
+            if self._onnx_wrapper_cls is None:
+                raise RuntimeError("ONNX wrapper class not provided. Cannot create ONNX exporter.")
+            self._onnx_exporter = ExporterFactory.create_onnx_exporter(
+                config=self.config,
+                wrapper_cls=self._onnx_wrapper_cls,
+                logger=self.logger,
+            )
+        return self._onnx_exporter
+
+    def _get_tensorrt_exporter(self) -> TensorRTExporter:
+        """
+        Get the TensorRT exporter instance.
+
+        Returns:
+            TensorRT exporter instance
+        """
+        if self._tensorrt_exporter is None:
+            self._tensorrt_exporter = ExporterFactory.create_tensorrt_exporter(
+                config=self.config,
+                logger=self.logger,
+            )
+        return self._tensorrt_exporter
+
+    def _get_tensorrt_output_path(self, onnx_path: str, tensorrt_dir: str) -> str:
+        """
+        Get the output path for the TensorRT engine.
+
+        Args:
+            onnx_path: Path to the ONNX artifact
+            tensorrt_dir: Directory for TensorRT output
+        Returns:
+            Path to the TensorRT engine file
+        """
+        if os.path.isdir(onnx_path):
+            return os.path.join(tensorrt_dir, self.DEFAULT_ENGINE_FILENAME)
+        onnx_filename = os.path.basename(onnx_path)
+        engine_filename = onnx_filename.replace(".onnx", ".engine")
+        return os.path.join(tensorrt_dir, engine_filename)
+
+    def _resolve_external_artifacts(self, result: ExportResult) -> None:
+        """
+        Resolve and register external artifacts from configuration.
+
+        Args:
+            result: Export result object to store the artifacts
+        """
+        if not result.onnx_path:
+            self._resolve_and_register_artifact(Backend.ONNX, result, "onnx_path")
+
+        if not result.tensorrt_path:
+            self._resolve_and_register_artifact(Backend.TENSORRT, result, "tensorrt_path")
+
+    def _resolve_and_register_artifact(self, backend: Backend, result: ExportResult, attr_name: str) -> None:
+        """
+        Resolve and register an artifact from configuration.
+
+        Args:
+            backend: Backend to resolve the artifact for
+            result: Export result object to store the artifact
+            attr_name: Attribute name to set in the result
+        """
+        eval_models = self.config.evaluation_config.models
+        artifact_path = self._get_backend_entry(eval_models, backend)
+
+        if artifact_path and os.path.exists(artifact_path):
+            setattr(result, attr_name, artifact_path)
+            multi_file = os.path.isdir(artifact_path)
+            self.artifact_manager.register_artifact(backend, Artifact(path=artifact_path, multi_file=multi_file))
+        elif artifact_path:
+            self.logger.warning(f"{backend.value} file from config does not exist: {artifact_path}")
+
+    @staticmethod
+    def _get_backend_entry(mapping: Optional[Mapping[Any, Any]], backend: Backend) -> Any:
+        """
+        Get a backend entry from a mapping.
+
+        Args:
+            mapping: Mapping to get the backend entry from
+            backend: Backend to get the entry for
+        Returns:
+            Backend entry from the mapping
+        """
+        if not mapping:
+            return None
+        if backend.value in mapping:
+            return mapping[backend.value]
+        return mapping.get(backend)
diff --git a/deployment/runtime/runner.py b/deployment/runtime/runner.py
new file mode 100644
index 000000000..1c0ae6692
--- /dev/null
+++ b/deployment/runtime/runner.py
@@ -0,0 +1,109 @@
+"""
+Unified deployment runner for common deployment workflows.
+
+Project-agnostic runtime runner that orchestrates:
+- Export (PyTorch -> ONNX -> TensorRT)
+- Verification (scenario-based comparisons)
+- Evaluation (metrics/latency across backends)
+"""
+
+from __future__ import annotations
+
+import logging
+from dataclasses import asdict, dataclass, field
+from typing import Any, Dict, Optional, Type
+
+from mmengine.config import Config
+
+from deployment.core import BaseDataLoader, BaseDeploymentConfig, BaseEvaluator
+from deployment.core.contexts import ExportContext
+from deployment.exporters.common.model_wrappers import BaseModelWrapper
+from deployment.exporters.export_pipelines.base import OnnxExportPipeline, TensorRTExportPipeline
+from deployment.runtime.artifact_manager import ArtifactManager
+from deployment.runtime.evaluation_orchestrator import EvaluationOrchestrator
+from deployment.runtime.export_orchestrator import ExportOrchestrator
+from deployment.runtime.verification_orchestrator import VerificationOrchestrator
+
+
+@dataclass
+class DeploymentResult:
+    """Standardized structure returned by `BaseDeploymentRunner.run()`."""
+
+    pytorch_model: Optional[Any] = None
+    onnx_path: Optional[str] = None
+    tensorrt_path: Optional[str] = None
+    verification_results: Dict[str, Any] = field(default_factory=dict)
+    evaluation_results: Dict[str, Any] = field(default_factory=dict)
+
+    def to_dict(self) -> Dict[str, Any]:
+        return asdict(self)
+
+
+class BaseDeploymentRunner:
+    """Base deployment runner for common deployment pipelines."""
+
+    def __init__(
+        self,
+        data_loader: BaseDataLoader,
+        evaluator: BaseEvaluator,
+        config: BaseDeploymentConfig,
+        model_cfg: Config,
+        logger: logging.Logger,
+        onnx_wrapper_cls: Optional[Type[BaseModelWrapper]] = None,
+        onnx_pipeline: Optional[OnnxExportPipeline] = None,
+        tensorrt_pipeline: Optional[TensorRTExportPipeline] = None,
+    ):
+        self.data_loader = data_loader
+        self.evaluator = evaluator
+        self.config = config
+        self.model_cfg = model_cfg
+        self.logger = logger
+
+        self._onnx_wrapper_cls = onnx_wrapper_cls
+        self._onnx_pipeline = onnx_pipeline
+        self._tensorrt_pipeline = tensorrt_pipeline
+
+        self.artifact_manager = ArtifactManager(config, logger)
+
+        self._export_orchestrator: Optional[ExportOrchestrator] = None
+        self.verification_orchestrator = VerificationOrchestrator(config, evaluator, data_loader, logger)
+        self.evaluation_orchestrator = EvaluationOrchestrator(config, evaluator, data_loader, logger)
+
+    @property
+    def export_orchestrator(self) -> ExportOrchestrator:
+        if self._export_orchestrator is None:
+            self._export_orchestrator = ExportOrchestrator(
+                config=self.config,
+                data_loader=self.data_loader,
+                artifact_manager=self.artifact_manager,
+                logger=self.logger,
+                model_loader=self.load_pytorch_model,
+                evaluator=self.evaluator,
+                onnx_wrapper_cls=self._onnx_wrapper_cls,
+                onnx_pipeline=self._onnx_pipeline,
+                tensorrt_pipeline=self._tensorrt_pipeline,
+            )
+        return self._export_orchestrator
+
+    def load_pytorch_model(self, checkpoint_path: str, context: ExportContext) -> Any:
+        raise NotImplementedError(f"{self.__class__.__name__}.load_pytorch_model() must be implemented by subclasses.")
+
+    def run(self, context: Optional[ExportContext] = None) -> DeploymentResult:
+        if context is None:
+            context = ExportContext()
+
+        results = DeploymentResult()
+
+        export_result = self.export_orchestrator.run(context)
+        results.pytorch_model = export_result.pytorch_model
+        results.onnx_path = export_result.onnx_path
+        results.tensorrt_path = export_result.tensorrt_path
+
+        results.verification_results = self.verification_orchestrator.run(artifact_manager=self.artifact_manager)
+        results.evaluation_results = self.evaluation_orchestrator.run(self.artifact_manager)
+
+        self.logger.info("\n" + "=" * 80)
+        self.logger.info("Deployment Complete!")
+        self.logger.info("=" * 80)
+
+        return results
diff --git a/deployment/runtime/verification_orchestrator.py b/deployment/runtime/verification_orchestrator.py
new file mode 100644
index 000000000..1151e0fc7
--- /dev/null
+++ b/deployment/runtime/verification_orchestrator.py
@@ -0,0 +1,180 @@
+"""
+Verification orchestration for deployment workflows.
+
+This module handles scenario-based verification across different backends.
+"""
+
+from __future__ import annotations
+
+import logging
+from typing import Any, Dict, Mapping
+
+from deployment.core.backend import Backend
+from deployment.core.config.base_config import BaseDeploymentConfig
+from deployment.core.evaluation.base_evaluator import BaseEvaluator
+from deployment.core.evaluation.evaluator_types import ModelSpec
+from deployment.core.io.base_data_loader import BaseDataLoader
+from deployment.runtime.artifact_manager import ArtifactManager
+
+
+class VerificationOrchestrator:
+    """
+    Orchestrates verification across backends using scenario-based verification.
+
+    This class handles:
+    - Running verification scenarios from config
+    - Resolving model paths via ArtifactManager
+    - Collecting and aggregating verification results
+    - Logging verification progress and results
+    """
+
+    def __init__(
+        self,
+        config: BaseDeploymentConfig,
+        evaluator: BaseEvaluator,
+        data_loader: BaseDataLoader,
+        logger: logging.Logger,
+    ):
+        """
+        Initialize verification orchestrator.
+
+        Args:
+            config: Deployment configuration
+            evaluator: Evaluator instance for running verification
+            data_loader: Data loader for loading samples
+            logger: Logger instance
+        """
+        self.config = config
+        self.evaluator = evaluator
+        self.data_loader = data_loader
+        self.logger = logger
+
+    def run(self, artifact_manager: ArtifactManager) -> Dict[str, Any]:
+        """
+        Run verification on exported models using policy-based verification.
+
+        Args:
+            artifact_manager: Artifact manager for resolving model paths
+        Returns:
+            Verification results dictionary
+        """
+        verification_cfg = self.config.verification_config
+
+        if not verification_cfg.enabled:
+            self.logger.info("Verification disabled (verification.enabled=False), skipping...")
+            return {}
+
+        export_mode = self.config.export_config.mode
+        scenarios = self.config.get_verification_scenarios(export_mode)
+
+        if not scenarios:
+            self.logger.info(f"No verification scenarios for export mode '{export_mode.value}', skipping...")
+            return {}
+
+        needs_pytorch = any(
+            policy.ref_backend is Backend.PYTORCH or policy.test_backend is Backend.PYTORCH for policy in scenarios
+        )
+        if needs_pytorch:
+            _, pytorch_valid = artifact_manager.resolve_artifact(Backend.PYTORCH)
+            if not pytorch_valid:
+                self.logger.warning(
+                    "PyTorch checkpoint not available, but required by verification scenarios. Skipping verification."
+                )
+                return {}
+
+        num_verify_samples = verification_cfg.num_verify_samples
+        tolerance = verification_cfg.tolerance
+        devices_raw = verification_cfg.devices
+        if devices_raw is None:
+            devices_raw = {}
+        if not isinstance(devices_raw, Mapping):
+            raise TypeError(f"verification.devices must be a mapping, got {type(devices_raw).__name__}")
+        devices_map = dict(devices_raw)
+        devices_map.setdefault("cpu", self.config.devices.cpu or "cpu")
+        if self.config.devices.cuda:
+            devices_map.setdefault("cuda", self.config.devices.cuda)
+
+        self.logger.info("=" * 80)
+        self.logger.info(f"Running Verification (mode: {export_mode.value})")
+        self.logger.info("=" * 80)
+
+        all_results: Dict[str, Any] = {}
+        total_passed = 0
+        total_failed = 0
+
+        for i, policy in enumerate(scenarios):
+            ref_device = self._resolve_device(policy.ref_device, devices_map)
+            test_device = self._resolve_device(policy.test_device, devices_map)
+
+            self.logger.info(
+                f"\nScenario {i+1}/{len(scenarios)}: "
+                f"{policy.ref_backend.value}({ref_device}) vs {policy.test_backend.value}({test_device})"
+            )
+
+            ref_artifact, ref_valid = artifact_manager.resolve_artifact(policy.ref_backend)
+            test_artifact, test_valid = artifact_manager.resolve_artifact(policy.test_backend)
+
+            if not ref_valid or not test_valid:
+                ref_path = ref_artifact.path if ref_artifact else None
+                test_path = test_artifact.path if test_artifact else None
+                self.logger.warning(
+                    "  Skipping: missing or invalid artifacts "
+                    f"(ref={ref_path}, valid={ref_valid}, test={test_path}, valid={test_valid})"
+                )
+                continue
+
+            reference_spec = ModelSpec(backend=policy.ref_backend, device=ref_device, artifact=ref_artifact)
+            test_spec = ModelSpec(backend=policy.test_backend, device=test_device, artifact=test_artifact)
+
+            verification_results = self.evaluator.verify(
+                reference=reference_spec,
+                test=test_spec,
+                data_loader=self.data_loader,
+                num_samples=num_verify_samples,
+                tolerance=tolerance,
+                verbose=False,
+            )
+
+            policy_key = f"{policy.ref_backend.value}_{ref_device}_vs_{policy.test_backend.value}_{test_device}"
+            all_results[policy_key] = verification_results
+
+            if "summary" in verification_results:
+                summary = verification_results["summary"]
+                passed = summary.get("passed", 0)
+                failed = summary.get("failed", 0)
+                total_passed += passed
+                total_failed += failed
+                if failed == 0:
+                    self.logger.info(f"Scenario {i+1} passed ({passed} comparisons)")
+                else:
+                    self.logger.warning(f"Scenario {i+1} failed ({failed}/{passed+failed} comparisons)")
+
+        self.logger.info("\n" + "=" * 80)
+        if total_failed == 0:
+            self.logger.info(f"All verifications passed! ({total_passed} total)")
+        else:
+            self.logger.warning(f"{total_failed}/{total_passed + total_failed} verifications failed")
+        self.logger.info("=" * 80)
+
+        all_results["summary"] = {
+            "passed": total_passed,
+            "failed": total_failed,
+            "total": total_passed + total_failed,
+        }
+
+        return all_results
+
+    def _resolve_device(self, device_key: str, devices_map: Mapping[str, str]) -> str:
+        """
+        Resolve a device key to a full device string.
+
+        Args:
+            device_key: Device key to resolve
+            devices_map: Mapping of device keys to full device strings
+        Returns:
+            Resolved device string
+        """
+        if device_key in devices_map:
+            return devices_map[device_key]
+        self.logger.warning(f"Device alias '{device_key}' not found in devices map, using as-is")
+        return device_key