Skip to content

Feature: Kernel Launch Overhead Reduction #10

@ShlokVFX

Description

@ShlokVFX

Investigate and implement techniques to reduce kernel launch overhead,
including kernel fusion and persistent kernels.

Focus on inference workloads with small batch sizes.

Planned Benchmarks

  • Kernel launch count
  • CPU-side overhead
  • End-to-end latency impact

Learning Objectives

  • Launch overhead sources
  • Persistent kernel design
  • Scheduling strategies

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions