Feature: RoPE (Rotary Positional Embedding) Kernels

Implement optimized RoPE kernels for attention, supporting both prefill
and decode paths.

Focus on minimizing compute overhead and improving data locality during
positional embedding application.

Planned Benchmarks
- RoPE overhead relative to attention
- Prefill vs decode performance
- Kernel fusion opportunities

Learning Objectives
- Positional encoding math
- Kernel fusion with attention
- Register and shared memory usage patterns


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature: RoPE (Rotary Positional Embedding) Kernels #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Feature: RoPE (Rotary Positional Embedding) Kernels #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions