Skip to content

More tasks (FFT, matrix multiply, sorting networks) #11

@SolidRegardless

Description

@SolidRegardless

Summary

Expand the benchmark task suite with three new GPU kernel implementations: Fast Fourier Transform (FFT), matrix multiplication, and sorting networks.

Motivation

The current five tasks demonstrate a range from bandwidth-bound to compute-bound workloads, but they're all element-wise operations with no inter-element dependencies. FFT, matrix multiply, and sorting networks introduce fundamentally different parallel patterns — butterfly networks, tiled shared-memory access, and compare-and-swap — that stress different aspects of GPU architecture and reveal new performance characteristics.

Acceptance Criteria

  • Implement FftTask — a radix-2 Cooley-Tukey FFT on a power-of-two input array
  • Implement MatrixMultiplyTask — dense matrix multiplication (e.g. 1024×1024 × 1024×1024)
  • Implement SortingNetworkTask — a bitonic sort or odd-even merge sort on the GPU
  • All three tasks implement IBenchmarkTask and integrate with the existing harness
  • Each task includes proper host↔device memory transfers (no shortcuts)
  • Results should be validated for correctness against a CPU reference implementation
  • Performance should be benchmarked and reported alongside existing tasks

Technical Notes

  • FFT requires shared memory and butterfly-pattern access — may need ILGPU shared memory primitives
  • Matrix multiply benefits hugely from tiled/blocked algorithms using shared memory
  • Sorting networks are interesting because they have a fixed comparison pattern, making them ideal for GPU parallelism despite sorting being traditionally sequential
  • Batch size semantics differ per task: for FFT it's the transform length, for matrix multiply it's the matrix dimension, for sorting it's the array length
  • Consider adding these under src/GpuStreamingDemo.Core/Tasks/ following existing patterns

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions