An HTTP server for benchmarking RGBA to gray scale image conversion using pure Go, CGO and CGO SIMD instructions. The project is to demonstrate what an image conversion pipeline entails when CGO SIMD instructions are used.
The server contains 4 different conversion types for benchmarking conversion performance:
- NativeGO: basic loop with standard library calls that allocate memory for
color.Colorstruct per pixel. - NativeGONoAlloc: similar to
nativegobut will prevent the per pixel allocation by modifying the data directly in the image buffer. - CGOBasic: uses the
cgoFFI to convert using C code. Pixel data is modified within the image buffer. - CGOSIMD: uses the
cgoFFI to call into C with conversion logic using SIMD instructions for loading and modifying multiple pixels at a time on the CPU.
NativeGONoAlloc, CGOBasic and CGOSIMD allow for zero-copy allocations when the image buffers are allocated prior to the calls, see grayscale_test.go for buffer allocations.
The server currently allocates two buffers for the image and grayscale image buffers per request, but these could be eliminated by using a sync.Pool to reuse buffer pools to cut down on GC pressure in high-throughput environments.
Run the make target benchmark-all to get a baseline across the 4 different conversion types.
| Method | Image Size | Time/op | Memory | Allocs |
|---|---|---|---|---|
| NativeGO | 100×100 | 124.5µs | 49KB | 10,002 |
| 500×500 | 3.02ms | 1.2MB | 250,002 | |
| 1000×1000 | 12.2ms | 4.9MB | 1,000,002 | |
| 1920×1080 | 25.4ms | 10.1MB | 2,073,602 | |
| 3840×2160 | 103.6ms | 40.5MB | 8,294,402 | |
| NativeGO (No Alloc) | 100×100 | 9.2µs | 0B | 0 |
| 500×500 | 231.2µs | 0B | 0 | |
| 1000×1000 | 918.5µs | 0B | 0 | |
| 1920×1080 | 1.89ms | 0B | 0 | |
| 3840×2160 | 7.49ms | 0B | 0 | |
| CGO Basic | 100×100 | 2.2µs | 0B | 0 |
| 500×500 | 56.2µs | 0B | 0 | |
| 1000×1000 | 215.0µs | 0B | 0 | |
| 1920×1080 | 431.0µs | 0B | 0 | |
| 3840×2160 | 1.71ms | 0B | 0 | |
| CGO SIMD | 100×100 | 0.9µs | 0B | 0 |
| 500×500 | 17.7µs | 0B | 0 | |
| 1000×1000 | 73.8µs | 0B | 0 | |
| 1920×1080 | 144.8µs | 0B | 0 | |
| 3840×2160 | 578.1µs | 0B | 0 |
Tested on Apple M3 Max (darwin/arm64)
note: see grayscale.c for implementation of the cgo calls.
Clone the repo and run make. After compilation a binary will be located in a build folder: ./build/img2gray.
% ./build/img2gray -h
Image to grayscale cgo testing server
Usage of ./build/img2gray:
-host string
server listening host (default "localhost")
-port string
server listening port (default "8080")
% ./build/img2gray
2026-01-14T13:05:26.140-0800 INFO img2gray/main.go:325 Server started {"address": "localhost:8080"}