PyFlameRT

High-performance inference runtime for Cerebras WSE (Wafer-Scale Engine).

PyFlameRT provides an ONNX Runtime-compatible API for deploying deep learning models on Cerebras hardware, with a CPU reference backend for development and testing.

Features

ONNX Runtime-compatible API - Easy migration from existing ONNX Runtime code
C++ Core with Python Bindings - High performance with convenient Python interface
CPU Reference Backend - Development and testing without Cerebras hardware
Extensible Architecture - Support for custom operators and backends

Installation

From Source

# Clone the repository
git clone https://github.com/pyflame/pyflame-rt.git
cd pyflame-rt

# Build with CMake
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --parallel

# Install Python package
pip install .

Requirements

CMake 3.18+
C++17 compiler (GCC 8+, Clang 8+, MSVC 2019+)
Python 3.9+
NumPy 1.21+

Quick Start

import numpy as np
import pyflame_rt

# Create session options
options = pyflame_rt.SessionOptions()
options.device = "cpu"
options.num_threads = 4

# Load model
session = pyflame_rt.InferenceSession("model.pfm", options)

# Get input/output info
inputs = session.get_inputs()
outputs = session.get_outputs()
print(f"Input: {inputs[0].name}, shape: {inputs[0].shape}")
print(f"Output: {outputs[0].name}, shape: {outputs[0].shape}")

# Run inference
input_data = np.random.randn(1, 3, 224, 224).astype(np.float32)
results = session.run(None, {"input": input_data})

print(f"Output shape: {results[0].shape}")

Project Structure

PyFlameRT/
├── include/pyflame_rt/    # Public C++ headers
├── src/                   # C++ implementation
│   ├── backends/cpu/      # CPU reference backend
│   └── io/                # Model loading
├── bindings/              # pybind11 Python bindings
├── python/pyflame_rt/     # Python package
├── tests/                 # C++ and Python tests
└── docs/                  # Documentation

Development

Building

# Configure with tests
cmake -B build -DPYFLAME_RT_BUILD_TESTS=ON

# Build
cmake --build build --parallel

# Run C++ tests
cd build && ctest --output-on-failure

# Run Python tests
pytest tests/python/

Supported Operators

PyFlameRT supports common neural network operators including:

Math: Add, Sub, Mul, Div, MatMul, Gemm, Sqrt, Exp, Log, Pow
Activation: Relu, Sigmoid, Tanh, Softmax, LeakyRelu, Gelu, SELU
Tensor: Reshape, Transpose, Concat, Squeeze, Unsqueeze, Flatten
NN: Conv, MaxPool, AveragePool, BatchNormalization, LayerNormalization, Dropout
Reduction: ReduceSum, ReduceMean, ReduceMax, ReduceMin, ArgMax

License

MIT License - see LICENSE for details.

Related Projects

PyFlame - Core tensor operations and IR graph
PyFlameVision - Computer vision models and utilities

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
bindings		bindings
deploy		deploy
docs		docs
include/pyflame_rt		include/pyflame_rt
python/pyflame_rt		python/pyflame_rt
src		src
tests		tests
third_party		third_party
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CMakeLists.txt		CMakeLists.txt
Project_Overview.md		Project_Overview.md
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyFlameRT

Features

Installation

From Source

Requirements

Quick Start

Project Structure

Development

Building

Supported Operators

License

Related Projects

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

CTO92/PyFlameRT

Folders and files

Latest commit

History

Repository files navigation

PyFlameRT

Features

Installation

From Source

Requirements

Quick Start

Project Structure

Development

Building

Supported Operators

License

Related Projects

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages