Skip to content

damienos61/SynapSwap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SynapSwap Logo

...

SynapSwap

Predictive VRAM Virtualization Engine for Large-Scale AI Inference

Break the VRAM wall. Run massive AI models on consumer hardware.

SynapSwap is a low-level memory virtualization engine designed to push beyond the physical limits of GPU VRAM. It enables the execution of massive AI models (LLMs, Vision, Diffusion) on consumer-grade hardware by transforming system RAM into an intelligent extension of VRAM, using predictive prefetching and fully asynchronous, non-blocking memory swapping.

Unlike traditional paging-based solutions, SynapSwap anticipates memory needs instead of reacting to memory pressure.


The Problem

Today, the primary bottleneck in AI systems is no longer compute — it is video memory.

  • Consumer GPUs typically ship with 8–24 GB of VRAM
  • Modern AI models routinely exceed 40, 80, or even 100+ GB
  • The result: OOM (Out Of Memory) errors, crashes, or the need for prohibitively expensive hardware

Existing mechanisms (Unified Memory, driver-level paging) are reactive, expensive, and unpredictable.


The Solution: SynapSwap

SynapSwap introduces proactive VRAM virtualization driven by awareness of the model execution graph.

VRAM becomes an intelligent cache, not a hard limit.


Key Features

Overlapping Computation & Transfer

Hides up to 90% of PCIe latency by loading layer N+1 while layer N is executing.

Graph-Aware Memory Scheduling

Leverages execution graph dependencies to predict future memory requirements.

Adaptive Prefetching (EMA Telemetry)

Dynamically tunes prefetch aggressiveness using Exponential Moving Averages.

Asynchronous Transfer Engine

Dedicated engine for fully non-blocking memory transfers (async memcpy).

Smart VRAM Eviction (LRU+)

Intelligent VRAM cleanup to prevent fragmentation and execution stalls.

Cross-Platform

Runs on Linux and Windows (MinGW).


Internal Architecture

SynapSwap is built around three core components:

1 Scheduler

Analyzes declared dependencies and decides what to load, when to load it, and why.

2 Transfer Engine

Dedicated thread responsible for asynchronous memory transfers without blocking the inference pipeline.

3 Eviction Manager

Advanced LRU-based algorithm that keeps VRAM clean, coherent, and performant.


Installation & Build

Prerequisites

  • Compiler: GCC ≥ 4.8 or MinGW-w64
  • OS: Linux / Windows
  • Libraries: pthreads (included by default)

Build Instructions

git clone https://github.com/your-username/synapswap.git
cd synapswap
make clean
make -j$(nproc)

API Integration (C / C++)

The API is designed to be hook-ready and easy to inject into existing inference engines.

#include "synapswap.h"

// 1. Initialization (physical VRAM limit: 2 GB)
synapswap_init(2048ULL * 1024 * 1024, true);

// 2. Allocate a virtualized memory block
void* layer_1 = synapswap_malloc(
    512 * 1024 * 1024,
    10,
    SS_POLICY_AUTO,
    "Transformer_Block_1"
);

// 3. Declare execution graph dependencies
synapswap_register_dependency(0, layer_1, 1);

// 4. Inference loop
synapswap_precompute_hint(0);
synapswap_wait_for_data(layer_1);

//  GPU kernel invocation (CUDA / OpenCL / Vulkan)

synapswap_shutdown();

Monitoring & Telemetry

SynapSwap includes a real-time ANSI dashboard:

[SynapSwap Dashboard]
├─ VRAM Usage : [||||||||||          ] 50.0% (1024 / 2048 MB)
├─ Hit Rate   : 98.2%   ← prediction accuracy
├─ Efficiency : OPTIMAL
└─ Stall Time : 1.25 ms

Roadmap

Planned features:

  • Native CUDA backend (cudaMemcpyAsync)
  • PyTorch wrapper (ctypes / pybind11)
  • Multi-GPU support
  • Weight compression in system RAM
  • Vulkan / ROCm integration

Contributing

Contributions are highly encouraged.

  1. Fork the project
  2. Create a feature branch:
git checkout -b feature/AmazingFeature
  1. Commit your changes:
git commit -m "Add AmazingFeature"
  1. Push and open a Pull Request

License

Distributed under the MIT License. See the LICENSE file for more information.

Academic citation is welcome if used in research.


Author

Developed with by DamienOS

Optimizing AI inference, one byte at a time.

⭐ If this project taught you something, consider leaving a star ⭐.

It helps SynapSwap reach more developers and researchers.