OpenAI API-compatible server for Petals distributed inference 👋

🎯 PROJECT STATUS: MISSION ACCOMPLISHED

OpenAI-Petal (v0.6.3) has successfully completed its mission as a technical sandbox for distributed AI inference.

✅ Achievements

This project proved the viability of distributed inference and established production-ready infrastructure:

✅ Production deployments: ~100 active nodes across developer/enthusiast community
✅ Cross-platform installers: Working Linux and macOS installers with GPU support (NVIDIA, AMD, Intel, Apple Silicon)
✅ Robust daemon management: 99%+ reliability with health monitoring, auto-calibration, and auto-update
✅ Real-world testing: Discovered and solved zombie states, network issues, GPU compatibility problems
✅ Beautiful UX: Professional CLI with visual feedback, contextual help, and smart defaults
✅ Comprehensive documentation: 104 Python files, extensive guides, and operational knowledge
✅ Open source contribution: CC-BY-4.0 licensed, community-driven development

The concepts are proven. The technology works. Mission accomplished. 🎉

⚠️ Architectural Limitations (Why We're Moving On)

While successful as a prototype, the Python/Petals architecture faces fundamental constraints that prevent mass adoption:

🔒 Security: 8 CVEs in transformers dependency (cannot fix without breaking Petals compatibility)
⏱️ Onboarding time: 30-45 minute setup (model download bottleneck kills viral growth)
🚫 Browser/Mobile: Architecturally impossible with Python runtime requirement
💻 Windows support: Installer broken, not viable to fix in current architecture
📦 Dependency hell: Every update risks breaking (transformers, triton, bitsandbytes conflicts)
📈 Scale ceiling: ~10K maximum technical users (Python/Docker adoption barrier)

These aren't bugs to fix—they're architectural constraints that require a ground-up rewrite.

🚀 The Future: KwaaiNet

Development efforts are focused on KwaaiNet - a ground-up rewrite in Rust for 1B+ users:

✅ Instant onboarding: <10 seconds to network visibility (vs 30-45 minutes)
✅ Browser/mobile-first: WASM + WebGPU, runs everywhere
✅ Modern security: No Python dependency trap, memory-safe Rust
✅ Mass adoption: Targeting app stores, one-click installation

Timeline:

Q1 2026: Browser extension launch
Q2-Q3 2026: Mobile apps (iOS/Android)
Q4 2026: OpenAI-Petal deprecation notice
Q2 2027: Final shutdown (estimated)

Current users: Your nodes will continue working through Q2 2027. Migration tools will be provided.

📚 Read more: docs/LEGACY_STATUS.md

📋 Overview
🚀 Recent Updates
⚡ Key Features
🏗️ Architecture
📡 API Endpoints
💾 Installation and Setup
🗑️ Uninstallation
🚀 Usage
⚙️ Configuration
🛠️ Available Command-line Options
🐍 Python API
🌍 Environment Variables
⚡ Performance Considerations
🔧 Troubleshooting
🔒 Security Considerations
📝 Recent Fixes and Improvements
🤖 Daemon Mode
🤝 Contributing
📄 License

Overview

OpenAI-Petal is an OpenAI API-compatible server that bridges to the Petals distributed inference network. It enables you to run large language models through Petals' distributed network while maintaining full compatibility with OpenAI's API format, making it easy to integrate into existing applications.

🚀 Recent Updates

🎯 v0.6.3 - Linux Installer Curl|Bash Fix (Latest)

✅ Fixed curl | bash execution: Resolved installer failure when piped from curl
✅ PATH preservation: Fixed environment variable handling in piped execution context
✅ Improved reliability: Enhanced error handling for remote installation
✅ Production tested: Verified on multiple Linux distributions

🎯 v0.6.2 - Health Monitor Infrastructure Improvements

✅ Infrastructure failure handling: Graceful degradation when map.kwaai.ai is unreachable
✅ Enhanced error reporting: Better diagnostics for network and API failures
✅ Stability improvements: Reduced false positives in health monitoring
✅ Connection resilience: Improved handling of transient network issues

🎯 v0.6.1 - Version Management Fix

✅ Dynamic version detection: Fixed VERSION file reading across all platforms
✅ Display improvements: Consistent version reporting in CLI commands
✅ Auto-update compatibility: Enhanced version comparison for update checks
✅ Build system fixes: Resolved version synchronization issues

🎯 v0.6.0 - Health Monitoring Refactor & Strategy Pattern

✅ Modular Architecture: Refactored health monitoring using Strategy pattern for extensibility
✅ Abstract Base Classes: Plugin architecture supports multiple service types (KwaaiNet nodes, MapAPI, Bootstrap DHT)
✅ Enhanced Thread Safety: Comprehensive concurrency fixes eliminate race conditions
✅ 50 Passing Tests: Full test coverage across concurrency, architecture, and integration
✅ 100% Backward Compatible: Existing code works without modification
✅ Production Tested: Node running stable with 99%+ health monitoring success rate

Architecture Improvements:

KwaaiNetHealthCheck Strategy: 6-step comprehensive health checking (API reachability, data freshness, bootstrap health, node visibility, state, throughput)
ExponentialBackoffStrategy: AWS best practice with full jitter (30s → 1800s max delay)
HealthMonitorOrchestrator: Coordinates strategies while preserving thread safety
Future-Ready: Easy to add new service types via plugins

🎯 v0.5.2 - Triton Compatibility Fix

✅ Runtime Crash Fixed: Resolved ModuleNotFoundError: No module named 'triton.ops'
✅ Dependency Pinning: Added triton<3.0 constraint (bitsandbytes 0.41.1 requires triton.ops)
✅ Wider Compatibility: Updated transformers to <4.45.0, tokenizers to <0.20.0 for petals 2.3.0.dev2
✅ Stability Verified: Node tested stable for 3.7+ hours (previously crashed at 26 seconds)

🎯 v0.4.7 - Auto-Calibration & Smart Block Allocation

✅ Auto-Calibration on Startup: Automatically determines optimal block count based on available hardware
✅ Smart Default Behavior: New nodes automatically calibrate on first start (no manual tuning needed)
✅ Hardware-Aware: Detects GPU type (CUDA/ROCm/CPU), memory, and CPU cores
✅ Intelligent Recommendations: Suggests min/recommended/max block counts with safety margins
✅ Calibration Caching: Stores profiles to avoid re-calibration on every start
✅ Production-Ready: Tested on Linux server (16 blocks recommended vs 1 block default)

🎯 v0.4.6 - Linux Process Cleanup & Production Stability

✅ Process Cleanup on Linux: Ported automatic cleanup feature from macOS to Linux installer
✅ Zombie Prevention: Prevents defunct processes from accumulating during restarts
✅ Auto-Cleanup by Default: Automatically stops existing instances before starting new ones
✅ --concurrent Flag: Optional flag to allow multiple instances for testing/development
✅ Reboot-Tested: Verified autostart reliability on production server (systemd user services)
✅ Feature Parity: 76% complete (13/17 features), 100% high-priority features done

🎯 v0.4.5 - Linux Auto-Update & Reconnect

✅ Auto-Update on Linux: kwaainet update command ported from macOS
✅ Reconnect Command: kwaainet reconnect forces P2P network reconnection
✅ Version Management: Dynamic VERSION file reading (no more hardcoded versions)
✅ GitHub Integration: Auto-detects installation method (git/installer/pip) and updates accordingly
✅ Configuration Backup: Automatic backup before updates with rollback capability

🎯 v0.4.3 - Concurrent Instance Prevention & MPS Compatibility

✅ Smart Instance Management: kwaainet start now automatically stops existing instances by default
✅ --concurrent Flag: New optional flag to allow multiple instances when needed
✅ Duplicate Prevention: Eliminates accidental duplicate nodes on network (common with launchd + manual starts)
✅ MPS Patch Fix: Fixed torch.mps.current_device compatibility issues on macOS with PyTorch 2.8+
✅ Clean Process Management: Automatically cleans up orphaned petals/p2pd processes

🔧 v0.4.2 - macOS Auto-Start Status Detection Fix

✅ Service Detection: Status command now correctly detects launchd-managed processes
✅ Auto-Start Support: Full compatibility with macOS auto-start service (no PID file needed)
✅ Enhanced Monitoring: Process detection via command line for service-managed instances
✅ Improved Cleanup: Better process management during service restarts
✅ Status Indicators: Includes service_managed flag for service-started processes

🔄 v0.4.1 - Auto-Update System

✅ Version Detection: Automatic checks for new releases via GitHub API and VERSION file
✅ Smart Caching: 1-hour cache to avoid rate limits, with force refresh option
✅ Update Notifications: Non-intrusive notifications in status display
✅ Interactive Updates: Guided update process with confirmation prompts
✅ Config Backup: Automatic configuration backup before updates
✅ Multi-Method Detection: Supports git, installer, and pip installation methods

📈 v0.4.0 - P2P Network Monitoring & Reconnection

✅ Connection Monitoring: 24-hour time-series tracking of P2P network health
✅ Smart Alerting: Webhook notifications for prolonged disconnections (configurable thresholds)
✅ Manual Reconnect: kwaainet reconnect command for network refresh without full restart
✅ Statistics Dashboard: kwaainet monitor stats shows uptime, disconnections, and network health
✅ Development Workflow: Installer now uses local source in git repositories (no more GitHub cloning)
✅ Alert Configuration: Customize webhook URLs, thresholds, and minimum connection counts

🔧 v0.3.1 - Enhanced Reliability & User Experience

✅ Shell Script Quality: Fixed all critical shellcheck issues for better installer robustness
✅ No-Build-Tools Default: Pre-built wheels only by default (saves ~5GB disk space)
✅ Improved Verification Messages: Less alarming warning symbols (⚠️ vs ❌) for better UX
✅ Better Error Handling: Enhanced CUDA detection and package management reliability
✅ Code Robustness: Safer array handling, proper variable declarations, performance optimizations
✅ Maintained Functionality: All features preserved while improving underlying quality

🚀 Daemon Mode Complete

✅ Stable daemon operation with PID tracking and process supervision
✅ Full daemon management: start, stop, restart, status, logs commands
✅ Cross-platform compatibility (macOS, Linux)
✅ Network connectivity fixes with working KwaaiNet bootstrap peers
✅ Beautiful CLI interface with enhanced visual design and Unicode borders

🛠️ Installer Improvements

⚠️ Windows installer temporarily unavailable (being rewritten)
✅ Linux installer v0.3.1 with enhanced reliability, shell script quality fixes, and no-build-tools default
✅ macOS installer with development mode installation to prevent version conflicts
✅ Enhanced user experience with improved verification messages and reduced alarm
✅ Better resource efficiency with ~5GB space savings using pre-built wheels by default
✅ Automatic setup integration in all installers

🎨 User Experience Enhancements

✅ Beautified CLI output with elegant borders, emojis, and visual hierarchy
✅ Enhanced status display with contextual icons and smart uptime formatting
✅ Professional help system with organized examples and clear documentation
✅ Improved error handling and user feedback across all components

📊 Current Status

All core features are complete and stable:

Cross-platform installers working on Linux, macOS (Windows under development)
Daemon mode with full management capabilities
Network connectivity to KwaaiNet distributed inference network
Beautiful CLI interface with professional visual design

⚡ Key Features

🔌 OpenAI API Compatibility: Drop-in replacement supporting standard endpoints
🌐 Petals Integration: Leverages distributed inference for efficient model serving
🛠️ Advanced Tool Calling: Function calling with model-specific formatting (Hermes, Llama 3, Mistral, etc.)
💻 Cross-Platform: Linux and macOS support with automatic GPU detection (NVIDIA, AMD, Intel, Apple Silicon). Windows support under development.
⚡ High Performance: FastAPI backend with streaming support and smart token processing
📦 Easy Setup: One-step installers with enhanced reliability, shell script quality, and efficient pre-built wheel installation (saves ~5GB)
🤖 Stable Daemon Mode: Background operation with PID tracking, process supervision, and automatic restart
🎨 Beautiful CLI Interface: Professional visual design with Unicode borders, contextual icons, and enhanced UX
🔧 Comprehensive Management: Full daemon control with start, stop, restart, status, logs commands
📊 Smart Status Monitoring: Real-time process metrics with CPU, memory, uptime, and connection tracking
📈 P2P Network Monitoring: 24-hour connection history with statistics, alerts, and webhook notifications
🔄 Manual Reconnection: Force P2P network refresh without daemon restart
🔄 Auto-Update System: Automatic version checking with guided updates and configuration backup

🏗️ Architecture

FastAPI Backend: High-performance async web server with CORS support
Model Management: Automatic model loading/unloading with graceful shutdown
Streaming Support: Real-time response streaming for both completions and chat
Token Processing: Smart special token handling and cleanup with configurable stop sequences

📡 API Endpoints

v1/models - List available models
v1/completions - Text completion endpoint
v1/chat/completions - Chat completion endpoint with tool calling support

The best way to support is to give us a ⭐ on GitHub, join the Kwaai community, and connect with us on slack!

💾 Installation and Setup

The steps below can be used to setup the enviroment for this project. The install will run with or without GPU. If you are running a private swarm node, you might need some gpu support to share the load with community inference servers. This project needs some resources for the tokenizer part of inference. It will run on cpu or gpu supported machines.

Note: The default setup and run process provided here will allow you to connect to Petals' public swarm. Data you send will be public. Please be aware!

Installation Options

Choose the installation method that best fits your environment:

One-Step Installation (Recommended) - Automated installers for each platform
Container-Based Installation - Docker/Podman with pre-built images
Manual Installation - Direct pip installation for advanced users

One-Step Installation (Recommended)

For a complete one-step installation that handles Python, dependencies, and environment setup:

Windows

⚠️ Windows installer temporarily unavailable due to critical issues.

Alternative options:

Use WSL2 with Linux installer (Recommended):

# Install WSL2 Ubuntu first, then:
curl -fsSL https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/Installer/linux/linuxinstaller.sh | bash

Manual installation for advanced users:

git clone https://github.com/Kwaai-AI-Lab/OpenAI-Petal.git
cd OpenAI-Petal\Installer\windows
pip install -e .

Linux

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/Installer/linux/linuxinstaller.sh)"

v0.3.1 Features:

✅ Default no-build-tools (saves ~5GB disk space by using pre-built wheels)
✅ Enhanced reliability with improved shell script quality
✅ Better user experience with less alarming verification messages

The Linux installer supports additional options:

# Skip system package installation (if you already have dependencies)
curl -fsSL https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/Installer/linux/linuxinstaller.sh | bash -s -- --no-system-packages

# Enable build tools for source compilation (if needed)
curl -fsSL https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/Installer/linux/linuxinstaller.sh | bash -s -- --with-build-tools

# Force specific Python environment
curl -fsSL https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/Installer/linux/linuxinstaller.sh | bash -s -- --force-venv
curl -fsSL https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/Installer/linux/linuxinstaller.sh | bash -s -- --force-conda

# Show help
curl -fsSL https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/Installer/linux/linuxinstaller.sh | bash -s -- --help

macOS

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/Installer/macOS/macinstaller.sh)"

This will:

Install Python and required tools
Set up the conda environment efficiently using pre-built wheels
Install KwaaiNet with enhanced reliability
Create a launcher for easy usage
Detect and configure GPU support (NVIDIA, AMD, Intel, Apple Silicon)
Save ~5GB disk space with optimized installation

Container-Based Installation (Docker/Podman)

For containerized deployment, you can use Docker or Podman with the provided compose configuration:

Using Docker

# Download the compose file and .env.example
curl -O https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/docker/compose.yml
curl -O https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/docker/.env.example

# Configure your deployment
cp .env.example .env
# Edit .env to set PUBLIC_IP and PUBLIC_NAME

# Start the services
docker-compose up -d

# Check status
docker-compose ps

Using Podman (Rootless - Recommended)

# Download the compose file and .env.example
curl -O https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/docker/compose.yml
curl -O https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/docker/.env.example

# Configure your deployment
cp .env.example .env
# Edit .env to set PUBLIC_IP and PUBLIC_NAME

# Start the services (no sudo required)
podman-compose up -d

# Check status
podman-compose ps

Configuration (.env file)

Create a .env file with your server's configuration:

# Required for network map visibility
PUBLIC_IP=your.public.ip.address

# Identifies your node (note: _docker suffix distinguishes from bare-metal)
PUBLIC_NAME=yourname_docker@kwaai

# Optional: Number of model blocks (adjust based on VRAM)
# KWAAINET_BLOCKS=4

Container Services:

kwaainet-node: Distributed inference node (port 8080)
- Connects to KwaaiNet bootstrap peers
- Serves 4 blocks of Llama-3.1-8B-Instruct (configurable)
- Public name uses _docker suffix to distinguish from bare-metal nodes
- GPU-accelerated (NVIDIA support included)
kwaainet-api: OpenAI-compatible API server (port 8000)
- Fully compatible with OpenAI API format
- Supports completions and chat endpoints

Container Advantages:

✅ No system dependencies - everything runs in containers
✅ Rootless operation with Podman (enhanced security)
✅ Easy cleanup and management
✅ GPU support included for NVIDIA devices
✅ Automatic restarts after reboot with unless-stopped policy
✅ Network map visibility with PUBLIC_IP announcement
✅ Flexible deployment - node-only, API-only, or combined configurations

Auto-Restart After Reboot (Podman)

For existing Podman deployments, enable auto-restart:

# Download and run the fix script
curl -fsSL https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/docker/fix-restart.sh | bash

Quick Start with Containers:

# Download compose file
curl -O https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/docker/compose.yml

# Configure (set your PUBLIC_IP and PUBLIC_NAME)
echo "PUBLIC_IP=your.public.ip" > .env
echo "PUBLIC_NAME=yourname_docker@kwaai" >> .env

# Start services
podman-compose up -d

# Verify services
curl http://localhost:8000/v1/models     # API endpoint
curl http://localhost:8080/health        # Node health check

Testing Container Installation:

# Download and run the comprehensive test script
curl -fsSL https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/docker/test_container_installation.sh | bash

# Or quick test for running containers
curl -fsSL https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/docker/test_container_quick.sh | bash

The test script automatically:

Detects Docker or Podman
Downloads and validates the compose file
Starts containers and validates all services
Tests API endpoints and network connectivity
Checks GPU access and performance
Provides detailed logging and results

Manual Installation

If you prefer to handle the environment yourself, you can install directly:

Windows

⚠️ Manual installation only (installer temporarily unavailable):

git clone https://github.com/Kwaai-AI-Lab/OpenAI-Petal.git
cd OpenAI-Petal\Installer\windows
pip install -e .

Linux

pip install -e ./Installer/linux/

macOS

pip install -e ./Installer/macOS/

⚠️ Make sure you are using Python 3.8+ and pip is from the correct environment (virtualenv, conda, or system Python).

Windows Requirements: Windows 10+ (64-bit), PowerShell 5.1+

🧪 Testing Your Installation

After installation, you can validate that everything is working correctly with our comprehensive test scripts:

Windows Test Script

# Download and run the Windows test script
Invoke-WebRequest -Uri "https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/Installer/windows/test_kwaainet_installer.ps1" -OutFile "test.ps1"; powershell.exe -ExecutionPolicy Bypass -File "test.ps1"

Test Features:

✅ Dependency validation - Verifies all packages installed correctly
✅ Tokenizers build prevention - Confirms pre-built wheels were used
✅ Version alignment - Checks transformers/huggingface_hub versions
✅ Enhanced error reporting - Validates detailed error capture
✅ Daemon functionality - Tests start/stop/status commands
✅ Installation timing - Measures installation performance

Test Options:

# Skip cleanup (test existing installation)
.\test.ps1 -SkipCleanup

# Quiet mode (minimal output)
.\test.ps1 -Quiet

Linux Test Script

# Download and run the Linux test script
curl -fsSL https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/Installer/linux/test_kwaainet_installer.sh | bash

Test Features:

✅ Automatic compatibility patches - Tests all 4 patches apply correctly
✅ Complete system cleanup - Uses official uninstaller for clean state
✅ Fresh installation validation - Comprehensive installation testing
✅ Daemon stability testing - 20-second daemon monitoring
✅ Network connectivity - Validates connection to KwaaiNet network
✅ Management commands - Tests all daemon control functions

Expected Output:

🎉 ALL TESTS PASSED!
✅ All 4 compatibility patches applied automatically
✅ Daemon functionality working correctly
✅ Management commands working correctly
✅ NO MANUAL PATCHES REQUIRED

The KwaaiNet Linux installer is PRODUCTION READY! 🚀

Container Installation Test

# Test Docker/Podman container installation
curl -fsSL https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/docker/test_container_installation.sh | bash

Why Test Your Installation:

Validate dependency fixes - Ensures tokenizers build failures are prevented
Confirm version compatibility - Verifies all packages work together
Test daemon stability - Validates background operation works correctly
Network connectivity - Confirms connection to distributed network
Performance benchmarking - Measures installation and startup times

🗑️ Uninstallation

To completely remove KwaaiNet and its environment:

Windows

Manual uninstallation (uninstaller temporarily unavailable):

# Remove KwaaiNet package
pip uninstall kwaainet

# Remove installation directory
rmdir /s "%USERPROFILE%\.kwaainet"

Linux

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/Installer/linux/linuxuninstaller.sh)"

macOS

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/Installer/macOS/macuninstaller.sh)"

🚀 Usage

Beautiful CLI Interface

KwaaiNet features a professional, visually appealing CLI with elegant borders and contextual icons:

# Get help with beautiful formatting
kwaainet --help

# Start daemon mode
kwaainet start --daemon

# Check status with visual indicators
kwaainet status

Status Output Example:

╭─────────────────────────────────────────────────────────────────────╮
│                      📊 KwaaiNet Daemon Status                       │
╰─────────────────────────────────────────────────────────────────────╯

  🟢 Status: Running (PID: 12345)
  ⏰ Uptime: 2.3 hours
  🖥️  CPU: 15.2%
  💾 Memory: 8.5% (1024.0 MB)
  🔗 Connections: 12
  🧵 Threads: 29
─────────────────────────────────────────────────────────────────────

Daemon Management

KwaaiNet runs as a stable background daemon with full process management:

# Start in daemon mode (background)
kwaainet start --daemon

# Check daemon status  
kwaainet status

# View logs with beautiful formatting
kwaainet logs --lines 50

# Restart daemon
kwaainet restart

# Stop daemon
kwaainet stop

Health Monitoring & Auto-Reconnection

Monitor node health and automatically recover from network disconnections:

# Check health status
kwaainet health-status

# Enable/disable health monitoring
kwaainet health-enable
kwaainet health-disable

Health Monitoring Features (v0.6.0):

Network-Aware Detection: Monitors map.kwaai.ai API for authoritative node state
4-State Health Model: HEALTHY/DEGRADED/UNHEALTHY/CRITICAL states
Automatic Reconnection: Triggers after 3 consecutive failures (configurable)
Exponential Backoff: AWS best practice with full jitter (30s → 1800s max)
Zombie State Detection: Prevents "process running but invisible on network" scenarios
Strategy Pattern Architecture: Extensible for monitoring multiple service types

📚 Technical Deep Dive: See docs/NETWORK_VISIBILITY_ARCHITECTURE.md for a comprehensive explanation of:

How nodes appear on map.kwaai.ai (DHT announcement mechanism)
The 6-step startup sequence with precise timing
Why TCP connections ≠ network visibility (zombie state analysis)
Health monitoring strategy and troubleshooting guide

Example Status Output:

📊 Health Monitoring Status
Enabled: True
Running: True
Check interval: 60s
Failure threshold: 3

Last Check: 2025-11-10T19:35:08
Status: healthy

Metrics:
  Total checks: 221
  Healthy: 219 (99.1%)
  Degraded: 0
  Unhealthy: 2
  Critical: 0
  Reconnections triggered: 0

Configuration Management

View and modify configuration with a clean interface:

# View current configuration
kwaainet config --view

# Set configuration values
kwaainet config --set model "meta-llama/Llama-2-7b-hf"
kwaainet config --set blocks 4

# Configure health monitoring
kwaainet config --set health_monitoring.check_interval 60
kwaainet config --set health_monitoring.failure_threshold 3

P2P Network Monitoring & Reconnection

Monitor network health and manage connections without restarting:

# Force P2P network reconnection (no restart needed)
kwaainet reconnect

# View connection statistics (last 60 minutes)
kwaainet monitor stats

# Configure alerts for disconnections
kwaainet monitor alert --enable
kwaainet monitor alert --threshold 10              # Alert after 10 min disconnect
kwaainet monitor alert --webhook "https://your-webhook.com/alert"
kwaainet monitor alert --min-connections 2         # Alert if connections < 2

# View current alert configuration
kwaainet monitor alert

Monitoring Features:

24-hour history: Tracks connections, threads, CPU, memory every 60 seconds
Disconnection detection: Identifies periods of network isolation
Webhook alerts: POST JSON notifications to configured endpoints
Cooldown protection: 1-hour cooldown prevents alert spam
Persistent storage: History saved to ~/.kwaainet/monitoring/

Example Statistics Output:

╭─────────────────────────────────────────────────────────────────────╮
│                  📈 P2P Connection Statistics                        │
╰─────────────────────────────────────────────────────────────────────╯

  📊 Samples: 60 (last 60 minutes)
  🔗 Current Connections: 12
  📈 Average Connections: 10.5
  📉 Min/Max: 8 / 15
  ⏱️  Uptime: 98.3%

  ⚠️  Disconnection Periods:
     • 2.5 minutes (ended 2025-01-03T14:30:00)

Auto-Update System

Keep KwaaiNet up-to-date with automatic version checking and guided updates:

# Check for available updates
kwaainet update --check

# Install the latest version (with confirmation)
kwaainet update

# Force update check (bypass 1-hour cache)
kwaainet update --force

Update Features:

Automatic detection: Checks GitHub for new releases
Smart caching: 1-hour cache prevents rate limiting
Status notifications: Non-intrusive update alerts in kwaainet status
Configuration backup: Automatic backup before updates
User control: Requires confirmation before installing
Daemon management: Automatically stops/restarts daemon during updates

Update Process:

Check status: kwaainet status shows if update available
Review changes: kwaainet update --check displays release notes
Install: kwaainet update backs up config, installs, and guides restart

Example Update Check:

╭─────────────────────────────────────────────────────────────────────╮
│                        🔄 KwaaiNet Update                            │
╰─────────────────────────────────────────────────────────────────────╯

  📌 Current version: v0.4.0
  🔍 Checking for updates...

  🎉 New version available: v0.4.1
  🔗 Details: https://github.com/Kwaai-AI-Lab/OpenAI-Petal/releases/tag/v0.4.1

  💡 Run 'kwaainet update' (without --check) to install

Initial Setup

If you installed using the one-step installer, you can immediately start using KwaaiNet.

If you installed manually, first run the setup command to configure your environment:

kwaainet setup

This will:

Set up required environment variables
Create cache directories
Install or verify dependencies
Check GPU compatibility

Starting a Node

To start a KwaaiNet node with default settings:

kwaainet start

Or with custom settings:

kwaainet start --model "unsloth/Llama-3.1-8B-Instruct" --blocks 4 --port 8080 --public-name "anon@kwaai"

⚙️ Configuration

View current configuration:

kwaainet config --view

Update configuration:

kwaainet config --set model "unsloth/Llama-3.1-8B-Instruct"
kwaainet config --set blocks 4
kwaainet config --set public_name "anon@kwaai"

🛠️ Available Command-line Options

The kwaainet start command supports the following options:

--model: Model to use (default: "unsloth/Llama-3.1-8B-Instruct")
--blocks: Number of blocks to share (default: 4)
--port: Port to listen on (default: 8080)
--no-gpu: Disable GPU acceleration
--public-name: Public name for your node
--public-ip: Explicitly set the public IP address
--announce-addr: Custom announce address for P2P networking
--no-relay: Disable automatic relay

🐍 Python API

You can also use KwaaiNet programmatically in your Python code:

import kwaainet

# Setup environment
kwaainet.setup()

# Start a node
kwaainet.start_node(
    model="unsloth/Llama-3.1-8B-Instruct",
    blocks=4,
    port=8080
)

🌍 Environment Variables

KwaaiNet respects the following environment variables:

KWAAINET_MODEL: Model to use (default: "unsloth/Llama-3.1-8B-Instruct")
KWAAINET_BLOCKS: Number of blocks to share (default: 4)
KWAAINET_PORT: Port to listen on (default: 8080)
INITIAL_PEERS: Initial peers for connecting to the network
KWAAINET_LOG_LEVEL: Logging level (default: "INFO")
KWAAINET_MAX_MEMORY: Maximum memory to use (in GB)
PUBLIC_NAME: Public name for your node
PUBLIC_IP: Override the public IP address (auto-detected by default)
ANNOUNCE_ADDR: Custom announce address for P2P networking
NORELAY: Set to any value to disable automatic relay

⚡ Performance Considerations

Windows Systems

NVIDIA GPUs

Windows systems with NVIDIA GPUs will use CUDA acceleration when properly configured. Manual setup required until Windows installer is restored.

AMD GPUs

AMD GPU support on Windows requires manual configuration. ROCm support is limited compared to Linux.

Intel GPUs

Intel integrated and discrete GPUs are detected and supported through Intel Extension for PyTorch when available.

CPU-only

On systems without dedicated GPUs, KwaaiNet will run in CPU-only mode with optimized PyTorch CPU libraries.

Linux Systems

NVIDIA GPUs

Linux systems with NVIDIA GPUs will automatically use CUDA acceleration when available. The installer detects NVIDIA GPUs and configures the appropriate drivers and libraries.

AMD GPUs

Systems with AMD GPUs can use ROCm for acceleration. The installer will detect AMD GPUs and attempt to configure ROCm support.

Intel GPUs

Intel integrated and discrete GPUs are supported through Intel Extension for PyTorch on compatible systems.

CPU-only

On systems without dedicated GPUs, KwaaiNet will run in CPU-only mode with optimized PyTorch CPU libraries.

Apple Silicon (M1/M2/M3/M4) Macs

On Apple Silicon Macs, GPU acceleration via Metal Performance Shaders (MPS) is used automatically if available. This provides significantly better performance than CPU-only mode.

Intel Macs

Intel Macs will primarily use CPU for computation as Metal support for PyTorch on Intel is limited.

🔧 Troubleshooting

Common Issues

Windows-specific Issues

⚠️ Windows installer temporarily unavailable due to critical issues.

For Windows users:

Use WSL2 with Linux installer (recommended)
Manual installation requires advanced PowerShell/Python knowledge

GPU not detected:

Ensure proper GPU drivers are installed (NVIDIA GeForce Experience, AMD Adrenalin, Intel Arc Control)
Check Device Manager for GPU hardware detection
Verify nvidia-smi command works for NVIDIA GPUs

Installation fails with permission errors:

Run PowerShell as Administrator if needed
Some dependencies may require elevated privileges
The installer will use winget when available for automatic dependency installation

Python version issues:

The installer supports Python 3.8+ and will set up Miniconda if system Python is incompatible
Windows Store Python installations may cause issues - prefer python.org or Miniconda installations (applies to manual installation)

ModuleNotFoundError: No module named 'kwaainet':

This was a known issue that has been fixed in recent installer updates
For existing installations, activate your environment and run: pip install "git+https://github.com/Kwaai-AI-Lab/OpenAI-Petal.git#subdirectory=Installer/windows"
Or use WSL2 with Linux installer

antivirus software interfering:

Some antivirus software may block the installer or conda operations
Add exclusions for the KwaaiNet directory and Python environments if needed

Linux-specific Issues

Tokenizers build failure (wheel compilation error):

Error: Building wheel for tokenizers (pyproject.toml) ... error
Cause: Missing Rust compiler or build dependencies

🚨 IMMEDIATE WORKAROUND (if issues persist in v0.2.0):

# Use the --no-build-tools flag to force pre-built wheels only
curl -fsSL https://raw.githubusercontent.com/Kwaai-AI-Lab/OpenAI-Petal/main/Installer/linux/linuxinstaller.sh | bash -s -- --no-build-tools

Long-term solutions:

# Install Rust compiler
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env

# Or install build dependencies
sudo apt-get update && sudo apt-get install build-essential
# For RHEL/CentOS: sudo yum groupinstall "Development Tools"
# For Arch: sudo pacman -S base-devel

# Force use of pre-built wheels manually
pip install --only-binary=tokenizers tokenizers

GPU not detected:

Ensure proper GPU drivers are installed (NVIDIA, AMD, or Intel)
Run lspci | grep -i vga to verify GPU hardware detection
Check if nvidia-smi, rocm-smi, or Intel GPU tools are working

Installation fails with permission errors:

The installer will automatically detect if sudo is needed and only use it when necessary
If you have all dependencies installed, use --no-system-packages to avoid sudo requirement
Ensure you have administrative privileges for system package installation

ModuleNotFoundError: No module named 'kwaainet':

This was a known issue that has been fixed in recent installer updates
For existing installations, run: source ~/.kwaainet-venv/bin/activate && pip install "git+https://github.com/Kwaai-AI-Lab/OpenAI-Petal.git#subdirectory=Installer/linux"
Or reinstall using the latest installer

Dependency conflicts (transformers version issues):

The installer now uses transformers==4.43.1 for Petals compatibility
This version is secure and not affected by recent CVEs (2024-11392, 11393, 11394)
Conflicts should be resolved automatically in new installations

Python version issues:

The installer supports Python 3.8+ and will set up conda if system Python is too old
Check your Python version with python3 --version

macOS-specific Issues

"MPS is not available" error:

Ensure you have macOS 12.3 or later
Make sure PyTorch 2.0+ is installed

General Issues

High memory usage:

Reduce the number of blocks being shared
Set a lower KWAAINET_MAX_MEMORY value

Node doesn't connect to network:

Check your network connection
Verify the initial peers configuration
Ensure firewall allows the configured port

🔒 Security Considerations

⚠️ Known Security Trade-offs (January 2025)

Transformers Vulnerability Status: The current installation uses transformers==4.43.1 due to Petals compatibility constraints. This version is vulnerable to 8 known CVEs:

CVE-2025-1194 (ReDoS in tokenizers) - 🔴 CRITICAL
CVE-2025-2099 (ReDoS in testing_utils) - 🔴 CRITICAL
CVE-2024-11392 (Code injection vulnerability) - 🟠 HIGH
CVE-2024-11393 (Deserialization vulnerability) - 🟠 HIGH
CVE-2024-11394 (Path traversal vulnerability) - 🟠 HIGH
Additional ReDoS vulnerabilities in various components

Why This Trade-off Exists: Petals (both stable v2.2.0 and development versions) strictly requires transformers==4.43.1. Updating to the secure transformers>=4.50.0 breaks Petals compatibility entirely.

Risk Mitigation Strategies:

🛡️ Run in isolated environments/containers
🚫 Avoid processing untrusted input through tokenizers
🔒 Use network firewalls to limit exposure
📊 Monitor for unusual CPU usage (ReDoS indicators)
🔄 Regularly check for Petals updates that support newer transformers

Resolution Timeline: This will be resolved when:

Petals releases a version supporting transformers>=4.50.0, OR
A security fork of transformers 4.43.1 patches these CVEs, OR
Alternative distributed inference solutions become available

Other Security Updates (December 2024)

✅ Fixed CVE-2024-24762 (FastAPI ReDoS vulnerability)
✅ Updated LangChain to address CVE-2023-46229 and CVE-2024-21513
✅ Updated all other dependencies to latest secure versions

📝 Recent Fixes and Improvements

Installer Improvements (December 2024)

✅ Fixed critical Linux installer bug causing "No module named 'kwaainet'" error
⚠️ Windows installer temporarily removed due to critical syntax errors
✅ Added intelligent sudo handling - only uses sudo when necessary
✅ Fixed transformers dependency conflicts (pinned to exact version 4.43.1)
✅ Added Linux installer options: --no-system-packages, --force-venv, --force-conda

Compatibility

✅ All installers now properly install the kwaainet package
✅ Dependency conflicts resolved with Petals compatibility maintained
✅ Works on Ubuntu 24.04, Windows 10+, macOS (Intel and Apple Silicon)

🤖 Daemon Mode

KwaaiNet now supports running as a daemon with advanced process management:

Basic Daemon Operations

# Start in daemon mode (background)
kwaainet start --daemon

# Or use daemon commands
kwaainet daemon start     # Start daemon
kwaainet daemon stop      # Stop daemon  
kwaainet daemon restart   # Restart daemon
kwaainet daemon status    # Show detailed status
kwaainet daemon logs      # Show recent logs

Process Management

# Regular commands (work with daemon or foreground)
kwaainet start            # Start in foreground
kwaainet stop             # Stop running instance
kwaainet restart          # Restart instance
kwaainet status           # Show status with metrics

Features

PID Management: Automatic PID file handling and process tracking
Health Monitoring: CPU, memory, and connection monitoring
Log Management: Automatic log rotation and structured logging
Signal Handling: Graceful shutdown on SIGTERM/SIGINT
Auto-Recovery: Process monitoring with restart capability
Status Reporting: JSON status output with system metrics

🔄 Reliability and Management Features (In Development)

KwaaiNet is being enhanced with additional reliability features for production deployments:

Planned Improvements

🔧 Process Cleanup: kwaainet start will automatically stop any existing processes to prevent conflicts
📦 Auto-Update: Automatic detection and installation when new versions are available
⏰ Scheduled Restarts: Daily restart capability (configurable time, default midnight) for maintenance
🌐 Connection Monitoring: Automatic restart when node is dropped from the network map due to connectivity issues

Why These Features Matter

Process Cleanup: Prevents port conflicts and resource contention from orphaned processes
Auto-Update: Ensures nodes stay current with security fixes and performance improvements
Scheduled Restarts: Maintains long-term stability and applies configuration changes
Connection Monitoring: Ensures continuous participation in the distributed network

These features will enhance KwaaiNet's suitability for production environments and long-running deployments.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Development Guidelines

Versioning: This project follows Semantic Versioning (SemVer). See VERSIONING.md for detailed version schema and update procedures.
Testing: Test your changes on the target platform before submitting
Documentation: Update relevant documentation for new features

📄 License

This project is CC-BY-4.0 licensed.

Name		Name	Last commit message	Last commit date
Latest commit History 316 Commits
.claude		.claude
.githooks		.githooks
.vscode		.vscode
Demo		Demo
Installer		Installer
NodeManager		NodeManager
docker		docker
docs		docs
kwaainet		kwaainet
scripts		scripts
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Caddyfile		Caddyfile
DAEMON_REQUIREMENTS.md		DAEMON_REQUIREMENTS.md
DAEMON_TODO.md		DAEMON_TODO.md
DUPLICATE_PROCESS_INVESTIGATION.md		DUPLICATE_PROCESS_INVESTIGATION.md
Feature_TODO.md		Feature_TODO.md
HEALTH_MONITORING_IMPLEMENTATION.md		HEALTH_MONITORING_IMPLEMENTATION.md
HEALTH_MONITORING_PLAN.md		HEALTH_MONITORING_PLAN.md
KwaaiNet WebUI Manager.pdf		KwaaiNet WebUI Manager.pdf
LICENSE		LICENSE
MASS_ADOPTION_STRATEGY.md		MASS_ADOPTION_STRATEGY.md
Makefile		Makefile
OpenAI-Petal.code-workspace		OpenAI-Petal.code-workspace
README.md		README.md
REBOOT-RECOVERY.md		REBOOT-RECOVERY.md
RECONNECTION_STRATEGY_COMPARISON.md		RECONNECTION_STRATEGY_COMPARISON.md
ROOT_CAUSE_ZOMBIE_STATE_AFTER_SWARM_REBALANCE.md		ROOT_CAUSE_ZOMBIE_STATE_AFTER_SWARM_REBALANCE.md
TESTING_RESULTS.md		TESTING_RESULTS.md
VERSION		VERSION
VERSIONING.md		VERSIONING.md
WINDOWS_INSTALLER_PLAN.md		WINDOWS_INSTALLER_PLAN.md
app_openai_json.py		app_openai_json.py
buildimages.sh		buildimages.sh
config.py		config.py
data_structures.py		data_structures.py
requirements.txt		requirements.txt
test_connection_scenarios.py		test_connection_scenarios.py
test_port_occupied_scenario.py		test_port_occupied_scenario.py
test_port_selection.py		test_port_selection.py
utils.py		utils.py

License

Kwaai-AI-Lab/OpenAI-Petal

Folders and files

Latest commit

History

Repository files navigation

OpenAI API-compatible server for Petals distributed inference 👋

🎯 PROJECT STATUS: MISSION ACCOMPLISHED

✅ Achievements

⚠️ Architectural Limitations (Why We're Moving On)

🚀 The Future: KwaaiNet

Table of Contents

Overview

🚀 Recent Updates

🎯 v0.6.3 - Linux Installer Curl|Bash Fix (Latest)

🎯 v0.6.2 - Health Monitor Infrastructure Improvements

🎯 v0.6.1 - Version Management Fix

🎯 v0.6.0 - Health Monitoring Refactor & Strategy Pattern

🎯 v0.5.2 - Triton Compatibility Fix

🎯 v0.4.7 - Auto-Calibration & Smart Block Allocation

🎯 v0.4.6 - Linux Process Cleanup & Production Stability

🎯 v0.4.5 - Linux Auto-Update & Reconnect

🎯 v0.4.3 - Concurrent Instance Prevention & MPS Compatibility

🔧 v0.4.2 - macOS Auto-Start Status Detection Fix

🔄 v0.4.1 - Auto-Update System

📈 v0.4.0 - P2P Network Monitoring & Reconnection

🔧 v0.3.1 - Enhanced Reliability & User Experience

🚀 Daemon Mode Complete

🛠️ Installer Improvements

🎨 User Experience Enhancements

📊 Current Status

⚡ Key Features

🏗️ Architecture

📡 API Endpoints

💾 Installation and Setup

Installation Options

One-Step Installation (Recommended)

Windows

Linux

macOS

Container-Based Installation (Docker/Podman)

Using Docker

Using Podman (Rootless - Recommended)

Configuration (.env file)

Auto-Restart After Reboot (Podman)

Manual Installation

Windows

Linux

macOS

🧪 Testing Your Installation

Windows Test Script

Linux Test Script

Container Installation Test

🗑️ Uninstallation

Windows

Linux

macOS

🚀 Usage

Beautiful CLI Interface

Daemon Management

Health Monitoring & Auto-Reconnection

Configuration Management

P2P Network Monitoring & Reconnection

Auto-Update System

Initial Setup

Starting a Node

⚙️ Configuration

🛠️ Available Command-line Options

🐍 Python API

🌍 Environment Variables

⚡ Performance Considerations

Windows Systems

NVIDIA GPUs

AMD GPUs

Intel GPUs

CPU-only

Linux Systems

NVIDIA GPUs

AMD GPUs

Intel GPUs

Packages