LLVM Binary Obfuscator

Production-ready multi-layer code obfuscation toolkit for C/C++ binaries

What is This?

The LLVM Binary Obfuscator is a comprehensive code protection toolkit that makes reverse engineering C/C++ binaries significantly harder. It uses a 4-layer defense strategy combining cryptographic symbol renaming, compiler-level optimizations, control-flow obfuscation, and string encryption.

Key Features

4 Sequential Obfuscation Layers - Applied in execution order (1→2→3→4)
UPX Binary Packing - Optional 5th layer: compress binaries 50-70% while adding obfuscation
Production Ready - Thoroughly tested, all layers functional
Zero Runtime Overhead (Layers 1-2) - Source-level transformations only
CLI & API - Easy integration with existing build pipelines
Comprehensive Reports - Detailed obfuscation metrics and validation
Cross-Platform - Supports Linux, macOS, and Windows targets

Quick Stats

Metric	Without Obfuscation	With All 4 Layers	With UPX
Symbol Count	20 symbols	1 symbol (-95%)	1 symbol (-95%)
Secret Visibility	100% exposed	0% visible	0% visible
RE Difficulty	2-4 hours	8-12 weeks (50x harder)	10-14 weeks
Binary Size	16 KB	35 KB (+119%)	18 KB (+12%) ⭐
Overhead	-	~15-25%	~15-25% + 10-50ms startup

Installation

Prerequisites

Requirement	Version	Notes
Python	3.10+	Required for CLI and API
Clang	15+	Or GCC 11+ for compilation
CMake	3.20+	For building MLIR components
LLVM	22	For OLLVM passes and MLIR obfuscation
MLIR	22	Included with LLVM 22
OpenSSL	Any	For cryptographic symbol hashing
UPX	4.0+	Optional, for binary packing

Option 1: Quick Install (Recommended)

# Clone repository
git clone https://github.com/devalgupta404/oaas.git
cd oaas

# Create and activate virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install Python dependencies
cd cmd/llvm-obfuscator
pip install -r requirements.txt

# Build MLIR obfuscation library (for MLIR passes)
cd ../../mlir-obs
./build.sh
cd ..

# Verify installation
cd cmd/llvm-obfuscator
python3 -m cli.obfuscate --help

Option 2: Full Build (with OLLVM Layer 2)

# Clone LLVM with OLLVM passes
git clone https://github.com/llvm/llvm-project.git
cd llvm-project
git checkout llvmorg-22.0.0  # Or appropriate version
mkdir build && cd build

# Build LLVM with OLLVM plugin
cmake -G Ninja ../llvm \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLVM_ENABLE_PROJECTS="clang;mlir" \
  -DLLVM_TARGETS_TO_BUILD="X86;ARM;AArch64"
ninja

# Add to PATH
export PATH="$(pwd)/bin:$PATH"

# Continue with Quick Install steps above

Option 3: Docker Installation

# Build the Docker image
docker build -t llvm-obfuscator -f cmd/llvm-obfuscator/Dockerfile.backend .

# Run the container
docker run -p 8000:8000 llvm-obfuscator

Building from Source

Complete Build Steps

# 1. Clone the repository
git clone https://github.com/devalgupta404/oaas.git
cd oaas

# 2. Set up Python environment
python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip

# 3. Install Python dependencies
cd cmd/llvm-obfuscator
pip install -r requirements.txt

# 4. Install development dependencies (optional, for testing)
pip install pytest pytest-cov httpx

# 5. Build MLIR obfuscation library
cd ../../mlir-obs
chmod +x build.sh
./build.sh

# 6. Verify the installation
cd ../cmd/llvm-obfuscator
python3 -m cli.obfuscate --help

Platform-Specific Notes

Linux (Ubuntu/Debian)

# Install system dependencies
sudo apt-get update
sudo apt-get install -y \
    build-essential \
    cmake \
    ninja-build \
    clang \
    llvm \
    python3-dev \
    python3-venv \
    libssl-dev \
    upx-ucl

macOS

# Install via Homebrew
brew install cmake ninja llvm python@3.11 openssl upx

# Add LLVM to PATH
export PATH="/opt/homebrew/opt/llvm/bin:$PATH"

Windows

# Install via Chocolatey
choco install cmake ninja llvm python3 openssl

# Or via winget
winget install LLVM.LLVM
winget install Kitware.CMake
winget install Python.Python.3.11

Running Tests

Test Suite Overview

The project includes comprehensive unit tests for all core modules:

tests/
├── conftest.py          # Shared fixtures and configuration
├── pytest.ini           # Pytest settings
├── test_config.py       # Tests for configuration module
├── test_utils.py        # Tests for utility functions
├── test_cli.py          # Tests for CLI commands
└── test_api.py          # Tests for API endpoints

Running Tests

# Navigate to project root
cd oaas

# Run all tests
pytest tests/ -v

# Run tests with coverage report
pytest tests/ --cov=cmd/llvm-obfuscator/core --cov-report=html

# Run specific test file
pytest tests/test_config.py -v

# Run specific test class
pytest tests/test_config.py::TestPlatform -v

# Run tests matching a pattern
pytest tests/ -k "test_platform" -v

# Run only unit tests (fast)
pytest tests/ -m "unit" -v

# Run integration tests
pytest tests/ -m "integration" -v

Test Categories

Marker	Description
`unit`	Fast unit tests without external dependencies
`integration`	Tests requiring external tools (LLVM, etc.)
`slow`	Tests that take more than 5 seconds

CI/CD Integration

Tests are automatically run on:

Push to main/master branch
Pull requests
Scheduled nightly builds

# Example GitHub Actions configuration
- name: Run tests
  run: |
    pip install pytest pytest-cov
    pytest tests/ --cov=core --cov-report=xml

Quick Start

Basic Obfuscation (Layers 0+1+3 + UPX - Production Ready)

# Navigate to CLI directory
cd cmd/llvm-obfuscator

# Obfuscate a C source file with UPX compression
python3 -m cli.obfuscate compile ../../src/simple_auth.c \
  --output ./obfuscated \
  --level 3 \
  --string-encryption \
  --enable-symbol-obfuscation \
  --enable-upx \
  --custom-flags "-flto -fvisibility=hidden -O3" \
  --report-formats "json,html"

# Test the obfuscated binary
./obfuscated/simple_auth "test_password"

Verify Protection

# Check symbols (should see only 1-2 symbols)
nm obfuscated/simple_auth | grep -v ' U ' | wc -l

# Check for secrets (should return empty)
strings obfuscated/simple_auth | grep -iE "password|secret|key"

# View obfuscation report
cat obfuscated/report.json

Architecture Overview

System Design

┌─────────────────────────────────────────────────────────┐
│                    LLVM Obfuscator                       │
├─────────────────────────────────────────────────────────┤
│                                                           │
│  Input: source.c  →  [4 Layers]  →  Output: binary     │
│                                                           │
│  Layer 1: Symbol Obfuscation (Pre-compilation)          │
│    ├─ Cryptographic hashing (SHA256/BLAKE2B)            │
│    ├─ Function/variable renaming                         │
│    └─ Mapping file generation                            │
│                                                           │
│  Layer 4: Compiler Flags (Compilation)                  │
│    ├─ Link-time optimization (-flto)                     │
│    ├─ Symbol hiding (-fvisibility=hidden)               │
│    ├─ Frame pointer removal (-fomit-frame-pointer)      │
│    └─ Speculative hardening (-mspeculative-load-hardening)│
│                                                           │
│  Layer 3: OLLVM Passes (IR-level) [Optional]            │
│    ├─ Control flow flattening                            │
│    ├─ Instruction substitution                           │
│    ├─ Bogus control flow                                 │
│    └─ Basic block splitting                              │
│                                                           │
│  Layer 2: String Encryption (Pre-compilation)           │
│    ├─ XOR encryption                                     │
│    ├─ Runtime decryption                                 │
│    └─ Constructor-based initialization                   │
│                                                           │
│  Layer 5: UPX Binary Packing (Post-compilation) [New!]  │
│    ├─ 50-70% binary size reduction                       │
│    ├─ Additional obfuscation layer                       │
│    ├─ LZMA compression                                   │
│    └─ Transparent runtime decompression                  │
│                                                           │
└─────────────────────────────────────────────────────────┘

Data Flow

Source File (auth.c)
       ↓
[Layer 1] Symbol Obfuscation
       ├─ validate_password → f_dabe0a778dd2
       ├─ failed_attempts → v_3f4e5d6c7b8a
       └─ Output: auth_symbols.c
       ↓
[Layer 2] String Encryption
       ├─ "AdminPass2024!" → {0xCA, 0xCF, 0xC6, ...}
       ├─ Inject decryption functions
       └─ Output: auth_encrypted.c
       ↓
[Layer 3] OLLVM Passes (if enabled)
       ├─ Control flow flattening
       ├─ Instruction substitution
       └─ Output: auth_obfuscated.ll
       ↓
[Layer 4] Compilation with Flags
       ├─ -flto -fvisibility=hidden -O3
       ├─ Symbol stripping
       └─ Output: auth_binary
       ↓
[Layer 5] UPX Packing (optional)
       ├─ LZMA compression
       ├─ 50-70% size reduction
       └─ Output: auth_binary (packed)
       ↓
Final Binary

4-Layer System (Execution Order: 1→2→3→4)

Layer 1: Symbol Obfuscation (PRE-COMPILE, FIRST)

Purpose: Remove semantic meaning from all symbol names Type: Pre-compilation source transformation Execution: FIRST (before everything else) Overhead: 0% runtime (compile-time only)

How it Works:

Uses cryptographic hashing (SHA256/BLAKE2B/SipHash)
Renames functions, variables, structures with meaningless hashes
Preserves system symbols (main, standard library, etc.)
Generates reversible mapping file for debugging

Example:

// Before
int validate_password(const char* user_input) {
    if (strcmp(user_input, MASTER_PASSWORD) == 0) {
        failed_attempts = 0;
        return 1;
    }
    failed_attempts++;
    return 0;
}

// After
int f_dabe0a778dd2(const char* v_9e8d7c6b5a4f) {
    if (strcmp(v_9e8d7c6b5a4f, v_a1b2c3d4e5f6) == 0) {
        v_3f4e5d6c7b8a = 0;
        return 1;
    }
    v_3f4e5d6c7b8a++;
    return 0;
}

Usage:

python3 -m cli.obfuscate compile source.c \
  --level 1 \
  --enable-symbol-obfuscation \
  --symbol-algorithm sha256 \
  --symbol-hash-length 12

Results:

Symbol reduction: 20 → 6 symbols
RE difficulty: 2x baseline
Cryptographically hard to reverse without mapping file

Layer 2: String Encryption (PRE-COMPILE, SECOND)

Purpose: Hide hardcoded secrets and string literals Type: Pre-compilation source transformation Execution: SECOND (after symbol obfuscation) Overhead: ~1-3% runtime (decryption at startup)

Critical Importance:

⚠️ 100% of binaries without string encryption exposed secrets in our testing
Compiler obfuscation (Layers 3+4) does NOT hide strings
Strings are stored in .rodata section unchanged
MANDATORY for any binary containing secrets

How it Works:

Scans source code for const string literals
Encrypts strings using XOR cipher
Converts to char arrays: {0xCA, 0xCF, 0xC6, ...}
Injects __attribute__((constructor)) decryption function
Strings decrypted automatically before main() executes

Example:

// Before
const char* MASTER_PASSWORD = "AdminPass2024!";
const char* API_SECRET = "sk_live_secret_12345";

// After
char* MASTER_PASSWORD = NULL;
char* API_SECRET = NULL;

__attribute__((constructor)) static void _init_encrypted_strings(void) {
    MASTER_PASSWORD = _xor_decrypt((const unsigned char[]){0xCA,0xCF,...}, 14, 0xAB);
    API_SECRET = _xor_decrypt((const unsigned char[]){0x9E,0x86,...}, 20, 0xED);
}

Usage:

python3 -m cli.obfuscate compile source.c \
  --level 1 \
  --string-encryption

Verification:

# Before string encryption
$ strings binary | grep password
AdminPass2024!          ← EXPOSED

# After string encryption
$ strings binary | grep password
(no output)             ← HIDDEN

Results:

String hiding: 100% (all secrets hidden)
Binary size increase: ~6%
Secrets in strings output: 0
RE difficulty: 5-10x baseline

Layer 3: OLLVM Compiler Passes (COMPILE, THIRD - Optional)

Purpose: Transform control flow and instructions at IR level Type: LLVM IR transformations Execution: THIRD (during compilation, optional) Overhead: ~10-20% runtime

4 Obfuscation Passes:

Control Flow Flattening
- Converts functions into state machines
- All basic blocks become case statements
- Original control flow completely hidden
Instruction Substitution
- Replaces simple operations with complex equivalents
- Example: a = b + c → a = -(-b - c)
- Harder to analyze, more complex patterns
Bogus Control Flow
- Inserts fake conditional branches (never taken)
- Adds opaque predicates (always true/false)
- Increases CFG complexity without changing behavior
Basic Block Splitting
- Divides basic blocks into smaller pieces
- Inserts unconditional jumps between them
- Increases code size and complexity

Important Notes:

⚠️ Requires OLLVM plugin (must be built separately)
⚠️ Modern optimizations (O1/O3) reduce OLLVM effectiveness by 30-41%
✅ Best resilience: Bogus CF (+40% entropy) > Flattening (+16%) > Split (+6%) > Substitution (+0.7%)
✅ Use with -O0 or -O2 (avoid -O1 and -O3)

Usage:

python3 -m cli.obfuscate compile source.c \
  --level 4 \
  --enable-flattening \
  --enable-substitution \
  --enable-bogus-cf \
  --enable-split \
  --custom-pass-plugin /path/to/LLVMObfuscationPlugin.dylib

Results:

Symbol count: 28 symbols (increases due to new basic blocks)
Entropy increase: +180% (0.6 → 1.8)
Function complexity: 20-30x harder to analyze
RE difficulty: 20-30x baseline

Research Finding: Based on testing 42 configurations, OLLVM passes provide minimal benefit when combined with modern optimizations. Recommended only for extreme security needs.

Layer 4: Compiler Flags (COMPILE, FINAL)

Purpose: Leverage compiler optimizations for obfuscation Type: Compilation-level transformations Execution: FOURTH/FINAL (last step) Overhead: ~2-5% runtime

Optimal Flag Set (9 flags):

-flto - Link-time optimization (inlines functions)
-fvisibility=hidden - Hides external symbols
-O3 - Maximum optimization
-fno-builtin - Disables built-in functions
-flto=thin - Thin LTO for faster builds
-fomit-frame-pointer - Breaks debugger stack traces
-mspeculative-load-hardening - Spectre mitigation (adds noise)
-O1 - Secondary optimization pass
-Wl,-s - Strip all symbols from binary

Synergy Effect:

Individual flags: 14 → 8 symbols (-43%)
Combined flags: 14 → 1 symbol (-93%)
Synergy factor: 2.16x (flags work together)

Usage:

python3 -m cli.obfuscate compile source.c \
  --level 3 \
  --custom-flags "-flto -fvisibility=hidden -O3 -fno-builtin -flto=thin -fomit-frame-pointer -mspeculative-load-hardening -O1 -Wl,-s"

Results:

Symbol reduction: 14 → 1 symbol (-93%)
Function hiding: 8 → 1 function
Binary size increase: ~100% (due to inlining)
RE difficulty: 10-15x baseline

Layer 3: OLLVM Compiler Passes

Purpose: Transform control flow and instructions at IR level Type: LLVM IR transformations Overhead: ~10-20% runtime

4 Obfuscation Passes:

Control Flow Flattening
- Converts functions into state machines
- All basic blocks become case statements
- Original control flow completely hidden
Instruction Substitution
- Replaces simple operations with complex equivalents
- Example: a = b + c → a = -(-b - c)
- Harder to analyze, more complex patterns
Bogus Control Flow
- Inserts fake conditional branches (never taken)
- Adds opaque predicates (always true/false)
- Increases CFG complexity without changing behavior
Basic Block Splitting
- Divides basic blocks into smaller pieces
- Inserts unconditional jumps between them
- Increases code size and complexity

Important Notes:

⚠️ Requires OLLVM plugin (must be built separately)
⚠️ Modern optimizations (O1/O3) reduce OLLVM effectiveness by 30-41%
✅ Best resilience: Bogus CF (+40% entropy) > Flattening (+16%) > Split (+6%) > Substitution (+0.7%)
✅ Use with -O0 or -O2 (avoid -O1 and -O3)

Usage:

python3 -m cli.obfuscate compile source.c \
  --level 4 \
  --enable-flattening \
  --enable-substitution \
  --enable-bogus-cf \
  --enable-split \
  --custom-pass-plugin /path/to/LLVMObfuscationPlugin.dylib

Results:

Symbol count: 28 symbols (increases due to new basic blocks)
Entropy increase: +180% (0.6 → 1.8)
Function complexity: 20-30x harder to analyze
RE difficulty: 20-30x baseline

Research Finding: Based on testing 42 configurations, OLLVM passes provide minimal benefit when combined with modern optimizations. Recommended only for extreme security needs.

Layer 2: String Encryption

Purpose: Hide hardcoded secrets and string literals Type: Pre-compilation source transformation Overhead: ~1-3% runtime (decryption at startup)

Critical Importance:

⚠️ 100% of binaries without string encryption exposed secrets in our testing
Compiler obfuscation (Layers 1+2) does NOT hide strings
Strings are stored in .rodata section unchanged
MANDATORY for any binary containing secrets

How it Works:

Scans source code for const string literals
Encrypts strings using XOR cipher
Converts to char arrays: {0xCA, 0xCF, 0xC6, ...}
Injects __attribute__((constructor)) decryption function
Strings decrypted automatically before main() executes

Example:

// Before
const char* MASTER_PASSWORD = "AdminPass2024!";
const char* API_SECRET = "sk_live_secret_12345";

// After
char* MASTER_PASSWORD = NULL;
char* API_SECRET = NULL;

__attribute__((constructor)) static void _init_encrypted_strings(void) {
    MASTER_PASSWORD = _xor_decrypt((const unsigned char[]){0xCA,0xCF,...}, 14, 0xAB);
    API_SECRET = _xor_decrypt((const unsigned char[]){0x9E,0x86,...}, 20, 0xED);
}

Usage:

python3 -m cli.obfuscate compile source.c \
  --level 1 \
  --string-encryption

Verification:

# Before string encryption
$ strings binary | grep password
AdminPass2024!          ← EXPOSED

# After string encryption
$ strings binary | grep password
(no output)             ← HIDDEN

Results:

String hiding: 100% (all secrets hidden)
Binary size increase: ~6%
Secrets in strings output: 0
RE difficulty: 5-10x baseline

Usage Examples

Example 1: Standard Production Binary (with UPX)

Use Case: Web application backend, moderate security requirements

python3 -m cli.obfuscate compile src/auth.c \
  --output ./release \
  --level 2 \
  --string-encryption \
  --enable-symbol-obfuscation \
  --enable-upx \
  --custom-flags "-flto -fvisibility=hidden -O3 -fno-builtin -flto=thin -fomit-frame-pointer -mspeculative-load-hardening -O1 -Wl,-s" \
  --report-formats "json,html"

Results:

Compilation time: ~2 seconds
Binary size: 18 KB (from 16 KB, only +12% instead of +119%!)
Symbol count: 1 (-95%)
Secrets visible: 0
Overhead: ~5-10% + 10-50ms startup
RE difficulty: 4-8 weeks (vs 2-4 hours)

Example 2: High-Security Financial Application

Use Case: Payment processing, regulatory compliance required

python3 -m cli.obfuscate compile src/payment.c \
  --output ./secure \
  --level 4 \
  --enable-symbol-obfuscation \
  --symbol-algorithm blake2b \
  --symbol-hash-length 16 \
  --symbol-salt "$(openssl rand -hex 32)" \
  --string-encryption \
  --custom-flags "-flto -fvisibility=hidden -O3 -fno-builtin -flto=thin -fomit-frame-pointer -mspeculative-load-hardening -O1 -Wl,-s" \
  --report-formats "json,html,markdown"

Results:

Symbol count: 1
Hash collisions: 0 (16-char Blake2b)
Secrets visible: 0
Obfuscation score: 80-85/100
RE difficulty: 8-12 weeks

Example 3: Ultra-Critical (Military/IP Protection)

Use Case: Proprietary algorithm, maximum protection required

python3 -m cli.obfuscate compile src/proprietary.c \
  --output ./maximum \
  --level 5 \
  --enable-symbol-obfuscation \
  --enable-flattening \
  --enable-bogus-cf \
  --string-encryption \
  --fake-loops 10 \
  --cycles 1 \
  --custom-flags "-flto -fvisibility=hidden -O2 -fno-builtin" \
  --custom-pass-plugin /path/to/LLVMObfuscationPlugin.dylib \
  --report-formats "json,html,markdown"

Important: Use -O2 instead of -O3 to preserve OLLVM obfuscation!

Results:

Symbol count: 2
Entropy: 1.08 (+180%)
Secrets visible: 0
Obfuscation score: 90-95/100
Overhead: ~25-30%
RE difficulty: 12+ weeks

Example 4: Batch Processing Multiple Files

Create batch_config.yaml:

jobs:
  - source: "src/auth.c"
    output: "./obfuscated/auth"
    config:
      level: 3
      string_encryption: true
      enable_symbol_obfuscation: true

  - source: "src/license.cpp"
    output: "./obfuscated/license"
    config:
      level: 4
      enable_flattening: true
      enable_bogus_cf: true
      string_encryption: true

  - source: "src/api.c"
    output: "./obfuscated/api"
    config:
      level: 3
      string_encryption: true

Run batch:

python3 -m cli.obfuscate batch batch_config.yaml

CLI Reference

Commands

python -m cli.obfuscate [COMMAND] [OPTIONS]

Available Commands:

compile - Obfuscate and compile a source file
analyze - Analyze an existing binary for obfuscation level
compare - Compare baseline vs obfuscated binaries
batch - Process multiple files from config file

Options

Core Options

--level <1-5>              Obfuscation level (1=minimal, 5=maximum)
--output <path>            Output directory
--report-formats <list>    Generate reports (json,html,markdown)

Layer 1: Symbol Obfuscation

--enable-symbol-obfuscation       Enable cryptographic symbol renaming
--symbol-algorithm <algo>         Hash algorithm (sha256, blake2b, siphash)
--symbol-hash-length <n>          Hash truncation length (default: 12)
--symbol-prefix <style>           Prefix style (typed, single, none)
--symbol-salt <string>            Custom salt for deterministic builds

Layer 4: Compiler Flags

--custom-flags "<flags>"   Custom compiler flags (overrides defaults)

Layer 3: OLLVM Passes

--enable-flattening              Enable control flow flattening
--enable-substitution            Enable instruction substitution
--enable-bogus-cf                Enable bogus control flow
--enable-split                   Enable basic block splitting
--custom-pass-plugin <path>      Path to OLLVM plugin (required for Layer 2)
--cycles <n>                     Number of obfuscation cycles (1-5)

Layer 2: String Encryption

--string-encryption              Enable string literal encryption
--fake-loops <n>                 Number of fake loops to insert (0-50)

Layer 5: UPX Binary Packing

--enable-upx                     Enable UPX binary packing (compression + obfuscation)
--upx-compression <level>        Compression level (fast, default, best, brute)
--upx-lzma / --no-upx-lzma      Use LZMA compression (default: true)
--upx-preserve-original          Keep backup of pre-UPX binary

Configuration File

--config-file <path>     Load settings from YAML file

Examples

Simple obfuscation:

python3 -m cli.obfuscate compile source.c --level 2 --string-encryption

Maximum security:

python3 -m cli.obfuscate compile source.c \
  --level 5 \
  --enable-symbol-obfuscation \
  --enable-flattening \
  --enable-bogus-cf \
  --string-encryption \
  --custom-pass-plugin /path/to/plugin.dylib

Analyze existing binary:

python3 -m cli.obfuscate analyze ./binary --output report.json

Compare binaries:

python3 -m cli.obfuscate compare ./baseline ./obfuscated --output diff.html

Best Practices

Security Guidelines

Always enable string encryption if binary contains secrets
- Passwords, API keys, encryption keys
- Database credentials, internal URLs
- License keys, proprietary algorithms
Use strong salts for symbol obfuscation
```
--symbol-salt "$(openssl rand -hex 32)"
```
Store mapping files securely
- Required for debugging crashes
- Don't distribute with binaries
- Backup in secure location
Test obfuscated binaries thoroughly
- Functional tests (same inputs → same outputs)
- Performance tests (acceptable overhead)
- Security tests (secrets not visible)
Use Layer 2 sparingly
- Only for ultra-critical code
- Adds 15-30% overhead
- Modern optimizations reduce effectiveness

Build Integration

Makefile Integration

# Release build with obfuscation
release:
	python3 -m cli.obfuscate compile src/main.c \
		--output ./release \
		--level 3 \
		--string-encryption \
		--enable-symbol-obfuscation \
		--custom-flags "-flto -fvisibility=hidden -O3"

# Development build (no obfuscation)
dev:
	gcc -O0 -g src/main.c -o ./dev/main

CMake Integration

# Add custom target for obfuscated build
add_custom_target(obfuscate
    COMMAND python3 -m cli.obfuscate compile ${CMAKE_SOURCE_DIR}/src/main.c
        --output ${CMAKE_BINARY_DIR}/obfuscated
        --level 3
        --string-encryption
        --enable-symbol-obfuscation
    DEPENDS ${CMAKE_SOURCE_DIR}/src/main.c
)

CI/CD Integration

# .github/workflows/release.yml
name: Release Build

on:
  push:
    tags: ['v*']

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Install dependencies
        run: |
          pip install -r requirements.txt

      - name: Build obfuscated binary
        run: |
          python3 -m cli.obfuscate compile src/main.c \
            --level 3 \
            --string-encryption \
            --enable-symbol-obfuscation \
            --output ./release

      - name: Verify obfuscation
        run: |
          # Check symbols
          nm release/main | grep -v ' U ' | wc -l

          # Check secrets
          ! strings release/main | grep -iE "password|secret"

      - name: Upload release
        uses: actions/upload-artifact@v3
        with:
          name: obfuscated-binary
          path: release/main

MLIR-Based Obfuscation (Advanced)

In addition to the 4-layer system, the obfuscator now supports MLIR-based obfuscation for advanced control-flow and constant transformations.

Two Pipeline Options

1. Default Pipeline (CLANG) - Stable, Production-Ready

python3 -m cli.obfuscate compile source.c \
  --enable-string-encrypt \
  --enable-crypto-hash \
  --output ./output

2. ClangIR Pipeline (CLANGIR) - Advanced C/C++ Semantics

python3 -m cli.obfuscate compile source.c \
  --mlir-frontend clangir \
  --enable-string-encrypt \
  --enable-crypto-hash \
  --enable-constant-obfuscate \
  --output ./output

MLIR Passes Available

Pass	Description	Flag
String Encryption	XOR-based string literal encryption	`--enable-string-encrypt`
Symbol Obfuscation	Random hex symbol names (RNG-based)	`--enable-symbol-obfuscate`
Crypto Hash	Cryptographic symbol hashing (SHA256/BLAKE2B/SipHash)	`--enable-crypto-hash`
Constant Obfuscation	Obfuscate ALL constants (int/float/string/array)	`--enable-constant-obfuscate`

ClangIR Advantages

✅ Preserves high-level C/C++ semantics for better obfuscation quality ✅ More accurate control flow analysis at MLIR level ✅ Native LLVM 22 support - officially part of LLVM project ✅ Better type preservation compared to lowered LLVM IR ✅ Future-proof - active development by LLVM community

Documentation

MLIR Integration Guide: MLIR_INTEGRATION_GUIDE.md - Quick start and usage examples
ClangIR Pipeline Guide: CLANGIR_PIPELINE_GUIDE.md - Comprehensive ClangIR documentation
MLIR Passes README: mlir-obs/README.md - Pass implementation details

Research & Testing

Comprehensive Testing (42 Configurations)

We tested 42 different obfuscation configurations to validate effectiveness:

Key Findings:

Modern LLVM optimizations destroy OLLVM obfuscation
- O1: -41% entropy reduction
- O3: -30% entropy reduction
- Substitution pass almost completely destroyed
Layer 1 alone > OLLVM passes
- Layer 1: 1 symbol, 2% overhead
- OLLVM + O3: 28 symbols, 10% overhead
- Layer 1 is MORE effective and FASTER
String encryption is mandatory
- 100% of binaries without Layer 3 exposed secrets
- Compiler obfuscation does NOT hide strings
Pass ordering matters significantly
- Best: split → substitution → boguscf → flattening
- Worst: flattening → boguscf → substitution → split
- Difference: 68% entropy variation

Full Research Report: See OBFUSCATION_COMPLETE.md

Test Data

Location: /Users/akashsingh/Desktop/llvm/test_results/
- comprehensive_metrics.csv (42 test configurations)
- binaries/ (42 compiled binaries)
- ir/ (LLVM IR transformation stages)

Obfuscation Effectiveness Ranking

Rank	Configuration	Symbols	Entropy	Overhead	Score
1	Layer 1 + OLLVM	2	1.0862	15%	0.762
2	Layer 1 only	1	0.8092	2%	0.758 ⭐
3	OLLVM + LTO	8	1.1302	10%	0.626
4	LTO only	8	0.7679	1%	0.590

Winner: Layer 1 only (best benefit/cost ratio)

API Documentation

RESTful API (Optional)

The obfuscator includes an optional FastAPI backend for web-based obfuscation.

Start server:

cd cmd/llvm-obfuscator
uvicorn api.main:app --reload

API Endpoints:

POST /obfuscate

Obfuscate source code

Request:

{
  "source_code": "int main() { return 0; }",
  "language": "c",
  "level": 3,
  "string_encryption": true,
  "symbol_obfuscation": true
}

Response:

{
  "status": "success",
  "binary_url": "https://api.example.com/download/abc123",
  "obfuscation_score": 75.0,
  "metrics": {
    "symbols_count": 1,
    "functions_count": 1,
    "strings_encrypted": 5,
    "entropy": 0.8092
  }
}

POST /analyze

Analyze existing binary

Request:

{
  "binary_url": "https://storage.example.com/binary"
}

Response:

{
  "status": "success",
  "analysis": {
    "symbol_count": 15,
    "function_count": 8,
    "entropy": 0.6474,
    "estimated_re_difficulty": "2-4 hours"
  }
}

Deployment

Production Deployment

Option 1: Docker Container

# Build image
docker build -t llvm-obfuscator:latest .

# Run container
docker run -p 8000:8000 llvm-obfuscator:latest

Option 2: Kubernetes

# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: llvm-obfuscator
spec:
  replicas: 3
  selector:
    matchLabels:
      app: llvm-obfuscator
  template:
    metadata:
      labels:
        app: llvm-obfuscator
    spec:
      containers:
      - name: api
        image: llvm-obfuscator:latest
        ports:
        - containerPort: 8000

Option 3: Serverless (AWS Lambda)

# Package for Lambda
./package_lambda.sh

# Deploy with AWS CLI
aws lambda create-function \
  --function-name llvm-obfuscator \
  --runtime provided.al2 \
  --handler bootstrap \
  --zip-file fileb://function.zip

Full Deployment Guide: See AUTOMATED_DEPLOYMENT.md

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Development Setup

# Clone repository
git clone https://github.com/your-org/llvm-obfuscator.git
cd llvm-obfuscator

# Install development dependencies
pip install -r requirements-dev.txt

# Run tests
pytest

# Run linters
flake8 cmd/llvm-obfuscator
mypy cmd/llvm-obfuscator

Areas for Contribution

OLLVM plugin packaging - Distribute pre-built plugins
IDE integration - VSCode/CLion extensions
Additional hash algorithms - xxHash, BLAKE3
Control flow analysis - Detect weak obfuscation
Commercial obfuscator comparison - Benchmark vs Tigress, VMProtect

License

MIT License - See LICENSE file for details.

Acknowledgments

OLLVM Project - Control flow obfuscation passes
LLVM Community - Compiler infrastructuree
Research Contributors - Comprehensive testing and validation

Support & Contact

Documentation: Full docs in OBFUSCATION_COMPLETE.md
Issues: https://github.com/your-org/llvm-obfuscator/issues
Discussions: https://github.com/your-org/llvm-obfuscator/discussions

Project Status: Production Ready ✅ Last Updated: 2025-10-13 Version: 1.0.0

Name		Name	Last commit message	Last commit date
Latest commit History 1,008 Commits
.github/workflows		.github/workflows
benchmark_suite/test_programs		benchmark_suite/test_programs
binary_obfuscation_pipeline		binary_obfuscation_pipeline
cmd/llvm-obfuscator		cmd/llvm-obfuscator
docs		docs
mlir-obs		mlir-obs
obfuscation_test_suite		obfuscation_test_suite
phoronix		phoronix
scripts		scripts
spec_cpu		spec_cpu
test_binaries		test_binaries
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CLANGIR_PIPELINE_GUIDE.md		CLANGIR_PIPELINE_GUIDE.md
CLANGIR_POLYGEIST_IMPLEMENTATION_PLAN.md		CLANGIR_POLYGEIST_IMPLEMENTATION_PLAN.md
CLAUDE.md		CLAUDE.md
CONSTANT_OBFUSCATION_GUIDE.md		CONSTANT_OBFUSCATION_GUIDE.md
CRYPTO_HASH_IMPLEMENTATION.md		CRYPTO_HASH_IMPLEMENTATION.md
Dockerfile.test		Dockerfile.test
INTEGRATION_SUMMARY.md		INTEGRATION_SUMMARY.md
MLIR_INTEGRATION_GUIDE.md		MLIR_INTEGRATION_GUIDE.md
POLYGEIST_INTEGRATION_SUMMARY.md		POLYGEIST_INTEGRATION_SUMMARY.md
POLYGEIST_LLVM22_PATCH_ANALYSIS.md		POLYGEIST_LLVM22_PATCH_ANALYSIS.md
QUICK_TEST.md		QUICK_TEST.md
QUICK_TEST_COMMANDS.md		QUICK_TEST_COMMANDS.md
README.md		README.md
README_TESTING.md		README_TESTING.md
SECRETS_SETUP.md		SECRETS_SETUP.md
SETUP_AND_TEST_CHECKLIST.md		SETUP_AND_TEST_CHECKLIST.md
SETUP_POLYGEIST.md		SETUP_POLYGEIST.md
TESTING_GUIDE.md		TESTING_GUIDE.md
TEST_COMMANDS.txt		TEST_COMMANDS.txt
TEST_SUITE_AUDIT_REPORT.md		TEST_SUITE_AUDIT_REPORT.md
bugs.md		bugs.md
demo_pgrm_13_reports.md		demo_pgrm_13_reports.md
docker-compose.test.yaml		docker-compose.test.yaml
docker-compose.yml		docker-compose.yml
fea4c43f85834c97b66fa026e28cd7ff.pdf		fea4c43f85834c97b66fa026e28cd7ff.pdf
image.png		image.png
missing_headers.tar.gz		missing_headers.tar.gz
obfuscate_headers.py		obfuscate_headers.py
setup_polygeist.sh		setup_polygeist.sh
test_complete_implementation.sh		test_complete_implementation.sh
test_mlir_integration.sh		test_mlir_integration.sh
test_polygeist_e2e.sh		test_polygeist_e2e.sh

devalgupta404/oaas

Folders and files

Latest commit

History

Repository files navigation

LLVM Binary Obfuscator

What is This?

Key Features

Quick Stats

Table of Contents

Installation

Prerequisites

Option 1: Quick Install (Recommended)

Option 2: Full Build (with OLLVM Layer 2)

Option 3: Docker Installation

Building from Source

Complete Build Steps

Platform-Specific Notes

Linux (Ubuntu/Debian)

macOS

Windows

Running Tests

Test Suite Overview

Running Tests

Test Categories

CI/CD Integration

Quick Start

Basic Obfuscation (Layers 0+1+3 + UPX - Production Ready)

Verify Protection

Architecture Overview

System Design

Data Flow

4-Layer System (Execution Order: 1→2→3→4)

Layer 1: Symbol Obfuscation (PRE-COMPILE, FIRST)

Layer 2: String Encryption (PRE-COMPILE, SECOND)

Layer 3: OLLVM Compiler Passes (COMPILE, THIRD - Optional)

Layer 4: Compiler Flags (COMPILE, FINAL)

Layer 3: OLLVM Compiler Passes

Layer 2: String Encryption

Usage Examples

Example 1: Standard Production Binary (with UPX)

Example 2: High-Security Financial Application

Example 3: Ultra-Critical (Military/IP Protection)

Example 4: Batch Processing Multiple Files

CLI Reference

Commands

Options

Core Options

Layer 1: Symbol Obfuscation

Layer 4: Compiler Flags

Layer 3: OLLVM Passes

Layer 2: String Encryption

Layer 5: UPX Binary Packing

Configuration File

Examples

Best Practices

Security Guidelines

Build Integration

Makefile Integration

CMake Integration

CI/CD Integration

MLIR-Based Obfuscation (Advanced)

Two Pipeline Options

MLIR Passes Available

ClangIR Advantages

Documentation

Research & Testing

Comprehensive Testing (42 Configurations)

Test Data

Obfuscation Effectiveness Ranking

API Documentation

RESTful API (Optional)

POST /obfuscate

POST /analyze

Deployment

Production Deployment

Contributing

Development Setup

Areas for Contribution

License

Packages