Skip to content

SkySingh04/oaas

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLVM Binary Obfuscator

Production-ready multi-layer code obfuscation toolkit for C/C++ binaries

Status Platform License


What is This?

The LLVM Binary Obfuscator is a comprehensive code protection toolkit that makes reverse engineering C/C++ binaries significantly harder. It uses a 4-layer defense strategy combining cryptographic symbol renaming, compiler-level optimizations, control-flow obfuscation, and string encryption.

Key Features

  • 4 Sequential Obfuscation Layers - Applied in execution order (1→2→3→4)
  • UPX Binary Packing - Optional 5th layer: compress binaries 50-70% while adding obfuscation
  • Production Ready - Thoroughly tested, all layers functional
  • Zero Runtime Overhead (Layers 1-2) - Source-level transformations only
  • CLI & API - Easy integration with existing build pipelines
  • Comprehensive Reports - Detailed obfuscation metrics and validation
  • Cross-Platform - Supports Linux, macOS, and Windows targets

Quick Stats

Metric Without Obfuscation With All 4 Layers With UPX
Symbol Count 20 symbols 1 symbol (-95%) 1 symbol (-95%)
Secret Visibility 100% exposed 0% visible 0% visible
RE Difficulty 2-4 hours 8-12 weeks (50x harder) 10-14 weeks
Binary Size 16 KB 35 KB (+119%) 18 KB (+12%)
Overhead - ~15-25% ~15-25% + 10-50ms startup

Table of Contents

  1. Installation
  2. Quick Start
  3. Architecture Overview
  4. 4-Layer System
  5. Usage Examples
  6. CLI Reference
  7. Best Practices
  8. MLIR-Based Obfuscation (Advanced)
  9. Research & Testing
  10. API Documentation
  11. Deployment
  12. Contributing
  13. License

Installation

Prerequisites

  • Python 3.10+
  • Clang 15+ (or GCC 11+)
  • CMake 3.20+
  • LLVM 22 (for OLLVM passes and MLIR obfuscation)
  • MLIR 22 (for MLIR-based passes - included with LLVM 22)
  • OpenSSL (for cryptographic symbol hashing)
  • ClangIR (optional - for advanced C/C++ pipeline, see CLANGIR_PIPELINE_GUIDE.md)

Option 1: Quick Install (Recommended)

# Clone repository
git clone https://github.com/your-org/llvm-obfuscator.git
cd llvm-obfuscator

# Install Python dependencies
cd cmd/llvm-obfuscator
pip install -r requirements.txt

# Build MLIR obfuscation library (for MLIR passes)
cd ../../mlir-obs
./build.sh
cd ..

# Build symbol obfuscator (optional - legacy)
cd symbol-obfuscator
mkdir build && cd build
cmake .. && make
cd ../..

# Verify installation
python3 -m cmd.llvm-obfuscator.cli.obfuscate --help

Option 2: Full Build (with OLLVM Layer 2)

# Clone LLVM with OLLVM passes
git clone https://github.com/your-org/llvm-project.git
cd llvm-project
mkdir build && cd build

# Build LLVM with OLLVM plugin
cmake -G Ninja ../llvm \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLVM_ENABLE_PROJECTS="clang" \
  -DLLVM_TARGETS_TO_BUILD="X86;ARM;AArch64"
ninja

# Continue with Quick Install steps above

Quick Start

Basic Obfuscation (Layers 0+1+3 + UPX - Production Ready)

# Navigate to CLI directory
cd cmd/llvm-obfuscator

# Obfuscate a C source file with UPX compression
python3 -m cli.obfuscate compile ../../src/simple_auth.c \
  --output ./obfuscated \
  --level 3 \
  --string-encryption \
  --enable-symbol-obfuscation \
  --enable-upx \
  --custom-flags "-flto -fvisibility=hidden -O3" \
  --report-formats "json,html"

# Test the obfuscated binary
./obfuscated/simple_auth "test_password"

Verify Protection

# Check symbols (should see only 1-2 symbols)
nm obfuscated/simple_auth | grep -v ' U ' | wc -l

# Check for secrets (should return empty)
strings obfuscated/simple_auth | grep -iE "password|secret|key"

# View obfuscation report
cat obfuscated/report.json

Architecture Overview

System Design

┌─────────────────────────────────────────────────────────┐
│                    LLVM Obfuscator                       │
├─────────────────────────────────────────────────────────┤
│                                                           │
│  Input: source.c  →  [4 Layers]  →  Output: binary     │
│                                                           │
│  Layer 1: Symbol Obfuscation (Pre-compilation)          │
│    ├─ Cryptographic hashing (SHA256/BLAKE2B)            │
│    ├─ Function/variable renaming                         │
│    └─ Mapping file generation                            │
│                                                           │
│  Layer 4: Compiler Flags (Compilation)                  │
│    ├─ Link-time optimization (-flto)                     │
│    ├─ Symbol hiding (-fvisibility=hidden)               │
│    ├─ Frame pointer removal (-fomit-frame-pointer)      │
│    └─ Speculative hardening (-mspeculative-load-hardening)│
│                                                           │
│  Layer 3: OLLVM Passes (IR-level) [Optional]            │
│    ├─ Control flow flattening                            │
│    ├─ Instruction substitution                           │
│    ├─ Bogus control flow                                 │
│    └─ Basic block splitting                              │
│                                                           │
│  Layer 2: String Encryption (Pre-compilation)           │
│    ├─ XOR encryption                                     │
│    ├─ Runtime decryption                                 │
│    └─ Constructor-based initialization                   │
│                                                           │
│  Layer 5: UPX Binary Packing (Post-compilation) [New!]  │
│    ├─ 50-70% binary size reduction                       │
│    ├─ Additional obfuscation layer                       │
│    ├─ LZMA compression                                   │
│    └─ Transparent runtime decompression                  │
│                                                           │
└─────────────────────────────────────────────────────────┘

Data Flow

Source File (auth.c)
       ↓
[Layer 1] Symbol Obfuscation
       ├─ validate_password → f_dabe0a778dd2
       ├─ failed_attempts → v_3f4e5d6c7b8a
       └─ Output: auth_symbols.c
       ↓
[Layer 2] String Encryption
       ├─ "AdminPass2024!" → {0xCA, 0xCF, 0xC6, ...}
       ├─ Inject decryption functions
       └─ Output: auth_encrypted.c
       ↓
[Layer 3] OLLVM Passes (if enabled)
       ├─ Control flow flattening
       ├─ Instruction substitution
       └─ Output: auth_obfuscated.ll
       ↓
[Layer 4] Compilation with Flags
       ├─ -flto -fvisibility=hidden -O3
       ├─ Symbol stripping
       └─ Output: auth_binary
       ↓
[Layer 5] UPX Packing (optional)
       ├─ LZMA compression
       ├─ 50-70% size reduction
       └─ Output: auth_binary (packed)
       ↓
Final Binary

4-Layer System (Execution Order: 1→2→3→4)

Layer 1: Symbol Obfuscation (PRE-COMPILE, FIRST)

Purpose: Remove semantic meaning from all symbol names Type: Pre-compilation source transformation Execution: FIRST (before everything else) Overhead: 0% runtime (compile-time only)

How it Works:

  • Uses cryptographic hashing (SHA256/BLAKE2B/SipHash)
  • Renames functions, variables, structures with meaningless hashes
  • Preserves system symbols (main, standard library, etc.)
  • Generates reversible mapping file for debugging

Example:

// Before
int validate_password(const char* user_input) {
    if (strcmp(user_input, MASTER_PASSWORD) == 0) {
        failed_attempts = 0;
        return 1;
    }
    failed_attempts++;
    return 0;
}

// After
int f_dabe0a778dd2(const char* v_9e8d7c6b5a4f) {
    if (strcmp(v_9e8d7c6b5a4f, v_a1b2c3d4e5f6) == 0) {
        v_3f4e5d6c7b8a = 0;
        return 1;
    }
    v_3f4e5d6c7b8a++;
    return 0;
}

Usage:

python3 -m cli.obfuscate compile source.c \
  --level 1 \
  --enable-symbol-obfuscation \
  --symbol-algorithm sha256 \
  --symbol-hash-length 12

Results:

  • Symbol reduction: 20 → 6 symbols
  • RE difficulty: 2x baseline
  • Cryptographically hard to reverse without mapping file

Layer 2: String Encryption (PRE-COMPILE, SECOND)

Purpose: Hide hardcoded secrets and string literals Type: Pre-compilation source transformation Execution: SECOND (after symbol obfuscation) Overhead: ~1-3% runtime (decryption at startup)

Critical Importance:

  • ⚠️ 100% of binaries without string encryption exposed secrets in our testing
  • Compiler obfuscation (Layers 3+4) does NOT hide strings
  • Strings are stored in .rodata section unchanged
  • MANDATORY for any binary containing secrets

How it Works:

  1. Scans source code for const string literals
  2. Encrypts strings using XOR cipher
  3. Converts to char arrays: {0xCA, 0xCF, 0xC6, ...}
  4. Injects __attribute__((constructor)) decryption function
  5. Strings decrypted automatically before main() executes

Example:

// Before
const char* MASTER_PASSWORD = "AdminPass2024!";
const char* API_SECRET = "sk_live_secret_12345";

// After
char* MASTER_PASSWORD = NULL;
char* API_SECRET = NULL;

__attribute__((constructor)) static void _init_encrypted_strings(void) {
    MASTER_PASSWORD = _xor_decrypt((const unsigned char[]){0xCA,0xCF,...}, 14, 0xAB);
    API_SECRET = _xor_decrypt((const unsigned char[]){0x9E,0x86,...}, 20, 0xED);
}

Usage:

python3 -m cli.obfuscate compile source.c \
  --level 1 \
  --string-encryption

Verification:

# Before string encryption
$ strings binary | grep password
AdminPass2024!          ← EXPOSED

# After string encryption
$ strings binary | grep password
(no output)             ← HIDDEN

Results:

  • String hiding: 100% (all secrets hidden)
  • Binary size increase: ~6%
  • Secrets in strings output: 0
  • RE difficulty: 5-10x baseline

Layer 3: OLLVM Compiler Passes (COMPILE, THIRD - Optional)

Purpose: Transform control flow and instructions at IR level Type: LLVM IR transformations Execution: THIRD (during compilation, optional) Overhead: ~10-20% runtime

4 Obfuscation Passes:

  1. Control Flow Flattening

    • Converts functions into state machines
    • All basic blocks become case statements
    • Original control flow completely hidden
  2. Instruction Substitution

    • Replaces simple operations with complex equivalents
    • Example: a = b + ca = -(-b - c)
    • Harder to analyze, more complex patterns
  3. Bogus Control Flow

    • Inserts fake conditional branches (never taken)
    • Adds opaque predicates (always true/false)
    • Increases CFG complexity without changing behavior
  4. Basic Block Splitting

    • Divides basic blocks into smaller pieces
    • Inserts unconditional jumps between them
    • Increases code size and complexity

Important Notes:

  • ⚠️ Requires OLLVM plugin (must be built separately)
  • ⚠️ Modern optimizations (O1/O3) reduce OLLVM effectiveness by 30-41%
  • ✅ Best resilience: Bogus CF (+40% entropy) > Flattening (+16%) > Split (+6%) > Substitution (+0.7%)
  • ✅ Use with -O0 or -O2 (avoid -O1 and -O3)

Usage:

python3 -m cli.obfuscate compile source.c \
  --level 4 \
  --enable-flattening \
  --enable-substitution \
  --enable-bogus-cf \
  --enable-split \
  --custom-pass-plugin /path/to/LLVMObfuscationPlugin.dylib

Results:

  • Symbol count: 28 symbols (increases due to new basic blocks)
  • Entropy increase: +180% (0.6 → 1.8)
  • Function complexity: 20-30x harder to analyze
  • RE difficulty: 20-30x baseline

Research Finding: Based on testing 42 configurations, OLLVM passes provide minimal benefit when combined with modern optimizations. Recommended only for extreme security needs.


Layer 4: Compiler Flags (COMPILE, FINAL)

Purpose: Leverage compiler optimizations for obfuscation Type: Compilation-level transformations Execution: FOURTH/FINAL (last step) Overhead: ~2-5% runtime

Optimal Flag Set (9 flags):

  1. -flto - Link-time optimization (inlines functions)
  2. -fvisibility=hidden - Hides external symbols
  3. -O3 - Maximum optimization
  4. -fno-builtin - Disables built-in functions
  5. -flto=thin - Thin LTO for faster builds
  6. -fomit-frame-pointer - Breaks debugger stack traces
  7. -mspeculative-load-hardening - Spectre mitigation (adds noise)
  8. -O1 - Secondary optimization pass
  9. -Wl,-s - Strip all symbols from binary

Synergy Effect:

  • Individual flags: 14 → 8 symbols (-43%)
  • Combined flags: 14 → 1 symbol (-93%)
  • Synergy factor: 2.16x (flags work together)

Usage:

python3 -m cli.obfuscate compile source.c \
  --level 3 \
  --custom-flags "-flto -fvisibility=hidden -O3 -fno-builtin -flto=thin -fomit-frame-pointer -mspeculative-load-hardening -O1 -Wl,-s"

Results:

  • Symbol reduction: 14 → 1 symbol (-93%)
  • Function hiding: 8 → 1 function
  • Binary size increase: ~100% (due to inlining)
  • RE difficulty: 10-15x baseline

Layer 3: OLLVM Compiler Passes

Purpose: Transform control flow and instructions at IR level Type: LLVM IR transformations Overhead: ~10-20% runtime

4 Obfuscation Passes:

  1. Control Flow Flattening

    • Converts functions into state machines
    • All basic blocks become case statements
    • Original control flow completely hidden
  2. Instruction Substitution

    • Replaces simple operations with complex equivalents
    • Example: a = b + ca = -(-b - c)
    • Harder to analyze, more complex patterns
  3. Bogus Control Flow

    • Inserts fake conditional branches (never taken)
    • Adds opaque predicates (always true/false)
    • Increases CFG complexity without changing behavior
  4. Basic Block Splitting

    • Divides basic blocks into smaller pieces
    • Inserts unconditional jumps between them
    • Increases code size and complexity

Important Notes:

  • ⚠️ Requires OLLVM plugin (must be built separately)
  • ⚠️ Modern optimizations (O1/O3) reduce OLLVM effectiveness by 30-41%
  • ✅ Best resilience: Bogus CF (+40% entropy) > Flattening (+16%) > Split (+6%) > Substitution (+0.7%)
  • ✅ Use with -O0 or -O2 (avoid -O1 and -O3)

Usage:

python3 -m cli.obfuscate compile source.c \
  --level 4 \
  --enable-flattening \
  --enable-substitution \
  --enable-bogus-cf \
  --enable-split \
  --custom-pass-plugin /path/to/LLVMObfuscationPlugin.dylib

Results:

  • Symbol count: 28 symbols (increases due to new basic blocks)
  • Entropy increase: +180% (0.6 → 1.8)
  • Function complexity: 20-30x harder to analyze
  • RE difficulty: 20-30x baseline

Research Finding: Based on testing 42 configurations, OLLVM passes provide minimal benefit when combined with modern optimizations. Recommended only for extreme security needs.


Layer 2: String Encryption

Purpose: Hide hardcoded secrets and string literals Type: Pre-compilation source transformation Overhead: ~1-3% runtime (decryption at startup)

Critical Importance:

  • ⚠️ 100% of binaries without string encryption exposed secrets in our testing
  • Compiler obfuscation (Layers 1+2) does NOT hide strings
  • Strings are stored in .rodata section unchanged
  • MANDATORY for any binary containing secrets

How it Works:

  1. Scans source code for const string literals
  2. Encrypts strings using XOR cipher
  3. Converts to char arrays: {0xCA, 0xCF, 0xC6, ...}
  4. Injects __attribute__((constructor)) decryption function
  5. Strings decrypted automatically before main() executes

Example:

// Before
const char* MASTER_PASSWORD = "AdminPass2024!";
const char* API_SECRET = "sk_live_secret_12345";

// After
char* MASTER_PASSWORD = NULL;
char* API_SECRET = NULL;

__attribute__((constructor)) static void _init_encrypted_strings(void) {
    MASTER_PASSWORD = _xor_decrypt((const unsigned char[]){0xCA,0xCF,...}, 14, 0xAB);
    API_SECRET = _xor_decrypt((const unsigned char[]){0x9E,0x86,...}, 20, 0xED);
}

Usage:

python3 -m cli.obfuscate compile source.c \
  --level 1 \
  --string-encryption

Verification:

# Before string encryption
$ strings binary | grep password
AdminPass2024!          ← EXPOSED

# After string encryption
$ strings binary | grep password
(no output)             ← HIDDEN

Results:

  • String hiding: 100% (all secrets hidden)
  • Binary size increase: ~6%
  • Secrets in strings output: 0
  • RE difficulty: 5-10x baseline

Usage Examples

Example 1: Standard Production Binary (with UPX)

Use Case: Web application backend, moderate security requirements

python3 -m cli.obfuscate compile src/auth.c \
  --output ./release \
  --level 2 \
  --string-encryption \
  --enable-symbol-obfuscation \
  --enable-upx \
  --custom-flags "-flto -fvisibility=hidden -O3 -fno-builtin -flto=thin -fomit-frame-pointer -mspeculative-load-hardening -O1 -Wl,-s" \
  --report-formats "json,html"

Results:

  • Compilation time: ~2 seconds
  • Binary size: 18 KB (from 16 KB, only +12% instead of +119%!)
  • Symbol count: 1 (-95%)
  • Secrets visible: 0
  • Overhead: ~5-10% + 10-50ms startup
  • RE difficulty: 4-8 weeks (vs 2-4 hours)

Example 2: High-Security Financial Application

Use Case: Payment processing, regulatory compliance required

python3 -m cli.obfuscate compile src/payment.c \
  --output ./secure \
  --level 4 \
  --enable-symbol-obfuscation \
  --symbol-algorithm blake2b \
  --symbol-hash-length 16 \
  --symbol-salt "$(openssl rand -hex 32)" \
  --string-encryption \
  --custom-flags "-flto -fvisibility=hidden -O3 -fno-builtin -flto=thin -fomit-frame-pointer -mspeculative-load-hardening -O1 -Wl,-s" \
  --report-formats "json,html,markdown"

Results:

  • Symbol count: 1
  • Hash collisions: 0 (16-char Blake2b)
  • Secrets visible: 0
  • Obfuscation score: 80-85/100
  • RE difficulty: 8-12 weeks

Example 3: Ultra-Critical (Military/IP Protection)

Use Case: Proprietary algorithm, maximum protection required

python3 -m cli.obfuscate compile src/proprietary.c \
  --output ./maximum \
  --level 5 \
  --enable-symbol-obfuscation \
  --enable-flattening \
  --enable-bogus-cf \
  --string-encryption \
  --fake-loops 10 \
  --cycles 1 \
  --custom-flags "-flto -fvisibility=hidden -O2 -fno-builtin" \
  --custom-pass-plugin /path/to/LLVMObfuscationPlugin.dylib \
  --report-formats "json,html,markdown"

Important: Use -O2 instead of -O3 to preserve OLLVM obfuscation!

Results:

  • Symbol count: 2
  • Entropy: 1.08 (+180%)
  • Secrets visible: 0
  • Obfuscation score: 90-95/100
  • Overhead: ~25-30%
  • RE difficulty: 12+ weeks

Example 4: Batch Processing Multiple Files

Create batch_config.yaml:

jobs:
  - source: "src/auth.c"
    output: "./obfuscated/auth"
    config:
      level: 3
      string_encryption: true
      enable_symbol_obfuscation: true

  - source: "src/license.cpp"
    output: "./obfuscated/license"
    config:
      level: 4
      enable_flattening: true
      enable_bogus_cf: true
      string_encryption: true

  - source: "src/api.c"
    output: "./obfuscated/api"
    config:
      level: 3
      string_encryption: true

Run batch:

python3 -m cli.obfuscate batch batch_config.yaml

CLI Reference

Commands

python -m cli.obfuscate [COMMAND] [OPTIONS]

Available Commands:

  • compile - Obfuscate and compile a source file
  • analyze - Analyze an existing binary for obfuscation level
  • compare - Compare baseline vs obfuscated binaries
  • batch - Process multiple files from config file

Options

Core Options

--level <1-5>              Obfuscation level (1=minimal, 5=maximum)
--output <path>            Output directory
--report-formats <list>    Generate reports (json,html,markdown)

Layer 1: Symbol Obfuscation

--enable-symbol-obfuscation       Enable cryptographic symbol renaming
--symbol-algorithm <algo>         Hash algorithm (sha256, blake2b, siphash)
--symbol-hash-length <n>          Hash truncation length (default: 12)
--symbol-prefix <style>           Prefix style (typed, single, none)
--symbol-salt <string>            Custom salt for deterministic builds

Layer 4: Compiler Flags

--custom-flags "<flags>"   Custom compiler flags (overrides defaults)

Layer 3: OLLVM Passes

--enable-flattening              Enable control flow flattening
--enable-substitution            Enable instruction substitution
--enable-bogus-cf                Enable bogus control flow
--enable-split                   Enable basic block splitting
--custom-pass-plugin <path>      Path to OLLVM plugin (required for Layer 2)
--cycles <n>                     Number of obfuscation cycles (1-5)

Layer 2: String Encryption

--string-encryption              Enable string literal encryption
--fake-loops <n>                 Number of fake loops to insert (0-50)

Layer 5: UPX Binary Packing

--enable-upx                     Enable UPX binary packing (compression + obfuscation)
--upx-compression <level>        Compression level (fast, default, best, brute)
--upx-lzma / --no-upx-lzma      Use LZMA compression (default: true)
--upx-preserve-original          Keep backup of pre-UPX binary

Configuration File

--config-file <path>     Load settings from YAML file

Examples

Simple obfuscation:

python3 -m cli.obfuscate compile source.c --level 2 --string-encryption

Maximum security:

python3 -m cli.obfuscate compile source.c \
  --level 5 \
  --enable-symbol-obfuscation \
  --enable-flattening \
  --enable-bogus-cf \
  --string-encryption \
  --custom-pass-plugin /path/to/plugin.dylib

Analyze existing binary:

python3 -m cli.obfuscate analyze ./binary --output report.json

Compare binaries:

python3 -m cli.obfuscate compare ./baseline ./obfuscated --output diff.html

Best Practices

Security Guidelines

  1. Always enable string encryption if binary contains secrets

    • Passwords, API keys, encryption keys
    • Database credentials, internal URLs
    • License keys, proprietary algorithms
  2. Use strong salts for symbol obfuscation

    --symbol-salt "$(openssl rand -hex 32)"
  3. Store mapping files securely

    • Required for debugging crashes
    • Don't distribute with binaries
    • Backup in secure location
  4. Test obfuscated binaries thoroughly

    • Functional tests (same inputs → same outputs)
    • Performance tests (acceptable overhead)
    • Security tests (secrets not visible)
  5. Use Layer 2 sparingly

    • Only for ultra-critical code
    • Adds 15-30% overhead
    • Modern optimizations reduce effectiveness

Build Integration

Makefile Integration

# Release build with obfuscation
release:
	python3 -m cli.obfuscate compile src/main.c \
		--output ./release \
		--level 3 \
		--string-encryption \
		--enable-symbol-obfuscation \
		--custom-flags "-flto -fvisibility=hidden -O3"

# Development build (no obfuscation)
dev:
	gcc -O0 -g src/main.c -o ./dev/main

CMake Integration

# Add custom target for obfuscated build
add_custom_target(obfuscate
    COMMAND python3 -m cli.obfuscate compile ${CMAKE_SOURCE_DIR}/src/main.c
        --output ${CMAKE_BINARY_DIR}/obfuscated
        --level 3
        --string-encryption
        --enable-symbol-obfuscation
    DEPENDS ${CMAKE_SOURCE_DIR}/src/main.c
)

CI/CD Integration

# .github/workflows/release.yml
name: Release Build

on:
  push:
    tags: ['v*']

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Install dependencies
        run: |
          pip install -r requirements.txt

      - name: Build obfuscated binary
        run: |
          python3 -m cli.obfuscate compile src/main.c \
            --level 3 \
            --string-encryption \
            --enable-symbol-obfuscation \
            --output ./release

      - name: Verify obfuscation
        run: |
          # Check symbols
          nm release/main | grep -v ' U ' | wc -l

          # Check secrets
          ! strings release/main | grep -iE "password|secret"

      - name: Upload release
        uses: actions/upload-artifact@v3
        with:
          name: obfuscated-binary
          path: release/main

MLIR-Based Obfuscation (Advanced)

In addition to the 4-layer system, the obfuscator now supports MLIR-based obfuscation for advanced control-flow and constant transformations.

Two Pipeline Options

1. Default Pipeline (CLANG) - Stable, Production-Ready

python3 -m cli.obfuscate compile source.c \
  --enable-string-encrypt \
  --enable-crypto-hash \
  --output ./output

2. ClangIR Pipeline (CLANGIR) - Advanced C/C++ Semantics

python3 -m cli.obfuscate compile source.c \
  --mlir-frontend clangir \
  --enable-string-encrypt \
  --enable-crypto-hash \
  --enable-constant-obfuscate \
  --output ./output

MLIR Passes Available

Pass Description Flag
String Encryption XOR-based string literal encryption --enable-string-encrypt
Symbol Obfuscation Random hex symbol names (RNG-based) --enable-symbol-obfuscate
Crypto Hash Cryptographic symbol hashing (SHA256/BLAKE2B/SipHash) --enable-crypto-hash
Constant Obfuscation Obfuscate ALL constants (int/float/string/array) --enable-constant-obfuscate

ClangIR Advantages

Preserves high-level C/C++ semantics for better obfuscation quality ✅ More accurate control flow analysis at MLIR level ✅ Native LLVM 22 support - officially part of LLVM project ✅ Better type preservation compared to lowered LLVM IR ✅ Future-proof - active development by LLVM community

Documentation


Research & Testing

Comprehensive Testing (42 Configurations)

We tested 42 different obfuscation configurations to validate effectiveness:

Key Findings:

  1. Modern LLVM optimizations destroy OLLVM obfuscation

    • O1: -41% entropy reduction
    • O3: -30% entropy reduction
    • Substitution pass almost completely destroyed
  2. Layer 1 alone > OLLVM passes

    • Layer 1: 1 symbol, 2% overhead
    • OLLVM + O3: 28 symbols, 10% overhead
    • Layer 1 is MORE effective and FASTER
  3. String encryption is mandatory

    • 100% of binaries without Layer 3 exposed secrets
    • Compiler obfuscation does NOT hide strings
  4. Pass ordering matters significantly

    • Best: split → substitution → boguscf → flattening
    • Worst: flattening → boguscf → substitution → split
    • Difference: 68% entropy variation

Full Research Report: See OBFUSCATION_COMPLETE.md

Test Data

Location: /Users/akashsingh/Desktop/llvm/test_results/
- comprehensive_metrics.csv (42 test configurations)
- binaries/ (42 compiled binaries)
- ir/ (LLVM IR transformation stages)

Obfuscation Effectiveness Ranking

Rank Configuration Symbols Entropy Overhead Score
1 Layer 1 + OLLVM 2 1.0862 15% 0.762
2 Layer 1 only 1 0.8092 2% 0.758
3 OLLVM + LTO 8 1.1302 10% 0.626
4 LTO only 8 0.7679 1% 0.590

Winner: Layer 1 only (best benefit/cost ratio)


API Documentation

RESTful API (Optional)

The obfuscator includes an optional FastAPI backend for web-based obfuscation.

Start server:

cd cmd/llvm-obfuscator
uvicorn api.main:app --reload

API Endpoints:

POST /obfuscate

Obfuscate source code

Request:

{
  "source_code": "int main() { return 0; }",
  "language": "c",
  "level": 3,
  "string_encryption": true,
  "symbol_obfuscation": true
}

Response:

{
  "status": "success",
  "binary_url": "https://api.example.com/download/abc123",
  "obfuscation_score": 75.0,
  "metrics": {
    "symbols_count": 1,
    "functions_count": 1,
    "strings_encrypted": 5,
    "entropy": 0.8092
  }
}

POST /analyze

Analyze existing binary

Request:

{
  "binary_url": "https://storage.example.com/binary"
}

Response:

{
  "status": "success",
  "analysis": {
    "symbol_count": 15,
    "function_count": 8,
    "entropy": 0.6474,
    "estimated_re_difficulty": "2-4 hours"
  }
}

Deployment

Production Deployment

Option 1: Docker Container

# Build image
docker build -t llvm-obfuscator:latest .

# Run container
docker run -p 8000:8000 llvm-obfuscator:latest

Option 2: Kubernetes

# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: llvm-obfuscator
spec:
  replicas: 3
  selector:
    matchLabels:
      app: llvm-obfuscator
  template:
    metadata:
      labels:
        app: llvm-obfuscator
    spec:
      containers:
      - name: api
        image: llvm-obfuscator:latest
        ports:
        - containerPort: 8000

Option 3: Serverless (AWS Lambda)

# Package for Lambda
./package_lambda.sh

# Deploy with AWS CLI
aws lambda create-function \
  --function-name llvm-obfuscator \
  --runtime provided.al2 \
  --handler bootstrap \
  --zip-file fileb://function.zip

Full Deployment Guide: See AUTOMATED_DEPLOYMENT.md


Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Development Setup

# Clone repository
git clone https://github.com/your-org/llvm-obfuscator.git
cd llvm-obfuscator

# Install development dependencies
pip install -r requirements-dev.txt

# Run tests
pytest

# Run linters
flake8 cmd/llvm-obfuscator
mypy cmd/llvm-obfuscator

Areas for Contribution

  • OLLVM plugin packaging - Distribute pre-built plugins
  • IDE integration - VSCode/CLion extensions
  • Additional hash algorithms - xxHash, BLAKE3
  • Control flow analysis - Detect weak obfuscation
  • Commercial obfuscator comparison - Benchmark vs Tigress, VMProtect

License

MIT License - See LICENSE file for details.


Acknowledgments

  • OLLVM Project - Control flow obfuscation passes
  • LLVM Community - Compiler infrastructuree
  • Research Contributors - Comprehensive testing and validation

Support & Contact


Project Status: Production Ready ✅ Last Updated: 2025-10-13 Version: 1.0.0

About

Obfuscation As A Service

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 6