PolyOpt - Polyhedral Loop Optimizer

A research-grade polyhedral compiler for optimizing loop nests

Features • Installation • Quick Start • Examples • Commands • Theory

Overview

PolyOpt is a polyhedral compiler that automatically optimizes loop nests for:

Parallelism: Identify and exploit parallel loops
Data Locality: Improve cache utilization through tiling
Vectorization: Enable SIMD optimizations

It takes programs written in a simple .poly language and generates optimized C code with OpenMP annotations.

┌─────────────┐     ┌──────────────┐     ┌────────────────┐     ┌─────────────┐
│  .poly file │ ──► │  Parse & IR  │ ──► │  Dependence    │ ──► │  Transform  │
│             │     │  Lowering    │     │  Analysis      │     │  & Schedule │
└─────────────┘     └──────────────┘     └────────────────┘     └──────┬──────┘
                                                                       │
                                                                       ▼
                                                               ┌─────────────┐
                                                               │  Generate   │
                                                               │  C/OpenMP   │
                                                               └─────────────┘

Features

Frontend: Custom parser for a simple loop-focused language
Analysis:
- Dependence analysis (GCD test, Banerjee test)
- Parallelism detection
- Loop-carried vs loop-independent dependencies
Transformations:
- Loop tiling (blocking) for cache optimization
- Loop interchange for better memory access
- Loop fusion and distribution
- Automatic scheduling (Pluto, Feautrier, Greedy algorithms)
Code Generation:
- Clean C output
- OpenMP parallelization pragmas
- Vectorization hints
- Benchmark code generation
Advanced Features (optional):
- ISL Integration: Exact polyhedral math via Integer Set Library
- Auto-Tuning: Automatic parameter optimization

Advanced Features

ISL Integration

PolyOpt can use the Integer Set Library (ISL) for exact polyhedral computations. ISL is the same library used by LLVM's Polly and GCC's Graphite.

Installation:

# macOS
brew install isl

# Ubuntu/Debian  
apt install libisl-dev isl-utils

# Fedora
dnf install isl-devel

Build with ISL support:

cargo build --release --features isl

Usage:

# Check ISL installation
polyopt isl input.poly --check

# Analyze domains in ISL format
polyopt isl input.poly --domain

# Compute optimal schedule
polyopt isl input.poly --schedule

# Show dependences in ISL format
polyopt isl input.poly --deps

Example output:

=== ISL Analysis: matmul ===

--- Domains (ISL format) ---
S0:
  { [i0,i1,i2] : 0 <= i0 < N and 0 <= i1 < M and 0 <= i2 < K }
  ISL canonical: [N, M, K] -> { [i0, i1, i2] : 0 <= i0 < N and 0 <= i1 < M and 0 <= i2 < K }

Auto-Tuning Framework

PolyOpt includes an auto-tuning framework that finds optimal transformation parameters by empirical search.

Build with auto-tuning:

cargo build --release --features autotuning

Usage:

# Basic auto-tuning (60 second budget)
polyopt autotune examples/matmul.poly -N 1000

# Quick mode (30 seconds)
polyopt autotune examples/matmul.poly --quick

# Custom search parameters
polyopt autotune examples/matmul.poly \
  --time 120 \
  --tiles "16,32,64,128" \
  --threads "1,2,4,8" \
  --search genetic

# Save results to CSV
polyopt autotune examples/matmul.poly -o results.csv

Search strategies:

grid: Exhaustive search (default, best for small spaces)
random: Random sampling (good for large spaces)
genetic: Genetic algorithm (good for complex optimization)

Example output:

=== Auto-Tuning: matmul.poly ===

Configuration:
  Max time: 60s
  Tile sizes: [16, 32, 64, 128]
  Thread counts: [1, 2, 4, 8]
  Search: Grid

Baseline: 0.4521s
[  1] t16_p1 -> 0.3892s (1.16x)
[  2] t32_p1 -> 0.3456s (1.31x)
[  3] t32_p4 -> 0.1234s (3.66x)
  *** New best: t32_p4 (3.66x speedup)
[  4] t64_p4 -> 0.0987s (4.58x)
  *** New best: t64_p4 (4.58x speedup)
...

=== Best Configuration Found ===
  Tile sizes: [64]
  Threads: 4
  Speedup: 4.58x

To generate optimized code:
  polyopt compile matmul.poly --openmp --tile 64

Installation

Prerequisites

Rust 1.70+ (install from rustup.rs)
GCC or Clang (for compiling generated code)

Build from Source

# Clone the repository
git clone https://github.com/yourusername/polyopt.git
cd polyopt

# Build in release mode
cargo build --release

# Run tests
cargo test

# Install (optional)
cargo install --path .

Verify Installation

# Check version
cargo run --bin polyopt -- --version

# See help
cargo run --bin polyopt -- --help

Quick Start

1. Your First Program

Create a file hello.poly:

// Vector addition: C[i] = A[i] + B[i]
func vector_add(A[N], B[N], C[N]) {
    for i = 0 to N {
        C[i] = A[i] + B[i];
    }
}

2. Compile to C

cargo run --bin polyopt -- compile hello.poly

Output:

// Generated by PolyOpt - Polyhedral Optimizer

#include <stdio.h>
#include <stdlib.h>
...

void vector_add(int N, double* A, double* B, double* C) {
    // Statement S0: S0
    for (int i = 0; i < N; i++) {
        C[i] = (A[i] + B[i]);
    }
}

3. Add OpenMP Parallelization

cargo run --bin polyopt -- compile hello.poly --openmp

4. Analyze Dependencies

cargo run --bin polyopt -- analyze hello.poly --deps --parallel

5. Visualize Iteration Space

cargo run --bin polyvis -- hello.poly --params N=8

The .poly Language

Basic Syntax

// Comments use // or /* */

func function_name(ArrayA[N][M], ArrayB[K], scalar_param) {
    // For loops
    for i = start to end {
        // Statements
        ArrayA[i][j] = expression;
    }
    
    // For loops with step
    for i = 0 to N step 2 {
        ...
    }
}

Array Declarations

// 1D array with symbolic size N
A[N]

// 2D array with sizes N x M  
B[N][M]

// 3D array
C[N][M][K]

Expressions

// Arithmetic
A[i] + B[j]
A[i] * 2.5
A[i] / B[j]
A[i] - B[j]

// Array access with affine indices
A[i]
B[i][j]
C[i+1][j-1]
C[2*i][j]

// Constants
0
1.5
3.14159

Loop Bounds

// Simple bounds
for i = 0 to N { ... }

// Offset bounds
for i = 1 to N-1 { ... }

// Triangular bounds (planned)
for j = 0 to i { ... }

Commands

`polyopt compile` - Generate Code

Compile a .poly file to C code.

# Basic compilation
polyopt compile input.poly

# Output to file
polyopt compile input.poly -o output.c

# With OpenMP
polyopt compile input.poly --openmp

# With vectorization hints
polyopt compile input.poly --openmp --vectorize

# Generate benchmark wrapper
polyopt compile input.poly --benchmark

Options:

Option	Description
`-o, --output`	Output file (default: stdout)
`-t, --target`	Target: `c`, `openmp`, `cuda`, `opencl`
`--openmp`	Enable OpenMP parallelization
`--vectorize`	Add SIMD pragmas
`--benchmark`	Generate timing code

`polyopt analyze` - Dependence Analysis

Analyze dependencies and parallelism opportunities.

# Basic analysis
polyopt analyze input.poly

# Show dependencies
polyopt analyze input.poly --deps

# Show parallelism
polyopt analyze input.poly --parallel

# Full analysis
polyopt analyze input.poly --deps --parallel --stats

# JSON output
polyopt analyze input.poly --json

Example Output:

=== Analysis: matmul ===

Statements: 2
Parameters: ["N", "K", "M"]
Arrays: ["A", "C", "B"]

--- Dependences ---
Total: 6
  Flow (RAW): 2
  Anti (WAR): 2
  Output (WAW): 2
  Loop-carried: 6
  Loop-independent: 0

--- Parallelism ---
  Level 0: ✗ sequential
  Level 1: ✗ sequential
  Level 2: ✗ sequential

No directly parallelizable loops found
Consider applying transformations (tiling, skewing)

`polyopt optimize` - Apply Transformations

Apply optimizations and generate code.

# Auto-optimize with Pluto scheduler
polyopt optimize input.poly --schedule pluto

# Apply tiling
polyopt optimize input.poly --tile 32

# Combined optimizations
polyopt optimize input.poly --schedule pluto --tile 32 --openmp

# Output to file
polyopt optimize input.poly --tile 32 -o optimized.c

Options:

Option	Description
`--tile SIZE`	Apply tiling with given block size
`--schedule ALG`	Scheduling: `pluto`, `feautrier`, `greedy`
`--openmp`	Enable OpenMP in output
`-o, --output`	Output file

`polyopt parse` - Debug Parser

View the parsed representation of a program.

# Show AST
polyopt parse input.poly

# Show HIR (High-level IR)
polyopt parse input.poly --hir

# Show PIR (Polyhedral IR)
polyopt parse input.poly --pir

`polyopt bench` - Run Benchmarks

Compile and run benchmarks.

# Run with default size (N=1000)
polyopt bench input.poly

# Specify size
polyopt bench input.poly -N 2000

# Multiple iterations
polyopt bench input.poly -N 1000 --iterations 10

# With OpenMP
polyopt bench input.poly -N 1000 --openmp

`polyvis` - Visualize Iteration Spaces

Text-based visualization of iteration spaces and dependencies.

# Basic visualization
polyvis input.poly

# With parameter values
polyvis input.poly --params N=10,M=10

# Limit displayed iterations
polyvis input.poly --params N=20 --max-iters 100

# Verbose mode (show dependency details)
polyvis input.poly --params N=8 --verbose

Example Output:

╔══════════════════════════════════════════════════════════════════╗
║                    PolyOpt Iteration Space Visualizer            ║
╚══════════════════════════════════════════════════════════════════╝

Program: matmul
Parameters: N=8, K=8, M=8

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Statement S0 (2D domain):
  Iteration space (8 x 8 = 64 points):

    j→  0 1 2 3 4 5 6 7
  i↓
   0    ● ● ● ● ● ● ● ●
   1    ● ● ● ● ● ● ● ●
   2    ● ● ● ● ● ● ● ●
   ...

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Dependence Summary:
  Flow (RAW):   2
  Anti (WAR):   2
  Output (WAW): 2

Parallelism: No parallel loops detected at any level

Examples

Basic Examples

Vector Addition (Parallel)

# File: examples/basic/vector_add.poly
polyopt analyze examples/basic/vector_add.poly --parallel
# Output: Level 0: ✓ PARALLEL

Array Copy (2D Parallel)

# File: examples/basic/array_copy.poly  
polyopt compile examples/basic/array_copy.poly --openmp

Linear Algebra

Matrix Multiplication

# Analyze dependencies
polyopt analyze examples/linalg/matmul.poly --deps

# Generate tiled code
polyopt optimize examples/linalg/matmul.poly --tile 32 --openmp -o matmul_tiled.c

# Benchmark
polyopt bench examples/linalg/matmul.poly -N 512 --openmp

Matrix-Vector Multiply

# File: examples/linalg/matvec.poly
polyopt compile examples/linalg/matvec.poly --openmp

Matrix Transpose

# File: examples/linalg/transpose.poly
# Both loops parallelizable
polyopt analyze examples/linalg/transpose.poly --parallel

Stencil Computations

Jacobi 2D (Double-Buffered - Parallel)

# File: examples/stencils/jacobi_2d.poly
# Reads from A, writes to B - no loop-carried dependencies
polyopt analyze examples/stencils/jacobi_2d.poly --parallel
polyopt compile examples/stencils/jacobi_2d.poly --openmp

Gauss-Seidel 2D (In-Place - Sequential)

# File: examples/stencils/gauss_seidel_2d.poly
# In-place update creates dependencies
polyopt analyze examples/stencils/gauss_seidel_2d.poly --deps
# Shows loop-carried dependencies - needs wavefront parallelization

5-Point Stencil

# File: examples/stencils/stencil_2d_5pt.poly
polyvis examples/stencils/stencil_2d_5pt.poly --params N=10

Reductions

Vector Sum

# File: examples/reductions/vector_sum.poly
polyopt compile examples/reductions/vector_sum.poly --openmp
# Generates: #pragma omp parallel for reduction(+:result)

Dot Product

# File: examples/reductions/dot_product.poly
polyopt compile examples/reductions/dot_product.poly --openmp

Transformation Examples

Loop Interchange

# File: examples/transformations/interchange_example.poly
# Original has poor cache behavior (column-major access)
polyopt analyze examples/transformations/interchange_example.poly --deps
polyopt optimize examples/transformations/interchange_example.poly --schedule pluto

Loop Tiling

# File: examples/transformations/tiling_example.poly
# Matrix multiply benefits from tiling
polyopt optimize examples/transformations/tiling_example.poly --tile 32 -o tiled.c

Loop Fusion

# File: examples/transformations/fusion_example.poly
# Multiple loops over same iteration space
polyopt analyze examples/transformations/fusion_example.poly --deps

Interactive Workflow Tutorial

Step 1: Write Your Kernel

Create mykernel.poly:

func saxpy(X[N], Y[N]) {
    for i = 0 to N {
        Y[i] = 2.0 * X[i] + Y[i];
    }
}

Step 2: Analyze Dependencies

$ polyopt analyze mykernel.poly --deps --parallel

=== Analysis: saxpy ===

Statements: 1
Parameters: ["N"]
Arrays: ["X", "Y"]

--- Dependences ---
Total: 0
  Flow (RAW): 0
  Anti (WAR): 0
  Output (WAW): 0

--- Parallelism ---
  Level 0: ✓ PARALLEL

Recommendation: Parallelize at level 0

Step 3: Visualize

$ polyvis mykernel.poly --params N=10

Statement S0 (1D domain):
  Iteration space (10 points):
    i→  0 1 2 3 4 5 6 7 8 9
        ● ● ● ● ● ● ● ● ● ●

Parallelism: Parallel at level 0

Step 4: Generate Parallel Code

$ polyopt compile mykernel.poly --openmp

void saxpy(int N, double* restrict X, double* restrict Y) {
    // Statement S0: S0
    #pragma omp parallel for
    for (int i = 0; i < N; i++) {
        Y[i] = ((2.0 * X[i]) + Y[i]);
    }
}

Step 5: Benchmark

$ polyopt bench mykernel.poly -N 10000000 --iterations 5 --openmp

=== Benchmark: saxpy ===
Size: N = 10000000
Iterations: 5
OpenMP: enabled

Running 5 iterations...

  Run 1: Time: 0.0234 seconds
  Run 2: Time: 0.0231 seconds
  Run 3: Time: 0.0229 seconds
  Run 4: Time: 0.0232 seconds
  Run 5: Time: 0.0230 seconds

The Polyhedral Model

What is the Polyhedral Model?

The polyhedral model represents loop nests as geometric objects:

Iteration Space: The set of all loop iterations, represented as integer points in a polyhedron
Schedule: A mapping that determines execution order
Access Functions: Affine functions describing array accesses

Why Polyhedral?

Traditional Compiler	Polyhedral Compiler
Pattern matching	Mathematical reasoning
Limited scope	Global optimization
Heuristic	Exact dependence analysis

Example: Matrix Multiply

Original code:

for (i = 0; i < N; i++)
  for (j = 0; j < N; j++)
    for (k = 0; k < N; k++)
      C[i][j] += A[i][k] * B[k][j];

Polyhedral representation:

Domain: { S[i,j,k] : 0 ≤ i,j,k < N }
Schedule: { S[i,j,k] → [i,j,k] } (original order)
Read accesses: { S[i,j,k] → A[i,k] }, { S[i,j,k] → B[k,j] }, { S[i,j,k] → C[i,j] }
Write access: { S[i,j,k] → C[i,j] }

Dependence Types

Type	Also Called	Pattern	Example
Flow	RAW (Read After Write)	Write → Read	`A[i] = ...; ... = A[i];`
Anti	WAR (Write After Read)	Read → Write	`... = A[i]; A[i] = ...;`
Output	WAW (Write After Write)	Write → Write	`A[i] = ...; A[i] = ...;`

Transformations

Transformation	Effect	When to Use
Tiling	Block loops for cache	Large working sets
Interchange	Swap loop order	Improve memory access pattern
Fusion	Merge loops	Improve temporal locality
Distribution	Split loops	Enable parallelization
Skewing	Add loop-carried values	Enable wavefront parallelism

Complete Example Listing

examples/
├── basic/
│   ├── vector_add.poly      # Simple parallel loop
│   ├── vector_scale.poly    # Scalar multiplication
│   ├── array_copy.poly      # 2D parallel copy
│   └── array_init.poly      # Index-dependent init
│
├── linalg/
│   ├── matmul.poly          # Matrix multiplication
│   ├── matvec.poly          # Matrix-vector multiply
│   ├── transpose.poly       # Matrix transpose
│   ├── trisolve.poly        # Triangular solve
│   ├── symm.poly            # Symmetric matrix multiply
│   ├── atax.poly            # A^T * A * x kernel
│   └── bicg.poly            # Biconjugate gradient
│
├── stencils/
│   ├── stencil_1d.poly      # 1D 3-point stencil
│   ├── stencil_2d_5pt.poly  # 2D 5-point stencil
│   ├── stencil_2d_9pt.poly  # 2D 9-point stencil
│   ├── jacobi_2d.poly       # Jacobi iteration (parallel)
│   ├── gauss_seidel_2d.poly # Gauss-Seidel (sequential)
│   └── conv2d.poly          # 2D convolution
│
├── reductions/
│   ├── vector_sum.poly      # Sum reduction
│   ├── dot_product.poly     # Dot product
│   ├── frobenius_norm.poly  # Matrix norm
│   └── prefix_sum.poly      # Prefix sum (sequential)
│
├── parallel/
│   ├── embarrassingly_parallel.poly  # Perfect parallelism
│   └── rowsum.poly          # Parallel outer, reduction inner
│
└── transformations/
    ├── interchange_example.poly   # Loop interchange demo
    ├── tiling_example.poly        # Tiling demo
    ├── fusion_example.poly        # Loop fusion demo
    └── distribution_example.poly  # Loop distribution demo

Tips & Best Practices

Writing Analyzable Code

Use affine bounds: 0 to N, 1 to N-1, not 0 to f(x)
Use affine indices: A[i][j], A[i+1][j-1], not A[B[i]]
Separate read/write arrays when possible (enables more parallelism)
Use compound assignment for reductions: sum = sum + x

Optimization Strategy

Analyze first: Understand dependencies before optimizing
Check parallelism: Use --parallel to find parallel loops
Consider tiling: Large iteration spaces benefit from tiling
Benchmark: Always verify optimizations improve performance

Common Patterns

Pattern	Parallelizable?	Optimization
Element-wise ops	✓ Yes	OpenMP parallel for
Stencil (double-buffer)	✓ Yes	Tiling + OpenMP
Stencil (in-place)	✗ No	Wavefront/skewing
Reduction	Partial	OpenMP reduction clause
Sequential scan	✗ No	Algorithm change needed

Troubleshooting

"Unknown variable in affine expression"

The variable wasn't declared as a loop iterator or parameter.

// Wrong - T not defined
for i = 0 to T { ... }

// Right - T is a parameter
func foo(A[T]) {
    for i = 0 to T { ... }
}

"No parallel loops found"

Check for loop-carried dependencies:

polyopt analyze input.poly --deps

Common causes:

In-place array updates: A[i] = f(A[i-1])
Reduction without proper pattern
Complex index expressions

Generated code doesn't compile

Ensure indices match array dimensions:

// Wrong - A is 1D but accessed as 2D
func foo(A[N]) {
    for i = 0 to N {
        A[i][j] = 0;  // Error!
    }
}

Advanced Features

ISL Integration (Exact Polyhedral Math)

PolyOpt includes a polyhedral math module with both a pure Rust simulation and optional native ISL support.

Usage

# Parse and analyze an ISL set expression directly
polyopt isl "{ [i, j] : 0 <= i < 10 and 0 <= j < 10 }" --expr --enumerate

# Analyze a .poly file's domains
polyopt isl examples/basic/array_copy.poly --domain

# Generate tiled loop code
polyopt isl "{ [i, j] : 0 <= i < N and 0 <= j < N }" --expr --tile 32

# Show scheduling info
polyopt isl examples/linalg/matmul.poly --schedule

Example Output

=== ISL Polyhedral Analysis ===

Using: Pure Rust polyhedral simulation
       (Install ISL for exact operations: brew install isl)

Set: { [i, j] : i >= 0 and -i + 9 >= 0 and j >= 0 and -j + 9 >= 0 }
Variables: ["i", "j"]
Constraints: 4

--- Point Enumeration ---
    0: ([0, 0])
    1: ([0, 1])
    ...
Total points: 100

Installing Native ISL (Optional)

For exact polyhedral operations, install ISL:

# macOS
brew install isl

# Ubuntu/Debian
sudo apt install libisl-dev isl-utils

# Fedora
sudo dnf install isl-devel

Auto-Tuning Framework

The auto-tuner automatically searches for optimal transformation parameters by compiling and benchmarking multiple variants.

Usage

# Basic auto-tuning with OpenMP
polyopt autotune input.poly --openmp

# Specify tile sizes to try
polyopt autotune input.poly --tiles "8,16,32,64,128" --openmp

# Use different search strategies
polyopt autotune input.poly --strategy exhaustive  # Try all combinations
polyopt autotune input.poly --strategy random      # Random sampling
polyopt autotune input.poly --strategy genetic     # Genetic algorithm
polyopt autotune input.poly --strategy annealing   # Simulated annealing

# Customize benchmark parameters
polyopt autotune input.poly -N 1000 --iterations 5 --openmp

# Save results to file
polyopt autotune input.poly -o results.csv --openmp

Search Strategies

Strategy	Description	Use When
`exhaustive`	Tests all combinations	Small search space, need optimal
`random`	Random sampling (20 samples)	Quick exploration
`genetic`	Evolutionary optimization	Large search space
`annealing`	Simulated annealing	Avoid local minima

Contributing

Contributions welcome! Areas of interest:

Additional scheduling algorithms
GPU code generation (CUDA/OpenCL)
More transformation passes
Performance benchmarking

License

MIT License - see LICENSE file

Acknowledgments

PolyOpt is inspired by:

ISL - Integer Set Library
Pluto - Automatic parallelizer
Polly - LLVM polyhedral optimizer
PolyBench - Polyhedral benchmark suite

Made with ❤️ for the HPC community

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
benches		benches
examples		examples
src		src
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
benchmark.sh		benchmark.sh
run_all_examples.sh		run_all_examples.sh
validate_examples.sh		validate_examples.sh

abodthedude25/polyopt

Folders and files

Latest commit

History

Repository files navigation

PolyOpt - Polyhedral Loop Optimizer

Overview

Features

Advanced Features

ISL Integration

Auto-Tuning Framework

Installation

Prerequisites

Build from Source

Verify Installation

Quick Start

1. Your First Program

2. Compile to C

3. Add OpenMP Parallelization

4. Analyze Dependencies

5. Visualize Iteration Space

The .poly Language

Basic Syntax

Array Declarations

Expressions

Loop Bounds

Commands

polyopt compile - Generate Code

polyopt analyze - Dependence Analysis

polyopt optimize - Apply Transformations

polyopt parse - Debug Parser

polyopt bench - Run Benchmarks

polyvis - Visualize Iteration Spaces

Examples

Basic Examples

Vector Addition (Parallel)

Array Copy (2D Parallel)

Linear Algebra

Matrix Multiplication

Matrix-Vector Multiply

Matrix Transpose

Stencil Computations

Jacobi 2D (Double-Buffered - Parallel)

Gauss-Seidel 2D (In-Place - Sequential)

5-Point Stencil

Reductions

Vector Sum

Dot Product

Transformation Examples

Loop Interchange

Loop Tiling

Loop Fusion

Interactive Workflow Tutorial

Step 1: Write Your Kernel

Step 2: Analyze Dependencies

Step 3: Visualize

Step 4: Generate Parallel Code

Step 5: Benchmark

The Polyhedral Model

What is the Polyhedral Model?

Why Polyhedral?

Example: Matrix Multiply

Dependence Types

Transformations

Complete Example Listing

Tips & Best Practices

Writing Analyzable Code

Optimization Strategy

Common Patterns

Troubleshooting

"Unknown variable in affine expression"

"No parallel loops found"

Generated code doesn't compile

Advanced Features

ISL Integration (Exact Polyhedral Math)

Usage

Example Output

Installing Native ISL (Optional)

Auto-Tuning Framework

Usage

`polyopt compile` - Generate Code

`polyopt analyze` - Dependence Analysis

`polyopt optimize` - Apply Transformations

`polyopt parse` - Debug Parser

`polyopt bench` - Run Benchmarks

`polyvis` - Visualize Iteration Spaces

Packages