Research-grade framework for end-to-end selective verifiability with explicit soundness guarantees.
Complete results are already present in the result/criterion folder. If you want to view paper results, you can directly use the extraction scripts (go to 410 for the script) in the scripts/ directory (see Extracting Results Tables section below).
Reproducing the entire project from scratch takes 2-4 hours and requires time and patience. This includes:
- Running all tests: ~60-90 minutes
- Running all benchmarks: ~120-150 minutes
- Additional setup and compilation time
Only run the full reproduction if you need to verify results or generate new data. For viewing existing results, use the extraction scripts instead.
Rust Toolchain:
- Rust 1.56.0 or later (Rust edition 2021)
- Cargo (comes with Rust)
- Install from: https://rustup.rs/
- Verify installation:
rustc --versionandcargo --version
Operating System:
- Windows: Windows 10/11 (PowerShell 5.1+ or PowerShell 7+)
- Linux: Most modern distributions (Ubuntu 20.04+, Debian 11+, etc.)
- macOS: macOS 10.15+ (Catalina or later)
Memory:
- Minimum: 8 GB RAM
- Recommended: 16 GB RAM or more
- For low-memory systems, use
CARGO_BUILD_JOBS=1to limit parallel compilation
Disk Space:
- Build artifacts: ~2-5 GB (in
target/directory) - Cargo cache: ~1-3 GB (global cache)
- Project + results: ~500 MB - 1 GB
- Total recommended: At least 10 GB free space
For viewing results (without running benchmarks):
- Windows: PowerShell 5.1+ or PowerShell 7+ (for extraction scripts)
- Unix/Linux/macOS: Bash shell (standard on most systems)
For post-quantum cryptography features (enabled by default):
- No additional system dependencies required
- All cryptographic libraries are pure Rust or have Rust bindings
For network features (beacon clients):
- Internet connection (for NIST Beacon and drand randomness sources)
- Feature flag:
beacon-clients(disabled by default)
Windows:
- Uses PowerShell scripts (
.ps1files) for result extraction - May need to set execution policy:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser - Or use:
pwsh -ExecutionPolicy Bypass -File scripts/extract_*.ps1 - Some post-quantum features optimized for Windows compatibility
Linux/macOS:
- Uses standard bash shell
- All scripts should work out of the box
- May need to make scripts executable:
chmod +x scripts/*.sh
After installing Rust, verify your setup:
# Check Rust version (should be 1.56.0+)
rustc --version
# Check Cargo version
cargo --version
# Verify you can build the project
cargo check# Build
cargo build --release
# Run tests (MUST use --test-threads=1 to prevent crashes)
cargo test --release -- --test-threads=1
# Or use helper script
.\scripts\test.ps1 # Windows
./scripts/test.sh # Unix/Linux/macOScargo build --release # Build optimized
cargo check # Type-check only
cargo clean # Clean artifacts--test-threads=1 or tests will crash!
# All tests (recommended)
cargo test --release -- --test-threads=1
# With resource limits (for low-memory systems)
$env:CARGO_BUILD_JOBS="1"; cargo test --release -- --test-threads=1 # Windows
CARGO_BUILD_JOBS=1 cargo test --release -- --test-threads=1 # Unix/Linux/macOS
# Specific test suites (most common)
cargo test --release --test config -- --test-threads=1 # Config (tested~1 min)
cargo test --release --lib -- --test-threads=1 # Unit tests (tested~5-10 min)
cargo test --release --lib -- core:: -- --test-threads=1 # Core unit tests (~5-10 min)
cargo test --release --lib -- apps:: -- --test-threads=1 # Apps unit tests (~5-10 min)
cargo test --release --lib -- analysis:: -- --test-threads=1 # Analysis unit tests (~5-10 min)
cargo test --release --test integration -- --test-threads=1 # Integration (2 hours/tested)
cargo test --release --test integration -- core:: -- --test-threads=1 # Core integration tests
cargo test --release --test integration -- apps:: -- --test-threads=1 # Apps integration tests
cargo test --release --test integration -- analysis:: -- --test-threads=1 # Analysis integration tests
# Specific test by name
cargo test --release --lib -- test_name -- --test-threads=1 # Unit test by name
cargo test --release --test integration -- test_name -- --test-threads=1 # Integration test by namecargo run --example basic_usage
cargo run --example threshold_sum
cargo run --example linear_riskcargo run --releasecargo fmt
cargo clippy --releasecargo clean
.\scripts\cleanup.ps1Run all 7 benchmarks to get all results needed for your paper:
# Windows:
$env:SVG_EXPORTS="1"; cargo bench --bench end_to_end_bench
$env:SVG_EXPORTS="1"; cargo bench --bench evaluation_bench
$env:SVG_EXPORTS="1"; cargo bench --bench cost_bench
$env:SVG_EXPORTS="1"; cargo bench --bench crypto_bench
$env:SVG_EXPORTS="1"; cargo bench --bench memory_bench
$env:SVG_EXPORTS="1"; cargo bench --bench core_bench
$env:SVG_EXPORTS="1"; cargo bench --bench ivc_spine_bench
# Unix/Linux/macOS:
SVG_EXPORTS=1 cargo bench --bench end_to_end_bench
SVG_EXPORTS=1 cargo bench --bench evaluation_bench
SVG_EXPORTS=1 cargo bench --bench cost_bench
SVG_EXPORTS=1 cargo bench --bench crypto_bench
SVG_EXPORTS=1 cargo bench --bench memory_bench
SVG_EXPORTS=1 cargo bench --bench core_bench
SVG_EXPORTS=1 cargo bench --bench ivc_spine_benchWhat you get:
result/end_to_end_bench/comprehensive_e2e_size_{N}_topo_{T}_strat_{S}.json- Full vs selective verification costs, speedups (19x), cost reductions (95%)result/end_to_end_bench/program_a_threshold_sum.json- Program A (Threshold Sum) resultsresult/end_to_end_bench/program_b_linear_risk.json- Program B (Linear Risk Score) resultsresult/end_to_end_bench/matrix_multiplication.json- Matrix multiplication resultsresult/coverage_validation_*.json- Coverage statisticsresult/soundness_monte_carlo_*.json- Soundness error analysisresult/ablation_study_*.json- Ablation study resultsresult/full_scalability_size_*.json- Full verification scalabilityresult/selective_scalability_size_*.json- Selective verification scalabilityresult/selective_vs_full_comparison_size_*.json- Direct comparisonresult/cost_model_validation_*.json- Cost model validation (actual vs estimated)result/stark_proof_*.json- STARK proof generation, verification, sizes (from crypto_bench)result/*_memory_*.json- Memory footprint measurements (from memory_bench)result/subset_selection.json,result/submodular_optimization.json,result/freivalds_verification.json- Core operations (from core_bench)result/mock_ivc_spine.json,result/stark_ivc_spine.json- IVC/PCD spines (from ivc_spine_bench)- Plus Criterion HTML reports in
result/criterion/
Time: ~120-150 minutes total (all benchmarks)
All 7 benchmarks support SVG_EXPORTS=1 and generate detailed JSON files:
# Windows:
$env:SVG_EXPORTS="1"; cargo bench --bench end_to_end_bench # Main contribution: speedup ratios, cost comparisons
$env:SVG_EXPORTS="1"; cargo bench --bench evaluation_bench # Security validation: tamper detection, soundness analysis
$env:SVG_EXPORTS="1"; cargo bench --bench cost_bench # Cost analysis: scalability, cost model validation
$env:SVG_EXPORTS="1"; cargo bench --bench crypto_bench # STARK proof times, proof sizes
$env:SVG_EXPORTS="1"; cargo bench --bench memory_bench # Memory footprint comparison
$env:SVG_EXPORTS="1"; cargo bench --bench core_bench # Subset selection, Freivalds verification
$env:SVG_EXPORTS="1"; cargo bench --bench ivc_spine_bench # IVC/PCD spines
# Unix/Linux/macOS:
SVG_EXPORTS=1 cargo bench --bench end_to_end_bench
SVG_EXPORTS=1 cargo bench --bench evaluation_bench
SVG_EXPORTS=1 cargo bench --bench cost_bench
SVG_EXPORTS=1 cargo bench --bench crypto_bench
SVG_EXPORTS=1 cargo bench --bench memory_bench
SVG_EXPORTS=1 cargo bench --bench core_bench
SVG_EXPORTS=1 cargo bench --bench ivc_spine_benchOutput:
- end_to_end_bench:
result/end_to_end_bench/comprehensive_e2e_size_{N}_topo_{T}_strat_{S}.json- Comprehensive comparison (verification costs, speedups, cost reductions)result/end_to_end_bench/program_a_threshold_sum.json- Program A resultsresult/end_to_end_bench/program_b_linear_risk.json- Program B resultsresult/end_to_end_bench/matrix_multiplication.json- Matrix multiplication results
- evaluation_bench: Multiple JSON files (coverage, soundness, ablation study)
- cost_bench: Scalability JSON files (
full_scalability_*.json,selective_scalability_*.json), comparison (selective_vs_full_comparison_*.json), cost model validation (cost_model_validation_*.json) - crypto_bench:
result/stark_proof_generation.json,result/stark_proof_verification.json,result/stark_proof_sizes.json - memory_bench:
result/graph_memory_usage.json,result/he_ciphertext_memory.json,result/stark_proof_memory.json,result/selective_verifier_memory.json - core_bench:
result/subset_selection.json,result/submodular_optimization.json,result/freivalds_verification.json - ivc_spine_bench:
result/mock_ivc_spine.json,result/stark_ivc_spine.json(if pq-zk enabled) - All also generate: Criterion HTML reports in
result/criterion/
Run all benchmarks (essential + supporting):
# Windows:
# Essential (with JSON exports)
$env:SVG_EXPORTS="1"; cargo bench --bench end_to_end_bench
$env:SVG_EXPORTS="1"; cargo bench --bench evaluation_bench
$env:SVG_EXPORTS="1"; cargo bench --bench cost_bench
# Supporting (with JSON exports)
$env:SVG_EXPORTS="1"; cargo bench --bench crypto_bench
$env:SVG_EXPORTS="1"; cargo bench --bench memory_bench
# Optional (with JSON exports)
$env:SVG_EXPORTS="1"; cargo bench --bench core_bench
$env:SVG_EXPORTS="1"; cargo bench --bench ivc_spine_bench
# Unix/Linux/macOS:
# Essential (with JSON exports)
SVG_EXPORTS=1 cargo bench --bench end_to_end_bench
SVG_EXPORTS=1 cargo bench --bench evaluation_bench
SVG_EXPORTS=1 cargo bench --bench cost_bench
# Supporting (with JSON exports)
SVG_EXPORTS=1 cargo bench --bench crypto_bench
SVG_EXPORTS=1 cargo bench --bench memory_bench
# Optional (with JSON exports)
SVG_EXPORTS=1 cargo bench --bench core_bench
SVG_EXPORTS=1 cargo bench --bench ivc_spine_benchTime: ~2-3 hours total
All benchmarks generate (Criterion):
- HTML reports:
result/criterion/<benchmark_name>/**/report/index.html - Execution times:
result/criterion/<benchmark_name>/**/estimates.json - Available for ALL benchmarks
Only end_to_end_bench and evaluation_bench generate (when SVG_EXPORTS=1):
- Custom JSON exports:
result/end_to_end_bench/comprehensive_e2e_size_{N}_topo_{T}_strat_{S}.json- Comprehensive E2E resultsresult/end_to_end_bench/program_a_threshold_sum.json- Program A resultsresult/end_to_end_bench/program_b_linear_risk.json- Program B resultsresult/end_to_end_bench/matrix_multiplication.json- Matrix multiplication resultsresult/coverage_validation_*.json, etc. - Evaluation benchmark results
- These contain actual verification costs/metrics for paper analysis
- Important:
SVG_EXPORTS=1generates BOTH Criterion outputs AND custom JSON exports
When to use SVG_EXPORTS=1:
- ✅
end_to_end_bench- Generates custom JSON exports - ✅
evaluation_bench- Generates custom JSON exports - ❌ Other benchmarks -
SVG_EXPORTS=1has no effect (only Criterion HTML/JSON)
After running benchmarks with SVG_EXPORTS=1, your result/ directory will contain:
result/
├── end_to_end_bench/ # End-to-end benchmark outputs
│ ├── comprehensive_e2e_size_10_topo_chain_strat_sequential.json
│ ├── comprehensive_e2e_size_10_topo_chain_strat_entropy.json
│ ├── comprehensive_e2e_size_10_topo_tree_strat_sequential.json
│ ├── comprehensive_e2e_size_10_topo_tree_strat_entropy.json
│ ├── comprehensive_e2e_size_25_topo_chain_strat_sequential.json
│ ├── comprehensive_e2e_size_25_topo_chain_strat_entropy.json
│ ├── comprehensive_e2e_size_25_topo_tree_strat_sequential.json
│ ├── comprehensive_e2e_size_25_topo_tree_strat_entropy.json
│ ├── program_a_threshold_sum.json
│ ├── program_b_linear_risk.json
│ └── matrix_multiplication.json
├── coverage_validation_*.json # Evaluation benchmark outputs
├── soundness_monte_carlo_*.json
├── ablation_study_*.json
├── full_scalability_size_*.json # Cost benchmark outputs
├── selective_scalability_size_*.json
├── selective_vs_full_comparison_size_*.json
├── cost_model_validation_*.json
├── stark_proof_*.json # Crypto benchmark outputs
├── *_memory_*.json # Memory benchmark outputs
├── subset_selection.json # Core benchmark outputs
├── submodular_optimization.json
├── freivalds_verification.json
├── mock_ivc_spine.json # IVC/PCD benchmark outputs
├── stark_ivc_spine.json
└── criterion/ # Criterion HTML reports (all benchmarks)
├── ComprehensiveE2E/
├── ProgramA_ThresholdSum/
├── ProgramB_LinearRiskScore/
├── MatrixMultiplicationExperiment/
└── ... (other benchmark groups)
Note: The exact number of comprehensive_e2e_*.json files depends on your benchmark configuration (graph sizes, topologies, strategies).
Generate End-to-End results table:
# Windows (bypasses execution policy):
pwsh -ExecutionPolicy Bypass -File scripts/extract_e2e_table.ps1
# Save to file:
pwsh -ExecutionPolicy Bypass -File scripts/extract_e2e_table.ps1 | Out-File -FilePath result/e2e_table.txt
# Alternative (if execution policy allows):
.\scripts\extract_e2e_table.ps1This script reads all result/end_to_end_bench/comprehensive_e2e_*.json files and generates a formatted table showing:
- Full vs Selective verification costs
- Speedup factors (7x-19x)
- Cost reduction percentages (85%-95%)
Generate Evaluation benchmark results tables:
# All evaluation tables (coverage, soundness, ablation, baselines):
pwsh -ExecutionPolicy Bypass -File scripts/extract_evaluation_tables.ps1
# Save to file:
pwsh -ExecutionPolicy Bypass -File scripts/extract_evaluation_tables.ps1 | Out-File -FilePath result/evaluation_tables.txtGenerate Cost benchmark results tables:
# All cost tables (scalability, comparison, validation):
pwsh -ExecutionPolicy Bypass -File scripts/extract_cost_tables.ps1Generate Cryptographic benchmark results tables:
# STARK proof performance tables:
pwsh -ExecutionPolicy Bypass -File scripts/extract_crypto_tables.ps1Generate Memory benchmark results tables:
# Memory footprint tables:
pwsh -ExecutionPolicy Bypass -File scripts/extract_memory_tables.ps1Generate Core operations benchmark results tables:
# Core operations tables (subset selection, optimization, Freivalds):
pwsh -ExecutionPolicy Bypass -File scripts/extract_core_tables.ps1Generate IVC/PCD spine benchmark results tables:
# IVC/PCD spine tables:
pwsh -ExecutionPolicy Bypass -File scripts/extract_ivc_tables.ps1Generate ALL benchmark results tables:
# All tables from all benchmarks: ///Please use this script FOR paper Result.
pwsh -ExecutionPolicy Bypass -File scripts/extract_all_tables.ps1
# Save to file:
pwsh -ExecutionPolicy Bypass -File scripts/extract_all_tables.ps1 | Out-File -FilePath result/all_tables.txtExtraction scripts (7 scripts, one per benchmark):
extract_e2e_table.ps1- End-to-End benchmarkextract_evaluation_tables.ps1- Evaluation benchmark (coverage, soundness, ablation, baselines)extract_cost_tables.ps1- Cost benchmark (scalability, comparison, validation)extract_crypto_tables.ps1- Cryptographic benchmark (STARK proof performance)extract_memory_tables.ps1- Memory benchmark (memory footprint)extract_core_tables.ps1- Core operations benchmark (subset selection, optimization, Freivalds)extract_ivc_tables.ps1- IVC/PCD spine benchmark
Convenience script:
extract_all_tables.ps1- Runs all 7 scripts and combines output
# Save baseline for comparison
cargo bench --bench <benchmark_name> -- --save-baseline default
# Compare against baseline
cargo bench --bench <benchmark_name> -- --baseline default- Format:
size_{N}_topo_{T}_strat_{S} - Topologies (
topo):topo_0= Chain (linear) - Sequential dependencies, O(n) depthtopo_1= Tree (hierarchical) - Binary tree structure, O(log n) depthtopo_2= Star (hub-and-spoke) - All inputs connect to center, O(1) depth (not used in current benchmarks)topo_3+= Branching - Multiple parallel branches converging (not used in current benchmarks)
- Strategies (
strat):strat_0= Sequential (first-N) - Simple deterministic selection, fasteststrat_1= EntropyRegularized (randomized) - Privacy-preserving sampling with temperature controlstrat_2= Greedy - Cost-optimized selection with (1-1/e) approximation (not used in current benchmarks)strat_3= LPRelaxed - LP relaxation with randomized rounding (not used in current benchmarks)
- See
docs/BENCHMARK_PARAMETERS.mdfor detailed explanations
Environment variables:
SVG_EXPORTS=1 # Enable custom JSON exports (for end_to_end_bench and evaluation_bench only)
# Generates BOTH Criterion HTML/JSON AND custom JSON export files
SVG_ABLATION_DISABLE_RANDOMIZATION=1
SVG_ABLATION_DISABLE_SPINE=1
SVG_ABLATION_DISABLE_ALGEBRAIC=1
RESULT_DIR=result # Set custom JSON export directory (default: "result")
# Criterion outputs always go to result/criterion/Reduce Memory Usage:
# Limit compilation jobs (prevents OOM during build)
$env:CARGO_BUILD_JOBS="1" # Windows - set before cargo commands
export CARGO_BUILD_JOBS=1 # Unix/Linux/macOS
# Clean before building (removes old artifacts)
cargo clean
# Move target directory to different drive
$env:CARGO_TARGET_DIR="D:\rust-target" # Windows
export CARGO_TARGET_DIR="/mnt/d/rust-target" # Unix/LinuxSpace Saving:
# Clean build artifacts regularly
cargo clean # Frees ~2-5 GB
# Clean Cargo global cache (all Rust projects)
$cargoHome = "$env:USERPROFILE\.cargo"
Remove-Item -Recurse -Force "$cargoHome\registry\cache" -ErrorAction SilentlyContinueMemory errors: Always use --test-threads=1 + CARGO_BUILD_JOBS=1
Tests slow: Normal with sequential execution (~60-90 min total)
Tests hanging: Press Ctrl+C, check --test-threads=1 flag
High disk usage: Run cargo clean regularly, move target/ to different drive