Data Fabrication is a WASM evaluation module for AI agents on the Bittensor network via platform. Miners submit Python harnesses that generate conversation datasets. The WASM module runs inside platform validators to validate submissions, evaluate generated datasets, and compute scores. A companion native CLI (data-cli) provides a TUI for monitoring leaderboards, evaluation progress, and network health. A native server library (server/) implements the ServerChallenge trait for running challenge logic outside the WASM sandbox.
data-fabrication/
├── Cargo.toml # workspace with members = [".", "core", "wasm", "server", "cli", "executor"]
├── src/
│ └── lib.rs # Root library crate entry point
├── wasm/
│ ├── Cargo.toml # cdylib, depends on platform-challenge-sdk-wasm
│ └── src/
│ ├── lib.rs # Challenge impl + register_challenge!
│ └── types.rs # Submission, ChallengeParams, etc.
├── core/
│ ├── Cargo.toml # Shared types and traits (feature-gated: std, alloc)
│ └── src/
│ ├── lib.rs # Root module with conditional exports
│ ├── schema.rs # JSONL parsing, conversation schema
│ ├── config.rs # EvaluationConfig, HarnessExecutionConfig
│ ├── scoring_types.rs # DatasetScore, ConversationScore, LlmEvaluationScore
│ ├── error.rs # DataFabricationError (std only)
│ ├── sandbox.rs # Python execution sandbox (std only)
│ ├── ast_validation.rs # Python AST validation (std only)
│ ├── ast_similarity.rs # AST structural comparison (std only)
│ ├── llm_client.rs # LLM client trait + implementations (std only)
│ ├── consensus.rs # P2P consensus helpers (std only)
│ ├── cache.rs # Evaluation result cache (std only)
│ └── resource_limits.rs # Unix resource limits (std + unix only)
├── server/
│ ├── Cargo.toml # lib + bin, depends on platform-challenge-sdk (server mode)
│ └── src/
│ ├── lib.rs # ServerChallenge implementation
│ ├── main.rs # Binary entry point
│ └── types.rs # Server-specific types
├── executor/
│ ├── Cargo.toml # Native binary for harness execution
│ └── src/
│ ├── lib.rs # Re-exports executor and LLM inference
│ ├── executor.rs # PythonExecutor, ExecutionResult
│ ├── llm_inference.rs # LLM inference with retry, plagiarism detection
│ └── error.rs # ExecutorError, ExecutorResult
└── cli/
├── Cargo.toml # Native binary, ratatui TUI
└── src/
├── main.rs # Entry point, event loop
├── lib.rs # Library exports
└── ui.rs # Ratatui UI rendering
| Package | Depends On | Purpose |
|---|---|---|
wasm |
core (no-default-features) |
WASM module for validators |
server |
core (std), platform-challenge-sdk |
Native server mode |
executor |
core (std) |
Native harness execution engine |
cli |
— (standalone) | TUI monitoring tool |
| Flag | Description | Dependencies |
|---|---|---|
default |
["alloc"] |
— |
alloc |
No-std alloc support | — |
std |
Full std support | thiserror, serde/std, serde_json/std, rustpython-parser, tempfile, log |
http-client |
HTTP LLM client | reqwest, tokio |
# Build WASM module (for validators)
cargo build --target wasm32-unknown-unknown -p data-fabrication-wasm
# Build WASM module (release)
cargo build --release --target wasm32-unknown-unknown -p data-fabrication-wasm
# Build native release (all packages)
cargo build --release
# Build specific package
cargo build --release -p data-fabrication-core
cargo build --release -p data-fabrication-server
cargo build --release -p data-executor
cargo build --release -p data-cli
# Check without building
cargo check -p data-fabrication-wasm
cargo check -p data-fabrication-core --features std# Run all tests (workspace, excluding WASM)
cargo test --workspace --exclude data-fabrication-wasm
# Run tests for specific package
cargo test -p data-fabrication-core
cargo test -p data-executor
# Run specific test
cargo test -p data-fabrication-core test_harness_submission_serialization
# Run tests with all features
cargo test -p data-fabrication-core --all-features/// Miner's submission with Python harness
pub struct HarnessSubmission {
pub hotkey: String,
pub epoch: u64,
pub code_hash: String,
pub package: Vec<u8>,
}
/// Generated dataset from harness execution
pub struct GeneratedDataset {
pub conversations: Vec<ConversationEntry>,
pub metadata: DatasetMetadata,
pub generation_time_ms: u64,
}DatasetScore— Overall dataset quality scoreConversationScore— Per-conversation quality metricsLlmEvaluationScore— LLM-based evaluation resultsCriteriaScores— Multi-criteria scoring breakdown
pub struct EvaluationConfig {
pub memory_limit_bytes: u64,
pub timeout_seconds: u64,
pub max_conversation_count: u64,
pub max_dataset_size_bytes: u64,
}
pub struct HarnessExecutionConfig {
pub python_path: String,
pub working_directory: String,
pub env_vars: HashMap<String, String>,
}- Create the module file in
core/src/ - Add module declaration in
core/src/lib.rs:- Without std:
pub mod my_module; - With std only:
#[cfg(feature = "std")] pub mod my_module; - With std + unix:
#[cfg(all(feature = "std", unix))] pub mod my_module;
- Without std:
- Add any new dependencies to
core/Cargo.tomlwith appropriate feature gates
The WASM module (wasm/src/lib.rs) must remain #![no_std] compatible:
- Use
alloc::collections:alloc::vec::Vec,alloc::string::String - Use
bincodewithdefault-features = false - Dependencies must also be
no_stdcompatible - Cannot access filesystem, network, or threads directly
- Use host functions for external I/O (provided by platform-challenge-sdk-wasm)
| Test Type | Location | Command |
|---|---|---|
| Unit tests | core/src/*.rs |
cargo test -p data-fabrication-core |
| Integration tests | core/tests/*.rs |
cargo test -p data-fabrication-core |
| Executor tests | executor/tests/*.rs |
cargo test -p data-executor |
| CLI tests | cli/tests/*.rs |
cargo test -p data-cli |
# Format all code
cargo fmt --all
# Check formatting
cargo fmt --all -- --check
# Run clippy
cargo clippy --workspace --exclude data-fabrication-wasm -- -D warnings
# WASM clippy (requires target)
cargo clippy -p data-fabrication-wasm --target wasm32-unknown-unknown -- -D warningsThe data-cli crate is a native binary that provides a terminal user interface for monitoring the data-fabrication network.
- Framework: Built with ratatui for TUI rendering
- Transport: Connects to validators via HTTP JSON-RPC
- Target: Standard
x86_64/aarch64native targets (not WASM)
| Key | Action |
|---|---|
Tab / Shift+Tab |
Switch between tabs |
r |
Refresh data |
q |
Quit |
- No
stdin WASM code. Thewasm/module compiles with#![no_std]. Usealloc::equivalents. - Core uses feature gates. All std-dependent code checks
#[cfg(feature = "std")]. - WASM uses
no-default-features. When depending on core in WASM, usedefault-features = false, features = ["alloc"]. - No
.unwrap()or.expect()in library paths. Use pattern matching orunwrap_or_default(). - Keep WASM minimal. Do not add heavy dependencies to the WASM crate.
- Do NOT break the WASM ABI. The
register_challenge!macro and Challenge trait must remain compatible.
Note: The
cli/,executor/, andserver/crates are native code exempt from theno_stdrules. They use fullstdfeatures.
- Use
alloc::types in WASM code (alloc::vec::Vec,alloc::string::String) - Use
serdewithdefault-features = false, features = ["derive", "alloc"]in WASM - Use
bincodewithdefault-features = falsefor WASM serialization - Feature-gate std-dependent imports in core with
#[cfg(feature = "std")] - Keep the
register_challenge!macro ABI intact
- Do NOT use
std::,println!,std::collections::HashMapin WASM code - Do NOT add heavy dependencies to the WASM crate
- Do NOT break the Challenge trait interface
- Do NOT use
#[allow(dead_code)]broadly — fix or remove unused code
The data-executor crate handles native harness execution and LLM inference:
| Module | Purpose |
|---|---|
executor.rs |
Python execution in sandboxed environment |
llm_inference.rs |
LLM API calls with retry logic, plagiarism detection |
error.rs |
Error types for execution failures |
- Sandboxed execution: Python harness runs with resource limits
- LLM inference: Quality evaluation via external LLM APIs
- Plagiarism detection: AST-based similarity comparison
- Retry logic: Automatic retry on transient failures
# Build executor
cargo build -p data-executor
# Build executor (release)
cargo build --release -p data-executor
# Run executor tests
cargo test -p data-executorThe data-fabrication-server crate implements ServerChallenge trait:
- Uses
platform-challenge-sdkfor server mode - Storage via sled key-value store
- HTTP routes via axum
- Reuses
coretypes withstdfeature
# Build server
cargo build -p data-fabrication-server
# Run server (requires configuration)
cargo run -p data-fabrication-server