Skip to content

Latest commit

 

History

History
310 lines (224 loc) · 9.29 KB

File metadata and controls

310 lines (224 loc) · 9.29 KB

Contributing to IDB Utils

Thank you for your interest in contributing to innodb-utils (inno). This guide covers development setup, architecture, conventions, and the contribution workflow.

Development Setup

Requirements: Rust 1.70+ (2021 edition), cargo, optionally Node.js for the web UI.

# Clone and build
git clone https://github.com/ringo380/idb-utils.git
cd idb-utils
cargo build

# Run tests (unit + integration + doc-tests)
cargo test

# Lint (zero warnings enforced)
cargo clippy -- -D warnings

# Format check
cargo fmt --check

# Build optimized binary
cargo build --release

# Build with MySQL query support (requires mysql_async + tokio)
cargo build --release --features mysql

# Security audit (requires cargo-audit)
cargo audit

WASM Build

# Install wasm-pack if needed
cargo install wasm-pack

# Build WASM package
wasm-pack build --release --target web --no-default-features

# Quick check without full build
cargo check --target wasm32-unknown-unknown --no-default-features

Web UI Development

cd web
npm ci
npm run dev    # Start dev server
npm run build  # Production build

Code Architecture

The project is organized into three layers plus a web frontend:

Binary (inno)

The CLI entry point lives in src/main.rs. It parses arguments with clap derive macros and dispatches to the appropriate subcommand module. CLI definitions (Cli, Commands, ColorMode) are in src/cli/app.rs, shared between main.rs and build.rs via include!().

Library (idb)

Core InnoDB parsing logic in src/innodb/:

  • tablespace.rs -- File I/O, page size detection, page iteration
  • page.rs -- FIL header/trailer parsing, page type identification
  • checksum.rs -- CRC-32C, legacy InnoDB, and MariaDB full_crc32 algorithms
  • sdi.rs -- SDI metadata extraction (MySQL 8.0+), multi-page zlib decompression
  • log.rs -- Redo log file parsing and block analysis
  • record.rs -- Record parsing within INDEX pages
  • compression.rs -- Compressed page handling
  • encryption.rs -- Encrypted tablespace detection
  • vendor.rs -- MySQL/Percona/MariaDB identification from FSP flags and redo headers
  • constants.rs -- InnoDB constants matching MySQL source names

WASM

src/wasm.rs provides thin wrapper functions over the library layer, returning JSON strings via wasm-bindgen. Built with --no-default-features to exclude filesystem and MySQL dependencies.

Web UI

web/ contains a Vite + Tailwind SPA. Components live in web/src/components/, shared utilities in web/src/utils/.

Module Organization

src/
  main.rs              # Binary entry point, subcommand dispatch
  lib.rs               # Library crate root
  cli/
    app.rs             # Cli, Commands, ColorMode (shared with build.rs)
    mod.rs             # wprintln!/wprint! macros, progress bar helper
    parse.rs           # inno parse
    pages.rs           # inno pages
    dump.rs            # inno dump
    checksum.rs        # inno checksum
    diff.rs            # inno diff
    corrupt.rs         # inno corrupt
    recover.rs         # inno recover
    find.rs            # inno find
    tsid.rs            # inno tsid
    sdi.rs             # inno sdi
    log.rs             # inno log
    info.rs            # inno info
    watch.rs           # inno watch
  innodb/              # Core parsing library
  util/
    hex.rs             # Hex dump formatting
    mysql.rs           # MySQL connection (feature-gated)
    fs.rs              # Directory traversal helpers
  wasm.rs              # WASM bindings

Key Patterns

Writer Pattern

Every subcommand accepts a writer for testability:

pub fn execute(opts: &Options, writer: &mut dyn Write) -> Result<(), IdbError> {
    // ...
}

In main.rs, this is called with &mut std::io::stdout(). In tests, use Vec<u8> as the writer to capture output.

Output Macros

wprintln! and wprint! are wrappers around writeln!/write! that convert std::io::Error into IdbError:

wprintln!(writer, "Page {}: type={}", page_num, page_type);

Clap Derive

Each subcommand defines an Options struct with clap derive attributes:

#[derive(Parser)]
pub struct Options {
    /// Path to the .ibd file
    #[arg(short, long)]
    pub file: PathBuf,

    /// Output as JSON
    #[arg(long)]
    pub json: bool,
}

Binary Parsing

All binary parsing uses byteorder::BigEndian for InnoDB's big-endian format:

use byteorder::{BigEndian, ReadBytesExt};
let page_number = cursor.read_u32::<BigEndian>()?;

Error Handling

A single IdbError enum (defined with thiserror) covers all error cases:

  • Io -- wraps std::io::Error
  • Parse -- invalid data, unexpected values
  • Argument -- invalid CLI arguments or combinations

Constants

Constants use UPPERCASE_WITH_UNDERSCORES and match MySQL/InnoDB source names exactly (from fil0fil.h, page0page.h, fsp0fsp.h, etc.):

pub const FIL_PAGE_DATA: usize = 38;
pub const FIL_PAGE_INDEX: u16 = 17855;

Adding a New Subcommand

  1. Create the module: Add src/cli/newcmd.rs with an Options struct (clap derive) and a pub fn execute(opts: &Options, writer: &mut dyn Write) -> Result<(), IdbError> function.

  2. Add the command variant: In src/cli/app.rs, add a variant to the Commands enum:

    /// Description of what the command does
    Newcmd(cli::newcmd::Options),
  3. Add dispatch: In src/main.rs, add a match arm:

    Commands::Newcmd(opts) => cli::newcmd::execute(&opts, &mut writer),
  4. Register the module: Add pub mod newcmd; to src/cli/mod.rs.

  5. Write unit tests: Add a #[cfg(test)] module at the bottom of src/cli/newcmd.rs.

  6. Write integration tests: Add test files in tests/ that exercise the command end-to-end.

  7. Update CHANGELOG.md: Document the new subcommand under the appropriate version heading.

Testing

Unit Tests

Inline #[cfg(test)] modules in each source file. Use the writer pattern to capture and assert on output:

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_something() {
        let mut output = Vec::new();
        execute(&opts, &mut output).unwrap();
        let result = String::from_utf8(output).unwrap();
        assert!(result.contains("expected text"));
    }
}

Integration Tests

Located in tests/. Build synthetic .ibd files using tempfile, byteorder, and CRC-32C checksums to create valid test data without requiring real database files.

Required Checks

# All of these must pass before submitting a PR
cargo test
cargo clippy -- -D warnings
cargo fmt --check

# Non-blocking but recommended
cargo audit

# WASM compatibility
cargo check --target wasm32-unknown-unknown --no-default-features

# Web UI build
cd web && npm run build

Testing Gotchas

  • Tablespace and LogFile do not derive Debug (they hold Box<dyn ReadSeek>). You cannot use .unwrap_err() in tests; use match instead.
  • Always test against real-world .ibd files from multiple MySQL versions when possible, not just synthetic data.

CI/CD

Three GitHub Actions workflows:

ci.yml (on push/PR)

  • Format check (cargo fmt --check)
  • Tests (cargo test)
  • Clippy lint (cargo clippy -- -D warnings)
  • Security audit (cargo audit, non-blocking)
  • Build on Ubuntu and macOS
  • Build with MySQL feature (cargo build --features mysql)
  • WASM compatibility check (cargo check --target wasm32-unknown-unknown --no-default-features)

pages.yml (on push to master)

  • Builds the WASM package
  • Builds the web UI
  • Deploys to GitHub Pages

release.yml (on v* tag push)

  • Builds optimized binaries for 4 targets:
    • x86_64-unknown-linux-gnu
    • aarch64-unknown-linux-gnu
    • x86_64-apple-darwin
    • aarch64-apple-darwin
  • Creates a GitHub release with the built artifacts
  • Dispatches a repository_dispatch event to the Homebrew tap repo (ringo380/homebrew-tap) to update the formula

Pull Request Guidelines

  • Keep PRs focused on a single change or feature.
  • Ensure all CI checks pass (tests, clippy, format).
  • Add tests for new functionality -- both unit and integration where appropriate.
  • Update CHANGELOG.md with a description of the change.
  • Keep commits clean and well-described.
  • All subcommands must support --json output via #[derive(Serialize)] structs.

Release Process

  1. Bump version in Cargo.toml.
  2. Update CHANGELOG.md with the new version and changes.
  3. Commit the version bump.
  4. Push to master.
  5. Create and push a tag: git tag vX.Y.Z && git push origin vX.Y.Z.
  6. GitHub Actions builds binaries, creates the release, and updates the Homebrew tap automatically.
  7. Publish to crates.io: cargo publish --allow-dirty (Cargo.lock may change from the build).

Code Style

  • Constants: UPPERCASE_WITH_UNDERSCORES, matching InnoDB/MySQL source names exactly.
  • Subcommand pattern: Options struct (clap derive) + execute() function per subcommand.
  • Zero warnings: cargo clippy -- -D warnings must pass with no warnings.
  • Edition: Rust 2021.
  • JSON output: All output structs derive Serialize. Use #[serde(skip_serializing_if = "...")] for optional fields.
  • Error handling: Use the IdbError enum. Do not introduce new error types.
  • Formatting: Run cargo fmt before committing.