Skip to content

Data residue scanner — detect if specific column values persist anywhere in tablespace pages #175

@ringo380

Description

@ringo380

Summary

Scan all pages in a tablespace (data, free space, undo, LOB) for byte patterns matching specific column values to verify complete data deletion.

Implementation

Module: src/innodb/compliance.rs (new)

Approach:

  • Accept configurable search patterns (string literals, regex, email format, etc)
  • Scan every byte of every page, including free space and garbage regions
  • Report exact page number + byte offset of each match
  • Classify match location: active record, free space, undo segment, LOB data, FIL header
  • Performance: rayon parallel page scanning with early-exit option

API:

pub struct ResidueMatch { page_no: u32, offset: u32, location: MatchLocation, context: String }
pub enum MatchLocation { ActiveRecord, FreeSpace, UndoSegment, LobData, Other }
pub fn scan_for_residue(tablespace: &Tablespace, patterns: &[Pattern]) -> Vec<ResidueMatch>;

Part of

Epic #165 (GDPR & Compliance Verification)

Metadata

Metadata

Assignees

No one assigned

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions