-
-
Notifications
You must be signed in to change notification settings - Fork 0
feat: Implement enum-based output formatters (Table, JSON, YARA) #127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
71e4c1c
4c4c955
b8bcc8e
d2710e8
14c3d82
de2e8d5
6c1b531
a122d32
3b9c618
b6689ce
bec8192
0c2744e
6510b90
5c53d91
f4388be
7306f48
d52047a
1cb3744
3b821e5
704e7c5
c4ec73b
74c71bc
9663556
5ccbff1
e4d1e15
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,3 @@ | ||
| { | ||
| "enabledPlugins": { | ||
| "commit@cc-marketplace": true | ||
| } | ||
| "enabledPlugins": {} | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,47 @@ | ||
| { | ||
| "name": "Rust", | ||
| "image": "mcr.microsoft.com/devcontainers/rust:2-1-trixie", | ||
| "features": { | ||
| "ghcr.io/devcontainers/features/docker-outside-of-docker:1": { | ||
| "installDockerBuildx": true, | ||
| "version": "latest", | ||
| "dockerDashComposeVersion": "v2", | ||
| "moby": false | ||
| }, | ||
| "ghcr.io/devcontainers/features/github-cli:1": { | ||
| "installDirectlyFromGitHubRelease": true, | ||
| "version": "latest" | ||
| }, | ||
| "ghcr.io/eitsupi/devcontainer-features/mdbook:1": { | ||
| "version": "latest" | ||
| }, | ||
| "ghcr.io/devcontainers-extra/features/claude-code:1": { | ||
| "version": "latest" | ||
| }, | ||
| "ghcr.io/devcontainers-extra/features/mise:1": { | ||
| "version": "latest" | ||
| } | ||
| }, | ||
| "customizations": { | ||
| "vscode": { | ||
| "extensions": [ | ||
| "mikestead.dotenv", | ||
| "EditorConfig.EditorConfig", | ||
| "tamasfe.even-better-toml", | ||
| "github.vscode-github-actions", | ||
| "GitHub.vscode-pull-request-github", | ||
| "skellock.just", | ||
| "yzhang.markdown-all-in-one", | ||
| "bierner.markdown-checkbox", | ||
| "bierner.markdown-footnotes", | ||
| "bierner.markdown-mermaid", | ||
| "bierner.markdown-yaml-preamble", | ||
| "DavidAnson.vscode-markdownlint", | ||
| "rust-lang.rust-analyzer", | ||
| "foxundermoon.shell-format", | ||
| "redhat.vscode-yaml", | ||
| "ms-vscode-remote.remote-containers" | ||
| ] | ||
| } | ||
| } | ||
| } | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick | 🔵 Trivial Add trailing newline. File ends without a newline. POSIX convention and most linters expect files to end with a newline character. 🤖 Prompt for AI Agents |
||
This file was deleted.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| megalinter-reports/* | ||
| target/* | ||
| stringy-output/* | ||
| tests/fixtures/* |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| { | ||
| "ruff.path": [ | ||
| "${workspaceFolder}/.vscode/mise-tools/ruff" | ||
| ], | ||
| "ruff.interpreter": [ | ||
| "${workspaceFolder}/.vscode/mise-tools/python" | ||
| ], | ||
| "python.defaultInterpreterPath": "${workspaceFolder}/.vscode/mise-tools/python", | ||
| "debug.javascript.defaultRuntimeExecutable": { | ||
| "pwa-node": "${workspaceFolder}/.vscode/mise-tools/node" | ||
| } | ||
| } | ||
unclesp1d3r marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,53 @@ | ||||||||||||||||||||
| # Changelog | ||||||||||||||||||||
|
|
||||||||||||||||||||
| All notable changes to this project will be documented in this file. | ||||||||||||||||||||
|
|
||||||||||||||||||||
| The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), | ||||||||||||||||||||
| and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). | ||||||||||||||||||||
|
|
||||||||||||||||||||
| ## [Unreleased] | ||||||||||||||||||||
|
|
||||||||||||||||||||
| ### Added | ||||||||||||||||||||
| - Output formatters: JSON (JSONL), table (TTY-friendly), and YARA rule templates | ||||||||||||||||||||
| - `generated_at` timestamp support in output metadata for deterministic outputs | ||||||||||||||||||||
| - Ranking system for prioritizing extracted strings by relevance | ||||||||||||||||||||
| - Symbol demangling support for Rust mangled names | ||||||||||||||||||||
| - File path classification for POSIX, Windows, and registry paths | ||||||||||||||||||||
| - Semantic classification for URLs, domains, and IP addresses (IPv4/IPv6) | ||||||||||||||||||||
| - String deduplication with full occurrence metadata preservation | ||||||||||||||||||||
| - `CanonicalString` type for deduplicated strings with occurrence tracking | ||||||||||||||||||||
| - UTF-16 string extraction with confidence scoring | ||||||||||||||||||||
| - Noise filtering framework with entropy, linguistic, and repetition filters | ||||||||||||||||||||
| - Mach-O load command extraction with section weight normalization | ||||||||||||||||||||
| - Comprehensive PE support: section classification, import/export parsing, resource extraction | ||||||||||||||||||||
| - ELF symbol extraction with type support and visibility filtering | ||||||||||||||||||||
| - `#[non_exhaustive]` and builder pattern for `FoundString` public API | ||||||||||||||||||||
| - Contributing guidelines document | ||||||||||||||||||||
|
|
||||||||||||||||||||
| ### Changed | ||||||||||||||||||||
| - Repository renamed from StringyMcStringFace to Stringy | ||||||||||||||||||||
| - Improved YARA formatter code quality and test coverage | ||||||||||||||||||||
| - Clarified ASCII rule for Unicode handling in documentation | ||||||||||||||||||||
|
Comment on lines
+27
to
+30
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Document the The UTF-16 extraction function was renamed in this PR. This is a breaking change that should be explicitly documented under "Changed" for users upgrading. Proposed addition ### Changed
- Repository renamed from StringyMcStringFace to Stringy
- Improved YARA formatter code quality and test coverage
- Clarified ASCII rule for Unicode handling in documentation
+- Renamed `extract_utf16le_strings` to `extract_utf16_strings` for consistency📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||||||||
|
|
||||||||||||||||||||
| ### Fixed | ||||||||||||||||||||
| - Rustdoc warning for IPv6 address example in documentation | ||||||||||||||||||||
|
|
||||||||||||||||||||
| ### Dependencies | ||||||||||||||||||||
| - Updated criterion to 0.8.1 | ||||||||||||||||||||
| - Updated actions/checkout to v6 | ||||||||||||||||||||
| - Updated actions/download-artifact to v7 | ||||||||||||||||||||
| - Updated actions/attest-build-provenance to v3 | ||||||||||||||||||||
| - Updated actions/upload-artifact to v5 | ||||||||||||||||||||
| - Updated github/codeql-action to v4 | ||||||||||||||||||||
| - Updated EmbarkStudios/cargo-deny-action to v2 | ||||||||||||||||||||
|
|
||||||||||||||||||||
| ## [0.1.0] - TBD | ||||||||||||||||||||
|
|
||||||||||||||||||||
| Initial release with core functionality: | ||||||||||||||||||||
|
|
||||||||||||||||||||
| ### Added | ||||||||||||||||||||
| - ELF, PE, and Mach-O binary format detection and parsing | ||||||||||||||||||||
| - ASCII and UTF-8 string extraction from binary sections | ||||||||||||||||||||
| - Section-aware extraction with weight-based prioritization | ||||||||||||||||||||
| - Basic semantic tagging infrastructure | ||||||||||||||||||||
| - Command-line interface (in development) | ||||||||||||||||||||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,90 @@ | ||||||||||||||||||||||||||||||||||
| # Contributing to Stringy | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| Thanks for your interest in Stringy. This guide explains how to propose changes and what we expect for code quality. | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| ## Quick start | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| 1. Search existing issues and pull requests before filing a new one. | ||||||||||||||||||||||||||||||||||
| 2. For bugs, open an issue with a clear reproduction and expected vs actual behavior. | ||||||||||||||||||||||||||||||||||
| 3. For new features or larger changes, open an issue first to discuss scope. | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| ## Development setup | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| Stringy uses Rust 2024 (MSRV 1.85+, see `rust-toolchain.toml`). We also use just for common tasks. | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| Recommended workflow: | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| - `just setup` (to install tools) | ||||||||||||||||||||||||||||||||||
| - `just build` (compiles a debug build) | ||||||||||||||||||||||||||||||||||
| - `just test` (runs tests) | ||||||||||||||||||||||||||||||||||
| - `just lint` (runs linters) | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| If you do not use just, the critical requirement is that: | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| - `cargo clippy -- -D warnings` passes | ||||||||||||||||||||||||||||||||||
| - `cargo fmt` produces no changes | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| ## Coding standards | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| These rules are enforced by CI: | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| - No unsafe code | ||||||||||||||||||||||||||||||||||
| - Zero warnings (`clippy -D warnings`) | ||||||||||||||||||||||||||||||||||
| - ASCII only in code and documentation, unless explicitly working with Unicode handling | ||||||||||||||||||||||||||||||||||
| - Keep files under 500-600 lines; split when needed | ||||||||||||||||||||||||||||||||||
| - No blanket `#[allow]` on modules or files | ||||||||||||||||||||||||||||||||||
| - No async; this is a synchronous CLI tool | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| Use thiserror for structured errors and include context (offsets, section names, file paths) when relevant. | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| ## Project-specific guidance | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| Module layout: | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| - `container/` handles format detection and section analysis | ||||||||||||||||||||||||||||||||||
| - `extraction/` handles string extraction, filtering, and deduplication | ||||||||||||||||||||||||||||||||||
| - `classification/` handles semantic tagging and ranking | ||||||||||||||||||||||||||||||||||
| - `output/` handles output formatters | ||||||||||||||||||||||||||||||||||
| - `types.rs` contains core data structures and error types | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| Key patterns: | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| - Section weights: add new section weights in `container/*.rs` using existing match patterns. Higher weight means more likely to contain useful strings. | ||||||||||||||||||||||||||||||||||
| - Semantic tags: add new Tag variants in `types.rs`, implement detection in `classification/semantic.rs`, and update any tag merging logic if needed. | ||||||||||||||||||||||||||||||||||
| - Deduplication: preserve all occurrences and merge tags across occurrences in `extraction/dedup.rs`. | ||||||||||||||||||||||||||||||||||
| - Public structs: keep public API structs non_exhaustive and provide explicit constructors. | ||||||||||||||||||||||||||||||||||
| - Imports: prefer `stringy::extraction` or `stringy::types`. Do not import locally-defined types inside `extraction/mod.rs`. | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| ## Tests | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| - Add or update tests for behavior changes. | ||||||||||||||||||||||||||||||||||
| - Use insta snapshots for output verification when appropriate. | ||||||||||||||||||||||||||||||||||
| - Integration tests live in tests/ and fixtures in tests/fixtures/. | ||||||||||||||||||||||||||||||||||
| - Use insta snapshots for output verification when changing output formatters. | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| Run: | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| - `just test` | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
|
Comment on lines
+58
to
+68
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Duplicate guidance and missing language specifier. Lines 61 and 63 contain duplicate text about insta snapshots:
Also, line 59 has an empty code block - consider adding a language specifier or removing it if unneeded. Proposed fix ## Tests
- Add or update tests for behavior changes.
-- Use insta snapshots for output verification when appropriate.
+- Use insta snapshots for output verification, especially when changing output formatters.
- Integration tests live in tests/ and fixtures in tests/fixtures/.
-- Use insta snapshots for output verification when changing output formatters.
-
-Run:
-
-- `just test`
+- Run tests with `just test`📝 Committable suggestion
Suggested change
🧰 Tools🪛 markdownlint-cli2 (0.18.1)59-59: Fenced code blocks should have a language specified (MD040, fenced-code-language) 🤖 Prompt for AI Agents |
||||||||||||||||||||||||||||||||||
| ## Pull requests | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| - Keep PRs focused and small when possible. | ||||||||||||||||||||||||||||||||||
| - Include a clear description of the problem and the solution. | ||||||||||||||||||||||||||||||||||
| - Link related issues in the PR description. | ||||||||||||||||||||||||||||||||||
| - Update documentation when behavior changes. | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| ## Documentation | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| Docs live under docs/ and project planning artifacts are in project_plan/. Update them when you change user-facing behavior. | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| ## Security | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| If you believe you found a security issue, please do not open a public issue. Use GitHub Security Advisories if available, or contact the maintainers privately. | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| ## AI-assisted development | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| This project includes Claude Code configuration in `.claude/settings.json`. These settings enable plugins that help maintain code quality and follow project conventions. If you use Claude Code, the configuration will be applied automatically. | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| ## Questions | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| If you are unsure where to start, open an issue with your question and we will point you in the right direction. | ||||||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Consider pinning feature versions for reproducible dev environments.
All features use
version: "latest", which can cause inconsistent environments across team members or over time. For a binary analysis tool, reproducible builds and test environments reduce debugging headaches.Pin to specific versions or at minimum document the expected versions in a comment.
Example version pinning
"ghcr.io/devcontainers/features/docker-outside-of-docker:1": { "installDockerBuildx": true, - "version": "latest", + "version": "27.4", "dockerDashComposeVersion": "v2", "moby": false }, "ghcr.io/devcontainers/features/github-cli:1": { "installDirectlyFromGitHubRelease": true, - "version": "latest" + "version": "2.64" },🤖 Prompt for AI Agents