Skip to content

Signal Score: Ruby scoring scripts and data pipeline #591

@divideby0

Description

@divideby0

Port the Signal Score analysis tooling to Ruby to align with the existing Rails codebase.

Context

Initial prototyping for Signal Score (#590) was done in Bun/TypeScript. Now that the approach is validated, we should rewrite the scoring pipeline in Ruby so it can eventually become a Rails service object (app/extras/signal_scorer.rb) running via ActiveJob.

Scripts to Port

  • score_grants.rb — Batch scoring via Anthropic API with Trust Equation rubric
  • discover_patterns.rb — Qualitative pattern analysis across labeled applications
  • import_data.rb — CSV → Parquet → DuckDB setup (already Ruby ✅)
  • export_sample.rb — Stratified sample generation for validation

Dependencies

  • anthropic gem — Batch API for scoring
  • duckdb gem — Analytical queries on historical data
  • dotenv-rails — Already in Gemfile, use .env for API keys

Data Setup

Historical application data (~64K rows) lives in a privileged Google Drive folder. Contact @divideby0 for access. Place CSVs in .scratch/data/ and run import_data.rb.

Refs: #590

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions