Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
c1219a4
feat(tree-sitter): add web-tree-sitter dep and grammar discovery
butttons Mar 4, 2026
f25e60c
feat(tree-sitter): add normalized types and Zod schemas
butttons Mar 4, 2026
eed8efe
feat(tree-sitter): add TS/JS S-expression queries and language registry
butttons Mar 4, 2026
5c046bb
feat(tree-sitter): add core parser module
butttons Mar 4, 2026
58659ae
feat(tree-sitter): add dora fn command
butttons Mar 4, 2026
0a5ad52
feat(tree-sitter): add dora class command
butttons Mar 4, 2026
9393105
feat(tree-sitter): add dora smells command
butttons Mar 4, 2026
ff43446
feat(tree-sitter): enhance dora file with metrics
butttons Mar 4, 2026
dcf0e2c
feat(tree-sitter): enhance dora symbol with function signatures
butttons Mar 4, 2026
81af061
feat(tree-sitter): add grammar availability to dora status
butttons Mar 4, 2026
f5d0872
docs: add tree-sitter cookbook recipe
butttons Mar 4, 2026
cf6fc22
fix(tree-sitter): apply code convention fixes
butttons Mar 4, 2026
42b8ec5
fix(tree-sitter): runtime wasm resolution, grammar discovery, error h…
butttons Mar 4, 2026
2d53370
test(tree-sitter): add tests for function/class captures and file met…
butttons Mar 4, 2026
6235fa9
chore: apply biome formatting across src and test
butttons Mar 4, 2026
b5e6846
fix(tests): resolve type-check errors in tree-sitter test files
butttons Mar 4, 2026
35dab3e
docs: rewrite README, CONTRIBUTING, and rename CLAUDE.md to AGENTS.md
butttons Mar 4, 2026
6912afd
docs(website): add tree-sitter commands, fix output format, update AG…
butttons Mar 4, 2026
8bfce36
docs(website): add tree-sitter installation instructions to docs and …
butttons Mar 4, 2026
30c511d
fix(tree-sitter): count ?? operator in cyclomatic complexity
butttons Mar 4, 2026
b386554
fix(tree-sitter): initialize Parser once via singleton promise
butttons Mar 4, 2026
6d1c9b4
fix(config): detect bun.lock in addition to bun.lockb
butttons Mar 4, 2026
edbc957
fix(commands): use resolveAbsolute() for absolute path construction
butttons Mar 4, 2026
3350587
fix(status): report all registered grammar availability
butttons Mar 4, 2026
0e96f87
fix(smells): scan TODO comments inside multi-line block comments
butttons Mar 4, 2026
3d9257e
fix(mcp): add required inline comment for any types in handleToolCall
butttons Mar 4, 2026
b5418b2
fix(converter): replace interface with type per project convention
butttons Mar 4, 2026
9f5b289
feat(smells): add class-level smell detection
butttons Mar 4, 2026
6d3c4b1
fix(fn,class): validate --sort option and throw on unknown values
butttons Mar 4, 2026
daeb466
fix(grammar): cache global node_modules path as singleton promise
butttons Mar 4, 2026
7210fda
chore: version 2.0.0
butttons Mar 10, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,6 @@ report.[0-9]_.[0-9]_.[0-9]_.[0-9]_.json
.dora/

stress-test/*

.beans/*
.beans.yml
2 changes: 1 addition & 1 deletion .pi/skills/toon/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
name: toon
description: **Token-Oriented Object Notation** is a compact, human-readable encoding of the JSON data model that minimizes tokens and makes structure easy for models to follow. It's intended for *LLM input* as a drop-in, lossless representation of your existing JSON.
description: Token-Oriented Object Notation is a compact, human-readable encoding of the JSON data model that minimizes tokens and makes structure easy for models to follow. It's intended for LLM input as a drop-in, lossless representation of your existing JSON.
---

![TOON logo with step‑by‑step guide](./.github/og.png)
Expand Down
80 changes: 80 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# CLAUDE.md

dora is a CLI that converts SCIP indexes into a queryable SQLite database. AI agents use it to answer questions about large codebases without reading files or tracing imports manually.

## Stack

- Runtime: Bun
- Database: SQLite via `bun:sqlite`
- Language: TypeScript
- Protobuf parsing: `@bufbuild/protobuf`
- AST parsing: `web-tree-sitter` (on-demand, per file)
- Output format: TOON by default (`dora status`), JSON via `--json` (`dora status --json`)

## Source layout

```
src/
├── commands/ # one file per CLI command
├── converter/ # SCIP protobuf parser + SQLite converter
├── db/ # schema and all SQL queries
├── mcp/ # MCP server, tool definitions, handlers
├── schemas/ # Zod schemas and inferred types
├── tree-sitter/ # grammar discovery, parser, language registry
└── utils/ # config, errors, output formatting
```

## Two indexing layers

**SCIP** — runs the configured indexer (e.g. `scip-typescript`), produces a `.scip` protobuf, converts it to SQLite. Gives you symbols, references, and file-to-file dependencies derived from actual import resolution.

**Tree-sitter** — parses source files on-demand using wasm grammars. Covers what SCIP doesn't: function signatures, cyclomatic complexity, class hierarchy, code smells. Grammar discovery checks local `node_modules`, then global bun packages, then explicit config paths.

## Database design

Denormalized counts (`symbol_count`, `dependency_count`, `dependent_count`, `reference_count`) are pre-computed at index time. Most queries are index lookups, not aggregations.

Local symbols (function parameters, closure variables) are flagged `is_local = 1` and filtered out by default. Symbol kinds are extracted from SCIP documentation strings since `scip-typescript` doesn't populate the kind field.

Schema: `src/converter/schema.sql`. All queries: `src/db/queries.ts`.

## Config file: `.dora/config.json`

```json
{
"root": "/absolute/path/to/repo",
"scip": ".dora/index.scip",
"db": ".dora/dora.db",
"commands": {
"index": "scip-typescript index --output .dora/index.scip"
},
"lastIndexed": "2025-01-15T10:30:00Z",
"ignore": ["test/**", "**/*.generated.ts"],
"treeSitter": {
"grammars": {
"typescript": "/explicit/path/to/tree-sitter-typescript.wasm"
}
}
}
```

## Code conventions

- Single object parameter — never multiple positional params
- No inline comments, no section separators, no file headers
- No `any` — use `unknown` or proper types
- Boolean variables prefixed with `is` or `has`
- Use `type` not `interface`
- No emojis
- Output JSON to stdout, errors to stderr as `{"error": "message"}`, exit 1 on error

## Adding a tree-sitter language

1. Create `src/tree-sitter/languages/mylang.ts` — export `functionQueryString`, `classQueryString`, `parseFunctionCaptures`, `parseClassCaptures`
2. Register in `src/tree-sitter/languages/registry.ts` with grammar name and extensions
3. Add tests in `test/tree-sitter/` — see `function-captures.test.ts` as the reference. Tests mock `Parser.QueryCapture[]` objects directly; no wasm or disk I/O needed.

## Hooks (`.claude/settings.json`)

- **Stop**: runs `dora index` in the background after each turn
- **SessionStart**: checks index health, prompts to init if missing
29 changes: 29 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,34 @@
# @butttons/dora

## 2.0.0

### New Commands

- `dora fn <path>` — list all functions in a file with cyclomatic complexity, LOC, parameter types, return type, async/export flags, and SCIP reference count. Sort by complexity, loc, or name. Filter by minimum complexity.
- `dora class <path>` — list all classes with inheritance, implemented interfaces, decorators, abstract flag, method list, and property count. Sort by name, method count, or total complexity.
- `dora smells <path>` — detect code smells: high cyclomatic complexity, long functions, too many parameters, god classes (too many methods), large classes (too many properties), and TODO/FIXME/HACK comments. All thresholds configurable.

### Enhancements

- `dora file` now includes per-file metrics (LOC, SLOC, comment lines, blank lines, function count, avg/max complexity) and a full function list when a tree-sitter grammar is available.
- `dora symbol` now enriches function and method results with cyclomatic complexity, parameter types, and return type from tree-sitter.
- `dora status` now reports grammar availability for every registered language instead of only the project's primary language.

### Tree-sitter Integration

- Grammar discovery checks project-local `node_modules`, global bun packages, and explicit paths in `.dora/config.json` under `treeSitter.grammars`.
- TypeScript, TSX, JavaScript, and JSX supported out of the box. Additional languages can be added by implementing a query module and registering it.
- Parser WASM is initialized once per process. Language grammars are cached after first load. Global node_modules path is resolved once and cached.
- A tree-sitter cookbook recipe (`dora cookbook show tree-sitter`) covers workflows, complexity thresholds, and pre-refactor checklists.

### Bug Fixes

- Nullish coalescing (`??`) was not counted in cyclomatic complexity. The operator is a `binary_expression` node in tree-sitter, not a standalone node type.
- `bun.lock` (text format, Bun 1.1+) was not recognized by workspace detection. Only `bun.lockb` was checked, causing wrong `scip-typescript` flags for newer Bun projects.
- TODO comment scanning in `dora smells` missed markers inside multi-line block comments.
- `dora fn` and `dora class` silently fell back to the default sort on unrecognized `--sort` values instead of throwing an error.
- Absolute paths in tree-sitter commands were built with string interpolation instead of `path.resolve`, which is fragile on Windows and with trailing slashes.

## 1.7.0

### Minor Changes
Expand Down
Loading