plotnik-lang · zharinov · Jan 7, 2026 · Jan 7, 2026
diff --git a/README.md b/README.md
@@ -1,6 +1,3 @@
-<br/>
-<br/>
-
 <p align="center">
   <img width="400" alt="The logo: a curled wood shaving on a workbench" src="https://github.com/user-attachments/assets/8f1162aa-5769-415d-babe-56b962256747" />
 </p>
@@ -11,7 +8,7 @@
 
 <p align="center">
   A type-safe query language for <a href="https://tree-sitter.github.io">Tree-sitter</a>.<br/>
-  Query in, typed data out.
+  Powered by the <a href="https://github.com/bearcove/arborium">arborium</a> grammar collection.
 </p>
 
 <br/>
@@ -26,202 +23,124 @@
 <br/>
 
 <p align="center">
-  ⚠️ <a href="#status">ALPHA STAGE</a>: not for production use ⚠️<br/>
+    <sub>
+    ⚠️ Beta: not for production use ⚠️<br/>
+    </sub>
 </p>
 
 <br/>
-<br/>
-
-## The problem
-
-Tree-sitter solved parsing. It powers syntax highlighting and code navigation at GitHub, drives the editing experience in Zed, Helix, and Neovim. It gives you a fast, accurate, incremental syntax tree for virtually any language.
-
-The hard problem now is what comes _after_ parsing: extracting structured data from the tree:
-
-```typescript
-function extractFunction(node: SyntaxNode): FunctionInfo | null {
-  if (node.type !== "function_declaration") {
-    return null;
-  }
-  const name = node.childForFieldName("name");
-  const body = node.childForFieldName("body");
-  if (!name || !body) {
-    return null;
-  }
-  return {
-    name: name.text,
-    body,
-  };
-}
-```
-
-Every extraction requires a new function, each one a potential source of bugs that won't surface until production.
-
-## The solution
-
-Plotnik extends Tree-sitter queries with type annotations:
-
-```clojure
-(function_declaration
-  name: (identifier) @name :: string
-  body: (statement_block) @body
-) @func :: FunctionInfo
-```
-
-The query describes structure, and Plotnik infers the output type:
-
-```typescript
-interface FunctionInfo {
-  name: string;
-  body: SyntaxNode;
-}
-```
-
-This structure is guaranteed by the query engine. No defensive programming needed.
-
-## But what about Tree-sitter queries?
-
-Tree-sitter already has queries:
-
-```clojure
-(function_declaration
-  name: (identifier) @name
-  body: (statement_block) @body)
-```
-
-The result is a flat capture list:
 
-```typescript
-query.matches(tree.rootNode);
-// → [{ captures: [{ name: "name", node }, { name: "body", node }] }, ...]
-```
-
-The assembly layer is up to you:
-
-```typescript
-const name = match.captures.find((c) => c.name === "name")?.node;
-const body = match.captures.find((c) => c.name === "body")?.node;
-if (!name || !body) throw new Error("Missing capture");
-return { name: name.text, body };
-```
-
-This means string-based lookup, null checks, and manual type definitions kept in sync by convention.
-
-Tree-sitter queries are designed for matching. Plotnik adds the typing layer: the query _is_ the type definition.
-
-## Why Plotnik?
+Tree-sitter gives you the syntax tree. Extracting structured data from it still means writing imperative navigation code, null checks, and maintaining type definitions by hand. Plotnik makes extraction declarative: write a pattern, get typed data. The query is the type definition.
 
-| Hand-written extraction    | Plotnik                      |
-| -------------------------- | ---------------------------- |
-| Manual navigation          | Declarative pattern matching |
-| Runtime type errors        | Compile-time type inference  |
-| Repetitive extraction code | Single-query extraction      |
-| Ad-hoc data structures     | Generated structs/interfaces |
+## Features
 
-Plotnik extends Tree-sitter's query syntax with:
+- [x] Static type inference from query structure
+- [x] Named expressions for composition and reuse
+- [x] Recursion for nested structures
+- [x] Tagged unions (discriminated unions)
+- [x] TypeScript type generation
+- [x] CLI: `exec` for matches, `infer` for types, `ast`/`trace`/`dump` for debug
+- [ ] Grammar verification (validate queries against tree-sitter node types)
+- [ ] Compile-time queries via proc-macro
+- [ ] LSP server
+- [ ] Editor extensions
 
-- **Named expressions** for composition and reuse
-- **Recursion** for arbitrarily nested structures
-- **Type annotations** for precise output shapes
-- **Alternations**: untagged for simplicity, tagged for precision (discriminated unions)
+## Example
 
-## Use cases
+Extract function signatures from Rust. `Type` references itself to handle nested generics like `Option<Vec<String>>`.
 
-- **Scripting:** Count patterns, extract metrics, audit dependencies
-- **Custom linters:** Encode your business rules and architecture constraints
-- **LLM Pipelines:** Extract signatures and types as structured data for RAG
-- **Code Intelligence:** Outline views, navigation, symbol extraction across grammars
-
-## Language design
-
-Start simple—extract all function names from a file:
+`query.ptk`:
 
 ```clojure
-Functions = (program
-  {(function_declaration name: (identifier) @name :: string)}* @functions)
-```
+Type = [
+  Simple: [(type_identifier) (primitive_type)] @name :: string
+  Generic: (generic_type
+    type: (type_identifier) @name :: string
+    type_arguments: (type_arguments (Type)* @args))
+]
 
-Plotnik infers the output type:
+Func = (function_item
+  name: (identifier) @name :: string
+  parameters: (parameters
+    (parameter
+      pattern: (identifier) @param :: string
+      type: (Type) @type
+    )* @params))
 
-```typescript
-type Functions = {
-  functions: { name: string }[];
-};
+Funcs = (source_file (Func)* @funcs)
 ```
 
-Scale up to tagged unions for richer structure:
-
-```clojure
-Statement = [
-  Assign: (assignment_expression
-    left: (identifier) @target :: string
-    right: (Expression) @value)
-  Call: (call_expression
-    function: (identifier) @func :: string
-    arguments: (arguments (Expression)* @args))
-]
+`lib.rs`:
 
-Expression = [
-  Ident: (identifier) @name :: string
-  Num: (number) @value :: string
-]
+```rust
+fn get(key: Option<Vec<String>>) {}
 
-TopDefinitions = (program (Statement)+ @statements)
+fn set(key: String, val: i32) {}
 ```
 
-This produces:
+Plotnik infers TypeScript types from the query structure. `Type` is recursive: `args: Type[]`.
 
-```typescript
-type Statement =
-  | { $tag: "Assign"; $data: { target: string; value: Expression } }
-  | { $tag: "Call"; $data: { func: string; args: Expression[] } };
+```sh
+❯ plotnik infer query.ptk -l rust
+export type Type =
+  | { $tag: "Simple"; $data: { name: string } }
+  | { $tag: "Generic"; $data: { name: string; args: Type[] } };
 
-type Expression =
-  | { $tag: "Ident"; $data: { name: string } }
-  | { $tag: "Num"; $data: { value: string } };
+export interface Func {
+  name: string;
+  params: { param: string; type: Type }[];
+}
 
-type TopDefinitions = {
-  statements: [Statement, ...Statement[]];
-};
+export interface Funcs {
+  funcs: Func[];
+}
 ```
 
-Then process the results:
-
-```typescript
-for (const stmt of result.statements) {
-  switch (stmt.$tag) {
-    case "Assign":
-      console.log(`Assignment to ${stmt.$data.target}`);
-      break;
-    case "Call":
-      console.log(
-        `Call to ${stmt.$data.func} with ${stmt.$data.args.length} args`,
-      );
-      break;
-  }
+Run the query against `lib.rs` to extract structured JSON:
+
+```sh
+❯ plotnik exec query.ptk lib.rs
+{
+  "funcs": [
+    {
+      "name": "get",
+      "params": [{
+        "param": "key",
+        "type": {
+          "$tag": "Generic",
+          "$data": {
+            "name": "Option",
+            "args": [{
+              "$tag": "Generic",
+              "$data": {
+                "name": "Vec",
+                "args": [{ "$tag": "Simple", "$data": { "name": "String" } }]
+              }
+            }]
+          }
+        }
+      }]
+    },
+    {
+      "name": "set",
+      "params": [
+        { "param": "key", "type": { "$tag": "Simple", "$data": { "name": "String" } } },
+        { "param": "val", "type": { "$tag": "Simple", "$data": { "name": "i32" } } }
+      ]
+    }
+  ]
 }
 ```
 
-For the detailed specification, see the [Language Reference](docs/lang-reference.md).
+## Why
 
-## Documentation
+Pattern matching over syntax trees is powerful, but tree-sitter queries produce flat capture lists. You still need to assemble the results, handle missing captures, and define types by hand. Plotnik closes this gap: the query describes structure, the engine guarantees it.
 
-- [CLI Guide](docs/cli.md) — Command-line tool usage
-- [Language Reference](docs/lang-reference.md) — Complete syntax and semantics
-- [Type System](docs/type-system.md) — How output types are inferred from queries
-- [Runtime Engine](docs/runtime-engine.md) — VM execution model (for contributors)
-
-## Supported Languages
-
-Plotnik bundles 15 languages out of the box: Bash, C, C++, CSS, Go, HTML, Java, JavaScript, JSON, Python, Rust, TOML, TSX, TypeScript, and YAML. The underlying [arborium](https://github.com/bearcove/arborium) collection includes 60+ permissively-licensed grammars—additional languages can be enabled as needed.
-
-## Status
-
-**Working now:** Parser with error recovery, type inference, query execution, CLI tools (`check`, `dump`, `infer`, `exec`, `trace`, `tree`, `langs`).
-
-**Next up:** CLI distribution (Homebrew, npm), language bindings (TypeScript/WASM, Python), LSP server, editor extensions.
+## Documentation
 
-⚠️ Alpha stage—API may change. Not for production use.
+- [CLI Guide](docs/cli.md)
+- [Language Reference](docs/lang-reference.md)
+- [Type System](docs/type-system.md)
 
 ## Acknowledgments