Telsho · Telsho · Jan 18, 2026 · Jan 17, 2026 · Jan 17, 2026 · Jan 17, 2026
diff --git a/README.md b/README.md
@@ -13,16 +13,16 @@
 
 ## 📖 Description
 
-With `extrai`, you can extract data from text documents with LLMs, which will be formatted into a given `SQLModel` and registered in your database.
+`extrai` extracts data from text documents using LLMs, formatting the output into a given `SQLModel` and registering it in a database.
 
-The core of the library is its [Consensus Mechanism](https://docs.extrai.xyz/concepts/consensus_mechanism.html). We make the same request multiple times, using the same or different providers, and then select the values that meet a certain threshold.
+The library utilizes a [Consensus Mechanism](https://docs.extrai.xyz/concepts/consensus_mechanism.html) to ensure accuracy. It makes the same request multiple times, using the same or different providers, and then selects the values that meet a configured threshold.
 
 `extrai` also has other features, like [generating `SQLModel`s](https://docs.extrai.xyz/how_to/generate_sql_model.html) from a prompt and documents, and [generating few-shot examples](https://docs.extrai.xyz/how_to/generate_example_json.html). For complex, nested data, the library offers [Hierarchical Extraction](https://docs.extrai.xyz/how_to/handle_complex_data_with_hierarchical_extraction.html), breaking down the extraction into manageable, hierarchical steps. It also includes [built-in analytics](https://docs.extrai.xyz/analytics_collector.html) to monitor performance and output quality.
 
 ## ✨ Key Features
 
-- **[Consensus Mechanism](https://docs.extrai.xyz/concepts/consensus_mechanism.html)**: Improves extraction accuracy by consolidating multiple LLM outputs.
-- **[Dynamic SQLModel Generation](https://docs.extrai.xyz/sqlmodel_generator.html)**: Generate `SQLModel` schemas from natural language descriptions.
+- **[Consensus Mechanism](https://docs.extrai.xyz/concepts/consensus_mechanism.html)**: Consolidates multiple LLM outputs to improve extraction accuracy.
+- **[Dynamic SQLModel Generation](https://docs.extrai.xyz/sqlmodel_generator.html)**: Generates `SQLModel` schemas from natural language descriptions.
 - **[Hierarchical Extraction](https://docs.extrai.xyz/how_to/handle_complex_data_with_hierarchical_extraction.html)**: Handles complex, nested data by breaking down the extraction into manageable, hierarchical steps.
 - **[Extensible LLM Support](https://docs.extrai.xyz/llm_providers.html)**: Integrates with various LLM providers through a client interface.
 - **[Built-in Analytics](https://docs.extrai.xyz/analytics_collector.html)**: Collects metrics on LLM performance and output quality to refine prompts and monitor errors.
@@ -59,7 +59,7 @@ For a complete guide, please see the full documentation. Here are the key sectio
 - **Community**
   - [Contributing Guide](https://docs.extrai.xyz/contributing.html)
 
-## ⚙️ Worflow Overview
+## ⚙️ Workflow Overview
 
 The library is built around a few key components that work together to manage the extraction workflow. The following diagram illustrates the high-level workflow (see [Architecture Overview](https://docs.extrai.xyz/concepts/architecture_overview.html)):
 

diff --git a/docs/advanced/batch_deep_dive.md b/docs/advanced/batch_deep_dive.md
@@ -0,0 +1,120 @@
+# Batch Processing Deep Dive
+
+Batch processing in `extrai` is designed for high-volume extraction tasks where immediate results are not required and cost optimization is a priority. It leverages the "Batch API" features of LLM providers (like OpenAI's Batch API) to process requests asynchronously at a lower cost (often 50% cheaper).
+
+## The Batch State Machine
+
+The batch pipeline is managed by a robust state machine that transitions a job through various phases. This ensures that even long-running jobs can be tracked, resumed, and recovered in case of failures.
+
+### States
+
+The `BatchJobStatus` enum defines the possible states:
+
+*   **SUBMITTED**: The initial extraction request has been sent to the LLM provider.
+*   **PROCESSING**: The provider is currently running the batch.
+*   **READY_TO_PROCESS**: The provider has finished, and the results are downloaded but not yet processed by `extrai`.
+*   **COMPLETED**: All results have been processed, consensus run, and objects hydrated.
+*   **FAILED**: The batch job failed at the provider or during local processing.
+*   **CANCELLED**: The job was manually cancelled.
+
+#### Counting Phase States
+
+If entity counting is enabled, the job goes through a "pre-flight" counting phase:
+
+*   **COUNTING_SUBMITTED**: The counting request is with the provider.
+*   **COUNTING_PROCESSING**: Counting is running.
+*   **COUNTING_READY_TO_PROCESS**: Counts are ready to be used for generating the main extraction prompts.
+
+### State Transition Diagram
+
+```mermaid
+stateDiagram-v2
+    [*] --> COUNTING_SUBMITTED: count_entities=True
+    [*] --> SUBMITTED: count_entities=False
+
+    COUNTING_SUBMITTED --> COUNTING_PROCESSING
+    COUNTING_PROCESSING --> COUNTING_READY_TO_PROCESS
+    COUNTING_READY_TO_PROCESS --> SUBMITTED: Generate Extraction Prompts
+
+    SUBMITTED --> PROCESSING
+    PROCESSING --> READY_TO_PROCESS
+    READY_TO_PROCESS --> COMPLETED: Consensus & Hydration
+
+    PROCESSING --> FAILED
+    SUBMITTED --> FAILED
+    COUNTING_PROCESSING --> FAILED
+
+    FAILED --> [*]
+    COMPLETED --> [*]
+```
+
+## Production Workflows
+
+### Submission & Polling
+
+The `WorkflowOrchestrator.synthesize_batch` method handles the lifecycle. You can use it in a blocking or non-blocking way.
+
+**Blocking (Simplest):**
+```python
+results = await orchestrator.synthesize_batch(
+    input_strings=docs, 
+    wait_for_completion=True
+)
+```
+This will poll the provider internally, handle transitions from counting to extraction, and return the final objects.
+
+**Non-Blocking (Async):**
+```python
+# Submit
+batch_id = await orchestrator.synthesize_batch(
+    input_strings=docs, 
+    wait_for_completion=False
+)
+
+# Later... check status
+current_status = await orchestrator.get_batch_status(batch_id, db_session)
+if current_status.status == BatchJobStatus.COMPLETED:
+    results = await orchestrator.get_batch_results(batch_id)
+```
+
+### Error Recovery & Resuming
+
+If your application crashes while a batch is running, you don't lose the job. The `root_batch_id` is the key to recovery.
+
+**Resuming Monitoring (e.g. after script restart)**
+
+If the batch job is still active or completed at the provider, but your script stopped monitoring it, simply call `monitor_batch_job` again:
+
+```python
+# Resume monitoring
+results = await orchestrator.monitor_batch_job(
+    root_batch_id="batch_123_abc",
+    db_session=session,
+    poll_interval=60
+)
+```
+This method inspects the current state (e.g., `COUNTING_SUBMITTED`, `READY_TO_PROCESS`) and automatically picks up where it left off, handling transitions between phases.
+
+**Retrying or Extending a Batch**
+
+If a batch failed or if you want to extend a completed workflow (e.g., adding more hierarchical steps), use `create_continuation_batch`. This creates a *new* batch job that copies the successful steps from the old one, saving time and money.
+
+```python
+# Continue from step 2
+new_batch_id = await orchestrator.create_continuation_batch(
+    original_batch_id="failed_batch_id",
+    db_session=session,
+    start_from_step_index=2, # Copy steps 0 and 1, restart from 2
+    wait_for_completion=True
+)
+```
+
+## Hierarchical Batches & Shallow Schemas
+
+When using `use_hierarchical_extraction=True` with batches, the process involves multiple "hops".
+
+1.  **Level 1 Extraction**: The root object is extracted.
+2.  **Shallow Schema Generation**: For lists of children, `extrai` generates "Shallow Schemas" (just IDs and essential fields) to keep the context window small.
+3.  **Child Batches**: New batch jobs are spawned for each child entity to extract full details.
+
+This complex coordination is handled automatically by the `BatchPipeline`.
diff --git a/docs/analytics_collector.rst b/docs/analytics_collector.rst
@@ -57,6 +57,19 @@ These methods track the health of LLM API calls and output processing.
    # To record an error when the LLM's JSON output fails schema validation
    analytics_collector.record_llm_output_validation_error()
 
+**Token Usage Tracking**
+
+The collector automatically aggregates token usage from supported LLM providers (e.g., OpenAI, Gemini).
+
+.. code-block:: python
+
+   # This is typically called internally by the LLM client
+   analytics_collector.record_llm_usage(input_tokens=150, output_tokens=50)
+
+   # Access totals
+   print(f"Total Input: {analytics_collector.total_input_tokens}")
+   print(f"Total Output: {analytics_collector.total_output_tokens}")
+
 **Consensus Run Details**
 
 This method is used to log the outcome of a consensus process. It takes a dictionary of metrics, usually generated by the ``JSONConsensus`` component.
@@ -137,6 +150,8 @@ A report provides a detailed summary of the entire workflow.
         "llm_output_parse_errors": 2,
         "llm_output_validation_errors": 1,
         "total_invalid_parsing_errors": 3,
+        "total_input_tokens": 1500,
+        "total_output_tokens": 450,
         "number_of_consensus_runs": 1,
         "hydrated_objects_successes": 10,
         "hydration_failures": 0,