-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Goal
Replace void's Gemini 2.5 Pro with a fine-tuned open model (Llama 3.1 8B or similar) running on self-hosted infrastructure. Void has 49,786 posts and extensive cognition records. The fine-tune should capture void's voice, analytical style, and engagement patterns at a fraction of the inference cost.
Data Available
Posts: 49,786 on comind.network PDS (99% replies with fetchable parent context)
app.bsky.feed.post- public Bluesky postsstream.thought.reasoning- internal reasoning tracesstream.thought.memory- episodic memory recordsstream.thought.tool.call- tool usage patterns
Context window (already exported to data/void-context/):
- System prompt: 5,618 chars
- 25 memory blocks: 64,561 chars total
- Key blocks:
void-persona(5.1k),operational_protocols(18.8k),communication_guidelines(6k),zeitgeist(450) - Prompt template from
~/code/void/bsky.pyhandler
Agent: void-prime (agent-01086cda-be1f-4986-bf3e-ca5b6297cc5d) on Letta Cloud
Pipeline (built, partially tested)
1. Export raw data
uv run python tools/export_training_data.py void.comind.network \
-o data/void-raw.jsonl \
--collections app.bsky.feed.post stream.thought.reasoning stream.thought.memory \
--filter-chars- Paginates PDS via
com.atproto.repo.listRecords - Fetches parent/root post context for every reply
- Filters character creation loop content (known failure mode: D&D-style sheets)
- Outputs JSONL with id, text, parent_text, parent_author, root_text, etc.
- Estimated time: hours (50k records + parent fetches with rate limiting)
2. Format for fine-tuning
uv run python tools/format_training_data.py data/void-raw.jsonl \
-o data/void-finetune.jsonl \
--system-prompt data/void-context/full-context.txt \
--replies-only \
--format sharegpt- Injects void's actual context window as system prompt
- Reconstructs thread context as user messages
- Formats as chat completions (OpenAI, ShareGPT, or Alpaca)
- Filters short responses (<20 chars)
3. Fine-tune
Base model candidates:
| Model | Size | VRAM needed (QLoRA) | Notes |
|---|---|---|---|
| Llama 3.1 8B Instruct | 8B | ~12GB | Best quality/cost ratio |
| Mistral 7B v0.3 | 7B | ~10GB | Good at conversation |
| Llama 3.2 3B | 3B | ~6GB | Cheapest, may lose nuance |
| Qwen 2.5 7B | 7B | ~10GB | Strong multilingual |
Training approach: QLoRA (4-bit quantization + LoRA adapters)
- Hardware: single A100 (80GB) or 4090 (24GB)
- Estimated training time: 2-4 hours on 40k+ pairs
- Framework: axolotl, unsloth, or huggingface TRL
4. Evaluation
This is the hardest part. Proposed approach:
- Held-out test set: 500 reply pairs void actually wrote, not seen during training
- A/B comparison: show test inputs to both fine-tuned model and base Llama, compare against void's actual response
- Voice metrics: response length distribution, vocabulary overlap, analytical depth (manual review of 50 samples)
- Failure mode check: does it generate character sheets? Does it break voice on edge cases?
5. Serve
Options:
- vLLM on a dedicated GPU instance (most performant)
- llama.cpp on CPU (cheapest, slower)
- Ollama for easy deployment
- Together.ai / Fireworks for managed inference (middle ground)
Then point void's handler at the new endpoint instead of Gemini.
Known Issues
- Character creation loop: 46% of recent
stream.thought.memoryrecords are D&D character sheets. Must filter aggressively. Keywords list inexport_training_data.py. - Context window size: void's full context is 64k chars. Most 8B models have 8k-32k context. May need to trim to essential blocks (persona, operational_protocols, communication_guidelines) or use a long-context model.
- Memory block drift: void's blocks change over time. The exported context is a snapshot from 2026-02-13. Training data from 6 months ago had different blocks. Could cause distribution mismatch.
- Tool calls: void uses
add_post_to_bluesky_reply_threadtool. Fine-tuned model needs to learn this tool-calling pattern, or we restructure the handler to extract text from the model and call the tool externally.
Files
tools/export_training_data.py- PDS export with parent fetchingtools/format_training_data.py- Chat completion formatterdata/void-context/- Exported context window (system prompt + 25 blocks)data/void-sample.jsonl- 20-record test sampledata/void-finetune-sample.jsonl- Formatted sample
Next Steps
- Run full 50k export (long-running, background)
- Decide on context window trimming strategy
- Choose base model + training framework
- Set up training environment (GPU instance)
- Train + evaluate
- Deploy and wire into void's handler
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels