Logsqueak

Turn your Logseq journal chaos into organized knowledge.

Logsqueak helps you extract lasting insights from daily journal entries using AI. Review what the AI finds, refine the content, and integrate it into your knowledge base—all through an interactive keyboard-driven interface.

Quick Start

Get running in 5 minutes:

# 1. Clone and install
git clone https://github.com/twaugh/logsqueak.git
cd logsqueak
./setup-dev.sh

# 2. Try it with the included test graph
source venv/bin/activate
logsqueak init  # Follow the interactive setup wizard

# 3. Extract knowledge from a sample journal
logsqueak extract 2025-01-15

What happens next:

Phase 1: AI identifies knowledge blocks (you can select/deselect)
Phase 2: AI suggests better wording (you can edit or accept)
Phase 3: AI suggests where to save it (you approve each one)

That's it! Your knowledge is now organized in your Logseq graph.

What is Logsqueak?

If you use Logseq journals to capture ideas during your day, you've probably noticed:

Great insights get buried in daily logs
Finding that one useful tip from last month is hard
Your knowledge base stays empty while journals pile up

Logsqueak solves this by:

Finding knowledge - AI reads your journals and identifies valuable content (technical tips, lessons learned, insights)
Cleaning it up - AI removes temporal context ("today I learned...") and improves clarity
Organizing it - AI suggests where to save it in your knowledge base (you review and approve)

All through a keyboard-driven terminal interface—no mouse needed.

Before You Start

You'll need:

✓ Python 3.11 or later
✓ A Logseq graph with journal entries
- Don't have one? Use the included test-graph/ directory to try it out
✓ Access to an AI assistant (choose one):
- Free: Ollama running locally (recommended for beginners)
- Paid: OpenAI API key
✓ ~500MB disk space for dependencies

New to Ollama? It's free software that runs AI models on your computer. Install guide →

Installation

Step 1: Install Logsqueak

Recommended: Automated setup

# Clone the repository
git clone https://github.com/twaugh/logsqueak.git
cd logsqueak

# Run setup script (creates virtual environment and installs everything)
./setup-dev.sh

Manual setup (if you prefer):

Click to expand manual installation steps

# Clone the repository
git clone https://github.com/twaugh/logsqueak.git
cd logsqueak

# Create virtual environment
python3.11 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install --upgrade pip
pip install -e .
pip install -e src/logseq-outline-parser/

# Verify installation
pytest -v

Step 2: Set Up Your AI Assistant

Option A: Ollama (Free, runs locally)

# 1. Install Ollama from https://ollama.com/download

# 2. Pull the recommended model:
ollama pull mistral:7b-instruct

# 3. Make sure Ollama is running
ollama serve

Option B: OpenAI (Paid, cloud-based)

Requires an API key from platform.openai.com/api-keys

Or: use any service provide an OpenAI-compatible API.

Step 3: Configure Logsqueak

Interactive setup wizard (recommended):

source venv/bin/activate
logsqueak init

The wizard will guide you through:

Selecting your Logseq graph location
Configuring your AI assistant (Ollama or OpenAI-compatible)
Setting up semantic search

Manual configuration (advanced):

Click to expand manual config instructions

Create ~/.config/logsqueak/config.yaml:

mkdir -p ~/.config/logsqueak
nano ~/.config/logsqueak/config.yaml

For Ollama (local AI):

llm:
  endpoint: http://localhost:11434/v1
  api_key: ollama  # Any string works for local Ollama
  model: mistral:7b-instruct
  num_ctx: 32768  # Optional: controls VRAM usage

logseq:
  graph_path: ~/Documents/logseq-graph  # Path to your graph

For OpenAI:

llm:
  endpoint: https://api.openai.com/v1
  api_key: sk-proj-xxxxxxxxxxxxx  # Your API key
  model: your-chosen-model

logseq:
  graph_path: ~/Documents/logseq-graph

Set correct permissions:

chmod 600 ~/.config/logsqueak/config.yaml

Usage

Try It with the Test Graph

The repository includes a sample Logseq graph with realistic journal entries:

source venv/bin/activate

# Configure to use test-graph (if not already done)
logsqueak init  # Point to /path/to/logsqueak/test-graph

# Extract knowledge from a sample journal entry
logsqueak extract 2025-01-15

What you'll see:

Phase 1 - Block Selection

The AI reads the journal and highlights blocks like:
✓ "Python 3.12 type hints improvements..." (knowledge)
✗ "Morning standup at 9am" (activity log)

Navigate with j/k, press Space to select/deselect, then 'n' to continue.

Phase 2 - Content Editing

Original: "Learned about TDD best practices..."
AI suggests: "Test-Driven Development (TDD) best practices include..."

Press 'a' to accept AI version, 'r' to revert, or Tab to edit manually.

Phase 3 - Integration Review

AI suggests: Add to "TDD" page under "Best Practices" section

You'll see a preview with the insertion point marked in green.
Press 'y' to accept, 's' to skip.

Use with Your Own Graph

# Extract from today's journal
logsqueak extract

# Extract from specific date
logsqueak extract 2025-01-20

# Extract from date range
logsqueak extract 2025-01-15..2025-01-20

Search Your Knowledge Base

# Semantic search (finds similar content by meaning)
logsqueak search "python testing tips"

# Force rebuild search index
logsqueak search "docker best practices" --reindex

Results show clickable logseq:// links (works in modern terminals).

Understanding the 3 Phases

Phase 1: Block Selection

What's happening: AI reads your journal and classifies each block as "knowledge" (worth saving) or "activity log" (daily noise).

Your job: Review the selections. The AI is pretty good, but you know best.

Keyboard shortcuts:

j/k or arrows: Navigate blocks
Space: Select/deselect current block
a: Accept all AI suggestions
c: Clear all selections
Shift+j/Shift+k: Jump to next/previous knowledge block
n: Proceed to Phase 2
q: Quit

Phase 2: Content Editing

What's happening: AI rewrites selected blocks to remove temporal context ("today I learned...") and improve clarity.

Your job: Accept AI suggestions, edit them, or keep the original.

Keyboard shortcuts:

j/k: Navigate blocks (auto-saves changes)
a: Accept AI reworded version
r: Revert to original
Tab: Focus/unfocus editor for manual editing
n: Proceed to Phase 3 (waits for semantic search to complete)
q: Go back to Phase 1

Three panels:

Left: Original journal content
Middle: AI's suggested rewrite
Right: Current version (editable)

Phase 3: Integration Review

What's happening: AI suggests where to save each knowledge block in your graph (which page, which section).

Your job: Review each suggestion and approve or skip.

Keyboard shortcuts:

j/k: Navigate decisions
y: Accept decision (writes to file immediately)
s: Skip this decision
a: Accept all decisions for current block
n: Move to next knowledge block
q: Go back to Phase 2

What you see:

Target page preview with green bar showing insertion point
Integration action (add new section, add under existing, etc.)
Provenance: Journal gets extracted-to:: markers after successful writes

Keyboard Shortcuts Cheat Sheet

All phases use vim-style navigation:

Key	Action
`j` / `k`	Navigate down/up
`Space`	Select/deselect (Phase 1)
`a`	Accept AI suggestion / Accept all
`r`	Revert to original (Phase 2)
`y`	Yes, accept decision (Phase 3)
`s`	Skip decision (Phase 3)
`Tab`	Focus/unfocus editor (Phase 2)
`n`	Next phase / Next block
`q`	Quit / Go back

No mouse needed! Everything is keyboard-driven.

How It Works (Under the Hood)

For the curious:

Parsing: Logsqueak uses a custom Logseq markdown parser that preserves exact structure (round-trip tested)
Classification: AI analyzes each journal block to identify knowledge vs. activity logs
Rewording: AI removes temporal context and improves clarity while preserving meaning
Semantic Search (RAG):
- Builds a searchable index of your entire graph
- Finds similar content by meaning, not just keywords
- Uses hierarchical chunks for context-aware search
Integration Planning:
- AI searches for relevant pages in your graph
- Analyzes page structure to suggest insertion points
- Optimized prompts
Atomic Writes:
- Writes to target pages happen immediately on approval
- Journal gets extracted-to:: markers only after successful write
- Every integrated block gets a unique id:: property for traceability

Non-destructive guarantee: All operations are traceable. Nothing gets deleted. You can always find where content came from.

Configuration Reference

Complete Config File

llm:
  # LLM API endpoint
  endpoint: http://localhost:11434/v1  # Ollama local
  # endpoint: https://api.openai.com/v1  # OpenAI cloud

  # API key (any string for Ollama, real key for OpenAI)
  api_key: ollama

  # Model name
  model: mistral:7b-instruct  # Ollama model (recommended)
  # model: your-chosen-model  # OpenAI model

  # Context window size (Ollama only, optional)
  # Controls VRAM usage - smaller = less memory, smaller context
  num_ctx: 32768

logseq:
  # Path to your Logseq graph directory
  # Must contain journals/ and logseq/ subdirectories
  graph_path: ~/Documents/logseq-graph

rag:
  # Number of similar blocks to retrieve per search
  # Higher = more context but slower (default: 20)
  top_k: 20

Note on semantic search: Logsqueak uses the all-mpnet-base-v2 embedding model for semantic search. This is not currently configurable but provides excellent quality for finding similar content in your knowledge base.

Advanced Topics

Understanding Semantic Search

Logsqueak builds a searchable index of your entire Logseq graph:

# First run: Builds index (takes a minute)
logsqueak search "python tips"

# Subsequent runs: Uses cached index (instant)
logsqueak search "docker containers"

# Force rebuild (if you've added lots of new pages)
logsqueak search "test" --reindex

How it works:

Converts your pages into "embeddings" (AI representations of meaning)
Searches by semantic similarity, not just keyword matching
Boosts results that have explicit links to relevant pages
Shows hierarchical context (parent blocks) for better understanding

When to rebuild:

After adding many new pages manually
If search results seem stale
If you changed your graph structure significantly

Provenance Tracking

Every integration is traceable:

In your journal:

- Learned about TDD best practices
  extracted-to:: [[TDD]]#65b1c1f0-1234-5678-89ab-cdef01234567

In the target page:

## Best Practices
- Test-Driven Development emphasizes writing tests first
  id:: 65b1c1f0-1234-5678-89ab-cdef01234567

The id:: property links back to the journal entry. The extracted-to:: marker shows where it went.

File Safety

Logsqueak uses "atomic two-phase writes":

Read target page and verify it hasn't changed
Write new content to a temporary file
Move temp file to final location (atomic operation)
Mark journal with extracted-to:: marker

If any step fails, the operation is rolled back. You never get partial writes or corrupted files.

Concurrent modification detection: If you edit a target page in Logseq while Logsqueak is running, the write will fail with an error instead of overwriting your changes.

Worker Dependencies

Background tasks run in a specific order:

Phase 1:
  - LLM Classification (immediate)
  - Embedding Model Loading (immediate)
    └─→ Page Indexing (waits for model)

Phase 2:
  - LLM Rewording (immediate)
  - RAG Search (waits for indexing)
    └─→ Integration Planning (waits for RAG)

Phase 3:
  - Decision Review (uses results from Phase 2)

The UI shows progress for all background tasks. You can navigate while workers run in the background.

Development

Want to contribute or customize Logsqueak? See CLAUDE.md for developer documentation.

Quick Dev Commands

# Activate virtual environment (REQUIRED for all commands below)
source venv/bin/activate

# Run tests
pytest -v

# Run specific test suite
pytest tests/unit/ -v           # Unit tests only
pytest tests/integration/ -v    # Integration tests only
pytest tests/ui/ -v             # UI tests only

# Code quality
black src/ tests/               # Format code
ruff check src/ tests/          # Lint code
mypy src/                       # Type checking

# Coverage report
pytest --cov=logsqueak --cov=logseq_outline --cov-report=html -v

Project Structure

logsqueak/
├── src/
│   ├── logsqueak/              # Main application
│   │   ├── models/             # Data models (Pydantic)
│   │   ├── services/           # LLM, RAG, file operations
│   │   ├── tui/                # Interactive UI (Textual)
│   │   ├── wizard/             # Setup wizard
│   │   ├── cli.py              # CLI commands
│   │   └── config.py           # Configuration management
│   └── logseq-outline-parser/  # Logseq markdown parser library
├── tests/                      # Test suite (376 tests)
│   ├── unit/                   # Unit tests (241 tests)
│   ├── integration/            # Integration tests (97 tests)
│   └── ui/                     # UI tests (38 tests)
├── specs/                      # Feature specifications
│   ├── 002-logsqueak-spec/     # Interactive TUI spec (complete)
│   └── 003-setup-wizard/       # Setup wizard spec (complete)
├── test-graph/                 # Sample Logseq graph for testing
└── pyproject.toml              # Dependencies and configuration

Key Resources

CLAUDE.md - Developer guide, architecture, API docs

Getting Help

Bugs: GitHub Issues
Questions: GitHub Discussions
Documentation: See CLAUDE.md for developer docs

Acknowledgments

Built with:

Textual - Modern TUI framework
Ollama - Local LLM runtime
ChromaDB - Vector database for semantic search
sentence-transformers - Embedding models

Developed with assistance from Claude Code.

Name		Name	Last commit message	Last commit date
Latest commit History 601 Commits
.claude/commands		.claude/commands
.github/workflows		.github/workflows
.specify		.specify
demo		demo
specs		specs
src		src
test-graph		test-graph
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup-dev.sh		setup-dev.sh

License

twaugh/logsqueak

Folders and files

Latest commit

History

Repository files navigation

Logsqueak

Quick Start

What is Logsqueak?

Before You Start

Installation

Step 1: Install Logsqueak

Step 2: Set Up Your AI Assistant

Step 3: Configure Logsqueak

Usage

Try It with the Test Graph

Use with Your Own Graph

Search Your Knowledge Base

Understanding the 3 Phases

Phase 1: Block Selection

Phase 2: Content Editing

Phase 3: Integration Review

Keyboard Shortcuts Cheat Sheet

How It Works (Under the Hood)

Configuration Reference

Complete Config File

Advanced Topics

Understanding Semantic Search

Provenance Tracking

File Safety

Worker Dependencies

Development

Quick Dev Commands

Project Structure

Key Resources

Getting Help

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages