Skip to content

Better brain. Knowledge management tool. Stop saving things you'll never read. Work in progress.

License

Notifications You must be signed in to change notification settings

cooperability/BMX-bookmark-extractor

Repository files navigation

BMX (BookMark eXtractor)

A sophisticated knowledge management system designed to transform bookmark collections into actionable knowledge through AI-powered analysis and graph-based exploration.

Quick Start (Local Development)

Prerequisites

  • Docker Desktop - Must be running before starting development
  • Git - Version control
  • Cursor or VS Code (recommended) - With Dev Containers extension for seamless development
  • WSL2 (Windows only) - For optimal Docker performance

Setup Methods

Method 1: DevContainer (Recommended for VS Code/Cursor Users)

git clone <repository-url>
cd BMX-bookmark-extractor

# Quick launch (opens Cursor/VS Code in devcontainer)
./scripts/open_devcontainer

# Or manually:
# 1. Ensure Docker Desktop is running first!
docker ps  # Should not return ENOENT error

# 2. Open in Cursor/VS Code
cursor .  # or: code .

# 3. Reopen in container when prompted
# Or manually: F1 β†’ "Dev Containers: Reopen in Container"

The dev container will automatically:

  • Set up Python environment with Poetry
  • Configure SvelteKit frontend with all dependencies
  • Initialize Neo4j database
  • Install development tools and extensions

From integrated terminal (inside container):

# Backend (FastAPI)
poetry run uvicorn src.main:app --host 0.0.0.0 --port 8000 --reload
# Or: ./scripts-devcontainer/dev

# Frontend (SvelteKit)
cd /project/frontend && yarn dev

Method 2: Docker Compose (Works with Any Editor)

git clone <repository-url>
cd BMX-bookmark-extractor

# Start all services
./scripts/dc_up

# Backend (separate terminal)
./scripts/dc_exec backend poetry run uvicorn src.main:app --host 0.0.0.0 --port 8000 --reload

# Frontend (separate terminal)
./scripts/dc_exec frontend yarn dev

Troubleshooting

Common Issues:

Issue Solution
Docker not running Start Docker Desktop, wait for green icon. Verify: docker ps
Port conflict (8000/3000) docker compose down or netstat -ano | findstr :8000 to find conflicting process
WSL2 corruption rm -rf ~/.vscode-server ~/.cursor-server then restart
Neo4j auth failed Default credentials: neo4j / bmxpassword
Build too slow Ensure .dockerignore exists in backend/ and frontend/
Can't connect to services Check logs: docker compose logs -f <service>

Nuclear Option (Fresh Start):

docker compose down -v && docker system prune -af
rm -rf ~/.vscode-server ~/.cursor-server
docker compose up --build

System Architecture

BMX uses a hybrid database architecture combining:

  • PostgreSQL for efficient full-text content storage and complex queries
  • Neo4j for relationship mapping and graph-based knowledge exploration
  • FastAPI backend for robust API and data processing
  • SvelteKit frontend for modern, responsive user interface

Key Features

  • Multi-Source Ingestion: Process bookmarks from browser exports, Anki flashcards, and direct web scraping
  • AI-Powered Analysis: Use Google Gemini API for intelligent content summarization and entity extraction
  • Graph-Based Knowledge Discovery: Visualize and explore relationships between concepts, documents, and ideas
  • Hybrid Storage Strategy: Optimize for both performance and cost with intelligent data distribution
  • Real-Time Processing: Stream-based ingestion and processing for immediate insights
  • Educational Content Integration: Structured learning materials processed through the knowledge graph for personalized learning paths

Educational Content Strategy

BMX includes structured educational materials designed for both human learning and LLM knowledge graph integration:

Features:

  • Interactive Format: Jupyter notebooks with executable code examples
  • Structured Metadata: YAML frontmatter with learning objectives and prerequisites
  • Knowledge Graph Integration: Content processed through BMX pipeline to extract:
    • Programming concepts and relationships
    • Code-to-concept mappings
    • Progressive learning paths
  • Cross-Domain Connections: Links educational content with other knowledge domains

LLM Integration Goals: Educational content becomes queryable knowledge, enabling the system to:

  • Recommend personalized learning paths
  • Explain concepts with executable examples
  • Connect theoretical knowledge with practical implementation
  • Provide context-aware coding assistance

Documentation Guide

Find what you need based on your goal:

πŸš€ Getting Started

πŸ“‹ Planning & Development

πŸ—οΈ Architecture & Design

πŸ”Œ Integration Guides

πŸ“š Additional Documentation

πŸ”§ Component Documentation

  • Backend - FastAPI application, Poetry dependencies, Docker configuration
  • Frontend - SvelteKit application, npm dependencies, UI components
  • DevContainer - Development environment setup and tools
  • Scripts - Helper scripts for Docker Compose operations

Project Status

Current Phase: Early development with foundational architecture established
Next Milestone: Neo4j integration and hybrid storage implementation
Target: Production-ready MVP with Anki data ingestion and basic graph visualization

BMX represents a comprehensive approach to knowledge management, transforming scattered bookmarks and information into a cohesive, explorable knowledge graph that helps users discover connections and insights they never knew existed.

License

License details in LICENSE file

About

Better brain. Knowledge management tool. Stop saving things you'll never read. Work in progress.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •