Skip to content

MattKotzbauer/kafka

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kafka

Fast, vim-like language learning.

Features

  • Vim-style navigation: Navigate through text using vim keybinds
  • Vocabulary tracking: Track word familiarity across 6 stages (unseen to known)
  • Multiple import formats: PDF, EPUB, plain text
  • AI import and cleanup: OCR on non-text sources, AI text cleanup for OCR errors
  • Fast and responsive: Built with Svelte

Quick Start

Prerequisites

  • Python 3.10+
  • Node.js 18+
  • (Optional) Tesseract OCR for PDF text extraction: sudo apt install tesseract-ocr tesseract-ocr-chi-sim
  • (Optional) ANTHROPIC_API_KEY environment variable for AI text cleanup

Running the Application

# Make start script executable
chmod +x start.sh

# Start both backend and frontend
./start.sh

Or run them separately:

# Terminal 1: Backend
cd backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python run.py

# Terminal 2: Frontend
cd frontend
npm install
npm run dev

Then open http://localhost:5173 in your browser.

Vim Keybindings

Navigation

Key Action
h / l Previous / next word
j / k Down / up (by line)
w / b Next / previous word
W / B Next / previous sentence
gg Go to beginning
G Go to end
Ctrl+f / Ctrl+b Page down / up

Vocabulary

Key Action
0 Mark as unseen
1 Mark as unknown (light yellow)
2-4 Learning stages (darker yellow)
5 Mark as known (blue)
x Mark as irrelevant

General

Key Action
i Import file (in library)
q Close reader
? Toggle help
Esc Close dialogs

Vocabulary Stages

Stage Color Description
0 None Unseen
1 Light yellow Seen but unknown
2-4 Darker yellow Learning stages
5 Blue Known
-1 None Irrelevant (names, etc.)

Project Structure

lingq_copy/
├── backend/           # FastAPI Python backend
│   ├── app/
│   │   ├── main.py   # API entry point
│   │   ├── models.py # Database models
│   │   ├── routers/  # API endpoints
│   │   └── services/ # Business logic
│   └── data/         # SQLite DB and content files
├── frontend/          # Svelte TypeScript frontend
│   └── src/
│       ├── components/
│       └── lib/      # API client, stores, vim handler
└── context/          # Project documentation

API Endpoints

  • GET /vocabulary/ - List vocabulary words
  • PUT /vocabulary/{word}/stage/{stage} - Set word stage
  • GET /content/ - List imported content
  • GET /content/{id} - Get content with text and word positions
  • POST /import/upload - Upload PDF/EPUB/TXT file
  • POST /import/text - Import raw text

Full API documentation at http://localhost:8000/docs

Technology Stack

  • Frontend: Svelte + TypeScript + Vite
  • Backend: Python + FastAPI
  • Database: SQLite
  • Chinese tokenization: jieba
  • PDF parsing: PyMuPDF + pytesseract (OCR)
  • EPUB parsing: ebooklib

About

Fast vim-like language learning with AI imports

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors