Live Demo: https://kvbgkvw4mehwhhdjt7crrg.streamlit.app/
LexTransition AI is an open-source, offline-first legal assistant. It helps users navigate the transition from old Indian laws (IPC/CrPC/IEA) to the new BNS/BNSS/BSA frameworks. Using local Machine Learning and OCR, it analyzes legal documents and maps law sections with 100% grounded accuracy.
LexTransition AI is an open-source, offline-first legal assistant. It helps users navigate the transition from old Indian laws (IPC/CrPC/IEA) to the new BNS/BNSS/BSA frameworks. Using local Machine Learning and OCR, it analyzes legal documents and maps law sections with 100% grounded accuracy.
- 🔄 The Law Transition Mapper: The core engine that maps old IPC sections to new BNS equivalents. It highlights specific changes in wording, penalties, and scope.
- 🖼️ Multimodal Document Analysis (OCR): Upload photos of legal notices or FIRs. The system extracts text using local OCR and explains "action items" in simple language.
- 📚 Grounded Fact-Checking: Every response is backed by official citations. The AI identifies the exact Section, Chapter, and Page from the official Law PDFs to prevent hallucinations.
To ensure privacy and offline accessibility, this project can be configured to run without external APIs:
- Backend: Python, LangChain/LlamaIndex.
- OCR: EasyOCR or PyTesseract (Local engines).
- Vector DB: ChromaDB or FAISS (Local storage instead of Pinecone/Milvus).
- Local LLM: Llama 3 or Mistral via Ollama or LM Studio (Runs on your GPU/CPU).
- Frontend: Streamlit Dashboard.
LexTransition-AI/
├── app.py # Streamlit UI
├── requirements.txt # Local ML libraries
├── engine/
│ ├── ocr_processor.py # Local OCR logic
│ ├── mapping_logic.py # IPC to BNS mapping dictionary
│ └── rag_engine.py # Local Vector Search logic
└── models/ # Local LLM weights (Quantized)The easiest way to run LexTransition-AI is with Docker. This handles all dependencies (including Tesseract OCR and system libraries) automatically.
-
Clone the repository:
git clone [https://github.com/centiceron/LexTransition-AI.git](https://github.com/centiceron/LexTransition-AI.git) cd LexTransition-AI -
Build the Docker Image
docker build -t lextransition.
-
Run the Application
docker run -p 8501:8501 lextransition
-
Open the App
http://localhost:8501
- Streamlit UI (app.py) — implemented (interactive pages for Mapper, OCR, Fact-check).
- OCR — local helper supporting EasyOCR and pytesseract (install system tesseract for pytesseract).
- IPC→BNS Mapping — in-memory mapping with fuzzy match; UI supports adding mappings at runtime.
- Grounded Fact-Check — simple PDF ingestion and page-level keyword search using pdfplumber (add PDFs to ./law_pdfs via UI).
- RAG/LLM & full offline guarantees — NOT implemented yet (placeholders/stubs present).
-
Install Python dependencies:
pip install -r requirements.txt -
(Optional) Install Tesseract binary for pytesseract:
- Ubuntu:
sudo apt install tesseract-ocr - Mac (brew):
brew install tesseract
- Ubuntu:
-
Launch:
streamlit run app.py
To use Grounded Fact-Check, upload law PDFs in the Fact-Check page (or drop them into ./law_pdfs) and click "Verify with Law PDFs".
-
Mappings are persisted to
mapping_db.json(in project root). You can add mappings in the UI; they are saved to this file. -
Run tests:
pip install -r requirements.txtpytest -q
Use scripts/ocr_benchmark.py with a CSV dataset (image_path,ground_truth) to compute:
- Character Error Rate (CER)
- Keyword Recall
Example:
python scripts/ocr_benchmark.py --dataset data/ocr_dataset.csv --report ocr_report.md
- Install (optional):
pip install sentence-transformers numpy faiss-cpu - Enable:
export LTA_USE_EMBEDDINGS=1 - Index persists in
./vector_store
- Configure:
export LTA_OLLAMA_URL=http://localhost:11434 - The app will use this endpoint for better plain-language summaries.
- A GitHub Actions workflow (lextransition-ci.yml) runs pytest for the project on PRs.
- Replace page-level keyword search with embeddings + vector store (Chroma/FAISS) + exact citation offsets.
- Add persistent mapping DB + import tools for official IPC→BNS mappings.
- Integrate local LLM for summaries/explanations (Ollama / LM Studio).
- Add tests and CI for engine modules.