Java RAG System

A Java-based Retrieval-Augmented Generation (RAG) system that combines document ingestion, vector storage, and language model inference to provide contextual question-answering capabilities.

Overview

This project implements a complete RAG pipeline using:

Document Ingestion: Loads and processes text documents
Vector Storage: Uses Milvus for storing document embeddings
Embedding Generation: Python-based API using SentenceTransformer models
Language Model: Integration with Ollama for text generation
Retrieval System: Vector similarity search for relevant context

Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Document      │    │   Embedding     │    │   Vector        │
│   Ingestion     │───▶│   Generation    │───▶│   Storage       │
│                 │    │   (Python API)  │    │   (Milvus)      │
└─────────────────┘    └─────────────────┘    └─────────────────┘
                                                        │
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Answer        │    │   Language      │    │   Context       │
│   Generation    │◀───│   Model         │◀───│   Retrieval     │
│                 │    │   (Ollama)      │    │                 │
└─────────────────┘    └─────────────────┘    └─────────────────┘

Features

Multi-format Document Processing: Load and process text documents
Semantic Search: Vector-based similarity search using embeddings
Context-aware Responses: Generate answers based on retrieved relevant context
Modular Design: Separate components for ingestion, retrieval, and generation
External LLM Integration: Uses Ollama for language model inference
Scalable Vector Storage: Milvus database for efficient vector operations

Prerequisites

Java Environment

Java 11 or higher
Maven 3.6+

External Services

Milvus: Vector database (default: localhost:19530)
Ollama: Language model server (default: localhost:11434)
Python Embedding API: SentenceTransformer service (default: localhost:5005)

Installation

1. Clone the Repository

git clone <repository-url>
cd Java-Rag-System

2. Install Java Dependencies

mvn clean install

3. Set up Python Embedding Service

cd src/embedding-api
pip install flask sentence-transformers
python embedding_api.py

4. Install and Start Milvus

Follow the Milvus installation guide or use Docker:

docker run -d --name milvus -p 19530:19530 milvusdb/milvus:latest

5. Install and Start Ollama

Install Ollama from https://ollama.ai
Pull the required model:

ollama pull gemma3:1b

Usage

1. Document Ingestion

First, run the ingestion process to load documents into the vector database:

mvn exec:java -Dexec.mainClass="ingestion.App"

2. Run RAG Application

Start the main RAG application:

mvn exec:java -Dexec.mainClass="LLM.AppRag"

3. Ask Questions

The system will prompt you to enter questions, and it will:

Retrieve relevant context from the vector database
Generate contextual answers using the language model

Project Structure

src/
├── main/java/
│   ├── ingestion/           # Document processing and vector storage
│   │   ├── App.java         # Main ingestion application
│   │   ├── SimpleDocumentLoader.java
│   │   ├── RemoteEmbedder.java
│   │   ├── MilvusVectorStore.java
│   │   └── MilvusConnection.java
│   ├── retrieval/           # Vector search and context retrieval
│   │   └── VectorRetriever.java
│   └── LLM/                # Language model integration
│       ├── AppRag.java      # Main RAG application
│       └── RagPipeline.java # RAG pipeline orchestration
├── embedding-api/           # Python embedding service
│   └── embedding_api.py
└── resources/
    └── doc1.txt            # Sample documents

Configuration

Language Model Settings

Model: gemma3:1b (configurable in RagPipeline.java)
Temperature: 0.2
Base URL: http://localhost:11434

Vector Database Settings

Host: 127.0.0.1
Port: 19530
Collection: Automatically managed

Embedding API Settings

Model: all-MiniLM-L6-v2
Endpoint: http://127.0.0.1:5005/embed

Dependencies

Java (Maven)

LangChain4J: Framework for LLM applications
Ollama Integration: Language model client
Milvus SDK: Vector database client
Gson: JSON processing

Python

Flask: Web framework for embedding API
SentenceTransformers: Pre-trained embedding models

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

Free to use for educational purposes.

Troubleshooting

Common Issues

Connection refused: Ensure all external services (Milvus, Ollama, Python API) are running
Model not found: Make sure to pull the required Ollama model: ollama pull gemma3:1b
Port conflicts: Check if default ports (19530, 11434, 5005) are available

Support

For issues and questions, please create an issue in the repository.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.idea		.idea
src		src
target/classes		target/classes
JavaRAGProject.iml		JavaRAGProject.iml
README.md		README.md
pom.xml		pom.xml

Folders and files

Latest commit

History

Repository files navigation

Java RAG System

Overview

Architecture

Features

Prerequisites

Java Environment

External Services

Installation

1. Clone the Repository

2. Install Java Dependencies

3. Set up Python Embedding Service

4. Install and Start Milvus

5. Install and Start Ollama

Usage

1. Document Ingestion

2. Run RAG Application

3. Ask Questions

Project Structure

Configuration

Language Model Settings

Vector Database Settings

Embedding API Settings

Dependencies

Java (Maven)

Python

Contributing

License

Troubleshooting

Common Issues

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages