A minimal template built with Java 21 and Spring Boot 4.0.3, designed to create modern agents that handle and orchestrate AI requests with high performance and efficiency.
- Spring Boot 4.0.3: Latest version with enhanced performance and security.
- Java 21 virtual threads: Lightweight concurrency for massively scalable services.
- WebFlux reactive stack: Non-blocking I/O for optimal resource utilization.
- REST controllers: Clean, minimal endpoints for agent interactions.
- Health check:
/actuator/healthendpoint ready for orchestration tools. - 100% test coverage: Comprehensive unit tests with JaCoCo reporting.
- Clean architecture: Separated concerns with controllers, components, and models.
minimal-java-agent/
βββ src/
β βββ main/
β β βββ java/com/example/agent/
β β β βββ AgentApplication.java
β β β βββ controller/
β β β β βββ AgentController.java
β β β βββ component/
β β β β βββ AgentInfoProvider.java
β β β βββ model/
β β β βββ ChatResponse.java
β β βββ resources/
β β βββ prompts/
β β βββ system.st
β βββ test/
β βββ java/com/example/agent/
β βββ AgentApplicationTest.java
β βββ controller/
β β βββ AgentControllerTest.java
β βββ component/
β β βββ AgentInfoProviderTest.java
β βββ model/
β βββ ChatResponseTest.java
βββ build.gradle
βββ Dockerfile
βββ README.md
- Java 21 or later
- Gradle (optional, wrapper IS NOT included)
- Docker (optional, for containerized deployment)
No Gradle wrapper is included β the project expects you to have Gradle installed globally. This keeps the repository even smaller and lets you use your preferred Gradle version.
# Clone the repository
git clone https://github.com/carlosquijano/minimal-java-agent.git
cd minimal-java-agent
# Build the project
./gradlew build
# Run the application
./gradlew bootRunThe included Dockerfile uses multi-stage builds for optimal image size:
# Build stage
FROM gradle:9.3.1-jdk21-alpine AS builder
WORKDIR /app
COPY build.gradle settings.gradle ./
COPY src ./src
RUN gradle bootJar --no-daemon
FROM eclipse-temurin:21-jre-alpine
RUN addgroup -S spring && adduser -S spring -G spring
USER spring:spring
COPY --from=builder /app/build/libs/*.jar agent.jar
EXPOSE 8080
ENTRYPOINT ["java", "-Xmx256m", "-jar", "/agent.jar"]# Build Docker image
docker build -t minimal-java-agent .
# Run container
docker run -p 8080:8080 minimal-java-agentThe system prompt defines your agent's personality, role, and purpose. This template loads it in two layers, concatenated in order at startup:
| Layer | Source | When loaded |
|---|---|---|
| Base | src/main/resources/prompts/system.st |
Always β defines the agent's core identity |
| Extension | File path set via AGENT_SYSTEM_PROMPT_PATH |
When the env var is set β appends role-specific behavior |
Separating the two layers gives you flexibility at different levels:
- Base prompt β changing it requires modifying the source and rebuilding the image. Use it for stable, structural instructions that define what kind of agent this is.
- Extension prompt β injected at runtime from a mounted file, no rebuild needed. Use it for role-specific behavior that may vary per deployment or instance.
docker run -p 8080:8080 \
-e AGENT_SYSTEM_PROMPT_PATH=file:/app/prompts/extension.st \
-v ./my-prompts:/app/prompts:ro \
minimal-java-agentagent-a:
image: minimal-java-agent
environment:
AGENT_SYSTEM_PROMPT_PATH: file:/app/prompts/extension.st
volumes:
- ./prompts/agent-a:/app/prompts:ro
agent-b:
image: minimal-java-agent
environment:
AGENT_SYSTEM_PROMPT_PATH: file:/app/prompts/extension.st
volumes:
- ./prompts/agent-b:/app/prompts:roIf
AGENT_SYSTEM_PROMPT_PATHis not set, only the base prompt is used. The agent starts normally either way.
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/chat |
Send a message to the agent |
# Send a message
curl -X POST http://localhost:8080/api/chat \
-H "Content-Type: text/plain" \
-d "Hello, agent!"This project uses Ollama to run language models 100% locally, offline, and private.
# 1. Run Ollama container with sufficient memory
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama --memory=2g ollama/ollama
# 2. Pull a lightweight model
docker exec -it ollama ollama pull llama3.2:1b-q4_0
# 3. Start your Spring Boot application
./gradlew bootRunThe llama3.2:1b-q4_0 model is perfectly balanced for development β fast responses (~30-40 tokens/sec) with minimal resource usage. For higher quality, try llama3.2:3b (needs ~2.2 GB RAM).
Send a message:
curl -X POST http://localhost:8080/api/chat \
-H "Content-Type: text/plain" \
-d "Β‘Hola Mundo!"Expected response:
{
"instanceId": "e9c2",
"agentName": "java-agent",
"thread": "Thread[#42,reactor-http-nio-9,5,main]",
"response": "Β‘Hola Mundo! ΒΏEn quΓ© puedo ayudarte hoy?"
}Notes on the above
- Java agent talked to Ollama running locally
- Llama 3.2 generated a contextual response knowing it was created by me
- All running 100% offline on your local environment
- Virtual threads handled the reactive flow seamlessly
# Run tests
./gradlew test
# Generate coverage report
./gradlew jacocoTestReport
# View coverage report
open build/reports/jacoco/test/html/index.htmlThe application exposes health and metrics endpoints via Spring Boot Actuator:
/actuator/health- Application health status/actuator/info- Application information
Contributions, issues, and feature requests are welcome! Feel free to check the issues page.
This project is Apache 2.0 licensed.
Suggestions on how to either minimize or enhance this further are welcome!