Ollama-Agent-Kit

A simple memory-enabled LLM Agent capable of handling tool calls. It communicates with large language models through the Ollama API and can use predefined tools (weather, time, reservation, etc.).

Requirements

This project is designed to run on Linux systems (tested on Ubuntu 24.04.3 LTS) and requires the following dependencies:

✅ Node.js v18.19.1 – Installed and available in your PATH
📦 npm v9.2.0 – Comes bundled with Node.js, required for package management
🐳 Docker v28.3.3 (Docker Desktop) – Container runtime environment

Installation

1- Clone the repository:

git clone https://github.com/EmreMutlu99/Llama-Based-Agent.git 
cd Llama-Based-Agent
npm install

2- Start Ollama with Docker Compose

cd ollama
docker compose up -d

3- Run it

You can run one of the following demo scripts depending on what you want to test:

# General demo
node src/main.js

# Direct tool call tests (weather, time, reservation)
node src/tools/tools-test.js

# Conversation threads + memory test
node src/threads-test.js

Usage in an External Project

1- Create a new project

mkdir demo-app && cd demo-app
npm init -y

2- Add this library as a local dependency

npm i ../Llama-Based-Agent

3- Create index.js

const { Agent, SimpleMemory } = require('llama-based-agent');

(async () => {
  // Agent başlat
  const agent = new Agent(); 

  // Yeni bir thread oluştur
  const { thread_id } = await agent.new_thread({ source: 'cli-demo', label: 'full-flow' });
  console.log('THREAD:', thread_id);

  // --- (Hafıza) Kendini tanıt + isim bilgisini ver
  let q = "Hi! My name is Omer. Keep replies short.";
  let r = await agent.generate({ input: q, thread_id });
  console.log('Q1:', q);
  console.log('A1:', r.text);

  // --- (Hafıza) İsmi hatırlıyor mu?
  q = "What is my name?";
  r = await agent.generate({ input: q, thread_id });
  console.log('Q2:', q);
  console.log('A2:', r.text);

  // --- Tool: get_weather
  q = "What's the weather in Paris?";
  r = await agent.generate({ input: q, thread_id, tool_choice: 'auto' });
  console.log('Q3:', q);
  console.log('A3:', r.text);

  // --- Tool: reserve_table (async)
  q = "Book a table for 3 tomorrow at 20:00 under Omer, phone 05001234567.";
  r = await agent.generate({ input: q, thread_id, tool_choice: 'auto' });
  console.log('Q5:', q);
  console.log('A5:', r.text);

  // --- Genel bilgi sorusu (tool yok)
  q = "Explain AI in two short sentences.";
  r = await agent.generate({ input: q, thread_id });
  console.log('Q7:', q);
  console.log('A7:', r.text);

  // --- (Hafıza) Sohbetin özetini iste
  q = "Summarize our conversation so far in 3 short bullet points.";
  r = await agent.generate({ input: q, thread_id });
  console.log('Q8:', q);
  console.log('A8:', r.text);

})();

4- Run it

node index.js

Features

Agent class:

Communicates with the LLM, handles tool calls, manages memory.

SimpleMemory:

JSONL-based lightweight memory layer.

Tools:

get_weather: returns weather from a static LUT.
get_time: returns local time from a static LUT.
reserve_table: logs a reservation request to an API or a queue.

Easy integration:

Usable in external projects with require('llama-based-agent').

📊 Benchmark Chart (English Response Times)

LLM Model Comparison (Short / Medium / Long Responses)

The chart above compares the English response times of different Large Language Models (LLMs) across short, medium, and long scenarios.

Execution Flow

Warmup runs: A few warmup requests (default: 2) are made before measurement to avoid cold-start latency.
Main measurement: Each scenario is executed with multiple runs (default: RUNS=20).
Prompt randomization: Each prompt is appended with a nonce (random token) and timestamp to avoid caching effects.
Timing: Response time is measured with process.hrtime.bigint() (nanosecond precision).
Before switching scenarios or models: docker restart is used. This ensures each test runs under fresh conditions without cached context.

Blue bars (avg_second): Average response time
Red bars (min_second): Fastest response time
Yellow bars (max_second): Slowest response time

This visualization shows how models perform when generating English responses with different input lengths. Some models are more consistent (narrow min–max range), while others show higher variability (large gap between fastest and slowest times).

LLM Model Comparison (Turkish / Instruction / Memory)

I built and ran a 10-task, 31-point Turkish benchmark across 13 local models (0.5B–8B).
I evaluated language fidelity, instruction following (1–2 sentence limits), memory recall (KV/facts: name=Ömer, language=Türkçe, codename=Atlas, short-reply preference), context use, translation, and simple reasoning.
I used automated checks per task, summed the scores, and normalized them to percentages for ranking. Mid-size instruct models (e.g., mistral:7b-instruct, llama3.1:8b) were consistently stronger, while very small models struggled with strict Turkish adherence and brevity constraints.

Extra Notes & Useful Commands

Docker

docker compose up -d     # start container from compose file (returns terminal immediately)
docker-compose up        # useful during development or when you want to follow logs in real time
docker compose down      # stops + removes the container
docker compose stop      # only stops, does not remove

docker compose logs -f   # follow logs (e.g., model download progress)
docker ps                # list running containers
docker ps -a             # list all containers

docker exec -it ollama bash   # enter the container shell
ollama list                   # list available models inside Ollama

Node.js & Scripts

npm install      # Installs all dependencies from package.json into node_modules/
node main.js     # Runs the example runner directly with Node (bypasses npm scripts)

rm memory*.jsonl          # Deletes persisted memory files (e.g., memory.jsonl) to reset conversation/history
npm start                 # Runs the project’s default start script (aliased to node src/main.js in package.json)

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
doc		doc
ollama		ollama
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Ollama-Agent-Kit

Requirements

Installation

1- Clone the repository:

2- Start Ollama with Docker Compose

3- Run it

Usage in an External Project

1- Create a new project

2- Add this library as a local dependency

3- Create index.js

4- Run it

Features

Agent class:

SimpleMemory:

Tools:

Easy integration:

📊 Benchmark Chart (English Response Times)

LLM Model Comparison (Short / Medium / Long Responses)

Execution Flow

LLM Model Comparison (Turkish / Instruction / Memory)

Extra Notes & Useful Commands

Docker

Node.js & Scripts

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

EmreMutlu99/Ollama-Agent-Kit

Folders and files

Latest commit

History

Repository files navigation

Ollama-Agent-Kit

Requirements

Installation

1- Clone the repository:

2- Start Ollama with Docker Compose

3- Run it

Usage in an External Project

1- Create a new project

2- Add this library as a local dependency

3- Create index.js

4- Run it

Features

Agent class:

SimpleMemory:

Tools:

Easy integration:

📊 Benchmark Chart (English Response Times)

LLM Model Comparison (Short / Medium / Long Responses)

Execution Flow

LLM Model Comparison (Turkish / Instruction / Memory)

Extra Notes & Useful Commands

Docker

Node.js & Scripts

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages