🧠 NanoAgent — A 135M Parameter Agentic SLM

NanoAgent is a 135M parameter, 8k context length, open-source language model designed for agentic tasks such as tool calling, instruction following, and lightweight reasoning.
It’s small enough (~135 MB in 8-bit) to run on edge devices like personal laptops, low-memory CPUs, and even wearables — yet smart enough to make tool calls, parse web information, and give structured answers.

Quick inference resource: here

Huggingface Model: NanoAgent-135M Run in Ollama: ollama run quwsarohi/NanoAgent

🌍 Real-World Use Cases

🕹️ Runs on edge devices — laptops, smartwatches, browsers, or CPU-only environments.
🌐 Parses and answers from the web — supports tool calling to fetch real-time information.
🔎 Answers recent questions with live web search tools.
💬 Continues conversations — ideal for assistant or agent frameworks.
⚙️ Tool calling support enables chaining multiple tools and parsing results to produce final answers.

✨ What NanoAgent Supports

Capability	Description
💬 Basic conversation	Casual small talk
🌐 Information retrieval	e.g., “How to bake a cake?”, “Weather in Toronto” through web search. Extracts answers from information returned by tools (scraping/search)
🧰 Tool calling	Single & multi-tool call with structured explanation
🧠 Question decomposition	Breaks complex questions into steps
🧭 Question classification	Identifies type of user query (e.g., fact, reasoning, instruction)
📝 Following system prompts	Responds properly to system-level instructions
✍️ Writing emails and tasks	Writes emails, structured messages

🧪 Training Overview

Base model: SmolLM2-135M-Instruct
Fine-tuning method: ~~Dynamic Fine-Tuning (DFT)~~ Supervised Fine-Tuning
Platform: Apple Mac M1 (16 GB) — MLX framework

📚 Datasets Used

This model was trained using a combination of datasets under different open licenses.
Each dataset retains its original license, and use of those datasets is subject to their respective terms.

Dataset	Purpose	License
microsoft/orca-agentinstruct-1M-v1	RAG, MCQ answering, JSON parsing, Text classification, instruction following	Community Data License Agreement – Permissive, Version 2.0
microsoft/orca-math-word-problems-200k	Lightweight reasoning, word-level reasoning	MIT
allenai/tulu-3-sft-personas-instruction-following	Instruction following with persona	Open Data Commons License Attribution family
xingyaoww/code-act	ReAct style reasoning and acting	Apache-2.0
m-a-p/Code-Feedback	Feedback alignment	Apache-2.0
HuggingFaceTB/smoltalk	General conversation, system prompt handling	Apache-2.0
HuggingFaceTB/smoltalk/apigen	Tool calling stabilization	Creative Commons Attribution 4.0 (was sourced from 1, 2)
weijie210/gsm8k_decomposed	Question decomposition	-
Locutusque/function-calling-chatml	Tool call response formatting	Apache-2.0
Salesforce/xlam-function-calling-60k	Stronger function calling coverage	Creative Commons Attribution 4.0
HuggingFaceTB/smoltalk2/SFT/smolagents_toolcalling_traces_think	Web search, scraping, real-time reasoning	Apache-2.0
NousResearch/hermes-function-calling-v1	Tool calling support with thinking	Apache-2.0
HuggingFaceTB/smoltalk/smol-magpie-ultra	For python code writing	Apache-2.0

🧭 Key Explorations & Findings

✂️ Dataset deduplication significantly improved performance by removing noisy or duplicate Q/As.
✂️ Shortening the responses (casual response) and using shorter python code in training improved performance and reduce repeated token generation.
🧮 Word-level reasoning from orca-math enhanced the model’s ability to handle stepwise logic.
🧰 Designing tool calling prompts using six open-source tool calling datasets resulted in stronger structured output generation.
🌐 Tool calling integration enabled the model to extract answers from parsed web data, supporting up-to-date queries.

⚡ Benchmark

Metric / Task	SmolLM2-135M-Instruct	NanoAgent
🧮 Parameters	135M	135M
📏 Context Length	8k	8k
📊 IFEval Score (Overall)	---	---
🧰 Tool Call Tasks	❌ Not Supported	✅ Supported
🧭 Instruction Following	🟡 Moderate	🟢 Improved
🧠 Reasoning (Light)	🟡 Moderate	🟡 Moderate
📝 Training Method	Baseline (SFT)	SFT + Agentic Finetuning
🧪 Strength	Instruction following	Tool call ability + structured outputs
⚠️ Limitations	No tool calling	Occasional tool errors, still beta

🧭 Roadmap

📊 Benchmark more agentic tasks
🧠 Explore GRPO for tool calling improvement
🔀 Experiment with weight merging
🧪 Evaluate multi-turn tool chaining
🧹 Further refine datasets for stability

Directory Tree

NanoAgent/
├── data/
│   ├── dataprep.py          # Dataset preparation, cleaning, and formatting
│   └── utils.py             # Helper utilities for data processing
│
├── grpo/
│   └── grpo-mlx.py          # Experimental GRPO (agentic fine-tuning) implementation using MLX
│
├── notebooks/
│   └── inference.ipynb      # Demo notebook for inference and evaluation
│
├── sft/
│   └── train-mlx.py         # Supervised Fine-Tuning (SFT) training script using MLX
│
├── utils/
│   ├── gguf_conv.py         # Conversion script for exporting model to GGUF format (for llama.cpp etc.)
│   ├── tokenizer.py         # Tokenizer helper functions and configs
│   └── webtool.py           # Example tool interface for web search / parsing integration
│
├── LICENSE                  # Apache 2.0 license file
├── NOTICE                   # Notices and attributions for datasets and dependencies
└── README.md                # Project overview, usage guide, and dataset details

📄 License

This project (code, model weights, and training recipes) is licensed under the Apache License 2.0.

📢 Notice

Portions of the training data were sourced from third-party datasets under CDLA-P 2.0, MIT, CC-BY 4.0, ODC-BY, and Apache 2.0.
The licensors of these datasets do not endorse this project or its outputs.
If you redistribute or fine-tune this model, ensure your use complies with all applicable dataset licenses.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 NanoAgent — A 135M Parameter Agentic SLM

🌍 Real-World Use Cases

✨ What NanoAgent Supports

🧪 Training Overview

📚 Datasets Used

🧭 Key Explorations & Findings

⚡ Benchmark

🧭 Roadmap

Directory Tree

📄 License

📢 Notice

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github		.github
config/mergekit		config/mergekit
data		data
grpo		grpo
notebooks		notebooks
sft		sft
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
requirements.txt		requirements.txt

Uh oh!

License

QuwsarOhi/NanoAgent

Folders and files

Latest commit

History

Repository files navigation

🧠 NanoAgent — A 135M Parameter Agentic SLM

🌍 Real-World Use Cases

✨ What NanoAgent Supports

🧪 Training Overview

📚 Datasets Used

🧭 Key Explorations & Findings

⚡ Benchmark

🧭 Roadmap

Directory Tree

📄 License

📢 Notice

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Languages

Packages