Learn Pydantic AI agents, step by step,
A step-by-step guide to building intelligent AI agents using Pydantic AI and local models and (Ollama or any OAI compatible). This tutorial demonstrates how to create structured, vision-capable, and tool-equipped AI agents.
All examples have been updated to use the latest Pydantic-AI library APIs, including:
- Modern
OpenAIModelwithOpenAIProviderstructure - Updated parameter names (
output_type,max_result_retries) - Latest multimodal/vision processing with
BinaryContent - Fixed deprecated API usage (
response.outputinstead ofresponse.data) - Environment variable configuration for dynamic model setup
- 📝 Structured output
- 🔧 Custom tool integration
- 👁️ Using Vision models (llama3.2 vision / minicpm-v)
- 🤝 Simple Multi-agent systems
- 📁 File handling
- 🔄 Dynamic prompting
- Clone the repository
- Create conda evironment
conda create -n pydantic python=3.12 pip -y- Install dependencies:
pip install pydantic-ai llm-sandbox- Configure your environment:
# Copy the example environment file
cp .env.example .env
# Edit .env and uncomment/configure your preferred LLM provider
# The file includes examples for OpenAI, Ollama, vLLM, DeepSeek, and more- Set up Ollama and required models (choose any of those model)
# Best agents model for local run
ollama pull llama3.1:8b-instruct-q8_0
ollama pull qwen2.5:14b
ollama pull qwen2.5:32b
# vision models
ollama pull llama3.2-vision:latest
ollama pull minicpm-vIf you prefer faster local inference you can use vLLM, here is 2 examples for preparing vLLM
# create vllm conda environment
conda create -n vllm python=3.12 pip -y
conda activate vllm
pip install vllm
# run vllm with LLama3.1 8B with flags that enables tool calling
# assuming you downloaded the GPTQ-Q8 version
vllm serve Meta-Llama-3.1-8B-Instruct-GPTQ-Q_8 \
--port 5003 \
--enforce-eager \
--kv-cache-dtype fp8 \
--enable-chunked-prefill \
--enable-auto-tool-choice \
--tool-call-parser llama3_json \
--guided-decoding-backend xgrammar \
--chat-template ./tool_chat_template_llama3.1_json.jinjayou can download the jinja template from vllm github repo vLLM repo vllm/examples/tool_chat_template_llama3.1_json.jinja
based on offical documentation Qwen2.5 uses hermes tool calling and hermes jinja template Qwen2.5 function_call
# Running Qwen2.5-32B with dual RTX cards.
vllm serve Qwen2.5-32B-Instruct-AWQ --port 5003 \
--tensor-parallel-size 2 \
--enforce-eager \
--kv-cache-dtype fp8 \
--enable-chunked-prefill \
--enable-auto-tool-choice \
--tool-call-parser hermes \
--guided-decoding-backend xgrammar \
--chat-template ./tool_chat_template_hermes.jinja# Running Qwen3-30B-A3B-GPTQ-Int4 with dual RTX cards.
vllm serve Qwen3-30B-A3B-GPTQ-Int4 \
--disable-log-requests \
--port 5003\
--tensor-parallel-size 2 \
--enable-prefix-caching \
--enable-auto-tool-choice \
--tool-call-parser hermes \
--enable-reasoning \
--reasoning-parser deepseek_r1 \
--guided-decoding-backend xgrammar \
--kv-cache-dtype fp8After starting your vLLM server, update your .env file to use it. For example, if you're running Qwen2.5-32B-Instruct-AWQ:
# Edit your .env file
MODEL_NAME=Qwen2.5-32B-Instruct-AWQ
BASE_URL=http://localhost:5003/v1
API_KEY=vllmThe examples will automatically use your vLLM server configuration. No code changes needed!
The repository contains progressive examples:
0_hello_world.py- Basic agent setup1_hello_with_OAI_api.py- OpenAI compatible API integration2_simple_structured.py- Structured outputs3_simple_structured_table.py- Table formatting + Structured outputs
4_lets_make_tools.py- Custom tool creation5_mix_tools_with_structured_output.py- Combining tools and structure6_code_as_tool.py- Code execution capabilities7_code_with_added_libs.py- External library integration
8_lets_make_dynamic_prompt.py- Dynamic prompt generation9_mix_multiple_prompts.py- Multiple prompt handling10_lets_open_files.py- File operations11_hello_vision.py- Basic vision capabilities12_running_2_mixed_agents_with_vision.py- Advanced vision and multi-agent systems
Licensed under the Apache License, Version 2.0. See LICENSE file for details.
Contributions welcome! Please feel free to submit a Pull Request.