Local AI workstation setup — Goose (Block's AI agent) + Ollama (local LLM inference) on NVIDIA GPU, managed with Ansible.
- Ubuntu/Debian with NVIDIA GPU and drivers installed
- CUDA toolkit
- Ansible 2.18+
ghCLI (for repo management)
git clone https://github.com/eschoeller/goose-nest.git
cd goose-nest
ansible-playbook deploy.yml -KYou'll be prompted for sudo password for tasks that need root (Ollama install, systemd config, model directory creation).
After the playbook completes, you need to configure Goose manually:
goose configure- Provider: Ollama
- Model:
qwen3-coder:30b(recommended) orqwen2.5-coder:14b - Telemetry: your choice
By default, Goose has very few extensions enabled. For full agentic capabilities (file access, shell commands, code editing), enable these extensions via goose configure:
| Extension | What it provides |
|---|---|
| developer | Shell access and code editing (essential) |
| code_execution | Token-efficient tool calls via code |
| computercontroller | Web scraping, file caching, automations |
| memory | Persistent memory across sessions |
| chatrecall | Search past conversations |
| autovisualiser | Data visualization and UI generation |
Without the developer extension, Goose cannot read files or run commands — it will only be able to chat.
goose sessionAsk Goose to read a file or run a command to confirm extensions are working.
Goose stores its configuration at ~/.config/goose/config.yaml.
- gpu_verify — Confirms NVIDIA GPU is present, reports VRAM/CUDA/driver info
- ollama — Installs/upgrades Ollama, moves model storage to
/games/models/ollama, configures systemd - models — Pulls configured LLM models
- goose — Installs/upgrades Goose CLI
Both Ollama and Goose are automatically upgraded when new versions are available.
Edit group_vars/all.yml to customize:
ollama_models_dir: /games/models/ollama # Where models are stored on disk
ollama_host: "http://localhost:11434"
ollama_models: # Models to pull
- qwen3-coder:30b # ~19GB - MoE, best tool-calling
- qwen2.5-coder:14b # ~9GB - Dense coding model
- qwen2.5:7b # ~4.7GB - General purpose
- deepseek-r1:7b # ~4.7GB - Reasoning
goose_default_model: qwen3-coder:30bUse tags to run specific parts:
ansible-playbook deploy.yml --tags gpu # GPU check only
ansible-playbook deploy.yml --tags ollama # Ollama setup only
ansible-playbook deploy.yml --tags models # Pull models only
ansible-playbook deploy.yml --tags goose # Goose setup onlyansible-playbook deploy.yml --check # Dry run
ollama list # Verify models
goose -V # Verify Goose version
goose session # Start interactive session