This repository contains examples demonstrating how to build voice-enabled conversational AI agents using the NVIDIA services, built using the Pipecat framework. These examples demonstrate various implementation patterns, ranging from simple LLM-based conversations to complex agentic workflows, and from WebSocket-based solutions to advanced WebRTC implementations with real-time capabilities.
- Voice Agent WebSocket : A simple voice assistant pipeline using WebSocket-based transport. This example demonstrates integration with NVIDIA LLM Service, Riva ASR and TTS NIMS.
- Voice Agent WebRTC : A more advanced voice agent using WebRTC Transport with real-time transcripts, dynamic prompt configuration and TTS voice selection via UI.
- NAT Agent (NeMo Agent Toolkit) : An end-to-end intelligent voice assistant powered by NeMo Agent Toolkit. The ReWoo agent uses planning-based approach for efficient task decomposition and execution with custom tools for menu browsing, pricing and cart management.
We recommend starting with the Voice Agent WebSocket example for a simple introduction, then progressing to WebRTC-based examples for production use cases. More details on examples can be found in examples README.md.
The NVIDIA Pipecat library augments the Pipecat framework by adding additional frame processors and NVIDIA services. This includes the integration of NVIDIA services and NIMs such as Riva ASR, Riva TTS, LLM NIMs, NAT (NeMo Agent Toolkit), and Foundational RAG. It also introduces a few processors with a focus on improving the end-user experience for multimodal conversational agents, along with speculative speech processing to reduce latency for faster bot responses.
The NVIDIA Pipecat package is released as a wheel on PyPI. Create a Python virtual environment and use the pip command to install the nvidia-pipecat package.
pip install nvidia-pipecatYou can start building pipecat pipelines utilizing services from the NVIDIA Pipecat package.
If you wish to work directly with the source code or modify services from the nvidia-pipecat package, you can utilize either the UV or Nix development setup as outlined below.
To get started, first install the UV package manager.
Then, create a virtual environment with all the required dependencies by running the following commands:
uv venv
source .venv/bin/activate
uv syncOnce the environment is set up, you can begin building pipelines or modifying the services in the source code.
If you wish to contribute your changes to the repository, please ensure you run the unit tests, linter, and formatting tool.
To run unit tests, use:
uv run pytest
To format the code, use:
ruff formatTo run the linter, use:
ruff check
To set up your development environment using the Nix, follow these steps:
Initialize the development environment: Simply run the following command:
nix developThis setup provides you with a fully configured environment, allowing you to focus on development without worrying about dependency management.
To ensure that all checks such as the formatting and linter for the repository are passing, use the following command:
nix flake checkThe project documentation includes:
- Voice Agent Examples - Voice agents examples built using pipecat and NVIDIA services
- NVIDIA Pipecat - Custom Pipecat processors implemented for NVIDIA services
- Best Practices - Performance optimization guidelines and production deployment strategies
- Speculative Speech Processing - Advanced speech processing techniques for reducing latency
We invite contributions! Open a GitHub issue or pull request! See contributing guildelines here.