AWS Voice Agent with LangSmith Tracing

Production-ready voice agent deployed on AWS with comprehensive observability via LangSmith and OpenTelemetry.

Demo

The demonstration above shows real-time tracing of voice agent interactions in LangSmith, capturing STT transcriptions, LLM inference, TTS synthesis, and conversation turn metrics.

Overview

This project implements a real-time voice agent built with Pipecat that functions as an AWS instructor. Users engage in natural voice conversations to learn about AWS services, prepare for certifications, and receive hands-on guidance.

The system is designed for production deployment on AWS infrastructure using Terraform, with a React frontend for user interaction and comprehensive observability through LangSmith tracing. The voice agent uses SmallWebRTCTransport for peer-to-peer WebRTC connections, eliminating dependency on third-party WebRTC providers.

Key Features

Real-time voice interaction with sub-second latency
AWS Bedrock Claude 3.5 Haiku for natural language understanding
OpenAI Whisper for speech-to-text transcription
OpenAI TTS for text-to-speech synthesis
Native WebRTC transport using Pipecat SmallWebRTCTransport
AWS Cognito for user authentication
ECS Fargate for serverless container deployment
S3 and CloudFront for frontend hosting
LangSmith integration for end-to-end tracing and observability

Architecture

AWS Deployment Architecture

                                 AWS Cloud
+--------------------------------------------------------------------------------+
|                                                                                |
|  +------------------+     +------------------+     +------------------------+  |
|  |   CloudFront     |     |   API Gateway    |     |   Cognito User Pool    |  |
|  |   + S3 Bucket    |     |   (HTTP API)     |     |   (Authentication)     |  |
|  |   (Frontend)     |     |                  |     |                        |  |
|  +--------+---------+     +--------+---------+     +-----------+------------+  |
|           |                        |                           |               |
|           |                        v                           |               |
|           |               +------------------+                 |               |
|           |               |   Application    |                 |               |
|           |               |   Load Balancer  |                 |               |
|           |               +--------+---------+                 |               |
|           |                        |                           |               |
|           |                        v                           |               |
|           |               +----------------------------------+ |               |
|           |               |   ECS Fargate                    | |               |
|           |               |   +----------------------------+ | |               |
|           +-------------->|   |  Voice Agent Container     | | |               |
|                           |   |  - FastAPI Server          |<+-+               |
|                           |   |  - WebRTC Signaling        | |                 |
|                           |   |  - Pipecat Pipeline        | |                 |
|                           |   |  - LangSmith Tracing       | |                 |
|                           |   +----------------------------+ |                 |
|                           +----------------------------------+                 |
|                                        |                                       |
|                                        v                                       |
|                           +----------------------------------+                 |
|                           |   External Services              |                 |
|                           |   - AWS Bedrock (Claude)         |                 |
|                           |   - OpenAI (STT/TTS)             |                 |
|                           |   - LangSmith (Tracing)          |                 |
|                           +----------------------------------+                 |
+--------------------------------------------------------------------------------+

Voice Pipeline Architecture

+-------------+     +-------------+     +-------------+     +-------------+
|   WebRTC    |     |   OpenAI    |     |    AWS      |     |   OpenAI    |
|   Input     | --> |   Whisper   | --> |   Bedrock   | --> |    TTS      |
|   (Audio)   |     |   (STT)     |     |   Claude    |     |             |
+-------------+     +-------------+     +-------------+     +-------------+
                           |                   |                   |
                           v                   v                   v
                    +--------------------------------------------------+
                    |            LangSmith OpenTelemetry               |
                    |                   Tracing                        |
                    +--------------------------------------------------+

Project Structure

voice-agent-tracing/
├── bot.py                     # Legacy Daily.co voice agent
├── bot_webrtc.py              # Voice agent with SmallWebRTCTransport
├── server.py                  # Legacy Daily.co server
├── server_webrtc.py           # FastAPI server with WebRTC signaling
├── tracing_observer.py        # Custom LangSmith tracing observer
├── requirements.txt           # Python dependencies
├── Dockerfile                 # Backend container definition
├── docker-compose.yml         # Local development orchestration
├── deploy.sh                  # Deployment automation script
│
├── frontend/                  # React frontend application
│   ├── src/
│   │   ├── App.tsx            # Main application with routing
│   │   ├── components/
│   │   │   ├── auth/
│   │   │   │   └── AuthLogin.tsx    # Authentication forms
│   │   │   └── chat/
│   │   │       └── VoiceAgent.tsx   # WebRTC voice client
│   │   └── pages/
│   │       └── ChatPage.tsx         # Chat interface
│   ├── package.json           # Node.js dependencies
│   ├── vite.config.ts         # Vite configuration
│   └── Dockerfile             # Frontend container definition
│
└── infrastructure/            # Terraform IaC
    ├── main.tf                # Provider configuration
    ├── variables.tf           # Input variables
    ├── outputs.tf             # Output values
    ├── cognito.tf             # Cognito User Pool resources
    ├── networking.tf          # VPC, subnets, security groups, ALB
    ├── ecs.tf                 # ECS cluster, task definition, service
    ├── iam.tf                 # IAM roles and policies
    ├── secrets.tf             # Secrets Manager configuration
    ├── frontend.tf            # S3 bucket, CloudFront distribution
    └── terraform.tfvars       # Variable values

Technology Stack

Component	Technology	Purpose
LLM	AWS Bedrock Claude 3.5 Haiku	Natural language understanding and response generation
STT	OpenAI Whisper API	Speech-to-text transcription
TTS	OpenAI TTS	Text-to-speech synthesis
Transport	Pipecat SmallWebRTCTransport	Peer-to-peer WebRTC connections
Backend Framework	FastAPI	HTTP API and WebRTC signaling server
Frontend Framework	React + TypeScript	User interface
Styling	Tailwind CSS	UI styling
Build Tool	Vite	Frontend build and development
Authentication	AWS Cognito	User authentication and authorization
Container Orchestration	AWS ECS Fargate	Serverless container deployment
CDN	AWS CloudFront	Frontend content delivery
Storage	AWS S3	Static asset hosting
Secrets	AWS Secrets Manager	API key management
IaC	Terraform	Infrastructure provisioning
Tracing	LangSmith + OpenTelemetry	Observability and monitoring

Prerequisites

Required Software

Python 3.11 or higher
Node.js 20 or higher
Docker and Docker Compose
Terraform 1.0 or higher
AWS CLI v2 configured with appropriate credentials

Required API Keys

Service	Purpose	Obtain From
OpenAI	STT (Whisper) and TTS	https://platform.openai.com
AWS	Bedrock Claude access	AWS Console with Bedrock model access enabled
LangSmith	Tracing and observability	https://smith.langchain.com

AWS Permissions

The AWS credentials used for deployment require the following permissions:

EC2, VPC, and networking resources
ECS cluster and service management
ECR repository management
Cognito User Pool management
S3 bucket management
CloudFront distribution management
Secrets Manager access
IAM role and policy management
Bedrock model invocation

Local Development

Backend Setup

# Clone the repository
git clone https://github.com/ihatesea69/monitoring-voice-agent-langsmith-aws.git
cd monitoring-voice-agent-langsmith-aws

# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate  # Linux/macOS
# .venv\Scripts\activate   # Windows

# Install dependencies
pip install -r requirements.txt

# Configure environment variables
cp .env.example .env
# Edit .env with your API keys

# Start the backend server
python server_webrtc.py

The backend server will start on http://localhost:7860.

Frontend Setup

cd frontend

# Install dependencies
npm install

# Start development server
npm run dev

The frontend development server will start on http://localhost:5173 and automatically proxy API requests to the backend.

Docker Compose Development

For running both services together:

docker-compose up

This starts the backend on port 7860 and frontend on port 5173.

AWS Deployment

Step 1: Configure Terraform Variables

Edit infrastructure/terraform.tfvars:

project_name = "voice-agent"
environment  = "dev"
aws_region   = "us-east-1"
ecs_cpu      = 512
ecs_memory   = 1024

Step 2: Initialize and Apply Terraform

cd infrastructure

# Initialize Terraform
terraform init

# Preview changes
terraform plan

# Apply infrastructure
terraform apply

Terraform creates the following resources:

VPC with two public subnets
Internet Gateway and route tables
Security groups for ALB and ECS tasks
Application Load Balancer
ECS Cluster with Fargate capacity provider
ECR repository for container images
Cognito User Pool and App Client
S3 bucket for frontend assets
CloudFront distribution
Secrets Manager secrets for API keys
IAM roles with Bedrock access

Step 3: Store API Keys in Secrets Manager

# Store OpenAI API key
aws secretsmanager put-secret-value \
    --secret-id voice-agent-dev-openai-api-key \
    --secret-string "sk-your-openai-api-key"

# Store LangSmith API key
aws secretsmanager put-secret-value \
    --secret-id voice-agent-dev-langsmith-api-key \
    --secret-string "lsv2_your-langsmith-api-key"

Step 4: Build and Push Docker Image

# Get ECR login command from Terraform output
aws ecr get-login-password --region us-east-1 | \
    docker login --username AWS --password-stdin \
    $(terraform output -raw ecr_repository_url | cut -d'/' -f1)

# Build the image
docker build -t $(terraform output -raw ecr_repository_url):latest .

# Push to ECR
docker push $(terraform output -raw ecr_repository_url):latest

# Force ECS service update
aws ecs update-service \
    --cluster voice-agent-dev-cluster \
    --service voice-agent-dev-backend \
    --force-new-deployment

Step 5: Deploy Frontend

cd frontend

# Create production environment file
cat > .env << EOF
VITE_API_URL=$(cd ../infrastructure && terraform output -raw backend_url)
VITE_COGNITO_USER_POOL_ID=$(cd ../infrastructure && terraform output -raw cognito_user_pool_id)
VITE_COGNITO_CLIENT_ID=$(cd ../infrastructure && terraform output -raw cognito_user_pool_client_id)
VITE_COGNITO_REGION=us-east-1
EOF

# Build production bundle
npm run build

# Sync to S3
aws s3 sync dist/ s3://$(cd ../infrastructure && terraform output -raw frontend_bucket)/ --delete

# Invalidate CloudFront cache (optional)
aws cloudfront create-invalidation \
    --distribution-id $(aws cloudfront list-distributions --query "DistributionList.Items[0].Id" --output text) \
    --paths "/*"

Step 6: Create Test User

POOL_ID=$(cd infrastructure && terraform output -raw cognito_user_pool_id)

# Create user
aws cognito-idp admin-create-user \
    --user-pool-id $POOL_ID \
    --username test@example.com \
    --user-attributes Name=email,Value=test@example.com \
    --message-action SUPPRESS

# Set permanent password
aws cognito-idp admin-set-user-password \
    --user-pool-id $POOL_ID \
    --username test@example.com \
    --password "SecurePassword123!" \
    --permanent

LangSmith Tracing

The system implements comprehensive tracing using OpenTelemetry exported to LangSmith.

Traced Components

Span Name	Type	Captured Attributes
stt_transcription	LLM	transcript, word_count, character_count
llm_generation	LLM	prompt, completion, token usage, latency
tts_synthesis	LLM	input_text, character_count, voice, latency
conversation_turn	Chain	user_message, assistant_response, total_latency
voice_agent_session	Chain	conversation_id, session duration

Viewing Traces

Navigate to https://smith.langchain.com
Select the project specified in LANGSMITH_PROJECT environment variable
Traces appear in real-time as conversations occur

Trace Data Captured

For each conversation turn, the following data is captured:

STT transcript and processing latency
LLM prompt messages and completion response
Estimated token usage (prompt and completion tokens)
TTS input text and synthesis latency
End-to-end turn latency

API Reference

Health Check

GET /health

Response:

{
  "status": "healthy",
  "service": "voice-agent"
}

WebRTC Signaling

POST /offer
Content-Type: application/json

{
  "sdp": "<SDP offer string>",
  "type": "offer"
}

Response:

{
  "sdp": "<SDP answer string>",
  "type": "answer"
}

Server Status

GET /api/status

Response:

{
  "active_connections": 1,
  "tracing_enabled": true,
  "region": "us-east-1"
}

Configuration

Environment Variables

Variable	Required	Default	Description
OPENAI_API_KEY	Yes	-	OpenAI API key for STT and TTS
AWS_REGION	No	us-east-1	AWS region for Bedrock
AWS_ACCESS_KEY_ID	No*	-	AWS access key (uses IAM role on ECS)
AWS_SECRET_ACCESS_KEY	No*	-	AWS secret key (uses IAM role on ECS)
LANGSMITH_API_KEY	No	-	LangSmith API key for tracing
LANGSMITH_PROJECT	No	aws-voice-agent	LangSmith project name
HOST	No	0.0.0.0	Server bind address
PORT	No	7860	Server port
COGNITO_USER_POOL_ID	No	-	Cognito pool ID for auth
COGNITO_CLIENT_ID	No	-	Cognito client ID

*Required for local development, not needed when running on ECS with IAM role.

Frontend Environment Variables

Variable	Required	Description
VITE_API_URL	No	Backend API URL (empty for same-origin)
VITE_COGNITO_USER_POOL_ID	No	Cognito User Pool ID
VITE_COGNITO_CLIENT_ID	No	Cognito App Client ID
VITE_COGNITO_REGION	No	Cognito region

Troubleshooting

WebRTC Connection Fails

If WebRTC connections fail to establish:

Verify the backend is accessible from the browser
Check browser console for WebRTC errors
For production deployments across networks, configure STUN/TURN servers

No Traces in LangSmith

Verify LANGSMITH_API_KEY is set correctly
Check that the API key has write access to the project
Review backend logs for OpenTelemetry export errors

ECS Task Fails to Start

Check CloudWatch logs at /ecs/voice-agent-dev-backend
Verify secrets are stored in Secrets Manager
Ensure the Docker image was pushed successfully to ECR

Bedrock Access Denied

Verify the ECS task role has Bedrock permissions
Ensure Claude model access is enabled in AWS Bedrock console
Confirm the model ID matches an available model in the region

Contributing

Contributions are welcome. To contribute:

Fork the repository
Create a feature branch (git checkout -b feature/improvement)
Commit changes (git commit -m 'Add improvement')
Push to branch (git push origin feature/improvement)
Open a Pull Request

License

Distributed under the MIT License. See LICENSE for details.

Contact

Project Link: https://github.com/ihatesea69/monitoring-voice-agent-langsmith-aws

Acknowledgments

Pipecat - Voice agent framework
LangSmith - LLM observability platform
AWS Bedrock - Managed LLM service
OpenAI - Whisper and TTS services
Terraform - Infrastructure as code

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
frontend		frontend
infrastructure		infrastructure
media		media
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
bot.py		bot.py
bot_webrtc.py		bot_webrtc.py
deploy.sh		deploy.sh
docker-compose.yml		docker-compose.yml
langsmith_processor.py		langsmith_processor.py
requirements.txt		requirements.txt
server.py		server.py
server_webrtc.py		server_webrtc.py
tracing_observer.py		tracing_observer.py

Folders and files

Latest commit

History

Repository files navigation

AWS Voice Agent with LangSmith Tracing

Demo

Table of Contents

Overview

Key Features

Architecture

AWS Deployment Architecture

Voice Pipeline Architecture

Project Structure

Technology Stack

Prerequisites

Required Software

Required API Keys

AWS Permissions

Local Development

Backend Setup

Frontend Setup

Docker Compose Development

AWS Deployment

Step 1: Configure Terraform Variables

Step 2: Initialize and Apply Terraform

Step 3: Store API Keys in Secrets Manager

Step 4: Build and Push Docker Image

Step 5: Deploy Frontend

Step 6: Create Test User

LangSmith Tracing

Traced Components

Viewing Traces

Trace Data Captured

API Reference

Health Check

WebRTC Signaling

Server Status

Configuration

Environment Variables

Frontend Environment Variables

Troubleshooting

WebRTC Connection Fails

No Traces in LangSmith

ECS Task Fails to Start

Bedrock Access Denied

Contributing

License

Contact

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages