This document describes the telemetry system used by Vibe Tools, covering the infrastructure, data structure, data flow/pipeline, and privacy considerations.
The telemetry infrastructure for vibe-tools is designed to collect anonymous usage data to help improve the tool. It involves:
- Client-side data collection within the
vibe-toolsCLI - A Cloudflare Worker acting as an ingestion endpoint
- A Cloudflare Pipeline for data processing and batching
- Cloudflare R2 for final storage
Telemetry collection happens in src/telemetry/index.ts.
The CommandState interface defines the core data points:
- command: The
vibe-toolscommand executed (e.g.,repo,web,plan) - startTime: Timestamp when the command started
- Token counts:
tokenCount(overall context tokens),promptTokens,completionTokens - AI Model details:
provider(e.g.,gemini,openai),modelname - Plan command specific:
fileProvider,fileModel,thinkingProvider,thinkingModel - options: Sanitized command-line options used (e.g.,
--debug,--saveTo, but not sensitive values) - error: If an error occurred, its
type(constructor name) andmessageare recorded
The TELEMETRY_DATA_DESCRIPTION explicitly states that the following are not tracked:
- User queries
- Prompts
- File contents
- Code
- Personal data
- API keys
- A
userIdis generated (UUID) and stored in~/.vibe-tools/diagnostics.json - If telemetry is disabled,
userIdcan beanonymous_opt_outoranonymous_pending_prompt - A
sessionId(UUID) is generated for each CLI invocation
- Users can opt-out via the
VIBE_TOOLS_NO_TELEMETRY=1environment variable - During
vibe-tools install, users are prompted to enable/disable telemetry, with the choice stored in~/.vibe-tools/diagnostics.json
startCommand(command, options): InitializescurrentCommandStatewhen a command beginsupdateCommandState(update): Allows updating the state with information like token counts as the command progressesrecordError(error): Captures error details if a command failsendCommand(): Calculates duration, finalizes the payload, and callstrackEventtrackEvent(eventName, properties):- Constructs the final JSON payload
- Sends an HTTP POST request to the
TELEMETRY_ENDPOINT
The JSON payload sent by the client has the following structure:
{
"data": {
"eventName": "command_executed" | "command_error", // Type of event
"userId": "string", // User's unique identifier (or anonymous placeholder)
"sessionId": "string", // Unique ID for the current CLI session
"timestamp": "ISO8601_string", // Time of the event
"toolVersion": "string", // Version of vibe-tools
// --- Additional properties from CommandState ---
"command": "string",
"duration": "number_milliseconds",
"contextTokens": "number_optional",
"promptTokens": "number_optional",
"completionTokens": "number_optional",
"provider": "string_optional",
"model": "string_optional",
"fileProvider": "string_optional", // Specific to 'plan' command
"fileModel": "string_optional", // Specific to 'plan' command
"thinkingProvider": "string_optional",// Specific to 'plan' command
"thinkingModel": "string_optional", // Specific to 'plan' command
"options": { /* sanitized key-value pairs */ },
"hasError": "boolean",
"errorType": "string_optional" // e.g., "ProviderError", "FileError"
}
}The infrastructure is provisioned using Alchemy for Cloudflare. The key configuration is in infra/alchemy/alchemy.run.ts.
- Acts as the ingestion endpoint for telemetry data
- The entry point is
infra/app/index.ts - Requests to
/api/pipelineare handled by a Nuxt server route defined ininfra/server/api/pipeline.post.ts - This route receives the JSON payload from the
vibe-toolsCLI
- Source: Configured with a
bindingsource, receiving data pushed by the Cloudflare Worker - Processing: Batches incoming JSON data based on:
- Size (max 10MB)
- Time (max 5 seconds)
- Row count (max 100 rows)
- Destination: The R2 bucket
- Final storage for the batched telemetry data
- Data is stored in JSON format
- An account-level API token with
Workers R2 Storage Bucket Item Writepermissions - Used by the Cloudflare Pipeline to write data to R2
- The
alchemy.run.tsscript updates theTELEMETRY_ENDPOINTconstant insrc/telemetry/index.tsafter deploying the worker - It sets the endpoint to the live URL of the deployed worker's API endpoint (e.g.,
https://vibe-tools-infra.aejefferson.workers.dev/api/pipeline)
- Collection:
vibe-toolsCLI gathers telemetry data during command execution - Transmission: Upon command completion (or error),
endCommand()callstrackEvent(), which sends an HTTP POST request with the JSON payload to theTELEMETRY_ENDPOINT - Ingestion: The Cloudflare Worker receives the request
- It reads the request body
- It extracts the
dataobject from the payload - It sends this
dataobject, wrapped in an array ([data]), to its bound Cloudflare Pipeline
- Processing & Batching: The Cloudflare Pipeline batches the data according to its configuration
- Storage: Once a batch is ready, the Pipeline writes the batch of JSON objects to the
vibe-tools-telemetryR2 bucket
┌─────────────────┐ HTTP POST ┌─────────────────────┐
│ │ JSON Payload │ │
│ vibe-tools CLI ├───────────────────► Cloudflare Worker │
│ │ │ │
└─────────────────┘ └─────────┬───────────┘
│
│ pipeline.send([data])
▼
┌─────────────────────┐
│ │
│ Cloudflare Pipeline │
│ │
└─────────┬───────────┘
│
│ Batch writes
▼
┌─────────────────────┐
│ │
│ Cloudflare R2 │
│ │
└─────────────────────┘
- Anonymity:
userIdis designed to be anonymous. Opting out or pending prompt results in placeholder IDs - Opt-out: Users can disable telemetry via environment variable (
VIBE_TOOLS_NO_TELEMETRY) or through the interactive installer - Data Minimization: The system collects only specific, non-sensitive data points
- Sanitization: The
sanitizeOptionsfunction ensures that only a whitelist of command-line option keys are tracked - Transparency: The
TELEMETRY_DATA_DESCRIPTIONdetails what is and isn't collected
Based on the current implementation, the telemetry system focuses primarily on data collection and storage. There are no built-in tools or systems within this repository for analyzing the telemetry data after it's stored in the R2 bucket.
The data is structured in JSON format and stored in batches in the Cloudflare R2 bucket, which makes it suitable for:
- Manual Analysis: Downloading the stored JSON files for analysis using data analysis tools
- Custom Tooling: Building custom analysis tools that read directly from the R2 bucket
- BI Integration: Integrating with business intelligence platforms by exporting data from R2
For implementing a data analysis solution, one approach would be to:
- Create a scheduled job to process data from R2
- Transform and load the data into a database or data warehouse
- Build dashboards or reporting tools on top of the processed data
src/telemetry/index.ts- Client-side telemetry collectioninfra/alchemy/alchemy.run.ts- Infrastructure provisioninginfra/server/api/pipeline.post.ts- Telemetry ingestion endpointinfra/app/index.ts- Cloudflare Worker entry pointinfra/env.ts- Environment configuration