Skip to content

Latest commit

 

History

History
172 lines (122 loc) · 7.51 KB

File metadata and controls

172 lines (122 loc) · 7.51 KB
name voicecall
description Place task-driven outbound phone calls via Twilio and ElevenLabs, from the CLI or the `@kvendrik/voicecall` SDK. Use when the user wants a live phone call made on their behalf to complete a specific task (e.g. schedule or confirm an appointment, ask a question, or get information).
requires
env:TWILIO_ACCOUNT_SID
env:TWILIO_AUTH_TOKEN
env:TWILIO_FROM_NUMBER
env:ELEVENLABS_KEY
env:ELEVENLABS_VOICE_ID
env:NGROK_AUTHTOKEN
env:ANTHROPIC_API_KEY

Voicecall Agent

Task-driven outbound voice calls from the CLI or SDK (callWithTask, doctor), powered by Twilio, ElevenLabs, ngrok, and an LLM backend (default: Claude via ANTHROPIC_API_KEY).

Provide as much context as possible. Before running a call, gather from the user (or from available data) every detail that would help the voice agent succeed: who is calling on whose behalf, relationship to the callee, time preferences, constraints, fallback options, and any other background the callee might need. Put that into --context (CLI) or the context option (SDK). A call with rich, specific context is far more likely to succeed than one with minimal context.

Quick start

1. Install

  • Published package (needs Bun on PATH; the CLI binary is voicecall):

    bun install -g @kvendrik/voicecall

    Or run without a global install (package name ≠ CLI name):

    bunx -p @kvendrik/voicecall voicecall doctor
  • This repository (development):

    bun install

    Then use the npm scripts (they run index.ts directly—no dist/ build required):

    bun run doctor
    bun run call -- --to "+12065551234" --task "" --context ""

2. Set required environment variables

  • Validated by voicecall doctor / doctor() (see validate() in ./config.ts):
    TWILIO_ACCOUNT_SID, TWILIO_AUTH_TOKEN, TWILIO_FROM_NUMBER, ELEVENLABS_KEY, ELEVENLABS_VOICE_ID, NGROK_AUTHTOKEN

  • Needed when placing a call with the default LLM (not checked by doctor):
    ANTHROPIC_API_KEY — used by callWithTask unless you pass a custom createAgent (SDK) that uses another provider.

doctor throws with a list of missing variables if any of the first group are absent.

3. Check health

voicecall doctor

(From the repo: bun run doctor.)

4. Run a call

Global / PATH install:

voicecall call \
  --to "+12065551234" \
  --task "Confirm tomorrow's dentist appointment and note the time" \
  --context "You are my assistant calling my dentist. Be brief and polite."

From the repo:

bun run call -- \
  --to "+12065551234" \
  --task "Confirm tomorrow's dentist appointment and note the time" \
  --context "You are my assistant calling my dentist. Be brief and polite."
  • --to: E.164 phone number to dial.
  • --task: What you want the agent to accomplish.
  • --context: Extra background or constraints.

The CLI will:

  • Place the call.
  • Talk to the callee using the LLM + TTS.
  • Print a final conclusion and transcript-style lines from onSpeech logging once the task concludes.

SDK: add @kvendrik/voicecall, call doctor() then callWithTask({ to, task, context, … }) (see README.md).

Typical usage patterns

  • Quick one-off task

    Ask the agent to complete a single, concrete task:

    voicecall call \
      --to "+12065551234" \
      --task "Ask when my order #1234 will ship and summarize the answer" \
      --context "You are my assistant."
  • Richer context (default behavior)

    Always supply as much useful context as you can in --context. Do not default to a minimal one-liner; treat context as the main lever for call success.

    • When constructing a voicecall call CLI command, always ask:
      • What background details does the callee need?
      • What constraints or preferences does the caller have?
      • Especially for times and scheduling, prefer giving a time range or multiple acceptable options instead of a single exact time.
    • Encode those details in --context rather than overloading --task.
    • Never provide sensitive information like the names of events on the person's calendar.

    Examples:

    • Scheduling a dentist appointment

      voicecall call \
        --to "+12065551234" \
        --task "Schedule a dentist appointment for me" \
        --context "I am available next Monday–Thursday between 9am and 2pm, prefer mornings, and I need a routine check-up and cleaning. If 9am is not available, any time between 9am and 11am on those days is fine."
    • Rescheduling a meeting

      voicecall call \
        --to "+12065554321" \
        --task "Reschedule my project meeting" \
        --context "I cannot make the original 3pm time today because of a conflict. Offer any 10–11am slot this week; prioritize Tuesday or Wednesday."

    As a rule of thumb: if a human assistant would need to know it to make the call effective, include it in --context. When in doubt, include it—more context (within the safety rules below) is better than less.

    Do NOT put the following into --context (or otherwise reveal it to the callee) unless the user has explicitly asked for it and it is clearly required for the task:

    • Full payment details (credit card numbers, CVV, bank account numbers, full IBANs).
    • Highly sensitive identifiers (full social security numbers, full national IDs, authentication codes, passwords, API keys).
    • Irrelevant private medical history when, for example, booking a simple dentist check-up (only share what the user explicitly provided, such as “routine check-up and cleaning” or “follow-up for last week’s root canal”).
    • Private information about third parties (their health, finances, internal company data) that the user did not provide as part of the task.

    When in doubt, prefer omitting unnecessary personal or sensitive data from --context, and only include what is clearly needed to achieve the user’s goal.

  • Tuning timeouts / behavior

    • Call pickup: waitForStatus('in-progress', …) in ./task.ts uses a 120s timeout (see ./VoiceCall/VoiceCall.ts).
    • Task duration: a 120s timer starts after the agent session is created; on expiry the process logs Call timed out — hanging up. and exits with code 1 (./task.ts).
    • Max call length on the media server: config.maxCallDurationSeconds in ./config.ts (default 10 minutes) is used in ./VoiceCall/VoiceCall.ts.
    • Port, STT/TTS defaults: ./config.ts.

Error handling (what you’ll see)

  • Missing env (doctor checks): doctor / voicecall doctor throws with
    Missing required environment variables for voice calling: and a bullet list (./config.ts).
  • Missing ANTHROPIC_API_KEY on call: error when creating the default agent session (./task.ts), e.g. ANTHROPIC_API_KEY is not set.
  • Call never reaches in-progress / fails early: waitForStatus throws with messages such as
    Call <id> reached terminal status 'failed' before 'in-progress'. or
    Timed out waiting for call <id> to reach status 'in-progress'. (./VoiceCall/VoiceCall.ts).
  • Task timer: stderr line Call timed out — hanging up. then non-zero exit (./task.ts).
  • Twilio / network / other runtime errors: uncaught Error messages or stack traces from the relevant client; fix env or connectivity and rerun.

If something fails, you can usually fix the env or network issue and rerun the same voicecall call … command (or SDK call).