Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion src/pages/docs/ai-transport/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -134,4 +134,13 @@ Take a look at some example code running in-browser of the sorts of features you

## Pricing

// Todo
AI Transport uses Ably's [usage based billing model](/docs/platform/pricing) at your package rates. Your consumption costs will depend on the number of messages inbound (published to Ably) and outbound (delivered to subscribers), and how long channels or connections are active. [Contact Ably](https://ably.com/contact) to discuss options for Enterprise pricing and volume discounts.

The cost of streaming token responses over Ably depends on:

- the number of tokens in the LLM responses that you are streaming. For example, a simple support chatbot response is around 300 tokens, a code session chat can be 2,000-3,000 tokens and a deep reasoning response could be over 50,000 tokens.
- the rate at which your agent publishes tokens to Ably and the number of messages it uses to do so. Some LLMs output every token as a single event, while others batch multiple tokens together. Similarly, your agent may publish tokens as they are received from the LLM or perform its own processing and batching first.
- the number of subscribers receiving the response.
- the [token streaming pattern](/docs/ai-transport/features/token-streaming#token-streaming-patterns) you choose.

For example, an AI support chatbot sending a response of 250 tokens at 70 tokens/s to a single client using the [message-per-response](/docs/ai-transport/features/token-streaming/message-per-response) pattern would consume 90 inbound messages, 90 outbound messages and 90 persisted messages. See the [AI support chatbot pricing example](/docs/platform/pricing/examples/ai-chatbot) for a full breakdown of the costs in this scenario.
57 changes: 57 additions & 0 deletions src/pages/docs/platform/pricing/examples/ai-chatbot.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
title: AI support chatbot
meta_description: "Calculate AI Transport pricing for conversations with an AI chatbot. Example shows how using the message-per-response pattern and modifying the append rollup window can generate cost savings."
meta_keywords: "chatbot, support chat, token streaming, token cost, AI Transport pricing, Ably AI Transport pricing, stream cost, Pub/Sub pricing, realtime data delivery, Ably Pub/Sub pricing"
intro: "This example uses consumption-based pricing for an AI support chatbot use case, where a single agent is publishing tokens to user over AI Transport."
---

### Assumptions

The scale and features used in this calculation.

| Scale | Features |
|-------|----------|
| 4 user prompts to get to resolution | ✓ Message-per-response |
| 250 tokens per LLM response | |
| 70 appends per second from agent | |
| 3 minute average chat duration | |
| 1 million chats | |

### Cost summary

The high level cost breakdown for this scenario. Messages are billed for both inbound (published to Ably) and outbound (delivered to subscribers). Creating the "Message updates and deletes" [channel rule](/docs/ai-transport/features/token-streaming/message-per-response#enable) will automatically enable message persistence.

| Item | Calculation | Cost |
|------|-------------|------|
| Messages | 1092M × $2.50/M | $2730.00 |
| Connection minutes | 6M × $1.00/M | $6.00 |
| Channel minutes | 3M × $1.00/M | $3.00 |
| Package fee | | [See plans](/pricing) |
| **Total** | | **~$2739.00/M chats** |

### Message breakdown

How the message cost breaks down. The message-per-response pattern includes [automatic rollup of append events](/docs/ai-transport/features/token-streaming/token-rate-limits#per-response) to reduce consumption costs and avoid rate limits.

| Type | Calculation | Inbound | Outbound | Total messages | Cost |
|------|-------------|---------|----------|----------------|------|
| User prompts | 1M chats × 4 prompts | 4M | 4M | 8M | $20.00 |
| Agent responses | 1M chats x 4 responses x 250 token events per response | 360M | 360M | 720M | $1800.00 |
| Persisted messages | Every inbound message is persisted | 364M | 0 | 364M | $910.00 |

### Effect of append rollup

The calculation above uses the default append rollup window of 40ms, chosen to control costs with minimum impact on responsiveness. For a text chatbot use case, you could increase the window to 200ms without noticably impacting the user experience.

| Rollup window | Inbound response messages | Total messages | Cost |
|---------------|---------------------------|----------------|------|
| 40ms | 360 per chat | 1092M | $2730.00/M chats |
| 100ms | 144 per chat | 444M | $1110.00/M chats |
| 200ms | 72 per chat | 228M | $570.00/M chats |

<Aside data-type='further-reading'>
- [Talk with our sales team](https://ably.com/contact) to get a personalised quote.
- [Learn how HubSpot uses Ably to enable 128,000 businesses with live chat that just works](https://ably.com/case-studies/hubspot)
- [See how doxy.me turned realtime from a liability into a strategic asset](https://ably.com/case-studies/doxyme)
</Aside>