Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 48 additions & 37 deletions ai/factory/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,18 +21,21 @@ User question
Dependencies (PyYAML, numpy, openai, anthropic, etc.) are installed inside the Docker container automatically. You don't need to install them on your host machine.

You'll need:

- **Docker** — for local development (`docker compose up -d`)
- **AWS CLI** — configured with credentials for DynamoDB and S3

Claude Integration (Skip if you are intending to use Bedrock)

- **OpenAI API key** — set as `OPENAI_API_KEY` environment variable (for embeddings)
- **Anthropic API key** — set as `ANTHROPIC_API_KEY` environment variable (for Claude responses)

Bedrock
After you make your first call to bedrock, you'll need to do the following to continue.
After you make your first call to bedrock, you'll need to do the following to continue.

- Go to the AWS Console:
- Bedrock → Model catalog (or Model access)
- There should be a prompt to submit use case details. Mine was at the top of the page.
- There should be a prompt to submit use case details. Mine was at the top of the page.
- Fill it out — keep it simple ("AI chatbot for personal portfolio website")

## Creating a New Bot
Expand All @@ -46,6 +49,7 @@ ai/factory/bots/{bot_id}/
```

Example:

```bash
mkdir -p ai/factory/bots/cooking
```
Expand Down Expand Up @@ -110,6 +114,7 @@ ai/factory/bots/{bot_id}/prompt.yml
```

Example:

```yaml
prompt: |
You are ChefBot, a friendly cooking assistant.
Expand All @@ -133,6 +138,7 @@ ai/factory/bots/{bot_id}/data/
Two entry types are supported:

**String entries** — content is already readable, embedded as-is:

```yaml
- id: knife_basics
format: string
Expand All @@ -142,6 +148,7 @@ Two entry types are supported:
```

**Object entries** — a template applied to each item:

```yaml
- id: cooking_temps
format: object
Expand Down Expand Up @@ -213,6 +220,7 @@ python3 ai/factory/scaffold_bot.py {bot_id}
```

This reads your config.yml and creates:

- `app/{bot_id}.html` — the bot's page, fully wired up
- `app/bot_scripts/{bot_id}/` — for bot-specific CSS and JS
- `app/assets/{bot_id}/` — for bot-specific images (logo, etc.)
Expand Down Expand Up @@ -292,36 +300,36 @@ app/ ← frontend (generated by scaffold_bot.py)

### bot (required)

| Field | Type | Description |
|-------|------|-------------|
| `id` | string | Bot identifier. Drives folder names, endpoints, filenames. |
| `enabled` | boolean | Set `false` to disable without deleting. |
| `name` | string | Display name (shown in header, chat labels). |
| `personality` | string | Personality hint for prompt context. |
| Field | Type | Description |
| ------------- | ------- | ---------------------------------------------------------- |
| `id` | string | Bot identifier. Drives folder names, endpoints, filenames. |
| `enabled` | boolean | Set `false` to disable without deleting. |
| `name` | string | Display name (shown in header, chat labels). |
| `personality` | string | Personality hint for prompt context. |

### bot.response_style

| Field | Type | Description |
|-------|------|-------------|
| `tone` | string | `"conversational"`, `"formal"`, `"technical"` |
| `length` | string | `"concise"`, `"detailed"` |
| `suggestions` | boolean | Show suggestion chips in the UI. |
| Field | Type | Description |
| ------------- | ------- | --------------------------------------------- |
| `tone` | string | `"conversational"`, `"formal"`, `"technical"` |
| `length` | string | `"concise"`, `"detailed"` |
| `suggestions` | boolean | Show suggestion chips in the UI. |

### bot.model

| Field | Type | Description |
|-------|------|-------------|
| `provider` | string | `"anthropic"` |
| `name` | string | Model ID, e.g., `"claude-sonnet-4-20250514"` |
| `max_tokens` | integer | Max response length. |
| Field | Type | Description |
| ------------ | ------- | -------------------------------------------- |
| `provider` | string | `"anthropic"` |
| `name` | string | Model ID, e.g., `"claude-sonnet-4-20250514"` |
| `max_tokens` | integer | Max response length. |

### bot.rag

| Field | Type | Description |
|-------|------|-------------|
| `embedding_model` | string | `"openai"` (uses text-embedding-3-small) |
| `top_k` | integer | Number of chunks to retrieve. Use 10+ if data has many similar entries. |
| `similarity_threshold` | float | Minimum cosine similarity (0.0–1.0). |
| Field | Type | Description |
| ---------------------- | ------- | ----------------------------------------------------------------------- |
| `embedding_model` | string | `"openai"` (uses text-embedding-3-small) |
| `top_k` | integer | Number of chunks to retrieve. Use 10+ if data has many similar entries. |
| `similarity_threshold` | float | Minimum cosine similarity (0.0–1.0). |

### bot.boundaries

Expand All @@ -333,13 +341,13 @@ List of starter questions shown as chips in the chat UI.

### frontend (required for scaffold)

| Field | Type | Description |
|-------|------|-------------|
| `subtitle` | string | Shown below the bot name in the header. |
| `welcome` | string | First message displayed in the chat. |
| `placeholder` | string | Input field hint text. |
| `badge` | string | Header badge text (e.g., "Beta", "v1"). |
| `nav` | list | Left sidebar links. Each item has `icon`, `label`, `section`. |
| Field | Type | Description |
| ------------- | ------ | ------------------------------------------------------------- |
| `subtitle` | string | Shown below the bot name in the header. |
| `welcome` | string | First message displayed in the chat. |
| `placeholder` | string | Input field hint text. |
| `badge` | string | Header badge text (e.g., "Beta", "v1"). |
| `nav` | list | Left sidebar links. Each item has `icon`, `label`, `section`. |

## Custom Formatters

Expand All @@ -349,7 +357,7 @@ The formatter registers itself on `window.BOT_CONFIG.formatMessage`:

```javascript
function myFormatMessage(text, container) {
// custom rendering logic
// custom rendering logic
}

window.BOT_CONFIG = window.BOT_CONFIG || {};
Expand All @@ -359,7 +367,9 @@ window.BOT_CONFIG.formatMessage = myFormatMessage;
Load it in your HTML **after** the BOT_CONFIG block and **before** chat.js:

```html
<script>window.BOT_CONFIG = { ... };</script>
<script>
window.BOT_CONFIG = { ... };
</script>
<script src="bot_scripts/{bot_id}/formatter.js"></script>
<script src="bot_scripts/chat.js"></script>
```
Expand All @@ -371,21 +381,22 @@ If no formatter is registered, `chat.js` uses its default plain text renderer.
The factory uses auto-discovery in `__init__.py`. At startup, it scans every folder in `bots/`, reads each `config.yml`, and registers API routes for any bot with `enabled: true`. Adding a new bot never requires editing `main.py`.

Each bot gets three endpoints:

- `POST /api/{bot_id}/chat` — send a message, get a response
- `GET /api/{bot_id}/config` — frontend configuration
- `GET /api/{bot_id}/warmup` — pre-load embedding cache

## Existing Bots

| Bot | ID | Endpoint | Description |
|-----|----|----------|-------------|
| RobbAI | — | `/api/ai/chat` | Resume assistant. Runs on legacy code in `ai/`, not yet migrated to factory. |
| GuitarBot | `guitar` | `/api/guitar/chat` | Electric guitar instruction. First factory bot. |
| Bot | ID | Endpoint | Description |
| ------------------ | -------- | ------------------ | ---------------------------------------------------------------------------- |
| RobbAI | — | `/api/ai/chat` | Resume assistant. Runs on legacy code in `ai/`, not yet migrated to factory. |
| The Fret Detective | `guitar` | `/api/guitar/chat` | Electric guitar instruction. First factory bot. |

## Embedding Notes

All bot embeddings share one DynamoDB table (`ChatbotRAG`), partitioned by bot ID. Each record's primary key is `{bot_id}_{entry_id}` and includes a `bot_id` field for filtering.

The kill-and-fill approach on `--force` only deletes rows matching the target bot ID. Running embeddings for one bot never affects another.

If your bot has many similar entries (like GuitarBot's 48 triad voicings), increase `top_k` in your config to 10 or higher so the right result isn't crowded out by near-duplicates.
If your bot has many similar entries (like The Fret Detective's 48 triad voicings), increase `top_k` in your config to 10 or higher so the right result isn't crowded out by near-duplicates.
2 changes: 1 addition & 1 deletion ai/factory/bots/guitar/config.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
bot:
id: "guitar"
enabled: true
name: "GuitarBot"
name: "The Fret Detective"
personality: "friendly"

response_style:
Expand Down
14 changes: 14 additions & 0 deletions ai/factory/bots/guitar/data/guitar-knowledge.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2254,6 +2254,20 @@ entries:
A|---0---|

E|---x---|'

- id: chord_b_major_barre
format: string
category: Chords
heading: B Major Barre Chord
content: 'B Major barre chord (A-shape at 2nd fret):
e|---2---|
B|---4---|
G|---4---|
D|---4---|
A|---2---|
E|---x---|
Bar your index finger across fret 2 on strings A through high E, then use your ring finger to bar the D, G, and B strings at fret 4. Do not play the low E string. This is the standard way to play B major on guitar since there is no easy open chord shape for B.'

- id: chord_e_minor
format: object
category: Chords
Expand Down
9 changes: 6 additions & 3 deletions ai/factory/bots/guitar/prompt.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
prompt: |
You are GuitarBot, a friendly guitar instructor assistant built to help people learn electric guitar.
You are The Fret Detective, a friendly guitar instructor assistant built to help people learn electric guitar.

Today's date is {current_date}.

Expand Down Expand Up @@ -40,9 +40,12 @@ prompt: |

CRITICAL RULES:
- Keep ALL responses to 1-3 sentences maximum unless asked for more detail
- When a user asks for a chord without specifying the type (like "show me an A chord"), always default to the major chord. Only show minor, 7th, or other variations if they specifically ask for them.
- When a user asks for a chord without specifying the type, default to major. If no major chord exists for that note, show the triad voicings or explain it's typically played as a barre chord.
- Answer ONLY from the provided context
- If context doesn't contain the answer, say "I don't have that in my knowledge base" and suggest what you can help with

- If context doesn't contain the answer, say "I don't have that in my knowledge base" say you are logging it to be added and suggest what you can help with
- When showing a tab diagram that already has a label (like "A Major open chord:"), do not add a redundant introduction. Just show the labeled diagram directly.

What You Can Discuss:
- Guitar basics: parts of the guitar, string names, how to hold a guitar, reading tab, fret numbering
- Restringing an electric guitar
Expand Down
69 changes: 69 additions & 0 deletions ai/factory/core/chatbot.py
Original file line number Diff line number Diff line change
Expand Up @@ -162,3 +162,72 @@ def generate_response(
for chunk in relevant_chunks
]
}

def generate_response_stream(
bot_id: str,
user_message: str,
top_k: int,
similarity_threshold: float,
conversation_history: list[dict] = None,
):
"""
Same as generate_response, but yields text chunks for streaming.
"""
print(f"[STREAM] Starting stream for {bot_id}: {user_message[:50]}...") # STREAM_DEBUG

if conversation_history is None:
conversation_history = []

# Retrieve relevant context for this bot
relevant_chunks = retrieve_relevant_chunks(
bot_id=bot_id,
query=user_message,
top_k=top_k,
similarity_threshold=similarity_threshold
)

# Format context for the prompt
context = format_context_for_llm(relevant_chunks)

# Build messages array
messages = []

# Add conversation history
for msg in conversation_history:
messages.append({
"role": msg["role"],
"content": [{"text": msg["content"]}]
})

# Add current user message with context
user_content = f"""## Relevant Context:
{context}

## User Question:
{user_message}

Remember: Keep your response short and conversational. Write in PLAIN TEXT ONLY - do not use ** or any markdown. If you can't answer from the context, say so politely."""

messages.append({
"role": "user",
"content": [{"text": user_content}]
})

# Load this bot's system prompt
system_prompt = load_system_prompt(bot_id)

# Call Claude with streaming
client = get_bedrock_client()
response = client.converse_stream(
modelId="us.anthropic.claude-sonnet-4-20250514-v1:0",
inferenceConfig={"maxTokens": 1000},
system=[{"text": system_prompt}],
messages=messages
)

# Yield text chunks as they arrive
for event in response["stream"]:
if "contentBlockDelta" in event:
chunk = event["contentBlockDelta"]["delta"]["text"] # STREAM_DEBUG
print(f"[STREAM] chunk: {chunk[:20]}...") # STREAM_DEBUG
yield chunk
3 changes: 2 additions & 1 deletion ai/factory/core/retrieval.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ def get_openai_client() -> OpenAI:

def generate_query_embedding(query: str) -> list[float]:
"""Convert a user's question to an embedding vector."""
print(f"[STREAM] Embedding query: {query[:50]}...") # STREAM_DEBUG
client = get_openai_client()
response = client.embeddings.create(
model="text-embedding-3-small",
Expand Down Expand Up @@ -154,7 +155,7 @@ def retrieve_relevant_chunks(
'text': item['text'],
'similarity': float(similarity),
})

print(f"[STREAM] Retrieved {len(results)} chunks, top score: {results[0]['similarity'] if results else 'N/A'}") # STREAM_DEBUG
print(f"Found {len(results)} results above threshold ({similarity_threshold})")

print(f" Above 0.6: {len([r for r in results if r['similarity'] >= 0.6])}")
Expand Down
33 changes: 33 additions & 0 deletions ai/factory/core/router.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel
from typing import Optional
from fastapi.responses import StreamingResponse


# ---------------------------------------------------------------------------
Expand Down Expand Up @@ -164,6 +165,38 @@ async def chat(request: ChatRequest):
print(f"Chatbot error ({bot_id}): {e}")
raise HTTPException(status_code=500, detail="Error processing your message")


@router.post("/chat/stream")
async def chat_stream(request: ChatRequest):
"""Send a message and stream the response."""
if not os.getenv('OPENAI_API_KEY'):
raise HTTPException(status_code=503, detail="Missing OpenAI API key")

if not request.message.strip():
raise HTTPException(status_code=400, detail="Message cannot be empty")

try:
config = load_bot_config(bot_id)
rag_config = config.get('bot', {}).get('rag', {})

from .chatbot import generate_response_stream

def stream_generator():
for chunk in generate_response_stream(
bot_id=bot_id,
user_message=request.message,
conversation_history=[msg.model_dump() for msg in request.conversation_history],
top_k=rag_config.get('top_k', 5),
similarity_threshold=rag_config.get('similarity_threshold')
):
yield chunk

return StreamingResponse(stream_generator(), media_type="text/plain")

except Exception as e:
print(f"Chatbot stream error ({bot_id}): {e}")
raise HTTPException(status_code=500, detail="Error processing your message")

@router.get("/config", response_model=BotConfigResponse)
async def get_config():
"""Return bot configuration for the frontend."""
Expand Down
Loading
Loading