Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 12 additions & 10 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
*.egg-info
venv
*_data.json
*_embeddings.bin
.ui_history
__pycache__
/typeagent.egg-info
/.env
/.ui_history
/.venv
/db
/*.db
/evals
/gmail/client_secret.json
/gmail/token.json
/gmail/mail_dump/
/testdata/Episode_53_Answer_results.json
/testdata/Episode_53_Search_results.json
/evals
/junk
__pycache__
testdata/MP/
db
/testdata/MP/
4 changes: 1 addition & 3 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
---
applyTo: '**/*.py'
---
# Absolute edict

**DO NOT BE OBSEQUIOUS**

Expand Down
25 changes: 9 additions & 16 deletions TADA.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Talk at PyBay is on Sat, Oct 18 in SF

## Software

- Design and implement high-level API to support ingestion and querying
- Rename utool.py to query.py
- Unify Podcast and VTT ingestion (use shared message and metadata classes)?
- Code structure (do podcasts and transcripts need to be under typeagent?)?
- Distinguish between release deps and build/dev deps?
Expand All @@ -18,8 +18,6 @@ Talk at PyBay is on Sat, Oct 18 in SF
### Minor (can do without)

- Reduce duplication between ingest_vtt.py and typeagent/transcripts/
- Why add speaker detection? Doesn't WebVTT support `<v ...>`? In fact things like `[MUSIC]` are used as stage directions, not for the speaker.
- Change 'import' to 'ingest' in file/class/function/comment (etc.) when it comes to entering data into the database; import is too ambiguous
- `get_transcript_speakers` and `get_transcript_duration` should not re-parse the transcript -- they should just take the parsed vtt object.

### Not doing:
Expand All @@ -35,23 +33,19 @@ Talk at PyBay is on Sat, Oct 18 in SF

- Getting Started
- Document the high-level API
- Document the MCP API
- Document what should go in `.env` and where it should live
- And alternatively what to put in shell env directly
- Document build/release process
- Document how to run evals (but don't reveal all the data)
- And alternatively (first?) what to put in shell env directly
- Document test/build/release process
- Document how to run evaluations (but don't reveal all the data)

## Demos

- Adrian Tchaikovsky Podcast: ready
- Monty Python Episode: almost ready
- Documents demo (doesn't look so easy)
- Email demo: Umesh has almost working prototype
- Monty Python Episode: ready but need to pick a list of sketches to index
- Email demo: Umesh has a working prototype

## Talk

- Write slides
- Make a pretty design for slides?
- Practice in private, timing, updating slides as needed
- Practice run for the team?
- Anticipate questions about (Lazy) GraphRAG?
Expand Down Expand Up @@ -97,21 +91,21 @@ this summer and its API.
```sh
pip install typeagent-py # Installs typeagent and dependencies
```
2. Create conversation:
2. Create conversation (TENTATIVE):
```py
import typeagent

conv = typeagent.get_conversation(dbfile="mymemory.sqlite")
# Could be empty (new) or could contain previously ingested data
# You can always ingest additional messages
```
3. Ingest messages:
3. Ingest messages (TENTATIVE):
```py
for message in ...: # Source of message strings
metadata = ... # Set date/time, speaker(s), listener(s)
conv.ingest_message(message, metadata)
```
4. Query:
4. Query (TENTATIVE):
```py
request = input("> ")
answer = conv.query(request)
Expand All @@ -124,4 +118,3 @@ this summer and its API.
- To PyPI project
- To GitHub (microsoft/typeagent-py)
- To docs

99 changes: 99 additions & 0 deletions docs/query-method.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Conversation Query Method

The `query()` method provides a simple, end-to-end API for querying conversations using natural language.

## Usage

```python
from typeagent import create_conversation
from typeagent.transcripts.transcript import TranscriptMessage

# Create a conversation
conv = await create_conversation(
"my_conversation.db",
TranscriptMessage,
name="My Conversation",
)

# Add messages
messages: list[TranscriptMessage] = [...]
await conv.add_messages_with_indexing(messages)

# Query the conversation
question: str = input("typeagent> ")
answer: str = await conv.query(question)
print(answer)
```

## How It Works

The `query()` method encapsulates the full TypeAgent query pipeline:

1. **Natural Language Understanding**: Uses TypeChat to translate the natural language question into a structured search query
2. **Search**: Executes the search across the conversation's messages and knowledge base
3. **Answer Generation**: Uses an LLM to generate a natural language answer based on the search results

## Method Signature

```python
async def query(self, question: str) -> str:
"""
Run an end-to-end query on the conversation.

Args:
question: The natural language question to answer

Returns:
A natural language answer string. If the answer cannot be determined,
returns an explanation of why no answer was found.
"""
```

## Behavior

- **Success**: Returns a natural language answer synthesized from the conversation content
- **No Answer Found**: Returns a message explaining why the answer couldn't be determined
- **Search Failure**: Returns an error message describing the failure

## Performance Considerations

The `query()` method caches the TypeChat translators per conversation instance, so repeated queries on the same conversation are more efficient.

## Example: Interactive Loop

```python
while True:
question: str = input("typeagent> ")
if not question.strip():
continue
if question.lower() in ("quit", "exit"):
break

answer: str = await conv.query(question)
print(answer)
```

## Example: Batch Processing

```python
questions = [
"What was discussed?",
"Who were the speakers?",
"What topics came up?",
]

for question in questions:
answer = await conv.query(question)
print(f"Q: {question}")
print(f"A: {answer}")
print()
```

## Related APIs

For more control over the query pipeline, you can use the lower-level APIs:

- `searchlang.search_conversation_with_language()` - Search only
- `answers.generate_answers()` - Answer generation from search results

See `tools/utool.py` for examples of using these lower-level APIs with debugging options.
102 changes: 102 additions & 0 deletions examples/simple_query_demo.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
#!/usr/bin/env python3
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.

"""
Simple demo of the conversation.query() method.

This demonstrates the end-to-end query pattern:
question = input("typeagent> ")
answer = await conv.query(question)
print(answer)
"""

import asyncio

from typeagent import create_conversation
from typeagent.aitools.embeddings import AsyncEmbeddingModel
from typeagent.aitools.utils import load_dotenv
from typeagent.knowpro.convsettings import ConversationSettings
from typeagent.transcripts.transcript import TranscriptMessage, TranscriptMessageMeta


async def main():
"""Demo the simple query API."""
# Load API keys
load_dotenv()

# Create a conversation with some sample content
print("Creating conversation...")
conv = await create_conversation(
None,
TranscriptMessage,
name="Demo Conversation",
)

# Add some sample messages
messages = [
TranscriptMessage(
text_chunks=["Welcome to the Python programming tutorial."],
metadata=TranscriptMessageMeta(speaker="Instructor"),
),
TranscriptMessage(
text_chunks=["Today we'll learn about async/await in Python."],
metadata=TranscriptMessageMeta(speaker="Instructor"),
),
TranscriptMessage(
text_chunks=[
"Python is a great language for beginners and experts alike."
],
metadata=TranscriptMessageMeta(speaker="Instructor"),
),
TranscriptMessage(
text_chunks=["The async keyword is used to define asynchronous functions."],
metadata=TranscriptMessageMeta(speaker="Instructor"),
),
TranscriptMessage(
text_chunks=[
"You use await to wait for asynchronous operations to complete."
],
metadata=TranscriptMessageMeta(speaker="Instructor"),
),
]

print("Adding messages and building indexes...")
result = await conv.add_messages_with_indexing(messages)
print(f"Conversation ready with {await conv.messages.size()} messages.")
print(f"Added {result.messages_added} messages, {result.semrefs_added} semantic refs")

# Check indexes
if conv.secondary_indexes:
if conv.secondary_indexes.message_index:
msg_index_size = await conv.secondary_indexes.message_index.size()
print(f"Message index has {msg_index_size} entries")
print()

# Interactive query loop
print("You can now ask questions about the conversation.")
print("Type 'quit' or 'exit' to stop.\n")

while True:
try:
question: str = input("typeagent> ")
if not question.strip():
continue
if question.strip().lower() in ("quit", "exit", "q"):
break

# This is the simple API pattern
answer: str = await conv.query(question)
print(answer)
print()

except EOFError:
print()
break
except KeyboardInterrupt:
print("\nExiting...")
break


if __name__ == "__main__":
asyncio.run(main())
3 changes: 0 additions & 3 deletions gmail/.gitignore

This file was deleted.

2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "typeagent"
version = "0.2.4"
version = "0.2.5"
description = "TypeAgent implements an agentic memory framework."
readme = { file = "README.md", content-type = "text/markdown" }
authors = [
Expand Down
Loading
Loading