Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .github/workflows/mkdocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
name: Deploy MkDocs

on:
push:
branches:
- main

jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: 3.x
- run: pip install mkdocs-material
- run: mkdocs gh-deploy --force
30 changes: 23 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
We develop a cascaded voice assistant system that includes ASR, TTS and a
ReAct based Agent for reasoning and action taking.

[![Documentation](https://img.shields.io/badge/docs-mkdocs-blue)](https://sentientia.github.io/Aura/)

## Aura: Demo

[![Aura Demo](https://img.youtube.com/vi/cb7w0GVwwF0/0.jpg)](https://www.youtube.com/watch?v=cb7w0GVwwF0)
Expand All @@ -11,18 +13,28 @@ ReAct based Agent for reasoning and action taking.

![Aura System Architecture](docs/images/aura_system_white.png)

## Documentation

For detailed documentation, please visit our [Documentation Website](https://sentientia.github.io/Aura/).

The documentation includes:
- [Installation Guide](https://sentientia.github.io/Aura/installation/)
- [Architecture Overview](https://sentientia.github.io/Aura/architecture/)
- [Agent Documentation](https://sentientia.github.io/Aura/agents/)
- [Action Documentation](https://sentientia.github.io/Aura/actions/)
- [UI Documentation](https://sentientia.github.io/Aura/ui/)
- [Contributing Guide](https://sentientia.github.io/Aura/contributing/)

## Repository Structure

```
.
├── agent/ # Core agent implementation
│ ├── actions/ # Action handlers for different tasks.
│ ├── actions/ # Action handlers for different tasks
│ ├── controller/ # Agent state and control logic
│ ├── llm/ # Language model integration
│ ├── secrets/ # Secure credential storage
│ └── agenthub/ # Agent implementations
├── ui/ # User interface components
│ ├── local_speech_app.py # Speech interface implementation (using gradio)
Expand All @@ -32,12 +44,12 @@ ReAct based Agent for reasoning and action taking.
├── llm_serve/ # Language model serving script
├── dst/ # Dialog State Tracking. Has the scripts for finetuning LLMs for DST
|
└── environment.yaml # Conda environment configuration
├── dst/ # Dialog State Tracking. Has the scripts for finetuning LLMs for DST
└── environment.yaml # Conda environment configuration
```

## Setup
## Quick Setup

1. Create the conda environment:
```bash
Expand All @@ -62,4 +74,8 @@ ReAct based Agent for reasoning and action taking.
python ui/local_speech_app.py
```

## Human in the Loop Data: https://docs.google.com/spreadsheets/d/16_DApAlgunmG3pR4f8p9JYjO-v-2m8ZxduN9fZ-AblI/edit?usp=sharing
For more detailed setup instructions, please refer to the [Installation Guide](https://sentientia.github.io/Aura/installation/) in our documentation.

## Human in the Loop Data

For human-in-the-loop data, please visit [this Google Sheets document](https://docs.google.com/spreadsheets/d/16_DApAlgunmG3pR4f8p9JYjO-v-2m8ZxduN9fZ-AblI/edit?usp=sharing).
76 changes: 76 additions & 0 deletions docs/actions/answer_action.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Answer Action

The Answer Action is used by the QA Agent to provide direct answers to questions. It is a specialized action for question-answering tasks.

## Overview

The Answer Action is implemented in the `AnswerAction` class, which extends the `Action` class. It is designed to provide direct answers to questions without engaging in extended conversations.

## Capabilities

The Answer Action can:

- Provide direct answers to questions
- Format answers for different question types
- Handle multiple-choice questions
- Provide explanations for answers

## Implementation

The Answer Action is implemented in the `agent/actions/answer_action.py` file. It uses the following components:

- **Action Base Class**: Extends the `Action` class defined in `agent/actions/action.py`.
- **Result Processing**: Formats the answer for the user.

## Usage

The Answer Action is used by the QA Agent to provide answers to questions. To use the Answer Action:

1. Create a new instance of the `AnswerAction` class with the appropriate thought and payload:
```python
from agent.actions.answer_action import AnswerAction

action = AnswerAction(
thought="I know that the capital of France is Paris",
payload="The capital of France is Paris."
)
```

2. Execute the action with the current state:
```python
observation = action.execute(state)
```

3. The observation will contain the answer.

## Example

Here's an example of how the Answer Action is used to answer a question:

1. Agent creates an Answer Action:
```python
action = AnswerAction(
thought="Based on the search results, I can see that the current president of the United States is Joe Biden",
payload="The current president of the United States is Joe Biden."
)
```

2. Agent executes the action:
```python
observation = action.execute(state)
```

3. The action provides the answer "The current president of the United States is Joe Biden."

4. The observation contains the answer, which is returned to the user.

## Integration with Other Actions

The Answer Action is typically used as the final action in a question-answering flow. For example:

1. User asks a question.
2. Agent uses a Web Search Action to find information to answer the question.
3. Agent processes the search results.
4. Agent uses an Answer Action to provide the answer to the user.

This combination of actions allows the agent to provide accurate and informative answers to user questions.
94 changes: 94 additions & 0 deletions docs/actions/calendar_action.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Calendar Action

The Calendar Action is used to manage calendar events. It can create, read, update, and delete events on the user's calendar.

## Overview

The Calendar Action is implemented in the `CalendarAction` class, which extends the `Action` class. It is designed to interact with calendar services to manage events.

## Capabilities

The Calendar Action can:

- Create new calendar events
- Delete existing calendar events
- Retrieve calendar events
- Update calendar events

## Implementation

The Calendar Action is implemented in the `agent/actions/calendar_action.py` file. It uses the following components:

- **Action Base Class**: Extends the `Action` class defined in `agent/actions/action.py`.
- **Calendar API Integration**: Uses external calendar APIs to manage events.
- **Result Processing**: Processes and formats calendar operation results for the agent.

## Usage

The Calendar Action is used by the agent to manage calendar events. To use the Calendar Action:

1. Create a new instance of the `CalendarAction` class with the appropriate thought and payload:
```python
from agent.actions.calendar_action import CalendarAction

action = CalendarAction(
thought="I should create a calendar event for the meeting",
payload={
"event": "create",
"start_time": "2025-07-10T14:00:00",
"end_time": "2025-07-10T15:00:00",
"title": "Team Meeting",
"description": "Weekly team sync-up meeting"
}
)
```

2. Execute the action with the current state:
```python
observation = action.execute(state)
```

3. The observation will contain the result of the calendar operation.

## Example

Here's an example of how the Calendar Action is used to create a calendar event:

1. Agent creates a Calendar Action:
```python
action = CalendarAction(
thought="I should create a calendar event for the doctor's appointment",
payload={
"event": "create",
"start_time": "2025-07-15T10:00:00",
"end_time": "2025-07-15T11:00:00",
"title": "Doctor's Appointment",
"description": "Annual check-up with Dr. Smith"
}
)
```

2. Agent executes the action:
```python
observation = action.execute(state)
```

3. The action creates a calendar event for the doctor's appointment on July 15, 2025, from 10:00 AM to 11:00 AM.

4. The observation contains the result of the calendar operation, which might include:
- Confirmation that the event was created
- Details of the created event
- Any errors or warnings that occurred during the operation

5. The agent processes the observation and creates a new action based on the result, typically a Chat Action to confirm the calendar operation with the user.

## Integration with Other Actions

The Calendar Action is often used in conjunction with other actions to provide a complete interaction flow. For example:

1. User asks to schedule a meeting.
2. Agent uses a Chat Action to gather details about the meeting.
3. Agent uses a Calendar Action to create the meeting event.
4. Agent uses a Chat Action to confirm the meeting details with the user.

This combination of actions allows the agent to provide a seamless and natural interaction experience for the user when managing calendar events.
75 changes: 75 additions & 0 deletions docs/actions/chat_action.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Chat Action

The Chat Action is used to engage in conversation with the user. It is the most basic action and is used for all text-based interactions.

## Overview

The Chat Action is implemented in the `ChatAction` class, which extends the `Action` class. It is designed to handle text-based interactions between the agent and the user.

## Capabilities

The Chat Action can:

- Send text messages to the user
- Process user responses
- Update the conversation history

## Implementation

The Chat Action is implemented in the `agent/actions/chat_action.py` file. It uses the following components:

- **Action Base Class**: Extends the `Action` class defined in `agent/actions/action.py`.
- **State Management**: Updates the state with the conversation history.

## Usage

The Chat Action is used by the agent to communicate with the user. To use the Chat Action:

1. Create a new instance of the `ChatAction` class with the appropriate thought and payload:
```python
from agent.actions.chat_action import ChatAction

action = ChatAction(thought="I should greet the user", payload="Hello, how can I help you today?")
```

2. Execute the action with the current state:
```python
observation = action.execute(state)
```

3. The observation will be the user's response to the message.

## Example

Here's an example of how the Chat Action is used in a conversation:

1. Agent creates a Chat Action:
```python
action = ChatAction(thought="I should ask about the user's preferences", payload="What kind of restaurant are you looking for?")
```

2. Agent executes the action:
```python
observation = action.execute(state)
```

3. The message "What kind of restaurant are you looking for?" is sent to the user.

4. The user responds with "I'm looking for an Italian restaurant."

5. The observation contains the user's response: "I'm looking for an Italian restaurant."

6. The agent processes the observation and creates a new action based on the user's response.

## Integration with Other Actions

The Chat Action is often used in conjunction with other actions to provide a complete interaction flow. For example:

1. Agent uses a Web Search Action to find information about Italian restaurants.
2. Agent processes the search results.
3. Agent uses a Chat Action to present the information to the user.
4. User responds with a preference.
5. Agent uses a Calendar Action to make a reservation.
6. Agent uses a Chat Action to confirm the reservation with the user.

This combination of actions allows the agent to provide a seamless and natural interaction experience for the user.
Loading