SARA Implementation

This repository contains the implementation of SARA (Stepwise Advisory agent with Retrieval Augmentation), a system designed for question answering tasks on HotPotQA dataset. SARA uses an advisory memory system to improve its performance through reflection and learning from past experiences.

Setup

Environment Setup

Create a conda environment with Python 3.9:

conda create -n sara python=3.9
conda activate sara

Install required dependencies:
```
pip install -r requirements.txt
```

Snowflake API Configuration

This code requires Snowflake API access to run. You need to configure your Snowflake connection parameters:

Edit snow_params.json and fill in your Snowflake credentials:

{
    "account": "YOUR_ACCOUNT",
    "user": "YOUR_USERNAME",
    "password": "YOUR_PASSWORD",
    "role": "SYSADMIN",
    "database": "",
    "schema": "",
    "warehouse": ""
}

Make sure you have the necessary Snowflake permissions and access to the required models (e.g., llama3.1-70b).

Running SARA

Using Pre-built Advisory Memory

To run SARA with the pre-built advisory memory, execute:

bash run_hotpotQA.sh

The script is configured to:

Use the llama3.1-70b model
Test on easy/medium/hard difficulty samples (configurable via diff variable)
Load advisory memory from build_advisory_memory/advisory_memory.json
Save results to results/sara_70b_{difficulty}.jsonl
Enable reflection and memory usage
Track token usage in token_tracking/log_70b/{difficulty}/

You can modify the script to change:

lm_name: The language model to use
lm_codename: Short name for the model (used in output paths)
diff: Difficulty level (easy, medium, or hard)

Evaluation

After running inference, you can evaluate the results using:

python evaluate/eval.py

The evaluation script:

Loads results from results/sara_{model}_{difficulty}.jsonl
Calculates F1 scores (SQuAD-style) for all predictions
Reports overall F1 score and F1 score for cases where advice was requested
Helps analyze the impact of the advisory memory system

Make sure to update the lm_codename and diff variables in evaluate/eval.py to match your run configuration.

Building Advisory Memory

The build_advisory_memory/ directory contains scripts and utilities for building the advisory memory system from training data. The advisory memory stores past experiences and advice that SARA can retrieve during inference.

Process Overview

The advisory memory building process consists of several steps:

0_get_valid_ids.py: Extracts valid sample IDs from training data based on F1 score thresholds. This identifies cases where the model initially failed but could potentially benefit from advice.
1_1_filter_advice.py: Filters reactive advice cases from training data, keeping only those that correspond to valid IDs. This creates a curated set of advice-seeking scenarios.
1_2_analyze_context_reactive.py: Analyzes the context of reactive advice cases, extracting relevant situational information that will be used as keys for memory retrieval.
2_reflect_reactive.py: Performs reflection on reactive advice cases using the language model. This step generates reflective insights about why certain advice was helpful, what the situation was, and what the consequences were. The reflection process helps create more meaningful memory entries.
3_save_advisory_memory.py: Builds and saves the final advisory memory knowledge base. This script:
- Processes reflected reactive advice records
- Generates embeddings for memory keys (situational contexts)
- Organizes the memory structure with keys, embeddings, and records
- Saves the final advisory_memory.json file that can be loaded during inference

Key Files

models.py: Contains the SnowClient class for interacting with Snowflake API
utils_build.py: Utility functions for building advisory memory, including embedding generation and similarity search
snow_params.json: Snowflake connection parameters (same as root directory)
advisory_memory.json: Final output file containing the built advisory memory

Running the Build Process

To build advisory memory from scratch:

Ensure you have the required training data files in the appropriate locations
Configure snow_params.json with your Snowflake credentials

Run the scripts in sequence:

cd build_advisory_memory
python 0_get_valid_ids.py
python 1_1_filter_advice.py
python 1_2_analyze_context_reactive.py
python 2_reflect_reactive.py
python 3_save_advisory_memory.py

The final advisory_memory.json will be created in the build_advisory_memory/ directory and can be used with the main SARA inference script.

Project Structure

├── build_advisory_memory/     # Scripts for building advisory memory
├── data/                       # HotPotQA dataset files
├── evaluate/                  # Evaluation scripts
├── results/                    # Inference results
├── src/                        # Main source code
│   ├── hotpot_agent_sara.py   # Main SARA agent implementation
│   ├── models.py              # Model definitions
│   ├── utils_general.py       # General utilities
│   ├── utils_sara.py          # SARA-specific utilities
│   └── prompt/                # Prompt templates
├── token_tracking/            # Token usage logs
├── run_hotpotQA.sh            # Main execution script
├── snow_params.json           # Snowflake configuration
└── requirements.txt           # Python dependencies

Notes

The advisory memory system uses semantic similarity search to retrieve relevant past experiences during inference
Token tracking files are generated during inference for analysis purposes
The system supports different difficulty levels (easy, medium, hard) for evaluation
Reflection is a key component that helps the agent learn from past mistakes and successes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SARA Implementation

Setup

Environment Setup

Snowflake API Configuration

Running SARA

Using Pre-built Advisory Memory

Evaluation

Building Advisory Memory

Process Overview

Key Files

Running the Build Process

Project Structure

Notes

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
build_advisory_memory		build_advisory_memory
evaluate		evaluate
src		src
.gitignore		.gitignore
readme.md		readme.md
requirements.txt		requirements.txt
run_hotpotQA.sh		run_hotpotQA.sh
snow_params.json		snow_params.json

Solomon-Han-00/SARA

Folders and files

Latest commit

History

Repository files navigation

SARA Implementation

Setup

Environment Setup

Snowflake API Configuration

Running SARA

Using Pre-built Advisory Memory

Evaluation

Building Advisory Memory

Process Overview

Key Files

Running the Build Process

Project Structure

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages