Skip to content

ryanapanela/EventRecall

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠 Automated Event Recall Assessment

EventRecall is an open-source tool for automating event segmentation and recall assessments using LLMs and sentence embeddings. It provides a scalable and efficient method to investigate event segmentation and recall accuracy, supporting both research and applied cognitive science applications.

This repository contains data and scripts used for:
Panela, R. A., Barnett, A. J., Barense, M. D., & Herrmann, B. (2025). Event segmentation applications in large language model enabled automated recall assessments. Communications Psychology, 3(1), 184. https://doi.org/10.1038/s44271-025-00359-7


🚀 Features

  • ✅ Segment events from narrative and recall files using OpenAI or Meta
  • ✅ Evaluate recall accuracy using embedding-based similarity
  • ✅ Output results as CSV files and/or visual heatmaps
  • ✅ Flexible usage via Python API (CLI support planned)
  • ✅ Includes data for revaluation

📦 Module Installation

To use the Python tools for segmentation and recall scoring:

pip install git+https://github.com/ryanapanela/EventRecall.git

📥 Clone Repository for Data and Reanalysis

To access all data and scripts used in the manuscript, clone this repository to your local environment:

git clone https://github.com/ryanapanela/EventRecall.git
cd EventRecall

🐍 Python API Usage

📘 Segmentation

from segmentation import run_segmentation

narrative_events = run_segmentation('data/stories/Run.txt', model='gpt-4', api_key='sk-...')
recall_events = run_segmentation('Recall.txt', model='gpt-4', api_key='sk-...')

📊 Recall Evaluation

from recall import evaluate_recall

# Evaluate recall using pre-segmented events
results = evaluate_recall(
    narrative_events=narrative_events,
    recall_events=recall_events,
    model_name='sentence-transformers/LaBSE',
    generate_plots=True,
    output_path='recall_results.csv'
)

# Evaluate recall using a file path for recall events
results = evaluate_recall(
    narrative_events=narrative_events,
    recall_path='recall.txt',
    model_name='sentence-transformers/LaBSE',
    generate_plots=True,
    output_path='recall_results.csv'
)

print(results)

📁 Output

  • CSV: Recall scores for each narrative event
  • Plot: Optional heatmap of similarity matrix

🧪 Research Code & Data

All data and analysis scripts used in the accompanying manuscript are available in:

  • data/segmentation - segmentation data for humans, GPT, and LLaMA
  • data/recall - recall transcripts and pre-processed narrative recall scores
  • code/segmentation - segmentation analysis code
  • code/recall - recall analysis code

Scripts for reproducing automated segmentations using LLMs are available in code/segmentation/produce_segmentation. These include: gpt_llama_segmentation.py — runs segmentation using GPT-4 or LLaMA segmentation_functions.py — helper functions for LLM prompting and text processing

The narratives used in the manuscript are available in data/stories.

Note: The segmentation data used in the final analyses may differ slightly from results generated by these scripts due to the non-deterministic nature of LLM outputs (even at low temperature settings). Additionally, outputs from Python were manually reformatted for compatibility with RMarkdown scripts. As such, there is no single linear pipeline from raw narrative to final plot provided in this repository. Instead, we provide all processed data and relevant scripts to enable reanalysis and inspection.

To rerun the analyses:

  • Use the Python scripts to regenerate LLM segmentations if desired
  • Execute the RMarkdown notebooks for statistical analyses and vizualization

🧩 File Structure

EventRecall/
│
├── code/                                    # Analysis notebooks to reproduce results
│   ├── recall
│   └── segmentation
│
├── data/                                   # Pre-processed data used in manuscript analyses
│   ├── recall                              
│   ├── segmentation
│   └── stories
│
├── module/                                 # Python package for installation
│   ├── __init__.py
│   ├── cli.py
│   ├── recall.py                              
│   ├── segmentation.py                     
│   └── utils.py       
│                     
├── requirements.txt                        # Dependencies
├── LICENSE
├── README.md                        
└── setup.py                                # Installation script                                

🔍 Function Reference

run_segmentation(path, model='gpt-4', api_key=None)

Segments a text file into discrete events using the specified model.

Returns: list[str]


evaluate_recall(...)

Evaluates recall accuracy between narrative and recall events.

Inputs:

  • narrative_path or narrative_events
  • recall_path or recall_events
  • model_name: Embedding model (default: 'sentence-transformers/LaBSE')
  • segmentation_model: LLM used for segmentation
  • api_key: Required for OpenAI models
  • generate_plots: Whether to generate heatmaps
  • output_path: Path to save CSV

Returns: pd.DataFrame of recall scores


recall_score(narrative_events, recall_events, model_name)

Returns detailed recall metrics, including:

  • Full recall matrix
  • Best matches per event
  • Forward and reverse diagonal scores

recall_matrix(narrative_events, recall_events, model_name)

Returns the full similarity matrix between each pair of events.


embedding(text, model_name)

Generates embeddings for a list of sentences/events.


🛠 Requirements

  • Python 3.8+
  • pandas, numpy, scipy, matplotlib, seaborn, sentence-transformers
  • For LLM support: OpenAI API key

Install dependencies:

pip install -r requirements.txt

📋 License

MIT License © 2025 Ryan A. Panela

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors