Skip to content

psyonp/core

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CORE: Measuring Multi-Agent LLM Interaction Quality under Game-Theoretic Pressures

This repository presents the Conversational Robustness Evaluation Score, CORE, a multi-faceted metric to quantify the effectiveness of language use within multi-agent systems across different game-theoretic interactions (cooperative, competitive, neutral). It also evaluates vocabulary structure using Zipf's Law and Heaps' Law.

Structure

  • 'config/' - model paths and experiment parameters
  • 'models/' - model loading and inference
  • 'experiment/' - simulation and orchestration logic
  • 'analysis/' - Zipf and Heaps fitting
  • 'utils/' - tokenization, plotting, and I/O
  • 'experiment_results/' - saved outputs and plots
  • 'core/' - runner code for CORE computation

Usage

Run experiments by instantiating a SLURM script that executes:

python main.py

Once aggregated data into 8x8 heatmaps per category with Heaps and Zipf laws, execute:

python ./core/main.py

Best,

Authors

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages