This directory contains executable scripts for running experiments with Spatial Lab.
# From the repository root:
cd /path/to/spatial-lab
# Set up environment
cp .env.example .env
# Edit .env with your API keys
# Run calibration experiment (~5 minutes)
PYTHONPATH=. python scripts/calibration_experiment.py
# Run basic experiment
PYTHONPATH=. python scripts/run_experiment.py --trials 10Purpose: LLM Confidence Calibration Study Runtime: ~5 minutes (144 trials) API Required: Groq (GROQ_API_KEY)
Investigates whether LLM-reported confidence scores are well-calibrated predictors of spatial reasoning accuracy.
PYTHONPATH=. python scripts/calibration_experiment.pyOutput: experiment_results/calibration_*_{results,metrics,analysis}.json
Purpose: Main experiment runner with CLI interface Runtime: Variable (depends on configuration) API Required: Groq or Gemini
# Quick test (10 trials)
PYTHONPATH=. python scripts/run_experiment.py --trials 10
# Full experiment
PYTHONPATH=. python scripts/run_experiment.py --trials 50 --robots 5Purpose: Groq/Llama-specific spatial reasoning experiments Runtime: ~2-3 minutes (30 trials) API Required: Groq (GROQ_API_KEY)
PYTHONPATH=. python scripts/run_groq_experiment.pyPurpose: Full integration experiments with multiple LLM providers Runtime: ~5-10 minutes API Required: Groq and/or Gemini
PYTHONPATH=. python scripts/run_real_experiment.pyPurpose: Validate API connectivity before running experiments Runtime: ~30 seconds API Required: Groq and Gemini
PYTHONPATH=. python scripts/test_llm_apis.pyAll scripts require API keys set in .env:
GROQ_API_KEY=gsk_... # Required for most experiments
GOOGLE_API_KEY=AIza... # Required for Gemini fallback
DEFAULT_LLM_MODEL=llama-3.3-70b-versatileAll experiment results are saved to experiment_results/ with timestamped filenames:
experiment_results/
├── calibration_YYYYMMDD_HHMMSS_results.json # Raw trial data
├── calibration_YYYYMMDD_HHMMSS_metrics.json # Computed metrics
├── calibration_YYYYMMDD_HHMMSS_analysis.json # Statistical tests
└── SCIENTIFIC_REPORT.md # Human-readable report
cp .env.example .env
# Edit .env with your Groq API key from https://console.groq.comThe scripts include rate limiting (0.25s between requests). If you still hit limits:
- Wait 1-2 minutes and retry
- Upgrade your Groq tier at https://console.groq.com/settings/billing
Ensure you're running from the repository root with PYTHONPATH:
cd /path/to/spatial-lab
PYTHONPATH=. python scripts/your_script.py