Skip to content

(NeurIPS 2025) SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions

License

Notifications You must be signed in to change notification settings

XianzheFan/SoMi-ToM

Repository files navigation

NeurIPS 2025

⭐ If our project helps you, please give us a star on GitHub to support us!

image/png

SoMi Embodied Interaction Environment

SoMi is easily extendable and supports LVLM agents controlling characters in the open-world game Minecraft, allowing them to collaborate with other agents to achieve crafting goals. The interaction logs, game screenshots, and videos generated by the interactive environment will be used for the SoMi-ToM evaluation.

@article{fan2025somi,
  title={SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions},
  author={Fan, Xianzhe and Zhou, Xuhui and Jin, Chuanyang and Nottingham, Kolby and Zhu, Hao and Sap, Maarten},
  journal={arXiv preprint arXiv:2506.23046},
  year={2025}
}

Requirements

Install and Run

  1. Make sure you have the requirements above.

  2. Clone or download this repository (big green button).

  3. Rename keys.example.json to keys.json and fill in your API keys (you only need one).

  4. In terminal/command prompt, run npm install from the installed directory.

  5. Clone or download feature/minecraft-update branch in Sotopia repository.

cd examples/experimental/minecraft_agents
uvicorn group_discussion_agents:app --reload --port 8080
// Open a new terminal
cd examples/experimental/minecraft_agents
export OPENAI_API_KEY=sk-  // Enter your OpenAI API key here
uv run aact run-dataflow group_discussion_agents.toml
  1. Enter Minecraft Java Edition, select Singleplayer, 1.20.1 version, and Survival Mode, then click Open to LAN 55916.

  2. Open a new terminal, than run node src/agent/index.js from this repository.

Bot Profiles

Bot profiles are toml files that define:

  1. Crafting Goal

You and your friends need to craft 2 “boat”.

  1. Knowledge - Specific Crafting Rule

The complete process for crafting a “boat” in Minecraft is as follows:

......

Patches

Some of the node modules that we depend on have bugs in them. To add a patch, change your local node module file and run npx patch-package [package-name]

SoMi-ToM Benchmark

We propose the SoMi-ToM benchmark, designed to evaluate multi-perspective ToM in embodied multi-agent complex social interactions. This benchmark is based on rich multimodal interaction data generated by the interaction environment SoMi, covering diverse crafting goals and social relationships. See dataset at SoMi-ToM.

🔥 Latest LVLM Benchmark Table

Performance of humans and leading closed-source or open-source LVLMs in the first-person evaluation (state inference). There are 350 questions for self-ToM reasoning and 700 questions for others’ ToM reasoning.

image/png

Performance of humans and leading closed-source and open-source LVLMs in the Third-Person Perspective ToM test (175 questions in total). Highest accuracy without CoT is shown in red bold, and with CoT in blue bold.

image/png

❤️ Acknowledgements

The SoMi-ToM benchmark references the following code repositories:

https://github.com/PrismarineJS/prismarine-viewer

https://github.com/kolbytn/mindcraft

https://github.com/ProKil/aact

https://sotopia.world/projects/sotopia

Thanks for their awesome work!

📺 Easter Egg: More AI in Minecraft!

For more fascinating videos on AI playing Minecraft, check out the Emergent Garden YouTube channel. The codebase for the AI in these videos comes from kolbytn/mindcraft.

About

(NeurIPS 2025) SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published