Kardia-R1: Unleashing LLMs to Reason toward Understanding and Empathy for Emotional Support via Rubric-as-Judge Reinforcement Learning
💌 Contact: jamse_yuan@163.com
2026.01.14🎉 Excited to share that our work, Kardia-R1, has been accepted at WWW 2026!2025.12.02🎉 Our Kardia-R1 paper released on arXiv — check it out now!2025.12.03🚀 The full KardiaBench dataset (22K multi-turn dialogues, 671 personas) is now officially released and open-sourced on HuggingFace!
### Authentication & Loading the Dataset huggingface-cli login from datasets import load_dataset dataset = load_dataset("Jhcircle/KadiaBench")
🧠 Kardia-R1 is a reasoning-centric empathetic dialogue framework that unifies
user understanding → emotional reasoning → safe, supportive responses,
empowered by Rubric-as-Judge GRPO Reinforcement Learning for transparent and controllable empathy.
- KardiaBench: 671 real-world user profiles + 22,080 multi-turn empathetic dialogues
- Four-span structured cognition
<|understanding_begin|>...<|understanding_end|>— Interpret user intent & emotions using persona and context<|reasoning_begin|>...<|reasoning_end|>— Perform internal appraisal and empathetic reasoning<|emotion_begin|>...<|emotion_end|>— Identify the correct fine-grained emotion label<|response_begin|>...<|response_end|>— Generate supportive, persona-aligned empathetic replies
- Rubric-as-Judge RL (Verifiable)
- Interpretable, criterion-based, LLM-judged reinforcement learning
- Backbone-Agnostic Gains
- Improves Qwen, Gemma, and more across all empathy metrics
- Superior to SoTA LLMs
- Outperforms GPT-4o, DeepSeek-R1, PsyLLM in emotion accuracy & empathetic quality
- Human-interpretable rubric: Relevance · Empathy · Persona Consistency · Safety · Fluency
- Transparent scoring → controllable improvement
- No black-box reward models → fully interpretable and aligned behavior
- Consistent gains across every empathy dimension
- Stronger emotional grounding and persona alignment
- Scalable to Qwen (3B/7B) / Gemma (2B/7B) backbones
- Robust, generalizable empathetic cognition across diverse emotional contexts
🌟 Kardia-R1 achieves state-of-the-art empathy, persona consistency, and emotion accuracy,
surpassing both general-purpose LLMs and specialized empathetic dialogue systems.
KardiaBench is a large-scale empathetic dialogue dataset designed for
reasoning-centered emotional support, containing:
- 22,080 empathetic multi-turn dialogues
- 671 fully documented personas
- Fine-grained emotional states
- Four-span structured reasoning format
- Fully anonymized & cleaned
HuggingFace dataset: 👉 KardiaBench
To prevent misuse of sensitive data, our dataset requires an access request on HuggingFace. Please follow the instructions below to obtain access.
- Submit an Access Request describing the intended use
- Wait for approval from the maintainers — we will review and approve requests as quickly as possible
from datasets import load_dataset
dataset = load_dataset("Jhcircle/KadiaBench")If you encounter AccessDenied or 403 Forbidden errors, your access request may still be pending or your HuggingFace authentication may be missing.
Login manually if needed:
huggingface-cli login| Field | Description |
|---|---|
| person | Full raw user profile string including MBTI, About, Signature, and Recent Activities. |
| mbti | The user’s MBTI type extracted from the profile (e.g., “INFP”, “ISTP”). |
| emotion | Target emotional state representing the user’s current feelings in the scenario (e.g., “anxious”, “terrified”). |
| situation | Starting background context or emotional scenario for the conversation. |
| anon_username | An anonymized username for privacy-preserving user identity. |
| messages | Full structured dialogue as a list of message objects, including the system prompt, user turns, and assistant responses. |
If our work is helpful, please cite:
@article{yuan2025kardia,
title={Kardia-R1: Unleashing LLMs to Reason toward Understanding and Empathy for Emotional Support via Rubric-as-Judge Reinforcement Learning},
author={Yuan, Jiahao and Cui, Zhiqing and Wang, Hanqing and Gao, Yuansheng and Zhou, Yucheng and Naseem, Usman},
journal={arXiv preprint arXiv:2512.01282},
year={2025}
}We gratefully acknowledge EmpatheticDialogues for foundational inspiration, PersonalityCafe for publicly shared personas, DeepSeek-R1 and Qwen3 for their GRPO insights, and all annotators and psychology experts for their invaluable support in building KardiaBench.

