Skip to content

Conversation

@AITherapySolutions
Copy link

@AITherapySolutions AITherapySolutions commented Dec 13, 2025

Proposed Addition: Conversational AI Safety

Why This Category Is Needed

Conversational AI safety has emerged as a critical and distinct subdomain requiring specialized tools and approaches. Three developments validate this as a standalone category:

1. Common Sense Media AI Risk Assessment (2025)

Common Sense Media, in partnership with Stanford's Brainstorm Lab for Mental Health Innovation, released comprehensive risk assessments concluding that AI companion apps pose "unacceptable risks" to users under 18. Their testing of Character.AI, Replika, Nomi, and others revealed systemic failures in crisis detection, grooming prevention, and harmful content moderation. 73% of teens have now used AI companions, yet safety infrastructure remains critically underdeveloped.

2. Google's DICES Dataset (NeurIPS 2023)

Google Research released the DICES (Diversity In Conversational AI Evaluation for Safety) dataset—the first large-scale benchmark specifically designed for evaluating safety in conversational AI systems. DICES contains 1,340 adversarial human-bot conversations rated by 296 demographically diverse raters across 24 safety criteria. This dataset acknowledges that conversational AI safety requires distinct evaluation approaches from general content moderation.

3. Safety4ConvAI Workshop Series (2020-2024)

The academic community has formalized this domain through the Safety for Conversational AI (Safety4ConvAI) workshop series, now in its third iteration at LREC-COLING 2024. Organized by researchers from Heriot-Watt University, Bocconi University, Google, and Meta AI, the workshop focuses specifically on:

  • Detecting safety-critical situations in dialogue (self-harm, medical advice, crisis states)

  • Conversational abuse detection and mitigation

  • Privacy leaks in conversational contexts

  • Benchmarks for dialogue-level safety evaluation

  • Workshop: https://sites.google.com/view/safety-conv-ai-workshop

Proposed Category Structure

Added a section on Conversational AI Safety with a new resource.

Signed-off-by: AITherapySolutions <tammy@aitherapysolutions.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant