Skip to content

Conversation

@NicolasLafitteMM
Copy link
Collaborator

Summary

This PR implements automated post-mortem reminders and notifications for the FireFighter incident management system:

  • Post-mortem creation announcements: When a PM is created for P1-P3 production incidents, an announcement is automatically sent to #critical-incidents
  • 5-day overdue reminders: Automated reminders sent 5 days after an incident reaches MITIGATED status, posted to both the incident channel and #critical-incidents (for eligible incidents)
  • Periodic execution: Reminders run twice daily at 10 AM and 3 PM (Paris time)
  • Testing tools: Management commands to test the feature without waiting 5 days

Changes

Core Features

  • New Slack messages (slack_messages.py):

    • SlackMessageIncidentPostMortemCreatedAnnouncement: Announcement for PM creation in #critical-incidents
    • SlackMessagePostMortemReminder5Days: Reminder message for incident channel
    • SlackMessagePostMortemReminder5DaysAnnouncement: Reminder announcement for #critical-incidents
  • New Celery task (send_postmortem_reminders.py):

    • Periodic task to check incidents mitigated 5+ days ago
    • Sends reminders to incident channel and #critical-incidents (when applicable)
    • Prevents duplicate reminders using Message tracking
  • New rule function (slack/rules.py):

    • should_publish_pm_in_general_channel(): Determines if PM announcements should be sent to #critical-incidents (P1-P3 production incidents only)

Database Changes

  • Incident model: Added mitigated_at timestamp field to track when incidents reach MITIGATED status
  • Celery Beat: Created periodic task with CrontabSchedule for 10 AM and 3 PM executions

Code Quality

  • Refactored postmortem_created_handler: Reduced complexity by extracting helper functions:
    • _update_mitigated_at_timestamp()
    • _create_confluence_postmortem()
    • _create_jira_postmortem()
    • _publish_postmortem_announcement()

Testing Tools

  • Management command backdate_incident_mitigated: Backdate incident mitigated_at timestamp for testing
  • Management command test_postmortem_reminders: Execute reminder task manually with optional --list-only mode
  • Documentation: Comprehensive testing guide at docs/contributing/testing-postmortem-reminders.md

Test Plan

  • Run migrations successfully
  • Backdate test incident by 6 days
  • Execute test_postmortem_reminders command
  • Verify messages sent to incident channel
  • Verify messages sent to #critical-incidents for eligible incidents
  • Verify duplicate prevention (Message table tracking)
  • Verify status filter includes both MITIGATED and POST_MORTEM
  • All type checks pass (mypy)
  • All linting passes (ruff)

Files Changed

  • 12 files modified
  • +893 insertions, -34 deletions

New Files

  • docs/contributing/testing-postmortem-reminders.md
  • src/firefighter/incidents/management/commands/backdate_incident_mitigated.py
  • src/firefighter/incidents/management/commands/test_postmortem_reminders.py
  • src/firefighter/incidents/migrations/0030_add_mitigated_at_field.py
  • src/firefighter/slack/migrations/0009_add_postmortem_reminder_periodic_task.py
  • src/firefighter/slack/tasks/send_postmortem_reminders.py

Modified Files

  • src/firefighter/incidents/models/incident.py
  • src/firefighter/jira_app/signals/postmortem_created.py
  • src/firefighter/slack/messages/slack_messages.py
  • src/firefighter/slack/rules.py

Add automated post-mortem reminders for incidents mitigated 5+ days ago
and notifications in #critical-incidents when post-mortems are created.

Features:
- Send reminders 5 days after incident mitigation to both incident
  channel and #critical-incidents (for P1-P3 production incidents)
- Announce post-mortem creation in #critical-incidents channel
- Add mitigated_at timestamp to Incident model to track mitigation date
- Create Celery periodic task running twice daily (10 AM and 3 PM Paris time)

Technical changes:
- Add SlackMessagePostMortemReminder5Days and announcement variants
- Add SlackMessageIncidentPostMortemCreatedAnnouncement
- Add should_publish_pm_in_general_channel() rule
- Refactor postmortem_created_handler to reduce complexity
- Add send_postmortem_reminders Celery task
- Add mitigated_at field to Incident model with migration

Testing:
- Add management commands for testing (backdate_incident_mitigated,
  test_postmortem_reminders)
- Add comprehensive testing documentation

All code, documentation, and comments are in English.
Use explicit status list instead of range comparison for better clarity.
Changed from _status__gte/lt to _status__in=[MITIGATED, POST_MORTEM].

Also add type annotations to management commands to fix mypy errors.
Add the new testing documentation to mkdocs.yml to fix strict build failure.
@codecov-commenter
Copy link

codecov-commenter commented Dec 19, 2025

Codecov Report

❌ Patch coverage is 28.03738% with 77 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.26%. Comparing base (8ee053b) to head (869dea8).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...firefighter/jira_app/signals/postmortem_created.py 20.00% 40 Missing ⚠️
src/firefighter/slack/messages/slack_messages.py 33.33% 36 Missing ⚠️
src/firefighter/slack/rules.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #204      +/-   ##
==========================================
- Coverage   67.49%   67.26%   -0.24%     
==========================================
  Files         213      213              
  Lines        9938    10027      +89     
  Branches     1086     1097      +11     
==========================================
+ Hits         6708     6745      +37     
- Misses       2941     2993      +52     
  Partials      289      289              
Flag Coverage Δ
unittests 67.26% <28.03%> (-0.24%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants