feat(spp_mis_demo_v2): deterministic blueprint-based demo data generation#62
feat(spp_mis_demo_v2): deterministic blueprint-based demo data generation#62
Conversation
…inistic blueprints - Add household blueprints with ~730 deterministic households - Add SeededVolumeGenerator for reproducible demo data - Add locale-aware Filipino/Togolese name providers in spp_demo - Populate cycle beneficiaries and entitlements - Improve demo data realism (enrollment dates, payment history) - Add blueprint reproducibility tests
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the reliability and realism of the demo data generation for the OpenSPP MIS module. By transitioning from a random, Faker-driven approach to a deterministic, blueprint-based system, the generated data is now fully reproducible. This change ensures consistent testing environments and predictable demo scenarios, while also introducing locale-aware naming to make the demo data more relevant to specific regional contexts. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
| # Use explicit registrant_name if provided, otherwise derive from story_id | ||
| story_name = cr_def.get("registrant_name") or self._get_story_name(story_id) | ||
| # Always use locale-aware name resolution (handles CR-specific IDs too) | ||
| story_name = self._get_story_name(story_id) |
Check warning
Code scanning / Semgrep OSS
Semgrep Finding: semgrep.odoo-sudo-without-context Warning
| # Always use locale-aware name resolution (handles CR-specific IDs too) | ||
| story_name = self._get_story_name(story_id) | ||
|
|
||
| registrant = self.env["res.partner"].search( |
Check warning
Code scanning / Semgrep OSS
Semgrep Finding: semgrep.odoo-sudo-without-context Warning
| story_name = self._get_story_name(story_id) | ||
|
|
||
| registrant = self.env["res.partner"].search( | ||
| [("name", "=", story_name), ("is_registrant", "=", True)], |
Check warning
Code scanning / Semgrep OSS
Semgrep Finding: semgrep.odoo-sudo-without-context Warning
| @@ -2813,7 +2900,7 @@ def _create_single_change_request(self, registrant, cr_def, stats, demo_user=Non | |||
| detail.write(detail_vals) | |||
Check warning
Code scanning / Semgrep OSS
Semgrep Finding: semgrep.odoo-sudo-without-context Warning
There was a problem hiding this comment.
Code Review
This pull request is a significant improvement, introducing deterministic demo data generation using a blueprint and seed-based approach. This will greatly enhance testing reliability and reproducibility. The addition of new locales for Filipino and Togolese names also adds valuable realism to the demo data. The code is generally well-structured, and the new functionality is supported by a good set of unit tests. I've identified a few areas for improvement, including a bug in the new SeededVolumeGenerator, some opportunities for code cleanup and efficiency gains, and a question about a potentially removed feature. I also noticed that the methods _generate_random_groups and _generate_volume_enrollments in spp_mis_demo_v2/models/mis_demo_generator.py are no longer called after your refactoring. It would be good to remove this dead code to keep the codebase clean.
I am having trouble creating individual review comments. Click here to see my feedback.
spp_mis_demo_v2/models/mis_demo_generator.py (1206-1248)
The _configure_compliance_manager method and its call have been removed. Was this intentional? This seems to remove the functionality of configuring compliance rules for demo programs. If this is an intended change, it would be good to mention it in the PR description.
spp_mis_demo_v2/models/seeded_volume_generator.py (185-193)
To fix a bug in _find_member_spec and improve the design, it's better to associate the member_spec with the created individual record here. This avoids fragile lookups later. You can achieve this by storing a tuple of (individual, member_spec) in the members list for each household.
if group_record.id not in group_households:
group_households[group_record.id] = {
"group": group_record,
"members": [],
"blueprint": member_specs[list(groups).index(group_record)][0],
}
_group_record, member_spec = individual_to_group[ind_idx]
group_households[group_record.id]["members"].append((individual, member_spec))spp_mis_demo_v2/models/seeded_volume_generator.py (238-296)
The _find_member_spec method is buggy. It doesn't use the member_record argument and will always return the first elderly member spec from the blueprint, which is incorrect for households with multiple elderly members.
By associating the spec with the member during creation (as suggested in another comment), you can simplify the logic in enroll_in_programs and remove the _find_member_spec method entirely.
# Individual-target program: enroll qualifying members
for member, member_spec in hh["members"]:
# ESP: only enroll elderly members (age >= 60 from blueprint)
if prog_id == "elderly_social_pension":
if not member_spec or member_spec.get("age_range", (0, 0))[0] < 60:
continue
enrollment_vals.append(
{
"program_id": program.id,
"partner_id": member.id,
"state": "enrolled",
}
)
enrollment_dates.append(member.registration_date or reg_date)
else:
# Group-target program: enroll the household
enrollment_vals.append(
{
"program_id": program.id,
"partner_id": group.id,
"state": "enrolled",
}
)
enrollment_dates.append(reg_date)
# Individual-level food assistance
if bp.get("individual_food_assistance"):
fa_program = program_map.get("food_assistance")
if fa_program:
for member, member_spec in hh["members"]:
enrollment_vals.append(
{
"program_id": fa_program.id,
"partner_id": member.id,
"state": "enrolled",
}
)
enrollment_dates.append(member.registration_date or reg_date)
if not enrollment_vals:
return
_logger.info("Enrolling %d program memberships...", len(enrollment_vals))
memberships = self._batch_create("spp.program.membership", enrollment_vals)
# Add state variety and backdate enrollment dates in one pass.
# enrollment_date is @api.depends("state") so we must do BOTH via SQL
# after all ORM operations are complete, to prevent recomputation.
self.env.flush_all()
self._apply_membership_realism(memberships, enrollment_dates)
def _apply_membership_realism(self, memberships, enrollment_dates):
"""Apply state variety and backdate enrollment dates via SQL.
enrollment_date is @api.depends("state") — any ORM state change triggers
recomputation to Datetime.now(). To prevent this, we:
1. flush_all() to commit ORM state
2. Apply state variety + date backdating together in raw SQL
3. Invalidate the cache so ORM sees our changes
"""
if not memberships:
return
membership_ids = memberships.ids
exited_count = paused_count = not_eligible_count = 0
for idx, mem_id in enumerate(membership_ids):
# Determine state
roll = self.rng.random()
state = "enrolled"
if roll < 0.02:
state = "not_eligible"
not_eligible_count += 1
elif roll < 0.05:
state = "paused"
paused_count += 1
elif roll < 0.10:
state = "exited"
exited_count += 1
# Determine enrollment date from registration date
if idx < len(enrollment_dates):
reg_date = enrollment_dates[idx]
enrollment_dt = datetime.datetime.combine(reg_date, datetime.time(8, 0, 0))
else:
enrollment_dt = datetime.datetime.now()
# Single SQL update for both state and enrollment_date
self.env.cr.execute(
"UPDATE spp_program_membership SET state = %s, enrollment_date = %s WHERE id = %s",
(state, enrollment_dt, mem_id),
)
memberships.invalidate_recordset(["state", "enrollment_date"])
_logger.info(
"Realism for %d memberships: %d exited, %d paused, %d not_eligible, dates backdated",
len(membership_ids),
exited_count,
paused_count,
not_eligible_count,
)
# =========================================================================
# Internal helpers
# =========================================================================spp_demo/models/demo_stories.py (1219-1229)
The call to get_story_by_id(story["id"]) is inside a loop over story.get("journey", []). This is inefficient as it repeatedly searches for the same story. You can improve performance by fetching the original story once before the loop.
if "children" in pnames:
orig_story = get_story_by_id(story["id"])
if not orig_story:
return story
orig_profile = orig_story.get("profile", {})
orig_children = orig_profile.get("children", [])
for step in story.get("journey", []):
if "member" in step:
# Find matching child by position
for idx, child in enumerate(orig_children):
if child.get("name") == step["member"] and idx < len(pnames["children"]):
step["member"] = pnames["children"][idx]
break
spp_mis_demo_v2/docs/USE_CASES.md (475-477)
This markdown table appears to be broken because all columns are on a single line. For the table to render correctly, each row, including the header and separator, should be on its own line. This issue also occurs for other tables in this file.
**Test Cases:**
| Persona | PMT Score | Expected | Result |
| -------------- | --------- | -------- | ------------ |
| Maria Santos | 38 | `true` | ✓ Eligible |
| Rosa Garcia | 42 | `true` | ✓ Eligible |
| Carlos Morales | 48 | `false` | ✗ Not eligible |
| Ibrahim Hassan | 35 | `true` | ✓ Eligible |
spp_mis_demo_v2/models/mis_demo_generator.py (2903-2904)
The nosec comment for this raw SQL query has been removed. While this pattern is common in Odoo and detail._table is generally safe as it comes from the model, it's good practice to keep the nosec comment to explicitly acknowledge and silence security scanner warnings about potential SQL injection.
f"UPDATE {detail._table} SET create_date = %s WHERE id = %s", # nosec B608
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 19.0 #62 +/- ##
===========================================
- Coverage 71.36% 59.76% -11.61%
===========================================
Files 319 182 -137
Lines 25878 15761 -10117
===========================================
- Hits 18467 9419 -9048
+ Misses 7411 6342 -1069
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…NG for name generation Replace all Faker usage with deterministic random.Random(seed).choice() from locale-specific name arrays. This ensures fully reproducible output across runs with the same seed. - Remove create_faker import and fake variable from mis_demo_generator - Delete dead code: _generate_random_groups, _create_random_individual, _generate_volume_enrollments - Remove fake parameter from _create_program_cycles - Remove faker from external_dependencies in manifest - Remove tests that called deleted _create_random_individual method - Update docs and docstrings to reflect seeded RNG approach
Add mis_demo_loaded boolean on res.company, set on successful generation. Wizard shows warning and hides Load button if already triggered. Also show GRM tickets / Case records bullets conditionally based on whether spp_grm_demo / spp_case_demo are installed.
Why is this change needed?
The MIS demo module previously used random Faker-based generation for volume data, producing different results on each run. This makes testing unreliable and screenshots non-reproducible. This syncs the blueprint/seed approach from openspp-modules-v2 (
feature/mis-demo-blueprintsbranch).How was the change implemented?
spp_mis_demo_v2:
household_blueprints.py— ~20 blueprint definitions covering diverse household archetypes (~730 total households)seeded_volume_generator.py—SeededVolumeGeneratorclass usingrandom.Random(seed=42)for deterministic outputmis_demo_generator.py— replaces_generate_random_households()with blueprint-based generation, populates cycle beneficiaries and entitlementstest_blueprint_reproducibility.py— verifies same seed produces identical outputspp_demo:
fil_PH(Filipino names) andfr_TG(Togolese names) with 70+ first names and 85+ last names eachdemo_stories.py— locale-aware story character name generationNew unit tests
test_blueprint_reproducibility.py— determinism verification (same seed = identical farms)Unit tests executed by the author
Tests pass in openspp-modules-v2
How to test manually
spp_mis_demo_v2Related links
openspp-modules-v2 branch:
feature/mis-demo-blueprints