Fix #510: Implement viral attributes logic#631
Draft
javihern98 wants to merge 15 commits intomainfrom
Draft
Conversation
- Add VIRAL_ATTRIBUTE = "Viral Attribute" to Role enum and Role_keys - Add get_viral_attributes() and get_viral_attributes_names() to Dataset - Fix _InternalApi.py: convert "ViralAttribute" → "Viral Attribute" (both loading paths) - Fix nullable default to include VIRAL_ATTRIBUTE - Add "ViralAttribute" to JSON schema for backward compatibility - Update existing test to expect VIRAL_ATTRIBUTE instead of ATTRIBUTE
- Replace NotImplementedError in visitViralAttribute with return Role.VIRAL_ATTRIBUTE - Add ViralAttribute class to RoleSetter.py - Add VIRAL_ATTRIBUTE to ROLE_SETTER_MAPPING in Utils - Fix VIRAL_ATTRIBUTE token to match "viral attribute" (lowered Role.value)
- Update Binary.dataset_validation() to keep VIRAL_ATTRIBUTE components - Collect viral attributes from both operands (not just base_operand) - Extract _cleanup_attributes_after_merge() for viral attr suffix handling - Update dataset_scalar_validation/evaluation to keep viral attr columns
- Update Unary.dataset_validation() to keep VIRAL_ATTRIBUTE components - Update Unary.dataset_evaluation() to include viral attr in cols_to_keep - Add parametrized tests for numeric, string, boolean, and comparison ops
- Update Between, Time_Aggregation, Check, Membership to keep VIRAL_ATTRIBUTE - Update Set operators (Intersection, Symdiff) to include viral attr columns - Update Interpreter HAVING clause to preserve VIRAL_ATTRIBUTE - Verified Aggregation, Nvl, Check_Hierarchy already correct (no changes needed)
- Add PROPAGATION token to VtlTokens.g4 - Add defViralPropagation, vpSignature, vpBody, vpClause grammar rules - Regenerate parser/lexer with ANTLR 4.9.3 - Define EnumeratedVpClause, AggregateVpClause, ViralPropagationDef AST nodes - Implement visitor methods in ASTConstructor - Update DAG to accept ViralPropagationDef as top-level statement - Add visit_ViralPropagationDef stub in Interpreter
- Create ViralPropagation package with ViralPropagationRule and ViralPropagationRegistry - Registry supports resolve_pair (binary) and resolve_group (aggregation) - Wire registry into Interpreter: initialize per run, register rules from AST nodes - Wire into Binary._cleanup_attributes_after_merge: use resolve_pair for dual-viral attrs - Wire into Aggregation.evaluate: propagate viral attrs using resolve_group after grouping - Move imports to top of files per project convention
- Add error codes 1-3-3-1 through 1-3-3-4 to messages.py - Validate: no duplicate rules for same variable/valuedomain - Validate: no duplicate enumeration combinations - Validate: no mixing enumerated and aggregate clauses
Merged 9 test files into 4 well-organized files with shared fixtures: - test_viral_role.py: data model unit tests (6 tests) - test_viral_operators.py: operator propagation — binary, unary, other ops (24 tests) - test_viral_propagation.py: define viral propagation — parsing, e2e, validation (10 tests) - test_viral_prettify.py: prettify support (4 tests) Extracted shared data structure builders (_ds, _id, _me, _va, _at, _run) to eliminate boilerplate duplication across tests.
- Add viral_propagation.vtl input files for ast_string and prettier tests - Add reference_viral_propagation.vtl expected output - Register in params and params_prettier lists in test_AST_String.py - Remove standalone test_viral_prettify.py
- Layered dataset approach: one base, progressively add 1/2/3 viral attrs - Parametrize operators × num_viral_attrs (unary, binary, scalar, other) - Shared propagation rules defined once, reused across parametrized ops - Multi-attribute test: enumerated (At_1) + aggregate max (At_2) in one script - 79 tests covering all operator categories with 1, 2, and 3 viral attrs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the VTL 2.2
define viral propagationconstruct and the viral attribute propagation mechanism per the draft spec.What changed:
VIRAL_ATTRIBUTErole — a newRoleenum value that behaves likeATTRIBUTEexcept viral attributes are automatically propagated through operators instead of being droppeddefine viral propagationconstruct — new ANTLR grammar rules, AST nodes (ViralPropagationDef,EnumeratedVpClause,AggregateVpClause), and parser support for defining propagation rulesViralPropagationRegistrystores rules and resolves viral attribute values viaresolve_pair(binary ops) andresolve_group(aggregations)define viral propagationstatements are correctly rendered back to VTL1-3-3-1through1-3-3-4for duplicate rules, mixed clause types, and duplicate enumeration combinationsCloses #510
Checklist
ruff format,ruff check,mypy)pytest) — 4144 tests, 79 new viral attribute testsImpact / Risk
"role": "ViralAttribute"(legacy format) are now loaded asRole.VIRAL_ATTRIBUTEinstead of being silently converted toRole.ATTRIBUTE. This is the correct behavior per the VTL spec. The JSON schema accepts both"Viral Attribute"and"ViralAttribute"for backward compatibility.run_sdmx()does not yet support viral attributes —VTL_ROLE_MAPPINGhas no viral attribute mapping. A follow-up issue in pysdmx is needed.PROPAGATIONtoken was added toVtlTokens.g4.Implementation details (excluding AST folder)
New file:
src/vtlengine/ViralPropagation/__init__.pyThe core propagation engine. Contains:
ViralPropagationRule— dataclass storing a single rule definition (name, signature type, target, enumerated clauses, aggregate function, default value)ViralPropagationRegistry— stores rules and resolves viral attribute values:register(rule)— stores a rule keyed by variable or value domainresolve_pair(variable, val_a, val_b)— resolves two values for binary operators (binary clauses checked before unary, then default)resolve_group(variable, values)— resolves N values for aggregation operators (aggregate function or pairwise reduce)get_current_registry/set_current_registry) — the Interpreter sets a fresh registry perrun()call; operators access it without threading it through signaturessrc/vtlengine/Model/__init__.pyVIRAL_ATTRIBUTE = "Viral Attribute"to theRoleenum"Viral Attribute"toRole_keysvalidation listget_viral_attributes()andget_viral_attributes_names()methods toDataset— return onlyVIRAL_ATTRIBUTEcomponents (separate fromget_attributes()which continues to return onlyATTRIBUTE)src/vtlengine/API/_InternalApi.py"ViralAttribute"→"Attribute"conversion hack to"ViralAttribute"→"Viral Attribute"(both thestructuresandDataStructureloading paths)Role.VIRAL_ATTRIBUTEto the nullable default tuple so viral attributes default tonullable=Truesrc/vtlengine/API/data/schema/json_schema_2.1.json"ViralAttribute"to the role enum for backward compatibility (alongside existing"Viral Attribute")src/vtlengine/Operators/__init__.pyBinary.dataset_validation()— filter includesRole.VIRAL_ATTRIBUTE; also collects viral attributes from the other operand (not just the base)Binary.dataset_scalar_validation()/Binary.dataset_set_validation()— filter includesRole.VIRAL_ATTRIBUTEBinary._cleanup_attributes_after_merge()— new static method extracting the attribute cleanup logic; drops non-viral attributes, resolves viral attribute merge suffixes (_x/_y) usingregistry.resolve_pair()Binary.dataset_scalar_evaluation()—cols_to_keepincludes viral attribute columnsUnary.dataset_validation()— filter includesRole.VIRAL_ATTRIBUTEUnary.dataset_evaluation()—cols_to_keepincludes viral attribute columnssrc/vtlengine/Operators/Aggregation.pyregistry.resolve_group()to each viral attribute's grouped valuessrc/vtlengine/Operators/Comparison.pyBetween.validate()— filter includesRole.VIRAL_ATTRIBUTEsrc/vtlengine/Operators/General.pyMembership.validate()— filter includesRole.VIRAL_ATTRIBUTEsrc/vtlengine/Operators/Set.pyIntersection.evaluate()andSymdiff.evaluate()—not_identifierslist includesget_viral_attributes_names()src/vtlengine/Operators/Time.pyTime_Aggregation.dataset_validation()— filter includesRole.VIRAL_ATTRIBUTEsrc/vtlengine/Operators/Validation.pyCheck.validate()— filter includesRole.VIRAL_ATTRIBUTEsrc/vtlengine/Operators/RoleSetter.pyclass ViralAttribute(RoleSetter): role = Role.VIRAL_ATTRIBUTE— enablescalc viral attributein VTL scriptssrc/vtlengine/Utils/__init__.pyVIRAL_ATTRIBUTEimport andVIRAL_ATTRIBUTE: ViralAttributetoROLE_SETTER_MAPPINGsrc/vtlengine/Interpreter/__init__.pyvisit_Start()— initializes a freshViralPropagationRegistryper runvisit_ViralPropagationDef()— new method that validates (no mixed clauses, no duplicate enumerations, no duplicate rules) and registers the propagation rule in the registryvisit_Starttype check to acceptViralPropagationDefas a valid top-level statementRole.VIRAL_ATTRIBUTEcomponentssrc/vtlengine/Exceptions/messages.pySemanticErrorcodes (1-3-3-1through1-3-3-4) for: duplicate variable rule, duplicate value domain rule, mixed clause types, duplicate enumeration combinationNotes
Known limitations (v1):
run_sdmx()pathway not supported yet (needs pysdmx viral attribute role support)Componentwith avalue_domainfield (deferred)aggregatekeyword in the VTL 2.2 spec maps toaggrin the grammar (matching the existing VTL token)Test structure (79 tests):
test_viral_role.py— data model unit tests (6 tests)test_viral_operators.py— operator propagation with 1/2/3 viral attributes across unary, binary, scalar, and other operators (58 tests)test_viral_propagation.py— define viral propagation: parsing, end-to-end with enumerated + aggregate rules, multi-attribute propagation, and semantic validation (15 tests)tests/AST/test_AST_String.py— prettify round-trip viaviral_propagation.vtldata files (5 tests)