FEAT: Extended conformance validation#270
Merged
carlwilson merged 4 commits intodev/1.18.5from Nov 13, 2025
Merged
Conversation
Many real world ODF documents use foreign XML elements and attributes not defined by the ODF XML schema. This is known as an extended document. This PR provides a pre-processing XML filter that either removes foreign elements, or uses their content, following the rules set out in [Section 3.17 of the ODF specification part 3 (Schema)](https://docs.oasis-open.org/office/OpenDocument/v1.3/os/part3-schema/OpenDocument-v1.3-os-part3-schema.html#__RefNumPara__623372_981565270). - created an `ExtendedConformanceFilter`, a pre-processing XML filter that resolves the content of foreign elements/attributes or removes them; - added a second validation method to `XmlValidator` that takes any XML filter, this allows use of the extended conformance filter but may be useful for more complex filtered workflows; - added an `isExtended` member variable to `OdfValidator` and new factory methods to instantiate extended and non-extended versions; - added an `isExtended` member variable to `ValidatingParserImpl` and new factory methods to instantiate extended and non-extended versions; - added an `isExtended` member variable to `ProfileImpl` and new factory methods to instantiate extended and non-extended versions; - created an extended version of `ValidPackageRule` so that it supports both forms of validation; - fixed reporting of foreign namespaces so they are reported as errors during normal validation, but info messages for extended validation; - added `EnumSets` of `OdfNamespaces` for the sets of local namespaces for ODF v1.0/1.1 and ODF v1.2 onwards; - added tests for extended conformance; - added a `-e`/`--extended` option to the CLI to invoke extended document validation; - tidied up creation of SAX `XMLReader` and added to `XmlUtils` class; and - added valid and invalid extended documents for testing.
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR implements extended conformance validation for ODF documents that contain foreign XML elements and attributes. The implementation follows Section 3.17 of the ODF specification, adding a preprocessing filter that either removes or processes foreign elements according to the specification rules.
- Created
ExtendedConformanceFilterto handle foreign XML elements/attributes per ODF spec requirements - Added
isExtendedparameter throughout the validation chain (validators, parsers, profiles, rules) to support both standard and extended validation modes - Added CLI flag
-e/--extendedto enable extended document validation via command-line interface
Reviewed Changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| extended_valid.xml | Test resource: valid extended ODF document with foreign elements |
| extended_invalid.xml | Test resource: invalid extended ODF document for negative testing |
| ValidPackageRuleTest.java | Removed EqualsVerifier test |
| ProfileImplTest.java | Updated to use new API with isExtended parameter |
| ValidatingParserTest.java | Improved null handling by using Path instead of File |
| PackageParserTest.java | Improved null handling by using Path instead of File |
| TestFiles.java | Added constants for new extended conformance test resources |
| XmlValidatorTest.java | Added comprehensive tests for extended conformance validation |
| OdfSchemaFactory.java | Removed overloaded getSchema method that defaulted to ODF_13 |
| OdfNamespaces.java | Added EnumSets for ODF 1.0/1.1 and 1.2+ namespace sets |
| ExtendedConformanceFilter.java | New SAX filter implementing extended conformance preprocessing |
| ValidPackageRule.java | Added isExtended parameter to support extended validation mode |
| Rules.java | Added extended profile support and factory methods |
| ProfileImpl.java | Added isExtended member and updated constructors |
| ValidationResultImpl.java | Minor formatting improvement (blank line) |
| ValidatingParserImpl.java | Integrated extended conformance filter and severity adjustments |
| OdfValidators.java | Added factory methods for extended/non-extended validators |
| OdfValidatorImpl.java | Added isExtended support and conditional validation logic |
| XmlValidator.java | Added validate method accepting XMLFilter for filtered validation |
| XmlUtils.java | Added utility methods for creating SAX readers with/without filters |
| XmlParserImpl.java | Refactored to use XmlUtils for SAX reader creation |
| CliValidator.java | Added -e/--extended CLI option for extended validation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
odf-core/src/main/java/org/openpreservation/odf/validation/ValidatingParserImpl.java
Outdated
Show resolved
Hide resolved
| /** | ||
| * Validate the supplied InputStream against the supplied schema. | ||
| * | ||
| * @param parseResult the {@link ParseResult} obtained form parsign the file |
There was a problem hiding this comment.
Multiple typos in JavaDoc comment: "form" should be "from" and "parsign" should be "parsing".
Member
Author
There was a problem hiding this comment.
Fixed minor spelling issues in comments
odf-core/src/main/java/org/openpreservation/odf/validation/rules/ValidPackageRule.java
Show resolved
Hide resolved
odf-core/src/test/resources/org/openpreservation/odf/fmt/xml/extended_invalid.xml
Outdated
Show resolved
Hide resolved
odf-core/src/main/java/org/openpreservation/odf/validation/ValidatingParserImpl.java
Show resolved
Hide resolved
odf-core/src/main/java/org/openpreservation/odf/validation/OdfValidators.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…xtended_invalid.xml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…Validators.java Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Many real world ODF documents use foreign XML elements and attributes not defined by the ODF XML schema. This is known as an extended document. This PR provides a pre-processing XML filter that either removes foreign elements, or uses their content, following the rules set out in Section 3.17 of the ODF specification part 3 (Schema).
ExtendedConformanceFilter, a pre-processing XML filter that resolves the content of foreign elements/attributes or removes them;XmlValidatorthat takes any XML filter, this allows use of the extended conformance filter but may be useful for more complex filtered workflows;isExtendedmember variable toOdfValidatorand new factory methods to instantiate extended and non-extended versions;isExtendedmember variable toValidatingParserImpland new factory methods to instantiate extended and non-extended versions;isExtendedmember variable toProfileImpland new factory methods to instantiate extended and non-extended versions;ValidPackageRuleso that it supports both forms of validation;EnumSetsofOdfNamespacesfor the sets of local namespaces for ODF v1.0/1.1 and ODF v1.2 onwards;-e/--extendedoption to the CLI to invoke extended document validation;XMLReaderand added toXmlUtilsclass; and