Skip to content

REL: ODF Validaor 0.18.5#271

Merged
carlwilson merged 16 commits intomainfrom
rel/0.18.5
Nov 13, 2025
Merged

REL: ODF Validaor 0.18.5#271
carlwilson merged 16 commits intomainfrom
rel/0.18.5

Conversation

@carlwilson
Copy link
Member

  • ODF v1.4 validation;
  • extended conformance checking;
  • memory usage fixes;
  • debug information; and
  • minor bug fixes.

- if no schema found for a particular version then throw an exceptions that reports the error and lists the currently supported versions;
- added `1.4` to `Version` enum in preparation for 1.4 validation;
- added static method to `Version` enum to output the supported versions;
- if detected version is `UNKNOWN` then default to v1.1 (prep for v1.0 and v1.1 validation); and
- bumped version to `0.18.5-SNAPSHOT` in project poms.
- used derived version rather than call versionFromPath twice; and
- moved prepend assignment inside string build condition.
- added schema documents for ODF v1.4 to resources;
- added new schema documents to schema map; and
- checked in batch files for `0.18.5-SNAPSHOT`.
The build date displayed by the command line app always defaults to the current date.
The Maven variables to substitute the date and date format in the build properties files weren't in the parent pom.

- added `<odfapps.timestamp>` and `<maven.build.timestamp.format>` Maven properties to parent pom;
- made some small improvements to the `BuildVersionProvider` class:
  - made the properties resource reader static as they only need to be read once;
  - added convenience method to get the version string alone; and
  - improved the exception message thrown when the resources couldn't be loaded.
- bumped Maven version -> 0.18.5;
- generated new documentation site, README and batch files; and
- added new message constant file left out in error.
When users are having issues it's helpful if you can find something about the environment.
This PR adds:

- a CLI option `-d`/`--debug` that outputs information about the users JVM and OS;
- a CLI option `-v` that controls verbosity of reporting in general, currently used to filter the output of the debug information.

Also added a primitive exception handler around validation in the top level CLI task. For now this simply outputs the exception details and
rethrows it.
- moved debug info initialisation within `call` method to ensure that it's initialised with the correct object values;
- improved the debug information output; and
- fixed small compiler warnings about unneccessary imports.
The zip archive cache was unlimited and large files could exhaust the heap memory.

- added two cache checks:
  1. `MAX_CACHE_SIZE` is equal to heap memory divided by 4; and
  2. `MAX_ITEM_SIZE` is `MAX_CACHE_SIZE` divide by 20;
- items larger than `MAX_ITEM_SIZE` or larger than the remaining cache size aren't cached;
- `ZipFileProcessor` now only opens the zip file once, necessary so that delivered `InputStream`s aren't closed;
- method to retrieve a stream now uses the cache calculation method; and
- removed uneccessary method to list cached file names.
- made `debugInfo` a local variable for now;
- fixed some small typos; and
- added JavaDoc comments to public methods.
- memory stats now shown in MB; and
- print stack trace for exceptions.
Many real world ODF documents use foreign XML elements and attributes not defined by the ODF XML schema. This is known as an extended document.
This PR provides a pre-processing XML filter that either removes foreign elements, or uses their content, following the rules set out in
[Section 3.17 of the ODF specification part 3 (Schema)](https://docs.oasis-open.org/office/OpenDocument/v1.3/os/part3-schema/OpenDocument-v1.3-os-part3-schema.html#__RefNumPara__623372_981565270).

- created an `ExtendedConformanceFilter`, a pre-processing XML filter that resolves the content of foreign elements/attributes or removes them;
- added a second validation method to `XmlValidator` that takes any XML filter, this allows use of the extended conformance filter but may be useful for more complex filtered workflows;
- added an `isExtended` member variable to `OdfValidator` and new factory methods to instantiate extended and non-extended versions;
- added an `isExtended` member variable to `ValidatingParserImpl` and new factory methods to instantiate extended and non-extended versions;
- added an `isExtended` member variable to `ProfileImpl` and new factory methods to instantiate extended and non-extended versions;
- created an extended version of `ValidPackageRule` so that it supports both forms of validation;
- fixed reporting of foreign namespaces so they are reported as errors during normal validation, but info messages for extended validation;
- added `EnumSets` of `OdfNamespaces` for the sets of local namespaces for ODF v1.0/1.1 and ODF v1.2 onwards;
- added tests for extended conformance;
- added a `-e`/`--extended` option to the CLI to invoke extended document validation;
- tidied up creation of SAX `XMLReader` and added to `XmlUtils` class; and
- added valid and invalid extended documents for testing.
@carlwilson carlwilson requested a review from Copilot November 13, 2025 11:25
@carlwilson carlwilson self-assigned this Nov 13, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@carlwilson carlwilson merged commit 9f8ecf8 into main Nov 13, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant