Improve W3C serialization compliance across all output methods#6138
Open
joewiz wants to merge 62 commits intoeXist-db:developfrom
Open
Improve W3C serialization compliance across all output methods#6138joewiz wants to merge 62 commits intoeXist-db:developfrom
joewiz wants to merge 62 commits intoeXist-db:developfrom
Conversation
Contributor
|
needs a rebase |
7f478be to
331d112
Compare
fn:compare: XQ4 numeric/duration/dateTime total order via BigDecimal. fn:min/fn:max: fn:compare-based mutual comparability. fn:round 3-arg. fn:deep-equal: full XQ4 options engine, text node merging. fn:every/fn:some, fn:all-equal/different, fn:atomic-equal, fn:duplicate-values, fn:highest/fn:lowest, fn:scan-left/right, fn:contains/starts-with/ends-with-subsequence. Fix: SequenceComparator o2Count typo, AtomicValueComparator cause preservation, Collations instanceof for non-RuleBasedCollator, BigInteger comparison via string (not truncating getLong()). XQTS: fn-min +73, fn-max +73, fn-deep-equal +20, fn-every/some +50 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
String: fn:characters, fn:graphemes (ICU4J), fn:char, fn:decode-from-uri, fn:insert-separator, fn:replicate Parsing: fn:parse-html (NekoHTML+XHTML), fn:parse-integer, fn:parse-QName, fn:parse-uri, fn:build-uri, fn:html-doc, fn:collation/-available Type: fn:atomic-type-annotation, fn:node-type-annotation, fn:type-of, fn:is-NaN, fn:identity, fn:void Nav: fn:transitive-closure, fn:element-to-map, fn:siblings, fn:in-scope-namespaces, fn:distinct/ordered-nodes Higher-order: fn:partition, fn:partial-apply, fn:sort-by, fn:op, fn:subsequence-where Numeric: fn:seconds, fn:divide-decimals, fn:unix-dateTime, fn:civil-timezone, fn:hash, fn:expanded-QName, fn:unparsed-binary Date: fn:build-dateTime, fn:parts-of-dateTime (record-compatible) Data: fn:items-at, fn:slice, fn:message, fn:highest, fn:lowest XQTS: fn-graphemes 1086/1189, fn-characters 45/45, misc-HtmlTestSuite 1105/1379, fn-unparsed-binary 14/15 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
array:slice (4 overloads), array:index-where, array:sort-with, array:sort-by, array:empty, array:foot, array:trunk, array:items, array:members, array:build, array:index-of, array:of-members, array:split. Fix array:sort ClassCastException unwrap, ArraySortBy key validation, ArraySortWith RuntimeException unwrap. XQTS: array-slice 71/71, array-foot 9/9, array-trunk 6/6, array-items 8/8, math-cosh/sinh/tanh 27/27 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Hyperbolic trigonometric functions via Java Math.cosh/sinh/tanh. Euler's number constant via Math.E. XQTS: math-cosh 9/9, math-sinh 9/9, math-tanh 9/9, math-e 4/5 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Unicode block name fallback (\p{Is<Block>} → \p{In<Block>}).
XQ4 fn:replace: 'c' flag, empty match, function replacement.
XQ4 fn:matches and fn:tokenize enhancements.
FunAnalyzeString: use reflection proxy for RegexIterator.MatchHandler
to avoid NoClassDefFoundError when the inner class is stripped from
fat JARs. Falls back to text-only output when unavailable.
XQTS: fn-matches.re +45, fn-replace +12, fn-tokenize +8
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fractional seconds: left-aligned digit semantics. Word/Roman via ICU4J: W/w/Ww cardinal, Wo/wo/Wwo ordinal, I/i Roman. Timezone: picture-driven rewrite with digit family support. Era [E]/[C], calendar validation, grouping separators, optional digit validation, ordinal suffix teens fix, whitespace stripping, military TZ "J", name width truncation (max not min). XQTS: format-time 46→77/92, format-date 79→111/133 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…d-text, fn:json-doc Resolve relative URIs against file: base URI with direct file: handling. Only allow direct file: access for URIs resolved from relative paths (absolute file: URIs go through SourceFactory security checks). Separate FOJS0001 from FOUT1170 in fn:json-doc. Add iso-8859 → iso-8859-1 charset fallback in fn:unparsed-text. XQTS: misc-HtmlTestSuite 0→1105/1379, misc-JsonTestSuite 0→299/318 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fn:parse-csv, fn:csv-to-arrays, fn:csv-to-xml, fn:csv-to-json. Custom streaming CSV parser with configurable delimiter, quote char, header handling, and column naming. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- fnXQuery40.xql: tests for 50+ new XQ4 functions - deep-equal-options-test.xq: deep-equal options engine tests - Re-enable arr:get-invalid-type (XPTY0004 now works) - Update json-to-xml pending comments - fn:replace test updates Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Parser and tree walker extensions for XQ4: focus functions, keyword args, string templates, pipeline, mapping arrow, for member, otherwise, braced if, while, try/finally, ternary, QName/hex/binary literals, array/map filter, choice/union/enum types, method call, let destructure, fn() shorthand, record types, gnode(), 4 new axes, reservedKeywords sub-rules, expr split for code-too-large fix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New expression classes: FocusFunction, KeywordArgumentExpression, MappingArrowOperator, MethodCallOperator, PipelineExpression, OtherwiseExpression, WhileClause, ForMemberExpr, ForKeyValueExpr, LetDestructureExpr, FilterExprAM, ChoiceCast/CastableExpression, EnumCastExpression, FunctionParameterFunctionSequenceType. Modified: Function (keyword arg resolution), FunctionFactory (XQ4 no-namespace override, unknown type XPST0017), FunctionSignature (default params), UserDefinedFunction (default param binding), TryCatchExpression (finally), SwitchExpression (XQ4 version gating), StringConstructor (atomization fixes), XQueryContext (version 4.0, XQST0060 relaxed, compileModuleFromSource), Constants (4 new axes), LocationStep (or-self axis evaluation with document node guard). Type infrastructure: Type.RECORD constant, SequenceType.RecordField, record type structural checking, record(*) and record() support. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- convertTo(): FORG0001→XPTY0004 for type-incompatible casts (20 files) - DoubleValue: NaN/INF→integer/decimal throws FOCA0002 - DynamicCardinalityCheck: ERROR→XPTY0004 (or XPDY0050 for treat-as) - DynamicTypeCheck: FOCH0002→XPTY0004 (overridable for treat-as) - CastExpression: xs:anySimpleType→XPST0080 (was XPST0051) - StringValue: validation errors→FORG0001 (was generic ERROR) - Base64BinaryValueType: FORG0001 with proper ErrorCode - ErrorCodes: added convenience constructor XQTS impact: prod-CastExpr 745→141F, prod-TreatExpr 18→1F Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Compile modules from provided source strings instead of loading from URIs. Required by misc-Subtyping XQTS tests (146 tests). Relaxed version compatibility check for content-loaded modules. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Parse invisible XML grammars using the Markup Blitz iXML library. Two signatures: fn:invisible-xml(grammar) returns a parsing function, and fn:invisible-xml(grammar, input) parses directly. Updated pom.xml with Markup Blitz dependency. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Primitive long start/end instead of IntegerValue objects. Pre-computed size with overflow protection. O(1) count/isEmpty/contains. Prevents OOM on large ranges like 1 to 10000000000. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Enhanced: fn:compare (XQ4 anyAtomicType, total order), fn:min/max (comparison function), fn:deep-equal (options map), fn:matches/ fn:tokenize (XQ4 regex flags, ! flag version-gating), fn:replace (function replacement, ! flag), fn:round (3-arg mode). Collations: supplementary codepoint fix, ASCII case-insensitive collator. InspectModule: keyword arg introspection. DocUtils: URI resolution. Parameter name alignment across 59 fn: module files to match W3C XQuery 4.0 Functions and Operators catalog. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comprehensive fnXQuery40.xql with tests for all XQ4 features. Updated fnHigherOrderFunctions.xql, replace.xqm, fnLanguage.xqm, InspectModuleTest.java. New deep-equal-options-test.xq and fnInvisibleXml.xqm. Fixed stray backtick in Lucene facets.xql. Updated map ordering test assertions for LinkedHashMap insertion order. XQSuite: 1341 tests, 0 failures Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4 tasks
Fix multiple issues in the JSON output method (method="json") and JSON function option validation: JSONSerializer: - Enable forward slash escaping (ESCAPE_FORWARD_SLASHES) per JSON spec - Handle INF/NaN/negative-zero per QT4 spec (1e9999, -1e9999, null) - Fix inverted allow-duplicate-names logic: "yes" now correctly allows duplicates (was enabling STRICT_DUPLICATE_DETECTION) - Add manual duplicate key detection in serializeMap for SERE0022 errors when allow-duplicate-names="no" - Extract numeric serialization into dedicated serializeAtomicValue method XQuerySerializer: - Remove backwards-compatibility check in serializeJSON() that routed single element/document nodes to XML serialization instead of JSON JSON.java (fn:parse-json, fn:json-to-xml, fn:json-doc): - Validate option types: 'liberal' must be boolean, 'duplicates' must be string (XPTY0004) - Check that options parameter is a map before casting XQTS QT4 results: method-json 8/81 → 46/81 (+38) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove 'map' prefix from map serialization: output '{...}' not
'map{...}' per W3C Serialization 3.1 Section 11 (Adaptive Output
Method)
- Fix double INF/NaN serialization: use 'INF'/'-INF'/'NaN' string
representations instead of Unicode symbols that DecimalFormat produces
XQTS QT4 results: method-adaptive 23/101 → 85/102 (+62)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
XQuerySerializer:
- Add item-separator support: when item-separator is set and the
sequence has multiple items, serialize each item individually with the
separator between them (the internal Serializer doesn't handle
item-separator)
XMLWriter:
- Output XML declaration when standalone parameter is set, even if
omit-xml-declaration is not explicitly "no" (per W3C Serialization 3.1)
- Add CDATA section output for cdata-section-elements: when
xdmSerialization is active and the current element is in the
cdata-section-elements set, wrap text content in CDATA sections
instead of character-escaping it
IndentingXMLWriter:
- Implement suppress-indentation parameter: parse space-separated
element names and skip indentation inside those elements and their
descendants
Option.java:
- Allow URIQualifiedName (Q{namespace}local) in declare option
statements; was rejecting them because it required a prefix
XQTS QT4 results: method-xml 11/47 → 20/47 (+9),
method-text 1/20 → 17/20 (+16)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AbstractSerializer: - Default html-version to 5.0 per W3C Serialization 3.1 spec (was 1.0, causing method="html" to use XHTML 1.0 writer instead of HTML5) - Map output:version to html-version for html/xhtml methods per W3C spec (version controls HTML version, not XML version, for these methods) HTML5Writer: - Add include-content-type support: inject <meta> content-type tag in <head> when include-content-type=yes (the default) - Add HTML5 processing instruction format: output <?pi data> instead of <?pi data?> per HTML5 spec XHTMLWriter: - Add 'embed' to void elements set (was missing, causing <embed></embed> instead of <embed />) XQTS QT4 results: method-html 31/69 → 34/69 (+3), method-xhtml 20/53 → 25/53 (+5) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rewrite fn:xml-to-json to use DOM traversal instead of XMLStreamReader. The XMLStreamReader approach failed for element nodes because getXMLStreamReader() always starts from the owner document root, causing non-JSON wrapper elements (like xsl:template, xsl:variable) to be traversed and rejected with FOJS0006. The new DOM-based approach: - Directly navigates the element's DOM tree - Handles map, array, string, number, boolean, null elements - Supports key/escaped/escaped-key attributes - Works correctly for both document and element node inputs - Keeps the old XMLStreamReader-based method for reference XQTS QT4 results: fn-xml-to-json 82/166 → 97/166 (+15) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
331d112 to
443870e
Compare
Grammar (XQuery.g): - fn() and function() type tests now accept named parameters: fn($name as xs:string, $age as xs:integer) as xs:boolean The names are parsed and discarded — only the sequence types matter for type checking. This matches the XQ4 spec. CastExpression/CastableExpression: - xs:anyType and xs:untyped now throw XPST0080 (was bypassing the abstract type check or using XPST0051) XQTS: misc-BuiltInKeywords 227→234 (+7 tests) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Restore the backwards-compatibility check in XQuerySerializer.serializeJSON() that routes single element or document nodes through the legacy XML-to-JSON writer. This is needed for RESTXQ and REST API endpoints that return XML documents with method=json — the legacy writer converts XML structure to JSON properties (e.g., <firstName>Adam</firstName> → "firstName":"Adam"). Maps, arrays, atomics, and multi-item sequences continue to use the W3C-compliant JSONSerializer. Fixes MediaTypeIntegrationTest.mediaTypeJson1 and mediaTypeJson2. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
a11919c to
e1eb77d
Compare
…e class Move the content-type meta tag insertion logic from HTML5Writer to XHTMLWriter so it works for both HTML 4.0 (XHTMLWriter) and HTML 5.0 (HTML5Writer → XHTML5Writer → XHTMLWriter). The meta is now inserted as the first child of <head> per W3C Serialization 3.1. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
joewiz
added a commit
to joewiz/exist
that referenced
this pull request
Mar 23, 2026
Three targeted fixes prevent the forked JVM from hanging after BrokerPool.shutdown() completes: 1. StatusReporter threads are now daemon threads. The startup and shutdown status reporter threads are monitoring-only and must not prevent JVM exit. Added newInstanceDaemonThread() to ThreadUtils. 2. Four wait loops in BrokerPool that swallowed InterruptedException and used unbounded wait() now have 1-second poll timeouts, isShuttingDown() checks, and proper interrupt handling: - get() service mode wait: breaks on shutdown or interrupt - get() broker availability wait: throws EXistException on shutdown - enterServiceMode() wait: breaks on shutdown or interrupt - shutdown() active brokers wait: re-sets interrupt flag and breaks 3. At end of shutdown, instanceThreadGroup.interrupt() wakes any lingering threads in the instance's thread group. Previously, 4 test classes required exclusion or timeout workarounds (DeadlockIT, RemoveCollectionIT, CollectionLocksTest, MoveResourceTest). Now all complete cleanly: 6533 unit tests + 9 integration tests, 0 failures, clean JVM exit. Affects PRs with CI timeout workarounds: eXist-db#6112, eXist-db#6139, eXist-db#6138 Related: eXist-db#3685 (FragmentsTest deadlock) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 tasks
IndentingXMLWriter parsed suppress-indentation property values as plain
local names, but fn:serialize passes them as URI-qualified names ({ns}local
or Q{ns}local). Extract the local part from URI-qualified names before
adding to the suppress set.
Fixes suppress-indentation when used via fn:serialize() with QName values.
Prolog-level declare option already worked since it passes plain names.
XQTS: method-html +1 (test-55), method-xhtml +3 (tests 65-67)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
XHTMLWriter's contentTypeMetaWritten and inHead flags were not reset in resetObjectState(), causing the pooled writer to skip meta insertion on subsequent serializations. Override resetObjectState() to clear both flags. XQTS: Fixes method-html test-36 regression from content-type meta move. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
XQ4-specific syntax is now only available when xquery version "4.0"
is declared. When version "3.1" is declared (or no version declaration
is present), XQ4 syntax produces XPST0003 parse errors.
Gated features:
- Pipeline operator (->)
- Mapping arrow operator (=!>)
- Method call operator (=?>)
- Otherwise expression
- Ternary conditional (?? !!)
- String templates
- QName literals (#name)
- Focus functions (fn { }, function { })
- Keyword arguments (name := value)
- Default parameter values ($x := default)
- for member / for key / for value clauses
- while clause in FLWOR
- try/catch/finally (finally clause)
Implementation: xq4Enabled boolean field on XQueryParser, set to
true when versionDecl parses "4.0". Semantic predicates { xq4Enabled }?
gate XQ4 alternatives in the grammar.
Default behavior: XQ 3.1 (conservative, matches Saxon). No version
declaration = XQ 3.1.
Updated fnXQuery40.xql to declare xquery version "4.0". Added version
gating tests that verify XQ4 syntax is rejected in XQ 3.1 context.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Per W3C serialization spec, the cdata-section-elements parameter is ignored for the HTML output method — CDATA sections are not valid in HTML. Previously eXist applied cdata-section-elements regardless of the output method, wrapping text in <![CDATA[...]]> in HTML output. Implementation: - Add shouldUseCdataSections() hook in XMLWriter that subclasses can override to suppress CDATA wrapping - Override in XHTMLWriter to return false for method="html" - Add currentElementNamespaceURI() accessor for subclass use Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
HTML serialization fixes for script and style elements: 1. Fix attribute escaping on script/style elements (HTML5Writer): Previously needsEscape() returned false for ALL content inside script/style elements, including attribute values. Now the 2-arg needsEscape(ch, inAttribute) correctly escapes attributes while suppressing escaping for text content only. 2. Add raw text element handling for HTML4 (XHTMLWriter): The HTML4 path through XHTMLWriter now also suppresses entity escaping inside script and style element text content. 3. Add needsEscape(char, boolean) hook to XMLWriter for context-aware escaping that distinguishes text from attributes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
For HTML5 (version >= 5.0), use the <meta charset="UTF-8"> shorthand form instead of <meta http-equiv="Content-Type" content="...">. The shorthand form is preferred per the HTML5 spec and expected by the QT4 XQTS tests (Serialization-html-36a). HTML4 and XHTML continue to use the http-equiv form. XQTS: method-html +1 (test-36a), -1 (test-36 expects http-equiv) = net 0. The tradeoff favors XQ4/HTML5 compliance over backward compat. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…feature/serialization-compliance
Pick up BrokerPool shutdown fix (eXist-db#6167) and @line-o's function type checking refactoring. Resolved 11 merge conflicts. Known regression: 12 XQSuite map ordering tests fail (QT4-only, not in QT3/XQ31). @line-o's MapType refactoring removed the keyOrder tracking that our XQ4 ordered map implementation used. The ordered map support needs to be re-implemented on top of the new MapType API. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@line-o's MapType refactoring removed the keyOrder parameter from constructors. Re-implement ordered maps using an insertionOrder list tracked alongside the Bifurcan IMap. - MapType: insertionOrder field tracks key insertion order - keys(): returns insertion-ordered keys when tracking is active - iterator(): iterates in insertion order when tracking is active - put(): preserves and extends insertion order - remove(): preserves insertion order minus removed keys - merge(): propagates insertion order from source maps - MapExpr: passes keyOrder to MapType via setInsertionOrder() Fixes 8 of 12 QT4 XQSuite map ordering test failures. Remaining 4 QT4 tests need insertion order propagation in map:filter and map:build functions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…feature/serialization-compliance
- Move include-content-type meta insertion from HTML5Writer to XHTMLWriter base class so it works for both HTML 4.0 and 5.0 - Insert meta as first child of <head> per W3C Serialization 3.1 - Update HTML5WriterTest, serialize-html-5-raw-text-elements-head, and serialize-html-5-needs-escape-elements test expectations to include the content-type meta tag Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The <meta charset="UTF-8"> shorthand is only valid for method="html" with HTML5 version. For method="xhtml", the full form must be used: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> Fixes method-xhtml tests Serialization-xhtml-33 and -34. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…port FunSerialize: - Add SEPM0009 validation: error when omit-xml-declaration=yes conflicts with standalone being set, or with version!=1.0 and doctype-system set - Validates before serialization per W3C Serialization 3.1 Section 3 XHTML5Writer: - Pass doctype-public and doctype-system properties to documentType() instead of always emitting bare <!DOCTYPE html>. This enables <!DOCTYPE html SYSTEM "about:legacy-compat"> and PUBLIC identifiers. XQTS QT4: fn-serialize +5 (SEPM0009), method-xhtml +2, method-html +3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Per W3C HTML serialization spec: an ampersand immediately followed
by { in an attribute value should not be escaped. This is an AVT-like
pattern used in templating contexts.
Add escapeAmpersandBeforeBrace() hook in XMLWriter (returns true for
XML, false for HTML via XHTMLWriter override).
Fixes method-html Serialization-html-11.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SerializerUtils:
- Handle Q{namespace}local (URIQualifiedName) format in QName-type
properties like cdata-section-elements and suppress-indentation.
Previously, Q{http://...}local was split on colons in the URI,
producing wrong namespace bindings.
XMLWriter:
- Escape control characters 0x7F-0x9F as character references (&#xHH;)
per W3C XML serialization spec. Previously these passed through
unescaped in UTF-8 output.
XQTS QT4: method-xml +3, fn-serialize +5
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ation - Enable CR (0x0D) escaping in text content — was commented out, causing literal carriage returns instead of 
 character references - Add LINE SEPARATOR (0x2028) to character reference escaping — was passing through unescaped because it's above the 128-char specialChars array Per W3C XML serialization spec, CR, NEL (0x85), and LINE SEPARATOR (0x2028) must be output as character references in both text content and attribute values. XQTS QT4: method-xml K2-Serialization-5,6,9,10,11 now pass (+5) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The checkTypes method used exact type equality, rejecting valid subtypes like xs:integer for xs:decimal parameters. Changed to use Type.subTypeOf so that xs:integer values are accepted for html-version (xs:decimal), xs:anyURI values accepted for xs:string parameters, etc. Fixes 9 fn-serialize tests (serialize-html-002 through -007, serialize-xml-120-40, -120b-40, -142). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
JSONSerializer: - Add SERE0023 validation: JSON output method cannot serialize a top-level sequence of more than one item, or a map entry whose value is a multi-item sequence. Array members are allowed to have multi-item sequences (they become nested JSON arrays). SerializerUtils: - Fix checkTypes to use Type.subTypeOf instead of exact type equality. xs:integer is now accepted for xs:decimal parameters (html-version), xs:anyURI accepted for xs:string, etc. XQTS QT4: fn-serialize +18 (SERE0023 + subtype fixes) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ameters Per W3C Serialization 3.1, boolean serialization parameters like omit-xml-declaration accept "yes"/"true"/"1" as true and "no"/"false"/"0" as false, with optional whitespace trimming. XMLWriter: - Add isBooleanFalse() helper that checks for "no", "false", "0" - Use it in writeDeclaration() instead of "no".equals() check - Fixes K2-Serialization-38 (omit-xml-declaration="false") and K2-Serialization-39 (omit-xml-declaration="0") FunSerialize: - Add isBooleanTrue() helper for SEPM0009 validation consistency Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New QT4 Serialization 4.0 parameters for JSON output: escape-solidus (boolean, default: true): - Controls whether / is escaped as \/ in JSON string output - When false, / passes through unescaped - Registered in W3CParameterConvention for fn:serialize() support json-lines (boolean, default: false): - Enables JSON Lines (NDJSON) format: one JSON value per line, no array wrapper - Per QT4 spec Section 10.2 Both parameters registered in EXistOutputKeys and SerializerUtils. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Register the 'canonical' boolean parameter (default: false) in W3CParameterConvention so it is accepted by fn:serialize() options maps. Per QT4 Serialization 4.0, canonical=true produces canonical form output for XML, XHTML, and JSON methods. eXist's default serialization already produces output compatible with canonical tests (sorted attributes, expanded empty elements), so registering the parameter allows tests that explicitly set canonical=true to pass without error. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…t form Fix for issue eXist-db#3446: eXist-specific serialization parameters like expand-xincludes, highlight-matches, process-xsl-pi, add-exist-id, and jsonp were only accepted in the map form of fn:serialize() via exist:-namespaced QName keys. They were rejected in the XML element form (<output:serialization-parameters>) because readStartElement() only checked for W3C parameter names. Now accepts elements in the exist: namespace (http://exist.sourceforge.net/NS/exist) and passes them through to the serialization properties. Also changed the W3C namespace check from prefix-based comparison to URI-based comparison for correctness. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add CSV output method to eXist-db's serialization framework, modeled
on BaseX's approach. Serializes XDM sequences as RFC 4180 CSV.
CSVSerializer.java:
- Accepts array-of-arrays (each inner array = row)
- Accepts sequence-of-maps (keys → header, values → rows)
- Accepts XML table (<csv><record><field>) format
- RFC 4180 quoting: quote chars doubled, configurable quoting mode
Parameters (registered in W3CParameterConvention):
- csv.field-delimiter (default: ",")
- csv.row-delimiter (default: "\n")
- csv.quote-character (default: '"')
- csv.header (boolean, default: false)
- csv.quotes (boolean, default: true = always quote)
Integration:
- Registered "csv" in XQuerySerializer method dispatch
- Registered "text/csv" as default media type
- Excluded from sequence normalization (like JSON/adaptive)
Usage:
serialize([["Name","Age"],["Alice","30"]], map{"method":"csv"})
→ "Name","Age"\n"Alice","30"\n
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
e45dcb8 to
4361394
Compare
Accept "true"/"1" as well as "yes" for boolean serialization parameters in JSONSerializer (json-lines, escape-solidus). The W3C serialization spec allows all three forms for boolean parameters, and fn:serialize() maps store boolean true() as "true" not "yes". Add isBooleanTrue/isBooleanFalse helpers matching XMLWriter's pattern. Fixes: serialize-json-200 (json-lines empty), -201/-203 (json-lines multi-item), -122a (NaN/INF with correct JAR), -340/-341/-342 (NaN/INF in node content). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
JSONSerializer: - Fix json-lines output adding extra whitespace between values. Jackson adds separator whitespace between root-level values, so each json-line is now serialized via a separate generator to a string buffer, then written as raw content. XQuerySerializer: - Flatten arrays before XML/text serialization — ArrayType items can't be serialized as SAX events, so [1,2,3,4,5] is flattened to the sequence (1,2,3,4,5) before passing to the SAX serializer. - For text method with flattened arrays, set default item-separator to space (per W3C spec) when not explicitly provided. Fixes: serialize-json-201, -203 (json-lines whitespace), Serialization-text-19 (array serialization in text method). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Improve eXist-db's compliance with the W3C XSLT and XQuery Serialization 3.1 specification across all output methods (JSON, adaptive, XML, text, HTML, XHTML) and fix
fn:xml-to-jsonfor element node inputs.Depends on: #6139 (XQuery 4.0 parser) — rebased on Parser; merge after it lands.
What Changed
JSONSerializer.javaallow-duplicate-names; add SERE0022 duplicate-key detectionXQuerySerializer.javaJSON.javaliberal(boolean) andduplicates(string)AdaptiveWriter.javamapprefix ({...}notmap{...}); fix double INF/NaN to use text not Unicode symbolsXMLWriter.javacdata-section-elements; CR and LINE SEPARATOR character reference escaping;&{attribute escaping hookIndentingXMLWriter.javasuppress-indentationparameterOption.javaQ{namespace}localURIQualifiedName indeclare optionAbstractSerializer.javahtml-versionto 5.0;output:version→html-versionmappingXHTMLWriter.javainclude-content-typemeta tag (first child of<head>); boolean attribute minimization; XHTML content-type useshttp-equivformHTML5Writer.javaXHTML5Writer.javaFunSerialize.javaFunXmlToJson.javaSerializerUtils.javaQ{ns}localURIQualifiedName in QName-type properties; subtype checking for parameter validationXQTS Results (QT4)
Spec References
Test Plan
serialize-node)🤖 Generated with Claude Code