Skip to content

Implement XQuery 4.0 core functions (fn:, array:, map:, math:)#6218

Open
joewiz wants to merge 67 commits into
eXist-db:developfrom
joewiz:v2/xq4-core-functions
Open

Implement XQuery 4.0 core functions (fn:, array:, map:, math:)#6218
joewiz wants to merge 67 commits into
eXist-db:developfrom
joewiz:v2/xq4-core-functions

Conversation

@joewiz
Copy link
Copy Markdown
Member

@joewiz joewiz commented Apr 6, 2026

Summary

82 new or updated functions across fn:, array:, map:, and math: namespaces for XQuery 4.0. Includes fn:replace empty-match version gating (XQ4 behavior only in 4.0 mode).

Additionally includes XQTS QT4 compliance fixes from 6 focused sessions targeting specific failure categories, plus a comprehensive fn: signature audit against W3C XQuery 4.0 PR197.

What Changed

Core XQ4 functions (original)

  • 58 fn: functions (compare, characters, contains-subsequence, identity, trunk, foot, slice, etc.)
  • 13 array: functions (build, index-of, index-where, of-members, slice, sort-by, sort-with, split, etc.)
  • 7 map: functions (build, filter, of-pairs, etc.)
  • 4 math: functions (cosh, sinh, tanh, cbrt)
  • fn:replace/fn:tokenize empty-match: gated on xquery version "4.0"
  • fn:doc multi-arity fix

XQTS compliance fixes

  • Regex compliance: Validate XPath regex patterns before Java regex fallback; support XQ4 lookaround translation
  • Keyword arguments + collation: Rename function parameter names to match XQ4 F&O spec; implement fn:collation(map) overload
  • fn:element-to-map: Fix XQ4 conformance for fn:element-to-map and fn:element-to-map-plan
  • fn:unparsed-text: Improve conformance and add function type checking
  • Function call/ref: Fix reserved function name handling, context item availability, and type promotions
  • Variable declaration error codes: Fix XPST0008/XQST0049 alignment with XQuery 3.1 spec

fn: Signature Audit (W3C XQuery 4.0 PR197)

Audit and fix FunctionSignature definitions across org.exist.xquery.functions.fn and .array against the W3C XQuery 4.0 spec. The QT4 keyword test set (misc-BuiltInKeywords) checks that fn:foo(arg := ?) instance of function(...) as ... matches the function-item signature W3C declares.

Signature changes:

  • collation parameters widened from xs:string (exactly-one) to xs:string?: fn:contains, fn:ends-with, fn:starts-with, fn:compare, fn:distinct-values, fn:index-of, fn:substring-before, fn:substring-after, fn:collation-key, fn:string-join's separator, array:index-of.
  • options map parameters widened to map(*)?: fn:doc, fn:doc-available, fn:csv-to-arrays, fn:parse-csv, fn:csv-to-xml, fn:csv-doc, fn:parse-xml, fn:parse-xml-fragment, fn:path.
  • trailing optional length / action widened to ?: fn:substring length, fn:subsequence length (also start/length type now xs:numeric), array:subarray length, array:build action.
  • typed function/array/record/map returns declared for: fn:op, fn:invisible-xml, fn:analyze-string, fn:element-to-map, fn:function-annotations, fn:csv-to-arrays, fn:divide-decimals, fn:in-scope-namespaces, fn:transitive-closure, array:members, fn:transform, fn:collation.

Type-checker support (SequenceType.isSubtypeOf):

  • Choice (union) type on the sub side: every alternative must be a subtype of sup.
  • Bare map(*) treated as map(xs:anyAtomicType, item()*) in subtype checks (was map(xs:string, item()*), which contradicts XQ 4.0).
  • Bare array(*) now treated as array(item()*).

Implementation adjustments:

  • CollatingFunction.getCollator treats empty xs:string? collation as default collator.
  • FunSubstring, FunSubSequence, FunAnalyzeString, ArrayFunction.subArray handle empty optional-length as no-length.
  • ArrayBuild treats empty action as identity.
  • FunResolveURI falls back to static base URI when explicit base is empty.

XQTS impact from signature audit:

Test set Before After Δ
misc-BuiltInKeywords 76 fails 31 fails −45
misc-Subtyping 26 25 −1
fn-collation-key 10 7 −3
fn-string-join 2 0 −2
fn-resolve-uri 3 1 −2
fn-substring* 6 1 −5

Codacy cleanup

  • 12 low-hanging Codacy warnings fixed (CollapsibleIfStatements, field ordering, unused methods/params, boolean simplification)
  • NPath decomposition for FnBuildUri (24M→clean), LetExpr (418K→clean), CsvParser (27K→clean), FnHash (276→clean), CastExpression (2K→clean), TryCatchExpression (3K→clean), FnHighestLowest (35K→clean), SequenceType (240→clean)

Spec references

Test plan

  • exist-core unit tests pass (0 failures)
  • XQ4 function tests pass
  • fn:replace with empty match works in 4.0 mode, rejected in 3.1 mode
  • XQTS QT4 misc-BuiltInKeywords 76→31 fails
  • Codacy PMD clean on changed files

🤖 Generated with Claude Code

@joewiz joewiz force-pushed the v2/xq4-core-functions branch 2 times, most recently from 00e6756 to 17a7067 Compare April 13, 2026 13:26
joewiz added a commit to joewiz/exist that referenced this pull request Apr 14, 2026
… update statuses

- Replace closed eXist-db#6213 (v2/jetty-12-upgrade) with eXist-db#6145 (feature/websocket-core)
- Add CI Health Note explaining known noise: integration hangs, container image
  HTTP 502, XQTS runner Saxon 12 crash, and complementary empty-match failures
  in eXist-db#6212/eXist-db#6218
- Update XQTS runner: eXist-db#45 closed, eXist-db#49 is the active PR
- Update cross-repo PR table accordingly
- Update "Also Ready to Merge" table: mark eXist-db#6142, eXist-db#6146 merged; eXist-db#6186 superseded
  by eXist-db#6224; correct eXist-db#6087 approver; add status notes for eXist-db#6182, eXist-db#6184

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@joewiz joewiz marked this pull request as ready for review April 14, 2026 13:43
@joewiz joewiz requested a review from a team as a code owner April 14, 2026 13:43
@joewiz
Copy link
Copy Markdown
Member Author

joewiz commented Apr 14, 2026

[This response was co-authored with Claude Code. -Joe]

CI state: 6/9 checks pass. Of the 3 failures:

  • 2 are pre-existing integration test hangs (ubuntu, windows)
  • 1 (ubuntu unit) is replace.empty-match-allowed — in XQ4 mode, empty-match patterns should be permitted, but the underlying capability requires Saxon 12 (from #6212). These two PRs have complementary failures that cancel out when merged together. See the CI Health Note for details.

Dependencies: Wave 4. Must merge after all Wave 3 grammar PRs (particularly v2/xquery-4.0-parser #6216). Pairs with v2/saxon-12-upgrade (#6212) for the fn:replace empty-match version-gating.

For full context on all 7.0 PRs and the merge order, see the Reviewer Guide.

@duncdrum duncdrum added enhancement new features, suggestions, etc. xquery issue is related to xquery implementation XQ4 xquery 4 labels Apr 14, 2026
@duncdrum duncdrum added this to v7.0.0 Apr 14, 2026
@duncdrum duncdrum moved this to Backlog in v7.0.0 Apr 14, 2026
@duncdrum duncdrum added the blocked blocked by a 3rd party label Apr 14, 2026
Comment on lines 269 to 280
final String version = v.getText();
if (version.equals("3.1")) {
if (version.equals("4.0")) {
context.setXQueryVersion(40);
} else if (version.equals("3.1")) {
context.setXQueryVersion(31);
} else if (version.equals("3.0")) {
context.setXQueryVersion(30);
} else if (version.equals("1.0")) {
context.setXQueryVersion(10);
} else {
throw new XPathException(v, ErrorCodes.XQST0031, "Wrong XQuery version: require 1.0, 3.0 or 3.1");
throw new XPathException(v, ErrorCodes.XQST0031, "Wrong XQuery version: require 1.0, 3.0, 3.1, or 4.0");
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use switch expression

final int cp = input.codePointAt(i);
final int cpLen = Character.charCount(cp);

switch (state) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use switch expression

}

private static boolean isDefaultPort(final String scheme, final int port) {
switch (scheme) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use switch expression

*/
private void validateOptionValue(final String mapKey, final String uriParam, final String val)
throws XPathException {
switch (uriParam) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use switch expression

}
}

switch (layout) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use switch expression

return Constants.EQUAL;
}

switch (item1.getType()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use switch expression

org.w3c.dom.Node child = parent.getFirstChild();
while (child != null) {
final int type = getEffectiveNodeType(child);
switch (type) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use switch expression

// else: trailing colon with no minutes is allowed (test 47, 60)
} else {
// No colon: split based on number of digits
switch (totalDigits) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use switch expression

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use switch expression

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use text blocks where multiline strings are used

joewiz and others added 13 commits April 30, 2026 18:15
Broadened signatures and improved compliance for existing functions:

- fn:compare: broadened to anyAtomicType, numeric total order,
  duration/datetime ordering per QT4 spec
- fn:deep-equal: text node merging across comments/PIs, options map
  support, BigInteger overflow fix for fn:round
- fn:head/fn:tail: expanded to fn:foot/fn:trunk (XQ4 aliases)
- fn:max/fn:min: duration comparison support, decimal precision
- fn:doc/fn:doc-available: security-gated file:// URI resolution,
  fn:doc#2 overload
- fn:unparsed-text: BOM stripping, fn:unparsed-text-lines fix
- fn:matches/fn:replace/fn:tokenize: regex enhancements
- fn:path#2: output format parameter
- fn:analyze-string: reflection proxy for Saxon compatibility
- fn:parse-json: option validation, empty sequence args, xs:integer
  for JSON integers
- fn:load-xquery-module: content option (XQ4)
- fn:format-number: negative exponent zero-padding, map overload,
  char:rendition pattern
- fn:format-date/fn:format-time: comprehensive improvements
- Collations: supplementary codepoint comparison
- RangeSequence: primitive long storage optimization
- Error codes: W3C alignment across casting and value types
- JSON serialization: XDM mode bypass fix, duplicate key detection

Spec: QT4 XQuery 4.0 §14 (Functions and Operators)
XQTS: +111 tests across method-json, fn-compare, fn-deep-equal,
      fn-round, fn-max/fn-min test sets

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New function implementations in the fn: namespace:

Sequence functions: fn:characters, fn:identity, fn:void, fn:foot,
fn:trunk, fn:slice, fn:items-at, fn:replicate, fn:insert-separator,
fn:all-equal, fn:all-different, fn:duplicate-values, fn:index-where,
fn:take-while, fn:distinct-ordered-nodes, fn:siblings

Higher-order functions: fn:every, fn:some (function form),
fn:highest, fn:lowest, fn:sort-by, fn:sort-with, fn:partition,
fn:scan-left, fn:scan-right, fn:subsequence-where,
fn:transitive-closure, fn:partial-apply, fn:op

String/URI functions: fn:char, fn:graphemes, fn:decode-from-uri,
fn:parse-uri, fn:build-uri, fn:expanded-QName, fn:parse-QName,
fn:parse-integer, fn:divide-decimals

Date/Time functions: fn:civil-timezone, fn:build-dateTime,
fn:parts-of-dateTime, fn:unix-dateTime, fn:seconds

Type functions: fn:schema-type, fn:atomic-type-annotation,
fn:node-type-annotation, fn:element-to-map, fn:element-to-map-plan,
fn:type-of, fn:is-NaN

Context functions: fn:get, fn:collation, fn:collation-available,
fn:message

Parsing functions: fn:parse-html (Validator.nu HTML5 parser),
fn:invisible-xml (Markup Blitz iXML parser), fn:parse-csv,
fn:csv, fn:html-doc, fn:unparsed-binary

Data functions: fn:hash, fn:function-annotations,
fn:function-identity, fn:in-scope-namespaces

Also: DeepEqualOptions class for fn:deep-equal options map support,
FnModule registrations for all new functions.

Spec: QT4 XQuery 4.0 §14 (Functions and Operators)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Array module (8 new functions):
- array:build, array:index-of, array:index-where, array:of-members,
  array:split, array:sort-by, array:sort-with, array:slice
- Plus: array:get#3 with default value

Map module (5 new functions):
- map:build, map:items, map:entries, map:filter, map:keys-where
- Plus: map:get#3 with default value, map:empty

Math module (4 new functions):
- math:cosh, math:sinh, math:tanh, math:e
- Plus: math:pow edge case fixes

Spec: QT4 XQuery 4.0 §17 (Array Module),
      QT4 XQuery 4.0 §16 (Map Module),
      QT4 XQuery 4.0 §18 (Math Module)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Build configuration:
- exist-parent/pom.xml: Add markup-blitz 1.10 (fn:invisible-xml),
  htmlparser 1.4.16 (fn:parse-html via Validator.nu)
- exist-core/pom.xml: Add markup-blitz and htmlparser dependencies
- .gitignore: Ignore iXML grammar cache files

Format improvements:
- FnFormatDates: comprehensive format-date/format-time improvements
- FnFormatNumbers: map overload, char:rendition pattern, negative
  exponent zero-padding fix

Tests:
- fnXQuery40.xql: XQSuite tests for XQ4 functions
- fnInvisibleXml.xqm: fn:invisible-xml test suite
- format-number-map.xql: fn:format-number map overload tests
- deep-equal-options-test.xq: fn:deep-equal options map tests
- Updated: fnLanguage.xqm, json-to-xml.xql, replace.xqm

Spec: QT4 XQuery 4.0 §14 (Functions and Operators)
XQTS: 732/861 (85.0%) for XQ4-specific test sets

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…d fn:tokenize

Without the ! flag, empty-matching patterns raise FORX0003 in both XQ 3.1
and XQ 4.0 mode. With the ! flag in XQ 4.0, fn:replace uses the Java regex
fallback and fn:tokenize tokenizes between each character.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… context

The XQ4 duration ordering version gate in DurationValue.compareTo()
checked getExpression().getContext().getXQueryVersion() >= 40, but
DurationValues created at runtime had null expression references,
causing the gate to always block ordering even in XQ4 mode.

Fix: propagate expression context from comparison operators to
atomized values. Add AtomicValue.setExpression() and version-gated
max()/min() for DurationValue.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mezone options to fn:deep-equal

Extends the XQuery 4.0 fn:deep-equal options map with four new options:

- items-equal: function(item(), item()) as xs:boolean? callback for custom
  item equality. Returns empty sequence to fall back to default comparison.
- unordered-elements: xs:QName* listing elements whose children are compared
  as multisets rather than ordered sequences.
- normalization-form: Unicode normalization (NFC/NFD/NFKC/NFKD) applied to
  string comparisons.
- timezone: xs:dayTimeDuration implicit timezone override for date/time
  comparison (parsed but not yet applied in comparison logic).

Also adds proper cleanup of FunctionReference via close() method on
DeepEqualOptions, called from FnDeepEqualOptions.eval() in a finally block.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…p:items

XQuery 4.0 conformance fixes for array and map module functions:

- map:build: Skip items with empty key (NPE fix), combine duplicate key
  values into sequences, support multi-key returns (PR1041)
- map:items: Return flat value sequence instead of entry maps; split from
  map:entries which retains the record-style return
- map:for-each, map:filter, map:keys-where: Support XQ4 arity coercion —
  callbacks may accept 0-3 args (key, value, position) per PR2225
- array:filter: Support XQ4 arity coercion — callbacks may accept 0-2
  args (member, position)
- array:for-each-pair: Support XQ4 arity coercion — callbacks may accept
  0-3 args (memberA, memberB, position) per PR2225
- array:get: Use long comparison to prevent integer overflow for indices
  exceeding Integer.MAX_VALUE (fixes FOAY0001 for large indices)
- map:merge: Accept empty sequence for options parameter (XQ4)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… XQ4

Per QT4CG PR eXist-db#1481, the 7 fn:*-from-dateTime component extraction
functions now accept any Gregorian date/time type (xs:date, xs:time,
xs:gYear, xs:gYearMonth, xs:gMonth, xs:gMonthDay, xs:gDay) in XQuery
4.0 mode. When the requested component is absent (e.g., hours from
xs:date), the function returns the empty sequence.

Version-gated: in XQ 3.1 mode, these functions still require strict
xs:dateTime input. The fn:*-from-date and fn:*-from-time signatures
are unchanged per the XQ4 spec.

Includes 35 XQSuite tests covering component presence/absence across
all Gregorian types, plus regression coverage for xs:dateTime inputs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…t XQ4 lookaround

Add pre-validation of regex patterns in fn:matches and fn:replace to reject
constructs that are not part of the XPath regular expression specification
(F&O 3.1, Section 5.6.1) but that Saxon's XP30 mode silently accepts.

Rejected constructs include:
- \x, \u hex/unicode escapes (not in XPath regex)
- \A, \Z, \z Java-specific anchors
- \b, \B word boundary assertions
- \a, \e, \f, \v special character escapes
- \Q, \E literal quoting
- \G, \k, \g named/numbered back-references
- (?=...) (?!...) (?<=...) (?<!...) Java-style lookaround
- (?>...) atomic groups
- (?i:...) (?m:...) (?s:...) (?-i:...) inline flag groups
- *+ ++ ?+ possessive quantifiers

Also adds support for XPath 4.0 named lookaround syntax by translating
(*positive_lookahead:...) etc. to Java regex (?=...) equivalents.

Expected XQTS impact: ~137 of 173 fn-matches.re failures fixed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…checking

fn:unparsed-text improvements:
- Change $encoding parameter from xs:string to xs:string? to accept empty sequence
- Fix error code mapping: FODC0005 → FOUT1170 for URI syntax errors
- Add URI-only dynamic text resource lookup for encoding-agnostic resolution
  (fixes UTF-16 and ISO-8859-1 resources when no encoding is specified)
- Add readLines support for dynamic text resources (was missing)
- Add XML character validation (FOUT1190) for non-XML characters
- Fix unparsed-text-available to return false (not empty sequence) for empty href

Function type checking (SequenceType):
- Add functionParamTypes and functionReturnType fields to SequenceType
- Wire up ANTLR tree walker to populate function type info (resolves TODO)
- Add return type covariance checking for function instance-of operations

XQTS fn-unparsed-text: 50 → 32 failures (18 tests fixed, 36% improvement)
Subtyping fixes require next-v3 integration branch for proper testing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…, promotions

- Add XPST0003 checks for reserved function names in NamedFunctionReference
  and FunctionFactory, fixing ~16 prod-NamedFunctionRef and ~7 prod-FunctionCall
  tests that incorrectly returned XPST0017 or succeeded when they should fail

- Fix context item passing for wrapped internal functions (FunctionFactory.wrap).
  UserDefinedFunction now preserves the evaluation context for wrapper functions,
  fixing ~15 tests where context-dependent functions like fn:string#0,
  fn:node-name#0, fn:id#1, fn:idref#1 lost the focus when called via
  function references

- Add binary type promotion (xs:base64Binary ↔ xs:hexBinary) in
  GeneralComparison and DynamicTypeCheck per XQuery 4.0 spec, fixing 4
  function-call-promotion tests

- Register 2-arity fn:element-with-id signature (the implementation already
  handled 2 args but the signature was missing), fixing 2 tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nformance

Fixes 59 XQTS failures (0% → 100% pass rate for fn-element-to-map suite):

- Implement "default" name-format (spec default, not "eqname"):
  child elements in same namespace as parent use local name,
  different namespace uses Q{ns}local, no-namespace child of
  namespaced parent uses Q{}local. Special case for xml: namespace
  attributes.

- Fix layout classification: whitespace-only text between elements
  is ignored for layout detection but preserved as content when the
  element has only whitespace text and no child elements.

- Implement sequence layout for non-unique child names (was
  incorrectly using record layout which collapsed duplicates).

- Fix list/list-plus layout: list drops child element name and
  returns array directly; list-plus uses child name as map key.

- Add plan support: explicit layout directives (empty, empty-plus,
  simple, simple-plus, list, list-plus, record, sequence, mixed,
  xml, error, deep-skip), fallback via "*" key, type coercion
  (numeric/boolean), FOJS0008 error for layout mismatches.

- Add XML serialization layout for plan-based conversion.

- Add option validation: name-format, attribute-marker type checks
  with XPTY0004 errors for invalid values.

- Add xsi:type coercion for schema-typed simple content.

- Fix fn:element-to-map-plan corpus analysis: properly merge
  multiple instances, detect list patterns across empty and
  non-empty instances, generate type annotations for numeric
  content.

- Add FOJS0008 error code to ErrorCodes.java.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
joewiz and others added 10 commits April 30, 2026 18:17
Reworks JSON.java to thread a single ParseOptions bundle through the
recursive parser so all XQuery 4.0 options take effect consistently for
fn:parse-json, fn:json-doc and fn:json-to-xml.

What changes:

* Default duplicates is now use-first (XPath/XQuery 3.1 §17.5.1 / 4.0
  PR2096); 'retain' is rejected for parse-json/json-doc.
* Empty options sequence is allowed: parse-json(json, ()).
* New 'null' option supplies the replacement value (or sequence) used
  for JSON null in parse-json results, including multi-item sequences.
* 'escape' option is now honoured for parse-json: control chars,
  backslash and characters not allowed in XML are re-encoded with JSON
  escape sequences. Quote chars are not re-escaped (per QT4 fixtures
  json-to-xml-049 / json-doc-012 / parse-json-107).
* 'fallback' is invoked for chars that cannot appear in XML in both
  parse-json keys and values, with full validation of return cardinality
  and FOTY0013 when a function item is returned.
* 'number-parser' is invoked for parse-json numeric tokens; multi-item
  results raise XPTY0004; arity is no longer pre-validated, matching
  PR975 fixtures.
* 'escape' and 'fallback' together raise FOJS0005 (spec §22.3.2).
* In XQuery 4.0 mode, unknown options (including 'spec' and 'validate')
  raise XPTY0004; in 3.1 mode they are silently accepted.
* Strings with characters not allowed in XML are normalised with U+FFFD
  in parse-json output when no fallback is supplied.

Test results (QT4 fn-parse-json on v2/xq4-core-functions):
  before: 142/188 (75.5%), 46 failures
  after : 182/188 (96.8%), 6 failures (Phase 2 gate ≥80% / ≤30)

The remaining 6 failures are out of scope for parse-json: a Jackson
leading-zero edge case (716), a parser closure-capture bug (731),
JSON serializer of magic-null QName (746/747), and an error-code
QName formatting mismatch (943).

No regressions in fn-json-to-xml, fn-json-doc, or fn-xml-to-json on
this branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…espaces, namespace-prefixes (attrs), timezones, map-order

XQ4 PR320/PR1855 introduced an options-map third argument to
fn:deep-equal. The DeepEqualOptions parser already recognized the
option keys, but several were stored as flags without ever being
consulted by the comparison engine. This commit wires them up:

- base-uri: compare element base URIs (xml:base / inherited)
- in-scope-namespaces: compare in-scope NS bindings as sets,
  walking ancestor xmlns declarations
- namespace-prefixes: extend prefix check to attributes (was only
  applied to elements). Use a fallback that parses the prefix from
  nodeName when getPrefix() returns null/empty
- timezones: when set, two date/time atomics with different
  explicit timezones (or one missing) are not deep-equal even if
  they represent the same instant
- map-order: iterate both maps in their recorded insertion order
  (PR1703) and compare keys position-wise

Two collateral fixes:

- items-equal: drop the eager arity check at parse time. The spec
  permits any function-typed value (the test deep-equal-40-items-equal-004
  passes true#0 because length-mismatched sequences must return false
  before invoking the callback). Arity is now validated lazily.
- Unsupported collation in the 3-arg variant now raises FOCH0002
  (the runtime error) rather than letting XQST0076 leak through
  from getCollator. Spec accepts FOCH0002 here.

Also fix deep-equal-options-test.xq: a library module cannot
declare local:helper (XQST0048) — renamed to det:helper.

XQTS QT4 fn-deep-equal: 33 -> 26 F+E (9 tests fixed: base-uri-003,
in-scope-namespaces-003, timezones-003, items-equal-{004,005,006,008},
normalize-unicode-004, map-order-003; 2 regressions in
whitespace-{009,031} that need follow-up).

JUnit DeepEqualTest: 63/63 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Commit 7758007 accidentally removed the compareAttributes(a, b)
call from compareElements when adding the new base-uri and
in-scope-namespaces option blocks. As a result, two elements with
different attribute values were treated as deep-equal whenever
their names, child contents, and (default-off) base-uri/in-scope
options matched.

Restore the call after the option checks. With this fix:
- xquery3.deep-equal-options-test: whitespace-strip-attr and
  whitespace-normalize-attr-different (assertFalse on differing
  attribute values) pass again
- XQTS QT4 fn-deep-equal: 33 -> 26 F+E, no regressions vs baseline,
  with namespace-prefixes-004 now also passing (the attribute
  prefix check from the parent commit was unreachable until now)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…odes

* TryCatchExpression: bind XQuery 4.0 $err:map (PR493) and $err:stack-trace
  (PR1470/PR1599) inside catch clauses. err:map is a map(xs:string, item()*)
  containing all standard error properties (code, description, value, module,
  line-number, column-number, additional, stack-trace).
* UntypedValueCheck: raise XPTY0117 when implicitly coercing xs:untypedAtomic
  to a namespace-sensitive target type (xs:QName, xs:NOTATION) during
  function-call argument coercion (XPath F&O 3.1 §19.1).
* CastExpression: allow xs:untypedAtomic as a source for cast as xs:QName
  (XQ30+ relaxed the rule — lexical errors raise FORG0001 via QNameValue).
* IntegerValue: revert XQ4 hex/binary/underscore parsing for runtime
  string-to-integer casts. Those literal extensions belong only to the parser
  token path (XQueryTree.g uses the BigInteger constructor) — applying them
  to xs:integer("0x0") accepted invalid lexical forms.

Reduces QT4 prod-TryCatchExpr failures from 34 to ~28 (covers $err:map and
$err:stack-trace tests). Net improvement on prod-CastExpr from XPTY0117
alignment and untypedAtomic→QName cast.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address Phase 2 conformance gaps in two related date/time function sets.

fn:parse-ietf-date (76.2% -> 100%, 105/105):
- Default missing timezone to UTC (per spec note: no timezone == GMT).
- Time parser no longer greedily consumes year digits as a timezone in
  asctime form (e.g. "Aug 20 19:36 2014").
- Timezone offset parser handles 1, 2, 3, 4-digit forms ("-5", "-05",
  "-500", "-0500"), trailing colons ("-05:"), and ":mm" with no minutes
  ("Feb-02 02:02-02: 02"); no longer eats trailing whitespace+content.
- Recognise lowercase TZ names ("gmt", "utc"); recognise lowercase day
  and month names already worked.
- Optional "(TZNAME)" comment after offsets is parsed and validated;
  empty parens / unknown TZ in parens raise FORG0010.
- Day-name without trailing whitespace ("Wed,20 Aug ...") now errors.
- dsep requires at least whitespace or hyphen; "20Aug" / "Aug2014" now
  error as expected.
- Handle 24:00 without seconds as midnight at end of day.
- "." with no fractional digits now errors (errs27).

fn:build-dateTime (64.8% -> 95.8%, 68/71):
- Strict field combination validation: empty record, time without all of
  hours/minutes/seconds, year+day without month, time fields with
  incomplete date components -- all raise FODT0005 with clear messages.
- Numeric coercion: integer fields accept xs:integer or xs:decimal that
  is exactly an integer; xs:double, fractional decimals, NaN/Infinity
  raise XPTY0004. Untyped/node values are parsed as integers (test
  date-without-timezone-from-nodes).
- Seconds field accepts xs:integer / xs:decimal / finite xs:double or
  xs:float; rejects NaN/Infinity (XPTY0004).
- Timezone accepts xs:duration too (rejecting year/month parts);
  validates +-14:00 range and whole-minute offsets (FODT0006).
- Calendar-day validity (28/29/30/31, leap years) checked up-front and
  reported as FODT0006 instead of bubbling up FORG0001 from the lexical
  parser. Seconds range 0..<60 also FODT0006.
- Year formatting handles year 0 ("0000") and negative years ("-0001")
  per XSD 1.1 representation.
- When a timezone is supplied with a full dateTime, return
  xs:dateTimeStamp instead of xs:dateTime.

XQTS QT4 deltas (antlr parser):
| Set                | Before               | After                |
| ------------------ | -------------------- | -------------------- |
| fn-parse-ietf-date | 80/105 (76.2%)       | 105/105 (100%)       |
| fn-build-dateTime  | 46/71  (64.8%)       | 68/71  (95.8%)       |

Remaining build-dateTime failures: year-zero formatting (XSD 1.0
javax.xml.datatype rejection) and one XPST0017 case that depends on
removing the xq31 2-arg overload.
…4 PR1041)

Brings next-v3's map:build implementation up to spec with the work
deferred from task7 commit a260f37 (which conflicted too deeply
to cherry-pick wholesale).

* BUILD_1/BUILD_2 — relaxed key/value parameters from required to
  optional. Empty sequence selects the spec default.
* build() — null-check args before casting to FunctionReference.
* When key function is empty, default to fn:data#1 semantics
  (atomize the input item). Required by map-build-117 where the
  input is element nodes that need atomization.
* When value function is empty, default to fn:identity#1 semantics
  (use the input item as-is).
* referenceArity() helper (placeholder for future partial-application
  handling; currently delegates to FunctionSignature.getArgumentCount).
* getBuildDuplicatesHandler() / BuildDuplicatesHandler — the
  'duplicates' option may now be a function reference. Arity 1
  receives only the existing accumulator (counter pattern); arity
  2 receives both existing and incoming values. Required by
  map-build-119/123/224.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… 4.0 spec

W3C XQuery 4.0 PR197 specifies the keyword-argument names for built-in
functions. eXist's signatures used the legacy 3.1-era names:

  fn:json-doc:    eXist used $href      -> spec says source
  fn:parse-json:  eXist used $json-text -> spec says value
  fn:json-to-xml: eXist used $json-text -> spec says value

These names are visible to callers via XQ4 keyword-argument syntax
(name := value), so they must match the spec for keyword calls to
resolve. Positional-call behavior is unchanged.

Confirmed via QT4 misc-BuiltInKeywords XQTS:
  Keywords-fn-json-doc-1   pass (was XPST0017)
  Keywords-fn-parse-json-1 pass (was XPST0017)
  Keywords-fn-json-to-xml-1 still fails on a separate parser-level
                            issue in its instance-of clause
                            (document-node(fn:*) wildcard).

Net misc-BuiltInKeywords: 83 -> 79 F+E (72.0% -> 72.8%). The remaining
79 failures need a per-function W3C-4.0 signature audit and new XQ4
type-syntax parser features (record types, local union types, document-
node element wildcards), tracked under separate phase2-* taskings.
Addresses misc-Subtyping XQTS gaps. misc-Subtyping QT4 fail count drops
from 45 to 26 (153 tests, 41 skipped, 112 active; pass rate 75%).

Parser changes (XQuery.g, XQueryTree.g):
- recordFieldDecl: suppress the optional QUESTION token from the AST so
  the tree walker no longer sees a stray '?' node after a record field
  with no type clause.
- Tree walker: allow xs:error in the atomic-type position. xs:error is
  defined as a builtin under ANY_SIMPLE_TYPE; per XQuery 4.0 it is a
  legitimate sequence type (its value space is empty, so xs:error*
  matches only the empty sequence).
- documentTest: accept document-node(*) as XQuery 4.0 short form for
  document-node(element(*)).

SequenceType subtype rules (SequenceType.java#isSubtypeOf):
- Element/attribute kind tests now compare nodeName: when sup names a
  specific element/attribute, sub must name the same one. Previously
  element(*) was reported as a subtype of element(a).
- Records now subtype-check structurally on declared fields per XQuery
  4.0 Records: required fields of sup must exist (and be required) in
  sub with sub's field type subtype-of sup's, and sub may not declare
  extra fields unless sup is extensible.
- map(K, V) subtype check now allows records (RECORD <= MAP_ITEM) to
  flow through, treating an untyped record as map(xs:string, item()*).
- function-shape conversion for maps and arrays now uses the declared
  K/V types: map(K, V) acts as function(xs:anyAtomicType) as V?
  (atomic-coerced key per XQ4 PR1501; lookup miss widens cardinality);
  array(T) acts as function(xs:integer) as T. Records flow through the
  map branch via Type.subTypeOf(sub.primaryType, MAP_ITEM).

XQTS misc-Subtyping (QT4):
- Before: 45 fails, 41 skipped, 67 passes (59.8%)
- After:  26 fails, 41 skipped, 86 passes (76.8%)

Remaining failures are XQuery 4.0 features outside this change set:
- element/attribute multi-name tests (element(a|b)) and namespace
  wildcards (element(p1:*), element(*:a)) -- 14 tests
- element-type tracking (element(a, xs:integer) covariance) -- 4 tests
- gnode() as supertype of node()|jnode() -- 3 tests
- document-node(QName) bare short form, document-node(*) vs () -- 2
- union-type widening (xs:long|xs:int subtype of xs:integer) -- 3
Reduce QT4 prod-WindowClause failures from 35 to 11 and prod-LetClause
from 37 to 27 by completing the parser/runtime support that the tree
walker side already had:

* Grammar: make WindowStartCondition and WindowEndCondition individually
  optional in windowClause, and make the "when ExprSingle" guard
  optional inside each (XQ4 PR483). Sliding windows still require an
  end clause; that constraint is enforced in the tree walker.

* WindowCondition / WindowExpr: tolerate a null whenExpression on either
  the start or the end condition. A missing "when" defaults to true()
  during analyze() and eval(), and toString() omits the "when ..."
  fragment entirely so dump output stays readable.

* LetExpr: when the variable has an explicit atomic SequenceType and
  XQuery version >= 4.0, run a function-conversion pass over the bound
  value before the body executes (XQ4 PR1131). New
  coerceAtomicSequence() casts each item to the declared type via
  atomize().convertTo(), promoting xs:integer/decimal/float to
  xs:double, casting xs:untypedAtomic and xs:anyURI to the target
  atomic type, and falling back to the existing XPTY0004 path if any
  item cannot be converted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nCall

Bring QT4 prod-DynamicFunctionCall from 49.3% to 81.6% (40 -> 14 F+E)
by implementing function-coercion semantics for record-typed parameters
(XQ4 PR1132/PR1501) and adding W3C Schema list-type cast support.

CastExpression: handle xs:NMTOKENS, xs:IDREFS, xs:ENTITIES by splitting
the source string on whitespace and producing a sequence of items typed
as the corresponding atomic item type. Previously a cast to any list
type fell through to StringValue.convertTo's default branch and threw
XPTY0004.

DynamicTypeCheck: factor the per-item function-coercion logic into a
public static helper coerceAtomicItem so other code paths can reuse it
without going through an Expression wrapper.

UserDefinedFunction: before validating a record-typed parameter, walk
the map's declared fields and apply function coercion to each value:
atomize node/array values, cast untypedAtomic to the declared type,
apply numeric promotion and XQ4 implicit casting/relabeling, and try
each alternative in choice (union) field types in declaration order.
Nested record types recurse. The coerced map is then bound to the
parameter so the function body sees the typed values.

SequenceType.checkType(Sequence): also iterate items for record types
and structurally-typed maps/arrays. The previous primary-type subtype
shortcut was unsound for these (a value of type map(*) -- a parent of
RECORD in the type hierarchy -- would erroneously satisfy a record
type).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joewiz joewiz force-pushed the v2/xq4-core-functions branch from 2041845 to 48b7823 Compare April 30, 2026 22:28
…anceof patterns

Address reinhapa's review feedback on PR eXist-db#6218:
- Convert if/else chains and switch statements to switch expressions
  with arrow syntax (CsvParser, FnBuildUri, FnCollation, FnHash, FnOp,
  FnElementToMap)
- Use instanceof with cast variable pattern matching where applicable
  (FnElementToMapPlan, FnElementToMap, FnGet, FnHighestLowest,
  FnInScopeNamespaces)

Also reconstruct TryCatchExpression after a botched cherry-pick of
[feature] XQ4 try/catch err:map + err:stack-trace had left the file
non-compiling: restore the parent state and re-add only the additive
parts (QN_STACK_TRACE, QN_MAP, addStackTrace, addErrMap helpers, plus
the calls and bindings that wire them in).

Add ErrorCodes for XPTY0117, FODT0005, FODT0006 referenced from
UntypedValueCheck and FnDateTimeParts on this branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joewiz
Copy link
Copy Markdown
Member Author

joewiz commented Apr 30, 2026

[This response was co-authored with Claude Code. -Joe]

@reinhapa Thanks for the thorough review — all addressed in a9aa67d9:

Switch expressions (~14 sites): XQueryTree.g version-decl block, CsvParser state machine, FnBuildUri.isDefaultPort, FnCollation.validateOptionValue, FnHash algorithm dispatch, FnOp (operator-validity check, operator dispatch, arithmetic dispatch, set-op dispatch), FnElementToMap (layout dispatch, two fallback dispatches, three node-type dispatches, two name-format dispatches, xsi:type and value-type coercion).

instanceof with cast variable (~10 sites): FnElementToMapPlan (Attr ternary), FnElementToMap (MapType plan, MapType lookup, StringValue layout), FnGet (ArrayType, AbstractMapType, FunctionReference), FnHighestLowest (DoubleValue, FloatValue), FnInScopeNamespaces (memtree + persistent ElementImpl).

Codacy run locally — the remaining warnings are all pre-existing on this branch (NPath on big methods, field-ordering in CsvParser, AvoidInstanceofChecksInCatchClause in TryCatchExpression). Happy to file follow-up cleanup PRs against develop if useful.

Note on standalone build: rebased onto current develop cleanly, but this PR cannot build standalone — UserDefinedFunction and SequenceType reference RecordType/FieldAccessor from v2/xq4-record-types. It's only buildable via the next-v3 integration branch. Existed on the branch before this round; flagging it for visibility. Update: No, we're going to make sure this branch builds cleanly.

joewiz and others added 8 commits April 30, 2026 19:21
…mpilation

Revert three cherry-picks that introduced RecordType, FieldAccessor,
WhileClause, and isRecordType/isChoiceType references which only exist
on the next-v3 integration branch. The v2/xq4-core-functions branch must
compile standalone without the integration branch.

Reverted commits (still present on feature/post-90-fixes):
- 48b7823 XQ4 record coercion + list-type cast
- 345ff8c XQuery 4.0 misc-Subtyping parser + SequenceType rules
- 393b8c1 XQ4 optional window clauses + let coercion

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CsvParser: move CsvConverter inner interface after field declarations
- FnInScopeNamespaces: collapse nested if into single condition
- FnBuildUri: collapse nested if for default-port omission
- FnElementToMapPlan: drop unused elemKey parameter; replace nested
  for/break with stream allMatch + helper; collapse nested if in
  detectAggregateType; getElementKey is now static
- WindowExpr: collapse nested window-end if and reformat block
- DynamicTypeCheck: introduce local 'current' variable to avoid
  reassigning 'item' parameter
- TryCatchExpression: drop unused Throwable parameter from
  addErrAdditional; remove unused getStackTrace method and now-unused
  IOException/PrintWriter/StringWriter imports
- SequenceType: collapse three nested if pairs in checkType /
  checkFunctionType; pattern-match FunctionReference inline

No behavior changes -- pure refactoring to clear PMD warnings flagged
by Codacy on the PR.
Extract helpers from large eval() / parse() bodies on PR eXist-db#6218 to reduce
NPath complexity and improve readability. No behavior changes intended.

FnBuildUri.eval: extract parseOptions, isHierarchical, appendScheme,
  appendAuthority, appendPath, appendQuery, appendFragment, plus the
  per-field helpers effectiveUserinfo / effectivePort / buildPathFromSegments
  / buildQueryFromParams. NPath was 24,192,000.

LetExpr.eval: extract preCoerceMapOrArray, validateSequenceType, plus the
  cardinality / non-node / node-binding sub-validators and
  applyNodeNameForError. NPath was 417,845.

CsvParser.parse: extract per-state handlers (handleFieldStart /
  handleUnquoted / handleQuoted / handleAfterQuoted), endRow,
  finalizeRecords, trimAndNormalize, emit, plus a ParseState holder for
  shared mutable state. NPath was 26,880.

FnHash.eval: extract resolveAlgorithm, computeHash dispatcher,
  computeCrc32 / computeBlake3 / computeMessageDigest, toHexString.
  NPath was 276.

CastExpression.eval: extract startProfiler, validateRequiredType,
  castSequence, castToQName. NPath was 2,000.

TryCatchExpression.addErrMap: extract buildDescription, errorValueOf,
  moduleOf, lineNumberOf, columnNumberOf, stackTraceOf. NPath was 2,916.

FnHighestLowest.eval: extract resolveCollator, resolveKeyRef,
  computeKeys, computeKey, findExtremeKey, collectMatching. NPath was
  34,560. Also drop unused NumericValue import.

SequenceType.checkFunctionType: extract checkFunctionParamSubsumption.
  NPath was 240.

WindowExpr.eval (NPath 6,360,900) is intentionally NOT decomposed in
this commit. Its eval body threads heavily mutable state through a
loop -- window, windowStartIdx, multiple LocalVariable marks /
WindowContextVariables refs, previousItem, and even rewinds the loop
counter `i` for sliding windows -- with sufficient interplay that any
further extraction would require either threading a state record or
mutable-ref wrappers. The risk of subtle behavior changes outweighs
the readability gain. Deferred for a follow-up tasking with dedicated
window-clause coverage.
…LetExpr profiler

Address remaining warnings from re-running Codacy after the initial
NPath decomposition pass on PR eXist-db#6218:

- SequenceType.cardinalitySubsumes: collapse the trailing
  ONE_OR_MORE if-return + return-false into a single boolean
  expression (SimplifyBooleanReturns at the closing branch).

- FnElementToMapPlan.hasSignificantAttributes: extract the
  per-attribute filter into isSignificantAttribute so the loop body
  no longer ends with `return true;` (AvoidBranchingStatementAsLastInLoop).

- LetExpr.eval: extract evalLet, finalizeResult, and startProfiler
  to bring the top-level method's NPath under the 200 threshold (was
  405 after the previous pass). finalizeResult is final / no-reassign
  to avoid AvoidReassigningParameters.

Remaining NPath warnings on the changed files
(FnElementToMapPlan.analyzeInstances, .detectAggregateType,
SequenceType.checkType, .isSubtypeOf) are pre-existing and outside
the scope of this Codacy cleanup tasking. WindowExpr.eval remains
deferred per the previous commit.
…ement-to-map-plan

Phase 2.19 conformance push for four XQ4 functions whose XQTS scores
were below the ≥80% / ≤30 F+E gate.

map:entries — return singleton maps per entry (17/17 = 100%, was 35.2%)
  Each output map now contains a single key-value pair from the input,
  matching the XPath/XQuery 4.0 spec. Previously each output was a
  two-field map {"key": K, "value": V}, which broke map:keys/map:items
  consumers.

fn:parts-of-dateTime — return RecordMapType with DATETIME_RECORD type
  (17/18 = 94.4%, was 31.5%). The result now reports as
  fn:dateTime-record so AssertType succeeds. Sole remaining failure
  is round-trip-altered, blocked by an unrelated bug in
  fn:adjust-dateTime-to-timezone.

fn:siblings — type checking, JNode dispatch, empty-context handling
  (8/18 = 44.4%, was 44.4%). The function now: rejects non-node items
  with XPTY0004 instead of ClassCastException; handles JNode (JSON
  node) sibling navigation; permits an empty context as the empty
  sequence (XPDY0002 is reserved for an absent context). Refactored
  eval() into resolveInput / siblingsOf / xmlSiblings helpers to keep
  NPath complexity under threshold. The remaining failures (101–106,
  010, 011, 012, 013) require infrastructure changes outside the
  function: JSON path expressions returning JNodes, fn:deep-equal
  JNode support, the namespace:: axis parser, namespace-node string
  serialization, and thin-arrow operator semantics on empty input.

fn:element-to-map-plan — suppress empty @id attribute entry
  (20/21 = 95.2%, was 61.9%). When a collected attribute has no
  detectable type (plain string), it no longer contributes a spurious
  {"@id": map{}} entry to the plan. Sole remaining failure is test 700,
  blocked by an unrelated parser issue with inline-function syntax.
…ings from RecordType + JNode

These three XQ4 functions held references to RecordMapType (record type
infrastructure) and JNode (JSON node infrastructure) that exist only on
the next-v3 integration branch. That made the commits un-cherry-pickable
to v2/xq4-core-functions, which must compile standalone on top of
develop with zero dependencies on unmerged feature PRs.

The fix replaces RecordMapType with plain MapType (a record is
structurally a typed map; the same key/value access pattern works on
both) and removes the JNode-specific siblings dispatch entirely.

FnDateTimeParts.partsOfDateTime — return MapType instead of
RecordMapType. The signature already declared Type.MAP_ITEM as the
return type, so call sites that access $result?year are unchanged.
Drop the now-unused DATETIME_RECORD_FIELD_ORDER constant and the
BigInteger / List imports.

FnDateTimeRecord.eval — same RecordMapType -> MapType swap. Update the
function signature return type from Type.DATETIME_RECORD (named record
type, only on next-v3) to Type.MAP_ITEM. The 8-arity zero-padding loop
keeps FIELD_ORDER for arg-index-to-key mapping.

FnSiblings — drop the JNode import, the jnodeSiblings() helper, and the
JNode dispatch in siblingsOf(). XQ4 fn:siblings is defined over XML
nodes; JNode sibling navigation is a next-v3-only extension that
belongs on the JNode feature branch.

A future commit can restore RecordType/JNode-aware variants once that
infrastructure lands on develop.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The parenthesized timezone comment after a numeric offset (e.g.
"+0100 (CET)") is informational per W3C XPath F&O. The parser was
validating the comment text against a small timezone abbreviation map
that only contained US timezones (EST, CST, etc.), causing European
abbreviations like CET to be rejected with FORG0010.

Skip the comment content without validation — the timezone was already
parsed from the numeric offset.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
joewiz and others added 5 commits April 30, 2026 22:02
…N errors

Address eleven test failures on PR eXist-db#6218 (in addition to the
parse-ietf-date CET cherry-pick already applied):

ArrayTests.for-each-pair1 / filter1 / filter3 / for-each-pair2
- Use signature arity primarily but fall back to the bound call's
  argument count when the signature is variadic, so concat#2 reports 2
  instead of -1. Inline functions and fixed-arity refs are unaffected.

XQueryFunctionsTest.currentDateTime
- *-from-dateTime accessors now accept any subtype of xs:dateTime
  (notably xs:dateTimeStamp, the type returned by current-dateTime),
  not only the exact xs:dateTime type.

Map key-ordering tests (maps.xqm, custom-assertion.xqm)
- Tests asserted the old hash-based key order. Updated assertions to
  match the spec-mandated insertion order, including the JSON
  serialization in custom-assertion.xqm.
- mt:wrongCardinality now expects XPTY0004 (the W3C-correct error code
  for a key expression yielding a non-singleton sequence) instead of the
  legacy EXMPDY001.

ArrayTests.parse-json
- Test relied on use-last as the default for duplicate keys; the W3C
  default is use-first, so the assertion now matches.

SecurityManagerTests.id.from-load-module
- fn:load-xquery-module now accepts a module whose declared XQuery
  version is <= the requested version (XQuery is backward compatible),
  rather than requiring exact equality.

XQuery3Tests.json-to-xml-error-2
- Boolean-typed JSON options that arrive as a string now raise FOJS0005
  (consistent with the existing duplicates-value path) instead of
  XPTY0004.

XQuery3Tests.replace.empty-match-allowed
- In XQuery 4.0 mode, fn:replace no longer raises FORX0003 when the
  input string is empty even without the ! flag, matching the QT4 test
  expectations.

FunUnparsedTextTest.unparsedTextLines_noDataStream
- A dynamic text resource backed by a null InputStream now surfaces as
  XPathException(FOUT1170) rather than NullPointerException.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…R197

Audit and fix FunctionSignature definitions against the QT4CG XQuery 4.0
spec so partial-application instance-of tests succeed. The QT4 keyword
test catalog (misc-BuiltInKeywords) checks that
fn:foo(arg := ?) instance of function(...) as ... matches the
signature declared by W3C; before this commit eXist's signatures
diverged in parameter names, cardinalities, and return types.

Common patterns fixed:
- collation parameter widened from xs:string to xs:string?
  (fn:contains, fn:ends-with, fn:starts-with, fn:compare,
  fn:distinct-values, fn:index-of, fn:substring-before,
  fn:substring-after, fn:collation-key, fn:string-join's separator,
  array:index-of)
- options map parameter widened to map(*)?
  (fn:doc, fn:doc-available, fn:csv-to-arrays, fn:parse-csv,
  fn:csv-to-xml, fn:csv-doc, fn:parse-xml, fn:parse-xml-fragment,
  fn:path)
- length-style optional trailing parameter widened to ?
  (fn:substring's length, fn:subsequence's length,
  array:subarray's length, array:build's action)
- start type widened to xs:numeric for fn:subsequence
- base param of fn:resolve-uri is xs:string?
- fn:seconds value is xs:decimal?
- fn:unix-dateTime value is xs:nonNegativeInteger?, returns dateTimeStamp
- typed function/array/record/map returns:
  fn:op returns fn(item()*, item()*) as item()*
  fn:invisible-xml returns fn(xs:string) as item()
  fn:analyze-string returns element(fn:analyze-string-result)
  fn:element-to-map returns map(xs:string, item()?)?
  fn:function-annotations returns map(xs:QName, xs:anyAtomicType*)*
  fn:csv-to-arrays returns array(xs:string)*
  fn:divide-decimals returns record(quotient/remainder)
  fn:in-scope-namespaces returns map(xs:NCName, xs:anyURI)
  fn:transitive-closure returns node()*
  array:members returns record(value as item()*)*
  fn:transform returns map(*)
  fn:collation accepts map(*) and returns xs:string

Type-checker support added:
- SequenceType.isSubtypeOf now handles a choice (union) type on the
  sub side: every alternative must be a subtype of the supertype.
  This unblocks date/time accessors (fn:year-from-dateTime etc.) and
  fn:char where the spec types are union types but eXist uses a
  single broader primary type.
- Bare map(*) is treated as map(xs:anyAtomicType, item()*) in subtype
  checks (was xs:string, which contradicts the W3C spec); records flow
  through with an xs:string key fallback because record keys are
  always strings.
- Bare array(*) is now treated as array(item()*); array(*) is a
  subtype of array(item()*).

Implementation tweaks accompanying the signature changes:
- CollatingFunction.getCollator handles an empty xs:string? collation
  argument by returning the default collator.
- FunSubstring, FunSubSequence, FunAnalyzeString, ArrayFunction handle
  an empty optional-length argument by behaving as the no-length form.
- ArrayBuild handles an empty action argument by returning the input
  unchanged (identity-like).
- FunResolveURI handles an empty base-URI argument by falling back to
  the static base URI.

XQTS misc-BuiltInKeywords pass rate: 72.9% -> 89.5% (76 -> 31 fails).
Remaining failures are net-new XQ4 functions (j-tree, j-key, j-value,
j-position, system-properties, build-dateTime, etc.), unrecognised
record types (fn:dateTime-record, fn:parsed-csv-structure-record,
etc.), parser support for record/document-node wildcards in instance-of
expressions, and a fn:matches partial-application internal error —
all out of scope for this signature audit.

Other affected XQTS sets (no regressions, only improvements):
fn-collation-key 10 -> 7, fn-string-join 2 -> 0, fn-resolve-uri 3 -> 1,
fn-substring 1 -> 0, fn-substring-before 3 -> 1, fn-substring-after
2 -> 0, misc-Subtyping 26 -> 25.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The fn signature audit (commit 13ffc8f / W3C PR197) tightened
fn:unix-dateTime's value parameter to xs:nonNegativeInteger?. Update
the unixDateTime-epoch and unixDateTime-oneSecond tests to pass
xs:nonNegativeInteger literals so they don't fail static type checks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous version-mismatch test loaded a 3.1 module from a 4.0
caller and expected FOQM0003. After the load-xquery-module fix that
allows older modules (XQuery is backward compatible), that scenario
now succeeds — which is the more useful behavior for module reuse.

Update the failing-version-mismatch test to use the explicit
xquery-version option requesting 3.1 against a 4.0 module, which is a
genuine mismatch that still raises FOQM0003. Also add a positive test
documenting that an older module loads cleanly from a newer caller.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per PMD's OneDeclarationPerLine rule. Trivial follow-up to the
fn signature audit (commit 13ffc8f).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joewiz
Copy link
Copy Markdown
Member Author

joewiz commented May 11, 2026

[This response was co-authored with Claude Code. -Joe]

Per the 2026-05-10 v2/* extraction audit, 10 of this branch's 67 commits are XQuery 3.1-mandatory and don't depend on the XQ 4.0 parser flag or core-function infrastructure. Extracted to a new PR against develop for the eXist 7.0 conformance push: #6344.

The extraction picked up the regex-validation chain, fn:unparsed-text family, prod-VarDecl error codes, fn:contains-token collation, fn:load-xquery-module version check, and the function-call/ref reserved-name + context fixes. XQTS HEAD spot-check shows ~+247 newly passing tests across the 16 affected sets — well above the audit's 50–80 estimate, with fn-contains-token (+42), prod-NamedFunctionRef (+44), and prod-FunctionCall (+35) being the largest clusters.

The remaining 57 commits — 50+ new XQ 4.0 functions, record types, fn{...} lambdas, keyword arguments, XQ 4.0 numeric literal extensions, parse-json 4.0 compliance, etc. — stay in this branch for a post-7.0 cycle alongside the XQ 4.0 parser landing.

Three commits were classified as DROP-already-on-develop or no-op (the XQ4 duration-ordering gate, the deep-equal compareAttributes restore that was only "missing" inside v2's DeepEqualOptions refactor, and the cast-error/try-catch commit whose 3.1 portion is already on develop). One commit was reclassified to DROP during execution: b36833ffaf (fn:reverse lazy O(1)) hit a structural conflict because v2 refactored RangeSequence from IntegerValue fields to primitive longs while develop has accumulated its own RangeSequence work; as an [optimize] with no XQTS yield it stays here.

Related: the paused 2026-05-10 rebase-v2-xq4-core-functions-onto-develop.md becomes much smaller after #6344 lands — the worst conflict surfaces (FunMatches/RegexUtil, FunUnparsedText, prod-CastExpr-adjacent classes) are resolved by the develop-side fixes being already in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

blocked blocked by a 3rd party enhancement new features, suggestions, etc. XQ4 xquery 4 xquery issue is related to xquery implementation

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

3 participants