Implement XQuery 4.0 core functions (fn:, array:, map:, math:)#6218
Implement XQuery 4.0 core functions (fn:, array:, map:, math:)#6218joewiz wants to merge 67 commits into
Conversation
00e6756 to
17a7067
Compare
… update statuses - Replace closed eXist-db#6213 (v2/jetty-12-upgrade) with eXist-db#6145 (feature/websocket-core) - Add CI Health Note explaining known noise: integration hangs, container image HTTP 502, XQTS runner Saxon 12 crash, and complementary empty-match failures in eXist-db#6212/eXist-db#6218 - Update XQTS runner: eXist-db#45 closed, eXist-db#49 is the active PR - Update cross-repo PR table accordingly - Update "Also Ready to Merge" table: mark eXist-db#6142, eXist-db#6146 merged; eXist-db#6186 superseded by eXist-db#6224; correct eXist-db#6087 approver; add status notes for eXist-db#6182, eXist-db#6184 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
[This response was co-authored with Claude Code. -Joe] CI state: 6/9 checks pass. Of the 3 failures:
Dependencies: Wave 4. Must merge after all Wave 3 grammar PRs (particularly For full context on all 7.0 PRs and the merge order, see the Reviewer Guide. |
| final String version = v.getText(); | ||
| if (version.equals("3.1")) { | ||
| if (version.equals("4.0")) { | ||
| context.setXQueryVersion(40); | ||
| } else if (version.equals("3.1")) { | ||
| context.setXQueryVersion(31); | ||
| } else if (version.equals("3.0")) { | ||
| context.setXQueryVersion(30); | ||
| } else if (version.equals("1.0")) { | ||
| context.setXQueryVersion(10); | ||
| } else { | ||
| throw new XPathException(v, ErrorCodes.XQST0031, "Wrong XQuery version: require 1.0, 3.0 or 3.1"); | ||
| throw new XPathException(v, ErrorCodes.XQST0031, "Wrong XQuery version: require 1.0, 3.0, 3.1, or 4.0"); | ||
| } |
| final int cp = input.codePointAt(i); | ||
| final int cpLen = Character.charCount(cp); | ||
|
|
||
| switch (state) { |
| } | ||
|
|
||
| private static boolean isDefaultPort(final String scheme, final int port) { | ||
| switch (scheme) { |
| */ | ||
| private void validateOptionValue(final String mapKey, final String uriParam, final String val) | ||
| throws XPathException { | ||
| switch (uriParam) { |
| } | ||
| } | ||
|
|
||
| switch (layout) { |
| return Constants.EQUAL; | ||
| } | ||
|
|
||
| switch (item1.getType()) { |
| org.w3c.dom.Node child = parent.getFirstChild(); | ||
| while (child != null) { | ||
| final int type = getEffectiveNodeType(child); | ||
| switch (type) { |
| // else: trailing colon with no minutes is allowed (test 47, 60) | ||
| } else { | ||
| // No colon: split based on number of digits | ||
| switch (totalDigits) { |
There was a problem hiding this comment.
Use text blocks where multiline strings are used
Broadened signatures and improved compliance for existing functions:
- fn:compare: broadened to anyAtomicType, numeric total order,
duration/datetime ordering per QT4 spec
- fn:deep-equal: text node merging across comments/PIs, options map
support, BigInteger overflow fix for fn:round
- fn:head/fn:tail: expanded to fn:foot/fn:trunk (XQ4 aliases)
- fn:max/fn:min: duration comparison support, decimal precision
- fn:doc/fn:doc-available: security-gated file:// URI resolution,
fn:doc#2 overload
- fn:unparsed-text: BOM stripping, fn:unparsed-text-lines fix
- fn:matches/fn:replace/fn:tokenize: regex enhancements
- fn:path#2: output format parameter
- fn:analyze-string: reflection proxy for Saxon compatibility
- fn:parse-json: option validation, empty sequence args, xs:integer
for JSON integers
- fn:load-xquery-module: content option (XQ4)
- fn:format-number: negative exponent zero-padding, map overload,
char:rendition pattern
- fn:format-date/fn:format-time: comprehensive improvements
- Collations: supplementary codepoint comparison
- RangeSequence: primitive long storage optimization
- Error codes: W3C alignment across casting and value types
- JSON serialization: XDM mode bypass fix, duplicate key detection
Spec: QT4 XQuery 4.0 §14 (Functions and Operators)
XQTS: +111 tests across method-json, fn-compare, fn-deep-equal,
fn-round, fn-max/fn-min test sets
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New function implementations in the fn: namespace: Sequence functions: fn:characters, fn:identity, fn:void, fn:foot, fn:trunk, fn:slice, fn:items-at, fn:replicate, fn:insert-separator, fn:all-equal, fn:all-different, fn:duplicate-values, fn:index-where, fn:take-while, fn:distinct-ordered-nodes, fn:siblings Higher-order functions: fn:every, fn:some (function form), fn:highest, fn:lowest, fn:sort-by, fn:sort-with, fn:partition, fn:scan-left, fn:scan-right, fn:subsequence-where, fn:transitive-closure, fn:partial-apply, fn:op String/URI functions: fn:char, fn:graphemes, fn:decode-from-uri, fn:parse-uri, fn:build-uri, fn:expanded-QName, fn:parse-QName, fn:parse-integer, fn:divide-decimals Date/Time functions: fn:civil-timezone, fn:build-dateTime, fn:parts-of-dateTime, fn:unix-dateTime, fn:seconds Type functions: fn:schema-type, fn:atomic-type-annotation, fn:node-type-annotation, fn:element-to-map, fn:element-to-map-plan, fn:type-of, fn:is-NaN Context functions: fn:get, fn:collation, fn:collation-available, fn:message Parsing functions: fn:parse-html (Validator.nu HTML5 parser), fn:invisible-xml (Markup Blitz iXML parser), fn:parse-csv, fn:csv, fn:html-doc, fn:unparsed-binary Data functions: fn:hash, fn:function-annotations, fn:function-identity, fn:in-scope-namespaces Also: DeepEqualOptions class for fn:deep-equal options map support, FnModule registrations for all new functions. Spec: QT4 XQuery 4.0 §14 (Functions and Operators) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Array module (8 new functions):
- array:build, array:index-of, array:index-where, array:of-members,
array:split, array:sort-by, array:sort-with, array:slice
- Plus: array:get#3 with default value
Map module (5 new functions):
- map:build, map:items, map:entries, map:filter, map:keys-where
- Plus: map:get#3 with default value, map:empty
Math module (4 new functions):
- math:cosh, math:sinh, math:tanh, math:e
- Plus: math:pow edge case fixes
Spec: QT4 XQuery 4.0 §17 (Array Module),
QT4 XQuery 4.0 §16 (Map Module),
QT4 XQuery 4.0 §18 (Math Module)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Build configuration: - exist-parent/pom.xml: Add markup-blitz 1.10 (fn:invisible-xml), htmlparser 1.4.16 (fn:parse-html via Validator.nu) - exist-core/pom.xml: Add markup-blitz and htmlparser dependencies - .gitignore: Ignore iXML grammar cache files Format improvements: - FnFormatDates: comprehensive format-date/format-time improvements - FnFormatNumbers: map overload, char:rendition pattern, negative exponent zero-padding fix Tests: - fnXQuery40.xql: XQSuite tests for XQ4 functions - fnInvisibleXml.xqm: fn:invisible-xml test suite - format-number-map.xql: fn:format-number map overload tests - deep-equal-options-test.xq: fn:deep-equal options map tests - Updated: fnLanguage.xqm, json-to-xml.xql, replace.xqm Spec: QT4 XQuery 4.0 §14 (Functions and Operators) XQTS: 732/861 (85.0%) for XQ4-specific test sets Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…d fn:tokenize Without the ! flag, empty-matching patterns raise FORX0003 in both XQ 3.1 and XQ 4.0 mode. With the ! flag in XQ 4.0, fn:replace uses the Java regex fallback and fn:tokenize tokenizes between each character. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… context The XQ4 duration ordering version gate in DurationValue.compareTo() checked getExpression().getContext().getXQueryVersion() >= 40, but DurationValues created at runtime had null expression references, causing the gate to always block ordering even in XQ4 mode. Fix: propagate expression context from comparison operators to atomized values. Add AtomicValue.setExpression() and version-gated max()/min() for DurationValue. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mezone options to fn:deep-equal Extends the XQuery 4.0 fn:deep-equal options map with four new options: - items-equal: function(item(), item()) as xs:boolean? callback for custom item equality. Returns empty sequence to fall back to default comparison. - unordered-elements: xs:QName* listing elements whose children are compared as multisets rather than ordered sequences. - normalization-form: Unicode normalization (NFC/NFD/NFKC/NFKD) applied to string comparisons. - timezone: xs:dayTimeDuration implicit timezone override for date/time comparison (parsed but not yet applied in comparison logic). Also adds proper cleanup of FunctionReference via close() method on DeepEqualOptions, called from FnDeepEqualOptions.eval() in a finally block. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…p:items XQuery 4.0 conformance fixes for array and map module functions: - map:build: Skip items with empty key (NPE fix), combine duplicate key values into sequences, support multi-key returns (PR1041) - map:items: Return flat value sequence instead of entry maps; split from map:entries which retains the record-style return - map:for-each, map:filter, map:keys-where: Support XQ4 arity coercion — callbacks may accept 0-3 args (key, value, position) per PR2225 - array:filter: Support XQ4 arity coercion — callbacks may accept 0-2 args (member, position) - array:for-each-pair: Support XQ4 arity coercion — callbacks may accept 0-3 args (memberA, memberB, position) per PR2225 - array:get: Use long comparison to prevent integer overflow for indices exceeding Integer.MAX_VALUE (fixes FOAY0001 for large indices) - map:merge: Accept empty sequence for options parameter (XQ4) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… XQ4 Per QT4CG PR eXist-db#1481, the 7 fn:*-from-dateTime component extraction functions now accept any Gregorian date/time type (xs:date, xs:time, xs:gYear, xs:gYearMonth, xs:gMonth, xs:gMonthDay, xs:gDay) in XQuery 4.0 mode. When the requested component is absent (e.g., hours from xs:date), the function returns the empty sequence. Version-gated: in XQ 3.1 mode, these functions still require strict xs:dateTime input. The fn:*-from-date and fn:*-from-time signatures are unchanged per the XQ4 spec. Includes 35 XQSuite tests covering component presence/absence across all Gregorian types, plus regression coverage for xs:dateTime inputs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…t XQ4 lookaround Add pre-validation of regex patterns in fn:matches and fn:replace to reject constructs that are not part of the XPath regular expression specification (F&O 3.1, Section 5.6.1) but that Saxon's XP30 mode silently accepts. Rejected constructs include: - \x, \u hex/unicode escapes (not in XPath regex) - \A, \Z, \z Java-specific anchors - \b, \B word boundary assertions - \a, \e, \f, \v special character escapes - \Q, \E literal quoting - \G, \k, \g named/numbered back-references - (?=...) (?!...) (?<=...) (?<!...) Java-style lookaround - (?>...) atomic groups - (?i:...) (?m:...) (?s:...) (?-i:...) inline flag groups - *+ ++ ?+ possessive quantifiers Also adds support for XPath 4.0 named lookaround syntax by translating (*positive_lookahead:...) etc. to Java regex (?=...) equivalents. Expected XQTS impact: ~137 of 173 fn-matches.re failures fixed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…checking fn:unparsed-text improvements: - Change $encoding parameter from xs:string to xs:string? to accept empty sequence - Fix error code mapping: FODC0005 → FOUT1170 for URI syntax errors - Add URI-only dynamic text resource lookup for encoding-agnostic resolution (fixes UTF-16 and ISO-8859-1 resources when no encoding is specified) - Add readLines support for dynamic text resources (was missing) - Add XML character validation (FOUT1190) for non-XML characters - Fix unparsed-text-available to return false (not empty sequence) for empty href Function type checking (SequenceType): - Add functionParamTypes and functionReturnType fields to SequenceType - Wire up ANTLR tree walker to populate function type info (resolves TODO) - Add return type covariance checking for function instance-of operations XQTS fn-unparsed-text: 50 → 32 failures (18 tests fixed, 36% improvement) Subtyping fixes require next-v3 integration branch for proper testing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…, promotions - Add XPST0003 checks for reserved function names in NamedFunctionReference and FunctionFactory, fixing ~16 prod-NamedFunctionRef and ~7 prod-FunctionCall tests that incorrectly returned XPST0017 or succeeded when they should fail - Fix context item passing for wrapped internal functions (FunctionFactory.wrap). UserDefinedFunction now preserves the evaluation context for wrapper functions, fixing ~15 tests where context-dependent functions like fn:string#0, fn:node-name#0, fn:id#1, fn:idref#1 lost the focus when called via function references - Add binary type promotion (xs:base64Binary ↔ xs:hexBinary) in GeneralComparison and DynamicTypeCheck per XQuery 4.0 spec, fixing 4 function-call-promotion tests - Register 2-arity fn:element-with-id signature (the implementation already handled 2 args but the signature was missing), fixing 2 tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nformance
Fixes 59 XQTS failures (0% → 100% pass rate for fn-element-to-map suite):
- Implement "default" name-format (spec default, not "eqname"):
child elements in same namespace as parent use local name,
different namespace uses Q{ns}local, no-namespace child of
namespaced parent uses Q{}local. Special case for xml: namespace
attributes.
- Fix layout classification: whitespace-only text between elements
is ignored for layout detection but preserved as content when the
element has only whitespace text and no child elements.
- Implement sequence layout for non-unique child names (was
incorrectly using record layout which collapsed duplicates).
- Fix list/list-plus layout: list drops child element name and
returns array directly; list-plus uses child name as map key.
- Add plan support: explicit layout directives (empty, empty-plus,
simple, simple-plus, list, list-plus, record, sequence, mixed,
xml, error, deep-skip), fallback via "*" key, type coercion
(numeric/boolean), FOJS0008 error for layout mismatches.
- Add XML serialization layout for plan-based conversion.
- Add option validation: name-format, attribute-marker type checks
with XPTY0004 errors for invalid values.
- Add xsi:type coercion for schema-typed simple content.
- Fix fn:element-to-map-plan corpus analysis: properly merge
multiple instances, detect list patterns across empty and
non-empty instances, generate type annotations for numeric
content.
- Add FOJS0008 error code to ErrorCodes.java.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reworks JSON.java to thread a single ParseOptions bundle through the recursive parser so all XQuery 4.0 options take effect consistently for fn:parse-json, fn:json-doc and fn:json-to-xml. What changes: * Default duplicates is now use-first (XPath/XQuery 3.1 §17.5.1 / 4.0 PR2096); 'retain' is rejected for parse-json/json-doc. * Empty options sequence is allowed: parse-json(json, ()). * New 'null' option supplies the replacement value (or sequence) used for JSON null in parse-json results, including multi-item sequences. * 'escape' option is now honoured for parse-json: control chars, backslash and characters not allowed in XML are re-encoded with JSON escape sequences. Quote chars are not re-escaped (per QT4 fixtures json-to-xml-049 / json-doc-012 / parse-json-107). * 'fallback' is invoked for chars that cannot appear in XML in both parse-json keys and values, with full validation of return cardinality and FOTY0013 when a function item is returned. * 'number-parser' is invoked for parse-json numeric tokens; multi-item results raise XPTY0004; arity is no longer pre-validated, matching PR975 fixtures. * 'escape' and 'fallback' together raise FOJS0005 (spec §22.3.2). * In XQuery 4.0 mode, unknown options (including 'spec' and 'validate') raise XPTY0004; in 3.1 mode they are silently accepted. * Strings with characters not allowed in XML are normalised with U+FFFD in parse-json output when no fallback is supplied. Test results (QT4 fn-parse-json on v2/xq4-core-functions): before: 142/188 (75.5%), 46 failures after : 182/188 (96.8%), 6 failures (Phase 2 gate ≥80% / ≤30) The remaining 6 failures are out of scope for parse-json: a Jackson leading-zero edge case (716), a parser closure-capture bug (731), JSON serializer of magic-null QName (746/747), and an error-code QName formatting mismatch (943). No regressions in fn-json-to-xml, fn-json-doc, or fn-xml-to-json on this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…espaces, namespace-prefixes (attrs), timezones, map-order
XQ4 PR320/PR1855 introduced an options-map third argument to
fn:deep-equal. The DeepEqualOptions parser already recognized the
option keys, but several were stored as flags without ever being
consulted by the comparison engine. This commit wires them up:
- base-uri: compare element base URIs (xml:base / inherited)
- in-scope-namespaces: compare in-scope NS bindings as sets,
walking ancestor xmlns declarations
- namespace-prefixes: extend prefix check to attributes (was only
applied to elements). Use a fallback that parses the prefix from
nodeName when getPrefix() returns null/empty
- timezones: when set, two date/time atomics with different
explicit timezones (or one missing) are not deep-equal even if
they represent the same instant
- map-order: iterate both maps in their recorded insertion order
(PR1703) and compare keys position-wise
Two collateral fixes:
- items-equal: drop the eager arity check at parse time. The spec
permits any function-typed value (the test deep-equal-40-items-equal-004
passes true#0 because length-mismatched sequences must return false
before invoking the callback). Arity is now validated lazily.
- Unsupported collation in the 3-arg variant now raises FOCH0002
(the runtime error) rather than letting XQST0076 leak through
from getCollator. Spec accepts FOCH0002 here.
Also fix deep-equal-options-test.xq: a library module cannot
declare local:helper (XQST0048) — renamed to det:helper.
XQTS QT4 fn-deep-equal: 33 -> 26 F+E (9 tests fixed: base-uri-003,
in-scope-namespaces-003, timezones-003, items-equal-{004,005,006,008},
normalize-unicode-004, map-order-003; 2 regressions in
whitespace-{009,031} that need follow-up).
JUnit DeepEqualTest: 63/63 pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Commit 7758007 accidentally removed the compareAttributes(a, b) call from compareElements when adding the new base-uri and in-scope-namespaces option blocks. As a result, two elements with different attribute values were treated as deep-equal whenever their names, child contents, and (default-off) base-uri/in-scope options matched. Restore the call after the option checks. With this fix: - xquery3.deep-equal-options-test: whitespace-strip-attr and whitespace-normalize-attr-different (assertFalse on differing attribute values) pass again - XQTS QT4 fn-deep-equal: 33 -> 26 F+E, no regressions vs baseline, with namespace-prefixes-004 now also passing (the attribute prefix check from the parent commit was unreachable until now) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…odes
* TryCatchExpression: bind XQuery 4.0 $err:map (PR493) and $err:stack-trace
(PR1470/PR1599) inside catch clauses. err:map is a map(xs:string, item()*)
containing all standard error properties (code, description, value, module,
line-number, column-number, additional, stack-trace).
* UntypedValueCheck: raise XPTY0117 when implicitly coercing xs:untypedAtomic
to a namespace-sensitive target type (xs:QName, xs:NOTATION) during
function-call argument coercion (XPath F&O 3.1 §19.1).
* CastExpression: allow xs:untypedAtomic as a source for cast as xs:QName
(XQ30+ relaxed the rule — lexical errors raise FORG0001 via QNameValue).
* IntegerValue: revert XQ4 hex/binary/underscore parsing for runtime
string-to-integer casts. Those literal extensions belong only to the parser
token path (XQueryTree.g uses the BigInteger constructor) — applying them
to xs:integer("0x0") accepted invalid lexical forms.
Reduces QT4 prod-TryCatchExpr failures from 34 to ~28 (covers $err:map and
$err:stack-trace tests). Net improvement on prod-CastExpr from XPTY0117
alignment and untypedAtomic→QName cast.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address Phase 2 conformance gaps in two related date/time function sets.
fn:parse-ietf-date (76.2% -> 100%, 105/105):
- Default missing timezone to UTC (per spec note: no timezone == GMT).
- Time parser no longer greedily consumes year digits as a timezone in
asctime form (e.g. "Aug 20 19:36 2014").
- Timezone offset parser handles 1, 2, 3, 4-digit forms ("-5", "-05",
"-500", "-0500"), trailing colons ("-05:"), and ":mm" with no minutes
("Feb-02 02:02-02: 02"); no longer eats trailing whitespace+content.
- Recognise lowercase TZ names ("gmt", "utc"); recognise lowercase day
and month names already worked.
- Optional "(TZNAME)" comment after offsets is parsed and validated;
empty parens / unknown TZ in parens raise FORG0010.
- Day-name without trailing whitespace ("Wed,20 Aug ...") now errors.
- dsep requires at least whitespace or hyphen; "20Aug" / "Aug2014" now
error as expected.
- Handle 24:00 without seconds as midnight at end of day.
- "." with no fractional digits now errors (errs27).
fn:build-dateTime (64.8% -> 95.8%, 68/71):
- Strict field combination validation: empty record, time without all of
hours/minutes/seconds, year+day without month, time fields with
incomplete date components -- all raise FODT0005 with clear messages.
- Numeric coercion: integer fields accept xs:integer or xs:decimal that
is exactly an integer; xs:double, fractional decimals, NaN/Infinity
raise XPTY0004. Untyped/node values are parsed as integers (test
date-without-timezone-from-nodes).
- Seconds field accepts xs:integer / xs:decimal / finite xs:double or
xs:float; rejects NaN/Infinity (XPTY0004).
- Timezone accepts xs:duration too (rejecting year/month parts);
validates +-14:00 range and whole-minute offsets (FODT0006).
- Calendar-day validity (28/29/30/31, leap years) checked up-front and
reported as FODT0006 instead of bubbling up FORG0001 from the lexical
parser. Seconds range 0..<60 also FODT0006.
- Year formatting handles year 0 ("0000") and negative years ("-0001")
per XSD 1.1 representation.
- When a timezone is supplied with a full dateTime, return
xs:dateTimeStamp instead of xs:dateTime.
XQTS QT4 deltas (antlr parser):
| Set | Before | After |
| ------------------ | -------------------- | -------------------- |
| fn-parse-ietf-date | 80/105 (76.2%) | 105/105 (100%) |
| fn-build-dateTime | 46/71 (64.8%) | 68/71 (95.8%) |
Remaining build-dateTime failures: year-zero formatting (XSD 1.0
javax.xml.datatype rejection) and one XPST0017 case that depends on
removing the xq31 2-arg overload.
…4 PR1041) Brings next-v3's map:build implementation up to spec with the work deferred from task7 commit a260f37 (which conflicted too deeply to cherry-pick wholesale). * BUILD_1/BUILD_2 — relaxed key/value parameters from required to optional. Empty sequence selects the spec default. * build() — null-check args before casting to FunctionReference. * When key function is empty, default to fn:data#1 semantics (atomize the input item). Required by map-build-117 where the input is element nodes that need atomization. * When value function is empty, default to fn:identity#1 semantics (use the input item as-is). * referenceArity() helper (placeholder for future partial-application handling; currently delegates to FunctionSignature.getArgumentCount). * getBuildDuplicatesHandler() / BuildDuplicatesHandler — the 'duplicates' option may now be a function reference. Arity 1 receives only the existing accumulator (counter pattern); arity 2 receives both existing and incoming values. Required by map-build-119/123/224. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… 4.0 spec
W3C XQuery 4.0 PR197 specifies the keyword-argument names for built-in
functions. eXist's signatures used the legacy 3.1-era names:
fn:json-doc: eXist used $href -> spec says source
fn:parse-json: eXist used $json-text -> spec says value
fn:json-to-xml: eXist used $json-text -> spec says value
These names are visible to callers via XQ4 keyword-argument syntax
(name := value), so they must match the spec for keyword calls to
resolve. Positional-call behavior is unchanged.
Confirmed via QT4 misc-BuiltInKeywords XQTS:
Keywords-fn-json-doc-1 pass (was XPST0017)
Keywords-fn-parse-json-1 pass (was XPST0017)
Keywords-fn-json-to-xml-1 still fails on a separate parser-level
issue in its instance-of clause
(document-node(fn:*) wildcard).
Net misc-BuiltInKeywords: 83 -> 79 F+E (72.0% -> 72.8%). The remaining
79 failures need a per-function W3C-4.0 signature audit and new XQ4
type-syntax parser features (record types, local union types, document-
node element wildcards), tracked under separate phase2-* taskings.
Addresses misc-Subtyping XQTS gaps. misc-Subtyping QT4 fail count drops from 45 to 26 (153 tests, 41 skipped, 112 active; pass rate 75%). Parser changes (XQuery.g, XQueryTree.g): - recordFieldDecl: suppress the optional QUESTION token from the AST so the tree walker no longer sees a stray '?' node after a record field with no type clause. - Tree walker: allow xs:error in the atomic-type position. xs:error is defined as a builtin under ANY_SIMPLE_TYPE; per XQuery 4.0 it is a legitimate sequence type (its value space is empty, so xs:error* matches only the empty sequence). - documentTest: accept document-node(*) as XQuery 4.0 short form for document-node(element(*)). SequenceType subtype rules (SequenceType.java#isSubtypeOf): - Element/attribute kind tests now compare nodeName: when sup names a specific element/attribute, sub must name the same one. Previously element(*) was reported as a subtype of element(a). - Records now subtype-check structurally on declared fields per XQuery 4.0 Records: required fields of sup must exist (and be required) in sub with sub's field type subtype-of sup's, and sub may not declare extra fields unless sup is extensible. - map(K, V) subtype check now allows records (RECORD <= MAP_ITEM) to flow through, treating an untyped record as map(xs:string, item()*). - function-shape conversion for maps and arrays now uses the declared K/V types: map(K, V) acts as function(xs:anyAtomicType) as V? (atomic-coerced key per XQ4 PR1501; lookup miss widens cardinality); array(T) acts as function(xs:integer) as T. Records flow through the map branch via Type.subTypeOf(sub.primaryType, MAP_ITEM). XQTS misc-Subtyping (QT4): - Before: 45 fails, 41 skipped, 67 passes (59.8%) - After: 26 fails, 41 skipped, 86 passes (76.8%) Remaining failures are XQuery 4.0 features outside this change set: - element/attribute multi-name tests (element(a|b)) and namespace wildcards (element(p1:*), element(*:a)) -- 14 tests - element-type tracking (element(a, xs:integer) covariance) -- 4 tests - gnode() as supertype of node()|jnode() -- 3 tests - document-node(QName) bare short form, document-node(*) vs () -- 2 - union-type widening (xs:long|xs:int subtype of xs:integer) -- 3
Reduce QT4 prod-WindowClause failures from 35 to 11 and prod-LetClause from 37 to 27 by completing the parser/runtime support that the tree walker side already had: * Grammar: make WindowStartCondition and WindowEndCondition individually optional in windowClause, and make the "when ExprSingle" guard optional inside each (XQ4 PR483). Sliding windows still require an end clause; that constraint is enforced in the tree walker. * WindowCondition / WindowExpr: tolerate a null whenExpression on either the start or the end condition. A missing "when" defaults to true() during analyze() and eval(), and toString() omits the "when ..." fragment entirely so dump output stays readable. * LetExpr: when the variable has an explicit atomic SequenceType and XQuery version >= 4.0, run a function-conversion pass over the bound value before the body executes (XQ4 PR1131). New coerceAtomicSequence() casts each item to the declared type via atomize().convertTo(), promoting xs:integer/decimal/float to xs:double, casting xs:untypedAtomic and xs:anyURI to the target atomic type, and falling back to the existing XPTY0004 path if any item cannot be converted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nCall Bring QT4 prod-DynamicFunctionCall from 49.3% to 81.6% (40 -> 14 F+E) by implementing function-coercion semantics for record-typed parameters (XQ4 PR1132/PR1501) and adding W3C Schema list-type cast support. CastExpression: handle xs:NMTOKENS, xs:IDREFS, xs:ENTITIES by splitting the source string on whitespace and producing a sequence of items typed as the corresponding atomic item type. Previously a cast to any list type fell through to StringValue.convertTo's default branch and threw XPTY0004. DynamicTypeCheck: factor the per-item function-coercion logic into a public static helper coerceAtomicItem so other code paths can reuse it without going through an Expression wrapper. UserDefinedFunction: before validating a record-typed parameter, walk the map's declared fields and apply function coercion to each value: atomize node/array values, cast untypedAtomic to the declared type, apply numeric promotion and XQ4 implicit casting/relabeling, and try each alternative in choice (union) field types in declaration order. Nested record types recurse. The coerced map is then bound to the parameter so the function body sees the typed values. SequenceType.checkType(Sequence): also iterate items for record types and structurally-typed maps/arrays. The previous primary-type subtype shortcut was unsound for these (a value of type map(*) -- a parent of RECORD in the type hierarchy -- would erroneously satisfy a record type). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2041845 to
48b7823
Compare
…anceof patterns Address reinhapa's review feedback on PR eXist-db#6218: - Convert if/else chains and switch statements to switch expressions with arrow syntax (CsvParser, FnBuildUri, FnCollation, FnHash, FnOp, FnElementToMap) - Use instanceof with cast variable pattern matching where applicable (FnElementToMapPlan, FnElementToMap, FnGet, FnHighestLowest, FnInScopeNamespaces) Also reconstruct TryCatchExpression after a botched cherry-pick of [feature] XQ4 try/catch err:map + err:stack-trace had left the file non-compiling: restore the parent state and re-add only the additive parts (QN_STACK_TRACE, QN_MAP, addStackTrace, addErrMap helpers, plus the calls and bindings that wire them in). Add ErrorCodes for XPTY0117, FODT0005, FODT0006 referenced from UntypedValueCheck and FnDateTimeParts on this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
[This response was co-authored with Claude Code. -Joe] @reinhapa Thanks for the thorough review — all addressed in Switch expressions (~14 sites):
Codacy run locally — the remaining warnings are all pre-existing on this branch (NPath on big methods, field-ordering in
|
…mpilation Revert three cherry-picks that introduced RecordType, FieldAccessor, WhileClause, and isRecordType/isChoiceType references which only exist on the next-v3 integration branch. The v2/xq4-core-functions branch must compile standalone without the integration branch. Reverted commits (still present on feature/post-90-fixes): - 48b7823 XQ4 record coercion + list-type cast - 345ff8c XQuery 4.0 misc-Subtyping parser + SequenceType rules - 393b8c1 XQ4 optional window clauses + let coercion Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CsvParser: move CsvConverter inner interface after field declarations - FnInScopeNamespaces: collapse nested if into single condition - FnBuildUri: collapse nested if for default-port omission - FnElementToMapPlan: drop unused elemKey parameter; replace nested for/break with stream allMatch + helper; collapse nested if in detectAggregateType; getElementKey is now static - WindowExpr: collapse nested window-end if and reformat block - DynamicTypeCheck: introduce local 'current' variable to avoid reassigning 'item' parameter - TryCatchExpression: drop unused Throwable parameter from addErrAdditional; remove unused getStackTrace method and now-unused IOException/PrintWriter/StringWriter imports - SequenceType: collapse three nested if pairs in checkType / checkFunctionType; pattern-match FunctionReference inline No behavior changes -- pure refactoring to clear PMD warnings flagged by Codacy on the PR.
Extract helpers from large eval() / parse() bodies on PR eXist-db#6218 to reduce NPath complexity and improve readability. No behavior changes intended. FnBuildUri.eval: extract parseOptions, isHierarchical, appendScheme, appendAuthority, appendPath, appendQuery, appendFragment, plus the per-field helpers effectiveUserinfo / effectivePort / buildPathFromSegments / buildQueryFromParams. NPath was 24,192,000. LetExpr.eval: extract preCoerceMapOrArray, validateSequenceType, plus the cardinality / non-node / node-binding sub-validators and applyNodeNameForError. NPath was 417,845. CsvParser.parse: extract per-state handlers (handleFieldStart / handleUnquoted / handleQuoted / handleAfterQuoted), endRow, finalizeRecords, trimAndNormalize, emit, plus a ParseState holder for shared mutable state. NPath was 26,880. FnHash.eval: extract resolveAlgorithm, computeHash dispatcher, computeCrc32 / computeBlake3 / computeMessageDigest, toHexString. NPath was 276. CastExpression.eval: extract startProfiler, validateRequiredType, castSequence, castToQName. NPath was 2,000. TryCatchExpression.addErrMap: extract buildDescription, errorValueOf, moduleOf, lineNumberOf, columnNumberOf, stackTraceOf. NPath was 2,916. FnHighestLowest.eval: extract resolveCollator, resolveKeyRef, computeKeys, computeKey, findExtremeKey, collectMatching. NPath was 34,560. Also drop unused NumericValue import. SequenceType.checkFunctionType: extract checkFunctionParamSubsumption. NPath was 240. WindowExpr.eval (NPath 6,360,900) is intentionally NOT decomposed in this commit. Its eval body threads heavily mutable state through a loop -- window, windowStartIdx, multiple LocalVariable marks / WindowContextVariables refs, previousItem, and even rewinds the loop counter `i` for sliding windows -- with sufficient interplay that any further extraction would require either threading a state record or mutable-ref wrappers. The risk of subtle behavior changes outweighs the readability gain. Deferred for a follow-up tasking with dedicated window-clause coverage.
…LetExpr profiler Address remaining warnings from re-running Codacy after the initial NPath decomposition pass on PR eXist-db#6218: - SequenceType.cardinalitySubsumes: collapse the trailing ONE_OR_MORE if-return + return-false into a single boolean expression (SimplifyBooleanReturns at the closing branch). - FnElementToMapPlan.hasSignificantAttributes: extract the per-attribute filter into isSignificantAttribute so the loop body no longer ends with `return true;` (AvoidBranchingStatementAsLastInLoop). - LetExpr.eval: extract evalLet, finalizeResult, and startProfiler to bring the top-level method's NPath under the 200 threshold (was 405 after the previous pass). finalizeResult is final / no-reassign to avoid AvoidReassigningParameters. Remaining NPath warnings on the changed files (FnElementToMapPlan.analyzeInstances, .detectAggregateType, SequenceType.checkType, .isSubtypeOf) are pre-existing and outside the scope of this Codacy cleanup tasking. WindowExpr.eval remains deferred per the previous commit.
…ement-to-map-plan
Phase 2.19 conformance push for four XQ4 functions whose XQTS scores
were below the ≥80% / ≤30 F+E gate.
map:entries — return singleton maps per entry (17/17 = 100%, was 35.2%)
Each output map now contains a single key-value pair from the input,
matching the XPath/XQuery 4.0 spec. Previously each output was a
two-field map {"key": K, "value": V}, which broke map:keys/map:items
consumers.
fn:parts-of-dateTime — return RecordMapType with DATETIME_RECORD type
(17/18 = 94.4%, was 31.5%). The result now reports as
fn:dateTime-record so AssertType succeeds. Sole remaining failure
is round-trip-altered, blocked by an unrelated bug in
fn:adjust-dateTime-to-timezone.
fn:siblings — type checking, JNode dispatch, empty-context handling
(8/18 = 44.4%, was 44.4%). The function now: rejects non-node items
with XPTY0004 instead of ClassCastException; handles JNode (JSON
node) sibling navigation; permits an empty context as the empty
sequence (XPDY0002 is reserved for an absent context). Refactored
eval() into resolveInput / siblingsOf / xmlSiblings helpers to keep
NPath complexity under threshold. The remaining failures (101–106,
010, 011, 012, 013) require infrastructure changes outside the
function: JSON path expressions returning JNodes, fn:deep-equal
JNode support, the namespace:: axis parser, namespace-node string
serialization, and thin-arrow operator semantics on empty input.
fn:element-to-map-plan — suppress empty @id attribute entry
(20/21 = 95.2%, was 61.9%). When a collected attribute has no
detectable type (plain string), it no longer contributes a spurious
{"@id": map{}} entry to the plan. Sole remaining failure is test 700,
blocked by an unrelated parser issue with inline-function syntax.
…ings from RecordType + JNode These three XQ4 functions held references to RecordMapType (record type infrastructure) and JNode (JSON node infrastructure) that exist only on the next-v3 integration branch. That made the commits un-cherry-pickable to v2/xq4-core-functions, which must compile standalone on top of develop with zero dependencies on unmerged feature PRs. The fix replaces RecordMapType with plain MapType (a record is structurally a typed map; the same key/value access pattern works on both) and removes the JNode-specific siblings dispatch entirely. FnDateTimeParts.partsOfDateTime — return MapType instead of RecordMapType. The signature already declared Type.MAP_ITEM as the return type, so call sites that access $result?year are unchanged. Drop the now-unused DATETIME_RECORD_FIELD_ORDER constant and the BigInteger / List imports. FnDateTimeRecord.eval — same RecordMapType -> MapType swap. Update the function signature return type from Type.DATETIME_RECORD (named record type, only on next-v3) to Type.MAP_ITEM. The 8-arity zero-padding loop keeps FIELD_ORDER for arg-index-to-key mapping. FnSiblings — drop the JNode import, the jnodeSiblings() helper, and the JNode dispatch in siblingsOf(). XQ4 fn:siblings is defined over XML nodes; JNode sibling navigation is a next-v3-only extension that belongs on the JNode feature branch. A future commit can restore RecordType/JNode-aware variants once that infrastructure lands on develop. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The parenthesized timezone comment after a numeric offset (e.g. "+0100 (CET)") is informational per W3C XPath F&O. The parser was validating the comment text against a small timezone abbreviation map that only contained US timezones (EST, CST, etc.), causing European abbreviations like CET to be rejected with FORG0010. Skip the comment content without validation — the timezone was already parsed from the numeric offset. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…N errors Address eleven test failures on PR eXist-db#6218 (in addition to the parse-ietf-date CET cherry-pick already applied): ArrayTests.for-each-pair1 / filter1 / filter3 / for-each-pair2 - Use signature arity primarily but fall back to the bound call's argument count when the signature is variadic, so concat#2 reports 2 instead of -1. Inline functions and fixed-arity refs are unaffected. XQueryFunctionsTest.currentDateTime - *-from-dateTime accessors now accept any subtype of xs:dateTime (notably xs:dateTimeStamp, the type returned by current-dateTime), not only the exact xs:dateTime type. Map key-ordering tests (maps.xqm, custom-assertion.xqm) - Tests asserted the old hash-based key order. Updated assertions to match the spec-mandated insertion order, including the JSON serialization in custom-assertion.xqm. - mt:wrongCardinality now expects XPTY0004 (the W3C-correct error code for a key expression yielding a non-singleton sequence) instead of the legacy EXMPDY001. ArrayTests.parse-json - Test relied on use-last as the default for duplicate keys; the W3C default is use-first, so the assertion now matches. SecurityManagerTests.id.from-load-module - fn:load-xquery-module now accepts a module whose declared XQuery version is <= the requested version (XQuery is backward compatible), rather than requiring exact equality. XQuery3Tests.json-to-xml-error-2 - Boolean-typed JSON options that arrive as a string now raise FOJS0005 (consistent with the existing duplicates-value path) instead of XPTY0004. XQuery3Tests.replace.empty-match-allowed - In XQuery 4.0 mode, fn:replace no longer raises FORX0003 when the input string is empty even without the ! flag, matching the QT4 test expectations. FunUnparsedTextTest.unparsedTextLines_noDataStream - A dynamic text resource backed by a null InputStream now surfaces as XPathException(FOUT1170) rather than NullPointerException. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…R197 Audit and fix FunctionSignature definitions against the QT4CG XQuery 4.0 spec so partial-application instance-of tests succeed. The QT4 keyword test catalog (misc-BuiltInKeywords) checks that fn:foo(arg := ?) instance of function(...) as ... matches the signature declared by W3C; before this commit eXist's signatures diverged in parameter names, cardinalities, and return types. Common patterns fixed: - collation parameter widened from xs:string to xs:string? (fn:contains, fn:ends-with, fn:starts-with, fn:compare, fn:distinct-values, fn:index-of, fn:substring-before, fn:substring-after, fn:collation-key, fn:string-join's separator, array:index-of) - options map parameter widened to map(*)? (fn:doc, fn:doc-available, fn:csv-to-arrays, fn:parse-csv, fn:csv-to-xml, fn:csv-doc, fn:parse-xml, fn:parse-xml-fragment, fn:path) - length-style optional trailing parameter widened to ? (fn:substring's length, fn:subsequence's length, array:subarray's length, array:build's action) - start type widened to xs:numeric for fn:subsequence - base param of fn:resolve-uri is xs:string? - fn:seconds value is xs:decimal? - fn:unix-dateTime value is xs:nonNegativeInteger?, returns dateTimeStamp - typed function/array/record/map returns: fn:op returns fn(item()*, item()*) as item()* fn:invisible-xml returns fn(xs:string) as item() fn:analyze-string returns element(fn:analyze-string-result) fn:element-to-map returns map(xs:string, item()?)? fn:function-annotations returns map(xs:QName, xs:anyAtomicType*)* fn:csv-to-arrays returns array(xs:string)* fn:divide-decimals returns record(quotient/remainder) fn:in-scope-namespaces returns map(xs:NCName, xs:anyURI) fn:transitive-closure returns node()* array:members returns record(value as item()*)* fn:transform returns map(*) fn:collation accepts map(*) and returns xs:string Type-checker support added: - SequenceType.isSubtypeOf now handles a choice (union) type on the sub side: every alternative must be a subtype of the supertype. This unblocks date/time accessors (fn:year-from-dateTime etc.) and fn:char where the spec types are union types but eXist uses a single broader primary type. - Bare map(*) is treated as map(xs:anyAtomicType, item()*) in subtype checks (was xs:string, which contradicts the W3C spec); records flow through with an xs:string key fallback because record keys are always strings. - Bare array(*) is now treated as array(item()*); array(*) is a subtype of array(item()*). Implementation tweaks accompanying the signature changes: - CollatingFunction.getCollator handles an empty xs:string? collation argument by returning the default collator. - FunSubstring, FunSubSequence, FunAnalyzeString, ArrayFunction handle an empty optional-length argument by behaving as the no-length form. - ArrayBuild handles an empty action argument by returning the input unchanged (identity-like). - FunResolveURI handles an empty base-URI argument by falling back to the static base URI. XQTS misc-BuiltInKeywords pass rate: 72.9% -> 89.5% (76 -> 31 fails). Remaining failures are net-new XQ4 functions (j-tree, j-key, j-value, j-position, system-properties, build-dateTime, etc.), unrecognised record types (fn:dateTime-record, fn:parsed-csv-structure-record, etc.), parser support for record/document-node wildcards in instance-of expressions, and a fn:matches partial-application internal error — all out of scope for this signature audit. Other affected XQTS sets (no regressions, only improvements): fn-collation-key 10 -> 7, fn-string-join 2 -> 0, fn-resolve-uri 3 -> 1, fn-substring 1 -> 0, fn-substring-before 3 -> 1, fn-substring-after 2 -> 0, misc-Subtyping 26 -> 25. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The fn signature audit (commit 13ffc8f / W3C PR197) tightened fn:unix-dateTime's value parameter to xs:nonNegativeInteger?. Update the unixDateTime-epoch and unixDateTime-oneSecond tests to pass xs:nonNegativeInteger literals so they don't fail static type checks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous version-mismatch test loaded a 3.1 module from a 4.0 caller and expected FOQM0003. After the load-xquery-module fix that allows older modules (XQuery is backward compatible), that scenario now succeeds — which is the more useful behavior for module reuse. Update the failing-version-mismatch test to use the explicit xquery-version option requesting 3.1 against a 4.0 module, which is a genuine mismatch that still raises FOQM0003. Also add a positive test documenting that an older module loads cleanly from a newer caller. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per PMD's OneDeclarationPerLine rule. Trivial follow-up to the fn signature audit (commit 13ffc8f). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
[This response was co-authored with Claude Code. -Joe] Per the 2026-05-10 v2/* extraction audit, 10 of this branch's 67 commits are XQuery 3.1-mandatory and don't depend on the XQ 4.0 parser flag or core-function infrastructure. Extracted to a new PR against The extraction picked up the regex-validation chain, The remaining 57 commits — 50+ new XQ 4.0 functions, record types, Three commits were classified as DROP-already-on-develop or no-op (the XQ4 duration-ordering gate, the deep-equal Related: the paused |
Summary
82 new or updated functions across fn:, array:, map:, and math: namespaces for XQuery 4.0. Includes fn:replace empty-match version gating (XQ4 behavior only in 4.0 mode).
Additionally includes XQTS QT4 compliance fixes from 6 focused sessions targeting specific failure categories, plus a comprehensive fn: signature audit against W3C XQuery 4.0 PR197.
What Changed
Core XQ4 functions (original)
XQTS compliance fixes
fn:collation(map)overloadfn:element-to-mapandfn:element-to-map-planfn: Signature Audit (W3C XQuery 4.0 PR197)
Audit and fix
FunctionSignaturedefinitions acrossorg.exist.xquery.functions.fnand.arrayagainst the W3C XQuery 4.0 spec. The QT4 keyword test set (misc-BuiltInKeywords) checks thatfn:foo(arg := ?) instance of function(...) as ...matches the function-item signature W3C declares.Signature changes:
xs:string(exactly-one) toxs:string?:fn:contains,fn:ends-with,fn:starts-with,fn:compare,fn:distinct-values,fn:index-of,fn:substring-before,fn:substring-after,fn:collation-key,fn:string-join'sseparator,array:index-of.map(*)?:fn:doc,fn:doc-available,fn:csv-to-arrays,fn:parse-csv,fn:csv-to-xml,fn:csv-doc,fn:parse-xml,fn:parse-xml-fragment,fn:path.?:fn:substringlength,fn:subsequencelength (also start/length type nowxs:numeric),array:subarraylength,array:buildaction.fn:op,fn:invisible-xml,fn:analyze-string,fn:element-to-map,fn:function-annotations,fn:csv-to-arrays,fn:divide-decimals,fn:in-scope-namespaces,fn:transitive-closure,array:members,fn:transform,fn:collation.Type-checker support (
SequenceType.isSubtypeOf):sup.map(*)treated asmap(xs:anyAtomicType, item()*)in subtype checks (wasmap(xs:string, item()*), which contradicts XQ 4.0).array(*)now treated asarray(item()*).Implementation adjustments:
CollatingFunction.getCollatortreats emptyxs:string?collation as default collator.FunSubstring,FunSubSequence,FunAnalyzeString,ArrayFunction.subArrayhandle empty optional-length as no-length.ArrayBuildtreats emptyactionas identity.FunResolveURIfalls back to static base URI when explicit base is empty.XQTS impact from signature audit:
misc-BuiltInKeywordsmisc-Subtypingfn-collation-keyfn-string-joinfn-resolve-urifn-substring*Codacy cleanup
Spec references
qt4tests/misc/BuiltInKeywords.xmlTest plan
🤖 Generated with Claude Code