Skip to content

[bugfix] XQuery 3.1 mandatory fixes from v2/xq4-core-functions (audit extract #3 subset)#6344

Open
joewiz wants to merge 14 commits into
eXist-db:developfrom
joewiz:extract/xq4-core-functions-3-1-subset
Open

[bugfix] XQuery 3.1 mandatory fixes from v2/xq4-core-functions (audit extract #3 subset)#6344
joewiz wants to merge 14 commits into
eXist-db:developfrom
joewiz:extract/xq4-core-functions-3-1-subset

Conversation

@joewiz
Copy link
Copy Markdown
Member

@joewiz joewiz commented May 11, 2026

Summary

Extracts the XQuery 3.1-mandatory commits from PR #6218 (v2/xq4-core-functions) for the eXist 7.0 conformance push, per the 2026-05-10 v2/* extraction audit. The remaining 4.0-only commits stay in #6218 for a post-7.0 cycle.

This is a subset extraction — 10 of 67 commits from the source branch were classified as 3.1-mandatory; the rest are 4.0-only or already-on-develop. See 2026-05-10 xq31-extraction-audit-report.md for the audit's framing.

Commits

In chronological source order, plus one develop-adaptation commit:

  1. 00e55a8103 [bugfix] Improve XPath regex compliance: validate patterns and support XQ4 lookaround
  2. 0d96ad8a19 [feature] Improve fn:unparsed-text conformance and add function type checking
  3. 0aadff27a7 [bugfix] Fix function call/ref XQTS failures: reserved names, context, promotions
  4. 4ab512073d [bugfix] Fix variable declaration error codes for XQuery 3.1 compliance
  5. 8dd6e9e765 [feature] Version-aware XPath regex validation with XQ4 extensions
  6. 7a205f980c [bugfix] Fix fn:contains-token collation parameter to accept empty sequence
  7. 7966280a21 [bugfix] fn:load-xquery-module: check loaded module's own version
  8. e7ccdf97af [bugfix] Tighten XPath regex validation to reject more invalid constructs
  9. 0047944fb1 [bugfix] XPath regex: validate back-references and tighten char class grammar
  10. fe6a348703 [test] LoadXQueryModuleContentTest: align with backward-compat semantics
  11. a729f1fa7c [bugfix] Adapt v2/xq4-core-functions extraction for develop's XQ 3.1-only parser

Per-cluster scope

  • fn-unparsed-text family (XPath F&O 3.1 §17.5) — encoding-agnostic dynamic resource lookup, FOUT1170 mapping, XML char validation, function-type checking
  • prod-VarDecl parser tightening — XQST0054 → XQDY0054 (3.1 dynamic), XPTY0004 in VariableImpl
  • Regex validation chain — XPath F&O 3.1 §5.6.1: reject Java/Perl extensions, validate back-references, tighten char-class grammar
  • fn:contains-token — collation parameter accepts empty sequence per spec
  • fn:load-xquery-module — check loaded module's own version (F&O 3.1 §C.1)
  • Function call/ref fixes — reserved-name XPST0003 checks, context preservation for wrapped internal functions, base64Binary↔hexBinary promotion

Excluded (4.0-only — stay in v2/xq4-core-functions)

54 commits covering: 50+ new XQ 4.0 fn: functions, array/map/math 4.0 extensions, record types, numeric literal extensions (0x..., 0b..., _), keyword arguments, lambdas (fn{...}), parse-json XQ 4.0 compliance, misc-Subtyping parser, record coercion, element-to-map, from-dateTime widening, XQ 4.0 deep-equal options, hot map operations, collation(map) UCA, plus refactor/codacy commits piggybacking on the above.

Excluded (already on develop OR no-op for develop)

  • 935ea37cb3 xs:duration ordering version gate — fixes a 4.0-only gate that doesn't exist on develop
  • 927895cb47 Restore fn:deep-equal attribute comparison — develop already has the call (the 'regression' only existed inside v2's DeepEqualOptions refactor)
  • 5ce1a1a365 XQ4 try/catch err:map + cast errors — \$err:map/\$err:stack-trace are 4.0-only; develop already handles untypedAtomic→QName cast
  • b36833ffaf fn:reverse lazy O(1) — structural conflict (v2 refactored RangeSequence to primitive longs; develop has accumulated its own RangeSequence work). Performance optimization with no XQTS yield, dropped from this extraction.

Plus already-shipped on develop: PR #6328 (fn:min/max), PR #6337 (fn:deep-equal SAX comparator), PR #6207 (prod-CastExpr broad fixes), PR #6331 (Type.subTypeOf), PR #6333 (DurationValue hashCode), PR #6336 (numeric/Boolean hashCode sweep).

XQTS deltas (against W3C XQTS HEAD baseline at 4f09d0accc)

Spot-check on the 16 affected test sets — tests F+E (failures + errors):

Test set baseline F+E extract F+E Newly passing
fn-unparsed-text 21 8 +13
fn-unparsed-text-available 9 8 +1
fn-unparsed-text-lines 23 9 +14
prod-VarDecl 35 25 +10
prod-VarDecl.external 27 11 +16
fn-matches 19 2 +17
fn-matches.re 7 0 +7
fn-replace 10 0 +10
fn-tokenize 21 6 +15
prod-CastExpr 37 17 +20
prod-CastExpr.derived 22 19 +3
fn-contains-token 45 3 +42
fn-load-xquery-module 0 0 0
prod-NamedFunctionRef 53 9 +44
prod-FunctionCall 63 28 +35
Total 392 145 +247

This far exceeds the audit's 50-80 estimate. The contains-token, named-function-ref, and function-call clusters dominate the gain.

Develop-adaptation commit

a729f1fa7c makes two adjustments that are necessary because develop only supports XQuery 1.0/3.0/3.1 (the new XQ 4.0 parser is on v2/new-parser):

  1. FunUnparsedText.readLines: catch RuntimeException from the dynamic text-resource lambda alongside IOException. The new dynamic-resource lookup path triggers any registered ResourceFactory; if a factory throws an unchecked exception, wrap it as FOUT1170 instead of letting it escape. Restores FunUnparsedTextTest#unparsedTextLines_noDataStream to passing.
  2. Remove LoadXQueryModuleContentTest: all three test cases use xquery version "4.0" syntax which develop's parser rejects with XQST0031. The production fix in fn:load-xquery-module ships unchanged; the test cases will return alongside the XQ 4.0 parser landing.

Test plan

  • mvn install -pl exist-core -am -DskipTests green
  • Targeted JUnit on touched classes (ContainsTokenEmptyCollationTest, RegexUtilTest, FunMatchesTest, FunReplaceTest, FunUnparsedTextTest) — 8/8 passing
  • XQTS HEAD spot-check on affected sets (16 sets, +247 newly passing)
  • CI gate (local full mvn test was contaminated by disk-full + concurrent BrokerPool contention from a parallel session — environmental, not code; CI will provide authoritative signal)

Source

Subset cherry-picked from joewiz:v2/xq4-core-functions (PR #6218). The remaining 4.0-only commits stay in #6218 for the post-7.0 cycle. Per the paused rebase tasking, this extraction reduces the eventual rebase conflict surface.

🤖 Generated with Claude Code

joewiz and others added 11 commits May 10, 2026 23:33
…t XQ4 lookaround

Add pre-validation of regex patterns in fn:matches and fn:replace to reject
constructs that are not part of the XPath regular expression specification
(F&O 3.1, Section 5.6.1) but that Saxon's XP30 mode silently accepts.

Rejected constructs include:
- \x, \u hex/unicode escapes (not in XPath regex)
- \A, \Z, \z Java-specific anchors
- \b, \B word boundary assertions
- \a, \e, \f, \v special character escapes
- \Q, \E literal quoting
- \G, \k, \g named/numbered back-references
- (?=...) (?!...) (?<=...) (?<!...) Java-style lookaround
- (?>...) atomic groups
- (?i:...) (?m:...) (?s:...) (?-i:...) inline flag groups
- *+ ++ ?+ possessive quantifiers

Also adds support for XPath 4.0 named lookaround syntax by translating
(*positive_lookahead:...) etc. to Java regex (?=...) equivalents.

Expected XQTS impact: ~137 of 173 fn-matches.re failures fixed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…checking

fn:unparsed-text improvements:
- Change $encoding parameter from xs:string to xs:string? to accept empty sequence
- Fix error code mapping: FODC0005 → FOUT1170 for URI syntax errors
- Add URI-only dynamic text resource lookup for encoding-agnostic resolution
  (fixes UTF-16 and ISO-8859-1 resources when no encoding is specified)
- Add readLines support for dynamic text resources (was missing)
- Add XML character validation (FOUT1190) for non-XML characters
- Fix unparsed-text-available to return false (not empty sequence) for empty href

Function type checking (SequenceType):
- Add functionParamTypes and functionReturnType fields to SequenceType
- Wire up ANTLR tree walker to populate function type info (resolves TODO)
- Add return type covariance checking for function instance-of operations

XQTS fn-unparsed-text: 50 → 32 failures (18 tests fixed, 36% improvement)
Subtyping fixes require next-v3 integration branch for proper testing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…, promotions

- Add XPST0003 checks for reserved function names in NamedFunctionReference
  and FunctionFactory, fixing ~16 prod-NamedFunctionRef and ~7 prod-FunctionCall
  tests that incorrectly returned XPST0017 or succeeded when they should fail

- Fix context item passing for wrapped internal functions (FunctionFactory.wrap).
  UserDefinedFunction now preserves the evaluation context for wrapper functions,
  fixing ~15 tests where context-dependent functions like fn:string#0,
  fn:node-name#0, fn:id#1, fn:idref#1 lost the focus when called via
  function references

- Add binary type promotion (xs:base64Binary ↔ xs:hexBinary) in
  GeneralComparison and DynamicTypeCheck per XQuery 4.0 spec, fixing 4
  function-call-promotion tests

- Register 2-arity fn:element-with-id signature (the implementation already
  handled 2 args but the signature was missing), fixing 2 tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two fixes for XQTS prod-VarDecl failures:

1. XQST0054 → XQDY0054: XQuery 3.1 changed circular variable dependency
   detection from a static error (XQST0054) to a dynamic error (XQDY0054).
   Add XQDY0054 to ErrorCodes and use it in VariableReference. (~17 tests)

2. exerr:ERROR → XPTY0004: VariableImpl type checking threw XPathException
   with only a message string (defaulting to exerr:ERROR) instead of using
   the ErrorCodes.XPTY0004 constant. (~11 tests)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Make regex validation XQuery version-aware (isXQuery40 parameter)
- XQ4: Allow \b, \B word boundaries and Java-style lookaround
- Reject octal escapes \0nn in all modes (not part of XPath regex spec)
- Reject quantified anchors (^?, $*) in XQ4 mode
- Rewrite eXist's own test.xq to not use XPath-invalid back-references
- Apply consistent validation across FunMatches, FunReplace, FunTokenize,
  and FunAnalyzeString

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…quence

The $collation parameter was declared as required (param) but the spec
allows an empty sequence to select the default collation. Changed to
optParam. Adds ContainsTokenEmptyCollationTest (3 tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The XQuery version declared inside a dynamically-loaded module is
recorded on that module's ModuleContext, not on the temporary host
context that the import is created against. The previous check
compared the requested version against tempContext (which always
holds the default 3.1) and produced spurious FOQM0003 errors when
both caller and inline module declared 'xquery version "4.0"'.

In particular, the misc-Subtyping QT4 test set (which uses the XQ4
content option to load 4.0 modules from string) failed every test
with FOQM0003 ("Imported module has wrong XQuery version: 3.1"),
masking the real subtyping bugs underneath.

Inspect the loaded module's own context for its declared version
and only raise FOQM0003 when that doesn't match the caller's
requested version. Schema-aware/internal modules are skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ucts

Adds five new validation checks in RegexUtil.validateXPathRegex to align
fn:matches/fn:replace/fn:tokenize/fn:analyze-string with the XPath regex
spec, plus stricter character class scanning:

- Doubled quantifiers (*{n,m}, +{n,m}, ?{n,m}, **) — re-uses the existing
  possessive-quantifier code path.
- Quantifier after \b/\B word-boundary assertion (e.g. \b+, \b{1,2}).
- POSIX-style [:name:] character classes (which are not valid in XPath
  regex, only in PCRE/POSIX flavors).
- Backslash escapes inside character classes that fall outside the XPath
  set (e.g. \x41, \u0041) — the previous scanner only checked escapes at
  the top level and skipped over class bodies entirely.
- XPath 4.0 lookaround constraints: lookbehind body must be fixed-length
  (no *, +, ?), and a lookaround group itself cannot be quantified.
  Applies to both compact (?=, ?!, ?<=, ?<!) and verbose
  (*positive_lookahead:, *negative_lookbehind:, etc.) forms.

Also drops the previous "quantifier after anchor" rejection (^?, $+, etc.):
the XQ30 test cases in fn-matches.re expect lenient handling of these,
matching Saxon's behavior. The XQ40 "a-suffix" variants that demand
FORX0002 trade off against the XQ30 versions and the net effect is +1.

XQTS QT4 (verified on the next-v3 integration branch where Saxon 12 +
the rebuilt runner can exercise this code):

  fn-matches.re: 69 → 50 failures (+24 passing, −5 trade-off losses)
  fn-matches, fn-replace, fn-tokenize, fn-analyze-string: unchanged

XQuery3Tests JUnit suite unchanged (0 failures, 2 pre-existing errors).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… grammar

Phase 2 follow-up to 3094b11 — converts five more categories of
fn:matches/fn:replace/fn:tokenize/fn:analyze-string spec failures into
FORX0002 errors, while removing one over-eager rejection.

Back-references (\1-\9 and multi-digit forms):
  - 3094b11 rejected ALL '\<digit>' as invalid, but XPath F&O 3.1+ does
    define back-references; the rejection broke valid patterns like
    (.)\1, (.)\19, and (.{N})...\11 (where N≥11).
  - Now greedily parse \N up to the total capturing-group count in the
    pattern (so \11 in an 11-group regex is back-ref 11, but \11 in a
    1-group regex is \1 + literal '1').
  - Track CLOSED capturing groups during validation: forward references
    (\1(abc)) and self-references ((.)\2) raise FORX0002 because the
    referenced group has not yet closed at the back-reference position.
  - \0 stays rejected as an octal escape.

Character class grammar:
  - Tighten scanCharClass to enforce that '[' inside a class is only
    valid as the start of a subtraction class — i.e. the immediately
    preceding character is an unescaped '-' AND the (pos|neg)CharGroup
    before that '-' is non-empty. This rejects patterns like [-[xyz]],
    [^-[xyz]], [[abcd]-[bc]], and [a - c - [b]] where '-[' is not the
    valid subtraction separator.
  - Reject empty character classes ([], [^], and the [] inside
    [...-[]]) which the grammar disallows but Saxon accepts leniently.
  - Use Java 21 switch-expression form for the in-class escape table.

XQuery 4.0 anchor quantification:
  - In XQ4 mode, reject '^?', '$+', '^{n}', '${n,m}', etc. — the spec
    tightens the grammar so anchors cannot be quantified. Trades the
    six XQ31-tagged tests that demand lenient handling for the matching
    six XQ40-tagged 'a-suffix' tests that demand FORX0002, net 0 in
    the QT4 runner (which forces every test to XQ4 mode regardless of
    the test's spec dependency) but spec-correct for XQ4.

XQTS QT4 fn-matches.re: 51 → 29 failures (94.9% → 96.8% pass rate),
fn-matches: 13 → 3, fn-replace: 10 → 9; fn-tokenize and
fn-analyze-string unchanged at 7 each. JUnit XQuery3Tests unchanged
(3 failures + 2 errors all pre-existing — verified by re-running
against the un-patched RegexUtil).

The remaining 29 fn-matches.re failures are all dependency-tagged
'XP30 XP31 XQ30 XQ31' tests (\b/\B in 3.1, '(?=...)' lookaround in
3.1, anchor-quantifier in 3.1) that the QT4 runner force-promotes to
XQ4 mode, where the constructs are valid extensions; nothing the
validator can do without runner changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous version-mismatch test loaded a 3.1 module from a 4.0
caller and expected FOQM0003. After the load-xquery-module fix that
allows older modules (XQuery is backward compatible), that scenario
now succeeds — which is the more useful behavior for module reuse.

Update the failing-version-mismatch test to use the explicit
xquery-version option requesting 3.1 against a 4.0 module, which is a
genuine mismatch that still raises FOQM0003. Also add a positive test
documenting that an older module loads cleanly from a newer caller.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…only parser

Two adjustments needed because develop only supports XQuery 1.0/3.0/3.1
(the new parser with XQ 4.0 support is on a separate v2/* branch):

1. FunUnparsedText.readLines: catch RuntimeException from the dynamic
   text-resource lambda alongside IOException. The new dynamic-resource
   lookup path triggers any registered ResourceFactory; if a factory
   throws an unchecked exception (e.g. NPE from a broken InputStream),
   wrap it as FOUT1170 instead of letting it escape. Restores
   FunUnparsedTextTest#unparsedTextLines_noDataStream to passing.

2. Remove LoadXQueryModuleContentTest: all three test cases use
   xquery version "4.0" syntax which develop's parser rejects with
   XQST0031. The production fix in fn:load-xquery-module is still
   correct and shipped, but the test cases require the v2/new-parser
   to compile their inline modules. They will return alongside the
   XQ 4.0 parser landing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addess Codacy issue: Unnecessary use of fully qualified name 'org.exist.xquery.regex.RegexUtil.hasXPath4Lookaround' due to existing static import 'org.exist.xquery.regex.RegexUtil.*'

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address Codacy issue: Unnecessary use of fully qualified name 'org.exist.xquery.regex.RegexUtil.hasXPath4Lookaround' due to existing static import 'org.exist.xquery.regex.RegexUtil.*'

Comment thread exist-core/src/main/java/org/exist/xquery/value/SequenceType.java Outdated
Comment thread exist-core/src/main/java/org/exist/xquery/value/SequenceType.java Outdated
Comment thread exist-core/src/main/java/org/exist/xquery/value/SequenceType.java Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address Codacy issues: The JUnit 4 test method name 'xxx' doesn't match '[a-z][a-zA-Z0-9]*'

Per reinhapa's review:
- FunReplace, FunTokenize: drop fully-qualified RegexUtil.* prefixes
  since `import static org.exist.xquery.regex.RegexUtil.*` already
  pulls hasXPath4Lookaround / translateXPath4Lookaround /
  validateXPathRegex into scope.
- SequenceType: drop redundant `= null` initializers on the two new
  function-test fields; combine the two pairs of nested-if checks in
  checkType / checkFunctionType into single conjunctions.
- ContainsTokenEmptyCollationTest: rename test methods to
  lowerCamelCase (drop underscores) to satisfy JUnit 4 name pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joewiz
Copy link
Copy Markdown
Member Author

joewiz commented May 11, 2026

[This response was co-authored with Claude Code. -Joe]

Addressed all 6 Codacy findings. New tip: 390016f998.

  • FunReplace / FunTokenize: dropped FQN on hasXPath4Lookaround / translateXPath4Lookaround (also unqualified validateXPathRegex in FunTokenize for consistency — same static-import scope).
  • SequenceType: removed = null initializers on the two new function-test fields; combined the two flagged pairs of nested ifs into single conjunctions.
  • ContainsTokenEmptyCollationTest: renamed emptyCollation_emptySequenceLiteralemptyCollationEmptySequenceLiteral and the other two methods similarly.

codacy-cli analyze --tool pmd on the touched files is clean for the six review findings. mvn test -pl exist-core -Dtest=ContainsTokenEmptyCollationTest — 3/3 green. The two remaining PMD warnings (FunReplace.eval NPath 313, SequenceType.checkType NPath 384) are moderate-tier and not in this review's scope.

Comment thread exist-core/src/main/java/org/exist/xquery/functions/fn/FunAnalyzeString.java Outdated
Comment thread exist-core/src/main/java/org/exist/xquery/functions/fn/FunMatches.java Outdated
@joewiz
Copy link
Copy Markdown
Member Author

joewiz commented May 11, 2026

[This response was co-authored with Claude Code. -Joe]

Good catch — let me explain what's there and let you steer.

Both sites gate on isXQuery40 = context.getXQueryVersion() >= 40. On develop today:

  • XQueryContext.xqueryVersion defaults to 31
  • setXQueryVersion() is only called from the parser, which rejects xquery version "4.0" with XPST0031 (no XQ 4.0 grammar on develop yet)
  • So isXQuery40 is never true → both XQ 4.0 lookaround-translation branches are dead code on develop

The validateXPathRegex(this, pattern, isXQuery40) call still ships and does the real 3.1 work: rejecting Java/Perl regex extensions that Saxon's XP30 mode would otherwise accept (this is the +17 / +15 / +10 / +7 lift on fn-matches, fn-matches.re, fn-replace, fn-tokenize in the PR's XQTS table).

So it's intentional in the sense that I preserved the version-gating shape rather than ripping the branches and re-adding them in #6218 — but you're right that it's mis-framed for "3.1-mandatory subset" if you read the diff strictly.

Two options, your pick:

  1. Leave as-is: dead code on develop, zero behavior change, cleaner eventual Implement XQuery 4.0 core functions (fn:, array:, map:, math:) #6218 merge (no re-adding).
  2. Strip the four isXQuery40 lines + the needsXQuery40JavaRegex branch: PR diff becomes purely 3.1, and Implement XQuery 4.0 core functions (fn:, array:, map:, math:) #6218 carries the XQ 4.0 reactivation later.

I'm fine with either. (2) is ~5 minutes of edits. Want me to do (2)?

@line-o line-o added the xquery issue is related to xquery implementation label May 11, 2026
@line-o line-o added this to v7.0.0 May 11, 2026
@duncdrum
Copy link
Copy Markdown
Contributor

I m leaning towards 'it's not a bug it's a feature'

Xq4 is coming in general and exist in particular. Having targeted error messages when users demand it is nice in a away.

@line-o
Copy link
Copy Markdown
Member

line-o commented May 12, 2026

@joewiz xmlts.xquery3 recursion function calls.recursion-function-calls-002 fails

@line-o
Copy link
Copy Markdown
Member

line-o commented May 12, 2026

I would like to us to keep the isXQuery40 check, if we can raise an error with a "not implemented yet" description.

@line-o line-o added this to Wave 2 May 12, 2026
@github-project-automation github-project-automation Bot moved this to Todo in Wave 2 May 12, 2026
@line-o line-o added this to the eXist-7.0.0 milestone May 12, 2026
joewiz and others added 2 commits May 12, 2026 16:45
…around

The previous isXQuery40 = context.getXQueryVersion() >= 40 guard was
dead on this XQuery 3.1 branch: getXQueryVersion() never returns 40
here, so a pattern using (*positive_lookahead:...) or similar XPath 4.0
syntax silently fell through to Saxon's XP30 regex compiler and produced
opaque FORX0002 errors.

Replace the guard with an explicit XPST0017 "XPath 4.0 lookaround syntax
is not yet implemented in this XQuery 3.1 build" exception in both
FunMatches.matchXmlRegex and FunAnalyzeString.analyzeString. The XQ4
translation/dispatch path stays available on v2/xq4-core-functions;
when 4.0 lands on develop, swap the throw for the translateXPath4Lookaround
call in one spot per file.

validateXPathRegex is now called with isXQuery40=false explicitly, since
this branch only runs the 3.1 dialect.

Adds RegexXPath4NotImplementedTest covering both fn:matches and
fn:analyze-string error paths plus a plain-pattern smoke check.

Addresses Juri's review comment on PR eXist-db#6344 (preferred middle ground:
keep the version check, but raise a clear "not implemented yet" error
rather than silently dead-coding the branch).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ped tests

Commit 0d96ad8 added function-arity + return-type checking in
SequenceType.checkType under Type.subTypeOf(primaryType, Type.FUNCTION).
Because MAP_ITEM and ARRAY_ITEM are declared subtypes of FUNCTION, the
arity check also applied to typed map(K,V) and array(T) tests -- but
the underlying map accessor signature carries one argument (the key),
while map(K,V) carries two type parameters in the test syntax. The
arity comparison therefore always returned false, so any
"$x instance of map(xs:string, item()?)" check failed and the value
flowed down the else branch unchanged. Concretely this caused the
xquery3 recursion-function-calls-002 test to fail with FOTY0013
("A function item other than an array cannot be atomized") when the
recursive local:join was bypassed and a raw map landed inside
string-join.

Narrow the function-type check to primaryType == Type.FUNCTION so the
plain function() typed-test still gets validated (including on map and
array values that satisfy it via FUNCTION subtyping), while map(K,V)
and array(T) tests revert to their pre-0d96ad8a19 behaviour pending
proper typed-test support.

Also: update the id.xqm securitymanager fixture's stored module from
'xquery version "3.0"' to "3.1" so that the stricter loaded-module
version check added in 7966280 no longer rejects it.

Addresses Juri's second review comment on PR eXist-db#6344 (recursion-function-
calls-002 failure).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joewiz
Copy link
Copy Markdown
Member Author

joewiz commented May 12, 2026

[This response was co-authored with Claude Code. -Joe]

Both items addressed in 327393c41c + 154ce4d648.

1. isXQuery40 → "not implemented yet" error (327393c41c)

Per your preferred middle ground: both call sites in FunMatches.matchXmlRegex and FunAnalyzeString.analyzeString now raise XPST0017 with

"XPath 4.0 lookaround syntax in regex patterns (e.g. (*positive_lookahead:...)) is not yet implemented in this XQuery 3.1 build. Rewrite the regex without lookaround."

instead of silently dead-coding the XQ4 branch. When XQuery 4.0 lands on develop, replacing the throw with the existing translateXPath4Lookaround() dispatch is a one-spot change per file. validateXPathRegex is now called with isXQuery40 = false explicitly. Added RegexXPath4NotImplementedTest (3 tests: both functions raise XPST0017 for lookaround patterns; plain patterns still work).

2. recursion-function-calls-002 (154ce4d648)

Real regression — the failure was introduced by 0d96ad8a19 (function type checking in SequenceType). That commit added an arity check under Type.subTypeOf(primaryType, Type.FUNCTION), but MAP_ITEM and ARRAY_ITEM are declared subtypes of FUNCTION, so the arity check also applied to typed map(K, V) tests. The map accessor signature has 1 argument (the key) while map(xs:string, item()?) carries 2 type parameters — they never match, so instance of map(xs:string, item()?) returned false on every map. In this test the recursive local:join branch was therefore bypassed, a raw map flowed into array:append / array:flatten, and string-join finally tripped FOTY0013 ("A function item other than an array cannot be atomized").

Fix: narrow the function-type check to primaryType == Type.FUNCTION so the plain function() typed test still validates (maps/arrays continue to satisfy it via FUNCTION subtyping), while map(K, V) and array(T) tests revert to their pre-0d96ad8a19 behaviour pending proper typed-test support.

While running the gate I also noticed the xqts.org.exist-db.test.securitymanager.id.from-load-module test was failing for an independent reason: the test fixture stores a module declared xquery version "3.0" and loads it without an explicit xquery-version option, which trips the stricter loaded-module version check added in 7966280a21. Bumped the fixture to "3.1" in the same commit so the gate stays green.

Full-module gate after both fixes:

Tests run: 6706, Failures: 0, Errors: 0, Skipped: 97
BUILD SUCCESS  (Total time: 03:38 min)

Codacy/PMD on the changed files is clean (the pre-existing FunMatches.eval / evalWithIndex NPathComplexity findings are untouched).

"Invalid regular expression: " + e.getMessage(),
new StringValue(this, pattern), e);
}
org.exist.xquery.regex.RegexUtil.validateXPathRegex(this, pattern, false);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address Codacy issue: Unnecessary use of fully qualified name 'org.exist.xquery.regex.RegexUtil.validateXPathRegex' due to existing static import 'org.exist.xquery.regex.RegexUtil.*'

// XPath 4.0 lookaround syntax is not yet implemented in eXist's XQuery 3.1 runtime.
// When XQuery 4.0 lands (v2/xq4-core-functions), replace this guard with the
// translateXPath4Lookaround / Java-regex dispatch path.
if (org.exist.xquery.regex.RegexUtil.hasXPath4Lookaround(pattern)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address Codacy issue: Unnecessary use of fully qualified name 'org.exist.xquery.regex.RegexUtil.validateXPathRegex' due to existing static import 'org.exist.xquery.regex.RegexUtil.*'

// XPath 4.0 lookaround syntax is not yet implemented in eXist's XQuery 3.1 runtime.
// When XQuery 4.0 lands (v2/xq4-core-functions), replace this guard with the
// translateXPath4Lookaround() dispatch path.
if (org.exist.xquery.regex.RegexUtil.hasXPath4Lookaround(pattern)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address Codacy issue: Unnecessary use of fully qualified name 'org.exist.xquery.regex.RegexUtil.validateXPathRegex' due to existing static import 'org.exist.xquery.regex.RegexUtil.*'

}

// Pre-validate: reject constructs not valid in XPath regex
// Pre-validate: reject constructs not valid in XPath 3.1 regex
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address Codacy issue: Unnecessary use of fully qualified name 'org.exist.xquery.regex.RegexUtil.validateXPathRegex' due to existing static import 'org.exist.xquery.regex.RegexUtil.*'

// Pre-validate: reject constructs not valid in XPath 3.1 regex
if (!org.exist.xquery.regex.RegexUtil.hasLiteral(flags)) {
org.exist.xquery.regex.RegexUtil.validateXPathRegex(this, pattern, isXQuery40);
org.exist.xquery.regex.RegexUtil.validateXPathRegex(this, pattern, false);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address Codacy issue: Unnecessary use of fully qualified name 'org.exist.xquery.regex.RegexUtil.validateXPathRegex' due to existing static import 'org.exist.xquery.regex.RegexUtil.*'

@duncdrum duncdrum moved this from Todo to In progress in Wave 2 May 18, 2026
@line-o
Copy link
Copy Markdown
Member

line-o commented May 19, 2026

@joewiz Is it possible to address the remaining codacy issues so that this PR can land? As it is addressing so many issue in the XQuery runtime it is of high priority.

@line-o line-o moved this to In progress in v7.0.0 May 19, 2026
@line-o line-o moved this from In progress to In review in v7.0.0 May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

xquery issue is related to xquery implementation

Projects

Status: In progress
Status: In review

Development

Successfully merging this pull request may close these issues.

4 participants