Skip to content

Extend XQTS runner: QT4/FTTS/Update suites, assertion fixes, batch runner, Saxon 12#49

Draft
joewiz wants to merge 41 commits into
eXist-db:feature/qt4-xquery-updatefrom
joewiz:feature/saxon-12-runner
Draft

Extend XQTS runner: QT4/FTTS/Update suites, assertion fixes, batch runner, Saxon 12#49
joewiz wants to merge 41 commits into
eXist-db:feature/qt4-xquery-updatefrom
joewiz:feature/saxon-12-runner

Conversation

@joewiz
Copy link
Copy Markdown
Member

@joewiz joewiz commented Apr 6, 2026

Summary

Extends the XQTS runner to support three additional W3C/community test suites beyond the existing XQuery 3.1 suite, fixes assertion evaluation bugs that could mask real test failures, adds performance improvements that cut full QT4 run time from 37+ minutes to under 6 minutes, and upgrades Saxon-HE from 9.9 to 12.5 to match eXist-db's Saxon 12 upgrade.

Consolidates what was previously split across PR #45 and PR #49.

  • Add qt4cg/qt4tests as a downloadable test suite (--xqts-version QT4)
  • Add W3C XQFTTS 1.0.4 Full Text Test Suite (--xqts-version FTTS)
  • Parse and execute multi-step XQuery Update test cases with sandpit (writable temp directory) support
  • Register the EXPath File module for expath-file test set
  • Fix assertion evaluation bugs that could produce false positives or false negatives
  • Fix completion detection bug that caused 5+ minute hangs after tests finished
  • Add batch runner with parallel execution support
  • Support custom local Maven repository for isolated multi-session builds
  • Upgrade Saxon-HE from 9.9.1-8 to 12.5 (for use with eXist-db 7.0 Saxon 12 branch)

Performance Improvements

Issue Before After Fix
Test set completion hang 5+ min hang after tests finish <2s clean exit isTestSetCompleted() compared against full catalog instead of started tests; Pekko dispatcher blocked by BrokerPool prevented ParsedTestSet delivery
Stall timeout too long 600s (10 min) before detecting hung tests 60s with per-test-case reporting of which tests hung
Actor system shutdown hang Indefinite hang during context.system.terminate() 30s deadline via standalone thread calling halt(0)
Sequential batch execution ~9 min for full QT4 ~6 min with --parallel 3 Batch runner distributes batches across N concurrent JVMs with isolated exist.home directories
Total QT4 run time 37+ min (hangs + sequential) ~6 min (fixes + parallel)

Commits

1. Core test suite support (3 commits)

  • Add QT4 test suite and XQuery Update support
  • Add XQFTTS 1.0.4 (W3C Full Text Test Suite) support
  • Register EXPath File module for expath-file test set

2. Build & infrastructure (2 commits)

  • Support custom local Maven repository via -Dmaven.repo.local
  • Fix assembly build and execution for exist-expath dependency

3. Sandpit support (1 commit)

  • Implement XQTS sandpit support for writable test directories

4. Assertion reliability fixes (9 commits)

Fix
Fix XML comparison and copy-modify-return result handling
Fix XMLUnit comparison and whitespace handling (scoped to XQFTTS)
Use admin authentication for embedded server connections
Pass result as context item for single-item assert evaluations
Preserve namespace URI in error code comparison
Handle AllOf assertions containing Error
Treat assertion evaluation errors as failures, not runner errors
Add raw serialization and serialization properties to ExistServer
Propagate static base URI to assertion evaluation context

5. Completion detection & shutdown (2 commits)

  • Add stall detection watchdog with shutdown timeout
  • Fix test set completion detection and add shutdown safeguards

6. Batch runner (5 commits)

  • Add batch runner with timing reports
  • Add parallel batch execution (--parallel N)
  • Fix parallel streams exiting after first batch
  • Add jstack thread dump capture on batch timeout
  • Add -XX:+ExitOnOutOfMemoryError to JVM args

7. Additional fixes (4 commits)

  • Add PARSER env var support for parser comparison runs
  • Exclude all Jetty transitive dependencies for Jetty 12 compatibility
  • Fix Not assertion parsing and serializationMatches query building
  • Various batch timeout and cleanup improvements

8. Saxon upgrade (1 commit)

  • Upgrade Saxon-HE from 9.9.1-8 to 12.5

When to merge Saxon upgrade

The Saxon upgrade commit is the last commit in this branch. It can be cherry-picked or this PR can be merged after eXist-db/exist#6212 (Saxon 12 upgrade) lands on develop. The rest of the PR is independent of the Saxon version.

Usage

# Run QT4 test sets
java -jar exist-xqts-runner-assembly-2.0.0-SNAPSHOT.jar \
  --xqts-version QT4 --test-set-pattern 'fn-.*'

# Run XQuery Update tests
java -jar ... --xqts-version QT4 --enable-feature XQUpdate --test-set-pattern 'upd-.*'

# Run Full Text tests
java -jar ... --xqts-version FTTS --test-set-pattern 'fts-.*'

# Batch runner (fresh JVM per batch, avoids OOM/thread leaks)
./run-batched.sh --xqts-version QT4 --batch-size 50 --heap 4g --output-dir results/qt4

# Parallel batch execution (3 concurrent JVM streams)
./run-batched.sh --xqts-version QT4 --parallel 3 --batch-size 50 --output-dir results/qt4

# Use custom local Maven repo (for multi-session isolation)
sbt -Dmaven.repo.local=/path/to/.m2-repo assembly

Test Plan

  • sbt compile passes
  • sbt assembly builds without deduplication errors
  • Assertion base URI propagation confirmed working
  • Whitespace filter scoped to XQFTTS only (no false positives on assertXml)
  • Completion detection fix verified: fn-format-number exits in <2s (was 5+ min hang)
  • Shutdown deadline verified: op-to exits in ~95s via halt(0) (was 300s timeout)
  • Batch runner tested across full QT4 suite (630 test sets, 85.5% pass rate)
  • Parallel execution verified: --parallel 3 produces identical results to sequential
  • Full QT4 re-run on next integration branch: 630 test sets, 35581/41611 (85.5%)
  • Saxon 12.5: sbt compile and assembly pass, sample test sets verified

[This PR was co-authored with Claude Code. -Joe]

🤖 Generated with Claude Code

joewiz and others added 30 commits March 16, 2026 15:31
Add qt4cg/qt4tests as a downloadable test suite (--xqts-version QT4). Parse and execute multi-step XQuery Update test cases with mutable in-memory documents. Add XP40/XQ40 spec values and XQUpdate feature. Handle revalidation and put dependency types.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add W3C XQFTTS as a downloadable test suite (--xqts-version FTTS). Handle Fragment and Inspect comparison types. Support stop-word and thesaurus URI maps with thread-safe registration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add exist-expath dependency and register the ExpathFileModule
(http://expath.org/ns/file) in the embedded server's conf.xml.
This resolves all 190 expath-file test failures caused by
XPST0017 "Call to undeclared function: file:exists".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Allows running sbt with -Dmaven.repo.local=/path/to/repo to use a
session-local Maven repository, avoiding conflicts between concurrent
sessions that install different exist-core snapshots.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Move jetty-jakarta-servlet-api exclusion to global excludeDependencies
  so it covers transitive paths through exist-expath (fixes dedup error)
- Disable prepended shell script in CI builds (corrupts ZIP offsets,
  causing "An unexpected error" from the Java launcher)
- Use explicit java -jar in CI test step

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Parse the <sandpit> element from test environments, copy the sandpit
source directory to a per-test-case temp directory before execution,
set the static base URI to the temp directory so relative file paths
resolve correctly, and clean up after execution.

This enables the EXPath File test set (190 tests) and upd-fn-put
test set (17 tests) which require a writable working directory.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add whitespace-only text node filter in XMLUnit diff to avoid spurious
  mismatches from insignificant whitespace differences
- Capture update expression return values for copy-modify-return tests
  that have no separate verification query

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Change getBroker() to authenticate("admin", "") to get an admin-level
broker instead of a guest broker. Required for modules that need
filesystem or privileged access (e.g., EXPath File module operations).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The XQTS assert expressions like ?columns and ?get(1,1) use unary
lookup which requires a context item. Previously, the result was only
available as $result variable but not as the context item, causing
XPDY0002 errors for any assertion using the ? lookup operator.

Only set context for single-item results (e.g., maps from parse-csv)
to avoid per-item evaluation for multi-item sequences like csv-to-arrays.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…dard namespaces

The runner was extracting only the local part of XPathException error
codes, losing the namespace URI. QT4 test catalogs use EQName notation
(e.g., Q{http://expath.org/ns/file}is-dir) for extension module error
codes. Now error codes from non-standard namespaces (not xqt-errors or
exist-xqt-errors) are formatted as Q{ns}local for proper matching.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tion

When a query returns an error and the expected result is an AllOf
assertion containing an Error assertion, match the error code against
the Error inside the AllOf. Previously, only direct Error and AnyOf
assertions were matched; AllOf was not handled, causing false failures
for tests like EXPath File read-binary bounds checking where the
catalog wraps the expected error in <all-of>.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rors

When an XPath assertion (assert, assert-eq, assert-deep-eq,
assert-permutation, assert-serialization) raises a query error
during evaluation (e.g., XPTY0004 type mismatch), this indicates
the assertion failed — not that the runner itself errored.

Previously these were reported as ErrorResults, inflating the error
count and masking the real nature of the failure. Now they are
reported as FailureResults with the error details in the message.

Fixes fn-parse-json-717 and fn-parse-json-731 which errored due to
type mismatches in assertion XPath evaluation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tServer

Add sequenceToStringRaw() for serialization-matches assertions where
the exact serialized output must be preserved without newline
replacement. Refactor sequenceToString into a shared implementation
with a sanitize flag.

Add serializationProperties field to Result to capture query context
serialization options (e.g., declare option output:method "json") for
use in assertion evaluation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The sandpit implementation (fab30c1) sets the static base URI for test
query execution, but assertion evaluation ran in a separate XQuery
context that defaulted to the JVM working directory. This caused 21
EXPath File tests to fail because assertion expressions like
Q{http://expath.org/ns/file}read-text("test.txt") resolved relative
paths against the wrong directory.

Thread the test case's static base URI through processAssertion → all
assertion methods → executeQueryWith$Result → connection.executeQuery,
so assertion expressions see the same base URI as the test query.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix isTestSetCompleted() to compare completed test cases against started tests rather than the full catalog (which includes test cases filtered by spec/feature requirements). Add fallback path for when ParsedTestSet is stuck in the Pekko mailbox. Prevent duplicate serialization. Reduce stall timeout from 600s to 60s with hung test reporting. Add 30-second shutdown deadline for actor system termination.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add run-batched.sh that splits test sets into batches, runs each in a fresh JVM, and aggregates results. Includes per-test-set timing report, resume mode, and configurable batch size/heap.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add --parallel N flag to run-batched.sh that distributes batches across
N concurrent streams. Each stream runs in its own JVM with an isolated
eXist-db home directory (via -Dexist.home) to avoid BrokerPool data
directory lock conflicts.

Batches are distributed round-robin across streams. Results accumulate
in the same output directory (JUnit XML files have unique names per
test set, so no conflicts).

QT4 full run: 5m37s with --parallel 3 (was ~9m sequential).
FTTS verified: identical results between sequential and parallel modes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…atch

The set +e / set -e toggle inside run_batch() caused backgrounded
subshells to exit after the first batch failure. When set -e is
re-enabled inside a function running in a backgrounded subshell,
any subsequent command failure terminates the entire stream.

Fix: replace set +e / set -e with || true pattern to capture the
timeout exit code without toggling errexit. This allows each stream
to continue processing all its batches regardless of individual
batch outcomes.

Before: --parallel 3 produced 150 results (1 batch per stream).
After: --parallel 3 produces 630 results (all 13 batches).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…pressions

The W3C XQTS 3.1 test suite uses the deprecated map{key:=value} syntax
in some assertion expressions and test queries. Since eXist-db no longer
supports := in maps, these fail with XPST0003.

Uses a heuristic: replace := preceded by a non-whitespace char (map
entries like "key":=value) but not variable bindings ($var := value
which have a space before :=).
When a batch approaches its 300s timeout, automatically capture a
thread dump of the Java process via jstack. The dump is saved to
$OUTPUT_DIR/jstack-batch-N.txt, showing exactly which threads are
stuck and what locks they're contending on.

Tested with op-to (known OOM hanger): thread dump reveals Pekko
dispatcher threads BLOCKED on java.lang.Shutdown monitor — OOM'd
threads all trying to call System.exit() simultaneously, deadlocking
on the JVM shutdown lock.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
OOM during test execution causes multiple Pekko dispatcher threads to
call System.exit() simultaneously, deadlocking on java.lang.Shutdown's
monitor. ExitOnOutOfMemoryError makes the JVM call _exit() immediately
on OOM — no shutdown hooks, no deadlock. The batch runner's timeout +
jstack handles diagnostics; the JVM just needs to die cleanly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Support PARSER=rd or PARSER=antlr2 environment variable in
run-batched.sh. Sets -Dexist.parser on the JVM command line.
Defaults to antlr2 if not specified. Displays parser name in
the run header for clear identification of results.

Usage:
  PARSER=rd ./run-batched.sh --xqts-version QT4 --output-dir results/rd
  PARSER=antlr2 ./run-batched.sh --xqts-version QT4 --output-dir results/antlr2

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tibility

The runner doesn't use Jetty directly — it only needs exist-core for
XQuery evaluation. But Jetty comes in as a transitive dependency, and
Ivy can't resolve Jetty 12 Maven POM constructs (property references
in dependencyManagement), producing broken pseudo-versions.

Exclude all org.eclipse.jetty group IDs (core, toolchain, websocket,
ee10) so the runner builds against both Jetty 11 (develop) and
Jetty 12 (next) branches of eXist-db.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lding

Two bugs blocking serialization tests:

1. Not assertion parsing (9 tests): stepOutAssertions only handled
   Assertions (AllOf/AnyOf) on the stack, not Not. When </not>
   triggered stepOutAssertions with a Not(Some(...)) on top, it threw
   "Unable to associate non-assertions object". Also consolidated the
   duplicate ALL_OF end-element handler — ALL_OF, ANY_OF, and NOT are
   now handled in a single case.

2. serializationMatches query building (4 tests): The method embedded
   the expected regex in a backtick string constructor ``[...]``, but
   patterns containing <? (e.g., <?xml) trigger an eXist parser bug
   where <? is interpreted as a processing instruction start inside
   the string constructor (XPST0003). Fixed by passing the regex and
   flags as external variables.

Note: The eXist parser bug with <? inside backtick string constructors
should be investigated separately — ``[<?xml]`` should be valid XQuery.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bug 1: Not(None) inside AllOf/AnyOf — When <not> is a child of <all-of>
or <any-of>, addAssertion() appended Not() to the Assertions list, then
appended the inner assertion as a sibling instead of nesting it inside
the Not. Fix: check if the last element of an Assertions container is
Not(None) and fill it with the new assertion.

Bug 2: Serialization options lost after context.reset() — The runner
called the 3-arg XQuery.execute() which passes null for outputProperties.
eXist's execute() calls context.checkOptions(null) (no-op) then
context.reset(), clearing declared serialization options before the
runner could read them. Fix: pass a Properties object to execute() so
eXist extracts options BEFORE resetting the context. This restores
declare option output:method "html" etc. for serialization tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add --timeout flag (default 180s, was hardcoded 300s) for faster
  recovery when BrokerPool shutdown hangs
- Kill lingering Java processes after each batch via pkill -9 matching
  the batch's unique exist.home directory
- Adjust jstack capture timing relative to the configurable timeout

The FLWOR fix increased rd parser test execution by ~4,483 tests,
overwhelming the batch runner with BrokerPool shutdown hangs on
13/13 batches. Shorter timeout + process cleanup allows the runner
to recover and continue.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
180s was too short for rd parser batches — the rd parser legitimately
takes longer on some test sets (FLWOR parsing complexity), causing
32/32 batches to timeout during test execution. Keep the configurable
--timeout flag and process cleanup, just set the default higher.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
joewiz and others added 2 commits March 30, 2026 23:09
Exit code 1 from the XQTS runner means "some test failures found" —
normal expected behavior. Only count timeout (124/137) and crashes
(exit > 1, not 255) as batch failures. Exit 255 is a runner error
(non-fatal, produces partial results).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update the runner's Saxon dependency to match eXist-db's Saxon 12
upgrade (eXist-db/exist#6143). The runner's own Saxon usage
(AnyURIValue, TransformerFactoryImpl) is compatible with both
versions — no source changes needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@line-o line-o mentioned this pull request Apr 13, 2026
2 tasks
@joewiz joewiz changed the title Upgrade Saxon-HE dependency from 9.9 to 12.5 Extend XQTS runner: QT4/FTTS/Update suites, assertion fixes, batch runner, Saxon 12 Apr 13, 2026
joewiz and others added 8 commits April 20, 2026 09:24
Resolve conflicts in favor of saxon-12-runner branch:
- build.sbt: Saxon-HE 12.5 (not 9.9), broader Jetty exclusions for
  both Jetty 11 and 12 compatibility
- ExistServer: pass outputProperties to execute() so serialization
  options are captured before context.reset()
- TestCaseRunnerActor: use map-based serialization with all properties
  (not just method/indent); bind regex/flags variables in
  serializationMatches assertion
- XQTSRunnerActor: 60s stall timeout (not 600s), hung test tracking,
  forceSerializeAndShutdown helper

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When exist.xqts.default-version=4.0 system property is set (by the
batch runner for QT4 suite), prepend 'xquery version "4.0";' to test
queries that don't already have a version declaration. This enables
XQ4 syntax (mapping arrow =!>, pipeline ->, etc.) in QT4 tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add exist-expath-file and exist-expath-binary as dependencies so the
runner can execute EXPath File 4.0 and Binary 4.0 conformance tests.

- Register both modules in the runner's embedded conf.xml
- Add Binary feature to Feature enum and DEFAULT_FEATURES
- Update File module class name to match new built-in extension

Verified: File 181/190 (95.2%), Binary 277/379 (73.0%)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Register addModuleLocationHint before importModule so sub-modules
can resolve their imports during compilation. Catch XPathException
from importModule to handle XQTS catalog entries that map a
namespace to a file declaring a different namespace.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The QT4 XQTS catalog inherits many test cases authored against pre-XQ40
semantics (XQ10/XQ30/XQ31). Without a version declaration in the test
body, those tests run under whatever default eXist applies — typically
XQ4 in QT4 mode — which breaks tests whose expected error/result depends
on rules that changed in 4.0:

  * Reserved function names (XQ10 originally allowed `function attribute()`).
  * Default function namespace mapping for unprefixed declarations.
  * Default parameter values (`:= expr`) only legal in XQ4.
  * Subtype substitutability of xs:integer/xs:decimal.

Look at `<dependency type="spec" value="...">` on the test case and
prepend a matching `xquery version "<v>";`:

  * Any "+" form (XQ10+, XQ30+, XQ31+, XQ40+) → "4.0".
  * Strict XQ40 → "4.0". Strict XQ31 → "3.1". XQ30 → "3.0". XQ10 → "1.0".
  * No spec dep → unchanged (let ExistServer's default apply).

This restores the right outcome for ~25 prod-FunctionDecl tests that
the QT4 catalog inherits from XQ10/XQ30 with no version decl.

Also: stop calling exist-core's `addModuleLocationHint` directly. That
method was added on a newer branch and is missing on older worktrees
(e.g. v2/*), which prevents the runner from even compiling against
those branches. Use reflection so the call is a no-op when the method
is absent — the subsequent `importModule` still works either way.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…M exclusions

- Add --parser flag handling (was falling through to EXTRA_ARGS and
  causing "Unknown option" errors from the runner JAR)
- Pass -Dexist.xqts.default-version=4.0 for QT4 runs so tests without
  version declarations get xq4Enabled=true in the ANTLR parser
- Add --exclude-test-case flag with QT4 defaults for OOM-prone op-to
  tests (RangeExpr-408f-k, 409c-d, 410f-k)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
executeQueryWith\$Result passed Seq.empty for namespaces, so XPath
assertions like \$result/*/*[1] instance of element(j:string,xs:untyped)
failed with XPST0081 even when the testCase's environment defined the
prefix (e.g. <namespace prefix="j" uri="..."/>).

Capture the current testCase's namespaces in a private actor field
when an assertion is dispatched (actors are single-threaded so this is
safe), and forward them to the embedded ExistConnection.executeQuery
call. Cleared after the assertion completes.

This unblocks ~9 fn-json-to-xml tests, ~7 op-union and op-except
"fn-*-node-args-*" assertion tests, and any other QT4 assertion that
uses a prefix declared in the test-set <environment>.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ures

The QT4 runner prepends `xquery version "4.0"` to every test, then ran every
test regardless of its `<dependency type="spec" ...>`. Tests with strict
(non-"+") spec deps that exclude XQ40 -- e.g. zero-arg constructor tests
declaring `<dependency type="spec" value="XP20 XQ10"/>` and XQ10/XQ30+
test pairs like K-SeqExprCast-71a/71b -- were therefore evaluated under
XQ40 semantics, where they fail spuriously: under XQ40's focus-constructor
rule `xs:double()` is legal, so tests expecting XPST0017 see a valid result
instead.

Restrict the default enabled specs for `--xqts-version QT4` to {XP40, XQ40}
so that strict pre-XQ40 spec deps fail the dependency check and the test is
correctly recorded as AssumptionFailed rather than RUN-and-FAIL. Plus-form
deps (e.g. `XP30+ XQ10+`) continue to expand via Spec.atLeast and match
XQ40, so they still run.

Effect on prod-CastExpr (next-v3, ANTLR parser): F=34 -> F=4, with 55
additional tests now correctly skipped. Other XQTS versions (3.1, HEAD,
FTTS) keep the historical "all specs enabled" defaults.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When the watchdog (or any other early-exit path) calls
forceSerializeAndShutdown(), the previous behaviour was to send pending
TestSetResults to the JUnitResultsSerializerRouter and immediately call
shutdown(), which terminated the actor system before the router's
children could finish writing. The in-flight messages landed in
deadLetters, causing scattered test results to silently disappear from
the JUnit output (observed: 4 prod-ModuleImport tests lost across
catalog positions 18/36/87/94 in a recent run, plus 17 pekkoDeadLetter
warnings for TestSetResults sent to terminated children).

Reuse the existing SerializedTestSetResults / FinalizeSerialization /
FinishedSerialization handshake to drain in flight before terminating:

- forceSerializeAndShutdown now sets a forcedShutdown flag, sends
  pending results, tracks them in unserializedTestSets, and returns —
  the existing ack handler triggers FinalizeSerialization once
  unserializedTestSets drains. Idempotent so repeated watchdog ticks
  cannot start parallel drains.
- SerializedTestSetResults handler relaxes its finalize trigger under
  forcedShutdown (hung-but-never-completed test cases mean
  allTestSetsCompleted() can never become true), and guards
  FinalizeSerialization against being sent twice.
- A 60s drain backstop terminates anyway if the serializer itself is
  wedged (separate from the existing 30s actor-system-termination
  backstop in shutdown()).
- shutdown() is now idempotent so the backstop's self-message and the
  normal FinishedSerialization path can't race.
@joewiz joewiz marked this pull request as draft May 20, 2026 13:10
@joewiz
Copy link
Copy Markdown
Member Author

joewiz commented May 20, 2026

[This response was co-authored with Claude Code. -Joe]

Moving this PR to draft to give the eXist 7 release a clear focus.

Per recent discussion among the devs, the eXist 7 scope is being narrowed to maximize XQuery 3.1 conformance rather than landing XQ 4.0 / XQ Update / XQ Full Text. Code freeze for the eXist 7 beta is planned for later today, so I'd like to get the XQ 4 / XQUF / XQFT-related runner work out of the review queue and let reviewers focus on the 3.1 PRs.

The relevant 3.1-only pieces from this bundle will land separately:

I'll bring this PR back out of draft post-7.0 when XQ 4 / XQUF / XQFT support is ready to integrate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant