[feature] spatial: add distance and distance-sphere functions#6308
[feature] spatial: add distance and distance-sphere functions#6308joewiz wants to merge 1 commit into
Conversation
Add spatial:distance/2 (cartesian, source SRS) and spatial:distance/3 (unit-aware: degree, meter, kilometer, mile, nautical-mile). Non-degree units transform to EPSG:4326 and use haversine on the closest-point pair found by JTS DistanceOp. Cartesian computation is JTS native. Closes the long-standing gap that the spatial module had no distance function -- previously a developer asking "how far apart are these two points?" had no spatial:* answer. Part of the spatial index modernization tasking; first of ~10 small surgical PRs adding the function surface developers expect today (distance, DWithin, KNN, GeoJSON I/O, geohash, lat/lon constructors). The backend stays HSQLDB; no schema change, no migration. New function class follows the same pattern as FunSpatialSearch and FunGeometricProperties: BasicFunction + IndexUseReporter; geometry resolution via the existing AbstractGMLJDBCIndexWorker public methods (getGeometryForNode for persistent nodes, streamNodeToGeometry for in-memory). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3ce6d79 to
9878912
Compare
|
I m wondering if we really want to stick with HSQLDB backend, or if we shouldn't aim to use our own storage, combined with Lucene-spatial ? That being said, I never was an active user of the spatial module so I'm curious what those who actually use(-d) it think. |
|
Hi,
HSQL was a proof of concept in... 2007 :-D
In fact, I wanted to be able to use :
- non spatial DBMS (using query refining as in the current HSQL code). HSQL was a lightweight one...
- spatial DBMS (namely PostGIS, or recent versions of MySQL)
- Lucene spatial
- Arc-GIS
...
and maybe other formats... using hierarchic interfaces.
The point was to show that the indexing engine could do many things for many purposes ;-)
Cheers,
p.b.
…----- Mail original -----
De: "Duncan Paterson" ***@***.***>
À: "eXist-db/exist" ***@***.***>
Cc: "Subscribed" ***@***.***>
Envoyé: Mardi 12 Mai 2026 17:32:32
Objet: Re: [eXist-db/exist] [feature] spatial: add distance and distance-sphere functions (PR #6308)
duncdrum left a comment (eXist-db/exist#6308)
I m wondering if we really want to stick with HSQLDB backend, or if we shouldn't aim to use our own storage, combined with Lucene-spatial ?
That being said, I never was an active user of the spatial module so I'm curious what those who actually use(-d) it think.
—
Reply to this email directly, view it on GitHub , or unsubscribe .
Triage notifications on the go with GitHub Mobile for iOS or Android .
You are receiving this because you are subscribed to this thread. Message ID: <eXist-db/exist/pull/6308/c4431848756 @ github . com>
|
|
[This response was co-authored with Claude Code. -Joe] @duncdrum @brihaye I mulled over this HSQLDB-vs-Lucene-spatial question when, prompted by our work on the ngram index, I asked Claude to examine the state of the spatial-module. Claude prepped a four-phase analysis (baseline → capability gap → design → implementation). Phase 2 concluded with a two-track recommendation:
This PR is squarely on track 1. The function additions don't preempt any track-2 decision. Here's the full analysis: 2026-05-05 spatial-index-modernization-tasking.md. |
|
Hi,
One more (important) thing :
eXist's spatial index has been designed for GML 2.1 if I remember well. Newer versions of GML should also be taken into consideration.
Cheers,
p.b.
…----- Mail original -----
De: "Joe Wicentowski" ***@***.***>
À: "eXist-db/exist" ***@***.***>
Cc: "Pierrick Brihaye" ***@***.***>, "Mention" ***@***.***>
Envoyé: Mardi 12 Mai 2026 19:15:49
Objet: Re: [eXist-db/exist] [feature] spatial: add distance and distance-sphere functions (PR #6308)
joewiz left a comment (eXist-db/exist#6308)
[This response was co-authored with Claude Code. -Joe]
@duncdrum @brihaye I mulled over this HSQLDB-vs-Lucene-spatial question when, prompted by our work on the ngram index, I asked Claude to examine the state of the spatial-module. Claude prepped a four-phase analysis (baseline → capability gap → design → implementation). Phase 2 concluded with a two-track recommendation:
1.
Near-term : keep HSQLDB + JTS, modernize the function surface. Add 12-15 high-value functions developers expect today (distance, DWithin, GeoJSON I/O, geohash, lat/lon constructors, bounding-box filter). I haven't seen complaints about the backend, and HSQLDB seems well-maintained, so dusting the extension off and adding a couple of functions that developers could use seemed like the best approach. This PR the first deliverable of this track (distance + distance-sphere).
2.
Long-term : migrate the storage layer from HSQLDB to Lucene 10 (LatLonShape + spatial-extras), deprecating HSQLDB with a release window. This aligns with the trajectory of eXist's other index modules toward Lucene, removes one whole database engine from the runtime, and the BKD tree under LatLonShape should work well for the point-in-polygon and KNN queries Tier A introduces. Hearing Pierrick had other backends in mind and likes Lucene-spatial makes it seem like the right choice.
This PR is squarely on track 1. The function additions don't preempt any track-2 decision.
Here's the full analysis: 2026-05-05 spatial-index-modernization-tasking.md .
—
Reply to this email directly, view it on GitHub , or unsubscribe .
Triage notifications on the go with GitHub Mobile for iOS or Android .
You are receiving this because you were mentioned. Message ID: <eXist-db/exist/pull/6308/c4432971578 @ github . com>
|
Summary
Adds
spatial:distance/2(cartesian, source-SRS units) andspatial:distance/3(unit-aware:degree,meter,kilometer,mile,nautical-mile) to the spatial XQuery module. Closes the long-standing gap that there was no way to compute distance between two GML geometries.This is the first of ~10 small surgical PRs modernizing the spatial module's function surface to add capabilities developers reach for today (distance, DWithin, KNN, GeoJSON I/O, geohash, lat/lon constructors).
The backend stays HSQLDB; no schema change, no migration story.
What changed
All files in extensions/indexes/spatial/src/main/java/org/exist/xquery/modules/
spatial/FunSpatialDistance.javaFunctionSignatures forspatial:distance/2and/3. Geometry resolution mirrors the existingFunGeometricPropertiespattern:getGeometryForNodefor persistent nodes,streamNodeToGeometryfor in-memory. Cartesian uses JTSGeometry.distance; haversine uses JTSDistanceOp.nearestPointsplus an inline haversine helper on the closest-point pair, then converts to the requested unit via Java 21 switch expression.spatial/SpatialModule.javafunctions[]under a// --- Distance & proximity ---labeled block.spatial/FunSpatialDistanceTest.javaXPathException, explicit'degree'unit equals 2-arity cartesian.XQuery API
$unitis one of"degree"(default; cartesian, source SRS),"meter","kilometer","mile","nautical-mile". For non-degree units both geometries are transformed to EPSG:4326 and great-circle distance is computed on the closest-point pair via haversine.Examples
Design notes
spatial:transform.Geometry.distancedoes — in source SRS units. For EPSG:4326 that's degrees, which is "wrong" in casual use but is the documented OGC SFA semantics for unit-less distance. The 3-arity form with$unit = 'meter'is the right call for lat/lon developers.IndexUseReporter: implemented; reportstruewhen the geometry was pulled fromSPATIAL_INDEX_V1,falsewhen streamed from in-memory or unindexed nodes. MirrorsFunSpatialSearchandFunGeometricProperties.Optimizableintegration in this PR — distance doesn't filter, so there's no FLWOR rewrite opportunity. DWithin (next PR) will need it.DistanceOp,Geometry.distance).ErrorCodes.FOER0000: used for the unsupported-unit error and for "unable to resolve geometry"; consistent with how other eXist modules surface user-facing errors that don't have a more specific W3C code. Avoids the deprecatedXPathException(Expression, String)constructor.Spec references
ST_DistanceandST_DistanceSphere— de-facto reference for SQL spatial.geo:distance— closest XML-database peer; comparable signature shape.geo:distance— XQuery-native; this PR is closer to MarkLogic's because EXPath Geo punts on units.Test plan
XPathExceptionwith the offending unit named in the message.'degree'unit equals 2-arity default — asserted.mvn test -pl extensions/indexes/spatial) — 18 passed, 1 skipped (pre-existing XQUF@Ignore), 0 failures.mvn license:check) — clean.Out of scope (future PRs)
spatial:within-distance(DWithin filter) — needs the index-side bbox-prefilter; planned as PR 2.spatial:nearest(KNN) — builds on distance; planned as PR 3.gt-referencing.GeodeticCalculator— 0.5% accuracy loss vs full ellipsoidal; users can transform to a projected CRS for sub-meter accuracy. Could land later asspatial:distance($g1, $g2, 'meter', 'ellipsoidal')if requested.getEPSG4326*redundant signatures — separate[refactor]PR.[bugfix]PR.🤖 Generated with Claude Code