Speed Up Nearest Zone Calculation in Disaggregate Accessibilities by dhensle · Pull Request #1031 · ActivitySim/activitysim

dhensle · 2026-01-27T20:29:35Z

Fix for #1030

This pull request introduces significant performance improvements to zone lookup operations in the accessibility calculations by vectorizing nearest zone searches and optimizing the mapping of zone IDs to skim indices. The changes focus on reducing redundant operations and leveraging efficient numpy-based lookups, which should result in faster computations, especially for large datasets.

Performance improvements in zone lookup and mapping:

Replaced the per-origin nearest zone search with a new vectorized function find_nearest_zones_via_skims, enabling a single batched skim lookup for all origin-destination pairs instead of one lookup per origin zone. This reduces computational overhead in disaggregate_accessibility.py. [1] [2]
Updated the code to use the new vectorized nearest zone search in place of the old loop-based approach.

Optimizations in OffsetMapper for skim index mapping:

Added a fast-path numpy array (_offset_array) to OffsetMapper for O(1) zone ID to index mapping when the offset is specified as a pandas Series, replacing the slower pandas .map() approach. This is constructed only when zone IDs are non-negative and the range is reasonable. [1] [2]
Modified the map method to use this numpy array when available, including safe handling of out-of-range indices, which improves performance for large zone lists.
Ensured that the numpy array is not used or constructed when the offset is a simple integer, maintaining correct behavior for all offset types.

Copilot

Pull request overview

This PR targets the performance bottleneck in disaggregate accessibility zone lookups (Issue #1030) by optimizing (1) nearest-zone identification when using skims and (2) zone-id-to-skim-index mapping.

Changes:

Introduces a vectorized nearest-zone skim lookup helper and switches find_nearest_accessibility_zone to use it.
Adds a numpy-array fast path to OffsetMapper to speed up zone id → skim index mapping when an offset series is used.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`activitysim/core/skim_dictionary.py`	Adds a numpy-based fast path for `OffsetMapper` mapping to reduce pandas `.map()` overhead.
`activitysim/abm/tables/disaggregate_accessibility.py`	Replaces per-origin skim nearest-zone lookups with a vectorized approach intended to reduce Python-level overhead.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+        # Build numpy lookup array for fast O(1) mapping
+        # This replaces slow pandas Series.map() with direct numpy indexing
+        index_vals = offset_series.index.values
+        if len(index_vals) > 0:
+            min_zone = int(index_vals.min())
+            max_zone = int(index_vals.max())
+            # Only build array if zone IDs are non-negative and range is reasonable
+            # (avoid huge arrays for sparse zone IDs)
+            if min_zone >= 0 and (max_zone - min_zone + 1) <= len(index_vals) * 10:
+                self._offset_array = np.full(
+                    max_zone + 1, NOT_IN_SKIM_ZONE_ID, dtype=np.int32
+                )
+                self._offset_array[
+                    index_vals.astype(int)
+                ] = offset_series.values.astype(np.int32)
+            else:
+                self._offset_array = None
+        else:
+            self._offset_array = None


+        if self._offset_array is not None:
+            # Fast path: use numpy array indexing (O(1) per element)
+            zone_ids = np.asanyarray(zone_ids).astype(int)
+            # Clip to valid range to avoid index errors, then mark out-of-range as NOT_IN_SKIM
+            max_valid = len(self._offset_array) - 1
+            valid_mask = (zone_ids >= 0) & (zone_ids <= max_valid)
+            # Use clip to safely index, then apply mask
+            clipped_ids = np.clip(zone_ids, 0, max_valid)
+            offsets = np.where(
+                valid_mask,
+                self._offset_array[clipped_ids],
+                NOT_IN_SKIM_ZONE_ID,
+            )
+            return offsets


+        origin_zones = np.asarray(origin_zones)
+        dest_zones = np.asarray(dest_zones)
+        n_origins = len(origin_zones)
+        n_dests = len(dest_zones)
+
+        # handle empty input case
+        if n_origins == 0 or n_dests == 0:
+            return []
+
+        # Create all origin-destination pairs in one go
+        # all_orig: [o1, o1, o1, ..., o2, o2, o2, ..., oN, oN, oN, ...]
+        # all_dest: [d1, d2, d3, ..., d1, d2, d3, ..., d1, d2, d3, ...]
+        all_orig = np.repeat(origin_zones, n_dests)
+        all_dest = np.tile(dest_zones, n_origins)
+
+        # Single skim lookup for all pairs
+        all_dists = skim_dict.lookup(all_orig, all_dest, "DIST")
+
+        # Reshape to (n_origins, n_dests) and find argmin per origin
+        dist_matrix = np.asarray(all_dists).reshape(n_origins, n_dests)
+        nearest_indices = np.argmin(dist_matrix, axis=1)
+
+        # Return list of (origin, nearest_dest) tuples
+        return list(zip(origin_zones, dest_zones[nearest_indices]))


dhensle added 4 commits January 27, 2026 09:33

optimize nearest_zone from skims function

c5a50eb

increase speed of skim dict _lookup mapping function

cb2f9c0

blacken

5867433

handle empty inputs for nearest zone calc

5145c06

jpn-- mentioned this pull request Jan 29, 2026

2026-01-29 Engineering Team ActivitySim/meeting-notes#68

Closed

jpn-- self-requested a review January 29, 2026 19:31

bhargavasana mentioned this pull request Feb 19, 2026

using skims with find_nearest_accessibility_zone is a bottleneck when working with large number of MAZs #1030

Open

Merge branch 'main' into disagg_access_nearest_zone_speed_up

42bf2f7

jpn-- mentioned this pull request Apr 9, 2026

2026-04-09 Engineering Team ActivitySim/meeting-notes#88

Closed

jpn-- mentioned this pull request Apr 16, 2026

2026-04-16 Engineering Team ActivitySim/meeting-notes#90

Closed

jpn-- mentioned this pull request Apr 30, 2026

2026-04-30 Engineering Team ActivitySim/meeting-notes#95

Closed

Merge branch 'main' into disagg_access_nearest_zone_speed_up

8cc4480

jpn-- requested a review from Copilot June 4, 2026 16:34

Copilot started reviewing on behalf of jpn-- June 4, 2026 16:34 View session

Copilot AI reviewed Jun 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed Up Nearest Zone Calculation in Disaggregate Accessibilities#1031

Speed Up Nearest Zone Calculation in Disaggregate Accessibilities#1031
dhensle wants to merge 6 commits into
ActivitySim:mainfrom
RSGInc:disagg_access_nearest_zone_speed_up

dhensle commented Jan 27, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dhensle commented Jan 27, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants