Support (N, M, L) per-dimension zones in octree containers by cphyc · Pull Request #5391 · yt-project/yt

cphyc · 2026-02-26T12:31:52Z

PR Summary

Currently, yt only supports cubic "zones" for octrees: this means that octree datasets can only contain (N, N, N) cells in each leaf oct, where N is controlled with the dataset parameter num_zones.

This PR allows passing a tuple of values of num_zones, increasing flexibility in support of #5390. In practice, this allows having an octree of blocks, where each block contains (N, M, L) cells.

This PR should leave the RAMSES, Art and Stream frontends behaviour unchanged.

PR Checklist

New features are documented, with docstrings and narrative docs
Adds a test for any bugs fixed. Adds tests for new features.

Blocks may contain more than 256 cells - prevent overflow of the uint8

matthewturk

I have gone over this and everything I can see suggests it looks exactly correct.

This is a really great improvement. I'd like to see if we could discuss reviving efforts to move the Enzo-P frontend to use this ...

With arbitrary zones, is it also potentially the case that any and all block-based grid systems could utilize all this machinery, rather than the more complex (and not fully working) grid visitor stuff? I suspect not, based on how the refinement patterns go, but I wanted to ask.

matthewturk · 2026-03-06T18:07:07Z

-            new_shape = (nz, nz, nz, n_oct)
-        elif arr.size == nz * nz * nz * n_oct * 3:
-            new_shape = (nz, nz, nz, n_oct, 3)
+        if arr.size == nzones * n_oct:


I don't think this changes the F/C ordering issues, right? Like, if we were ordered correctly with symmetric dimensions we'd be similarly ordered correctly with asymmetric dimensions? (And vice versa.)

It's too late in the week for me to fully process that question, but I think I understand it, and I think the answer is yes. How could I convince us of it?

chrishavlin

only took a cursory look to see where the test failures were coming from

chrishavlin · 2026-03-20T21:17:58Z

        # We create oct arrays of the correct size
        cdef np.ndarray[np.uint8_t, ndim=1] levels
-        cdef np.ndarray[np.uint8_t, ndim=1] cell_inds
+        cdef np.ndarray[np.uint32_t, ndim=1] cell_inds


This is the source of the test failures: the return will now be uint32_t, but fill_level is expecting uint8_t still, so presumably fill_level's types should be updated

and should file_index_octs_with_ghost_zones be updated too?

oh and fill_level_with_domain ?

So I followed the white rabbit and ended up replacing lots of hits. Let's see whether tests pass now!

Funnily, I think yt never worked with nz > 6 for which nz**3 > 255 which overflows an uint8_t!

Oh dang, that's somehow not surprising, but should be. I wonder if this is related to some other weird problems I saw in some other work a few years ago.

This will potentially increase the peak memory usage quite a lot, but I don't know that it's going to be a long-lived increase that will make a difference.

To be fair, the memory increase is on the order of $(32-8) \mathrm{bit} \times N_\mathrm{cell}$ to be compared to the $64\mathrm{bit} \times N_\mathrm{cell}$ used by file_inds and a similar amount used to read the data in double precision.

I agree this is slightly wasteful, but in the grand scheme of things, it's probably OK. But I could make it a uint16 if you wanted.

I definitely do not! I think we're in fine territory.

When allowing larger block sizes, we may easily overflow 8-bit counters. Note that this was *already* overflowing for nz > 6, but it appears it was never tested.

cphyc · 2026-03-20T23:00:56Z

        cdef np.uint8_t[::1] cell_inds
        cdef np.int64_t[::1] oct_inds

-        cell_inds = np.full(num_octs*4**3, 8, dtype=np.uint8)


So, reading this with a fresh mind, I think this is incorrect again, and never worked for cases where nz ≠ 2. The 4**3 is one oct (harcoded 2×2×2) with one neighbour in each dimension (so 4×4×4).

At this point, I'm starting to wonder how deep I need to go to fix all these hard-coded assumptions.

I could either try to fix them all in this PR or open another one to fix it afterwards. What do you suggest @chrishavlin?

This is an interesting one, and I'm not sure how it worked before, if indeed it is a problem.

One thing to keep in mind is that there are, occasionally, sloppy references to "cells" when it should be "leaf nodes." So while this may indeed be an error, it's possible that it's actually just marking the selected leaf nodes and not the cells.

I think I'm the author of this sloppy piece of code, and I'm pretty sure I hard-coded that it only works in the 2×2×2 case.

I wanted to say that it's not that big a deal, but in looking it over, it does get used in the Stream and ART frontends, so it's possible that this may impede our ability moving forward if we wanted to convert any Block-structured code to using this.

For instance, it could potentially be very beneficial to use the octree machinery for enzo-p (and a student worked on that a couple summers ago). (Maybe he even issued a pull request to make the number of zones generic? I will investigate.)

cphyc · 2026-03-20T23:04:19Z


        cdef np.int64_t[:, :, :, :] cell_inds

        cell_inds = np.full((self.nocts, 2, 2, 2), -1, dtype=np.int64)


For reference, here is another place where it is hard-coded that nz=2.

cphyc · 2026-03-21T09:53:55Z

@matthewturk and @chrishavlin I've:

made the requested modification,
opened an issue ([BUG] Oct container has hard-coded assumptions nz=2 #5402) for the incorrect assumption about the number of cells in an oct that was already in yt. I propose fixing it in another PR. The impact seems to be limited to neighbour finding and ghost zone reconstruction.

matthewturk · 2026-04-28T16:34:56Z

If you think this is good to go in with a followup, I say do it.

cphyc · 2026-04-28T16:42:48Z

If you think this is good to go in with a followup, I say do it.

I think that would be the best option (especially if we want to release a new version soon). All the bugs with hardcoded 2×2×2 assumptions have been with us essentially forever; this PR is merely revealing them and it's documented in #5402.

Plus, I'm starting to work with datasets that break this assumption explicitly, so I'll eventually fix them!

matthewturk · 2026-04-28T16:51:20Z

OK -- so you think this one is good to go in? I reviewed it and it looks like you made the appropriate changes ... going once, going twice?

cphyc · 2026-04-28T18:31:38Z

OK -- so you think this one is good to go in? I reviewed it and it looks like you made the appropriate changes ... going once, going twice?

I do!

chrishavlin · 2026-04-28T18:39:50Z

I won't have time to get to this for a while, but don't hold it up for me! if @matthewturk took a look and you're both happy, merge away!

cphyc added this to the 4.5.0 milestone Feb 26, 2026

cphyc added new feature Something fun and new! index: octree labels Feb 26, 2026

cphyc mentioned this pull request Feb 26, 2026

Dyablo frontend #5390

Draft

Support (N, M, L) per-dimension zones in octree containers

d126ef5

cphyc force-pushed the feature/support-non-cubic-zones branch from cf2d51e to d126ef5 Compare February 26, 2026 15:13

cphyc marked this pull request as ready for review February 27, 2026 08:40

Prevent overflow for big blocks

283b0cf

Blocks may contain more than 256 cells - prevent overflow of the uint8

chrishavlin self-assigned this Mar 2, 2026

matthewturk previously approved these changes Mar 6, 2026

View reviewed changes

chrishavlin requested changes Mar 20, 2026

View reviewed changes

Replace hits of uint8_t -> uint32_t for cell indices

e569ea9

When allowing larger block sizes, we may easily overflow 8-bit counters. Note that this was *already* overflowing for nz > 6, but it appears it was never tested.

cphyc dismissed matthewturk’s stale review via e569ea9 March 20, 2026 22:18

cphyc commented Mar 20, 2026

View reviewed changes

cphyc mentioned this pull request Mar 21, 2026

[BUG] Oct container has hard-coded assumptions nz=2 #5402

Open

cphyc requested a review from chrishavlin March 31, 2026 07:59

matthewturk merged commit 11e166f into yt-project:main Apr 28, 2026
38 of 41 checks passed

cphyc deleted the feature/support-non-cubic-zones branch April 29, 2026 06:37


		cdef np.int64_t[:, :, :, :] cell_inds

		cell_inds = np.full((self.nocts, 2, 2, 2), -1, dtype=np.int64)

Conversation

cphyc commented Feb 26, 2026

PR Summary

PR Checklist

Uh oh!

matthewturk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chrishavlin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cphyc Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cphyc commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

matthewturk commented Apr 28, 2026

Uh oh!

cphyc commented Apr 28, 2026

Uh oh!

matthewturk commented Apr 28, 2026

Uh oh!

cphyc commented Apr 28, 2026

Uh oh!

chrishavlin commented Apr 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cphyc Mar 20, 2026 •

edited

Loading

cphyc commented Mar 21, 2026 •

edited

Loading