Support (N, M, L) per-dimension zones in octree containers#5391
Support (N, M, L) per-dimension zones in octree containers#5391matthewturk merged 3 commits intoyt-project:mainfrom
Conversation
cf2d51e to
d126ef5
Compare
Blocks may contain more than 256 cells - prevent overflow of the uint8
matthewturk
left a comment
There was a problem hiding this comment.
I have gone over this and everything I can see suggests it looks exactly correct.
This is a really great improvement. I'd like to see if we could discuss reviving efforts to move the Enzo-P frontend to use this ...
With arbitrary zones, is it also potentially the case that any and all block-based grid systems could utilize all this machinery, rather than the more complex (and not fully working) grid visitor stuff? I suspect not, based on how the refinement patterns go, but I wanted to ask.
| new_shape = (nz, nz, nz, n_oct) | ||
| elif arr.size == nz * nz * nz * n_oct * 3: | ||
| new_shape = (nz, nz, nz, n_oct, 3) | ||
| if arr.size == nzones * n_oct: |
There was a problem hiding this comment.
I don't think this changes the F/C ordering issues, right? Like, if we were ordered correctly with symmetric dimensions we'd be similarly ordered correctly with asymmetric dimensions? (And vice versa.)
There was a problem hiding this comment.
It's too late in the week for me to fully process that question, but I think I understand it, and I think the answer is yes. How could I convince us of it?
chrishavlin
left a comment
There was a problem hiding this comment.
only took a cursory look to see where the test failures were coming from
| # We create oct arrays of the correct size | ||
| cdef np.ndarray[np.uint8_t, ndim=1] levels | ||
| cdef np.ndarray[np.uint8_t, ndim=1] cell_inds | ||
| cdef np.ndarray[np.uint32_t, ndim=1] cell_inds |
There was a problem hiding this comment.
This is the source of the test failures: the return will now be uint32_t, but fill_level is expecting uint8_t still, so presumably fill_level's types should be updated
There was a problem hiding this comment.
and should file_index_octs_with_ghost_zones be updated too?
There was a problem hiding this comment.
oh and fill_level_with_domain ?
There was a problem hiding this comment.
So I followed the white rabbit and ended up replacing lots of hits. Let's see whether tests pass now!
There was a problem hiding this comment.
Funnily, I think yt never worked with nz > 6 for which nz**3 > 255 which overflows an uint8_t!
There was a problem hiding this comment.
Oh dang, that's somehow not surprising, but should be. I wonder if this is related to some other weird problems I saw in some other work a few years ago.
This will potentially increase the peak memory usage quite a lot, but I don't know that it's going to be a long-lived increase that will make a difference.
There was a problem hiding this comment.
To be fair, the memory increase is on the order of file_inds and a similar amount used to read the data in double precision.
I agree this is slightly wasteful, but in the grand scheme of things, it's probably OK. But I could make it a uint16 if you wanted.
There was a problem hiding this comment.
I definitely do not! I think we're in fine territory.
When allowing larger block sizes, we may easily overflow 8-bit counters. Note that this was *already* overflowing for nz > 6, but it appears it was never tested.
| cdef np.uint8_t[::1] cell_inds | ||
| cdef np.int64_t[::1] oct_inds | ||
|
|
||
| cell_inds = np.full(num_octs*4**3, 8, dtype=np.uint8) |
There was a problem hiding this comment.
So, reading this with a fresh mind, I think this is incorrect again, and never worked for cases where nz ≠ 2. The 4**3 is one oct (harcoded 2×2×2) with one neighbour in each dimension (so 4×4×4).
At this point, I'm starting to wonder how deep I need to go to fix all these hard-coded assumptions.
I could either try to fix them all in this PR or open another one to fix it afterwards. What do you suggest @chrishavlin?
There was a problem hiding this comment.
This is an interesting one, and I'm not sure how it worked before, if indeed it is a problem.
One thing to keep in mind is that there are, occasionally, sloppy references to "cells" when it should be "leaf nodes." So while this may indeed be an error, it's possible that it's actually just marking the selected leaf nodes and not the cells.
There was a problem hiding this comment.
I think I'm the author of this sloppy piece of code, and I'm pretty sure I hard-coded that it only works in the 2×2×2 case.
There was a problem hiding this comment.
I wanted to say that it's not that big a deal, but in looking it over, it does get used in the Stream and ART frontends, so it's possible that this may impede our ability moving forward if we wanted to convert any Block-structured code to using this.
For instance, it could potentially be very beneficial to use the octree machinery for enzo-p (and a student worked on that a couple summers ago). (Maybe he even issued a pull request to make the number of zones generic? I will investigate.)
|
|
||
| cdef np.int64_t[:, :, :, :] cell_inds | ||
|
|
||
| cell_inds = np.full((self.nocts, 2, 2, 2), -1, dtype=np.int64) |
There was a problem hiding this comment.
For reference, here is another place where it is hard-coded that nz=2.
|
@matthewturk and @chrishavlin I've:
|
|
If you think this is good to go in with a followup, I say do it. |
I think that would be the best option (especially if we want to release a new version soon). All the bugs with hardcoded 2×2×2 assumptions have been with us essentially forever; this PR is merely revealing them and it's documented in #5402. Plus, I'm starting to work with datasets that break this assumption explicitly, so I'll eventually fix them! |
|
OK -- so you think this one is good to go in? I reviewed it and it looks like you made the appropriate changes ... going once, going twice? |
I do! |
|
I won't have time to get to this for a while, but don't hold it up for me! if @matthewturk took a look and you're both happy, merge away! |
PR Summary
Currently, yt only supports cubic "zones" for octrees: this means that octree datasets can only contain
(N, N, N)cells in each leaf oct, whereNis controlled with the dataset parameternum_zones.This PR allows passing a tuple of values of
num_zones, increasing flexibility in support of #5390. In practice, this allows having an octree of blocks, where each block contains(N, M, L)cells.This PR should leave the RAMSES, Art and Stream frontends behaviour unchanged.
PR Checklist