Skip to content

Optimization Cycle I: Loop merge & transient refine#465

Draft
FlorianDeconinck wants to merge 45 commits into
NOAA-GFDL:developfrom
FlorianDeconinck:opt_cycle_I/loop_merge
Draft

Optimization Cycle I: Loop merge & transient refine#465
FlorianDeconinck wants to merge 45 commits into
NOAA-GFDL:developfrom
FlorianDeconinck:opt_cycle_I/loop_merge

Conversation

@FlorianDeconinck
Copy link
Copy Markdown
Collaborator

@FlorianDeconinck FlorianDeconinck commented May 14, 2026

Description

Readying for mainline the following Schedule Tree transform:

  • CartesianMerge
  • InlineVertical2DWrite
  • CartesianRefineTransients

QOL / Tooling:

  • TreeOptimizationStatistics will record the before/after count of maps, fors and transients

🐞 Regression/ Bugs worked around

  • Locals are now non-transient in GPU because of bugs showing during tree optimization
  • CartesianRefineTransients is not applied on GPU - same as above

⚠️ This PR includes an update to temporary branches of gt4py/dace to consolidate all changes needed for the June presentation

How has this been tested?

New tests when needed

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (e.g. add new modules to docs/docstrings/)
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules
  • New check tests, if applicable, are included

@FlorianDeconinck FlorianDeconinck requested review from romanc and twicki May 14, 2026 16:06
FlorianDeconinck and others added 26 commits May 14, 2026 15:28
Move pipeline defaults inside the Pipeline itself and have orchestration call default
Mockup of passes required for merging to behave
Use symbols in the replacement directory. Update DaCe to a version
that doesn't re-initialize the symbols. And fix the test failure in
python 3.13.
This has been replaced with `InlineOffgridConditionals` pass
- Local are no longer transient on GPU
- RefineTransients is deactivated
romanc and others added 17 commits May 26, 2026 10:23
Also adds infrastructure to override the orchestration pipeline in
tests (used to allow testing `InlineVertical2Dwrite`).
While the functions creates an inconstent DaceConfig by creating a
config first and then tempering with some properites without
re-evaluating computed properties. In particular `code_path`,
`do_compile` and distributed caches are potentially out of sync with the
layout information.
UW translate test compares a scalar value (`dotransport`) as part of the
translate test. Doing so trips reporting and this change makes it work
again \o/
Protect `performance_timer` for `time==False` and add external setup
Split Simplify2 pass into a GPU centric with block_size on maps & apply_gpu_xform
Remove useless code - legacy code bleed
Verbose the steps better
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants