There is something in MOM_initialize_tracer_from_Z() that adds a ton of time to initialization -- MARBL loops through all the MARBL tracers and calls this routine for each. @klindsay28 had a run where the reported timers are
Total runtime 1 1253.507196 1253.507603 1253.507481 0.000079 1.000 0 0 2559
Ocean Initialization 2 824.114088 825.869618 824.277788 0.272389 0.658 11 0 2559
(Initialize tracer from Z) 41 697.711339 699.077023 698.795238 0.082158 0.557 41 0 2559
And I had a similar run with
Total runtime 1 846.249887 846.250246 846.249995 0.000067 1.000 0 0 2559
Ocean Initialization 2 491.731818 492.948469 492.118261 0.437644 0.582 11 0 2559
(Initialize tracer from Z) 37 379.730930 380.510779 380.245332 0.076911 0.449 41 0 2559
So somewhere between 75% and 85% of init time (which is 50% - 70% of total runtime for short runs; @klindsay28 ran for 3 days with 41 tracers, I ran for 5 with a slightly older code base that only had 37 tracers). We were hoping it was the global communication in
call myStats(tr(:,:,k), missing_value, G, k, 'Tracer from ALE()')
but commenting that call out did not affect performance. We are remapping the initial conditions both horizontally (from a uniform 1° grid) and vertically (with ALE). It's not clear where the routine is spending more time, or if there are any remapping weights or something similar we can save and provide as an argument rather than recalculating for each tracer.
This is a low priority, and we do not expect to speed up initialization times before the CESM 3.0.0 release, but when we are ready to tackle this the first step should be to introduce more timers and determine what, exactly, is taking so long.
There is something in
MOM_initialize_tracer_from_Z()that adds a ton of time to initialization -- MARBL loops through all the MARBL tracers and calls this routine for each. @klindsay28 had a run where the reported timers areAnd I had a similar run with
So somewhere between 75% and 85% of init time (which is 50% - 70% of total runtime for short runs; @klindsay28 ran for 3 days with 41 tracers, I ran for 5 with a slightly older code base that only had 37 tracers). We were hoping it was the global communication in
but commenting that call out did not affect performance. We are remapping the initial conditions both horizontally (from a uniform 1° grid) and vertically (with ALE). It's not clear where the routine is spending more time, or if there are any remapping weights or something similar we can save and provide as an argument rather than recalculating for each tracer.
This is a low priority, and we do not expect to speed up initialization times before the CESM 3.0.0 release, but when we are ready to tackle this the first step should be to introduce more timers and determine what, exactly, is taking so long.