Add tools.impacts: GMT-to-impact prediction toolkit via RIME emulators#479
Add tools.impacts: GMT-to-impact prediction toolkit via RIME emulators#479Wegatriespython wants to merge 19 commits into
Conversation
|
[edited] I was wondering if inthe guidelines on how to run existing scenarios with impacts you could include what the requirements, etc. are... I understand that you need a specific scenario (ScenarioMIP BMT) to run impacts in building and other requirements (existing water module) to add/edit water impacts. right? you also need to have run MAGICC (also FAIR?), which should very clear upfront. What is a user wanted to run the water module on an existing scenario and directly use this impact implementation? from my understanding, this is currently not ready-to-use with the current water cli commands. it is fine if it will be in another PR. In particular for the water module impacts (impacts.py and cooling_impacts.py), we should think if we want to retire previous implementations, like this |
|
Actually I think it would be nice to get some user feedback before designing pre-maturely. Because that would give some valuable insights into how workflows other than this specific project are organised. Maybe you can give a proposed goal and we can see how it would work |
0d0274b to
715b617
Compare
715b617 to
d124e41
Compare
adrivinca
left a comment
There was a problem hiding this comment.
Thanks for the impressive work. It might take a bit to digest and polish.
I maybe would like to clarify some doubts about data, the overal structure and usability of the impacts workflow (also for possible projects beyond sparccle), and the interaction with other modules, like water.
Data:
- SSP1 & SSP2 only for building impacts? (data coefficients files)
Data/Impact/rimemaybe subdivide by sectoral subfolders?
Conceptual questions:
- Why is
build_wet_cooling_constraintsimplementing impact differently than for water cooling system? is because the source data is different? - What is the different if you implement the thermal cooling impacts on a scenario without water wrt to a scenario with water contraints? Might clue is that there are differences and the secound route should also be possible in this framework
Structure:
- it would be nice if there was a complete generic workflow like the one in sparccle (btw you misspelled with "sparrcle") project now that is not under the project, but that it could be used by any project. basically I suggest moving the sparccle worksflow to a generic impact worksflow when possible and just leave sparcle project specific config.
Documentation:
- please see my previous comment on prerequirites (MAGICC), existing scenarios, ect
- explain how to run individual impacts: building, power plants, water
- add information on the reporting possibilities for each impact stream
- explain what SSPs are covered and gaps
- based on the suggestion to chang structure. document how to run specific impacts, not just magicc-rime generically.
Then there are some utils that improve or partially replace existing ones. Maybe worth if @khaeru also gives a look and you decide if these utils should cohexist or some should be updated:
-
tools/impact/temporal.py/sample_to_model_yearspartially reinvents util/compat/message_data/update_fix_and_inv_cost.py/add_missing_years -
_demand_to_final_energy_iamccalls pint directly —util.convert_unitsexistsExisting:
util/_convert_units.py:16 — convert_units(data, unit_info, store) is a public utility that does exactly registry.Quantity(factor * data.values, unit_in).to(unit_out).magnitude on a pandas Series, using iam_units.registry.
New code (buildings/impacts.py:449-450):
registry.Quantity(df["value"].to_numpy(), "GWa/year")
.to(_FINAL_ENERGY_UNIT).magnitude
The APIs differ (convert_units is Series-oriented with a dict; the new code is a one-liner on an array), so the existing one can't be dropped in directly. But this is worth noting: the project already has a pint-based unit conversion utility and the new code essentially reimplements the same pint call. If _FINAL_ENERGY_UNIT or the source unit ever needs changing, there are now two divergent patterns to maintain.
| if not ts.empty: | ||
| scen.check_out(timeseries_only=True) | ||
| try: | ||
| scen.add_timeseries(ts) |
There was a problem hiding this comment.
Is this a safe approach also in the case the legacy reporting, the water reporting (here) or others are run afterwards. There are cases where this timeseries addition might be removed.
Let's maybe explore the possible use-cases for impacts and make sure reporting is consistent
There was a problem hiding this comment.
So for this, besides passing files manually I think we have to rely on subsequent caller hygiene to not remove timeseries blanket. For water reporting's own removal we can address but in general I think this might be a potential issue.
| # MESSAGE demand parameter expects GWa | ||
| _EJ_TO_GWA = registry("1 EJ").to("GW * year").magnitude | ||
|
|
||
| _REFERENCE_SCENARIO = "SSP2" |
There was a problem hiding this comment.
I think this should not be hard-coded
maybe more to a workflow setup
There was a problem hiding this comment.
So the scenario is no longer hardcoded, however for gamma (which calibrates between STURM and CHILLED) I only have SSP2 runs so for now this has to be hardcoded here till we get more STURM runs.
Without yet diving into the big diff, the summary Adriano wrote does seem to be on-point. Please reach out if I can help discuss how to improve existing utils to support the applications you have in this PR. |
|
Thanks @adrivinca and @khaeru Khaeru. Follow-up pass to address the comments
Not done:
Reporting/timeseries: buildings and cooling CID inputs persisted as — |
adrivinca
left a comment
There was a problem hiding this comment.
just minor comments on the doc
GmtArray + load_magicc_gmt + persist_gmt_mean in climate; cached open_rime_dataset, clip_gmt, and predict_rime with linearity guard in rime; sample_to_model_years (point/average/interpolate, long-form) in temporal; frame_to_iamc helper in tools/iamc; extract_region_code in util/node (callers in water_for_ppl and water/report migrate from inline split). API doc page tools-impacts.rst.
compute_building_cids and apply_building_cids drive the RIME
EI-vs-GWL emulator into MESSAGE residential/commercial demand,
with theta-calibrated reference scenarios and sector_fractions
from STURM. CID timeseries persist as Final Energy|Residential
and Commercial|{Cooling,Heating} in EJ/yr.
Packaged data: theta_{cool,heat}_SSP{2,3}.csv,
rc_sector_fractions_SSP{2,3}.csv, sturm_floor_area_R12_{resid,comm}.csv,
correction_coefficients_{cool,heat}_SSP2_{resid,comm}.csv,
region_EI_{cool,heat}_gwl_binned.nc.
build_wet_cooling_constraints emits relation_activity rows that bound
freshwater-using thermoelectric variants by per-region capacity-factor
ratios under warming (Li et al. 2025); build_dry_cooling_factors
rescales air-cooled capacity factors (Qin et al. 2023). Wet relations
named wet_cooling_cf_{parent}; CID timeseries persist as Physical
Climate Impact|Thermoelectric Cooling|{Freshwater,Dry}*.
Packaged: r12_thermoelectric_gwl.nc, r12_capacity_gwl_ensemble.nc.
model/water/data/impacts.py: predict_water_rime drives basin-level RIME emulators, with seasonal2step bifurcation when applicable, and expands to MESSAGE basin-region rows. model/water/utils.py: load_basin_mapping (cached), split_basin_macroregion, NAN_BASIN_IDS / N_RIME_BASINS / N_MESSAGE_BASINS constants. Packaged: joint_bifurcation_mapping_CWatM_2step.csv, rime_regionarray_local_temp_CWatM_seasonal2step_window11.nc, rime_regionarray_qr_CWatM_annual_window11.nc.
project/sparrcle wires Phase 1 (water-ix cooling subprocess) and Phase 2 (buildings + cooling CIDs) into a Workflow with one base/cooling/CI_b/CI_p/CI_bp step set per starter. validate_inputs preflights MAGICC outputs, RIME datasets, and per-SSP buildings calibration files so a misconfigured run fails before any clone. CLI: mix-models sparrcle run TARGET (--from / --go).
Drop test_sparrcle.py (trivial click registration check) and test_sparrcle_workflow.py (graph-key assertions with load_config and validate_inputs both monkeypatched away). Replace the permanently-skipped theta calibration test in test_building_impacts.py with two in-memory tests of prepare_building_demand that run unconditionally in CI: - test_prepare_building_demand_substitution_arithmetic: verifies the substitution formula new = old*(1-frac) + cid against known inputs - test_prepare_building_demand_missing_cid_zerofills: verifies that an empty CID frame contributes zero rather than NaN Add netCDF4 and plotnine as dev dependencies (needed to run the remaining predict_building_ei and package-import paths in tests).
CI was failing on tests that open RIME NetCDF files via xarray (test_ratio_* in test_cooling_impacts.py, test_warming_* in test_building_impacts.py) because no NetCDF backend was installed. The files exist in the repo via LFS; the skip guard only checked existence, not openability.
9f8bac0 to
0fc75ae
Compare
This PR introduces the
tools.impactslibrary and implements it for SPARCCLE project workflows.This PR extracts does two things : A) introduces the
tools.impactsmodule, addsimpacts.pyfiles for domainsbuildings,cooling_impacts,water. Compared to #466, it refreshes the source data files for rime, replacing downscaled data from Jones et al 2025, with grid level computed impacts data based on Li et al 2025 for wet cooling, and Qin et al 2023 for dry cooling.B) It implements SPARRCLE project specific workflows using the workflow class, covering physical-impact integration into a range of SPARRCLE scenarios.
How to review
Five commits on
main:2e5d5c6b1tools/impacts/andtools/iamc/helpers — toolkit only, no domain code.0c106e9f0model/buildings/impacts.py(full implementation) plus per-SSPtheta_*,rc_sector_fractions_*, andcorrection_coefficients_*calibration data.dc20285f5model/water/data/cooling_impacts.py— wet cooling viarelation_activity, dry cooling viacapacity_factor.2445a7dbdmodel/water/data/impacts.pyand basin geometry helpers. Water is opt-in (code path exists; not wired into the active workflow).d124e411cproject/sparrcle/{cli.py, scenario_config.yaml, workflow.py}—Workflow-driven graph andmix-models sparrcle runCLI.Toolkit layer (
tools/impacts/,tools/iamc/)rime.pyopen_rime_dataset(filename)entry. Linearity check gates ensemble-mean reliability.climate.pyGmtArrayNamedTuple,gmt_ensemble/gmt_expectation,load_magicc_gmt(magicc_dir, n_runs)(reads upstream MAGICC ensembles),persist_gmt_mean(scen, gmt).temporal.pysample_to_model_years— resamples annual data to MESSAGE non-uniform timesteps viamatch-dispatched methods (point / average / forward-fill / interpolate); accepts wide or long input.tools/iamc/__init__.pyframe_to_iamc(df, variable, unit, *, region_col)shared (region, year, value) → IAMC helper used by both buildings and cooling.Domain layer (
model/buildings/,model/water/data/)model/buildings/impacts.pyrc_spec/rc_thermdemand reshaping under warming via theta transfer functions and SSP-specific correction coefficients (SSP2, SSP3).apply_building_cidswrites the reshaped demand into the scenario and persists `Final Energymodel/water/data/cooling_impacts.pyapply_cooling_cids— wet cooling viawet_cooling_cf_{parent}relation_activityrows on*__ot_fresh/*__cl_freshvariants and dry cooling viacapacity_factorderating. IAMC ratios persisted under `Physical Climate Impactmodel/water/data/impacts.pysplit_basin_macroregion. Water is a code path only; not wired into the active SPARRCLE workflow.SPARRCLE project (
project/sparrcle/)workflow.pygenerate(...)builds aWorkflow.add_stepgraph: per-starterbase → cooling(Phase 1, subprocessmix-models water-ix cooling)→ CI_b / CI_p / CI_bp(Phase 2 CID variants).validate_inputs(config)preflight raises a singleFileNotFoundErrorlisting every missing MAGICC output, buildings calibration file, or RIME dataset before any clone is made.cli.pymix-models sparrclegroup;runselects a target subgraph,--fromtruncates,--goexecutes.scenario_config.yaml_GDP_CI_50_ensemble_*starter rows across SSP1/SSP2/SSP3.PR checklist
doc/api/tools-impacts.rst, docstrings, citations)doc/whatsnew.rst