Skip to content

Water/Nexus : Desalination & Historical Activity Fixes#522

Open
Wegatriespython wants to merge 22 commits into
iiasa:mainfrom
Wegatriespython:wr/pr-b-water-nexus-fixes
Open

Water/Nexus : Desalination & Historical Activity Fixes#522
Wegatriespython wants to merge 22 commits into
iiasa:mainfrom
Wegatriespython:wr/pr-b-water-nexus-fixes

Conversation

@Wegatriespython

Copy link
Copy Markdown
Contributor

This PR fixes a set of water-supply constraint and historical-calibration issues in the nexus module that surfaced while validating the data refresh in #513, the model-behaviour changes are collected here

  • Desalination:
    • Saline extraction capacity (bound_total_capacity_up on extract_salinewater) now defaults to zero in basins with no desalination projection, instead of being left unbounded.
    • Activity floors (bound_activity_lo) for membrane and distillation are clipped proportionally so their combined saline-water demand cannot exceed the shared extraction cap.
    • New-capacity growth is limited (growth_new_capacity_up = 0.10/yr on membrane and distillation), so capacity expands smoothly (This was set to remove saw-tooth desal numbers seen for WEU basins, number was eye-balled).
  • Historical supply calibration:
    • New hist_dispatch module seeds historical_activity for freshwater extraction technologies by merit-order dispatch of historical demand (including irrigation) across surface water, renewable groundwater, and fossil groundwater.
    • The seed respects the groundwater sustainability share constraint, the share constraint definition itself moved to a shared groundwater_share_floor helper to prevent drift from the 2 places of use.
    • Renewable groundwater gets the same activity growth up as surface water. Fossil groundwater use is comparatively more than the previous version of the model, which is realistic given global aquifer depletion. However this is an incidental result.

How to review

@adrivinca
This PR is based on #513 and can be merged only after it. Review the commits above the data-refresh base.

PR checklist

  • Continuous integration checks all ✅
  • Add or expand tests; coverage checks both ✅
  • Add, expand, or update documentation.
  • Update doc/whatsnew.

Remove the urban/rural connected-disconnected split from the optimizer.
Demand routes the full municipal volume through urban_mw/rural_mw;
connected/unconnected attribution is reconstructed at reporting time
as ACT_t_d * connection_rate.

Validated against the prior split formulation: OBJ delta <0.005%,
electricity price delta <0.01% across SSP2/SSP3.
The CLI passes sdgs as the string "baseline" or "SDG"; the previous
`if sdgs` truthiness check sent every baseline call down the SDG path
and returned an empty rates DataFrame. Check explicitly for True or
"SDG".
Switch scaind from 1 (auto-scale) to -1 (no scaling) for both solve
paths. Auto-scaling produced `LP status (5): optimal with unscaled
infeasibilities` on production baselines.
Shared helpers for distributing country- or region-level water inputs
across MESSAGE basins. Loads country-basin overlap, derives stable
basin shares from existing basin-level time series, and joins totals
to shares for downstream use by the per-domain generators that follow.

Refs #535.
Generate the R12 x SSP x (urban, rural) connection-rate baseline CSVs
from "Improved water services" rows in two files under
data/water/demands/drinking_water_access/:
projections_people_UR_income_10_25.csv (with urban/rural split, for
AFR, EEU, FSU, LAM, MEA, PAS, RCPA, SAS, WEU) and
projections_people_merge_countries_10_25(in).csv (no split, for NAM
and CHN, broadcast to both settings). PAO is hard-coded to 0.99.

R12 rates are population-weighted, carried backward to fill early
target years and capped forward at 2090 for 2100/2110, then broadcast
uniformly to basin columns. The basin column order comes from
connection_rate_basins_R12.csv.

SSP2 connection-rate CSVs are regenerated from the same pipeline.
SSP1, SSP3, SSP4, SSP5 connection-rate CSVs are net-new.

Refs #535.
Regenerate R12 x SSP x {urban, rural, manufacturing} x {withdrawal,
return} demand CSVs from Khan 2022 basin-level withdrawal projections
(/mnt/p/ene.model/NEST/water_demands/Khan2022,
doi:10.1038/s41597-023-02086-2). Withdrawals come directly from the
source; returns are withdrawal x per-basin return/withdrawal ratio,
read from new input files return_ratio_{urban_domestic, rural,
manufacturing}.csv. urban_withdrawal / urban_return combine domestic
and manufacturing.

SSP1, SSP3, SSP4, SSP5 demand CSVs are net-new.

Generator at pre_processing/generate_sectoral_demands.py.

Rename urban_withdrawal2 / urban_return2 -> urban_*_domestic across
R11, R12, ZMB, and old_R11 harmonized demand CSVs.

Refs #535.
Desalination capacity is socio-economic, not climate-conditioned.
The projected potential CSVs (R11, R12, ZMB) now carry an `ssp`
column derived from Marina's country-level source data, and
`add_desalination` filters on `context.ssp` instead of `cfg.RCP`.
SSP1, SSP3, SSP5 are taken directly from the source; SSP2 inherits
SSP1 and SSP4 inherits SSP3 by assignment.

R12 historical capacity is refreshed from the same source. R12 ships
a basin-allocation template
(desalination_basin_allocation_template_R12.csv) used by the new
generator at pre_processing/generate_desalination.py.

`test_infrastructure.py` parametrizations gain an explicit `ssp=SSP2`
so the SSP-keyed filter resolves.

Refs #535.
R12 hydro availability CSVs (qtot, qr, e-flow, 5-yr and monthly
variants) for 2p6, 7p0, and 8p5 are regenerated from the CWaTM
SSP-keyed percentile pipeline. Source data lives at
/mnt/p/watxene/ISIMIP_postprocessed/data_for_vignesh/message_nexus_input_2026/
(5 GCMs x 3 SSPs of qr_monthly + qtot_daily futures). Generator added
at pre_processing/generate_hydro_availability.py.

Three basins (30|FSU, 51|FSU, 154|FSU) are 100% NaN across all 5 GCMs
in the refreshed source; `filter_basins_by_region` excludes them
unconditionally so the water build never sees zero-availability rows
for them.

`compute_basin_demand_ratio` switches its supply baseline from
qtot/qr_5y_no_climate_low to qtot/qr_5y_2p6_low so the reduced-basin
ranking remains stable across the run's RCP choice. The same function
reads the renamed urban_*_domestic demand file introduced earlier in
this PR.

Refs #535.
The R12 input data shipped earlier in this PR is now produced by the
Python generators under pre_processing/ (generate_access_rates,
generate_sectoral_demands, generate_desalination,
generate_hydro_availability) on top of basin_allocation. The legacy
scripts (desalination.R, generate_water_constraints.R, hydro_agg_basin.py,
hydro_agg_raster.py, hydro_agg_spatial.R) have no remaining role and
are removed.

doc/water/index.rst is updated to drop the entries for the retired
scripts, add entries for basin_allocation and the four new generators,
and remove the stale "Deprecated R Code" section pointing at a removed
directory.

Refs #535.
CWaTM source data (refreshed earlier in this PR) covers 2p6, 7p0, and
8p5 only. The synthetic no_climate baseline has no source equivalent
and no defensible RCP analogue, so it is dropped from the CLI choices
and from the R12 availability data set.

`_RCPS` in cli.py is now ["2p6", "7p0", "8p5"]; the `--rcps` defaults
and help strings for `nexus` and `cooling` flip to 2p6. `Config.RCP`
default flips to "2p6". R12 no_climate availability CSVs are removed.

test_water_data.py drops no_climate from the R11-only legacy exclusion
list so the R11<->R12 parity check still passes. test_build.py picks
7p0 as the build-test RCP in place of the retired 6p0.

R11 and ZMB availability data are untouched; their legacy RCPs (6p0,
no_climate) remain available for R11 regression coverage.

Refs #535.
The cooling-impact dataset
(power_plant_cooling_impact_MESSAGE_*_{2p6,6p0,7p0}.csv) covered only
the legacy RCPs and was gated behind `cfg.RCP == "no_climate"`. With
no_climate now removed and 8p5 added (neither has impact coverage),
the gated branch in `_make_capacity_factor` is unreachable for any
defensible RCP choice. The branch is removed; the function now
returns the dimensionless capacity_factor unchanged. The eleven
cooling-impact CSVs are deleted.

`test_water_for_ppl.py` drops its no_climate parametrizations.

Refs #535.
add_sectoral_demands now resolves ssp = context.ssp.lower() and reads
withdrawal / return CSVs matching {ssp}_regional_*.csv, picking up the
SSP1, SSP3, SSP4, SSP5 demand data shipped earlier in this PR instead
of the SSP2 fallback. Treatment-rate and recycling-rate CSVs are read
from ssp2_regional_*_rate_baseline.csv regardless of the current SSP,
since rate data is not differentiated across SSPs. The monthly path
({ssp}_m_water_demands.csv) is keyed by current SSP the same way.

The in-memory variable strings (urban_withdrawal2_baseline,
urban_return2_baseline) are renamed to urban_*_domestic_baseline to
match the CSV-side rename that landed earlier; the prior strings no
longer matched any variable in the loaded data and were silently
returning empty slices.

Known gap: ssp{n}_m_water_demands.csv exists only for SSP2/ZMB. A
non-SSP2 run with sub-annual time slices will raise FileNotFoundError
on the monthly path. R12 has no monthly file even for SSP2, so the
sub-annual + R12 path was already broken; this commit makes the
failure mode explicit per-SSP rather than introducing it.

Refs #535.
Membrane and distillation both draw on the extract_salinewater_basin
upstream technology, so their per-technology bound_activity_lo floors
could sum above that technology's bound_total_capacity_up and make the LP
infeasible. Scale the floors down proportionally per node-year when their
sum exceeds the cap, and restrict the floors to the model horizon.
The projected desalination potential covers only part of the basin set.
Basins without a row left extract_salinewater_basin capacity effectively
unbounded, so reduced-basin runs routed unlimited desalination to them.
Backfill the missing basin-years with a zero capacity bound, so absence of
a projection means no desalination rather than unlimited.
Add growth_new_capacity_up on membrane and distillation to smooth the
vintage-replacement build pattern.
Surface-water extraction carried a growth_activity_up bound with no
historical_activity baseline, which pinned its activity at zero across the
whole horizon. Seed historical_activity for the basin extraction techs
(surface water, groundwater, fossil groundwater) by dispatching historical
freshwater demand against per-basin capacity in ascending operating-cost
order. Demand sums sectoral withdrawals and GLOBIOM irrigation. Surface
water is held below each basin's groundwater sustainability-floor share so
the seed stays consistent with the in-horizon share constraint.
The merit-order seed now anchors both surface water and groundwater on
historical_activity. Surface water already carried growth_activity_up =
0.02/yr; extend the same bound to extract_groundwater so the LP cannot
ramp groundwater arbitrarily once seeded while surface water stays held by
its ceiling. extract_gw_fossil is left uncapped as the residual backstop.

This is a dynamic constraint that compounds over the period length, so it
is gated on the cluster build+solve rather than the data-layer tests.
Keep the merit-order/constraint drift rationale once in the
groundwater_share_floor docstring; shorten the utils.py constants comment
and the hist_dispatch call-site comment to pointers. State the
growth_activity_up share invariant timelessly, drop the stale
'reduced from 50 to 5' trailing comment, and state the surface-water
cap formula once in cap_surfacewater_for_gw_floor.
Replace the arbitrary discouragement-knob costs on extract_gw_fossil (inv_cost
1.5x desalination, a separate fix_cost and var_cost, 5-year lifetime, 5x
electricity intensity) with a flat 20% premium over renewable groundwater on the
dimensions it shares: inv_cost (54.52 * 1.2), electricity multiplier 1.2, and
lifetime parity (20 years). Renewable groundwater carries no fix_cost or
var_cost, so fossil now carries none either; the unused var-cost constant is
removed. The merit-order historical seed ranks fossil above renewable by the
1.2x electricity multiplier alone, leaving the seeded historical_activity
unchanged.
@Wegatriespython Wegatriespython force-pushed the wr/pr-b-water-nexus-fixes branch from d4573ac to a9fd7f1 Compare June 10, 2026 13:18
@Wegatriespython Wegatriespython added the safe to test PRs from forks that do not pose security risks label Jun 10, 2026
@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.0%. Comparing base (b6511ef) to head (a9fd7f1).

Additional details and impacted files
@@           Coverage Diff           @@
##            main    #522     +/-   ##
=======================================
- Coverage   74.2%   74.0%   -0.2%     
=======================================
  Files        320     324      +4     
  Lines      25655   25930    +275     
=======================================
+ Hits       19047   19209    +162     
- Misses      6608    6721    +113     
Files with missing lines Coverage Δ
message_ix_models/model/water/cli.py 33.0% <ø> (ø)
message_ix_models/model/water/config.py 100.0% <ø> (ø)
message_ix_models/model/water/data/__init__.py 70.3% <ø> (+1.1%) ⬆️
message_ix_models/model/water/data/demands.py 69.7% <ø> (-0.9%) ⬇️
...essage_ix_models/model/water/data/hist_dispatch.py 39.4% <ø> (ø)
...ssage_ix_models/model/water/data/infrastructure.py 100.0% <ø> (ø)
...odel/water/data/pre_processing/basin_allocation.py 88.4% <ø> (ø)
...essage_ix_models/model/water/data/water_for_ppl.py 91.2% <ø> (-0.4%) ⬇️
message_ix_models/model/water/data/water_supply.py 76.8% <ø> (-0.2%) ⬇️
message_ix_models/model/water/report.py 15.9% <ø> (-0.8%) ⬇️
... and 8 more
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test PRs from forks that do not pose security risks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant