Feature/ramp by AbdelrahmanKatkat · Pull Request #39 · hotosm/fAIr-models

AbdelrahmanKatkat · 2026-04-14T09:14:16Z

What does this PR do?

Adds the ramp model pack with STAC metadata, Docker stages, README, and ZenML training / evaluation / export pipeline support.
Integrates the RAMP workflow with the fAIr platform conventions for hyperparameters, artifact materializers, MLflow logging, promotion, and inference.

…mantic segmentation model - Introduced Dockerfile for building the RAMP model environment with GPU and CPU support. - Added pipeline.py for defining the ZenML pipeline, including preprocessing, training, inference, and postprocessing steps. - Created README.md to document the model architecture, usage, and data layout. - Implemented stac-item.json for STAC catalog integration. - Included smoke tests to validate the Docker runtime and model functionality. - Updated .gitignore to exclude new data directories.

… loading - Updated pipeline.py to load hyperparameters from STAC Item JSON, streamlining model configuration. - Modified training_pipeline to accept a path to the STAC Item, allowing for flexible hyperparameter management. - Revised README.md to reflect changes in hyperparameter handling and usage of STAC Item. - Enhanced stac-item.json with additional metadata and structure for better integration. - Removed outdated CODE_EXPLAINED.md and README.md from tests directory to clean up documentation.

… and compatibility - Refactored Dockerfile to streamline the build process, using a base image from GHCR for both CPU and GPU. - Enhanced pipeline.py to support new model weight loading mechanisms and improved error handling for model paths. - Updated README.md to reflect changes in framework version and model usage, including new data directory structure. - Revised stac-item.json to include updated model weights source and additional metadata for better integration. - Improved smoke tests to validate the new pipeline functionality and ensure compatibility with the latest TensorFlow/Keras versions. - Added per-file ignores in Ruff for specific linting rules in pipeline.py.

…improvements - Added functions to resolve local and remote input directories and files, improving flexibility in handling model paths. - Implemented a zip extraction utility for loading models from compressed files. - Updated `resolve_model_href` to support both local and remote SavedModel directories, enhancing compatibility with various model formats. - Modified smoke tests to validate the new `split_dataset` functionality alongside training wrappers, ensuring robust model training and validation.

… and dependencies - Refactored Dockerfile to separate build, runtime, test, and inference stages for better clarity and efficiency. - Updated `pyproject.toml` to remove unnecessary lint ignores. - Enhanced `pipeline.py` with improved model resolution and added support for lazy imports. - Revised README.md for clarity on architecture and usage. - Added test fixtures for a toy dataset and implemented step tests for the RAMP pipeline. - Updated STAC item schema to version 1.1.0 and included additional metadata properties.

codecov · 2026-04-21T12:47:55Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.27%. Comparing base (186a987) to head (6e4d2d3).

Additional details and impacted files

@@           Coverage Diff           @@
##           master      #39   +/-   ##
=======================================
  Coverage   97.27%   97.27%           
=======================================
  Files          44       44           
  Lines        3892     3893    +1     
=======================================
+ Hits         3786     3787    +1     
  Misses        106      106

Flag	Coverage Δ
fair	`96.21% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…ources - Refactored model resolution logic in `pipeline.py` for clarity and efficiency. - Updated `pretrained_source` and `checkpoint` URLs in `stac-item.json` to point to Hugging Face. - Adjusted metadata properties in `stac-item.json` for consistency and accuracy.

- Simplified the `resolve_model_href` function to focus on .onnx and .zip formats. - Improved error handling for unsupported model formats and missing files. - Updated ZIP extraction logic to ensure proper directory creation and cache management. - Removed deprecated code and comments for better readability.

…ipeline.py - Consolidated multi-line string definitions into single lines for consistency. - Removed unnecessary blank lines to enhance code clarity. - Streamlined parameter retrieval in several functions for better readability.

…TAC item references - Removed the default baseline URL in `pipeline.py` and raised a ValueError if weights are not provided. - Updated `pretrained_source` and `checkpoint` URLs in `stac-item.json` to point to the new Hugging Face location. - Adjusted the `create_toy_data` function to ensure it uses a GeoJSON file for labels, aligning with the expectations in `pipeline.py`.

…peline.py - Updated type hints for several functions to use Optional and Union for better clarity. - Introduced a new function `_normalize_to_savedmodel_dir` to streamline model path normalization. - Improved error handling and readability in model resolution and checkpoint restoration logic. - Consolidated ZIP handling and model loading processes for better maintainability.

… unused cache function; update hyperparameters specification in stac-item.json

…export steps

…raining functions

…alization and update assertions

…rypoint in stac-item.json

…treamline GeoJSON conversion

…est steps for model training

…into feature/ramp

Added a new function `_resolve_labels_geojson` to handle the resolution of dataset labels, supporting both direct file paths and directories containing a single labels file. Updated `_materialize_training_input` to utilize the new label resolution function, enhancing input handling for training datasets.

…th handling Refactored the `_resolve_labels_geojson` function to simplify label resolution for GeoJSON/JSON files, removing redundant file path checks. Introduced a new helper function `_resolve_model_file_path` to handle the resolution of model file paths from various sources, improving code organization and maintainability. Updated the STAC item JSON to reflect a new citation URL.

Added new labels for training and inference to the kind configuration. Updated the default values for training epochs and batch size in the STAC item JSON, reflecting a more suitable configuration for model training.

…ration Deleted the training and inference labels from the kind configuration file to streamline the setup and reduce unnecessary complexity.

…n logging Added MLflow training context to the model training process for better tracking. Enhanced model evaluation by logging evaluation results for both zero metrics and computed metrics, improving observability of model performance.

Changed the output materializer label from "trained_model" to "trained_model_artifact" in the train_model function to better reflect the returned artifact type.

Updated the model loading process to support both `.keras` and `.h5` formats, improving flexibility. Adjusted evaluation metrics to remove the "fair:" prefix for consistency. Additionally, modified the STAC item JSON to reduce training epochs and batch size for better initial training performance.

Renamed and refactored functions for clarity and consistency in handling local and remote paths. Updated the model loading process to exclusively support ZIP files containing SavedModel directories, enhancing error handling and simplifying the extraction logic. Adjusted related function calls to reflect these changes, improving overall code maintainability.

…rfile user to root Modified the STAC item JSON to change the ID and name from "ramp-v1" to "ramp" for better alignment with naming conventions. Updated the Dockerfile to set the user to root, ensuring the k8s ZenML orchestrator can access the in-cluster service account token without authentication issues.

Eliminated the USER root directive from the Dockerfile, as it was no longer needed for the k8s ZenML orchestrator to access the in-cluster service account token. This change simplifies the Dockerfile and enhances security by avoiding unnecessary root privileges.

Enhanced the README.md to provide a clearer overview of the RAMP EfficientNetB0 + U-Net model, including detailed architecture, input/output specifications, pretrained artifacts, and usage instructions for inference and fine-tuning. Added limitations and citation information for better context and usability.

…raining Introduced a new `sample_fraction` parameter to control the fraction of chip files used during training across multiple models. This allows for faster smoke and CI runs by enabling stride sampling. Updated relevant model pipelines and STAC item JSON files to reflect this change, ensuring consistency in model configurations. Additionally, added unit tests for the new functionality to validate its behavior.

AbdelrahmanKatkat and others added 7 commits March 1, 2026 23:54

Merge branch 'master' into feature/ramp

0121689

Merge branch 'master' into feature/ramp

bbae718

AbdelrahmanKatkat added 16 commits April 22, 2026 23:35

fix(ramp): update pretrained model source URLs in STAC item

0cdc85b

Merge branch 'master' into feature/ramp

92b29b9

chore(ci): update just version to 1.39.0 in workflows

79314fd

refactor(pipeline): streamline training directory creation and remove…

24b7e95

… unused cache function; update hyperparameters specification in stac-item.json

feat(pipeline): add output materializers for model training and ONNX …

4a2956c

…export steps

feat(pipeline): support Keras model formats in model resolution and t…

1307685

…raining functions

feat(tests): enhance model training tests to support Keras model seri…

4cebc65

…alization and update assertions

refactor(ramp): update Dockerfile for inference stages and modify ent…

1a8f706

…rypoint in stac-item.json

feat(pipeline): enhance postprocessing function to validate CRS and s…

1c7afe5

…treamline GeoJSON conversion

refactor(ramp): update Dockerfile to use GPU base image and enhance t…

73137db

…est steps for model training

Merge branch 'master' into feature/ramp

23ac532

AbdelrahmanKatkat requested a review from kshitijrajsharma May 11, 2026 10:32

kshitijrajsharma reviewed May 16, 2026

View reviewed changes

Comment thread models/ramp/README.md Outdated

kshitijrajsharma reviewed May 16, 2026

View reviewed changes

Comment thread models/ramp/README.md Outdated

kshitijrajsharma added 3 commits May 16, 2026 21:34

Merge remote-tracking branch 'origin/master' into feature/ramp

28243f0

Merge remote-tracking branch 'origin/master' into feature/ramp

af807a1

Merge remote-tracking branch 'origin/master' into feature/ramp

95e0692

kshitijrajsharma and others added 21 commits May 17, 2026 00:17

Merge remote-tracking branch 'origin/master' into feature/ramp

82d2158

Merge remote-tracking branch 'origin/master' into feature/ramp

d03c412

Merge branch 'master' into feature/ramp

153c6db

Merge branch 'feature/ramp' of https://github.com/hotosm/fAIr-models …

810cc79

…into feature/ramp

Merge branch 'master' into feature/ramp

95e2632

chore(config): remove training and inference labels from kind configu…

8b3e527

…ration Deleted the training and inference labels from the kind configuration file to streamline the setup and reduce unnecessary complexity.

fix(pipeline): update output materializer label in train_model function

323652a

Changed the output materializer label from "trained_model" to "trained_model_artifact" in the train_model function to better reflect the returned artifact type.

Merge branch 'master' into feature/ramp

dc11407

Merge branch 'master' into feature/ramp

8abd477

Merge branch 'master' into feature/ramp

e8500e0

Merge branch 'master' into feature/ramp

6e4d2d3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/ramp#39

Feature/ramp#39
AbdelrahmanKatkat wants to merge 47 commits into
masterfrom
feature/ramp

AbdelrahmanKatkat commented Apr 14, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Apr 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AbdelrahmanKatkat commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AbdelrahmanKatkat commented Apr 14, 2026 •

edited

Loading

codecov Bot commented Apr 21, 2026 •

edited

Loading