Skip to content

Feature/ramp#39

Open
AbdelrahmanKatkat wants to merge 47 commits into
masterfrom
feature/ramp
Open

Feature/ramp#39
AbdelrahmanKatkat wants to merge 47 commits into
masterfrom
feature/ramp

Conversation

@AbdelrahmanKatkat
Copy link
Copy Markdown
Contributor

@AbdelrahmanKatkat AbdelrahmanKatkat commented Apr 14, 2026

What does this PR do?

  • Adds the ramp model pack with STAC metadata, Docker stages, README, and ZenML training / evaluation / export pipeline support.
  • Integrates the RAMP workflow with the fAIr platform conventions for hyperparameters, artifact materializers, MLflow logging, promotion, and inference.

AbdelrahmanKatkat and others added 7 commits March 1, 2026 23:54
…mantic segmentation model

- Introduced Dockerfile for building the RAMP model environment with GPU and CPU support.
- Added pipeline.py for defining the ZenML pipeline, including preprocessing, training, inference, and postprocessing steps.
- Created README.md to document the model architecture, usage, and data layout.
- Implemented stac-item.json for STAC catalog integration.
- Included smoke tests to validate the Docker runtime and model functionality.
- Updated .gitignore to exclude new data directories.
… loading

- Updated pipeline.py to load hyperparameters from STAC Item JSON, streamlining model configuration.
- Modified training_pipeline to accept a path to the STAC Item, allowing for flexible hyperparameter management.
- Revised README.md to reflect changes in hyperparameter handling and usage of STAC Item.
- Enhanced stac-item.json with additional metadata and structure for better integration.
- Removed outdated CODE_EXPLAINED.md and README.md from tests directory to clean up documentation.
… and compatibility

- Refactored Dockerfile to streamline the build process, using a base image from GHCR for both CPU and GPU.
- Enhanced pipeline.py to support new model weight loading mechanisms and improved error handling for model paths.
- Updated README.md to reflect changes in framework version and model usage, including new data directory structure.
- Revised stac-item.json to include updated model weights source and additional metadata for better integration.
- Improved smoke tests to validate the new pipeline functionality and ensure compatibility with the latest TensorFlow/Keras versions.
- Added per-file ignores in Ruff for specific linting rules in pipeline.py.
…improvements

- Added functions to resolve local and remote input directories and files, improving flexibility in handling model paths.
- Implemented a zip extraction utility for loading models from compressed files.
- Updated `resolve_model_href` to support both local and remote SavedModel directories, enhancing compatibility with various model formats.
- Modified smoke tests to validate the new `split_dataset` functionality alongside training wrappers, ensuring robust model training and validation.
… and dependencies

- Refactored Dockerfile to separate build, runtime, test, and inference stages for better clarity and efficiency.
- Updated `pyproject.toml` to remove unnecessary lint ignores.
- Enhanced `pipeline.py` with improved model resolution and added support for lazy imports.
- Revised README.md for clarity on architecture and usage.
- Added test fixtures for a toy dataset and implemented step tests for the RAMP pipeline.
- Updated STAC item schema to version 1.1.0 and included additional metadata properties.
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.27%. Comparing base (186a987) to head (6e4d2d3).

Additional details and impacted files
@@           Coverage Diff           @@
##           master      #39   +/-   ##
=======================================
  Coverage   97.27%   97.27%           
=======================================
  Files          44       44           
  Lines        3892     3893    +1     
=======================================
+ Hits         3786     3787    +1     
  Misses        106      106           
Flag Coverage Δ
fair 96.21% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…ources

- Refactored model resolution logic in `pipeline.py` for clarity and efficiency.
- Updated `pretrained_source` and `checkpoint` URLs in `stac-item.json` to point to Hugging Face.
- Adjusted metadata properties in `stac-item.json` for consistency and accuracy.
- Simplified the `resolve_model_href` function to focus on .onnx and .zip formats.
- Improved error handling for unsupported model formats and missing files.
- Updated ZIP extraction logic to ensure proper directory creation and cache management.
- Removed deprecated code and comments for better readability.
…ipeline.py

- Consolidated multi-line string definitions into single lines for consistency.
- Removed unnecessary blank lines to enhance code clarity.
- Streamlined parameter retrieval in several functions for better readability.
…TAC item references

- Removed the default baseline URL in `pipeline.py` and raised a ValueError if weights are not provided.
- Updated `pretrained_source` and `checkpoint` URLs in `stac-item.json` to point to the new Hugging Face location.
- Adjusted the `create_toy_data` function to ensure it uses a GeoJSON file for labels, aligning with the expectations in `pipeline.py`.
…peline.py

- Updated type hints for several functions to use Optional and Union for better clarity.
- Introduced a new function `_normalize_to_savedmodel_dir` to streamline model path normalization.
- Improved error handling and readability in model resolution and checkpoint restoration logic.
- Consolidated ZIP handling and model loading processes for better maintainability.
… unused cache function; update hyperparameters specification in stac-item.json
Comment thread models/ramp/README.md Outdated
Comment thread models/ramp/README.md Outdated
kshitijrajsharma and others added 21 commits May 17, 2026 00:17
Added a new function `_resolve_labels_geojson` to handle the resolution of dataset labels, supporting both direct file paths and directories containing a single labels file. Updated `_materialize_training_input` to utilize the new label resolution function, enhancing input handling for training datasets.
…th handling

Refactored the `_resolve_labels_geojson` function to simplify label resolution for GeoJSON/JSON files, removing redundant file path checks. Introduced a new helper function `_resolve_model_file_path` to handle the resolution of model file paths from various sources, improving code organization and maintainability. Updated the STAC item JSON to reflect a new citation URL.
Added new labels for training and inference to the kind configuration. Updated the default values for training epochs and batch size in the STAC item JSON, reflecting a more suitable configuration for model training.
…ration

Deleted the training and inference labels from the kind configuration file to streamline the setup and reduce unnecessary complexity.
…n logging

Added MLflow training context to the model training process for better tracking. Enhanced model evaluation by logging evaluation results for both zero metrics and computed metrics, improving observability of model performance.
Changed the output materializer label from "trained_model" to "trained_model_artifact" in the train_model function to better reflect the returned artifact type.
Updated the model loading process to support both `.keras` and `.h5` formats, improving flexibility. Adjusted evaluation metrics to remove the "fair:" prefix for consistency. Additionally, modified the STAC item JSON to reduce training epochs and batch size for better initial training performance.
Renamed and refactored functions for clarity and consistency in handling local and remote paths. Updated the model loading process to exclusively support ZIP files containing SavedModel directories, enhancing error handling and simplifying the extraction logic. Adjusted related function calls to reflect these changes, improving overall code maintainability.
…rfile user to root

Modified the STAC item JSON to change the ID and name from "ramp-v1" to "ramp" for better alignment with naming conventions. Updated the Dockerfile to set the user to root, ensuring the k8s ZenML orchestrator can access the in-cluster service account token without authentication issues.
Eliminated the USER root directive from the Dockerfile, as it was no longer needed for the k8s ZenML orchestrator to access the in-cluster service account token. This change simplifies the Dockerfile and enhances security by avoiding unnecessary root privileges.
Enhanced the README.md to provide a clearer overview of the RAMP EfficientNetB0 + U-Net model, including detailed architecture, input/output specifications, pretrained artifacts, and usage instructions for inference and fine-tuning. Added limitations and citation information for better context and usability.
…raining

Introduced a new `sample_fraction` parameter to control the fraction of chip files used during training across multiple models. This allows for faster smoke and CI runs by enabling stride sampling. Updated relevant model pipelines and STAC item JSON files to reflect this change, ensuring consistency in model configurations. Additionally, added unit tests for the new functionality to validate its behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants