1240 uploader look into processing h5ad in chunks#1291
Draft
adkinsrs wants to merge 5 commits into
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
(Still a WIP... need to test the rabbitMQ consumer)
This pull request removes legacy Python code related to dataset uploading and AnnData file handling, and adds documentation and service configuration for a new AnnData upload consumer. The changes help to streamline the codebase, reduce redundancy, and introduce a new, containerized RabbitMQ consumer for AnnData uploads.
Removal of legacy dataset upload code:
lib/gear/dataarchive.py, which previously handled detection, parsing, and writing of various dataset formats (MEX, 3tab) into AnnData objects.lib/gear/datasetuploader.py, which included the dataset uploader factory and logic for determining file types and handling tarball contents.lib/gear/exceluploader.py, which implemented Excel file parsing and conversion to AnnData, including validation and statistics/coloring calculations.Introduction of new AnnData upload consumer:
anndata_upload_consumerservice to the Docker Compose template (docker/docker-compose.yml.template), which builds from a dedicated Dockerfile and is responsible for handling AnnData and related file uploads via RabbitMQ.Documentation updates: