Skip to content

Add pipeline transform order validation#12438

Open
crawfordxx wants to merge 1 commit intoopen-mmlab:mainfrom
crawfordxx:feat-pipeline-transform-order-validation
Open

Add pipeline transform order validation#12438
crawfordxx wants to merge 1 commit intoopen-mmlab:mainfrom
crawfordxx:feat-pipeline-transform-order-validation

Conversation

@crawfordxx
Copy link
Copy Markdown

Summary

Fixes #6106

As reported in the issue, the ordering of data augmentation transforms in training pipelines is error-prone and can silently produce incorrect results (e.g., zero evaluation metrics). This PR adds a validate_pipeline_order() utility that:

  • Warns when transforms appear in a suspicious order (e.g., Normalize before RandomFlip, Pad before Resize)
  • Uses both pair-wise rules and category-based ordering checks
  • Is called automatically in BaseDetDataset.__init__ before the dataset is built
  • Only emits UserWarning messages — never raises exceptions — so existing configs continue to work

Changes

  • mmdet/datasets/utils.py: Added validate_pipeline_order() with transform category definitions and pair-wise ordering rules
  • mmdet/datasets/base_det_dataset.py: Calls validate_pipeline_order() during dataset initialization
  • mmdet/datasets/__init__.py: Exports the new function
  • tests/test_datasets/test_utils.py: Unit tests covering correct pipelines, the exact problematic pipeline from Training pipeline transformation order? #6106, and various misordering scenarios

Example warning

When using the problematic pipeline from #6106 (Normalize before Pad):

UserWarning: In the data pipeline, 'Pad' (index 5) appears after 'Normalize' 
(index 4), which is likely incorrect. Normalize should be applied after Pad so 
that padded values are also normalized. Please verify the transform order in 
your config.

Test plan

  • Correct standard pipeline produces no warnings
  • Issue Training pipeline transformation order? #6106 pipeline (Normalize before Pad) triggers warning
  • Normalize before RandomFlip triggers warning
  • Pad before Resize triggers warning
  • Empty pipeline does not crash
  • Unknown custom transforms are silently skipped

Add validate_pipeline_order() that warns when data augmentation
transforms appear in a suspicious order.  For example, placing
Normalize before RandomFlip or Pad before Resize can silently
produce bad training results, as reported in open-mmlab#6106.

The validation is called automatically in BaseDetDataset.__init__
and emits UserWarning messages without raising exceptions, so
existing configs continue to work.

Covered rules:
- Loading transforms must come first
- Spatial augmentation before pixel augmentation
- Pad after Resize and RandomFlip
- Normalize after all augmentations and Pad
- Formatting / packing last
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Training pipeline transformation order?

2 participants