Introduce GatherToLDS MLIR operation

The FX-level `GatherToLDS` op has no MLIR counterpart. This means any pipeline path that touches it in MLIR -- roundtrip, lowering, analysis -- is blocked .

What's needed:

- Define `wave.gather_to_lds` in `WaveOps.td` with appropriate interfaces. Possibly, it should not inherit `HasWaveIndexMapping` since it lacks a standard `index` attribute.
- Implement a C++ lowering pattern to `amdgpu.gather_to_lds` in `LowerReadWriteOps.cpp`
- Verify the op interacts correctly with existing MLIR passes and propagations.
- Add emission (`water_emitter.py`) and import (`fx_emitter.py`) support for MLIR roundtrip.

Currently, the MXFP4 roundtrip test works around the absence of this op by setting `use_global_to_shared=False` and `schedule=SchedulingType.NONE`, so does `water_e2e_test.py`. The manual schedule (`gemm_mxfp4_double_buffer.py`) depends on `GatherToLDS` nodes, so `SchedulingType.MANUAL` cannot be used without this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce GatherToLDS MLIR operation #1290

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Introduce GatherToLDS MLIR operation #1290

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions