Skip to content

Introduce GatherToLDS MLIR operation #1290

@martin-luecke

Description

@martin-luecke

The FX-level GatherToLDS op has no MLIR counterpart. This means any pipeline path that touches it in MLIR -- roundtrip, lowering, analysis -- is blocked .

What's needed:

  • Define wave.gather_to_lds in WaveOps.td with appropriate interfaces. Possibly, it should not inherit HasWaveIndexMapping since it lacks a standard index attribute.
  • Implement a C++ lowering pattern to amdgpu.gather_to_lds in LowerReadWriteOps.cpp
  • Verify the op interacts correctly with existing MLIR passes and propagations.
  • Add emission (water_emitter.py) and import (fx_emitter.py) support for MLIR roundtrip.

Currently, the MXFP4 roundtrip test works around the absence of this op by setting use_global_to_shared=False and schedule=SchedulingType.NONE, so does water_e2e_test.py. The manual schedule (gemm_mxfp4_double_buffer.py) depends on GatherToLDS nodes, so SchedulingType.MANUAL cannot be used without this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions