[PyTorch] Support scaled + clamped SwiGLU in te.ops and enable fused MXFP8 grouped MLP
#2855
Loading
te.ops and enable fused MXFP8 grouped MLP
#2855