-
Notifications
You must be signed in to change notification settings - Fork 687
Pull requests: NVIDIA/TransformerEngine
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[PyTorch] Relax dimension constraints for using fused grouped MLP
#2856
opened Apr 8, 2026 by
ksivaman
Loading…
5 of 13 tasks
[PyTorch] Support scaled + clamped SwiGLU in
te.ops and enable fused MXFP8 grouped MLP
#2855
opened Apr 8, 2026 by
ksivaman
Loading…
7 of 13 tasks
Fix zero input shape for bgrad_group_quantize
#2854
opened Apr 8, 2026 by
vthumbe1503
Loading…
13 tasks
Add cpplint and ruff linter to pre-commit and fix lint violations
#2853
opened Apr 8, 2026 by
pstjohn
Loading…
Bump transformers from 4.55.0 to 5.0.0rc3 in /docs/examples/te_gemma
dependencies
Pull requests that update a dependency file
python
Pull requests that update python code
#2851
opened Apr 8, 2026 by
dependabot
bot
Loading…
Bump transformers from 4.57.0 to 5.0.0rc3 in /docs/examples/te_llama
dependencies
Pull requests that update a dependency file
python
Pull requests that update python code
#2850
opened Apr 8, 2026 by
dependabot
bot
Loading…
Skip activation kernels when tensor size is zero
bug
Something isn't working
#2848
opened Apr 8, 2026 by
timmoon10
Loading…
8 of 13 tasks
Add Megatron-FSDP E2E integration test to TE CI/CD (L1).
#2845
opened Apr 7, 2026 by
cspades
Loading…
3 of 13 tasks
[Core] Report CUDA versions when NVRTC compilation fails
enhancement
New feature or request
#2842
opened Apr 7, 2026 by
timmoon10
Loading…
8 of 13 tasks
[CUDNN] Update frontend to version 1.22 and add cuDNN 9.20 path for SM arch >100
#2838
opened Apr 5, 2026 by
zmelumian972
Loading…
2 of 3 tasks
Add grouped unswizzle functionality for MXFP8 scaling factors
#2837
opened Apr 5, 2026 by
int-smart
Loading…
8 of 13 tasks
[PyTorch] Fix FlashAttention 2 head_dim > 192 on sm103 and other architectures
#2836
opened Apr 4, 2026 by
pedramr
Loading…
1 task done
Fix JAX extension build with NVTE_UB_WITH_MPI=1
#2835
opened Apr 4, 2026 by
GaetanLepage
Loading…
2 of 13 tasks
[Pyt][Common] Enabling/Guarding sm120 support (non - attention)
#2833
opened Apr 3, 2026 by
KshitijLakhani
•
Draft
13 tasks
fix CUDA architectures cmake logic
community-contribution
PRs from external contributor outside the core maintainers, representing community-driven work.
#2832
opened Apr 3, 2026 by
GaetanLepage
Loading…
2 of 13 tasks
Add capture_time_hooks to make_graphed_callables for non-capturable per-callable hooks
#2831
opened Apr 3, 2026 by
buptzyb
Loading…
1 of 13 tasks
[Common] Reduced padding kernel compilation time
#2827
opened Apr 2, 2026 by
Oleg-Goncharov
Loading…
5 of 13 tasks
fix(CP, MLA): CP works fine with MLA in a2a cp_comm_type
community-contribution
PRs from external contributor outside the core maintainers, representing community-driven work.
#2826
opened Apr 2, 2026 by
zhujian19891203
Loading…
5 of 13 tasks
fix(CP, FA): the conditional logic in the FA version contains a vulnerability when processing the output of Flash Attn forward pass
#2825
opened Apr 2, 2026 by
zhujian19891203
Loading…
5 of 13 tasks
Parallel Test Execution to decrease CI run times
#2824
opened Apr 2, 2026 by
sudhakarsingh27
•
Draft
Previous Next
ProTip!
Adding no:label will show everything without a label.