[AMDGPU] Capping max number of registers for indirect calls by JoshuaGrindstaff · Pull Request #199765 · llvm/llvm-project

JoshuaGrindstaff · 2026-05-26T21:10:09Z

Depends on #199746

Changed the SetMaxReg Lambda function to cap at ST.getMaxNumVectorRegs(F) when calculating the max register use for each kernel that contain indirect function calls. Before this change, the calculation chooses the max out of all function's maxes within the module instead of those within the scope of the indirect calls.

This fixes an issue of overinflated number of vgpr/agpr for kernels that have indirect function calls. This inflation has lead to some kernels with indirect calls to go over the limit for vgpr/agpr and crash.

github-actions · 2026-05-26T21:10:28Z

Hello @JoshuaGrindstaff 👋

Thank you for submitting a Pull Request (PR) to the LLVM Project. Since this is your first PR, here are a few useful links covering our main contribution policies and review practices.

All contributions to LLVM must follow our LLVM AI Tool Use Policy. In particular, if you used AI while working on this PR, remember to add a note to the PR description.
The LLVM Code-Review Policy and Practices document contains practical information about the PR process, including how patches are reviewed and accepted, and who can review a PR.
Our LLVM Developer Policy describes our expectations for code quality, commit summaries and contains notes on our CI system.

Please reply to this message to confirm that you have read these policies, especially the LLVM AI Tool Use Policy, and that any AI tool usage has been noted in the PR description.

Frequently asked questions

How do I add reviewers?

This PR will be automatically labeled, and the relevant teams will be notified. For some parts of the project, reviewers may also be added automatically.

You can also add reviewers manually using the Reviewers section on this page. If you cannot use that section, it is probably because you do not have write permissions for the repository. In that case, you can request a review by tagging reviewers in a comment using @ followed by their GitHub username.

What if there are no comments?

If you have not received any comments on your PR after a week, you can request a review by pinging the PR with a comment such as “Ping”. The common courtesy ping rate is once a week. Please remember that you are asking for volunteer time from other developers.

Are any special GitHub settings required to contribute to LLVM?

We only require contributors to have a public email address associated with their GitHub commits, see this section of LLVM Developer Policy for details.

If you have questions, feel free to leave a comment on this PR, or ask on LLVM Discord or LLVM Discourse.

Thank you,
The LLVM Community

JoshuaGrindstaff · 2026-05-26T22:01:59Z

Hello @JoshuaGrindstaff 👋

Thank you for submitting a Pull Request (PR) to the LLVM Project. Since this is your first PR, here are a few useful links covering our main contribution policies and review practices.

All contributions to LLVM must follow our LLVM AI Tool Use Policy. In particular, if you used AI while working on this PR, remember to add a note to the PR description.

The LLVM Code-Review Policy and Practices document contains practical information about the PR process, including how patches are reviewed and accepted, and who can review a PR.

Our LLVM Developer Policy describes our expectations for code quality, commit summaries and contains notes on our CI system.

Please reply to this message to confirm that you have read these policies, especially the LLVM AI Tool Use Policy, and that any AI tool usage has been noted in the PR description.

Frequently asked questions

How do I add reviewers?

This PR will be automatically labeled, and the relevant teams will be notified. For some parts of the project, reviewers may also be added automatically.

You can also add reviewers manually using the Reviewers section on this page. If you cannot use that section, it is probably because you do not have write permissions for the repository. In that case, you can request a review by tagging reviewers in a comment using @ followed by their GitHub username.

What if there are no comments?

If you have not received any comments on your PR after a week, you can request a review by pinging the PR with a comment such as “Ping”. The common courtesy ping rate is once a week. Please remember that you are asking for volunteer time from other developers.

Are any special GitHub settings required to contribute to LLVM?

We only require contributors to have a public email address associated with their GitHub commits, see this section of LLVM Developer Policy for details.

If you have questions, feel free to leave a comment on this PR, or ask on LLVM Discord or LLVM Discourse.

Thank you, The LLVM Community

I have read those policies

github-actions · 2026-05-27T09:45:28Z

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:

git-clang-format --diff origin/main HEAD --extensions cpp,h -- llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.cpp llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h --diff_from_common_commit

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from clang-format here.

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp
index 1786a4b49..6107563cd 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp
@@ -14,13 +14,13 @@
 
 #include "AMDGPUMCResourceInfo.h"
 #include "AMDGPUTargetMachine.h"
+#include "GCNSubtarget.h"
 #include "Utils/AMDGPUBaseInfo.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/MC/MCAsmInfo.h"
 #include "llvm/MC/MCContext.h"
 #include "llvm/MC/MCSymbol.h"
 #include "llvm/Target/TargetMachine.h"
-#include "GCNSubtarget.h"
 
 #define DEBUG_TYPE "amdgpu-mc-resource-usage"
 
@@ -304,7 +304,8 @@ void MCResourceInfo::gatherResourceInfo(
   });
 
   const GCNSubtarget &ST = MF.getSubtarget<GCNSubtarget>();
-  auto [MaxAllowedVGPRs, MaxAllowedAGPRs] = ST.getMaxNumVectorRegs(MF.getFunction());
+  auto [MaxAllowedVGPRs, MaxAllowedAGPRs] =
+      ST.getMaxNumVectorRegs(MF.getFunction());
   auto SetMaxReg = [&](MCSymbol *MaxSym, int32_t numRegs,
                        ResourceInfoKind RIK) {
     if (!FRI.HasIndirectCall) {
@@ -315,17 +316,19 @@ void MCResourceInfo::gatherResourceInfo(
       MCSymbol *LocalNumSym = getSymbol(FnSym->getName(), RIK, OutContext);
       const MCExpr *RegExpr = AMDGPUMCExpr::createMax(
           {MCConstantExpr::create(numRegs, OutContext), SymRef}, OutContext);
-      if(RIK == RIK_NumVGPR) {
+      if (RIK == RIK_NumVGPR) {
         RegExpr = AMDGPUMCExpr::createMin(
-          {MCConstantExpr::create(MaxAllowedVGPRs, OutContext),RegExpr},OutContext);
-      }
-      else if (RIK == RIK_NumAGPR) {
+            {MCConstantExpr::create(MaxAllowedVGPRs, OutContext), RegExpr},
+            OutContext);
+      } else if (RIK == RIK_NumAGPR) {
         RegExpr = AMDGPUMCExpr::createMin(
-          {MCConstantExpr::create(MaxAllowedAGPRs, OutContext),RegExpr},OutContext);
+            {MCConstantExpr::create(MaxAllowedAGPRs, OutContext), RegExpr},
+            OutContext);
       }
       LocalNumSym->setVariableValue(RegExpr);
       LLVM_DEBUG(dbgs() << "MCResUse:   " << LocalNumSym->getName()
-                        << ": Indirect callee within, using minimum of module maximum and function maximum\n");
+                        << ": Indirect callee within, using minimum of module "
+                           "maximum and function maximum\n");
     }
   };
 
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h
index 33a8f5d21..369545fa7 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h
@@ -26,8 +26,9 @@ enum class LitModifier { None, Lit, Lit64 };
 ///   - max
 ///   - min
 ///
-/// \note If the 'or'/'max'/'min' operations are provided only a single argument, the
-/// operation will act as a no-op and simply resolve as the provided argument.
+/// \note If the 'or'/'max'/'min' operations are provided only a single
+/// argument, the operation will act as a no-op and simply resolve as the
+/// provided argument.
 ///
 class AMDGPUMCExpr : public MCTargetExpr {
 public:
@@ -87,10 +88,10 @@ public:
                                        MCContext &Ctx) {
     return create(VariantKind::AGVK_Max, Args, Ctx);
   }
-  static const AMDGPUMCExpr *createMin(ArrayRef<const MCExpr *> Args, 
-                                      MCContext &Ctx) {
+  static const AMDGPUMCExpr *createMin(ArrayRef<const MCExpr *> Args,
+                                       MCContext &Ctx) {
     return create(VariantKind::AGVK_Min, Args, Ctx);
-                                      }
+  }
 
   static const AMDGPUMCExpr *createExtraSGPRs(const MCExpr *VCCUsed,
                                               const MCExpr *FlatScrUsed,

github-actions · 2026-05-27T10:13:51Z

🐧 Linux x64 Test Results

The build failed before running any tests. Click on a failure below to see the details.

lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUMCResourceInfo.cpp.o

FAILED: lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUMCResourceInfo.cpp.o
sccache /opt/llvm/bin/clang++ -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GLIBCXX_USE_CXX11_ABI=1 -D_GNU_SOURCE -D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_EXTENSIVE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/gha/actions-runner/_work/llvm-project/llvm-project/build/lib/Target/AMDGPU -I/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/lib/Target/AMDGPU -I/home/gha/actions-runner/_work/llvm-project/llvm-project/build/include -I/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/include -gmlt -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wno-pass-failed -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fvisibility=hidden -UNDEBUG -fno-exceptions -funwind-tables -fno-rtti -MD -MT lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUMCResourceInfo.cpp.o -MF lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUMCResourceInfo.cpp.o.d -o lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUMCResourceInfo.cpp.o -c /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp:320:35: error: captured structured bindings are a C++20 extension [-Werror,-Wc++20-extensions]
320 |           {MCConstantExpr::create(MaxAllowedVGPRs, OutContext),RegExpr},OutContext);
|                                   ^
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp:307:9: note: 'MaxAllowedVGPRs' declared here
307 |   auto [MaxAllowedVGPRs, MaxAllowedAGPRs] = ST.getMaxNumVectorRegs(MF.getFunction());
|         ^
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp:324:35: error: captured structured bindings are a C++20 extension [-Werror,-Wc++20-extensions]
324 |           {MCConstantExpr::create(MaxAllowedAGPRs, OutContext),RegExpr},OutContext);
|                                   ^
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp:307:26: note: 'MaxAllowedAGPRs' declared here
307 |   auto [MaxAllowedVGPRs, MaxAllowedAGPRs] = ST.getMaxNumVectorRegs(MF.getFunction());
|                          ^
2 errors generated.

If these failures are unrelated to your changes (for example tests are broken or flaky at HEAD), please open an issue at https://github.com/llvm/llvm-project/issues and add the infrastructure label.

JoshuaGrindstaff added 2 commits May 26, 2026 13:05

Added min operation for AMDGPUMCExprs

51b655b

Changed SetMaxReg calculation to cap at function max

cfd8dc1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMDGPU] Capping max number of registers for indirect calls#199765

[AMDGPU] Capping max number of registers for indirect calls#199765
JoshuaGrindstaff wants to merge 2 commits into
llvm:mainfrom
JoshuaGrindstaff:capping-registers-for-indirect-calls

JoshuaGrindstaff commented May 26, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

JoshuaGrindstaff commented May 26, 2026

Frequently asked questions

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JoshuaGrindstaff commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 26, 2026

Frequently asked questions

Uh oh!

JoshuaGrindstaff commented May 26, 2026

Frequently asked questions

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

github-actions Bot commented May 27, 2026

🐧 Linux x64 Test Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JoshuaGrindstaff commented May 26, 2026 •

edited

Loading