Skip to content

[AMDGPU] Capping max number of registers for indirect calls#199765

Draft
JoshuaGrindstaff wants to merge 2 commits into
llvm:mainfrom
JoshuaGrindstaff:capping-registers-for-indirect-calls
Draft

[AMDGPU] Capping max number of registers for indirect calls#199765
JoshuaGrindstaff wants to merge 2 commits into
llvm:mainfrom
JoshuaGrindstaff:capping-registers-for-indirect-calls

Conversation

@JoshuaGrindstaff
Copy link
Copy Markdown

@JoshuaGrindstaff JoshuaGrindstaff commented May 26, 2026

Depends on #199746

Changed the SetMaxReg Lambda function to cap at ST.getMaxNumVectorRegs(F) when calculating the max register use for each kernel that contain indirect function calls. Before this change, the calculation chooses the max out of all function's maxes within the module instead of those within the scope of the indirect calls.

This fixes an issue of overinflated number of vgpr/agpr for kernels that have indirect function calls. This inflation has lead to some kernels with indirect calls to go over the limit for vgpr/agpr and crash.

@github-actions
Copy link
Copy Markdown

Hello @JoshuaGrindstaff 👋

Thank you for submitting a Pull Request (PR) to the LLVM Project. Since this is your first PR, here are a few useful links covering our main contribution policies and review practices.

  • All contributions to LLVM must follow our LLVM AI Tool Use Policy. In particular, if you used AI while working on this PR, remember to add a note to the PR description.
  • The LLVM Code-Review Policy and Practices document contains practical information about the PR process, including how patches are reviewed and accepted, and who can review a PR.
  • Our LLVM Developer Policy describes our expectations for code quality, commit summaries and contains notes on our CI system.

Please reply to this message to confirm that you have read these policies, especially the LLVM AI Tool Use Policy, and that any AI tool usage has been noted in the PR description.


Frequently asked questions

How do I add reviewers?

This PR will be automatically labeled, and the relevant teams will be notified. For some parts of the project, reviewers may also be added automatically.

You can also add reviewers manually using the Reviewers section on this page. If you cannot use that section, it is probably because you do not have write permissions for the repository. In that case, you can request a review by tagging reviewers in a comment using @ followed by their GitHub username.

What if there are no comments?

If you have not received any comments on your PR after a week, you can request a review by pinging the PR with a comment such as “Ping”. The common courtesy ping rate is once a week. Please remember that you are asking for volunteer time from other developers.

Are any special GitHub settings required to contribute to LLVM?

We only require contributors to have a public email address associated with their GitHub commits, see this section of LLVM Developer Policy for details.


If you have questions, feel free to leave a comment on this PR, or ask on LLVM Discord or LLVM Discourse.

Thank you,
The LLVM Community

@JoshuaGrindstaff
Copy link
Copy Markdown
Author

Hello @JoshuaGrindstaff 👋

Thank you for submitting a Pull Request (PR) to the LLVM Project. Since this is your first PR, here are a few useful links covering our main contribution policies and review practices.

  • All contributions to LLVM must follow our LLVM AI Tool Use Policy. In particular, if you used AI while working on this PR, remember to add a note to the PR description.
  • The LLVM Code-Review Policy and Practices document contains practical information about the PR process, including how patches are reviewed and accepted, and who can review a PR.
  • Our LLVM Developer Policy describes our expectations for code quality, commit summaries and contains notes on our CI system.

Please reply to this message to confirm that you have read these policies, especially the LLVM AI Tool Use Policy, and that any AI tool usage has been noted in the PR description.

Frequently asked questions

How do I add reviewers?

This PR will be automatically labeled, and the relevant teams will be notified. For some parts of the project, reviewers may also be added automatically.

You can also add reviewers manually using the Reviewers section on this page. If you cannot use that section, it is probably because you do not have write permissions for the repository. In that case, you can request a review by tagging reviewers in a comment using @ followed by their GitHub username.

What if there are no comments?

If you have not received any comments on your PR after a week, you can request a review by pinging the PR with a comment such as “Ping”. The common courtesy ping rate is once a week. Please remember that you are asking for volunteer time from other developers.

Are any special GitHub settings required to contribute to LLVM?

We only require contributors to have a public email address associated with their GitHub commits, see this section of LLVM Developer Policy for details.

If you have questions, feel free to leave a comment on this PR, or ask on LLVM Discord or LLVM Discourse.

Thank you, The LLVM Community

I have read those policies

@github-actions
Copy link
Copy Markdown

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff origin/main HEAD --extensions cpp,h -- llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.cpp llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h --diff_from_common_commit

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from clang-format here.
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp
index 1786a4b49..6107563cd 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp
@@ -14,13 +14,13 @@
 
 #include "AMDGPUMCResourceInfo.h"
 #include "AMDGPUTargetMachine.h"
+#include "GCNSubtarget.h"
 #include "Utils/AMDGPUBaseInfo.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/MC/MCAsmInfo.h"
 #include "llvm/MC/MCContext.h"
 #include "llvm/MC/MCSymbol.h"
 #include "llvm/Target/TargetMachine.h"
-#include "GCNSubtarget.h"
 
 #define DEBUG_TYPE "amdgpu-mc-resource-usage"
 
@@ -304,7 +304,8 @@ void MCResourceInfo::gatherResourceInfo(
   });
 
   const GCNSubtarget &ST = MF.getSubtarget<GCNSubtarget>();
-  auto [MaxAllowedVGPRs, MaxAllowedAGPRs] = ST.getMaxNumVectorRegs(MF.getFunction());
+  auto [MaxAllowedVGPRs, MaxAllowedAGPRs] =
+      ST.getMaxNumVectorRegs(MF.getFunction());
   auto SetMaxReg = [&](MCSymbol *MaxSym, int32_t numRegs,
                        ResourceInfoKind RIK) {
     if (!FRI.HasIndirectCall) {
@@ -315,17 +316,19 @@ void MCResourceInfo::gatherResourceInfo(
       MCSymbol *LocalNumSym = getSymbol(FnSym->getName(), RIK, OutContext);
       const MCExpr *RegExpr = AMDGPUMCExpr::createMax(
           {MCConstantExpr::create(numRegs, OutContext), SymRef}, OutContext);
-      if(RIK == RIK_NumVGPR) {
+      if (RIK == RIK_NumVGPR) {
         RegExpr = AMDGPUMCExpr::createMin(
-          {MCConstantExpr::create(MaxAllowedVGPRs, OutContext),RegExpr},OutContext);
-      }
-      else if (RIK == RIK_NumAGPR) {
+            {MCConstantExpr::create(MaxAllowedVGPRs, OutContext), RegExpr},
+            OutContext);
+      } else if (RIK == RIK_NumAGPR) {
         RegExpr = AMDGPUMCExpr::createMin(
-          {MCConstantExpr::create(MaxAllowedAGPRs, OutContext),RegExpr},OutContext);
+            {MCConstantExpr::create(MaxAllowedAGPRs, OutContext), RegExpr},
+            OutContext);
       }
       LocalNumSym->setVariableValue(RegExpr);
       LLVM_DEBUG(dbgs() << "MCResUse:   " << LocalNumSym->getName()
-                        << ": Indirect callee within, using minimum of module maximum and function maximum\n");
+                        << ": Indirect callee within, using minimum of module "
+                           "maximum and function maximum\n");
     }
   };
 
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h
index 33a8f5d21..369545fa7 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h
@@ -26,8 +26,9 @@ enum class LitModifier { None, Lit, Lit64 };
 ///   - max
 ///   - min
 ///
-/// \note If the 'or'/'max'/'min' operations are provided only a single argument, the
-/// operation will act as a no-op and simply resolve as the provided argument.
+/// \note If the 'or'/'max'/'min' operations are provided only a single
+/// argument, the operation will act as a no-op and simply resolve as the
+/// provided argument.
 ///
 class AMDGPUMCExpr : public MCTargetExpr {
 public:
@@ -87,10 +88,10 @@ public:
                                        MCContext &Ctx) {
     return create(VariantKind::AGVK_Max, Args, Ctx);
   }
-  static const AMDGPUMCExpr *createMin(ArrayRef<const MCExpr *> Args, 
-                                      MCContext &Ctx) {
+  static const AMDGPUMCExpr *createMin(ArrayRef<const MCExpr *> Args,
+                                       MCContext &Ctx) {
     return create(VariantKind::AGVK_Min, Args, Ctx);
-                                      }
+  }
 
   static const AMDGPUMCExpr *createExtraSGPRs(const MCExpr *VCCUsed,
                                               const MCExpr *FlatScrUsed,

@github-actions
Copy link
Copy Markdown

🐧 Linux x64 Test Results

The build failed before running any tests. Click on a failure below to see the details.

lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUMCResourceInfo.cpp.o
FAILED: lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUMCResourceInfo.cpp.o
sccache /opt/llvm/bin/clang++ -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GLIBCXX_USE_CXX11_ABI=1 -D_GNU_SOURCE -D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_EXTENSIVE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/gha/actions-runner/_work/llvm-project/llvm-project/build/lib/Target/AMDGPU -I/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/lib/Target/AMDGPU -I/home/gha/actions-runner/_work/llvm-project/llvm-project/build/include -I/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/include -gmlt -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wno-pass-failed -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fvisibility=hidden -UNDEBUG -fno-exceptions -funwind-tables -fno-rtti -MD -MT lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUMCResourceInfo.cpp.o -MF lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUMCResourceInfo.cpp.o.d -o lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUMCResourceInfo.cpp.o -c /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp:320:35: error: captured structured bindings are a C++20 extension [-Werror,-Wc++20-extensions]
320 |           {MCConstantExpr::create(MaxAllowedVGPRs, OutContext),RegExpr},OutContext);
|                                   ^
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp:307:9: note: 'MaxAllowedVGPRs' declared here
307 |   auto [MaxAllowedVGPRs, MaxAllowedAGPRs] = ST.getMaxNumVectorRegs(MF.getFunction());
|         ^
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp:324:35: error: captured structured bindings are a C++20 extension [-Werror,-Wc++20-extensions]
324 |           {MCConstantExpr::create(MaxAllowedAGPRs, OutContext),RegExpr},OutContext);
|                                   ^
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp:307:26: note: 'MaxAllowedAGPRs' declared here
307 |   auto [MaxAllowedVGPRs, MaxAllowedAGPRs] = ST.getMaxNumVectorRegs(MF.getFunction());
|                          ^
2 errors generated.

If these failures are unrelated to your changes (for example tests are broken or flaky at HEAD), please open an issue at https://github.com/llvm/llvm-project/issues and add the infrastructure label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant