[OpenMP] device Xteamr: Clean up template parameters#1246
[OpenMP] device Xteamr: Clean up template parameters#1246ro-i wants to merge 1 commit intoamd-stagingfrom
Conversation
Remove the wave number and wave size template parameters from the entry points of the device xteam reduction functions. Replace the wave size parameter by a call to `__gpu_num_lanes()`, which is optimized out during compilation. Replace the wave number parameter by the constant `32`, which is on the safe side for its current usage situations (VLA size needs to be constant, max number of threads is 1024, min wave size is 32).
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
|
(PSDB is going to fail at least due to the not-yet-adapted smoke-limbo tests, see ROCm/aomp#1895) |
|
Closed in favor of #1691 |
Remove the wave number and wave size template parameters from the entry points of the device xteam reduction functions. Replace the wave size parameter by a call to
__gpu_num_lanes(), which is optimized out during compilation. Replace the wave number parameter by the constant32, which is on the safe side for its current usage situations (VLA size needs to be constant, max number of threads is 1024, min wave size is 32).