-
Notifications
You must be signed in to change notification settings - Fork 6.3k
8382713: [VectorAPI] Perform late inlining of failed vector intrinsics #30876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 7 commits
c7e6fce
931d45e
e779a2f
5f46f5b
d18bd2a
8171911
679e444
dc3ffe8
77150d9
ae4a373
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -47,16 +47,16 @@ public static void main(String[] args) { | |
| public int call() { return 1; } | ||
|
|
||
| @Test | ||
| @IR(failOn = {IRNode.CMP_I, IRNode.CMOVE_I}) | ||
| @IR(failOn = {IRNode.CMP_I, IRNode.CMOVE_I}, applyIf = {"IncrementalInlineVector", "false"}) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does it mean that the rule is disabled unless the test is explicitly run with Instead, why don't you explicitly run the test with
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done, I am now passing -XX:-IncrementalInlineVector to test invocation. |
||
| @IR(counts = {IRNode.VECTOR_TEST, "1"}) | ||
| public int branch(long maskLong) { | ||
| var mask = VectorMask.fromLong(ByteVector.SPECIES_PREFERRED, maskLong); | ||
| return mask.allTrue() ? call() : 0; | ||
| } | ||
|
|
||
| @Test | ||
| @IR(failOn = {IRNode.CMP_I}) | ||
| @IR(counts = {IRNode.VECTOR_TEST, "1", IRNode.CMOVE_I, "1"}) | ||
| @IR(failOn = {IRNode.CMP_I}, applyIf = {"IncrementalInlineVector", "false"}) | ||
| @IR(counts = {IRNode.VECTOR_TEST, "1", IRNode.CMOVE_I, "1"}, applyIf = {"IncrementalInlineVector", "false"}) | ||
| public int cmove(long maskLong) { | ||
| var mask = VectorMask.fromLong(ByteVector.SPECIES_PREFERRED, maskLong); | ||
| return mask.allTrue() ? 1 : 0; | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1294,7 +1294,7 @@ public static void testCompareMaskNotDoubleNegative() { | |
| public static void main(String[] args) { | ||
| TestFramework testFramework = new TestFramework(); | ||
| testFramework.setDefaultWarmup(5000) | ||
| .addFlags("--add-modules=jdk.incubator.vector") | ||
| .addFlags("--add-modules=jdk.incubator.vector", "-XX:InlineSmallCode=100000") | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With @ForceInline over AbstractMask::intoArray test passes with "-ea -esa -XX:CompileThreshold=100 -XX:+UnlockExperimentalVMOptions -server -XX:-TieredCompilation" but fails with default option due to difference in inlining With -XX:InlineSmallCode=1000000 it passes with all the configurations.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please, elaborate where it fails. Does
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I investigated if further here is my analysis Adding @ForceInline to AbstractMask::intoArray is desirable for vector intrinsic inlining, but it exposes a pre-existing bug in C2's switch profiling. The bug is in Parse::do_tableswitch() in parse2.cpp: when a mature MDO has all-zero MultiBranchData counts, merge_ranges() marks every arm as never_reached, and jump_switch_ranges() collapses the entire switch to a single unstable_if trap. The parser should treat this as "no useful profile" (fall back to cnt = 1.0F), not "every arm is cold." I confirmed this analysis by passing -XX:-TieredCompilation or -XX:-UseSwitchProfiling — the test passes with either flag. This profiling issue is orthogonal to the vector intrinsic late inlining work and should be addressed in a separate PR. For now, @ForceInline on AbstractMask::intoArray is not added and -XX:InlineSmallCode=1000000 is added to the failing test as a workaround
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the details. Hm, that doesn't sound right. There's no support for caller-sensitive profiling yet, so each method profile data is stored in a dedicated per-method MDO instance. (There are deoptimization counts which may depend on inlining, but regular branch counts should not be affected.) Anyway, let's continue investigating it separately.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have filed a follow up JBS for this https://bugs.openjdk.org/browse/JDK-8385134 |
||
| .start(); | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you apply regular inlining analysis here (resides in
Compile::call_generator) to decide whether fallback implementation should be inlined or not?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done