Skip to content

Add post-mono MIR optimizations#156858

Open
cjgillot wants to merge 11 commits into
rust-lang:mainfrom
cjgillot:post-mono-mir-opts
Open

Add post-mono MIR optimizations#156858
cjgillot wants to merge 11 commits into
rust-lang:mainfrom
cjgillot:post-mono-mir-opts

Conversation

@cjgillot
Copy link
Copy Markdown
Contributor

@cjgillot cjgillot commented May 23, 2026

View all comments

This is mostly a rebase of #131650 by @saethlin.

MIR optimizations are limited since they run on polymorphic code. They cannot know of all types nor of their layout.

To work around this limitation @saethlin added a MIR traversal which monomorphizes one the run (#121421). We also already have a pass #139088 which is explicitly waiting for post-mono MIR passes to happen.

This PR creates a build_codegen_mir query. That query has a peculiar Steal<Cow<'tcx, Body<'tcx>>> return type. This allows reusing optimized_mir when the body is already monomorphic, and also to free memory when we need to clone it. With this device we still have a sizeable max-rss regression.

All this allows to remove just-in-time monomorphization from codegen code. Follow-up PRs can try migrating transforms that happen at codegen time to a post-mono MIR pass.

@rustbot rustbot added A-attributes Area: Attributes (`#[…]`, `#![…]`) S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-clippy Relevant to the Clippy team. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels May 23, 2026
@cjgillot
Copy link
Copy Markdown
Contributor Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 23, 2026
rust-bors Bot pushed a commit that referenced this pull request May 23, 2026
@rust-log-analyzer

This comment has been minimized.

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented May 23, 2026

☀️ Try build successful (CI)
Build commit: c40ae76 (c40ae76fdfbb0d687aafd24fdcd2354ede04422c, parent: 54333ff079780f803f65dcee30c544050b35f544)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Copy Markdown
Collaborator

Finished benchmarking commit (c40ae76): comparison URL.

Overall result: ❌✅ regressions and improvements - please read:

Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf.

Next, please: If you can, justify the regressions found in this try perf run in writing along with @rustbot label: +perf-regression-triaged. If not, fix the regressions and do another perf run. Neutral or positive results will clear the label automatically.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.9% [0.1%, 3.4%] 80
Regressions ❌
(secondary)
0.7% [0.1%, 3.1%] 70
Improvements ✅
(primary)
-0.4% [-0.7%, -0.3%] 4
Improvements ✅
(secondary)
-0.4% [-1.4%, -0.0%] 6
All ❌✅ (primary) 0.8% [-0.7%, 3.4%] 84

Max RSS (memory usage)

Results (primary 11.7%, secondary 2.8%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
12.1% [0.8%, 53.9%] 48
Regressions ❌
(secondary)
3.3% [0.7%, 6.7%] 18
Improvements ✅
(primary)
-8.1% [-8.1%, -8.1%] 1
Improvements ✅
(secondary)
-5.8% [-5.8%, -5.8%] 1
All ❌✅ (primary) 11.7% [-8.1%, 53.9%] 49

Cycles

Results (primary 2.7%, secondary 2.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
3.6% [1.8%, 5.5%] 15
Regressions ❌
(secondary)
3.4% [1.9%, 4.4%] 10
Improvements ✅
(primary)
-4.1% [-4.6%, -3.6%] 2
Improvements ✅
(secondary)
-3.3% [-4.5%, -2.1%] 2
All ❌✅ (primary) 2.7% [-4.6%, 5.5%] 17

Binary size

Results (primary -0.2%, secondary -0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.2% [0.0%, 0.6%] 15
Regressions ❌
(secondary)
0.3% [0.0%, 0.5%] 5
Improvements ✅
(primary)
-0.3% [-3.0%, -0.0%] 46
Improvements ✅
(secondary)
-0.1% [-0.4%, -0.0%] 34
All ❌✅ (primary) -0.2% [-3.0%, 0.6%] 61

Bootstrap: 510.282s -> 521.256s (2.15%)
Artifact size: 400.55 MiB -> 398.41 MiB (-0.53%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels May 23, 2026
@jieyouxu jieyouxu self-assigned this May 23, 2026
@cjgillot cjgillot force-pushed the post-mono-mir-opts branch from 73c6c56 to c885a8d Compare May 23, 2026 22:00
@cjgillot
Copy link
Copy Markdown
Contributor Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 23, 2026
rust-bors Bot pushed a commit that referenced this pull request May 23, 2026
@rust-log-analyzer

This comment has been minimized.

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented May 24, 2026

☀️ Try build successful (CI)
Build commit: 0a3713a (0a3713a7df23eb1f82606bf484689d5bf5886931, parent: 54333ff079780f803f65dcee30c544050b35f544)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Copy Markdown
Collaborator

Finished benchmarking commit (0a3713a): comparison URL.

Overall result: ❌✅ regressions and improvements - please read:

Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf.

Next, please: If you can, justify the regressions found in this try perf run in writing along with @rustbot label: +perf-regression-triaged. If not, fix the regressions and do another perf run. Neutral or positive results will clear the label automatically.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
1.0% [0.2%, 16.5%] 27
Regressions ❌
(secondary)
0.3% [0.1%, 0.7%] 16
Improvements ✅
(primary)
-0.4% [-0.8%, -0.1%] 17
Improvements ✅
(secondary)
-0.4% [-0.7%, -0.0%] 8
All ❌✅ (primary) 0.5% [-0.8%, 16.5%] 44

Max RSS (memory usage)

Results (primary 7.3%, secondary 1.9%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
8.4% [1.0%, 36.4%] 35
Regressions ❌
(secondary)
2.5% [0.8%, 4.2%] 8
Improvements ✅
(primary)
-2.5% [-7.8%, -0.7%] 4
Improvements ✅
(secondary)
-2.7% [-2.7%, -2.7%] 1
All ❌✅ (primary) 7.3% [-7.8%, 36.4%] 39

Cycles

Results (primary 2.0%, secondary 4.8%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
5.4% [1.7%, 17.4%] 5
Regressions ❌
(secondary)
4.8% [3.5%, 7.0%] 5
Improvements ✅
(primary)
-3.5% [-4.6%, -2.3%] 3
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.0% [-4.6%, 17.4%] 8

Binary size

Results (primary -0.2%, secondary -0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.2% [0.0%, 0.6%] 16
Regressions ❌
(secondary)
0.3% [0.0%, 0.5%] 5
Improvements ✅
(primary)
-0.3% [-3.0%, -0.0%] 46
Improvements ✅
(secondary)
-0.1% [-0.4%, -0.0%] 34
All ❌✅ (primary) -0.2% [-3.0%, 0.6%] 62

Bootstrap: 510.282s -> 514.101s (0.75%)
Artifact size: 400.55 MiB -> 398.40 MiB (-0.54%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 24, 2026
@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented May 24, 2026

Try build cancelled. Cancelled workflows:

@rust-log-analyzer

This comment has been minimized.

@cjgillot cjgillot force-pushed the post-mono-mir-opts branch from 17bf2a0 to 92dcb6a Compare May 24, 2026 08:49
@cjgillot
Copy link
Copy Markdown
Contributor Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors Bot pushed a commit that referenced this pull request May 24, 2026
@rust-log-analyzer

This comment has been minimized.

@cjgillot
Copy link
Copy Markdown
Contributor Author

@bors try cancel

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented May 24, 2026

Try build cancelled. Cancelled workflows:

@rust-log-analyzer

This comment has been minimized.

@cjgillot cjgillot force-pushed the post-mono-mir-opts branch from 6c4bb91 to 9aed921 Compare May 24, 2026 15:36
@cjgillot
Copy link
Copy Markdown
Contributor Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors Bot pushed a commit that referenced this pull request May 24, 2026
@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented May 25, 2026

☀️ Try build successful (CI)
Build commit: 4bf97ca (4bf97ca594075e17597a8893e14d1c22d4122f60, parent: 609b8c5cefb3932bbaf4497cb7f9195ca8a1eab6)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Copy Markdown
Collaborator

Finished benchmarking commit (4bf97ca): comparison URL.

Overall result: ❌ regressions - please read:

Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf.

Next, please: If you can, justify the regressions found in this try perf run in writing along with @rustbot label: +perf-regression-triaged. If not, fix the regressions and do another perf run. Neutral or positive results will clear the label automatically.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.5% [0.1%, 2.5%] 60
Regressions ❌
(secondary)
0.5% [0.1%, 1.2%] 24
Improvements ✅
(primary)
-0.3% [-0.7%, -0.1%] 8
Improvements ✅
(secondary)
-0.3% [-0.4%, -0.1%] 4
All ❌✅ (primary) 0.4% [-0.7%, 2.5%] 68

Max RSS (memory usage)

Results (primary 8.1%, secondary 1.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
8.1% [0.8%, 26.1%] 41
Regressions ❌
(secondary)
3.2% [2.2%, 6.4%] 6
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-4.3% [-6.8%, -1.7%] 2
All ❌✅ (primary) 8.1% [0.8%, 26.1%] 41

Cycles

Results (primary 2.8%, secondary -0.7%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
2.8% [2.1%, 3.3%] 9
Regressions ❌
(secondary)
2.4% [2.1%, 2.6%] 5
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-16.1% [-16.1%, -16.1%] 1
All ❌✅ (primary) 2.8% [2.1%, 3.3%] 9

Binary size

Results (primary -0.2%, secondary -0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.2% [0.0%, 0.5%] 13
Regressions ❌
(secondary)
0.3% [0.0%, 0.4%] 5
Improvements ✅
(primary)
-0.5% [-3.5%, -0.0%] 21
Improvements ✅
(secondary)
-0.2% [-0.6%, -0.0%] 20
All ❌✅ (primary) -0.2% [-3.5%, 0.5%] 34

Bootstrap: 508.631s -> 519.036s (2.05%)
Artifact size: 400.68 MiB -> 400.55 MiB (-0.03%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 25, 2026
@cjgillot cjgillot marked this pull request as ready for review May 25, 2026 07:20
@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 25, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented May 25, 2026

The Clippy subtree was changed

cc @rust-lang/clippy

Some changes occurred in coverage instrumentation.

cc @Zalathar

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

This PR changes MIR

cc @oli-obk, @RalfJung, @JakobDegen, @vakaras

Some changes occurred in compiler/rustc_attr_parsing

cc @jdonszelmann, @JonathanBrouwer

The Cranelift subtree was changed

cc @bjorn3

Some changes occurred in compiler/rustc_hir/src/attrs

cc @jdonszelmann, @JonathanBrouwer

@rustbot rustbot removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label May 25, 2026
@cjgillot
Copy link
Copy Markdown
Contributor Author

Last perf run halves the max-rss regression, I'm not sure how much I can do. I welcome ideas.

@oli-obk
Copy link
Copy Markdown
Contributor

oli-obk commented May 25, 2026

Some instruction regressions are time improvements.

Most others that I looked at have most of their regression in LLVM, which can be anything like

  • better opts
  • having to untangle things
  • just LTO noise due to changes in cgu size

Pretty sure regex has benchmarks, maybe run those with the two rustcs used in the latest perf run?

@saethlin
Copy link
Copy Markdown
Member

I don't think a few percent of icount is that concerning next to a 25% increase in memory use of optimized builds. I think we've both tried to massage the implementation to get that number down and I'm still shocked by how large it is despite our efforts. Do we have a memory profiling strategy that could identify the cause?

At work I've used strace or perf's syscall tracing to find mmap calls then converted them into the inferno crate's folded stacks format. I'm not sure I've tried that on the compiler, maybe it works?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-attributes Area: Attributes (`#[…]`, `#![…]`) perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-clippy Relevant to the Clippy team. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants