Skip to content

Enable Big/Super Pages for v3d#7285

Open
mairacanal wants to merge 2 commits into
raspberrypi:rpi-6.18.yfrom
mairacanal:v3d/downstream/super-pages-512k
Open

Enable Big/Super Pages for v3d#7285
mairacanal wants to merge 2 commits into
raspberrypi:rpi-6.18.yfrom
mairacanal:v3d/downstream/super-pages-512k

Conversation

@mairacanal
Copy link
Copy Markdown
Contributor

Considering that the Raspberry Pi is an embedded device with limited memory, memory fragmentation is an important aspect for performance. Using Big/Super Pages has clear benefits when it comes to reducing TLB misses, but also has an impact on memory fragmentation as we need to allocate aligned contiguous memory, increasing compaction pressure and memory waste for small BOs.

As Big/Super Pages only have benefits for larger BOs, create a minimum BO size to use the THP partition. After testing different thresholds, 512KB provides the most balanced results with clear improvements and no significant regressions. This means that Big/Super Pages will only be used for BOs of at least 512KB.

Here are some benchmark results. Each trace has been run twice to gather the results.

image

@popcornmix
Copy link
Copy Markdown
Collaborator

I have run various 3d related benchmarks to check the effect of these settings. 5 runs of each.
THP=1 is TRANSPARENT_HUGEPAGE_MADVISE=y
THP=2 is TRANSPARENT_HUGEPAGE_ALWAYS=y
BIG_V3D=1 is Only use Big/Super Pages if BO >= 512KB commit

The percentages are change from default. An ! means statistically significant.
I'm struggling to find a statistically significant benefit.

Any suggestions for a test that shows a clear benefit?

aqua5000: https://webglsamples.org/aquarium/aquarium.html with 5000 fish
aqua5000: https://webglsamples.org/aquarium/aquarium.html with 500 fish
motionmark: https://browserbench.org/MotionMark1.3.1/
speedometer: https://browserbench.org/Speedometer3.1/
youtube: https://www.youtube.com/watch?v=LXb3EKWsInQ&autoplay=1 and extract DrawAndSwap
volumeshader: https://volumeshaderbm.com/start/
glmark2-es2: glmark2-es2-drm from apt
vkmark: vkmark from apt
openarena: openarena timedemo from apt

=== aqua5000 ===
272.0 : Default [288, 255, 254, 280, 283]
272.2 : THP=2 / BIG_V3D=1 [224, 286, 276, 285, 290] +0.1%
274.2 : BIG_V3D=1 [222, 295, 283, 286, 285] +0.8%
276.6 : THP=1 [305, 277, 269, 276, 256] +1.7%
279.4 : THP=2 [279, 279, 283, 278, 278] +2.7%
281.2 : THP=1 / BIG_V3D=1 [264, 303, 286, 264, 289] +3.4%

=== aqua500 ===
700.0 : THP=1 [712, 693, 707, 694, 694] -2.3%!
701.0 : THP=2 [710, 696, 695, 697, 707] -2.2%
703.6 : THP=1 / BIG_V3D=1 [706, 709, 692, 705, 706] -1.8%
710.4 : THP=2 / BIG_V3D=1 [712, 708, 716, 708, 708] -0.9%
711.0 : BIG_V3D=1 [708, 703, 713, 720, 711] -0.8%
716.8 : Default [705, 718, 721, 704, 736]

=== glmark2-es2 ===
665.3 : THP=1 [666, 667, 667, 666, 667, 665, 666, 662, 665, 662] -1.9%!
665.4 : THP=2 [666, 667, 663, 666, 663, 666, 666, 665, 667, 665] -1.9%!
673.0 : THP=2 / BIG_V3D=1 [671, 674, 674, 673, 673] -0.8%!
673.4 : THP=1 / BIG_V3D=1 [673, 674, 673, 673, 674] -0.8%!
675.4 : BIG_V3D=1 [675, 675, 675, 676, 676] -0.5%!
678.5 : Default [679, 679, 679, 679, 678, 679, 678, 678, 678, 678]

=== motionmark ===
24022.0 : THP=2 / BIG_V3D=1 [24163, 23706, 24797, 24051, 23393] -11.9%!
25599.5 : THP=1 [24445, 25943, 24328, 27235, 24412, 23262, 24022, 28012, 28355, 25981] -6.1%
25655.4 : BIG_V3D=1 [29270, 27546, 23687, 23459, 24315] -5.9%
26255.5 : THP=2 [27648, 26508, 23661, 30018, 28348, 26604, 25809, 23730, 26815, 23414] -3.7%
26260.2 : THP=1 / BIG_V3D=1 [22966, 28149, 26818, 27127, 26241] -3.7%
27261.8 : Default [23926, 28065, 26559, 28385, 23741, 30594, 32586, 24177, 24391, 30194]

=== openarena ===
9792.0 : THP=1 [9780, 9760, 9750, 9790, 9810, 9830, 9800, 9790, 9800, 9810] -2.8%!
9801.0 : THP=2 [9820, 9830, 9770, 9780, 9800, 9810, 9780, 9790, 9790, 9840] -2.8%!
10012.0 : THP=2 / BIG_V3D=1 [10040, 10040, 9980, 10010, 9990] -0.7%!
10022.0 : THP=1 / BIG_V3D=1 [10010, 10050, 10030, 10000, 10020] -0.6%!
10046.0 : BIG_V3D=1 [9970, 10070, 10070, 10060, 10060] -0.3%
10079.0 : Default [10080, 10040, 10080, 10080, 10070, 10090, 10100, 10050, 10100, 10100]

=== speedometer ===
501.4 : Default [503, 497, 502, 507, 498]
501.6 : BIG_V3D=1 [495, 503, 503, 502, 505] +0.0%
502.0 : THP=1 [502, 499, 503, 502, 504] +0.1%
502.0 : THP=2 / BIG_V3D=1 [506, 499, 507, 497, 501] +0.1%
503.6 : THP=2 [501, 503, 508, 509, 497] +0.4%
506.8 : THP=1 / BIG_V3D=1 [500, 508, 507, 510, 509] +1.1%

=== vkmark ===
578.8 : THP=1 [579, 579, 579, 579, 578, 578, 579, 579, 579, 579] -4.4%!
578.8 : THP=2 [579, 578, 578, 578, 579, 579, 580, 579, 579, 579] -4.4%!
602.2 : THP=2 / BIG_V3D=1 [602, 602, 603, 602, 602] -0.6%!
602.8 : THP=1 / BIG_V3D=1 [603, 602, 603, 603, 603] -0.5%!
604.0 : BIG_V3D=1 [604, 604, 604, 604, 604] -0.3%!
605.7 : Default [605, 606, 606, 606, 606, 605, 606, 605, 606, 606]

=== volumeshader ===
630.0 : Default [630, 630, 630, 630, 630]
630.0 : BIG_V3D=1 [630, 630, 630, 630, 630] +0.0%
630.0 : THP=1 [630, 630, 630, 630, 630] +0.0%
630.0 : THP=1 / BIG_V3D=1 [630, 630, 630, 630, 630] +0.0%
630.0 : THP=2 [630, 630, 630, 630, 630] +0.0%
630.0 : THP=2 / BIG_V3D=1 [630, 630, 630, 630, 630] +0.0%

=== youtube ===
1718.2 : THP=1 / BIG_V3D=1 [1692, 1745, 1767, 1701, 1686] -2.2%
1747.0 : THP=2 / BIG_V3D=1 [1745, 1686, 1748, 1764, 1792] -0.5%
1756.0 : Default [1770, 1792, 1767, 1721, 1730]
1763.0 : BIG_V3D=1 [1600, 1795, 1783, 1848, 1789] +0.4%
1774.0 : THP=2 [1815, 1783, 1751, 1773, 1748] +1.0%
1776.8 : THP=1 [1712, 1859, 1742, 1776, 1795] +1.2%

@mairacanal
Copy link
Copy Markdown
Contributor Author

I have run various 3d related benchmarks to check the effect of these settings. 5 runs of each. THP=1 is TRANSPARENT_HUGEPAGE_MADVISE=y THP=2 is TRANSPARENT_HUGEPAGE_ALWAYS=y BIG_V3D=1 is Only use Big/Super Pages if BO >= 512KB commit

The percentages are change from default. An ! means statistically significant. I'm struggling to find a statistically significant benefit.

Any suggestions for a test that shows a clear benefit?

Thanks for sharing these results. I'm under the impression that using traces might be masking other side-effects that using THP has on the system. I'll investigate it further and run benchmarks without traces.

@popcornmix
Copy link
Copy Markdown
Collaborator

I'm testing with RpiOS trixie mesa - are you testing with something closer to master?
Might there be a difference in results observed?

@mairacanal
Copy link
Copy Markdown
Contributor Author

mairacanal commented May 26, 2026

Yes, for the results I previously shared, I was using Mesa mainline (commit 3ea293a9d1d).

@popcornmix
Copy link
Copy Markdown
Collaborator

I've added GIT=1 to test matrix, which is using mesa from head of git (bb15c00).
There are some huge wins just from the mesa switch.
And, I think the THP/BIG_V3D are visible when combined with GIT=1 for some tests.

Any view on the safety if bumping mesa closer to git master in our trixie repo?
Would you expect any incompatibilities with trixie packages that use mesa?
Are the gains from a few commits that could be cherry-picked onto trixie mesa source?

=== aqua5000 ===
193.0 : Default [199, 182, 198, 188, 198]
193.2 : BIG_V3D=1 [190, 204, 184, 189, 199] +0.1%
194.6 : THP=1 / BIG_V3D=1 [202, 200, 201, 184, 186] +0.8%
199.3 : THP=2 [200, 199, 199] +3.3%
203.0 : THP=1 [196, 199, 207, 202, 211] +5.2%
207.8 : THP=2 / BIG_V3D=1 [222, 222, 193, 201, 201] +7.7%
218.2 : GIT=1 [220, 201, 232, 249, 189] +13.1%
289.6 : THP=2 / GIT=1 / BIG_V3D=1 [254, 294, 319, 326, 255] +50.1%!
299.5 : GIT=1 / BIG_V3D=1 [307, 303, 294, 294] +55.2%!
299.5 : THP=1 / GIT=1 [301, 300, 298, 299] +55.2%!
302.5 : THP=2 / GIT=1 [300, 302, 304, 304] +56.7%!
308.4 : THP=1 / GIT=1 / BIG_V3D=1 [289, 301, 297, 334, 321] +59.8%!

=== aqua500 ===
547.5 : BIG_V3D=1 [546, 546, 548, 550] -1.0%
548.4 : THP=2 / BIG_V3D=1 [538, 550, 545, 560, 549] -0.9%
551.0 : THP=1 / BIG_V3D=1 [549, 546, 557, 552] -0.4%
551.2 : THP=2 [531, 546, 551, 558, 570] -0.4%
553.2 : Default [547, 546, 556, 555, 562]
558.2 : THP=1 [546, 548, 562, 579, 556] +0.9%
657.8 : GIT=1 [657, 656, 660, 658] +18.9%!
758.8 : THP=1 / GIT=1 [756, 766, 764, 749] +37.2%!
762.2 : GIT=1 / BIG_V3D=1 [745, 781, 757, 770, 758] +37.8%!
763.6 : THP=2 / GIT=1 [748, 760, 778, 756, 776] +38.0%!
768.0 : THP=1 / GIT=1 / BIG_V3D=1 [766, 755, 788, 760, 771] +38.8%!
780.4 : THP=2 / GIT=1 / BIG_V3D=1 [751, 799, 779, 804, 769] +41.1%!

=== basemark ===
23720.0 : BIG_V3D=1 [23500, 23600, 24200, 24000, 23300] -0.8%
23920.0 : Default [24600, 23600, 23300, 24100, 24000]
23933.3 : THP=1 [23900, 23900, 24000] +0.1%
24320.0 : THP=2 / BIG_V3D=1 [24300, 24700, 24900, 24100, 23600] +1.7%
24460.0 : THP=1 / BIG_V3D=1 [24800, 24300, 24900, 24300, 24000] +2.3%
24650.0 : THP=2 [24800, 25000, 24600, 24200] +3.1%!
24920.0 : GIT=1 [24900, 24400, 25300, 25200, 24800] +4.2%!
29840.0 : THP=1 / GIT=1 / BIG_V3D=1 [32500, 28300, 29200, 28500, 30700] +24.7%!
30740.0 : THP=2 / GIT=1 [31600, 31300, 31800, 29600, 29400] +28.5%!
30860.0 : THP=2 / GIT=1 / BIG_V3D=1 [31700, 30300, 29600, 31300, 31400] +29.0%!
31325.0 : THP=1 / GIT=1 [31300, 31500, 30700, 31800] +31.0%!
31600.0 : GIT=1 / BIG_V3D=1 [32000, 31400, 31400, 31600] +32.1%!

=== glmark2-es2 ===
708.0 : THP=1 [708, 708, 708, 708, 708] -0.4%!
708.2 : THP=2 [708, 708, 708, 709, 708] -0.4%!
708.2 : THP=2 / BIG_V3D=1 [709, 708, 708, 708, 708] -0.4%!
708.4 : THP=1 / BIG_V3D=1 [709, 708, 709, 708, 708] -0.4%!
710.7 : GIT=1 [710, 710, 711, 711, 711, 711] -0.0%
711.0 : Default [711, 711, 712, 711, 711, 710]
711.0 : BIG_V3D=1 [711, 711, 711, 711, 711] +0.0%
717.4 : THP=2 / GIT=1 / BIG_V3D=1 [716, 718, 717, 718, 718] +0.9%!
717.6 : THP=2 / GIT=1 [718, 717, 717, 718, 718] +0.9%!
718.0 : THP=1 / GIT=1 / BIG_V3D=1 [716, 719, 718, 718, 719] +1.0%!
718.4 : THP=1 / GIT=1 [720, 718, 718, 718, 718] +1.0%!
718.8 : GIT=1 / BIG_V3D=1 [719, 719, 720, 717, 719] +1.1%!

=== jetstream ===
12979.0 : THP=2 [12962, 12965, 12998, 12991] -0.7%!
13032.2 : THP=2 / BIG_V3D=1 [13055, 12956, 13100, 13056, 12994] -0.2%
13048.4 : THP=1 [13050, 13098, 13034, 13075, 12985] -0.1%
13054.8 : THP=1 / BIG_V3D=1 [13025, 13097, 13069, 13087, 12996] -0.1%
13062.2 : GIT=1 [13093, 13061, 13098, 13051, 13008] -0.0%
13063.4 : BIG_V3D=1 [13124, 13016, 13041, 13072, 13064] -0.0%
13064.0 : Default [13047, 13081, 13107, 13044, 13041]
19516.0 : THP=2 / GIT=1 [19865, 19017, 19086, 19414, 20198] +49.4%!
20004.0 : THP=2 / GIT=1 / BIG_V3D=1 [20014, 19985, 20013] +53.1%!
20196.8 : GIT=1 / BIG_V3D=1 [20305, 20158, 20051, 20287, 20183] +54.6%!
20215.8 : THP=1 / GIT=1 [20220, 20238, 20169, 20259, 20193] +54.7%!
20235.8 : THP=1 / GIT=1 / BIG_V3D=1 [20255, 20235, 20215, 20238] +54.9%!

=== jetstream2 ===
8429.6 : THP=1 / GIT=1 [8406, 8357, 8483, 8432, 8470] -0.5%
8430.8 : THP=2 / GIT=1 / BIG_V3D=1 [8353, 8466, 8427, 8429, 8479] -0.5%
8447.4 : THP=2 / GIT=1 [8447, 8467, 8397, 8403, 8523] -0.3%
8462.5 : GIT=1 / BIG_V3D=1 [8448, 8452, 8465, 8485] -0.1%
8465.2 : THP=1 / GIT=1 / BIG_V3D=1 [8489, 8447, 8426, 8479, 8485] -0.1%
8467.0 : THP=2 [8507, 8413, 8469, 8489, 8457] -0.1%
8467.2 : THP=1 / BIG_V3D=1 [8408, 8487, 8418, 8531, 8492] -0.1%
8467.8 : GIT=1 [8575, 8387, 8398, 8501, 8478] -0.1%
8470.6 : BIG_V3D=1 [8489, 8538, 8422, 8544, 8360] -0.0%
8474.0 : Default [8484, 8545, 8432, 8462, 8447]
8478.2 : THP=1 [8484, 8477, 8468, 8484] +0.1%
8479.0 : THP=2 / BIG_V3D=1 [8491, 8467, 8472, 8486] +0.1%

=== motionmark ===
23532.0 : THP=2 / BIG_V3D=1 [20566, 24335, 23606, 26178, 22975] -1.8%
23968.8 : Default [23892, 23929, 23029, 24646, 24348]
24368.4 : BIG_V3D=1 [24717, 23221, 23324, 24770, 25810] +1.7%
24611.8 : THP=2 [24378, 24819, 24352, 24898] +2.7%
25068.4 : GIT=1 [24177, 23702, 25994, 25378, 26091] +4.6%
25158.0 : THP=1 / BIG_V3D=1 [25274, 25135, 25065] +5.0%!
25254.8 : THP=1 [25005, 25626, 25216, 25172] +5.4%!
28886.2 : THP=2 / GIT=1 / BIG_V3D=1 [29531, 26346, 29635, 27540, 31379] +20.5%!
28987.8 : GIT=1 / BIG_V3D=1 [32612, 26444, 29952, 28610, 27321] +20.9%!
29082.5 : THP=2 / GIT=1 [28172, 29061, 29430, 29667] +21.3%!
29214.0 : THP=1 / GIT=1 [29616, 29744, 28049, 28637, 30024] +21.9%!
30751.8 : THP=1 / GIT=1 / BIG_V3D=1 [33157, 29991, 31263, 28282, 31066] +28.3%!

=== octane ===
15724.2 : THP=1 / BIG_V3D=1 [15604, 15635, 15628, 15857, 15897] -1.2%
15741.0 : GIT=1 [16185, 15314, 15417, 15902, 15887] -1.0%
15907.4 : Default [15780, 15878, 16146, 15735, 15998]
15941.6 : THP=1 [16133, 16207, 15803, 16035, 15530] +0.2%
16029.8 : THP=2 / BIG_V3D=1 [15895, 16263, 15810, 16077, 16104] +0.8%
16035.0 : THP=2 [16181, 16026, 16090, 15977, 15901] +0.8%
16069.5 : BIG_V3D=1 [16095, 16012, 15999, 16172] +1.0%
25032.8 : THP=1 / GIT=1 [25007, 24918, 24647, 25201, 25391] +57.4%!
25046.4 : THP=1 / GIT=1 / BIG_V3D=1 [25425, 24571, 24825, 25320, 25091] +57.5%!
25084.6 : THP=2 / GIT=1 [25172, 25170, 24976, 25450, 24655] +57.7%!
25100.4 : GIT=1 / BIG_V3D=1 [24913, 25011, 25395, 24809, 25374] +57.8%!
25299.0 : THP=2 / GIT=1 / BIG_V3D=1 [25235, 25455, 25101, 25342, 25362] +59.0%!

=== openarena ===
10150.0 : THP=2 [10160, 10130, 10150, 10160] -0.6%!
10160.0 : THP=1 [10160, 10150, 10180, 10150] -0.5%!
10164.0 : THP=1 / BIG_V3D=1 [10230, 10120, 10170, 10160, 10140] -0.5%
10166.0 : THP=2 / BIG_V3D=1 [10250, 10150, 10150, 10130, 10150] -0.4%
10210.0 : BIG_V3D=1 [10280, 10170, 10200, 10200, 10200] -0.0%
10211.4 : Default [10290, 10290, 10170, 10170, 10180, 10180, 10200]
10441.4 : GIT=1 [10490, 10500, 10420, 10410, 10430, 10420, 10420] +2.3%!
10457.5 : THP=1 / GIT=1 / BIG_V3D=1 [10450, 10460, 10450, 10470] +2.4%!
10458.0 : THP=1 / GIT=1 [10450, 10450, 10480, 10440, 10470] +2.4%!
10460.0 : THP=2 / GIT=1 / BIG_V3D=1 [10450, 10460, 10480, 10450, 10460] +2.4%!
10470.0 : THP=2 / GIT=1 [10450, 10490, 10470, 10470, 10470] +2.5%!
10496.0 : GIT=1 / BIG_V3D=1 [10510, 10500, 10500, 10490, 10480] +2.8%!

=== speedometer ===
369.2 : GIT=1 [370, 365, 373, 370, 368] -1.2%!
370.6 : THP=2 / BIG_V3D=1 [368, 369, 372, 374, 370] -0.8%!
371.6 : BIG_V3D=1 [372, 370, 372, 375, 369] -0.6%
372.2 : THP=1 [367, 375, 375, 375, 369] -0.4%
373.0 : THP=2 [378, 374, 373, 371, 369] -0.2%
373.4 : THP=1 / BIG_V3D=1 [375, 375, 373, 369, 375] -0.1%
373.8 : Default [375, 373, 374, 373]
530.6 : THP=1 / GIT=1 [530, 534, 533, 528, 528] +42.0%!
532.2 : THP=2 / GIT=1 [532, 533, 533, 531, 532] +42.4%!
532.4 : THP=2 / GIT=1 / BIG_V3D=1 [526, 534, 534, 534, 534] +42.4%!
535.8 : THP=1 / GIT=1 / BIG_V3D=1 [538, 534, 533, 539, 535] +43.4%!
536.6 : GIT=1 / BIG_V3D=1 [542, 535, 538, 539, 529] +43.6%!

=== vkmark ===
617.8 : THP=1 / BIG_V3D=1 [618, 617, 618, 618, 618] -0.6%!
617.8 : THP=2 [618, 618, 618, 617, 618] -0.6%!
618.0 : THP=1 [618, 618, 618, 618, 618] -0.6%!
618.0 : THP=2 / BIG_V3D=1 [618, 619, 618, 618, 617] -0.6%!
619.8 : GIT=1 [619, 620, 620, 620, 620, 620] -0.3%!
621.8 : Default [622, 621, 622, 622, 622, 622]
622.6 : BIG_V3D=1 [623, 623, 623, 622, 622] +0.1%!
627.2 : THP=1 / GIT=1 / BIG_V3D=1 [627, 627, 627, 627, 628] +0.9%!
627.6 : THP=2 / GIT=1 / BIG_V3D=1 [628, 627, 628, 628, 627] +0.9%!
627.8 : THP=1 / GIT=1 [628, 628, 628, 627, 628] +1.0%!
628.2 : GIT=1 / BIG_V3D=1 [629, 628, 628, 628, 628] +1.0%!
628.2 : THP=2 / GIT=1 [629, 628, 628, 628, 628] +1.0%!

=== volumeshader ===
630.0 : Default [630, 630, 630, 630, 630]
630.0 : BIG_V3D=1 [630, 630, 630, 630, 630] +0.0%
630.0 : THP=1 [630, 630, 630, 630, 630] +0.0%
630.0 : THP=1 / BIG_V3D=1 [630, 630, 630, 630, 630] +0.0%
630.0 : THP=2 [630, 630, 630, 630, 630] +0.0%
630.0 : THP=2 / BIG_V3D=1 [630, 630, 630, 630, 630] +0.0%
650.0 : GIT=1 [650, 650, 650, 650, 650] +3.2%!
650.0 : GIT=1 / BIG_V3D=1 [650, 650, 650, 650, 650] +3.2%!
650.0 : THP=1 / GIT=1 [650, 650, 650, 650, 650] +3.2%!
650.0 : THP=1 / GIT=1 / BIG_V3D=1 [650, 650, 650, 650, 650] +3.2%!
650.0 : THP=2 / GIT=1 [650, 650, 650, 650, 650] +3.2%!
650.0 : THP=2 / GIT=1 / BIG_V3D=1 [650, 650, 650, 650, 650] +3.2%!

=== youtube ===
1845.0 : THP=2 / BIG_V3D=1 [1845, 1842, 1848] -1.5%!
1863.3 : THP=1 / BIG_V3D=1 [1855, 1866, 1869] -0.5%
1868.8 : THP=2 [1912, 1832, 1855, 1876, 1869] -0.2%
1872.7 : Default [1880, 1862, 1876]
1873.0 : THP=1 [1894, 1842, 1852, 1901, 1876] +0.0%
1877.8 : BIG_V3D=1 [1862, 1842, 1883, 1894, 1908] +0.3%
1934.4 : GIT=1 [1965, 1938, 1949, 1908, 1912] +3.3%!
2183.5 : THP=1 / GIT=1 [2174, 2169, 2198, 2193] +16.6%!
2214.4 : THP=2 / GIT=1 [2242, 2203, 2198, 2217, 2212] +18.2%!
2233.2 : THP=2 / GIT=1 / BIG_V3D=1 [2257, 2242, 2188, 2252, 2227] +19.3%!
2236.4 : THP=1 / GIT=1 / BIG_V3D=1 [2252, 2242, 2217, 2273, 2198] +19.4%!
2300.0 : GIT=1 / BIG_V3D=1 [2299, 2299, 2294, 2309, 2299] +22.8%!

@mairacanal
Copy link
Copy Markdown
Contributor Author

Any view on the safety if bumping mesa closer to git master in our trixie repo?
Would you expect any incompatibilities with trixie packages that use mesa?
Are the gains from a few commits that could be cherry-picked onto trixie mesa source?

I have the impression that bumping to 26.1 should be safe, but I’ll check with the Mesa experts to confirm. I’ll also see whether there is a small set of commits we can cherry-pick, and then I’ll get back to you with a solution.

Considering that the Raspberry Pi is an embedded device with limited
memory, memory fragmentation is an important aspect for performance.
Using Big/Super Pages has clear benefits when it comes to reducing TLB
misses, but also has an impact on memory fragmentation as we need to
allocate aligned contiguous memory, increasing compaction pressure and
memory waste for small BOs.

As Big/Super Pages only have benefits for larger BOs, create a minimum
BO size to use the THP partition. After testing different thresholds,
512KB provides the most balanced results with clear improvements and no
significant regressions. This means that Big/Super Pages will only be
used for BOs of at least 512KB.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
@mairacanal mairacanal force-pushed the v3d/downstream/super-pages-512k branch from 48996f0 to b2316a1 Compare June 5, 2026 10:50
@mairacanal
Copy link
Copy Markdown
Contributor Author

@popcornmix, in order to make sure our results are similar and compatible, would you mind sharing your benchmarking scripts?

@popcornmix
Copy link
Copy Markdown
Collaborator

popcornmix commented Jun 5, 2026

The browser benchmarks use CDP to launch benchmarks and grab results. I've attached the version I last used
browser-bench.sh

I'd be interested in your scores from any of those (especially the ones I report as having bigger performance gains).

Note: I convert a meaure from output to a score (bigger is better) with:

benchmark filter
aquarium awk '/Avg FPS:/ {printf "%.0f\n", $3 * 10}'
jetstream awk '/Overall:/ {v=$2} END {printf "%.0f\n", v * 100}'
jetstream2 awk '/Overall:/ {v=$2} END {printf "%.0f\n", v * 100}'
motionmark awk '/Score:/ {v=$2} END {printf "%.0f\n", v * 100}'
speedometer awk '/Score:/ {v=$2} END {printf "%.0f\n", v * 100}'
basemark awk '/Score:/ {v=$2} END {printf "%.0f\n", v * 100}'
volumeshader awk '/Avg FPS:/ {printf "%.0f\n", $3 * 1000}'
youtube awk '/V3D durations/ && match($0, /gl_total=([0-9.]+)/, m) {printf "%.0f\n", (m[1]>0)?1000.0/m[1]:0}'

I (with claude) extended this to also support Firefox (and BiDi) and the last run didn't show the significant boosts with latest mesa (even on chromium), so I am now a bit less confident with the gains observed. They were consistent enough across multiple runs and multiple benchmarks that it was measuring something, but there may be some other variable in play.

The framework that launches them has a lot of dependencies (e.g. switching branches and building/installing kernel and bootloader on our build server, building mesa, adjusting config.txt/cmdline.txt/eeprom-config) so might be quite a job to get running somewhere else.

@mairacanal
Copy link
Copy Markdown
Contributor Author

Thanks for sharing the script! I'll use it to evaluate the possible gains across Mesa commits. About the performance gains, I'd also be a bit skeptical about +20% performance improvements. From our collection traces, there was a consistent measurable impact on the performance, but as you can see it averaged around +2%, with the biggest improvement being Google Maps with +10%.

Having said that, I'll run the benchmark locally and bring the results back to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants