[Feat] Enable LTO by zhouguangyuan0718 · Pull Request #1622 · goplus/llgo

zhouguangyuan0718 · 2026-02-08T08:01:55Z

No description provided.

gemini-code-assist · 2026-02-08T08:02:09Z

Summary of Changes

Hello @zhouguangyuan0718, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates Thin Link-Time Optimization (LTO) and the merge-functions feature into the default build process. This enhancement is designed to improve overall code optimization and potentially reduce the size of the compiled binaries by allowing the linker to perform whole-program analysis and merge identical functions, leading to more efficient and smaller executables.

Highlights

ThinLTO Enabled by Default: Thin Link-Time Optimization (LTO) is now enabled by default for both clang and ld.lld linkers, allowing for whole-program optimization.
Merge Functions Enabled: The merge-functions feature is now enabled by default, which helps reduce binary size by merging identical functions across the entire program.
Unified LTO Flags: The -funified-lto flag has been added to CCFLAGS, cflags, and ccflags to ensure consistent LTO behavior across compilation and linking stages.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

internal/crosscompile/crosscompile.go
- Added -flto=thin, -Wl,-mllvm,-enable-merge-functions, and -Wl,-mllvm,-mergefunc-use-aliases to ldflags when clangRoot is present, enabling ThinLTO and function merging.
- Appended -flto=thin and -funified-lto to export.CCFLAGS for clang.
- Introduced a conditional block to apply --lto=thin, -mllvm -enable-merge-functions, -mllvm -mergefunc-use-aliases to ldflags, and -flto=thin, -funified-lto to cflags and ccflags specifically when ld.lld is configured as the linker.

Activity

No specific activity (comments, reviews, or progress updates) has been recorded for this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request enables ThinLTO and function merging by default, which is a great step for optimizing binary size and performance. The implementation correctly adds the necessary flags for both native and cross-compilation builds.

My main feedback is about code duplication. The LTO-related flags are defined in two separate places (use and UseTarget functions). I've left a comment suggesting to refactor this by extracting the flags into package-level variables to improve maintainability. I also pointed out a misleading comment.

gemini-code-assist · 2026-02-08T08:04:59Z

internal/crosscompile/crosscompile.go

+	// For ld.lld linker, also add CPU info to linker flags
+	if config.Linker == "ld.lld" {
+		ldflags = append(ldflags,
+			// Enable ThinLTO, then global DCE will work well,
+			// and "merge-functions" can merge the function in whole program.
+			"--lto=thin",
+			"-mllvm", "-enable-merge-functions",
+			"-mllvm", "-mergefunc-use-aliases")
+		cflags = append(cflags, "-flto=thin", "-funified-lto")
+		ccflags = append(ccflags, "-flto=thin", "-funified-lto")
+	}


The comment on line 509 is misleading. This block adds LTO and function merging flags, not CPU info. Please update the comment for clarity.

Additionally, these LTO flags and the associated comment are duplicated from the use function (see lines 230-234 and 257-258). To improve maintainability and avoid future inconsistencies, I recommend refactoring this by extracting the different sets of flags into package-level variables. This would make the code cleaner and easier to manage.

For example:

// At package level var ( ltoCompilerFlags = []string{"-flto=thin", "-funified-lto"} ltoLinkerFlagsForClangDriver = []string{ // ... comment here ... "-flto=thin", "-Wl,-mllvm,-enable-merge-functions", "-Wl,-mllvm,-mergefunc-use-aliases", } ltoLinkerFlagsForLld = []string{ // ... comment here ... "--lto=thin", "-mllvm", "-enable-merge-functions", "-mllvm", "-mergefunc-use-aliases", } )

Then you could use ltoLinkerFlagsForLld and ltoCompilerFlags here, and the other variables in the use function.

Suggested change

// For ld.lld linker, also add CPU info to linker flags

if config.Linker == "ld.lld" {

ldflags = append(ldflags,

// Enable ThinLTO, then global DCE will work well,

// and "merge-functions" can merge the function in whole program.

"--lto=thin",

"-mllvm", "-enable-merge-functions",

"-mllvm", "-mergefunc-use-aliases")

cflags = append(cflags, "-flto=thin", "-funified-lto")

ccflags = append(ccflags, "-flto=thin", "-funified-lto")

}

// For ld.lld linker, add LTO and function merging flags.

if config.Linker == "ld.lld" {

ldflags = append(ldflags,

// Enable ThinLTO, then global DCE will work well,

// and "merge-functions" can merge the function in whole program.

"--lto=thin",

"-mllvm", "-enable-merge-functions",

"-mllvm", "-mergefunc-use-aliases")

cflags = append(cflags, "-flto=thin", "-funified-lto")

ccflags = append(ccflags, "-flto=thin", "-funified-lto")

}

xgopilot · 2026-03-02T15:57:07Z

Overall solid work enabling ThinLTO and preserving exported symbols. The llvm.compiler.used mechanism and fitIntSize slice fixes are well-structured and properly tested. A few concerns:

--lto-O0 sets LTO optimization to O0, paying ThinLTO overhead (bitcode files, slower link) without its key benefits (cross-module inlining, IPO). Consider --lto-O2.
out.ll shows @llvm.used but source code creates llvm.compiler.used — these have different semantics.
ELF-specific linker flags (--icf=safe, --lto-O0) used unconditionally on macOS native path where ld64.lld is used.

See inline comments for details.

xgopilot · 2026-03-02T15:57:42Z

_demo/embed/targetsbuild/build.sh

 			"riscv32"
 			"riscv64"
 			"rp2040"
+			"nintendoswitch" # undefined symbol under lto, should not work when no-lto


Nit: the comment says "should not work when no-lto" but the intent seems to be that it doesn't work with LTO (undefined symbol under LTO). Consider rephrasing, e.g.: # undefined symbol under lto, skip for now.

xgopilot · 2026-03-02T15:58:04Z

internal/crosscompile/crosscompile.go

+			"-Wl,--icf=safe",
+			// Enable ThinLTO, Using default lto kind(thinlto).
+			"-Wl,--lto-O0",
 		}


--lto-O0 sets the LTO optimization level to zero — this pays the full cost of ThinLTO (bitcode intermediates, slower link) while disabling the optimizations that make LTO valuable (cross-module inlining, interprocedural constant propagation, global dead code elimination). The only benefit surviving at O0 is that the linker can see all symbols, but ICF and linker-level dead stripping work without LTO as well.

Consider --lto-O2 (common ThinLTO default) to actually realize the link-time optimization benefits.

Also, the comment // Enable ThinLTO, Using default lto kind(thinlto). is misleading here — --lto-O0 doesn't enable ThinLTO (that's done by -flto=thin in CCFLAGS). This flag only controls the optimization level during the LTO link step.

Additionally, -Wl,--icf=safe and -Wl,--lto-O0 are ld.lld (ELF) flags. On macOS, -fuse-ld=lld resolves to ld64.lld (Mach-O linker) which uses different flag syntax. These flags may produce linker warnings or errors on macOS (or be silently ignored due to -Wno-unused-command-line-argument). Consider gating them behind a platform check, similar to the OS-specific branch below at line ~268.

xgopilot · 2026-03-02T15:58:18Z

internal/crosscompile/crosscompile.go

 			"-Wno-unused-command-line-argument",
+			"-flto=thin",
 		}



In the native compile path, -flto=thin is added only to CCFLAGS but not to CFLAGS. In contrast, the UseTarget path (line ~507) adds -flto=thin to both cflags and ccflags. Without -flto=thin in CFLAGS, object files compiled via that path won't contain LTO bitcode, so ThinLTO can't optimize across those translation units. Is this intentional?

xgopilot · 2026-03-02T15:58:31Z

internal/crosscompile/crosscompile.go


+	if config.Linker == "ld.lld" {
+		// Enable ThinLTO, Using default lto kind(thinlto).
+		ldflags = append(ldflags, "--lto-O0")


Same --lto-O0 concern as the native path — this pays ThinLTO overhead without the optimization benefits. Consider --lto-O2.

xgopilot · 2026-03-02T15:58:52Z

ssa/package.go

+func (p Package) markLLVMUsed(v llvm.Value) {
+	elemTyp := p.Prog.VoidPtr().ll
+	p.llvmUsedValues = append(p.llvmUsedValues, llvm.ConstBitCast(v, elemTyp))
+	if !p.llvmUsed.IsNil() {
+		p.llvmUsed.EraseFromParentAsGlobal()
+	}
+	init := llvm.ConstArray(elemTyp, p.llvmUsedValues)
+	global := llvm.AddGlobal(p.mod, init.Type(), "llvm.compiler.used")
+	global.SetInitializer(init)
+	global.SetLinkage(llvm.AppendingLinkage)
+	global.SetSection("llvm.metadata")
+	p.llvmUsed = global
+}


markLLVMUsed erases and recreates the llvm.compiler.used global each time a symbol is added, resulting in O(n^2) total LLVM IR manipulation for n preserved symbols. Consider deferring the construction to a finalization step — accumulate values in llvmUsedValues during compilation and build the llvm.compiler.used global once before the module is emitted. This simplifies the code and avoids the repeated erase/create cycle.

Also, the field and method names reference llvmUsed but the actual global created is llvm.compiler.used. These are semantically different LLVM intrinsics (llvm.used prevents both compiler and linker removal; llvm.compiler.used only prevents compiler removal). The naming could be clearer — e.g., llvmCompilerUsed / llvmCompilerUsedValues.

xgopilot · 2026-03-02T15:59:12Z

ssa/package.go

+		p.llvmUsed.EraseFromParentAsGlobal()
+	}
+	init := llvm.ConstArray(elemTyp, p.llvmUsedValues)
+	global := llvm.AddGlobal(p.mod, init.Type(), "llvm.compiler.used")


The global created here is "llvm.compiler.used", but cl/_testdata/cpkg/out.ll shows @llvm.used (not @llvm.compiler.used). These are semantically distinct LLVM intrinsics:

llvm.used: prevents both compiler and linker from removing symbols

llvm.compiler.used: prevents only the compiler/optimizer from removing, allowing the linker to strip unused symbols

The commit message says "use llvm.compiler.used ... instead of relying on llvm.used merging behavior," and the unit test in ssa_test.go:132 correctly asserts @llvm.compiler.used. But the out.ll reference file contradicts this. Is the out.ll file correctly regenerated? If LLVM is normalizing the name during llgen/gentests output, that would be worth investigating.

xgopilot · 2026-03-02T15:59:34Z

internal/build/plan9asm.go

+		if pkgPath == "syscall" && goos == "darwin" && (goarch == "arm64" || goarch == "amd64") &&
+			(strings.HasSuffix(resolved, "RawSyscall") || strings.HasSuffix(resolved, "RawSyscall6")) {
+			continue
+		}
 		keep = append(keep, fn)
 	}


Minor: The darwin RawSyscall/RawSyscall6 filter lacks an explanatory comment, unlike the sibling Linux filter for rawVforkSyscall above (line 254). Consider adding a brief comment explaining why these assembly functions are excluded on darwin (e.g., provided by runtime via go:linkname, or incompatible with LTO).

use llvm.compiler.used to preserve exported symbols during LTO Signed-off-by: ZhouGuangyuan <zhouguangyuan.xian@gmail.com>

Signed-off-by: ZhouGuangyuan <zhouguangyuan.xian@gmail.com>

codecov · 2026-03-05T00:46:02Z

Codecov Report

❌ Patch coverage is 94.73684% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.99%. Comparing base (3b3ff41) to head (6d59d79).

Files with missing lines	Patch %	Lines
internal/crosscompile/crosscompile.go	80.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1622      +/-   ##
==========================================
- Coverage   92.99%   92.99%   -0.01%     
==========================================
  Files          47       47              
  Lines       13175    13210      +35     
==========================================
+ Hits        12252    12284      +32     
- Misses        737      742       +5     
+ Partials      186      184       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

gemini-code-assist bot reviewed Feb 8, 2026

View reviewed changes

zhouguangyuan0718 force-pushed the main-lto branch 5 times, most recently from 1f80264 to 740d318 Compare February 13, 2026 16:09

zhouguangyuan0718 force-pushed the main-lto branch from 740d318 to 18167cc Compare February 26, 2026 14:23

zhouguangyuan0718 changed the title ~~[Feat] Enable LTO and merge-functions default~~ [Feat] Enable LTO Feb 26, 2026

zhouguangyuan0718 force-pushed the main-lto branch 21 times, most recently from 5d6ef59 to d1cad0b Compare March 2, 2026 11:28

zhouguangyuan0718 force-pushed the main-lto branch 3 times, most recently from 40cd0a9 to 5684f3d Compare March 2, 2026 14:29

zhouguangyuan0718 marked this pull request as ready for review March 2, 2026 15:03

zhouguangyuan0718 force-pushed the main-lto branch 2 times, most recently from b2dd536 to c3c991d Compare March 2, 2026 15:44

xgopilot bot reviewed Mar 2, 2026

View reviewed changes

zhouguangyuan0718 force-pushed the main-lto branch 4 times, most recently from 592cd90 to 3fa5c27 Compare March 3, 2026 15:03

ssa: preserve export functions with llvm.compiler.used

9c9b9f1

use llvm.compiler.used to preserve exported symbols during LTO Signed-off-by: ZhouGuangyuan <zhouguangyuan.xian@gmail.com>

zhouguangyuan0718 force-pushed the main-lto branch 4 times, most recently from bce6845 to def4427 Compare March 4, 2026 16:26

[Feat] Enable LTO

6d59d79

Signed-off-by: ZhouGuangyuan <zhouguangyuan.xian@gmail.com>

zhouguangyuan0718 force-pushed the main-lto branch from def4427 to 6d59d79 Compare March 4, 2026 23:49

Merge branch 'goplus:main' into main-lto

571d31e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat] Enable LTO#1622

[Feat] Enable LTO#1622
zhouguangyuan0718 wants to merge 3 commits intogoplus:mainfrom
zhouguangyuan0718:main-lto

zhouguangyuan0718 commented Feb 8, 2026

Uh oh!

gemini-code-assist bot commented Feb 8, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 8, 2026

Uh oh!

xgopilot bot commented Mar 2, 2026

Uh oh!

xgopilot bot Mar 2, 2026

Uh oh!

xgopilot bot Mar 2, 2026

Uh oh!

xgopilot bot Mar 2, 2026

Uh oh!

xgopilot bot Mar 2, 2026

Uh oh!

xgopilot bot Mar 2, 2026

Uh oh!

xgopilot bot Mar 2, 2026

Uh oh!

xgopilot bot Mar 2, 2026

Uh oh!

codecov bot commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zhouguangyuan0718 commented Feb 8, 2026

Uh oh!

gemini-code-assist bot commented Feb 8, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

xgopilot bot commented Mar 2, 2026

Uh oh!

xgopilot bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

xgopilot bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

xgopilot bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

xgopilot bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

xgopilot bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

xgopilot bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

xgopilot bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 5, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant