feat(lto): add global DCE phase-1 support for interface dispatch#1752
feat(lto): add global DCE phase-1 support for interface dispatch#1752zhouguangyuan0718 wants to merge 2 commits intogoplus:mainfrom
Conversation
This change newly introduces end-to-end LTO control in llgo. - add a new tri-state -lto option (auto/true/false) - default LTO ON for embedded target builds and OFF for non-target builds - allow explicit user override in both paths - apply LTO decision at crosscompile flag generation time - make monitor follow the same LTO rule path - add/update tests for default and override behavior Signed-off-by: ZhouGuangyuan <zhouguangyuan.xian@gmail.com>
Signed-off-by: ZhouGuangyuan <zhouguangyuan.xian@gmail.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces the first phase of global Dead Code Elimination (DCE) for Go programs compiled with Link-Time Optimization (LTO). The primary goal is to enhance optimization by making interface dispatch visible to LLVM's type-based devirtualization, while carefully preserving essential runtime functions. This involves significant changes to how LTO is configured, how interface methods are represented, and how metadata is attached to ABI types, all contributing to more efficient and smaller binaries under LTO. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces Link-Time Optimization (LTO) support to the llgo compiler, enabling more optimized builds, especially for target platforms. It adds a new -lto flag for explicit control over LTO, which defaults to enabled for target builds and disabled otherwise. The changes involve updating build configurations to conditionally apply LTO-specific compiler and linker flags, including passing codegen options via -mllvm to ldflags when LTO is active. Furthermore, the pull request integrates Go Global Dead Code Elimination (DCE) by introducing LLVM intrinsics, module flags for virtual function elimination, and type metadata for method capabilities, ensuring proper symbol preservation during LTO. Test cases have been added to validate the LTO flag's behavior and the correct application of LTO-related build settings. Feedback includes addressing an inconsistency in documentation regarding ThinLTO vs. Full LTO flags and improving the maintainability of magic numbers used for field indices in ssa/abitype.go.
| "-Wl,--icf=safe", | ||
| } | ||
| if enableLTO { | ||
| // Enable ThinLTO, using default lto kind(thinlto). |
There was a problem hiding this comment.
| equalField := baseType.Operand(7) | ||
| if !equalField.IsNull() && equalField.OperandsCount() > 0 { | ||
| equalFn := equalField.Operand(0) | ||
| if !equalFn.IsNull() { | ||
| b.Func.recordFakeUse(equalFn) | ||
| } | ||
| } | ||
| if _, ok := types.Unalias(t).(*types.Map); ok { | ||
| rt := b.Prog.rtNamed(b.Prog.abi.RuntimeName(t)) | ||
| runtimeType := b.Prog.Type(rt, InGo) | ||
| base := peelConstOperand0ToType(init, runtimeType.ll) | ||
| hasherField := base.Operand(4) |
There was a problem hiding this comment.
The field indices 7 (for equalField on line 145) and 4 (for hasherField on line 156) are magic numbers. This is fragile and can lead to silent errors if the runtime._type or runtime.maptype struct definitions change. It would be more maintainable to define these as constants with comments explaining what they represent, for example:
const (
// runtimeTypeEqualIndex is the field index of Equal in runtime._type.
runtimeTypeEqualIndex = 7
// mapTypeHasherIndex is the field index of Hasher in runtime.maptype.
mapTypeHasherIndex = 4
)| for v.Type() != target { | ||
| v = v.Operand(0) | ||
| } | ||
| return v |
There was a problem hiding this comment.
peelConstOperand0ToType has no termination guard. If the type chain doesn't contain target (e.g., due to a runtime struct layout change), this will either loop infinitely or crash via out-of-bounds Operand(0) on a value with zero operands (LLVM C API assertion).
Consider adding a safety check:
func peelConstOperand0ToType(v llvm.Value, target llvm.Type) llvm.Value {
for v.Type() != target {
if v.OperandsCount() == 0 {
panic("peelConstOperand0ToType: reached leaf without finding target type")
}
v = v.Operand(0)
}
return v
}| // Enable ThinLTO, using default lto kind(thinlto). | ||
| export.LDFLAGS = append(export.LDFLAGS, "-Wl,--lto-O1", "-v") | ||
| } | ||
| if clangRoot != "" { |
There was a problem hiding this comment.
Two issues on this line:
-
Debug
-vflag leaked into production: The-vflag causes clang to dump verbose output (full toolchain paths, sysroot, library search dirs, internal linker command lines) for every LTO-enabled build. This looks like a debugging leftover — it's not present in theUseTargetpath (line 511). Should be removed before merge. -
Comment/code mismatch (ThinLTO vs Full LTO): The comment on line 233 says "Enable ThinLTO, using default lto kind(thinlto)" but
-flto=full(line 259) selects full LTO, not thin LTO. These are fundamentally different strategies. Same mismatch exists inUseTargetat line 509. The comments should say "Full LTO" or the flags should use-flto=thin.
| func (p Program) methodCheckedLoad(b llvm.Builder, mod llvm.Module, typedesc llvm.Value, typeID string) llvm.Value { | ||
| mdVal := p.ctx.MetadataAsValue(p.ctx.MDString(typeID)) | ||
| res := llvm.CreateCall(b, p.llvmTypeCheckedLoad(mod).GlobalValueType(), p.llvmTypeCheckedLoad(mod), []llvm.Value{ | ||
| typedesc, |
There was a problem hiding this comment.
p.llvmTypeCheckedLoad(mod) is called twice here (and p.llvmAssume(mod) twice at line 95). Each call does a mod.NamedFunction() CGo lookup. Since this runs once per interface dispatch site, consider caching in a local:
tcl := p.llvmTypeCheckedLoad(mod)
res := llvm.CreateCall(b, tcl.GlobalValueType(), tcl, []llvm.Value{...})Same pattern at line 112 for llvmFakeUse.
| llvm.CreateCall(b, fnTy, asm, []llvm.Value{v}) | ||
| } | ||
|
|
||
| func (fn Function) emitFakeUses(b Builder) { |
There was a problem hiding this comment.
emitFakeUses (which uses llvm.fake.use) is defined but never called — only emitFakeUsesInlineAsm is wired up in EndBuild. If this is intentionally kept as a future alternative, a comment explaining this would help. Otherwise it's dead code that could be removed to avoid confusion about which emission strategy is active.
| hasVArg: hasVArg, | ||
| fakeUses: make([]llvm.Value, 0, 4), | ||
| fakeUseSet: make(map[llvm.Value]struct{}), | ||
| } |
There was a problem hiding this comment.
fakeUses and fakeUseSet are allocated for every function even when enableGoGlobalDCE is false (the common case today). In a large codebase with thousands of functions, the per-function make(map[...]) adds non-trivial overhead. Consider lazy init — leave them nil and allocate on the first recordFakeUse call:
func (p Function) recordFakeUse(v llvm.Value) {
if v.IsNil() { return }
if p.fakeUseSet == nil {
p.fakeUseSet = make(map[llvm.Value]struct{})
}
// ...
}|
|
||
| typeType := b.Prog.Type(b.Prog.rtNamed("Type"), InGo) | ||
| baseType := peelConstOperand0ToType(init, typeType.ll) | ||
| equalField := baseType.Operand(7) |
There was a problem hiding this comment.
Hardcoded operand indices 7 (Equal) and 4 (Hasher at line 156) silently break if the runtime struct layout changes — the Operand() call would either read the wrong field or crash if the struct has fewer fields. Consider adding a brief comment documenting which struct and field each index refers to, and/or an OperandsCount() assertion before access.
|
Well-structured PR — the LTO plumbing through CLI/build/crosscompile is clean, the reflect-safety approach via module flag merging is elegant, and the test coverage is solid. Key items to address before merge:
See inline comments for details. |
Summary
This PR adds phase-1 support for Go global DCE on the LTO path.
The implementation focuses on making non-empty interface dispatch visible to LLVM's type-based devirtualization / virtual function elimination pipeline, while keeping runtime-required function references alive through explicit fake-use handling.
What Changed
LTO control
-ltoCLI/config control with target-aware defaultsInterface method reachability
llvm.type.checked.loadgo.method.<name>:<normalized-signature>b.Load(pfn)pathABI type metadata
!typemetadata on ABI method slots!vcall_visibilitymetadata with linkage-unit visibility"Virtual Function Elim"for LTO buildsRuntime function keepalive
EqualandMapType.Hasherreferences alive through fake-use handlingReflect safety
"Virtual Function Elim"as anllvm.module.flagsentry usingminmerge behavior0Design Notes
Equalis intentionally not modeled as a normal vtable capability idTests
Ran:
GOWORK=/opt/data/00.Code/goplus/go.work GOCACHE=/tmp/llgo-gocache go test ./ssa -run 'TestAddTypeMetadata|TestReflectPackageDisablesVirtualFunctionElim|TestFakeUseValue|TestFakeUseValueInlineAsm|TestEmitFakeUsesAtEntry' -count=1GOWORK=/opt/data/00.Code/goplus/go.work GOCACHE=/tmp/llgo-gocache go test ./internal/build -run 'TestLTOEnabledDefault|TestLTOEnabledExplicitOverride' -count=1GOWORK=/opt/data/00.Code/goplus/go.work GOCACHE=/tmp/llgo-gocache go test ./cl -run 'TestGoGlobalDCEPhase1IR' -count=1Follow-ups
llvm.fake.usevs inline asm fallback)