Skip to content

feat(lto): add global DCE phase-1 support for interface dispatch#1752

Open
zhouguangyuan0718 wants to merge 2 commits intogoplus:mainfrom
zhouguangyuan0718:main-lto-dce-reflect-19
Open

feat(lto): add global DCE phase-1 support for interface dispatch#1752
zhouguangyuan0718 wants to merge 2 commits intogoplus:mainfrom
zhouguangyuan0718:main-lto-dce-reflect-19

Conversation

@zhouguangyuan0718
Copy link
Copy Markdown
Contributor

Summary

This PR adds phase-1 support for Go global DCE on the LTO path.

The implementation focuses on making non-empty interface dispatch visible to LLVM's type-based devirtualization / virtual function elimination pipeline, while keeping runtime-required function references alive through explicit fake-use handling.

What Changed

LTO control

  • add explicit -lto CLI/config control with target-aware defaults
  • thread the LTO setting through build/crosscompile into SSA
  • enable Go global DCE only when LTO is enabled

Interface method reachability

  • lower non-empty interface method calls with llvm.type.checked.load
  • attach method capability metadata using stable ids of the form:
    • go.method.<name>:<normalized-signature>
  • keep the original call typing behavior by replacing the existing b.Load(pfn) path

ABI type metadata

  • add !type metadata on ABI method slots
  • add !vcall_visibility metadata with linkage-unit visibility
  • add module flag "Virtual Function Elim" for LTO builds

Runtime function keepalive

  • keep Equal and MapType.Hasher references alive through fake-use handling
  • collect fake-use targets per function and emit them at entry
  • add an inline-asm-based fake-use helper so version-specific selection can be done later

Reflect safety

  • emit "Virtual Function Elim" as an llvm.module.flags entry using min merge behavior
  • force the reflect package to emit value 0
  • this allows LTO merging to automatically disable VFE whenever reflect is linked in

Design Notes

  • phase-1 only targets non-empty interface method dispatch
  • Equal is intentionally not modeled as a normal vtable capability id
  • reflect-related disabling is handled through module flag merging instead of front-end callsite tracking
  • fake-use support currently keeps the abstraction separate so backend/version selection can evolve independently

Tests

Ran:

  • GOWORK=/opt/data/00.Code/goplus/go.work GOCACHE=/tmp/llgo-gocache go test ./ssa -run 'TestAddTypeMetadata|TestReflectPackageDisablesVirtualFunctionElim|TestFakeUseValue|TestFakeUseValueInlineAsm|TestEmitFakeUsesAtEntry' -count=1
  • GOWORK=/opt/data/00.Code/goplus/go.work GOCACHE=/tmp/llgo-gocache go test ./internal/build -run 'TestLTOEnabledDefault|TestLTOEnabledExplicitOverride' -count=1
  • GOWORK=/opt/data/00.Code/goplus/go.work GOCACHE=/tmp/llgo-gocache go test ./cl -run 'TestGoGlobalDCEPhase1IR' -count=1

Follow-ups

  • add phase-2 interface precision / finer-grained reachability
  • move reflect-sensitive policy into the LTO plugin where appropriate
  • choose fake-use lowering by LLVM version (llvm.fake.use vs inline asm fallback)

This change newly introduces end-to-end LTO control in llgo.

- add a new tri-state -lto option (auto/true/false)

- default LTO ON for embedded target builds and OFF for non-target builds

- allow explicit user override in both paths

- apply LTO decision at crosscompile flag generation time

- make monitor follow the same LTO rule path

- add/update tests for default and override behavior

Signed-off-by: ZhouGuangyuan <zhouguangyuan.xian@gmail.com>
Signed-off-by: ZhouGuangyuan <zhouguangyuan.xian@gmail.com>
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the first phase of global Dead Code Elimination (DCE) for Go programs compiled with Link-Time Optimization (LTO). The primary goal is to enhance optimization by making interface dispatch visible to LLVM's type-based devirtualization, while carefully preserving essential runtime functions. This involves significant changes to how LTO is configured, how interface methods are represented, and how metadata is attached to ABI types, all contributing to more efficient and smaller binaries under LTO.

Highlights

  • Global DCE Phase-1 Support: Introduced initial support for Go global Dead Code Elimination (DCE) within the Link-Time Optimization (LTO) path, specifically targeting non-empty interface dispatch.
  • LTO Control and Configuration: Added explicit command-line interface (-lto) and configuration options for LTO, with target-aware defaults. The LTO setting is now threaded through the build and cross-compilation processes into the SSA representation, enabling global DCE only when LTO is active.
  • Interface Method Reachability: Implemented lowering of non-empty interface method calls using llvm.type.checked.load and attached method capability metadata with stable IDs (e.g., go.method.<name>:<normalized-signature>). This ensures LLVM's type-based devirtualization can effectively eliminate unused virtual functions.
  • ABI Type Metadata and VFE: Added !type metadata on ABI method slots and !vcall_visibility metadata to indicate linkage-unit visibility. A module flag "Virtual Function Elim" is now added for LTO builds, with special handling for the reflect package to automatically disable Virtual Function Elimination (VFE) when linked.
  • Runtime Function Keepalive: Ensured critical runtime functions like Equal and MapType.Hasher are kept alive through explicit fake-use handling. Fake-use targets are collected per function and emitted at entry, utilizing an inline-assembly-based helper for future version-specific selection.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces Link-Time Optimization (LTO) support to the llgo compiler, enabling more optimized builds, especially for target platforms. It adds a new -lto flag for explicit control over LTO, which defaults to enabled for target builds and disabled otherwise. The changes involve updating build configurations to conditionally apply LTO-specific compiler and linker flags, including passing codegen options via -mllvm to ldflags when LTO is active. Furthermore, the pull request integrates Go Global Dead Code Elimination (DCE) by introducing LLVM intrinsics, module flags for virtual function elimination, and type metadata for method capabilities, ensuring proper symbol preservation during LTO. Test cases have been added to validate the LTO flag's behavior and the correct application of LTO-related build settings. Feedback includes addressing an inconsistency in documentation regarding ThinLTO vs. Full LTO flags and improving the maintainability of magic numbers used for field indices in ssa/abitype.go.

"-Wl,--icf=safe",
}
if enableLTO {
// Enable ThinLTO, using default lto kind(thinlto).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The comment mentions enabling ThinLTO, but the compiler flag being added on line 257 is -flto=full, which enables full LTO. This is inconsistent and could be confusing. Please update the comment to reflect that full LTO is being used. A similar issue exists on line 509.

Comment on lines +145 to +156
equalField := baseType.Operand(7)
if !equalField.IsNull() && equalField.OperandsCount() > 0 {
equalFn := equalField.Operand(0)
if !equalFn.IsNull() {
b.Func.recordFakeUse(equalFn)
}
}
if _, ok := types.Unalias(t).(*types.Map); ok {
rt := b.Prog.rtNamed(b.Prog.abi.RuntimeName(t))
runtimeType := b.Prog.Type(rt, InGo)
base := peelConstOperand0ToType(init, runtimeType.ll)
hasherField := base.Operand(4)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The field indices 7 (for equalField on line 145) and 4 (for hasherField on line 156) are magic numbers. This is fragile and can lead to silent errors if the runtime._type or runtime.maptype struct definitions change. It would be more maintainable to define these as constants with comments explaining what they represent, for example:

const (
    // runtimeTypeEqualIndex is the field index of Equal in runtime._type.
    runtimeTypeEqualIndex = 7
    // mapTypeHasherIndex is the field index of Hasher in runtime.maptype.
    mapTypeHasherIndex = 4
)

for v.Type() != target {
v = v.Operand(0)
}
return v
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

peelConstOperand0ToType has no termination guard. If the type chain doesn't contain target (e.g., due to a runtime struct layout change), this will either loop infinitely or crash via out-of-bounds Operand(0) on a value with zero operands (LLVM C API assertion).

Consider adding a safety check:

func peelConstOperand0ToType(v llvm.Value, target llvm.Type) llvm.Value {
	for v.Type() != target {
		if v.OperandsCount() == 0 {
			panic("peelConstOperand0ToType: reached leaf without finding target type")
		}
		v = v.Operand(0)
	}
	return v
}

// Enable ThinLTO, using default lto kind(thinlto).
export.LDFLAGS = append(export.LDFLAGS, "-Wl,--lto-O1", "-v")
}
if clangRoot != "" {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two issues on this line:

  1. Debug -v flag leaked into production: The -v flag causes clang to dump verbose output (full toolchain paths, sysroot, library search dirs, internal linker command lines) for every LTO-enabled build. This looks like a debugging leftover — it's not present in the UseTarget path (line 511). Should be removed before merge.

  2. Comment/code mismatch (ThinLTO vs Full LTO): The comment on line 233 says "Enable ThinLTO, using default lto kind(thinlto)" but -flto=full (line 259) selects full LTO, not thin LTO. These are fundamentally different strategies. Same mismatch exists in UseTarget at line 509. The comments should say "Full LTO" or the flags should use -flto=thin.

func (p Program) methodCheckedLoad(b llvm.Builder, mod llvm.Module, typedesc llvm.Value, typeID string) llvm.Value {
mdVal := p.ctx.MetadataAsValue(p.ctx.MDString(typeID))
res := llvm.CreateCall(b, p.llvmTypeCheckedLoad(mod).GlobalValueType(), p.llvmTypeCheckedLoad(mod), []llvm.Value{
typedesc,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.llvmTypeCheckedLoad(mod) is called twice here (and p.llvmAssume(mod) twice at line 95). Each call does a mod.NamedFunction() CGo lookup. Since this runs once per interface dispatch site, consider caching in a local:

tcl := p.llvmTypeCheckedLoad(mod)
res := llvm.CreateCall(b, tcl.GlobalValueType(), tcl, []llvm.Value{...})

Same pattern at line 112 for llvmFakeUse.

llvm.CreateCall(b, fnTy, asm, []llvm.Value{v})
}

func (fn Function) emitFakeUses(b Builder) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emitFakeUses (which uses llvm.fake.use) is defined but never called — only emitFakeUsesInlineAsm is wired up in EndBuild. If this is intentionally kept as a future alternative, a comment explaining this would help. Otherwise it's dead code that could be removed to avoid confusion about which emission strategy is active.

hasVArg: hasVArg,
fakeUses: make([]llvm.Value, 0, 4),
fakeUseSet: make(map[llvm.Value]struct{}),
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fakeUses and fakeUseSet are allocated for every function even when enableGoGlobalDCE is false (the common case today). In a large codebase with thousands of functions, the per-function make(map[...]) adds non-trivial overhead. Consider lazy init — leave them nil and allocate on the first recordFakeUse call:

func (p Function) recordFakeUse(v llvm.Value) {
	if v.IsNil() { return }
	if p.fakeUseSet == nil {
		p.fakeUseSet = make(map[llvm.Value]struct{})
	}
	// ...
}


typeType := b.Prog.Type(b.Prog.rtNamed("Type"), InGo)
baseType := peelConstOperand0ToType(init, typeType.ll)
equalField := baseType.Operand(7)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded operand indices 7 (Equal) and 4 (Hasher at line 156) silently break if the runtime struct layout changes — the Operand() call would either read the wrong field or crash if the struct has fewer fields. Consider adding a brief comment documenting which struct and field each index refers to, and/or an OperandsCount() assertion before access.

@xgopilot
Copy link
Copy Markdown
Contributor

xgopilot bot commented Mar 26, 2026

Well-structured PR — the LTO plumbing through CLI/build/crosscompile is clean, the reflect-safety approach via module flag merging is elegant, and the test coverage is solid. Key items to address before merge:

  • Debug -v flag left in production LDFLAGS (crosscompile.go:234)
  • Comment/code mismatch: comments say "ThinLTO" but -flto=full selects full LTO
  • peelConstOperand0ToType needs a termination guard to prevent compiler crashes
  • emitFakeUses is dead code (only the inline-asm variant is wired up)
  • Unconditional fakeUseSet/fakeUses allocation per function even when DCE is off — consider lazy init

See inline comments for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant