[Enhancement] Split publish-trace SST miss counters by local-cache vs remote (backport #73087)#73191
Merged
Merged
Conversation
Contributor
Author
|
Cherry-pick of 2b6cbed has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally |
Contributor
Author
|
@mergify[bot]: Backport conflict, please reslove the conflict and resubmit the pr |
… remote (#73087) Signed-off-by: luohaha <18810541851@163.com>
d926ea4 to
1a9dc09
Compare
luohaha
approved these changes
May 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why I'm doing:
In shared-data publish traces today we have
read_block_miss_cache_cnton the persistent-index sstableMultiGetpath. It tells us the in-memory sstable block cache missed, but not whether the resulting file read was served by the local data cache or went out to the remote object store (S3/OSS). That distinction is what we actually need to diagnose slow PK publishes — a miss that hits local disk is dramatically different from one that hits remote.What I'm doing:
In
PersistentIndexSstable::multi_get, snapshot the underlyingRandomAccessFile'sNumericStatistics(bytes_read_local_disk,bytes_read_remote,io_count_local_disk,io_count_remote) before and afterTable::MultiGet, and emit the delta as four new trace counters alongside the existing miss counter:sstable_io_local_disk_bytessstable_io_remote_bytessstable_io_count_local_disksstable_io_count_remoteStreams without
NumericStatistics(e.g. plain POSIX in shared-nothing UTs) report all-zero, so the wiring is safe across build flavors.UT: a
FakeStatsInputStreamwraps the on-disk stream and attributes every successfulread_at_fullyto either the local-disk bucket or the remote bucket; three new cases (test_multi_get_io_breakdown_local_disk,test_multi_get_io_breakdown_remote,test_multi_get_io_breakdown_no_stats) drive the breakdown end-to-end and assert the published trace metrics — including the nullptr-stats branch.Fixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check:
This is an automatic backport of pull request [Enhancement] Split publish-trace SST miss counters by local-cache vs remote #73087 done by Mergify.