Skip to content

PS-11136: non-GTID transactions cause one storage flush per transaction, bypassing size/interval checkpointing#127

Merged
kamil-holubicki merged 1 commit into
Percona-Lab:0.2from
kamil-holubicki:PS-11136
May 25, 2026
Merged

PS-11136: non-GTID transactions cause one storage flush per transaction, bypassing size/interval checkpointing#127
kamil-holubicki merged 1 commit into
Percona-Lab:0.2from
kamil-holubicki:PS-11136

Conversation

@kamil-holubicki
Copy link
Copy Markdown
Collaborator

https://perconadev.atlassian.net/browse/PS-11136

Problem:
In non-GTID (anonymous transaction) replication mode, PBS flushes its in-memory event buffer to the storage backend on every transaction boundary, ignoring the configured 'checkpoint_size_bytes' and 'checkpoint_interval_seconds' thresholds. For object-store backends this turns into one PUT per transaction.

Cause:
'storage::write_event()' had a fast-path keyed on
'at_transaction_boundary && transaction_gtid.is_empty()' whose intent was "flush regardless of thresholds because this is the file-final ROTATE/STOP event". The condition wasn't tight enough: anonymous transactions also satisfy it (they never populate 'transaction_gtid_'), so every XID terminating an anonymous transaction was misidentified as a file terminator and forced a synchronous flush.

Solution:
Removed the fast-path. The file-final ROTATE/STOP event is still flushed - just through the already-existing 'storage::close_binlog()' call on the 'process_rotate_or_stop_event()' / artificial-rotate rename paths, which is the natural place for a file-boundary flush. GTID-mode behavior is unchanged.

Comment thread mtr/binlog_streaming/t/binlog_flush.test Outdated
…on, bypassing size/interval checkpointing

https://perconadev.atlassian.net/browse/PS-11136

Problem:
In non-GTID (anonymous transaction) replication mode, PBS flushes its
in-memory event buffer to the storage backend on every transaction
boundary, ignoring the configured 'checkpoint_size_bytes' and
'checkpoint_interval_seconds' thresholds. For object-store backends
this turns into one PUT per transaction.

Cause:
'storage::write_event()' had a fast-path keyed on
'at_transaction_boundary && transaction_gtid.is_empty()' whose intent
was "flush regardless of thresholds because this is the file-final
ROTATE/STOP event". The condition wasn't tight enough: anonymous
transactions also satisfy it (they never populate 'transaction_gtid_'),
so every XID terminating an anonymous transaction was misidentified as
a file terminator and forced a synchronous flush.

Solution:
Removed the fast-path. The file-final ROTATE/STOP event is still
flushed - just through the already-existing 'storage::close_binlog()'
call on the 'process_rotate_or_stop_event()' / artificial-rotate
rename paths, which is the natural place for a file-boundary flush.
GTID-mode behavior is unchanged.
Copy link
Copy Markdown
Collaborator

@percona-ysorokin percona-ysorokin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kamil-holubicki kamil-holubicki merged commit 6f29656 into Percona-Lab:0.2 May 25, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants