Feature Request
Is your feature request related to a problem? Please describe:
Describe the feature you'd like:
For v1, after the PD /ready interface is confirmed, the operator will initiate the process of evicting the TiKV leader. Prior to this, it first retrieves the current TiKV leader count from the PD. However, although the loadRegion process has completed on the PD at that point, the leader count may not yet be fully up to date, as it requires some time for the region heartbeat to propagate and update the information.
As a result, the recorded TiKV region leader count at that moment may be inaccurate—often significantly lower than the actual number. Since this recorded count is used to determine when to wait for rebalancing, an underestimated value could cause the next TiKV node to restart earlier than intended. This premature restart may then lead to an excessive number of leader transfers during the second TiKV restart, subsequently increasing CDC lag.
It would be helpful if we could support one new API for leader-ready. Maybe we can realize it through this new api interface in PD tikv/pd#9852
Describe alternatives you've considered:
Teachability, Documentation, Adoption, Migration Strategy:
Feature Request
Is your feature request related to a problem? Please describe:
Describe the feature you'd like:
For v1, after the PD /ready interface is confirmed, the operator will initiate the process of evicting the TiKV leader. Prior to this, it first retrieves the current TiKV leader count from the PD. However, although the
loadRegionprocess has completed on the PD at that point, the leader count may not yet be fully up to date, as it requires some time for the region heartbeat to propagate and update the information.As a result, the recorded TiKV region leader count at that moment may be inaccurate—often significantly lower than the actual number. Since this recorded count is used to determine when to wait for rebalancing, an underestimated value could cause the next TiKV node to restart earlier than intended. This premature restart may then lead to an excessive number of leader transfers during the second TiKV restart, subsequently increasing CDC lag.
It would be helpful if we could support one new API for leader-ready. Maybe we can realize it through this new api interface in PD tikv/pd#9852
Describe alternatives you've considered:
Teachability, Documentation, Adoption, Migration Strategy: