fix(recent-block-cache): Reduce fetch fan out and preserve negative cache entries.#4421
fix(recent-block-cache): Reduce fetch fan out and preserve negative cache entries.#4421metalurgical wants to merge 4 commits into
Conversation
…che entries Use fetch_values instead of join_all to reduce fan out. Preserve negative cache entries instead of removing empty entries.
There was a problem hiding this comment.
Code Review
This pull request refactors the RecentBlockCache by removing the request_sharing dependency and simplifying the fetching logic into a single retry loop within fetch_inner_many. It also corrects a bug in the garbage collection process where negative cache entries (empty results) were being prematurely evicted, and adds a regression test to verify their persistence. Review feedback identified that the new retry implementation introduces unnecessary latency on the final failure, loses original error context by using anyhow::bail!, and violates repository guidelines regarding detailed error reporting for batch fetches.
| for _ in 0..=self.maximum_retries { | ||
| match self.fetcher.fetch_values(keys.clone(), block).await { | ||
| Ok(values) => return Ok(values), | ||
| Err(err) => tracing::warn!("retrying fetch because error: {:?}", err), | ||
| } | ||
| .boxed() | ||
| }); | ||
| shared.await.context("could not fetch liquidity") | ||
| tokio::time::sleep(self.delay_between_retries).await; | ||
| } | ||
| anyhow::bail!("could not fetch liquidity"); |
There was a problem hiding this comment.
The current retry logic in fetch_inner_many has several issues:
- Unnecessary Latency: It performs a
tokio::time::sleepeven after the final retry attempt fails, adding redundant delay to the overall request time. - Error Context Loss: By using
anyhow::bail!, the original RPC error (e.g., timeout, node error) is discarded. Using.context()preserves the root cause for better troubleshooting. - Batch Error Detail: According to repository guidelines for batch fetches, the implementation should explicitly indicate which items failed and provide error details for each, rather than returning a single error for the entire batch. Please ensure the response structure reflects individual item statuses.
let mut keys = keys;
for i in 0..=self.maximum_retries {
let current_keys = if i == self.maximum_retries {
keys
} else {
keys.clone()
};
match self.fetcher.fetch_values(current_keys, block).await {
Ok(values) => return Ok(values),
Err(err) => {
if i == self.maximum_retries {
return Err(anyhow::Error::from(err).context("could not fetch liquidity"));
}
tracing::warn!(?err, "retrying fetch");
tokio::time::sleep(self.delay_between_retries).await;
}
}
}
unreachable!()References
- When fetching a batch of items where individual fetches can fail, do not silently ignore errors. The API response should explicitly indicate which items failed and provide error details for each failure.
There was a problem hiding this comment.
Point 1 and 2 have been corrected.
Point 3 is a trait redesign, Result<Vec<V>> will need to become HashMap<K, Result<Vec<V>> in multiple places for this.
There was a problem hiding this comment.
Code Review
This pull request refactors the RecentBlockCache to implement batch fetching and simplified retry logic, removing the dependency on individual request sharing. It also fixes a bug in the cache's garbage collection logic that caused negative cache entries to be prematurely evicted and adds a regression test. A high-severity recommendation was made to include an early return for empty key sets in fetch_inner_many to avoid unnecessary RPC calls.
Description
Reduce fetch fan out and preserve negative cache entries.
Changes
fetch_values(keys, block)directly instead of expanding them into individual key requestsHow to test
cargo nextest run