fix(deltaproxycache): better handling of failed requests#989
fix(deltaproxycache): better handling of failed requests#989Elia-Renzoni wants to merge 5 commits into
Conversation
| mts[i] = nts | ||
| } else if resp.StatusCode != http.StatusOK { | ||
| errs[i] = tpe.ErrUnexpectedUpstreamResponse | ||
| errTs[i] = el[i] |
There was a problem hiding this comment.
when are we using these error values? (it doesn't look like we are, is that intended? would expect we log these somewhere). previously we were returning errors.Join(errs...)
There was a problem hiding this comment.
I removed the use of errors in favor of returning the failed extents in the headers. The errors are still being logged, in particular UnexpectedUpstreamResponse and the unmarshaling errors. The only case that is not currently logged is the error coming from the Fetch function call.
There was a problem hiding this comment.
The error from the Fetch function is also logged within the function itself, e.g.:
if err != nil {
logger.Error("error reading body from http response",
logging.Pairs{"url": pr.URL.String(), "detail": err.Error()})
return body, resp, 0, err
}| // TestDeltaProxyCacheRequestPartialHitWithFailedExtents verifies that when | ||
| // a partial hit occurs and the upstream request for the missing fragment fails, | ||
| // the failed extents are properly tracked and reported in the response header. | ||
| func TestDeltaProxyCacheRequestPartialHitWithFailedExtents(t *testing.T) { |
There was a problem hiding this comment.
i think we need more tests cases
- multi-extent partial failure (some succeed, some fail)
- header round-trip parse test for
failed= - MergeResultHeaderVals with a
failed=header on either side - sharded fetchTimeseries with a single shard failing
There was a problem hiding this comment.
here are some failing / panic'ing tests 141a374
Signed-off-by: Elia Renzoni <elia.renzoni03@gmail.com>
Signed-off-by: Elia Renzoni <elia.renzoni03@gmail.com>
Signed-off-by: Elia Renzoni <elia.renzoni03@gmail.com>
2e17b90 to
9065dd2
Compare
Signed-off-by: Elia Renzoni <elia.renzoni03@gmail.com>
Signed-off-by: Elia Renzoni <elia.renzoni03@gmail.com>
Description
This update improves how DeltaProxyCache handles and reports failures when serving partial hit requests that involve multiple upstream extents. With this change failed extents are explicitly tracked during the fetchExtents() process using a dedicated parallel structure,rather than aggregating errors. This allows the system to preserve detailed information about which specific extents failed.
fetchExtents() now returns (ExtentList, bool) to indicate failed extents and whether a full failure occurred. A full failure (or severe fault) occurs when all the requests fail, in this case the severeFault flag is set to true, otherwise it is false.
Related Issue: #291
Type of Change
AI Disclosure