Skip to content

Recover S3 stream after skip failure#29419

Open
electrum wants to merge 2 commits into
masterfrom
user/electrum/s3input
Open

Recover S3 stream after skip failure#29419
electrum wants to merge 2 commits into
masterfrom
user/electrum/s3input

Conversation

@electrum
Copy link
Copy Markdown
Member

@electrum electrum commented May 11, 2026

Release notes

(x) Release notes are required, with the following suggested text:

## Delta Lake connector
* Retry on S3 failure during forward seek operations. ({issue}`29419`)

## Hive connector
* Retry on S3 failure during forward seek operations. ({issue}`29419`)

## Hudi connector
* Retry on S3 failure during forward seek operations. ({issue}`29419`)

## Iceberg connector
* Retry on S3 failure during forward seek operations. ({issue}`29419`)

## Lakehouse connector
* Retry on S3 failure during forward seek operations. ({issue}`29419`)

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the S3 filesystem input stream behavior to be more resilient to transient S3 failures when performing forward seeks, which benefits lakehouse connectors that rely on forward seeking while reading object data.

Changes:

  • Retry forward-seek operations by reopening the S3 stream when skip() fails with an IOException.
  • Remove the previous zero-length response workaround tied to an AWS SDK issue.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Member

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix PR title

@electrum electrum changed the title User/electrum/s3input Recover S3 stream after skip failure May 11, 2026
@electrum electrum force-pushed the user/electrum/s3input branch from 92cb049 to 396dc34 Compare May 11, 2026 08:52
@electrum
Copy link
Copy Markdown
Member Author

I addressed the Copilot comment to handle InterruptedIOException.

@electrum
Copy link
Copy Markdown
Member Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 11, 2026

✅ Actions performed

Full review triggered.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 11, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4747aa84-c9ab-4b65-9a0d-fee1d2c0b552

📥 Commits

Reviewing files that changed from the base of the PR and between 73a7d29 and 396dc34.

📒 Files selected for processing (1)
  • lib/trino-filesystem-s3/src/main/java/io/trino/filesystem/s3/S3InputStream.java

📝 Walkthrough

Walkthrough

This PR refactors S3InputStream's forward-seek behavior to inline skip logic with explicit exception handling. The previous doSkip helper method is removed, and its AbortedException-to-InterruptedIOException mapping is moved into the seek method. Additionally, an AWS SDK workaround that replaced zero-length streams with null-input streams during range request opening is removed. The net effect is clearer control flow for skip operations and reduced defensive coding for an edge case.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.

@electrum electrum enabled auto-merge May 11, 2026 09:02
electrum added 2 commits May 12, 2026 07:56
S3 cloud tests saw connection resets while skipping within an open
response stream. Treat that path like read failures so the next read
can reopen at the requested position.
Older AWS SDK versions returned a checksum byte instead of EOF for
empty S3 objects. The current SDK returns EOF directly, so the local
replacement stream is no longer needed.
@electrum electrum force-pushed the user/electrum/s3input branch from 396dc34 to b9e1530 Compare May 12, 2026 14:56
@electrum electrum added this pull request to the merge queue May 12, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants