Skip to content

Fix KafkaAppender reporting error after successful retry#4125

Open
SebTardif wants to merge 3 commits into
apache:2.xfrom
SebTardif:fix/kafka-appender-retry-logic
Open

Fix KafkaAppender reporting error after successful retry#4125
SebTardif wants to merge 3 commits into
apache:2.xfrom
SebTardif:fix/kafka-appender-retry-logic

Conversation

@SebTardif
Copy link
Copy Markdown

Fix KafkaAppender retry logic that always reports an error to the error handler even when a retry succeeds.

When retryCount is configured and the initial tryAppend() fails, the retry loop uses break to exit on success. However, break only exits the while loop; execution always reaches the error() call afterward. This causes spurious error notifications for transient Kafka failures that were successfully recovered by a retry.

This change replaces break with return so that a successful retry exits append() without reporting an error. Retry exceptions are now logged at DEBUG level for diagnostics instead of being silently discarded.

Also removes dead code in Builder.getRetryCount() where Integer.valueOf(int) was wrapped in a NumberFormatException catch that can never fire.

The bug was introduced in #315.

Checklist

  • Base your changes on 2.x branch if you are targeting Log4j 2; use main otherwise
  • ./mvnw verify succeeds (the build instructions)
  • Tests are provided

Copy link
Copy Markdown
Contributor

@ramanathan1504 ramanathan1504 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall — great fix for the break/return retry behavior, and thanks for adding DEBUG logging plus cleaning up getRetryCount().

One small suggestion commented to keep the diff minimal: retain the existing while loop shape and just apply the early return + DEBUG logging inside it.

SebTardif added 2 commits May 18, 2026 18:04
When retryCount is configured and the initial tryAppend() fails, the
retry loop uses break to exit on success. However, break only exits the
while loop and execution always reaches the error() call afterward. This
causes spurious error notifications for transient Kafka failures that
were successfully recovered by a retry.

Replace break with return so that a successful retry exits append()
without reporting an error. Retry exceptions are now logged at DEBUG
level for diagnostics instead of being silently discarded.

Also remove dead code in Builder.getRetryCount() where Integer.valueOf(int)
was wrapped in a NumberFormatException catch that can never fire.

The bug was introduced in apache#315.

Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
@SebTardif SebTardif force-pushed the fix/kafka-appender-retry-logic branch from 42bc3fe to 2871852 Compare May 19, 2026 01:04
@ramanathan1504
Copy link
Copy Markdown
Contributor

Hi @SebTardif ,

Thank you so much for your patience and for contributing to this! We really appreciate the time and effort you've put into getting this KafkaAppender issue resolved. The code changes look good and the review is cleared!

Before we can finally merge this, there are just a few quick administrative things we need to address:

  • Avoid Force-Pushing during reviews: I noticed that the recent updates were force-pushed. For future PRs, please just push new commits normally rather than force-pushing. Force-pushing overwrites the Git history, which makes it very difficult for reviewers to track the differences between the old and new changes.
  • Commit Signatures: It looks like your commits are currently signed with a self-signed key. For security and verification purposes, we require commits to be signed off using a valid GPG key. Could you please set up a GPG key, re-sign your commits, and update the PR?
  • Add a Changelog Entry: Please add an entry to our changelog file for this fix. You can use this exact wording:

    "Fixed an issue where KafkaAppender incorrectly reported an error to the error handler despite a successful retry."

Next Steps:
Since you need to re-sign your previous commits and add the changelog, you will actually need to force-push one last time to update this PR. Once you've added the changelog and signed with a GPG key, go ahead and push it up, and we'll get this merged!

Thanks again for your great work on this!

@vy
Copy link
Copy Markdown
Member

vy commented May 19, 2026

Great guidance @ramanathan1504! 💯

  • Commit Signatures: It looks like your commits are currently signed with a self-signed key. For security and verification purposes, we require commits to be signed off using a valid GPG key.

@ramanathan1504, @SebTardif, we've dropped the requirement for commit signatures — see #3989. You can skip that chore.

Adopt reviewer suggestion to use try-with-resources CloseableThreadContext
instead of manual ThreadContext.put/clearMap.
@SebTardif
Copy link
Copy Markdown
Author

Thanks @ramanathan1504 for the review guidance and @vy for clarifying on signatures.

Adopted CloseableThreadContext as suggested. Pushed as a new commit this time (no force-push).

@ramanathan1504 ramanathan1504 enabled auto-merge (squash) May 19, 2026 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

3 participants