Skip to content

Log EMR cluster StateChangeReason on failure#98

Merged
Oguzhan Unlu (oguzhanunlu) merged 2 commits into
developfrom
log-reason
May 22, 2026
Merged

Log EMR cluster StateChangeReason on failure#98
Oguzhan Unlu (oguzhanunlu) merged 2 commits into
developfrom
log-reason

Conversation

@oguzhanunlu

@oguzhanunlu Oguzhan Unlu (oguzhanunlu) commented May 21, 2026

Copy link
Copy Markdown
Member

This PR surfaces code and message of EMR failures in stdout, freeing Support from manually checking customers' account.

ref: https://snplow.atlassian.net/browse/PDP-2648

Surface Cluster.Status.StateChangeReason.Code and .Message in stdout and
in the returned error whenever a cluster fails to reach WAITING. Previously
the runner only checked the Code internally (to detect bootstrap-failure
retries) and never surfaced the underlying reason, leaving Support blind
to failures like 'On the master instance, application provisioning failed'
or quota errors where no step logs are produced.
Snyk Docker scan flagged two High-severity Uncaught Exception advisories
in transitive AWS SDK Go v2 deps. Bumped:
- service/s3 v1.71.1 -> v1.97.3 (SNYK-GOLANG-...-S3-16316411)
- aws/protocol/eventstream v1.6.7 -> v1.7.8 (SNYK-GOLANG-...-EVENTSTREAM-16316402)

go mod tidy pulled along compatible bumps for aws-sdk-go-v2 core, smithy-go,
and a handful of internal/* packages.

@spenes Enes Aldemir (spenes) left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@oguzhanunlu Oguzhan Unlu (oguzhanunlu) merged commit c226e7c into develop May 22, 2026
8 checks passed
Oguzhan Unlu (oguzhanunlu) added a commit that referenced this pull request May 22, 2026
* Log EMR cluster StateChangeReason on failure

Surface Cluster.Status.StateChangeReason.Code and .Message in stdout and
in the returned error whenever a cluster fails to reach WAITING. Previously
the runner only checked the Code internally (to detect bootstrap-failure
retries) and never surfaced the underlying reason, leaving Support blind
to failures like 'On the master instance, application provisioning failed'
or quota errors where no step logs are produced.

* Bump aws-sdk-go-v2/service/s3 and eventstream for CVE fixes

Snyk Docker scan flagged two High-severity Uncaught Exception advisories
in transitive AWS SDK Go v2 deps. Bumped:
- service/s3 v1.71.1 -> v1.97.3 (SNYK-GOLANG-...-S3-16316411)
- aws/protocol/eventstream v1.6.7 -> v1.7.8 (SNYK-GOLANG-...-EVENTSTREAM-16316402)

go mod tidy pulled along compatible bumps for aws-sdk-go-v2 core, smithy-go,
and a handful of internal/* packages.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants