Skip to content

feat: support ExternalId in AWS AssumeRole for cross-account isolation#7665

Closed
shivkumr wants to merge 8 commits intokedacore:mainfrom
shivkumr:feature/aws-external-id
Closed

feat: support ExternalId in AWS AssumeRole for cross-account isolation#7665
shivkumr wants to merge 8 commits intokedacore:mainfrom
shivkumr:feature/aws-external-id

Conversation

@shivkumr
Copy link
Copy Markdown

@shivkumr shivkumr commented Apr 19, 2026

Add awsExternalId support to AWS TriggerAuthentication, enabling sts:AssumeRole with ExternalId for cross-account confused deputy protection.

Problem

In a shared Kubernetes cluster managed by a platform team, multiple application teams from different AWS accounts deploy their workloads. Each team uses KEDA to autoscale based on AWS resources (e.g., SQS queues) in their own accounts. The KEDA operator runs as a single cluster-wide service with one IAM identity (via IRSA or Pod Identity).

To access cross-account resources, KEDA assumes IAM roles in each team's account. However, without ExternalId support, there is no way to enforce the confused deputy protection — any tenant who discovers another tenant's role ARN could potentially leverage KEDA's shared identity to access their resources. IAM trust policies that require sts:ExternalId as a condition are the standard mitigation, but KEDA currently has no mechanism to pass an ExternalId during AssumeRole.

What this enables

Each tenant's IAM role can now require a unique ExternalId in its trust policy. The ExternalId is stored in a namespace-scoped Kubernetes Secret (protected by RBAC) and passed to KEDA via TriggerAuthentication. KEDA includes it in the AssumeRole call, and IAM rejects any attempt without the correct ExternalId — providing tenant isolation at the IAM layer.

Key changes

  1. GetAwsAuthorization(): read awsExternalId from authParams in all three identity paths — including the identityOwner: operator path, which previously only set PodIdentityOwner = false without reading any auth parameters
  2. GetAwsConfig(): allow the operator identity path to perform AssumeRole when a role ARN is provided, by refining the early return condition from if !PodIdentityOwner to if !PodIdentityOwner && AwsRoleArn == ""

Changes

File Change
pkg/scalers/aws/aws_authorization.go Added AwsExternalId field to AuthorizationMetadata struct
pkg/scalers/aws/aws_common.go Read awsExternalId from authParams in all 3 identity paths (pod identity, operator, pod); refined early return in GetAwsConfig to support operator + role ARN
pkg/scalers/aws/aws_config_cache.go Include ExternalId in cache key; pass it to AssumeRoleOptions in retrievePodIdentityCredentials
pkg/scalers/aws/aws_common_test.go 7 new unit tests covering all identity paths + cache key isolation

Usage

Users pass awsExternalId via TriggerAuthentication.secretTargetRef, same pattern as awsRoleArn:

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
spec:
  secretTargetRef:
    - parameter: awsRoleArn
      name: keda-aws-creds
      key: AWS_ROLE_ARN
    - parameter: awsExternalId
      name: keda-aws-creds
      key: AWS_EXTERNAL_ID

Backward-compatible: if no awsExternalId is provided, behavior is identical to before.

Checklist

  • I have verified that my change is according to the deprecations & breaking changes policy
  • Tests have been added (if applicable)
  • Ensure make generate-scalers-schema has been run to update any outdated generated files
  • Changelog has been updated and is aligned with our changelog requirements, only when the change impacts end users
  • A PR is opened to update the documentation on (repo) (if applicable)
  • Commits are signed with Developer Certificate of Origin (DCO - learn more)

Fixes #7662
Fixes #6921
Docs: kedacore/keda-docs#1754

@shivkumr shivkumr requested a review from a team as a code owner April 19, 2026 02:29
@github-actions
Copy link
Copy Markdown

Thank you for your contribution! 🙏

Please understand that we will do our best to review your PR and give you feedback as soon as possible, but please bear with us if it takes a little longer as expected.

While you are waiting, make sure to:

  • Add an entry in our changelog in alphabetical order and link related issue
  • Update the documentation, if needed
  • Add unit & e2e tests for your changes
  • GitHub checks are passing
  • Is the DCO check failing? Here is how you can fix DCO issues

Once the initial tests are successful, a KEDA member will ensure that the e2e tests are run. Once the e2e tests have been successfully completed, the PR may be merged at a later date. Please be patient.

Learn more about our contribution guide.

@keda-automation keda-automation requested a review from a team April 19, 2026 02:30
@snyk-io
Copy link
Copy Markdown

snyk-io Bot commented Apr 19, 2026

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@shivkumr
Copy link
Copy Markdown
Author

@JorTurFer @rickbrouwer @tangobango5 This adds ExternalId support to AWS AssumeRole for cross-account confused deputy protection. Fixes #7662 and #6921.

I noticed the existing PR #6916 by @tangobango5 for the same feature. A few differences in this PR:

  • ExternalId is passed via secretTargetRef in TriggerAuthentication (no CRD schema changes needed), which aligns with @JorTurFer's feedback to keep it in the shared auth code alongside awsRoleArn
  • ExternalId is included in the cache key (addresses @JorTurFer's last review comment on feat(aws-sqs): Add external ID support for cross-account access #6916)
  • Covers all three identity paths: identityOwner: operator, identityOwner: pod, and PodIdentityProviderAws
  • Fixes the GetAwsConfig early return that prevented the operator path from performing AssumeRole when a role ARN was provided
  • 4 files changed in this PR, no CRD modifications

Happy to collaborate with @tangobango5 or defer to maintainers on which approach to move forward with.

@shivkumr shivkumr force-pushed the feature/aws-external-id branch from c7fdca5 to 7e2a550 Compare April 19, 2026 03:29
@rickbrouwer rickbrouwer added the related-pr This is a PR that is related to another PR. The potential merging may affect the related PR. label Apr 20, 2026
@rickbrouwer
Copy link
Copy Markdown
Member

Related with #6916 and #7580

Add ExternalId support to AWS STS AssumeRole calls, enabling secure
cross-account access with confused deputy protection.

Changes:
- Add AwsExternalId field to AuthorizationMetadata struct
- Read awsExternalId from authParams in all identity paths:
  PodIdentityProviderAws, identityOwner=operator, identityOwner=pod
- Pass ExternalId to AssumeRoleOptions in GetAwsConfig and
  retrievePodIdentityCredentials
- Include ExternalId in config cache key to prevent collisions
- Fix early return in GetAwsConfig that skipped AssumeRole for
  operator identity when a role ARN was provided

Fixes kedacore#7662

Signed-off-by: shivkumr <shivkumr@github.com>
Signed-off-by: shivkumr <shivkumr@amazon.com>
…g sort order

Signed-off-by: shivkumr <shivkumr@amazon.com>
Comment thread pkg/scalers/aws/aws_common_test.go Outdated
@rickbrouwer
Copy link
Copy Markdown
Member

rickbrouwer commented Apr 29, 2026

/run-e2e aws_identity_external_id_test
Update: You can check the progress here

Signed-off-by: shivkumr <shivkumr@amazon.com>
@keda-automation keda-automation requested a review from a team April 29, 2026 13:55
- Add max 30 iterations to cleanupMessages to prevent infinite loop
- Set VisibilityTimeout=1 so deleted messages don't reappear
- Remove DelaySeconds from queue creation and message sending so
  messages are immediately visible to KEDA

Signed-off-by: shivkumr <shivkumr@amazon.com>
1 second was too short — delete call could race against the timeout.
10 seconds gives enough time for the delete to complete while still
being short enough to not block the cleanup loop.

Signed-off-by: shivkumr <shivkumr@amazon.com>
@shivkumr
Copy link
Copy Markdown
Author

shivkumr commented Apr 29, 2026

@rickbrouwer The e2e test was hanging due to three issues in the test code (not the feature code):

  1. Infinite loop in cleanupMessages — SQS visibility timeout caused deleted messages to reappear, looping forever. Fixed with a 30-iteration cap.
  2. Queue DelaySeconds: 60 — messages were invisible for 60 seconds after sending, so KEDA saw an empty queue and never scaled. Fixed to DelaySeconds: 0.
  3. VisibilityTimeout: 10 on receive — gives enough time for deletes to complete without racing.(0287151)

Could you cancel the current run and re-trigger? Thanks!

@rickbrouwer
Copy link
Copy Markdown
Member

rickbrouwer commented Apr 29, 2026

/run-e2e aws_identity_external_id_test
Update: You can check the progress here

@rickbrouwer
Copy link
Copy Markdown
Member

Looks clean overall, nice work!

A few questions:

Could the reading of awsExternalId be made consistent across the three identity paths? In case "", "pod": it's nested inside the awsRoleArn != "" switch case, while in the operator and PodIdentityProviderAws paths it's read independently.

Further, would it make sense to add a validation/warning when awsExternalId is set without awsRoleArn? Right now it's silently ignored.

And should the test function names also be renamed from ExternalId to ExternalID to match the struct field renames you did earlier.

About the unit test, could you add a unit test specifically targeting the GetAwsConfig early-return change (operator + role ARN path)? That's now only covered indirectly via e2e.

During the review I think I also spotted what looks like a pre-existing bug in RemoveCachedEntry.
a.items[awsAuthorization.AwsRoleArn] = cachedEntry uses AwsRoleArn instead of the hashed key. Not introduced by your PR, but is it possible that you check this for me. And if I am correct, would you mind including a small fix here (or do you want it in a separate PR for the right scope?)

And lastly, do you plan to open a corresponding PR on https://github.com/kedacore/keda-docs to document the new awsExternalId parameter?

Comment thread pkg/scalers/aws/aws_config_cache.go
- Nest awsExternalId under awsRoleArn check in all identity paths for consistency
- Add warning log when awsExternalId is set without awsRoleArn
- Rename test functions ExternalId -> ExternalID per Go conventions
- Add unit tests for GetAwsConfig operator + role ARN path
- Fix pre-existing bug in RemoveCachedEntry: use hashed cache key instead
  of raw AwsRoleArn when updating entry with remaining usages

Signed-off-by: shivkumr <shivkumr@amazon.com>
@keda-automation keda-automation requested a review from a team April 29, 2026 19:41
@shivkumr
Copy link
Copy Markdown
Author

shivkumr commented Apr 29, 2026

@rickbrouwer Thanks for the thorough review! Addressed all points in 0e486f2:

  1. Consistency - awsExternalId now nested under awsRoleArn check in all three identity paths
  2. Warning - added log warning when awsExternalId is set without awsRoleArn
  3. Test naming - renamed ExternalId -> ExternalID in all test function names
  4. GetAwsConfig unit test - added TestGetAwsConfig_OperatorWithRoleArn and TestGetAwsConfig_OperatorWithoutRoleArn
  5. RemoveCachedEntry bug - I'm okay fixing it, so fixed a.items[awsAuthorization.AwsRoleArn] -> a.items[key]

For docs update PR: kedacore/keda-docs#1754

…path

AssumeRoleWithWebIdentity does not accept ExternalId — the OIDC token
authenticates the role assumption. ExternalID only applies to the
AssumeRole fallback.

Signed-off-by: shivkumr <shivkumr@amazon.com>
@shivkumr shivkumr force-pushed the feature/aws-external-id branch from 4a4360d to 0418952 Compare April 29, 2026 21:22
@rickbrouwer
Copy link
Copy Markdown
Member

Great! Looks good!
One last point. Given that you included the fix, thank you for that, it would be good to mention this in the changelog.md. Can you add an extra entry to the changelog under 'Fixes' for this purpose, mentioning this pull request:

- **AWS Scalers**: Fix incorrect cache key used in `RemoveCachedEntry` when an entry still has remaining usages, causing stale cache entries ([#7665](https://github.com/kedacore/keda/pull/7665))

@rickbrouwer
Copy link
Copy Markdown
Member

rickbrouwer commented Apr 30, 2026

/run-e2e aws_identity_external_id_test
Update: You can check the progress here

Signed-off-by: shivkumr <shivkumr@amazon.com>
@rickbrouwer
Copy link
Copy Markdown
Member

rickbrouwer commented Apr 30, 2026

/run-e2e aws*
Update: You can check the progress here

@rickbrouwer rickbrouwer added the Awaiting/2nd-approval This PR needs one more approval review label Apr 30, 2026
@rickbrouwer
Copy link
Copy Markdown
Member

Before this undergoes a second approval, the related PRs must be carefully reviewed. All three are assessed to determine what can and cannot proceed (if there is a conflict or if they have the same goal).

@JorTurFer
Copy link
Copy Markdown
Member

This will be added via #6916
Thanks for your effort, I'd like to apologize because we didn't review it earlier and there are multiple PRs doing the same

@JorTurFer JorTurFer closed this May 3, 2026
JorTurFer added a commit to kedacore/keda-docs that referenced this pull request May 3, 2026
…1754)

* docs: document awsExternalId parameter for cross-account AssumeRole

Add documentation for the new awsExternalId parameter in TriggerAuthentication,
which enables confused deputy protection for multi-tenant environments where
a shared KEDA operator assumes roles in different AWS accounts.

Relates to kedacore/keda#7665

Signed-off-by: shivkumr <shivkumr@amazon.com>

* Apply suggestions from code review

Co-authored-by: Jorge Turrado Ferrero <Jorge_turrado@hotmail.es>
Signed-off-by: Jorge Turrado Ferrero <Jorge_turrado@hotmail.es>

* Apply suggestion from @JorTurFer

Signed-off-by: Jorge Turrado Ferrero <Jorge_turrado@hotmail.es>

---------

Signed-off-by: shivkumr <shivkumr@amazon.com>
Signed-off-by: Jorge Turrado Ferrero <Jorge_turrado@hotmail.es>
Co-authored-by: Jorge Turrado Ferrero <Jorge_turrado@hotmail.es>
@rickbrouwer rickbrouwer removed the Awaiting/2nd-approval This PR needs one more approval review label May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

related-pr This is a PR that is related to another PR. The potential merging may affect the related PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support ExternalId in AWS AssumeRole for cross-account tenant isolation Support for external ID in AWS SQS Scaler

3 participants