Skip to content

fix: report specific missing security group IDs in EC2NodeClass status#9036

Open
r-aju wants to merge 1 commit intoaws:mainfrom
r-aju:fix/security-group-missing-id-reporting
Open

fix: report specific missing security group IDs in EC2NodeClass status#9036
r-aju wants to merge 1 commit intoaws:mainfrom
r-aju:fix/security-group-missing-id-reporting

Conversation

@r-aju
Copy link
Copy Markdown

@r-aju r-aju commented Mar 26, 2026

When securityGroupSelectorTerms specifies security group IDs that do not exist in AWS, the controller previously set a generic 'SecurityGroupSelector did not match any SecurityGroups' condition.

This change compares the requested IDs against the IDs returned by the EC2 DescribeSecurityGroups API and explicitly reports which IDs were not found, e.g.:
'Security groups do not exist: [sg-abc123, TBD-testing]'

This makes it significantly easier to diagnose misconfigured or non-existent security group IDs in an EC2NodeClass.

Fixes #N/A #9035

Description
Observed Behavior: The EC2NodeClass supports including security groups that don't exist. Karpenter creates the EC2NodeClass even when a security group doesn't exist, without validation

Expected Behavior: When creating an EC2NodeClass, Karpenter should validate that all specified security groups actually exist before proceeding with the creation. Ideally it should just use security group IDs and not allow anything else.

Reproduction Steps (Please include YAML):
EC2NodeClass is created successfully even with non-existent security groups.

apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: worker
spec:
  amiFamily: Bottlerocket
  amiSelectorTerms:
  - id: ami-xxxxxx
  - id: ami-xxxxxx
  associatePublicIPAddress: false
  blockDeviceMappings:
  - deviceName: /dev/xvda
    ebs:
      encrypted: true
      kmsKeyID: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
      volumeSize: 4Gi
      volumeType: gp3
  detailedMonitoring: true
  kubelet:
    clusterDNS:
    - xx.xxx.x.xx
    maxPods: 110
  metadataOptions:
    httpEndpoint: enabled
    httpProtocolIPv6: disabled
    httpPutResponseHopLimit: 2
    httpTokens: required
  role: karpenter-node
  securityGroupSelectorTerms:
  - id: sg-xxxxxxxxxxxxxx
  - id: TBD-testing                                             # This SG doesn't exist
  - id: sg-xxxxxxxxxxxxxxxxx
  subnetSelectorTerms:
  - id: subnet-xxxxxxxxxxxxxxxxx
  - id: subnet-xxxxxxxxxxxxxxxxx
  - id: subnet-xxxxxxxxxxxxxxxxx
  tags:
    ENVIRONMENT: prod
    POD: infra
    SERVICE: eks
  userData: |
    [settings]
    [settings.host-containers.control]
    [settings.kernel.sysctl]
    'net.core.netdev_max_backlog' = '30000'

How was this change tested?

Does this change impact docs?

  • Yes, PR includes docs updates
  • Yes, issue opened: #
  • No

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

When securityGroupSelectorTerms specifies security group IDs that do
not exist in AWS, the controller previously set a generic
'SecurityGroupSelector did not match any SecurityGroups' condition.

This change compares the requested IDs against the IDs returned by
the EC2 DescribeSecurityGroups API and explicitly reports which IDs
were not found, e.g.:
  'Security groups do not exist: [sg-abc123, TBD-testing]'

This makes it significantly easier to diagnose misconfigured or
non-existent security group IDs in an EC2NodeClass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants