Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Karpenter consolidation replaces the node with exact same node (EC2 instance) type #4826

Closed
badrish-s opened this issue Oct 13, 2023 · 7 comments
Labels
lifecycle/closed lifecycle/stale question Issues that are support related questions

Comments

@badrish-s
Copy link
Contributor

badrish-s commented Oct 13, 2023

Description

Observed Behavior:

Karpenter consolidation replaces the node with exact same node (EC2 instance) type when spec.disruption.consolidationPolicy: WhenUnderutilized is set. Also, eks-node-viewer doesn't show the node to be deleted/replaced as "Cordoned" - this was atleast the behaviour observed during consolidation with earlier versions of Karpenter.

Expected Behavior:

My understanding is, consolidation should kick-in during below situations for OnDemand instance types:

  • Delete a node – When pods can run on free capacity of other nodes in the cluster
  • Deletes a node – When node is empty
  • Replaces a node – When pods can run on a combination of free capacity of other nodes in the cluster + more efficient replacement node

However, I noticed the "Replace node" happens whenever Karpenter finds the node is underutilized - the node is replaced on continuous basis and with the exact same node type. In my case t4g.nano was replaced with a t4g.nano, the replacement node was not efficient then the original node in any way, rather exactly same. This behaviour made me think the replacement is happening based on utilization only.

Also, the node to be deleted/replaced should be Cordoned first, drained and deleted only after pods are placed onto the new node.

Reproduction Steps:

NodeClass.yaml (karpenter-demo is my Cluster name):

apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  creationTimestamp: null
  name: default
spec:
  amiFamily: AL2
  role: KarpenterNodeRole-karpenter-demo
  securityGroupSelectorTerms:
  - tags:
      karpenter.sh/discovery: karpenter-demo
  subnetSelectorTerms:
  - tags:
      karpenter.sh/discovery: karpenter-demo
status: {}

Nodepool.yaml

apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  creationTimestamp: null
  name: default
spec:
  disruption:
    consolidationPolicy: WhenUnderutilized
    expireAfter: Never
  limits:
    cpu: 10k
  template:
    spec:
      nodeClassRef:
        name: default
      requirements:
      - key: karpenter.sh/capacity-type
        operator: In
        values:
        - on-demand
      resources: {}
status: {}

Deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: inflate
spec:
  replicas: 1
  selector:
    matchLabels:
      app: inflate
  template:
    metadata:
      labels:
        app: inflate
    spec:
      terminationGracePeriodSeconds: 0
      containers:
        - name: inflate
          image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
          resources:
            requests:
              cpu: 1

Screens from eks-node-viewer:
 
ip-192-168-151-16.us-west-2.compute.internal (t4g.nano) is being consolidated (because it is underutilized?) and replaced with ip-192-168-24-218.us-west-2.compute.internal (again, t4g.nano).

Screenshot 2023-10-12 at 6 54 45 PM

 

Screenshot 2023-10-12 at 6 55 55 PM

 

Screenshot 2023-10-12 at 6 56 26 PM  

After sometime, ip-192-168-24-218.us-west-2.compute.internal will be again replaced with another t4g.nano instance and the cycle repeats continuously.

Additionally, unlike earlier versions of Karpenter eks-node-viewer doesn't show the node to be replaced as "Cordoned". Since the logs were rotating fast, it was hard to check if pods were being graciously moved to the new node.

Do I have something misconfigured in NodePool or NodeClass or Deployment manifest? or is this the expected Consolidation behaviour in v1beta1 that needs additional configuration to make it work as expected? If there are no misconfigurations or additional configurations to control this, then this is a potential bug that needs attention.

Versions:

  • Chart Version:
  • Kubernetes Version (kubectl version):
  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@badrish-s badrish-s added the bug Something isn't working label Oct 13, 2023
@sadath-12
Copy link
Contributor

As far as cordening and draining of selected nodes for disruption are concerned , it will be well handled once the issue kubernetes-sigs/karpenter#624 is solved

@ellistarn
Copy link
Contributor

You're using an unreleased version of the beta?

@ellistarn
Copy link
Contributor

Can you provide some logs?

@ellistarn ellistarn added question Issues that are support related questions and removed bug Something isn't working labels Oct 25, 2023
@njtran
Copy link
Contributor

njtran commented Oct 25, 2023

@badrish-s does your t4g.nano node have any node memory pressure taints?

Copy link
Contributor

github-actions bot commented Nov 9, 2023

This issue has been inactive for 14 days. StaleBot will close this stale issue after 14 more days of inactivity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 23, 2023
@badrish-s
Copy link
Contributor Author

Apologies. I was on vacation and didn't get a chance to respond to this issue earlier. I am back now, and I did try to set up the Karpenter v1beta1 freshly (using instructions)and applied the same set of nodepools, nodeclass and deployments. I am unable to reproduce the issue now, i.e. the node is NOT being replaced on continuous basis with exact same node type.

I'd like to mention that, the KARPENTER VERSION I am currently using is the latest i.e. v0.32.2 - this was different when I originally tested and reported this issue during October 2023 - I was testing with an internal only image - v0-2012cf98c2e2e9625e858842c9f2d177efb0c364. I believe I did something incorrect earlier or the issue is been addressed now with latest version v0.32.2. git-hub actions has closed this issue due to inactivity and I will let it remain that way until I see this again (hopefully never). Thanks for looking into this!

@jonathan-innis
Copy link
Contributor

Sounds good @badrish-s. Glad to hear that the issue appears to be resolved on the latest version!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/closed lifecycle/stale question Issues that are support related questions
Projects
None yet
Development

No branches or pull requests

5 participants