CNI ipamd reconcilation of in-use addresses #3109

hbhasker · 2024-11-08T19:58:10Z

What happened:
We noticed that on some hosts the CNI was thinking 64 IPs were in use with pods that had long terminated. When we checked the node had only 59 pods (including ones that were using host networking) but CNI clearly thought there were 64 pods running and failed allocating new IPs to pods because all IPs were in use (we set the max pods to 64 for the node). We spent sometime trying to figure out how that happens but I guess it can happen if somehow the CNI missing the delete event or fails to process it.

I was trying to read the code and see if there is some race where a delete and create can race causing CNI to incorrectly reject the delete and then proceed to add the IP to ipamd as allocated. In which case the IP remains in use even though the pod is gone. (Its possible I am misunderstanding what kubelet /crio do when a pod is terminated and if the CNI fails the DelNetwork request with an error).

Mostly looking to understand if this is a known issue? Looks like CNI does reconcile its database on a restart but maybe it needs to reconcile it periodically to prevent this?

Environment:

Kubernetes version (use kubectl version): 1.26
CNI Version: 1.18.2
OS (e.g: cat /etc/os-release): Amazon Linux 2023
Kernel (e.g. uname -a): 6.1

The text was updated successfully, but these errors were encountered:

The CNI today only reconciles its datastore with existing pods at startup but never again. Sometimes its possible that IPAMD goes out of sync with the kubelet's view of the pods running on the node if it fails or is temporarily unreachable by the CNI plugin handling the DelNetwork call from the contrainer runtime. In such cases the CNI continues to consider the pods IP allocated and will not free it as it will never see a DelNetwork again. This results in CNI failing to assign IP's to new pods. This change adds a reconcile loop which periodically (once a minute) reconciles its allocated IPs with existence of pod's veth devices. If the veth device is not found then it free's up the corresponding allocation making the IP available for reuse. Fixes aws#3109

orsenthil · 2024-11-13T19:28:58Z

Hi @hbhasker , thank you for this report. Could you confirm this by looking at the ipamd.log If you share the logs in k8s-awscni-triage@amazon.com, we can look over it too.

hbhasker · 2024-11-13T20:26:03Z

I did confirm by looking at the json for the datastore as well. It clearly had pods in there that had already terminated on the node. I will see if i run into another occurence of the same and capture more information.

The CNI today only reconciles its datastore with existing pods at startup but never again. Sometimes its possible that IPAMD goes out of sync with the kubelet's view of the pods running on the node if it fails or is temporarily unreachable by the CNI plugin handling the DelNetwork call from the contrainer runtime. In such cases the CNI continues to consider the pods IP allocated and will not free it as it will never see a DelNetwork again. This results in CNI failing to assign IP's to new pods. This change adds a reconcile loop which periodically (once a minute) reconciles its allocated IPs with existence of pod's veth devices. If the veth device is not found then it free's up the corresponding allocation making the IP available for reuse. Fixes aws#3109

jayanthvn · 2024-11-14T21:08:29Z

@orsenthil - We will have to check plugin logs if the delete request landed on CNI. Since kubelet is the source of truth. I don't think we should add more reconcilers rather check why the event was missed or not received..

hbhasker added needs investigation question labels Nov 8, 2024

hbhasker linked a pull request Nov 13, 2024 that will close this issue

Add support to reconcile allocated Pod IPs. #3113

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CNI ipamd reconcilation of in-use addresses #3109

CNI ipamd reconcilation of in-use addresses #3109

hbhasker commented Nov 8, 2024 •

edited

Loading

orsenthil commented Nov 13, 2024

hbhasker commented Nov 13, 2024

jayanthvn commented Nov 14, 2024

CNI ipamd reconcilation of in-use addresses #3109

CNI ipamd reconcilation of in-use addresses #3109

Comments

hbhasker commented Nov 8, 2024 • edited Loading

orsenthil commented Nov 13, 2024

hbhasker commented Nov 13, 2024

jayanthvn commented Nov 14, 2024

hbhasker commented Nov 8, 2024 •

edited

Loading