Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calico node error - iptables-legacy-save command failed #8831

Open
Farhanec07 opened this issue May 16, 2024 · 6 comments
Open

Calico node error - iptables-legacy-save command failed #8831

Farhanec07 opened this issue May 16, 2024 · 6 comments

Comments

@Farhanec07
Copy link

Expected Behavior

Current Behavior

calico-kube-controllers-6fb59668cc-k746l   0/1     CrashLoopBackOff   38 (4m22s ago)   120m
calico-node-lzwd5                          0/1     Running            0                117m
calico-node-m5pdk                          0/1     Running            11 (72s ago)     117m
calico-node-zngzj                          1/1     Running            1 (51m ago)      117m
calico-typha-5f49b8b8c4-hmspz              1/1     Running            0                120m
calico-typha-5f49b8b8c4-kksv4              1/1     Running            0                120m

panic which i observed that its failing to save iptables rules causing pods to crash.
calico-node pod log -

panic: (*logrus.Entry) 0xc0005ed420
2024-05-15 17:52:12.344 [WARNING][61470] felix/table.go 840: iptables save failed error=exit status 127
2024-05-15 17:52:12.949 [PANIC][61470] felix/table.go 784: iptables-legacy-save command failed after retries ipVersion=0x4 table="raw"
panic: (*logrus.Entry) 0xc000720000

goroutine 172 [running]:
github.com/sirupsen/logrus.(*Entry).log(0xc0003ada40, 0x0, {0xc00034d640, 0x31}) ...


2024-05-15 17:52:14.307 [WARNING][61545] felix/table.go 840: iptables save failed error=exit status 127
2024-05-15 17:52:14.307 [WARNING][61545] felix/table.go 778: iptables-legacy-save command failed error=exit status 127 ipVersion=0x4 stderr="" table="raw"
2024-05-15 17:52:14.309 [WARNING][61545] felix/table.go 840: iptables save failed error=exit status 127
2024-05-15 17:52:14.310 [WARNING][61545] felix/table.go 778: iptables-legacy-save command failed error=exit status 127 ipVersion=0x4 stderr="" table="nat"
2024-05-15 17:52:14.312 [WARNING][61545] felix/table.go 840: iptables save failed error=exit status 127
2024-05-15 17:52:14.312 [WARNING][61545] felix/table.go 778: iptables-legacy-save command failed error=exit status 127 ipVersion=0x4 stderr="" table="mangle"
2024-05-15 17:52:14.314 [WARNING][61545] felix/table.go 840: iptables save failed error=exit status 127
2024-05-15 17:52:14.314 [WARNING][61545] felix/table.go 778: iptables-legacy-save command failed error=exit status 127 ipVersion=0x4 stderr="" table="filter"
2024-05-15 17:52:14.478 [INFO][61545] felix/health.go 294: Reporter is not ready: reporting non-ready. name="InternalDataplaneMainLoop"

checked cni.log . could see only below error are

2024-05-13 15:13:19.864 [ERROR][4517] plugin.go 580: Final result of CNI DEL was an error. error=stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/

2024-05-13 15:13:58.392 [WARNING][7280] k8s.go 549: CNI_CONTAINERID does not match WorkloadEndpoint ConainerID, don't delete WEP. ContainerID="1a5404058443532a2fd04878af23e0d33b8be65f497d56f42d2f546310dcbc9b" WorkloadEndpoint=&v3.WorkloadEndpoint{TypeMeta:v1.TypeMeta{Kind:"WorkloadEndpoint", APIVersion:"projectcalico.org/v3"}, ObjectMeta:v1.ObjectMeta{Name:"ip--10--80--187--34.us--west--2.compute.internal-k8s-calico--kube--controllers--6fb59668cc--slzxw-eth0", GenerateName:"calico-kube-controllers-6fb59668cc-", Namespace:"kube-system", SelfLink:"", UID:"1a92bc05-cfae-436d-9ae0-8a5bbfd39918", ResourceVersion:"2657", Generation:0, CreationTimestamp:time.Date(2024, time.May, 13, 15, 9, 21, 0, time.Local), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string{"k8s-app":"calico-kube-controllers", "pod-template-hash":"6fb59668cc", "projectcalico.org/namespace":"kube-system", "projectcalico.org/orchestrator":"k8s", "projectcalico.org/serviceaccount":"calico-kube-controllers"}, Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v3.WorkloadEndpointSpec{Orchestrator:"k8s", Workload:"", Node:"ip-10-80-187-34.us-west-2.compute.internal", ContainerID:"5ce6b20118a30e7fb8568e63265afe2a3f51590d9ddc7e82de3ab11c5d4f52e5", Pod:"calico-kube-controllers-6fb59668cc-slzxw", Endpoint:"eth0", ServiceAccountName:"calico-kube-controllers", IPNetworks:[]string{"100.100.208.65/32"}, IPNATs:[]v3.IPNAT(nil), IPv4Gateway:"", IPv6Gateway:"", Profiles:[]string{"kns.kube-system", "ksa.kube-system.calico-kube-controllers"}, InterfaceName:"cali0551ebff1eb", MAC:"", Ports:[]v3.WorkloadEndpointPort(nil), AllowSpoofedSourcePrefixes:[]string(nil)}}
2024-05-13 15:13:58.392 [INFO][7280] k8s.go 585: Cleaning up netns ContainerID="1a5404058443532a2fd04878af23e0d33b8be65f497d56f42d2f546310dcbc9b"
2024-05-13 15:13:58.392 [INFO][7280] dataplane_linux.go 526: CleanUpNamespace called with no netns name, ignoring. ContainerID="1a5404058443532a2fd04878af23e0d33b8be65f497d56f42d2f546310dcbc9b" iface="eth0" netns=""

2024-05-13 15:13:58.583 [WARNING][7343] ipam_plugin.go 432: Asked to release address but it doesn't exist. Ignoring ContainerID="98c5fab830d6aad1c63b765bc11ad533d7160d9144989f232a9b5716415c804c" HandleID="k8s-pod-network.98c5fab830d6aad1c63b765bc11ad533d7160d9144989f232a9b5716415c804c" Workload="ip--10--80--187--34.us--west--2.compute.internal-k8s-kubed--864bd6d7f--jjfzv-eth0"

while exec into pod iptables cmd is not executing

$ kubectl exec -it calico-node-25spv -n kube-system -- /bin/bash

[root@ip-10-80-175-186 /]# iptables-save
iptables-save: symbol lookup error: iptables-save: undefined symbol: xtables_strdup
[root@ip-10-80-175-186 /]# iptables --version
iptables: symbol lookup error: iptables: undefined symbol: xtables_strdup

Possible Solution

Steps to Reproduce (for bugs)

Context

Your Environment

  • Calico version 3.27.3
  • Orchestrator version (e.g. kubernetes, mesos, rkt): k8s 1.28
  • Operating System and version: linux
  • Link to your project (optional):
@Farhanec07
Copy link
Author

Please share your thoughts on this. We are currently blocked from upgrading to EKS 1.29 due to this issue.

@tomastigera
Copy link
Contributor

What Linux distro/version do you use? Does it have (proper) support for iptables?

@Farhanec07
Copy link
Author

Farhanec07 commented Jun 5, 2024

What Linux distro/version do you use? Does it have (proper) support for iptables?

we create cluster on ami which is amazon-linux-2-arm64 AMI
on which we faced above issue

@jonathan-hurley
Copy link

jonathan-hurley commented Jun 14, 2024

The actual AMIs which are in question here are the Optimized EKS ones (such as amazon-eks-arm64-node-1.26 and amazon-eks-arm64-node-1.29).

All versions of these AMIs (even the x86/AMD64 ones) have the same version of iptables (v1.8.4):

rpm -q iptables nftables firewalld
iptables-1.8.4-10.amzn2.1.2.aarch64
package nftables is not installed
package firewalld is not installed

So I don't think this would be related to the version of iptables. The same commands work on much older 1.26 ARM instances (which work with earlier versions of Calico).

# iptables -A INPUT -s 1.2.3.4 -j DROP
# iptables-save
# Generated by iptables-save v1.8.4 on Fri Jun 14 19:59:06 2024
*filter
:INPUT ACCEPT [65:3736]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [34:2728]
-A INPUT -s 1.2.3.4/32 -j DROP
COMMIT
# Completed on Fri Jun 14 19:59:06 2024

@coutinhop
Copy link
Contributor

@jonathan-hurley the function in question (xtables_strdup()) is present in iptables v1.8.8 (which is what calico v3.27.2+ uses): https://github.com/PKRoma/iptables/blob/v1.8.8/libxtables/xtables.c#L463, but it doesn't seem to be there in the version you mentioned (v1.8.4): https://github.com/PKRoma/iptables/blob/v1.8.4/libxtables/xtables.c

Would it be possible to upgrade iptables to v1.8.8 in your instances? Alternatively, calico pre-v3.27.2 should be using iptables v1.8.4, could you try that and see if the issue is resolved? (not ideal, but this would at least help diagnose this)

@jonathan-hurley
Copy link

Amazon EKS optimized images have always used 1.8.4; we do not have the option to change this.

We must use the latest versions of Calico in order to resolve CVEs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants