-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connection issue for a multiple zone cluster with calico 3.28.0 #8860
Comments
Sounds like a regression, we'll have a look |
I might have a similar issue and was just about to open a bugreport. For me, this is related to VXLAN checksum offloading. @lzhecheng Can you try with disabling checksum offloading ( |
I'm adding the basics of what I am observing here:
I could not observe packet loss when directing the traffic against the podIP itself, only via K8s iptables rules for NodePort/LoadBalancer, likely because of NAT? I observed the packet loss to happen only on the destination node, between the physical interface and the pod interface, i.e. I could still see the first SYN packet VXLAN encapsulated on the physical interface, but only the second SYN popped up on the pod interface (cali*). My test client was only running in the hostNetwork, I didn't test from a pod (yet). |
We've got an internal repro of a kernel problem where when using VXLAN offloading and a packet is SNATted multiple time (in this case for node->service->pod traffic) the checksum doesn't get calculated properly. We'll revert ChecksumOffloadBroken to true in the next patch release (3.28.1) while we look into alternative fixes (perhaps finding a way to prevent the double SNAT). I'm not certain that this exactly matches the issue here, but it certainly sounds similar. |
I think this is likely a kernel problem with VXLAN checksum offload when there are multiple SNATs (which can happen in this kind of host -> service -> pod connection). We're going to disable this in 3.28.1 and will then try and look for a way to get the offload back. |
@lzhecheng @sfudeus thanks for reporting. The kernel issue was supposed to be fixed, but we were looking for a possible repro. what kernel / linux distro do you use? Do you use any public cloud? |
@tomastigera Pure on-premise on metal for us, currently Flatcar Container Linux 3975.1.1 (beta channel) with kernel 6.6.36-flatcar (likely 3941.1.0 with 6.6.30-flatcar at the time of reporting). |
@tomastigera my cluster was created on Azure with CAPZ. Version |
Yes, the first packet has wrong udp csum and thus is dropped by the vxlan device and is not forwarded to the pod. @sfudeus do you observe the issue with ebpf as well? My understanding is that in iptables, the situation is created by a conflict between calico and kube-proxy rules. In ebpf, there are no kube-proxy rules and packets take a completely different path and so I would expect this not to happen. |
@tomastigera IIRC we saw this with non-ebpf only, but I'll recheck. Not sure when I get to this though, might only be next week. |
It is not an eBPF issue for sure. |
This image |
@tomastigera I can confirm that I cannot observe the first-syn issues anymore using |
Expected Behavior
A Node can reach a service whose endpoint is on another Node (different zone) immediately.
Current Behavior
A Node cannot reach a service whose endpoint is on another Node (different zone) immediately. The first packet is dropped and the second one works.
Possible Solution
Use calico 3.27.3
Steps to Reproduce (for bugs)
Context
Details here: kubernetes-sigs/cloud-provider-azure#6293
Your Environment
The text was updated successfully, but these errors were encountered: