Rework NPC's iptables-restore Logic To Use --noflush #1372

aauren · 2022-10-11T18:46:18Z

Is your feature request related to a problem? Please describe.
As pointed out in #1370 when we operate iptables-restore in it's default mode we have the potential to cause race-conditions with other iptables tooling that may exist on the host. In its default operation iptables-restore flushes all existing rules and reads in rules from scratch from the restore file given to it.

Using iptables in this way means that kube-router has the potentially to inadvertently change or remove other application's chains and rules. However, if we use iptables-restore --noflush we can choose to, for the most part, only operate on our own rules and chains. Any chain / rule that is not in the iptables restore input is left alone.

This is especially important because upstream netfilter does not have strong promises of backward compatibility with previous versions of user-space tooling. These two issues together essentially caused #1370 to occur because kube-proxy used a newer version of the user-space tools than kube-router did, when kube-router used iptables-save it wasn't actually fed all of the appropriate options to the kube-proxy rules. Thus when it did it's overly-broad iptables-restore operation, it changed rules that were not related to kube-router

Describe the solution you'd like
kube-router should rework the logic inside the NPC controller so that it uses iptables-restore --noflush and tries as hard as possible to only touch chains and rules that it is authoritative for. As suggested by the kube-proxy team, (thanks to @danwinship) the most efficient way to get from what we have now to what we need would likely be logic along the lines of:

Basically, when you use --noflush, there are three possibilities for a chain:

if you have no :CHAINNAME line for the chain, and no -X CHAINNAME or -A CHAINNAME rules, then the restore doesn't affect that chain at all

if you have :CHAINNAME and then later -X CHAINNAME (and no -A CHAINNAME), then the chain gets deleted

if you have :CHAINNAME (and no -X CHAINNAME), and 0 or more -A CHAINNAME ... lines, then the chain gets completely replaced by the rules in the restore input (just like what would have happened in the non---noflush case)

Additional context

The text was updated successfully, but these errors were encountered:

aauren added the feature label Oct 11, 2022

aauren mentioned this issue Oct 20, 2022

kube-router does not work with iptables 1.8.8 (nf_tables) on host #1370

Closed

aauren mentioned this issue Dec 16, 2022

Network connectivity lost when kube-router starts after upgrade to Fedora CoreOS 37 #1415

Closed

jnummelin mentioned this issue Jan 10, 2023

Kubernetes cronjob cannot access database service using network policy k0sproject/k0s#2261

Closed

4 tasks

aauren added the override-stale Don't allow automatic management of stale issues / PRs label Sep 4, 2023

aauren added this to the v2.2.0 milestone Jan 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework NPC's iptables-restore Logic To Use --noflush #1372

Rework NPC's iptables-restore Logic To Use --noflush #1372

aauren commented Oct 11, 2022

Rework NPC's iptables-restore Logic To Use --noflush #1372

Rework NPC's iptables-restore Logic To Use --noflush #1372

Comments

aauren commented Oct 11, 2022