Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework NPC's iptables-restore Logic To Use --noflush #1372

Open
aauren opened this issue Oct 11, 2022 · 0 comments
Open

Rework NPC's iptables-restore Logic To Use --noflush #1372

aauren opened this issue Oct 11, 2022 · 0 comments
Labels
feature override-stale Don't allow automatic management of stale issues / PRs
Milestone

Comments

@aauren
Copy link
Collaborator

aauren commented Oct 11, 2022

Is your feature request related to a problem? Please describe.
As pointed out in #1370 when we operate iptables-restore in it's default mode we have the potential to cause race-conditions with other iptables tooling that may exist on the host. In its default operation iptables-restore flushes all existing rules and reads in rules from scratch from the restore file given to it.

Using iptables in this way means that kube-router has the potentially to inadvertently change or remove other application's chains and rules. However, if we use iptables-restore --noflush we can choose to, for the most part, only operate on our own rules and chains. Any chain / rule that is not in the iptables restore input is left alone.

This is especially important because upstream netfilter does not have strong promises of backward compatibility with previous versions of user-space tooling. These two issues together essentially caused #1370 to occur because kube-proxy used a newer version of the user-space tools than kube-router did, when kube-router used iptables-save it wasn't actually fed all of the appropriate options to the kube-proxy rules. Thus when it did it's overly-broad iptables-restore operation, it changed rules that were not related to kube-router

Describe the solution you'd like
kube-router should rework the logic inside the NPC controller so that it uses iptables-restore --noflush and tries as hard as possible to only touch chains and rules that it is authoritative for. As suggested by the kube-proxy team, (thanks to @danwinship) the most efficient way to get from what we have now to what we need would likely be logic along the lines of:

Basically, when you use --noflush, there are three possibilities for a chain:

  • if you have no :CHAINNAME line for the chain, and no -X CHAINNAME or -A CHAINNAME rules, then the restore doesn't affect that chain at all
  • if you have :CHAINNAME and then later -X CHAINNAME (and no -A CHAINNAME), then the chain gets deleted
  • if you have :CHAINNAME (and no -X CHAINNAME), and 0 or more -A CHAINNAME ... lines, then the chain gets completely replaced by the rules in the restore input (just like what would have happened in the non---noflush case)

Additional context

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature override-stale Don't allow automatic management of stale issues / PRs
Projects
None yet
Development

No branches or pull requests

1 participant