-
Notifications
You must be signed in to change notification settings - Fork 470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Add Route Reconciliation #1749
base: master
Are you sure you want to change the base?
Conversation
@@ -0,0 +1,45 @@ | |||
package pkg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@twz123 I wanted to get your feedback about this convention within Go.
As I started to get into the refactoring of InjectRoute, I found that I kept getting into circular dependency type situations between the bgp
and routes
packages.
After watching a video from Brian Ketelsen about structuring go programs, as well as several other posts, it seems like pulling up your domain models into the root of the project is a conventional way of working around circular dependencies.
Doing this with the below methods, did work out for me, but I noticed that this architecture is limited. For instance, I couldn't leave the TunnelCreator
interface with EncapType
and EncapPort
because those are declared in the tunnels
package. And for some reason, it didn't feel right to bring those up into this top level package either.
So I worked around it for now, but maybe this is showing that either I should have brought them up, or that maybe I'm embracing the wrong pattern here.
I would be curious for your thoughts.
@@ -1,82 +0,0 @@ | |||
package routes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@twz123 I ended up bringing this back into the routing
package because I realized that it was behaving a bit more like an extension of the NetworkRoutesController, in that, in order to have it process route reconciliation I needed it to work with the routes source of truth, which is BGP. At that point it is no longer strictly related to host routes (although now that I'm thinking about it, it means that maybe host_route_sync.go
isn't the right name for the file any more either.
Anyway, curious about your thoughts here as well.
pkg/routes/linux_routes.go
Outdated
@@ -30,3 +32,109 @@ func DeleteByDestination(destinationSubnet *net.IPNet) error { | |||
} | |||
return nil | |||
} | |||
|
|||
func InjectRoute(subnet *net.IPNet, gw net.IP, ipa utils.NodeIPAware, tn pkg.Tunneler, rs pkg.RouteSyncer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@twz123 - Now that this logic is disentangled with the BGP logic that used to be buried in here I like it a lot better. It feels a lot cleaner and more straight forward to me.
pkg/bgp/path.go
Outdated
"k8s.io/klog/v2" | ||
) | ||
|
||
func PathChanged(path *gobgpapi.Path, pl PeerLister, rs pkg.RouteSyncer, tc pkg.TunnelCleaner) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@twz123 - This feels like the right pattern, because it decouples the routes
package from the bgp
package by making it implementation agnostic. So I think that making all of the arguments to this function interfaces is the right thing to do, but wanted to see if you had any feedback.
c07f1fd
to
482ebec
Compare
type LinuxRouter struct { | ||
NodeIPA pkg.NodeIPAware | ||
Tunneler pkg.Tunneler | ||
RouteSyncer pkg.RouteSyncer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't particularly like the circular dependency here. LinuxRouter
requires pkg.RouteSyncer
so that it can update the state host routes. pkg.RouteSyncer
in turn requires LinuxRouter
so that it can reuse the InjectRoute()
logic when new routes are found during checkCacheAgainstBGP()
.
It feels like there should be a better / cleaner way to do this.
25b6ff4
to
c204b26
Compare
c204b26
to
9bda129
Compare
Add a route reconciliation step to the host_route_sync file that uses the BGP state table as a system of record for the routes that should or should not be synced to the system. In theory, kube-router should be consistently informed of route changes via BGP Path change events that are sent to us from GoBGP. However, in practice it seems that some routes are getting stuck in our state table.
Host route syncing is an important part of kube-router that is run in its own goroutine, so we should be checking it for health the way we do the major controllers to make sure that it never stops functioning. Additionally, as the health check controller has been continuously added on to, we also make some modifications here to make it a bit more robust and scalable.
In addition to added / removed routes, there is also the chance that the gateway has changed and we need to track that as well by fixing the cache state.
We don't want to add paths that are either not the Best path, which is something that we get for free on the BGP Watch method that we have in the NRC, or paths that are currently being withdrawn (even though this should be an edge case).
In a List operation, we get our own routes that we've advertised. This was a problem that we didn't have with BGP Watches. However, this now affects us, so we need to remove IPs that represent ourselves, so that we don't add needless routes.
Adding a boolean to the return values of InjectRoute() allows us to track when a route is actually inserted vs. when it was called. This helps us better represent HostRoutesStaleAddedCounter metric. Without this, the administrator will see a bunch of routes seemingly being added every time we call checkCacheAgainstBGP() when, in fact, most of them are tossed out depending on the user's mesh subnet settings.
9bda129
to
a27e751
Compare
There are a couple of items that have typically ended up in a no-op for us when considering routes to inject. However, now that we have a route map where we track route state, we need this not just to be no-ops, but also update the route state cache as well to ensure that the route doesn't get replaced in the future. When we find tunnels to clean up, we need to not only remove the tunnel and the route to that tunnel, but also remove the route from the state map. When we discover that no route needs to be added to the host because it's not in the same subnet and we weren't supposed to create a tunnel, then we also clean it up and ensure that it isn't in our state as well.
Most of this PR is out-dated. After several attempts to patch it up, I've found that my approach of adding BGP Cache Checking into the route_sync controller directly, just isn't the best way to do what I'm trying to do here. Over time I've seen that it is frought with problems and logic gaps. I'm leaving this PR around for now, while I re-consider my approach here, but in the end, I think that most of the problems can be resolved by adopting commit |
This PR is still very early and is not fully implemented, but I wanted to start to get some feedback going on the direction of this before I got too much further down the road.
The end goal of this PR is to enable route reconciliation within the route sync controller. However, in order to do that, I wanted to break out
injectRoute()
from the NetworkRoutesController.Also, over the years, it has become obvious that the
injectRoute()
function is hard for new users to parse, and there is often a lot of confusion about how it works because the logic is complicated. So this also attempts to split out the logic in inject route into BGP type logic as well as Route type logic.Final fix for: #1738