Skip to content

Background

jfuruness edited this page Jan 19, 2024 · 12 revisions

Home

Tutorial

Beyond the background in the paper, there are two areas you should be aware of. Programming skills and a background in BGP

Programming Skills for extending BGPy

These are some skills below that probably would be pretty useful for extending BGPy to run your own simulations, along with some resources for them, if applicable. If you're a beginner at Python programming, this repo may not be for you, but maybe our BGPy website would be helpful to you (COMING SOON).

  1. Linux/bash (basic terminal commands, since Windows is not supported)
  2. pytest https://www.udemy.com/course/elegant-automation-frameworks-with-python-and-pytest/
  3. Python Environments (best not to be installing globally, especially since you may want to switch back and forth between Python and PyPy)
  4. PyPy https://www.pypy.org/download.html
  5. Inheritance

Programming skills for contributing to BGPy

This is mainly for new hires who come to work for us at UConn who want to contribute directly to BGPy. If you're just looking to use BGPy, and not modify the source code, this list probably isn't relevant to you.

Some areas that might be nice to brush up on:

  1. git/github
  2. Python packaging
  3. ruff, black, tox, github actions, git precommit: https://www.youtube.com/watch?v=DhUpxWjOhME
  4. Mypy
  5. dunder/magic methods (I recommend the first chapter from an O'Reilly book called Fluent Python if you need a good resource for this)
  6. dataclasses (mcoding on youtube has some good videos on this)
  7. Various intermediate Python topics
  8. Basic understanding of matplotlib, collections (specifically defaultdict), pathlib.
  9. Basic understanding of YAML, and check out YAMLable https://smarie.github.io/python-yamlable/
  10. Multiprocessing in Python (the GIL, basic understanding of multiprocessing).

BGP, ROV, Hijacking Background

Then there is a background in BGP, ROV, BGP Hijacking, etc. These papers will probably add some insight/understanding, I recommend reading them along with any other online resources on BGP and how it works:

  1. Starting with Stable Internet Routing Without Global Coordination
  2. Investigating Interdomain Routing Policies in the Wild
  3. Profiling BGP Serial Hijackers: Capturing Persistent Misbehavior in the Global Routing Table
  4. Mind Your Blocks: On the Stealthiness of Malicious BGP Hijacks
  5. RPKI is Coming of Age: A Longitudinal Study of RPKI Deployment and Invalid Route Origins

Beyond that, here is an example that we will go over with some very brief BGP Background, along with some defaults within our simulations:

BGP

Image

We simulate the internet in BGPy as a directed acyclic graph that contains about 70,000 nodes, known as Autonomous Systems, or ASes for short. These ASes route traffic for Internet Service Providers (ISPs), Content Delivery Networks (CDNs), Organizations, etc. They can be identified by their AS Number, or ASN for short. For example, on the image above, the green AS has an ASN of 777.

In BGPy we simulate two main types of connections.

In customer provider connections, customers pay providers for traffic. For example, in the image above, the arrows denote customer provider relationships (with the arrowhead being pointed towards the customers).

There are also peer to peer connections, where traffic flows freely between nodes/ASes. In our example graph, these are denoted by dashed lines.

There are other types of connections, such as sibling connections, but we do not include these within our simulations since there is not good ground truth data on these types of relationships. Additionally, sibling connections are typically not included in prior works for BGP simulations.

These ASes forward traffic and BGP Announcements to one another. While announcements have many attributes (and you can see many of these in the code), you can think of them from a very high level as having:

  • Prefix: This is a block of IP addresses. For example, 1.2.3.0/24 is a prefix. This indicates that the first 24 bits of the prefix are specified, and the remaining 8 bits are unspecified. So for example, 1.2.3.1 to 1.2.3.255 would be contained within 1.2.3/24, but 1.2.4.0 would not be.
  • Origin: The ASN that created the announcement
  • AS Path: The path that the announcement took to reach it's destination. The AS Path has the newest ASes at the front on the left, with the origin on the farthest right. As the announcement travels across the internet topology, each AS will prepend itself to the AS Path.

You can see all of these attributes in the announcements in the image above at each AS. Each AS in the diagram show's it's local RIB, which contains the best announcement for each prefix.

When an AS receives multiple announcements for the same prefix, the AS must choose which announcement is best. In general, as has been used in prior works with BGP Simulations, BGP ASes select announcements for a given prefix based on Valley Free Routing in order to maximize profit. This is also the default policy in BGPy (although you could use another policy if you wish).

For Valley Free Routing, first there is the local preference. This is where ASes prefer announcements from customers (which pay them), over announcements from peers (which are free), over announcements from providers (which they have to pay for). For example, in the image above, AS 8 receives the /16 prefix from both it's peer and it's customer, and it rightly selects the announcement from it's customer. Additionally, for the subprefix, AS 8 receives the /24 prefix announcement from both it's peer and it's provider, and rightly selects the announcement from it's peer over it's provider because the traffic is cheaper.

After local preference comes optimizing for the shortest AS path. If the relationship is the same, the shortest AS Path is chosen, such as with AS 10, which receives the announcement containing the /16 prefix from AS 8 and AS 9, and rightly selects the announcement from AS 9 since it has a shorter AS path back to the origin.

As far as tiebreakers go, in the real world iBGP is then taken into account. There are tiebreakers with MED, BGP ID, etc, but we don't have access to good ground truth data for these metrics, and thus, as is done in prior works, we simple tiebreak by the lowest ASN. You can see an example of this here, in AS 5, which receives the announcement containing the /24 subprefix from both AS 1 and AS 2, and rightly selects the announcement from AS 1 (the lower ASN).

For the export policy (where announcements get sent after receiving them), by default all announcements from customers are sent to everyone. However, announcements from peers and providers are only sent to customers. For example, you can see here that AS 9 receives the announcement containing the /24 subprefix from it's provider, AS 10, and does not forward that to it's second provider, AS 11. Similarly AS 8 receives the announcement containing the /24 subprefix from it's peer, AS 5, and only forwards the announcement to it's customers, AS 4, and does not forward the announcement to it's peers, AS 3.

And again, the valley free routing and export policy are just the defaults in our simulations (as they have been in prior works). They are very straightforward to extend to do things like route leaks and so on that violate the valley free principles, and we have projects under way that are using BGPy in this way. You can also extend it to have different tiebreakers, etc

BGP Hijacking

Now what happens if there are multiple prefixes for the same IP address? The answer is the most specific prefix. A type of attack that exploits this is called a BGP Hijack, and our specific example is of a BGP Subprefix hijack, where the attacker, AS 666, announces a more specific prefix (in this case 1.2.3.0/24) than the victim, AS 777, (which in this case announces 1.2.0.0/16). To see an example of this, let's start at AS 3. If an operator were looking at the control plane data (in other words, the data visible in the local RIB), the routing operator would assume that AS 3 is not hijacked, since it does not have an announcement for the hijack.

However, let's look at the data plane (the traffic flow between ASes). Let's say AS 3 wants to route to 1.2.3.0. 1.2.3.0 is contained within the 1.2.0.0/16 prefix in the local RIB. The next AS in the AS Path of the announcement in the local RIB is AS 8. Therefore, the traffic is forwarded to AS 8 for 1.2.3.0. AS 8 performs the same action. 1.2.3.0 is contained both within 1.2.0.0/16 and 1.2.3.0/24. Since there is a conflict, AS 8 will choose the most specific prefix, 1.2.3.0/24, to forward it's traffic to.

This process repeats itself until the traffic is routed all the way back to the attacker, AS 666.

This is an important point, because while looking at the control plane/local RIB, AS 3 might appear to not be hijacked, however, we must actually trace back the traffic to it's origin on the data plane to actually determine that. This will become important later as we look at how we determine if an AS is hijacked or not. All of the ASes that trace back to the attacker (AS 666) on the data plane are denoted in red. All of the ASes that trace back to the victim (AS 777) on the data plane are denoted in green.

ROV

So how do ASes defend against such attacks? With Route Origin Validation (ROV). The in depth explanation of ROV was included in the reading reference material, so I will just give a high level overview of it here. You can think of ROAs as entries within a database that contain a prefix and it's origin. There are other attributes of ROAs, but we won't discuss them here. ROV basically looks up each announcement it receives in this database of ROAs (which in real life is the RPKI). In the example below, AS 8 deploys ROV. It looks up the attacker's announcement, and finds that the announcement from the attacker is invalid and drops it. This protects AS 3, AS 8, and AS 4, which now all turn green as the result of traffic from the ata plane flowing back to the victim, the legitimate origin of the announcement, which is AS 777. You can see these changes in the top left of the image as well, where the number of ASes that route back to the victim (including the victim) is now 4.

Image

Peer ROV

We will also take a look at a second ROV variant that we see in the wild, Peer ROV. This is the same as ROV, but it only filters announcements coming from peers. In this example, we can see that it fails to filter the hijack coming from it's provider, and therefore fails to protect the other ASes that basic ROV protects.

Image

Background Conclusion

But this comparison between ROV and Peer ROV is just on a small topology that we made up. The rest of this walkthrough is going to walk you through, step by step with code examples, how to run a comparison of PeerROV vs ROV against a subprefix hijack on a full scale AS topology with all 70,000 ASes. You'll be shown all of the metrics that we capture, and how to generate these graphs and display them, such as the one below. We'll also show you how to create the system tests like the one that we've been using throughout this walkthrough, which have been dynamically generated from our system test suite.

Image

The above is an example graph that demonstrates results for a full scale AS topology. As you can see, similar to our smaller example, PeerROV is very ineffective. At 40% of ASes adopting ROV, around 40% of ROV ASes are not hijacked. Compared to 40% of PeerROV ASes adopting, and nearly all PeerROV ASes are hijacked.

Next: AS Graph