-
Notifications
You must be signed in to change notification settings - Fork 87
Sweep and Prune
Broadphase collision systems are one of the core routines of every physic engine. If your broadphase is slow you already have lost, there is no chance to recover from a bad broadphase system by using simple code optimizations.
Generally a physic engine has three collision levels, called:
-
Broadphase
Here you detect which collisions can actually happen and discard collisions which can't. -
Midphase
Maybe you detected a collision between a sphere and some cloth object or your level geometry. The work of the midphase is to detect collisions of the level geometry or the cloth object which can actually happen. (Think of the midphase as a "level2" broadphase) -
Narrowphase
Here you do the actual collision detection. In every physic engine (I'm aware of) you get pairs of bodies which could collide. Then you perform the narrowphase which checks if they actually do and then retrieve collision data like the point of intersection, the collision normal or the penetration depth.
Nearly all broadphase systems use the axis aligned bounding boxes (aabb) of the objects. They can be computed quite fast and checking for intersection is also done in a few lines of code. So, how do we accelerate the "sort out pairs which can't collide because their aabbs don't overlap" process? Let's have a quick few on the naive approach to detect collisions:
for (int i = 0; I < rigidBodies.Count; i++)
{
for (int e = i; e < rigidBodies.Count; e++)
{
if (CheckBoundingBoxForOverlap(i, e))
DetectNarrowPhase(i,e);
}
}
Detect the first body against all others, detect the second body against all others but the first, and so on... Doesn't seem so bad – but it is! If you have a 10x10x10 Cube of touching unit cubes you need to check a few thousand possible colliding pairs in the best case, performing the code above, the narrowphase has to make 0.25 million checks! (Using this brute attack makes sense in small environments with only a few bodies or for testing purposes – the Jitter implementation can be found in "CollisionSystemBrute.cs".) Naturally broadphase is a O(n^2) problem, what can be do about this?
We need some accelerating datastructure. If you search the web for "broadphase collision detection" you find the magic Sweep and Prune (or Sort and Prune, SAP) approach. There are many variants of implementations (two of them are implemented in Jitter) but they have one thing in common: First they sort, then they prune :)
Let's see what this means using the simplest "flavor" of the SAP algorithms, non-persistent single axis SAP:
- Fill a list called "axisList" with all objects in the world. Sort this list on one axis (here the x-axis) by the begin of the bounding box (Sort by Object.BoundingBox.Min.X, With "Min" and "Max" the vectors defining the bounding box). So you know that the most left point of object5 is more left then the most left point on object6 when you look directly at the X-Axis.
- Create a new temporary list called "activeList". You begin on the left of your axisList, adding the first item to the activeList. Now you have a look at the next item in the axisList and compare it with all items currently in the activeList (at the moment just one): If the new item's left is greater then the current activeList-item right, then remove the activeList-item from the activeList – otherwise report a possible collision between the new axisList-item and the current activeList-item. Add the new item itself to the activeList and continue with the next item in the axisList.
We immediately see that this approach is "non-persistent" – the sorting is done every frame from scratch (quicksort performs well) and it just sorts for one axis (here the X-Axis). We don't exploit temporal coherence of the scene (it's likely that a scene doesn't change much in one frame) where does the speed improvement come from? Answer: Collision detection is a search problem. And searching on a presorted list is faster O(nlog(n)). Implementing this easy approach gives you a immense speed improvement compared to the brute force algorithm. When I ported over the Code from JigLib (Created by Danny Chapman) to C# (JigLibX) the original version of JigLib contained a brute force and a very basic grid approach. Jon Watte committed a SAP algorithm to the project which uses exactly the codes described above – it beat the complicated grid approach with ease. So this code perform already quite well, nevertheless if all objects overlap on the X-Axis, the runtime is O(n^2) again. (So, never use the up-Axis, as seperating axis – because of gravity most objects overlap there!)
A more complicated (and faster) approach for doing Sweep and Prune is to do a full three axis check. Two objects collide if and only if all of their three axis overlap (Seperating axis theorem). The persistent-3Axis SAP works in a nutshell like this: We use three lists (for each axis one) and keep the lists sorted using insertionsort. Insertionsort is O(n) on nearly sorted lists – A swap in the insertion sort algorithm means that a object starts/stops overlapping with another body. So you keep one hashtable with overlapping pairs and each frame you do the insertionsort stuff and only get reported the changes which you use to update the hashtable structure. This is how the procedure looks in Jitter:
private void SortAxis(List<SweepPoint> axis)
{
for (int j = 1; j < axis.Count; j++)
{
SweepPoint keyelement = axis[j];
float key = keyelement.Value;
int i = j - 1;
while (i >= 0 && axis[i].Value > key)
{
SweepPoint swapper = axis[i];
if (keyelement.Begin && !swapper.Begin)
{
if (CheckBoundingBoxes(swapper.Body, keyelement.Body))
{
lock (fullOverlaps)
{
fullOverlaps.Add(new BroadphasePair(swapper.Body, keyelement.Body));
}
}
}
if (!keyelement.Begin && swapper.Begin)
{
lock (fullOverlaps)
{
fullOverlaps.Remove(new BroadphasePair(swapper.Body, keyelement.Body));
}
}
axis[i + 1] = swapper;
i = i - 1;
}
axis[i + 1] = keyelement;
}
}
If you are interested in all the details of the typical SAP data structures I recommend to read the excellent article written by Pierre Terdiman which covers nearly everything. If not, just keep in mind that we get informed each frame that "A overlaps B at axis C" or that "A stopped overlapping B at axis C" where C can be the X-, Y- or Z-Axis. When I first implemented this, I was very happy about the cheap results I get and then I made a mistake: I implemented a data structure which remembered how many overlaps each pair has. If the number of overlaps is 3 (which means one overlap on every axis) – the objects collide. This works but is horrible. The number of pairs overlapping on one axis can get very high and so you have to store thousands and thousands of potential overlapping pairs with overlapping count = 1, just to realize that only a few also overlap on the second axis and reach overlap count = 2, and only a very few overlap the third axis getting count = 3. In the end, runtime was O(nlogn) but memory usage was O(n^2) (killing my dictionary structures) – let's mention that I'am not the first doing this fault. The solution of this problem is simple: Don't store anything. If one axis reports an overlap, just do the cheap full boundingbox check – If it is positive add the pair to the fulloverlaps. If one axis reports two bodies seperating then remove the pair from the fulloverlaps. (this is done in the code above). So, memory used for SAP is again only O(n). (Implementation in Jitter can be found under "CollisionSystemPersistentSAP.cs".)
One downside of the persistent Sweep And Prune algorithm is that it suffers from really large worlds with a lot of inactive objects: Imagine a bullet shot through a town with many static objects. There is a swap between each object (even it's many miles away from the bullet) and the bullet itself. This issue can be solved using multiple smaller SAP's which altogether form a grid (broadphase in a broadphase). Another problem I noticed is that ray and bounding box queries can't be done efficient. For raycasting the three sorted lists can be interpreted as a non-uniform voxel grid – so there is a chance to get an ordered ray query. This is in practice rather slow because lot of the voxel grids are empty and it's only a solution for rays which have early hits – just doing a brute force check against each object is often faster. Another problem is that the scene contains one large object (which is often the ground or the level). The endpoints of this object are completely on the left and completely on the right of the lists. Without additional informations stored in the sweep points there is no way to efficiently detect if an object is enclosed by endpoints of a much bigger entity.