Please read and seek to understand the material below. Questions and programming exercises in light yellow will be discussed in class. Please write down enough so that you will be able to participate in the discussion. If you do not understand an exercise, feel free to skip it.

Collision Detection

Often, in games, you want to know if an object is touching another object, if a laser is bouncing off a mirror, if the player character is standing on something, or the player is clicking on a specific part of the UI. These are all problems that involve writing functions to check for collisions between different sorts of geometric objects.

What are other situations where collision/intersection of geometric objects is relevant to gameplay?

In addition to simply checking for intersections, it will also often be useful for a collision routine to report:

So also keep these use cases in mind in the following discussion.

There is no single best algorithm for all collision detection scenarios, but there are some big ideas that can be useful in designing collision handling systems. In this lesson we will discuss a few of these big ideas in three topic areas:

Early Out

The fastest collision check is the one that is never performed. "Early Out" code attempts to bail out of collision checks as quickly as possible -- generally, by having a fast check that can prove items are not colliding.

Simple Checks First

The type of early out you will almost always see in collision code is to perform a low-computational-cost test on a pair of potential colliders before performing a high-cost test. For example, you might do a sphere vs sphere test (costs a few subtracts, multiplies, and a comparison) before doing a convex-vs-convex or mesh-vs-mesh test.

This can even be done in the normal course of an intersection check -- if your code needs to check three different things, it should probably be checking the quickest-to-check one first.

The logic of "always do the fastest check first" has some flaws to be aware of.

Consider collision code that has two checks, a() and b(), both of which must pass for a collision to be reported. Say that a() takes \(t_a\) seconds and passes on fraction \(p_a\) of the pairs being tested, while b() takes \(t_b\) seconds and passes on fraction \(p_b\) of the pairs being tested. Further, the test results are independent, in the sense that the faction of pairs on which both tests pass is \(p_a p_b\) and the fraction of pairs on which neither test passes is \((1-p_a)(1-p_b)\).

Let's suppose that a() is the "faster" test (i.e., \(t_a < t_b\)). Give equations for the expected runtime of return a() && b(); (run the faster check first) and return b() && a(); (run the slower test first):

Is running the slower check first actually faster? If so, when?

In the case of -- say -- doing a sphere-vs-sphere pre-test, the "fast check" is redundant (not required if the "slow check" is performed). How does this change your analysis and result?

The Broad Phase

Even with very fast object-vs-object tests, your code will still need to perform \( \frac{n(n-1)}{2} \) checks given \( n \) objects unless it has some fast way of excluding pairs of objects that can't possibly overlap. This is the job of a broad phase -- code that comes up with a list of pairs of objects to collision check, generally in faster than \( O(n^2) \) time.

Consider a 2D game world containing objects in three boxes, A, B, and C:

three boxes in a 2D world
Objects in boxes

Can objects in box A and box C possibly overlap? Why or why not?

Broad phase checks can be built on a wide range of spatial data structures, but all boil down to the same basic reasoning: if \(A\) is contained in \(X\), and \(B\) is contained in \(Y\), then if \(X\) doesn't intersect \(Y\), \(A\) cannot intersect \(B\). Or, slightly more generally, and written in math:

\[ A \subseteq X \land B \subseteq Y \implies A \cap B \subseteq X \cap Y \]

And, indeed, we can make a similar statement about distances: if \(A\) is contained in \(X\) and \(B\) is contained in \(Y\), then the distance between the closest points in \(A\) and \(B\) is at least the distance between the closest points in \(X\) and \(Y\).

\[ A \subseteq X \land B \subseteq Y \implies \mathrm{dis}(A, B) \ge \mathrm{dis}(X,Y) \]

Intuitively, why are these two statements about \(A\), \(B\), \(X\), and \(Y\) true?

A Shout-Out to Grids

You can use a wide range of spatial data structures in games (binary space partition trees, in particular, turn out to be really beautiful to write code for, and have a long history with games). But when you want structures that are darn easy to write code for and will work decently, the choice is clear: use uniform grids.

So let's go on a quick tour of some of the flavors of uniform grid you might use.

First, here's the vanilla grid. This is the simple, lazy thing that I use whenever checking all pairs of colliders isn't quite fast enough:

#include <glm/gtx/hash.hpp> //(0)
struct Collider {
	//...
	//axis-aligned bounding box:
	glm::vec2 aabb_min;
	glm::vec2 aabb_max;
};
void broad_phase( std::vector< Collider > &colliders, std::function< void(Collider &, Collider &) > const &report_pair ) {
	glm::vec2 constexpr GridSize = glm::vec2(1.0f, 1.0f); //(1)
	std::unordered_multimap< glm::ivec2, Object * > grid;
	grid.reserve(colliders.size()); //(2)
	for (auto &collider : colliders) {
		glm::ivec2 min = glm::ivec2( //(3a)
			std::floor(collider.aabb_min.x / GridSize.x),
			std::floor(collider.aabb_min.y / GridSize.y)
		);
		glm::ivec2 max = glm::ivec2( //(3b)
			std::floor(collider.aabb_max.x / GridSize.x),
			std::floor(collider.aabb_max.y / GridSize.y)
		);
		for (glm::ivec2 cell = min; cell.y <= max.y; ++cell.y) {
			for (cell.x = min.x; cell.x <= max.x; ++cell.x) {
				auto others = grid.equal_range(cell); //(4)
				for (auto other = others.first; other != others.second; ++other) {
					report_pair(*other, collider); //(5)
				}
				grid.emplace_hint(equal_range.first, &collider);
			}
		}
	}
}

Let's take a quick stroll through this code and see what's going on.

Why is the glm/gtx/hash.hpp header (at (0)) so important to include that I put it in the example even though, e.g., I didn't mention unordered_map?

The GridSize variable (at (1)) controls a trade-off; what are the advantages of setting it smaller? larger?

What does grid.reserve (at (2)) do? Why call it? (See: cppreference docs for std::unordered_map)

What is the purpose of min (at (3a)) and max (at (3b))? Why are both min and max computed with std::floor? (Shouldn't max use std::ceil?)

What type will the compiler infer for others (at (4))?

Why is other being dereferenced, while collider is not, in the call to report_pair at (5)?

The biggest mistake I make when writing this code is to use a std::unordered_map instead of a std::unordered_multimap. What goes wrong when I do this?

While this sort of vanilla, hash-based uniform grid works really well as-is on moderately-sized worlds, they can begin to break down in large worlds. You can improve performance with a few simple-to-code tricks.

Temporal Broad Phase

Spatial data structures aren't the only way to avoid doing collision checks.

Consider a 2D game world containing three circular objects A, B, and C at the following positions:

three circles on a grid
Circular objects in a 2D game world.

If the maximum velocity that any object can travel in the game is 1 unit / second, can A and B can possibly intersect in the next 3 seconds of gameplay? Why or why not?

How might you use this idea in a game to avoid performing collision checks between every pair of objects every frame? (Describe at a high level, do not write code.)

Mathematical Ideas

When devising collision checks, it is important to think about collisions in the right way. What is the right way? Well, it depends on the situation.

In this section, I will equip you with some different mathematical ideas which can be useful in building collision checks.

Convexity

You have probably run into the notion of a convex object (or set) before. A set of points \( C \) is convex if every line between two points in the set is itself contained in the set:

\[ \forall a,b\in C, x \in [0,1]: x(b-a)+a \in C \]
a convex set and a non-convex set
A convex set is a set that contains every line between two points in the set. Red lines show that the set on the right is not convex.

It turns out that convexity is exactly the property you want objects to have to make them easy to perform collision and distance checks between. We'll see a bit of why this is below.

Convex objects are so nice to work with that you'll often find yourself splitting more complex objects (e.g., complicated polygons) into simple convex objects (e.g., triangles) in order to perform collision or distance checks. This process is known as convex decomposition, and it is worth looking into (especially) approximate convex decomposition before writing your own algorithms. (It's also generally worth asking your artist if they would be willing to hand-build such a decomposition, since they will do a better job and it's generally pretty easy to do.)

Minkowski Sum

The Minkowski Sum of two sets \(A\) and \(B\) is the set of sums of elements of the sets:

\[ A + B \equiv \{x + y \;|\; x \in A, y \in B\} \]

I'll also use \( A - B \) to denote the set of differences of elements.

There are many notions of adding and subtracting sets. You may be familiar, for instance, with set union \( A \cup B \) and difference \(A \setminus B \). These are dramatically different than the sum \( A + B \) and difference \( A - B \) defined above. Explain the difference.

This is much easier to think about if we draw a picture:

the minkowski sum and difference of two sets
The Minkowski Sum of \(A\) and \(B\) is what you get if you draw a copy of \(B\) offset by every point in \(A\) (or visa-versa).

The Minkowski Sum has some remarkable and useful properties for collision detection, which the following exercise will ask you to reason about.

Sets \(A\) and \(B\) intersect if and only if the Minkowski difference \( A - B \) contains \( 0 \). Explain why.

Points \(a \in A\) and \(b \in B\) are a pair of closest points in \(A\) and \(B\) if and only if \( a - b \) is a closest point to \( 0 \) in the Minkowski difference \( A - B \). Explain why.

The Minkowski Sum (or difference) of two convex sets is, itself, convex. Explain why.

The above three observations form the basis for the GJK algorithm, a general-purpose closest-point / collision algorithm which is powerful and worth knowing about, but is also typically considered very difficult to implement properly.

It is also worth noting that explicitly computing the Minkowski sum or difference of simple objects is entirely feasible, and a good way to write simple collision checks which are straightforward to visualize and debug.

Separating Axis

Perhaps paradoxically, the easiest way to demonstrate that two objects collide is to check all of the ways they might not collide and -- if they don't not collide in any of these ways -- conclude that they must collide.

Your key tool in any collision check between two convex objects is the separating axis theorem, which states that two convex sets do not overlap if and only if there exists a direction along which their projections do not overlap. This direction is called the "separating axis", and there may be one than one.

two convex sets and their separating axis
If two convex sets do not overlap, then there is at least one separating axis (a direction along which their projections do not overlap).

A key observation which you should take a moment to reason about using intuitions from the Minkowski Sum discussion above is that for two non-overlapping convex polytopes, there always exists a separating axis perpendicular to a face of one of the polytopes.

Thus, if you want to write a check for collisions between two convex polygons your code can check all the edges of each polygon to see if its perpendicular works as a separating direction. If it finds a separating direction among the perpendiculars to the edges, then it can report that the polygon's don't intersect; otherwise, it can report that they must intersect. (Though, unfortunately, it won't have computed an intersection point.)

Admittedly, it would generally be just as quick to explicitly compute the intersection by repeated half-plane clipping steps, and the resulting overlap polygon would also allow your code to readily compute a collision location and normal. This is worth thinking about for a moment. (And, really, writing up convex/convex polygon intersection is worth doing. I'd have you write it right here if this weren't already a bit long as a lesson. It's surprisingly enjoyable code to put together.)

A totally incidental claim for which I currently don't recall a satisfying proof is that any arrangement of non-intersecting convex objects may be transformed into any other arrangement of the same objects by a sequence of moves, where each move involves translating a single object along a straight, intersection-free path.

Continuous Collision Detection

When objects are travelling very quickly, you may wish to check for collisions at all points along their trajectory.

While this may at first seem complex, notice that an object \( A \) traveling along linear path \( v_a \) and object \( B \) traveling along linear path \( v_b \) will intersect at time \( t \) such that the Minkowski sum \[ (A + tv_a) - (B + tv_b) = (A - B) + t(v_a - v_b) \] contains zero. In other words, computing the continuous collision time between two objects moving along linear paths can be computed by tracing an appropriate ray into their Minkowski difference.

continuous collision between a circle and triangle
Computing the instant of collision between two moving objects by tracing a ray into their Minkowski difference.

In the picture it looks like the intersection point \( x \) between \( A - B \) and the ray \(v_b - v_a\) is at the eventual "real-world" intersection location between \(A\) and \(B\). Is this a coincidence? Why or why not?

Note that this approach generally must handle rotations separately, either by using rotationally-symmetric collision geometry for the object of interest or by resolving rotations using instant-to-instant checking and doing continuous checks for translation only.

Voronoi Regions

A Voronoi diagram divides space into a set of Voronoi cells (regions), each of which is closest to some specific feature of an object.

The Voronoi diagram is interesting to think about when computing closest-point queries because if you can figure out what Voronoi cell a point is, you know what feature it must be closest to.

Signed Distance Fields

While all the above machinery can be really useful for doing collision or distance checks, sometimes the right answer really is to just compute a look-up table. A signed distance field is just that: a look-up table of how far inside (or outside) of an object every point in some volume of space around that object is.

Say you want to check if a point is inside the object. With a signed distance field this operation takes a single memory access and comparison -- distanceField.at(pt) < 0. For things where low-res collisions are enough -- e.g. GPU particle systems or ray-marched soft shadows -- signed distance fields are a brilliant tool to keep in your back pocket.

Hmm, I said "for low-res collisions." What's the problem with storing high-resolution signed distance fields?

It is often useful to know not just how far from an object's surface a given point is, but also what the closest point on the surface is. One might, therefore, decide to store a "closest-point-on-the-surface field" which stores the \( (x,y,z) \) location of the closest point on the surface of the object. What are advantages and disadvantages of a "closet-point-on-the-surface" field?

Valid Spaces

We generally discuss collision as being about keeping players out of level geometry; that is, testing and rejecting invalid configurations. But, in some cases, it actually makes sense to instead think about keeping players inside the level interior; that is, constraining the world to the space of valid configurations. (An idea akin to generalized coordinates, coordinates that represent a physical system in terms of its degrees of freedom.)

Imagine, for example, that you are building a 3D exploration game where a player walks around a complex world. Because the player can only walk (and not, say, jump), the player can only ever access a limited subset of the space.

the walkmesh from a portion of cubedex3
The hand-authored walkmesh (pink) used in TCHOW llc's "The Cubedex of Flowers and Stone."

If your asset pipeline could pre-compute (or, indeed, you could hand-author) this manifold of positions -- say, as a triangle mesh -- your runtime code could then prevent the player from intersecting the world by keeping the player's position on the valid position mesh. This turns a complex 3D problem (is the player intersecting any object in the world?) into a simple 2D problem (is the player's position still in the same triangle? If not, what neighboring triangle should it be in?).

This is a Walk Mesh approach to collision avoidance (not detection because collisions with level geometry are never checked). It's very convenient for games where you want players to walk (or run) around largely static scenes and you don't want to deal with fancier collision detection schemes.

You will be building a WalkMesh implementation yourself in our next lesson. To get you started, think about a few design questions.

How might your code represent a point on the surface of a triangle mesh?

How might your code convert between points in extrinsic 3D space and the intrinsic 2D triangle-surface space of your representation?

What function(s) might a WalkMesh expose to game code in order to make it easy to control player movement?