A Transformation Hierarchy
Often, in a 3D scene, it is convenient to specify the transformations of scene elements relative to each-other. For instance, items sitting on a table should move along with the table; items in a box should move with the box; and the transformations of the links of a robot arm make sense to write in terms of the transformations of their parent links.
One way to implement this is to think of every object in the scene as having a Transform
that indicates its position relative to a parent Transform
.
Indeed, such a scene graph (transformation hierarchy) is at the core of many 3D modelling packages.
This document develops a basic implementation of such a scene graph in C++ for use in your games. This implementation is included as part of the 15-566-f20-base2 code as Scene.hpp and Scene.cpp.
Transforms and Parents
To start with, we'll need some way of writing down a 3D transformation. In so doing, we will also be (effectively) specifying / limiting the kinds of transformations our code can represent.
Almost all GLSL shader code you read will expect you to supply transformations as mat4
(4-column, 4-row matrices of floating-point numbers) or mat4x3
(4-column, 3-row matrices of floating-point numbers).
It use these matrices to transform 3D points by using matrix-vector multiplication:
//vertex shader from ShowSceneProgram.cpp:
#version 330
uniform mat4 OBJECT_TO_CLIP;
uniform mat4x3 OBJECT_TO_LIGHT;
uniform mat3 NORMAL_TO_LIGHT;
in vec4 Position;
in vec3 Normal
in vec4 Color;
in vec2 TexCoord;
out vec3 position;
out vec3 normal;
out vec4 color;
out vec2 texCoord;
void main() {
gl_Position = OBJECT_TO_CLIP * Position;
position = OBJECT_TO_LIGHT * Position;
normal = NORMAL_TO_LIGHT * Normal;
color = Color;
texCoord = TexCoord;
}
So we could decide that since shaders need mat4x3
values, our Transform
should contain exactly this:
struct PossibleTransform {
glm::mat4x3 local_to_parent;
};
But these will be a real hassle for our game code to deal with. So, instead, let's use a representation that's a bit easier to edit:
struct Transform {
glm::vec3 position;
glm::quat rotation;
glm::vec3 scale;
//NOTE: transform order: position + rotation * scale * pt
//...
};
This is much nicer.
Rotations are now pulled out as a quaternion (see this note on rotations for why that is nicer),
and we can write position
instead of local_to_parent[3]
.
Now that we've got the basic transformation part of the transformation written down, we can finish it up by adding a parent transform (since that's the whole point of a scene hierarchy) and some helper functions to make matrices for our shaders. And we might as well add a name as well since that's good for debugging (and looking stuff up with):
struct Scene {
struct Transform {
//Transform names are useful for debugging and looking up locations in a loaded scene:
std::string name;
//The core function of a transform is to store a transformation in the world:
glm::vec3 position = glm::vec3(0.0f, 0.0f, 0.0f);
glm::quat rotation = glm::quat(1.0f, 0.0f, 0.0f, 0.0f); //n.b. wxyz init order
glm::vec3 scale = glm::vec3(1.0f, 1.0f, 1.0f);
//The transform above may be relative to some parent transform:
Transform *parent = nullptr;
//It is often convenient to construct matrices representing this transformation:
// ..relative to its parent:
glm::mat4x3 make_local_to_parent() const;
glm::mat4x3 make_parent_to_local() const;
// ..relative to the world:
glm::mat4x3 make_local_to_world() const;
glm::mat4x3 make_world_to_local() const;
//since hierarchy is tracked through pointers, copy-constructing a transform is not advised:
Transform(Transform const &) = delete;
//if we delete some constructors, we need to let the compiler know that the default constructor is still okay:
Transform() = default;
};
}
The to_parent
/parent_to
member functions write down matrices that convert from local space to/from the space of the parent transform.
These involve writing down some matrices:
glm::mat4x3 Scene::Transform::make_local_to_parent() const {
glm::mat3 rot = glm::mat3_cast(rotation);
return glm::mat4x3(
rot[0] * scale.x,
rot[1] * scale.y,
rot[2] * scale.z,
position
);
}
glm::mat4x3 Scene::Transform::make_parent_to_local() const {
glm::vec3 inv_scale;
inv_scale.x = (scale.x == 0.0f ? 0.0f : 1.0f / scale.x);
inv_scale.y = (scale.y == 0.0f ? 0.0f : 1.0f / scale.y);
inv_scale.z = (scale.z == 0.0f ? 0.0f : 1.0f / scale.z);
glm::mat3 inv_rot = glm::mat3_cast(glm::inverse(rotation));
inv_rot[0] *= inv_scale;
inv_rot[1] *= inv_scale;
inv_rot[2] *= inv_scale;
return glm::mat4x3(
inv_rot[0],
inv_rot[1],
inv_rot[2],
inv_rot * -position
);
}
How does one implement the to_world
/world_to
member functions? Recursively, of course:
glm::mat4x3 Scene::Transform::make_local_to_world() const {
if (parent) {
return parent->make_local_to_world()
* glm::mat4(make_local_to_parent());
} else {
return make_local_to_parent();
}
}
glm::mat4x3 Scene::Transform::make_local_to_parent() const {
if (parent) {
return make_parent_to_local()
* glm::mat4(parent->make_world_to_local());
} else {
return make_parent_to_local();
}
}
Everyone Gets A Transform
Now that we have a transform structure, let's attach everything else in our 3D scene to one:
struct Scene {
struct Transform { /* see above */ };
struct Camera {
Transform *transform;
//...
};
struct Drawable {
Transform *transform;
//...
};
std::list< Camera > cameras;
std::list< Drawable > drawables;
//...
};
Notice that the Transform
is pointed to rather than contained within the objects.
This means you could have several Drawable
s all with the same transform (and also have a Camera
attached to that transform).
Convenient!
We can now sketch a basic drawing function which uses the transform
member of scene objects to generate appropriate matrices:
void Scene::draw(Scene::Camera const &camera) {
//NOTE: just a sketch; for full details read Scene.cpp
//compute the projection matrix using the supplied camera:
glm::mat4 world_to_clip =
camera.make_projection()
* camera.world_to_local();
for (auto const &drawable : drawables) {
glm::mat4x3 object_to_world =
drawable.transform->make_local_to_world();
glUseProgram(drawable.program);
//OBJECT_TO_CLIP takes vertices from local space to clip space:
if (drawable.OBJECT_TO_CLIP_mat4 != -1U) {
glm::mat4 object_to_clip = world_to_clip * glm::mat4(object_to_world);
glUniformMatrix4fv(drawable.OBJECT_TO_CLIP_mat4, 1, GL_FALSE, glm::value_ptr(object_to_clip));
}
//draw the object's vertices:
glBindVertexArray(drawable.vao);
glDrawArrays(drawable.type, drawable.start, drawable.count);
}
glUseProgram(0);
glBindVertexArray(0);
}
Of course, this isn't perfect, but it's a good start. I address some potential concerns below.
Concern: DAG Cycles
Concern: Inefficiency
A Note on Memory
When you read Scene.hpp you will notice that the Scene
structure maintains std::list
s of various scene objects (Transform
s, etc.).
This is because it is convenient to set things up so that when a Scene
is deallocated, all of the associated objects are also deallocated.
Drawables Represent Pipelines
As a final note, I want to make sure you take a look at the real Scene::Drawable
structure.
This structure writes down everything needed to run the OpenGL pipeline and, well, draw a thing:
struct Drawable {
Drawable(Transform *transform_) : transform(transform_) { assert(transform); }
Transform * transform;
//Contains all the data needed to run the OpenGL pipeline:
struct Pipeline {
GLuint program = 0; //shader program; passed to glUseProgram
//attributes:
GLuint vao = 0; //attrib->buffer mapping; passed to glBindVertexArray
GLenum type = GL_TRIANGLES; //what sort of primitive to draw; passed to glDrawArrays
GLuint start = 0; //first vertex to draw; passed to glDrawArrays
GLuint count = 0; //number of vertices to draw; passed to glDrawArrays
//uniforms:
GLuint OBJECT_TO_CLIP_mat4 = -1U; //uniform location for object to clip space matrix
GLuint OBJECT_TO_LIGHT_mat4x3 = -1U; //uniform location for object to light space (== world space) matrix
GLuint NORMAL_TO_LIGHT_mat3 = -1U; //uniform location for normal to light space (== world space) matrix
std::function< void() > set_uniforms; //(optional) function to set any other useful uniforms
//texture objects to bind for the first TextureCount textures:
enum : uint32_t { TextureCount = 4 };
struct TextureInfo {
GLuint texture = 0;
GLenum target = GL_TEXTURE_2D;
} textures[TextureCount];
} pipeline;
};
First, notice that all of the state needed to run the OpenGL pipeline is wrapped up in a Pipeline
structure.
This is going to become relevant later on when we will want to have objects that draw themselves differently in different "passes" (e.g., they may use a different shader or with different geometry when rendering shadows).
In this case, we'll change pipeline
to pipelines[PassCount]
.
Further, it's quite useful when you want to set up a bunch of objects with the same shader, because you can create a default Scene::Transform::Pipeline
for that shader and assign it to all the objects, only changing the mesh-data-related fields per-object.
Second, notice that -- despite how many lines it takes up -- Drawable
is pretty lightweight -- 80-ish bytes, depending on the size of a function pointer.
You should think of them more as lightweight instances of the heavy geometry stored in MeshBuffer
; it's okay to have hundreds (possibly thousands, though see that note on efficiency) of them in your scene.
This leads to a particular level/scene design strategy where you make a few simple objects and then use many many instances of them throughout the world. This is a good strategy for world-creation productivity, and I recommend it.