In my previous entry, I discussed the requirements of card rendering in Riven X as well as using depth testing to satisfy those requirements. I concluded by mentionning depth testing had its own problems and that I would further detail my design process later. So let’s get on with it, shall we?
Why depth testing doesn’t cure cancer
Depth testing can be extremely useful in 3D applications, but Riven X is only a simple 2D compositing engine. As such, depth testing isn’t required to obtain the desired image but only as an optimization technique. So for depth testing to make sense, it needs to offer a net advantage over other methods, which is far from being obvious.
First of all, depth testing has a memory cost. Not only must you maintain a set of z coordinates for all your primitives, you must also instruct OpenGL to maintain a depth buffer as part of its framebuffer. It is true that statistically there will never be a large number of primitives in a given card, however it is also likely Riven X will need to send large textures to the GPU each frame (such as for movies). Consequently, memory bandwidth to the GPU becomes a precious commodity that needs to be managed tightly.
Secondly, depth testing adds a level of complexity that other more naïve solutions do not have. Specifically, because Riven X is a 2D composite, there is no depth information in a card’s description. As such, it would be necessary to generate z coordinates for each renderable element before depth testing could be used to control display order. This would essentially boil down to a simple quantization problem (m elements to render, n possible depth values). Nothing to worry about but why do more when you can do less? In addition, elements can be enabled and disabled. That implies computing a new depth value whenever an element is enabled, and eventually a complete re-quantization of all enabled elements when you reach the upper bound of the dynamic range of your depth values.
Finally, depth testing doesn’t solve the what to render problem entirely. Indeed, depth testing can be used to filter out “too far” objects, but you need to define your “far plane” or “depth clipping value” and initially set the depth value of all your primitives to that cutoff threshold.
The texture problem
If you recall from the previous entry, one of the big advantages of depth testing is the ability to render a large number of primitives in one OpenGL operation (using vertex arrays and the likes of glDrawArray) that share common state properties. This is the typical case of 3D applications, where models are aggregates of various types of primitives (triangles, strips, quads, etc). However, this is not applicable to Riven X because each primitive is essentially used as a billboard on which a unique texture can be applied. Typically, a 3D model will be made of a few sub-sections, each of which is made of n primitives and one texture (or a few with multitexture) that will be applied to all the primitives using each primitive’s texture coordinates. If I were to do the same in Riven X, I’d have to dynamically generate a large composite texture with each element’s texture in it. Because each of those sub-regions will be rectangular, it is likely I’d run out of texture space rapidly. And that doesn’t include the fact I’d have to write code to do the composite in an optimal fashion. Thermodynamics win, solution dismissed, I remain in my lazy state.
A simple array
Riven X uses a simple enabled elements array to drive the card rendering loop, which looks like this:
NSEnumerator *renderListEnumerator = [_renderList objectEnumerator];
id anObject = nil;
Class lastRenderableClass = nil;
while(anObject = [renderListEnumerator nextObject]) {
if([anObject isKindOfClass:pictureClass]) {
unsigned short pictureIndex = [(NSNumber *)anObject unsignedShortValue];
// bind the pictures VBO and arrays only when we must
if(lastRenderableClass != pictureClass) {
lastRenderableClass = pictureClass;
glBindBuffer(GL_ARRAY_BUFFER, _pictureCoordsVBO);
glVertexPointer(2, GL_FLOAT, 0, BUFFER_OFFSET(0));
glTexCoordPointer(2, GL_FLOAT, 0, BUFFER_OFFSET(_pictureTexCoordsOffset));
}
// bind the picture texture and draw the quad
glBindTexture(GL_TEXTURE_RECTANGLE_EXT, _pictureTextureObjects[pictureIndex]);
glDrawArrays(GL_QUADS, pictureIndex, 4);
}
}
It uses the class of each element to know how to render it (right now, I’m only rendering pictures) and tries to minimize state changes as much as possible. An array is a relatively simple solution that doesn’t offer fast deactivation but relatively OK activation. It should be noted that deactivation wouldn’t be fast either if I was using depth testing (both are linear), which may be fixed in both instances using a reference table from element resource ID to element index.
However, sampling indicates Riven X renders a lot more often than it enables or disables elements, so this isn’t a significant problem.
So this is how it works for now, and until I have a more complete application that I can sample in a more realistic manner, an array will do just fine.