Following the tracks of my colleagues (Eternal coding) (David Rousset’s Blog), I take the opportunity to tell you the fabulous story of the 3D content creation, from the designer and artist point of view. As you may know, prior to work for Microsoft, I’ve been deeply involved in the creation of a very unique 3D engine which was called NOVA. My main duty at that time, was to provide the user’s feedbacks. Will the 3D artist use this feature? Will he understand what is the purpose of such an action ? How to create that item without any line of code? So, as Microsoft launched yesterday the new Internet Explorer preview (AKA IE11), which can take advantage of 3D WebGL contents, let’s get a look to the main concepts related to that fantastic world. And we will begin today with a generic article.
This article is intended to match with new comers needs. Some parts will shortcut principles, and maybe will not be very accurate. We will get deeper in the following articles.
In my job, I often speak to old sages that keep saying that performance is the key of a good application. And let’s go for hours, speaking of integrated code, of the specific things of C++, of native code, and so on. Today, with very high performances machines in every single laptop, or even phone, you can easily say that every PC or device is powerful enough to dim the optimization part and to leverage the power of these machines for very high quality visualization.
But such a concept is a mistake. All right, we can show much more triangles. Ok, we can easily bring more particles. Fine, I managed all the things up for a high resolution screen. Perfect. But to bring even more, to go further than that, you need to optimize. The right question here is : Damn it, I have a very cool 3D engine, exploiting the latest technologies, based on a robust platform. But how did those guys, in AAA game productions, obtain such awesome graphics ? Do you see that or that feature ? these are certainly “engine-killers” features , so, how do they do that ?
The answer lies beneath the hidden truth : optimization. And not only in the engine’s code. It would be so simple. Performance is tied to the assets. In fact, performance is often a artist matter. And you can think of optimization as an highway where all the cars have a minimum speed. Minimal speed is 30 frame per seconds. Under that speed, the visualization begins to be very tiring and frustrating. Every single car cannot slow down under that speed. If you have only a few cars, they will reach very quickly their destination. And as the number of cars increases,, they slow down to allow to every one to get to its goal. But they still cannot get under the lower speed limit. And if they do (too much traffic), you have to take decisions. Should I suppress a bunch of cars? Should I override the low speed rule ?
On the other side, if you’re a good professional with optimization in mind, all your cars are very fast and they all take the Holiday road with 3000 frames per seconds. But then, you simply miss the target : as an artist, your goal is to display beautiful things. Not quick things. The landscape is the major thing, not the cars ! If you’re too high on performances, that probably not means that you’re good at optimization. This means that you have not describe the 3D world enough. Too low, and the travel is frustrating and irritating, too quick and it’s nonsense.
And of course, depending of the Platform, the engine will not respond within the same range of performance, and you have to take care of the device your little 3D world will run onto.
One last word about that : When you’re seeking for performances, do not focus too much about how much polygons you have in your 3D space. This is not that relevant. That is may be a good statistic overview, but remember that a high polygonized object can be twice as quick to display with some combinations (texture size optimization, effects, skinning …).
The 3D guys around the world are bound to a fellowship, speaking in an esoteric language. Don’t be afraid of that. The concepts are quite simple. Really, it is no big deal. Imagine a theater. You’re the only spectator (much more like a director, in fact…) in here. Let say you’re the point of view (the camera). In front of you is the Scene. This is where the actors perform. Let’s call them objects. They all have a specific role to play (still, animated, triggers, …), but cannot shine without lights, Nor without costumes (shaders). As the camera operator, you can add filters to embellish the cruel reality (Post effects shaders), but as the space surrounding the scene remains in the shadows, we will ignore it and we decide that all that is not in the field of view simply do not exist (Frustrum).
All those actors on scene, all those items… Such a mess. To make it work, you have to organize things a little: For instance, the actors should not say their respective lines in the same time. They even should not be all on the the stage at the same time ! The rules are pretty simple: performance matters, sure, but, as an artist, you want everyone to be heard and seen for their respective qualities ! So, you have to follow the scenario and leverage the role of everyone in here.
Lucky you, you have in the crew a Stage Manager (Your CPU) that takes care of the items on the scene : Too much items, and the CPU is lost and slows down the show. Too few items, and the CPU idles. And, yes, think about this process as a “per image'” workload. Every image (among the 30 per seconds min.), the CPU check all the items to organize the workload on the scene. One more thing here : Your CPU, even if you’re lucky or rich enough to possess an high end one, is not God. It has much restrictive gifts. For instance, in real life, we only have one world, from distant galaxies to proton-made atoms. CPU cannot be that accurate and must fit a suitable scale.
Once the organization is done, once all the items are on stage, playing their role, you must broadcast the result : This is where everything gets creepy. Imagine a 3D world, translated in a 2D World (your screen). Now imagine your camera as the GPU (graphics board). To display all the items, the GPU inherit from the data computed by the CPU. But the GPU has it’s own logic. It “renders” the more distant items first. Then, little by little, the GPU displays the closest objects. This is called the Z-buffer process. And the GPU is much more accurate than the CPU, in the sense that it deals with the underlying template of an object (Faces, triangles, vertices, edges). This becomes tricky when you add transparency on a object. Remember, the CPU consider the objet as a item, but GPU considers vertices as an item. If an object is not opaque for the CPU, what pixel is to display first, taking into account the transparency ?
So, resume it : The CPU organize the scene. Then the GPU constructs the scene and tessellates (creates faces) the objects. Then, it illuminates the triangles with the available lights in the scene, smoothing the angles between them. And the item is still naked, so far. The next process is to apply a shader on the surface of the objects to render it in a specific way. We will see that later on.
Finally, the result is treated as pixels and sent to the display. This is mainly a process that converts abstract data to concrete images (from 3D to 2D and from math to pixels).
All those constraints lead obviously to a simple thought : Try to obtain the better performance from the CPU combined to the GPU. If you have more power in the CPU part, it will always wait for the GPU to complete its task. This once made the development with the PS3 quite tricky because of the 8 cells combination, for instance.
A great CPU is fine for organization, and additional computation in your scenes (statistics may become handy, if you have multi-cores CPU). GPU is perfect for final computation and some extra stuffs (like physics…). Don’t forget that a game or a great simulation is much more than that. You have to take care of the AI (intelligence), of the sound, of the physics, network, …
Speaking of the GPU, you must keep in mind that an image is not computed in one process but in overlaying passes named Draw Calls. The less draw calls you have, the fastest is the rendering. The most fps you obtain.
Each time you add a feature, you add probably a draw call somewhere in the process. For example, let’s have a closer look to the aspect of the objects. As we have previously seen, an object is basically drawn by the graphic cards that way : first, the vertices, then the edges, then it fills the triangles, applies the light shading, eventually add textures. Textures are bitmap objects or procedural items that change the initial aspect of the objet, in an additional, multiplicative or subtractive way. A texture can be laid down in a specific channel : The diffuse channel is for the general aspect. the ambient channel is useful to improve the lightning (and shadowing) part. The opacity channel is for transparent or semi-transparent purposes. reflexion is useful if the objects reflects its environment. Specular is for the relative glossiness of the surface, and so on. And, depending on the engine, you can put a texture in each of these channels. The global result is called a shader, that can be compiled on the fly with the recent shaders languages. Depending on how brave the engine software developer is, you will eventually gain features in the shaders, such as Fresnel textures mapping, cubic mapping, or per-pixel specular.
Oh, yes, I almost forgot to speak about the two ways to light an object: The first one is often the fastest because it is related to the amount of vertices composing the object. You take the vertices, and light them up, following the light vectors, and then, you apply on the top, the different channels. Otherwise, you can light every single pixel of the object. This process had been made very fluid recently because of the new powerful graphic boards. But, this implies that the more pixels you have (especially in high resolutions), the most memory you will use to gather all the pixels final colors, and possibly the slowest the render will be…
With Diffuse map
With diffuse+Ambiant map
And finally, on top of that, add reflection map
Per-pixel shader, with Fresnel reflections.
Ok, this completes this first post about 3D from the Artist view. The next article will be focused upon the tooling required to create assets to populate your 3D vision. Especially Blender 3D. So stay tuned for more !
A nice WEBGL demo, done here in France, for and by the Microsoft French DPE