Shawn Hargreaves Blog
This article is prerecorded. Shawn is away (getting married later today). Replies to comments will be delayed.
Plagiarizing my MIX talk, I thought it would be useful to summarize the performance implications of choosing between the different shader permutations of the five built-in effects in XNA Game Studio 4.0.
To oversimplify, options with fewer shader instructions tend to run faster. But before you rush off to choose the cheapest shaders available, be aware this will make no difference if your game is CPU rather than GPU bound, or if your bottleneck is a different part of the GPU.
There are four main versions of BasicEffect, depending on what lighting options are used:
Vertex Shader Pixel Shader LightingEnabled = false 5 1 One vertex light 40 1 Three vertex lights 60 1 PreferPerPixelLighting = true 18 50
The "one vertex light" version is used when:
These settings add extra shader instructions to the numbers from the previous table:
Vertex Shader Pixel Shader TextureEnabled = true +1 +2 FogEnabled = true +4 +2
There are three main versions of SkinnedEffect, depending on what lighting options are used:
Vertex Shader Pixel Shader One vertex light 55 4 Three vertex lights 75 4 PreferPerPixelLighting = true 33 51
The above is with WeightsPerVertex = 1. The default is 4, which matches the default ModelProcessor behavior. Specifying fewer weights will only give correct results if your vertex data matches the requested shader. Neat trick: you can use SkinnedEffect with WeightsPerVertex = 1 to implement shader based instancing.
Vertex Shader Pixel Shader WeightsPerVertex = 2 +7 +0 WeightsPerVertex = 4 +13 +0 FogEnabled = true +0 +2
Unlike BasicEffect, SkinnedEffect has no option to disable lighting entirely, or to disable texturing.
There are two main versions of EnvironmentMapEffect, depending on what lighting options are used:
Vertex Shader Pixel Shader One vertex light 32 6 Three vertex lights 36 6
Vertex Shader Pixel Shader FresnelFactor != 0 +7 +0 EnvironmentMapSpecular != 0 +0 +2 FogEnabled = true +0 +2
I will write more about Fresnel and specular sometime later.
This guy is really simple, with only one basic version:
Vertex Shader Pixel Shader DualTextureEffect 7 6
Plus a single setting that adds extra shader instructions:
Vertex Shader Pixel Shader FogEnabled = true +4 +2
Despite its simplicity, DualTextureEffect is the key to great looking and efficient rendering techniques such as lightmaps and detail textures. I will write more about this sometime later, too.
This fellow has two main versions, depending on what AlphaFunction is used:
Vertex Shader Pixel Shader <, <=, >=, > 6 6 ==, != 6 10
Vertex Shader Pixel Shader FogEnabled = true +4 +2
Unrelated to this post: Shawn: I need XNA 4.0. Fix the installer so it will install on Server 2008 (says it only supports Vista and Windows 7) :P
I was wondering why the dual texture effect doesnt support any lighting? Was this a time constrant or just a design choice.
Also i noticed that the 4.0 refresh is out do you have anything to say about it shawn?
Thanks for the post.
Congratulations on the wedding!
> Neat trick: you can use SkinnedEffect with WeightsPerVertex = 1 to implement shader based instancing.
I would love to see that trick :)
"I would love to see that trick :)"
Just from thinking about it (not from any kind of documentation), I believe this would be the vertex shader instancing implementation. That is, each vertex has an associated instance index, and the vertex data is duplicated for each instance (see the instancing sample for more information). In this case, that instance index == the bone index.
However, this is relatively inefficient, as it requires the vertex list to contain duplicate vertices for ALL instances. The better method is to use hardware instancing where there's only one set of vertex data plus a separate set of instance indices. But I'm guessing SkinnedEffect doesn't support that, as it's a fundamentally different operation than skinning.
> I was wondering why the dual texture effect doesnt support any lighting?
To support every possible permutation of N different shading options is an exponential permutation problem, requiring 2^N different shaders.
We had the resources to create around 50 different shaders, which would have given us room to support just log2(50) options, which is nowhere near enough to support a range of interesting rendering features.
Instead, we chose to pick and choose only the more useful combinations of features, so we could support a larger total set of features without having to create an exponential number of different shaders.
More info in this article: http://blogs.msdn.com/shawnhar/archive/2010/04/28/new-built-in-effects-in-xna-game-studio-4-0.aspx?CommentPosted=true#commentmessage
Re: "I need XNA 4.0. Fix the installer so it will install on Server 2008 (says it only supports Vista and Windows 7"
Never mind, I just realized the 4.0 CTP only includes phone. Hurry up and get the rest of it done, and make sure that installer works on Server 2008 :P
Though I do have the exact question Daniel asked: wouldn't the lack of lighting for DualTextureEffect and AlphaTestEffect be a significant disadvantage for some common use scenarios?
> Just from thinking about it (not from any kind of documentation), I believe this would be the vertex shader instancing implementation.
Exactly. Skinned character rendering and vertex shader instancing are exactly the same thing, just used to achieve a different goal, so the same shader can be used for both.
It is quite trivial to change the shader instancing technique (from the instancing sample on creators.xna.com) to use SkinnedEffect as opposed to the custom shader it uses today.
> Exactly. Skinned character rendering and vertex shader instancing are exactly the same thing, just used to achieve a different goal, so the same shader can be used for both.
Aren't you very limited in how many instances you can render here? First of all the vertexbuffer needs all the vertexdata for all the instances, but most importantly the instance data has to be set using shaders constants which are limited in size (256?)
> Aren't you very limited in how many instances you can render here?
Yes, exactly the same as with any implementation of the shader instancing algorithm. From the doc that comes with our Instanced Model sample:
There is a limit on how many shader instances can be drawn in a single batch. This comes partly from the limited number of shader constant registers available to hold the instance transform matrices (see the comment and MAX_SHADER_MATRICES constant at the top of InstanceModel.fx) and partly from the limited range of 16-bit index values. If we repeated the model data too many times, our 16-bit indices would overflow. We do not want to use 32-bit indices because they are not universally supported on all graphics cards. The InstancedModelPart class stores the result of combining these two batch size limits in the maxInstances field. If asked to draw more copies than this limit, the DrawShaderInstancing method splits up the request, drawing as many instances as possible in each call to DrawIndexPrimitives.
Exactly, about 60 instances per draw call. And I want 10000+ :)
> Exactly, about 60 instances per draw call. And I want 10000+ :)
So, on platforms that support it, use true hardware instancing. On platforms that do not support hardware instancing, shader instancing is a great alternative.
Obviously if you want to draw more instances than fit into shader registers, you must split into multiple batches. That's still much faster than doing no instancing at all. Check out eg. our instancing sample, and you will see that shader instancing and hardware instancing are usually about the same speed in practice.