Dear Mr Miller, you suck. I'm running in 1080p resolution, and needed to have four simultaneous render targets, and a depth buffer, and I can't do it.
Now, to be fair, I didn't actually receive a message like that, but I could see it happening (and I've gotten some similar). One of the features that we added this last release was the ability to use multiple simultaneous render targets on Xbox (up to four of them). However, this isn't without restriction. All render targets that will be used on the device need to fit within the EDRAM (a 10MB chunk of memory). This includes the depth buffer if it exists. EDRAM isn't very big at all.
Let's take a look at an example. Your Xbox is rendering in wide screen 720p mode with a resolution of 1280x720. You set your game to run at this exact same resolution with a surface format of Color (32bits) and a 32bit depth buffer as well. You also turn on 2x multisampling because you hate jaggies like the rest of us. All of this data needs to fit into EDRAM, so let's see how big we are. Each pixel in the back buffer will be 8 bytes (32bit for color, double it for multisampling). There are 1280x720 pixels, so the back buffer will be taking up approximately 7.3MB of memory. The depth buffer is the same size, so that's another 7.3MB, and we're looking at a total of almost 15MB of data we need to store in our 10MB chunk of memory.
As you can see, even this simple scenario we overstepped our bounds. Luckily, we have a mechanism (called "tiling") that allows us to work with this data anyway. We essentially break our large 1280x720 set of pixels into a series of smaller sets of pixels that *do* fit within the memory constraints. In the example above, the back buffer could be broken down into two separate "tiles" of size 640x720 with 2X multisampling, and things would work just fine. With only two possible consumers of this chunk of memory in version one (the back buffer and the depth buffer), you could easily fit the largest possible back buffer sizes within this chunk of memory.
In version two though, we allow up to five consumers of this memory (four simultaneous render targets and the depth buffer). The memory is broken up equally by all consumers, so in the maximum case, each render target and buffer has to fit within 2 MB of EDRAM. You can imagine if I called 10MB small what I think about 2MB. I'm sure you could just think "Well sure Miller, that's fine, but we could just create hundreds of little tiles and make it all fit."
There are two big issues with this. First, performance would be horrible. Each tile that is rendered actually will have your *entire* scene rendered on to it, then the results of the tiles glued together. In a complex scene this could add up very very quickly. Second (and more importantly) we only allow a maximum of 15 tiles to be created at one time. If your render target or depth buffer cannot fit inside 2MB of memory with 15 tiles or less, you will fail.
Knowing that, what size of render targets can you make? What if you have a 2048x2048 render target (surface format of Color) with no multisampling? Would that fit if you had four of them at the same time? A quick and dirty way of figuring it out would be this:
2048 * 2048 = 4MB * 4bytes (color) = 16MB (total amount of memory needed) / 2MB (maximum size) == 8
So, you'd need 8 tiles for this buffer to render correctly, 8 is less than 15 so you'd be safe. What if you added 2x multisampling though?
2048 * 2048 == 4MB * 8 bytes (color+ms) == 32 MB (total amount of memory needed) / 2MB (maximum size) == 16
Nope, 16 tiles is too many, this wouldn't fit. What if you didn't have a depth buffer though? Depth buffer counts as a buffer in the EDRAM so without it, you'd have 2.5MB per surface available:
2048 * 2048 == 4MB * 8 bytes (color+ms) == 32 MB (total amount of memory needed) / 2.5MB (maximum size) == 13
13 tiles, you'd fit again!
It's also important to point out that not all tiles are created equal. Tiles aren't calculated based on a memory size, but on a pixel size. Odd shaped (ie, non-square) render targets will create odd shaped tiles, and the last tile may be larger than needed to account for that. For example, if you have a 1280x720 surface and 3 tiles used in that, you can be assured that you don't have 3 separate tiles each of size 426.67x720. Each tile is rounded up to the next size it can be (normally a mutliple of 32). In the case above, that 1280x720 surface would really be three tiles of size 448x720 with the third title having some "empty" space in it. So if you do your little math above and it comes to exactly 15 tiles, you still may be too big if the tiles aren't the size you'd expect.
Do I really expect someone to try to create and use four simultaneous 2048x2048 Color render targets with 4x multisampling and a 32bit depth buffer? Well no, because as you've seen here, it would fail! However, if it does fail, at least this hopefully explains why!
By the way, in reference to my fake first question. You can actually do this just fine, so long as you aren't using 4x multisampling, and even with 4x multisampling, it just barely doesn't fit with a depth buffer (1900x1080x16bytes == 31.3 MB / 2MB == 15.6 tiles). Four 2x multisampled render targets at 1080p would fit though.
Thumbs up for the crystal clear explanation. Thanks! Got my 5.