NVIDIA's GeForce 8800 (G80): GPUs Re-architected for DirectX 10
by Anand Lal Shimpi & Derek Wilson on November 8, 2006 6:01 PM EST- Posted in
- GPUs
What is CSAA?
Taking another step forward in antialiasing quality and performance, NVIDIA is introducing Coverage Sample Antialiasing with G80. Coverage Sample AA is an evolutionary step forward in AA technology designed to improve how accurately the hardware is able to determine the area of a pixel covered by any given surface. CSAA can be thought of as extending MSAA. NVIDIA is calling all of their AA modes CSAA, even though common AA modes (2x, 4x, and now 8x (8xQ to NVIDIA)) are performed exactly the same way MSAA would be performed.
To enable modes that more accurately represent each polygon's coverage of a pixel, NVIDIA has introduced an "Enhance the application" option in their driver. This option will allow you to enable a desired MSAA mode in a game (either 4x or 8x) and then "enhance" it by enabling 8x, 16x, or 16xQ CSAA. This will make the 4xAA requested in the game look like 8xAA or 16xAA. Enhancing 8x to 16xQ gives the effect of 16xMSAA without the huge performance impact that would be associated with such a setting.
To understand how it comes together, lets take a quick look at fragments and the evolution of AA.
We usually refer to fragments as pixels for simplicity sake (and because Microsoft decided to use the term pixel shader rather than fragment shader in DirectX), but it helps to understand what the difference between a pixel and a fragment is when talking about AA methods. A pixel is simply a colored dot on the screen (or stored in a frame buffer). The different pieces of data that go into determining the color of a particular pixel are called fragments. For example, if 2 triangles cover the area of a single pixel, both will be processed as fragments. Texture look ups will be done for each at the pixel center, and a color and depth will be determined, and any of this data can be manipulated by a fragment (pixel) shader. Without AA (and ignoring blending, transparency, etc...), only the fragment that is nearest the viewer and covers the pixel center will determine the color of the pixel. Antialiasing techniques are used to make the final pixel color reflect an accurate blend of the colors that cover a pixel.
A sub-pixel can be thought of as a zoomed in look at the area a pixel covers, so for example instead of a single pixel it can be viewed as a 10x10 grid of sub-pixels. Current popular FSAA (full screen AA) methods use the calculated colors of multiple sub-pixels that fall within the area of a pixel rather than just the pixel center to determine the final color. Super Sample AA takes each of these sub-pixels through the entire pipeline to determine texture and pixel shader output at each location. This is very accurate, but wastes lots of processing power without providing a proportional benefit. This is because sub-pixels that fall on the same surface don't usually end up with very different colors. MSAA only looks at one textured/shaded sample point per fragment. The colors of the sub-pixels on a polygon are the same as the color at the center of the pixel, but each sub-pixel gets its own depth value. When two polygons cover the same pixel, we can end up with different colored sub-pixels. Blending these colors proportionally results in properly antialiased polygon edges.
CSAA extends MSAA by decoupling color and depth values from the positions of the sample points within a pixel. Color values are determined at the pixel center, and color and depth data are stored in a buffer. The extension of this in CSAA comes in that we can look at more sample points in the pixel than we store color/Z data for. Under NVIDIA's 16x CSAA, four color values are stored, but the fragment coverage information for each of 16 sample points is retained. These coverage sample points are able to reference the appropriate color/Z data stored for the polygon that covers them.
While NVIDIA couldn't go into much detail on the technology behind CSAA, we can extrapolate what's going on behind the scenes in order to make this happen. For each triangle that covers a pixel, each CSAA sample point gets a boolean value that indicates whether or not it is covered by the triangle. Color/Z data for the fragment are stored in a buffer for that pixel. For this whole thing to work, each CSAA sample point must also know what color in the buffer to indicate. If we assume position is predefined, the most storage that would be needed for each CSAA point is 4 bits (one boolean coverage value plus 3bits to index 8 color/Z values). The color and Z data will be significantly larger than 8 bytes per pixel, especially for floating point color data, so the memory footprint shouldn't be much larger than MSAA.
As fragments are sent out of the pixel shader, sub-pixel data is updated based on depth tests, and coverage samples and color/Z data will be updated as necessary. When the scene is ready to be drawn, the coverage sample points and color/Z data will be used to determine the color of a pixel based on each fragment that influenced it.
So what are the downsides? We have less depth information inside the pixel, but in most cases this isn't as important as color information. We do need to know depth at different sub-pixel positions in order to handle intersecting polygons, but doing this with a different level of detail than color information shouldn't have a big impact on quality.
The other drawback is that algorithms that require stencil/Z data at sub-pixel locations will not work correctly with CSAA in modes where there are more coverage samples than colors stored. In these cases, like with the stencil shadows used in FEAR, only the coverage samples located where color values are taken are used. This effectively reverts these algorithms to MSAA quality levels. CSAA will still be applied to polygon edges, and stencil algorithms will still work with the decreased level of antialiasing applied.
At a basic level, CSAA can provide more accurate coverage information for a pixel without the storage requirements of MSAA. This not only gives gamers an option to enable higher quality AA, but the option to enable higher quality AA without a large performance impact. While the explanation of how it does this may be overly complex, here's a simple table to help convey what's going on:
111 Comments
View All Comments
JarredWalton - Wednesday, November 8, 2006 - link
The text is basically complete, and minor spelling issues aren't going to change the results. Obviously, proofing 29 pages of article content is going to take some time. We felt our readers would be a lot more interested in getting the content now rather than waiting even longer for me to proof everything. I know the vast majority of readers don't bother to comment on spelling and grammar issues, but my post was to avoid the comments section turning into a bunch of short posts complaining about errors that will be corrected shortly. :)Iger - Wednesday, November 8, 2006 - link
Pff, of course we would! If I would like to read a novel I would find a book! Results first - proofing later... if ever :) Thanks for the article!JarredWalton - Wednesday, November 8, 2006 - link
Did I say an hour? Okay, how about I just post here when I'm done reading/editing? :)JarredWalton - Wednesday, November 8, 2006 - link
Okay, I'm done proofing/editing. If you still see errors, feel free to complain. Like I said, though, try to keep them in this thread.--Jarred
LuxFestinus - Thursday, November 9, 2006 - link
Pg. 3 under <b>Unified Shaders</b>Should read as follows:
<i>Until now, building a GPU with unified shaders would not have <b>been</b> desirable, let alone practical, but Shader Model 4.0 lends itself well to this approach.</i>
Good try though.;)
shabby - Wednesday, November 8, 2006 - link
$600 for the gtx and $450 for the gts is pretty good seeing how much they crammed into the gpu, makes you wonder why the previous gen topped 650 bucks at times.dcalfine - Wednesday, November 8, 2006 - link
How does the 8800GTX compare to the 7950GX2? Not just in FPS, but also in performance/watt?dcalfine - Wednesday, November 8, 2006 - link
Ignore ^^^sorry
Hot card by the way!
neogodless - Wednesday, November 8, 2006 - link
I know you touched on this, but I assume that DirectX 10 is still not available for your testing platform, Windows XP Professional SP2, and additionally no games have been released for that platform. Is this correct? If so...Will DirectX 10 be made available for Windows XP?
Will you publish a new review once Vista, DirectX 10 and the new games are available?
Can we peak into the future at all now?
JarredWalton - Wednesday, November 8, 2006 - link
DX10 will be Vista only according to Microsoft. What that means according to some game developers is that DX10 support is going to be somewhat slow, and it's also going to be a major headache because for the next 3-4 years they will pretty much be required to have a DX9 rendering solution along with DX10.