NVIDIA's GeForce 8800 (G80): GPUs Re-architected for DirectX 10
by Anand Lal Shimpi & Derek Wilson on November 8, 2006 6:01 PM EST- Posted in
- GPUs
Branching
In order to talk generally about SPs and their capabilities, all the vertices, primitives, pixel components, etc. to be processed are referred to as threads. This way we can look at each SP as handling its own thread no matter what type of data is being processed. G80 is able to sustain "thousands" of threads at a time, but the actual number of threads that can be active at any given time is not disclosed. While all SPs can handle any type of thread, SPs that share resources must be running the same type of thread at any given time. In this way, each block of 16 SPs can be running one type of shader program on 16 threads. This indicates something about branch granularity as well. For vertex shaders, branch granularity is 16 vertices. For pixel shaders, branch granularity is 32 pixels (arranged in pairs of blocks of 4x4 pixels).
Branch granularity defines how many threads must follow the same path through data. When a group of 32 pixel threads all take the same branch, we don't have a problem. If even one thread must take a path that is different from the others, all 32 threads must be evaluated with both paths following the branch. The branch then defines what result each individual thread will keep and which it will discard. It's easy to see that optimum granularity is 1 thread, as no unnecessary work would be done. The way resources are allocated and the way instructions are run on SPs grouped together currently doesn't allow any more fine-grained branching. Here's a chart that address branch granularity:
GPU | Branch Granularity |
NVIDIA NV4x | ~1K pixels |
NVIDIA G70 | ~256 pixels |
ATI R580 | 48 pixels |
NVIDIA G80 | 16 vertex 32 pixels |
Clearly G80 has the advantage here, as it's less likely that smaller groups of pixels will take different directions through a branch. This gives programmers the ability to more easily integrate branching into their code without getting a massive performance hit. If programmers are able to incorporate more branches, shader code can become more general purpose and we will see many more effects make their way into games. Now that G80 has caught up to ATI in terms of potential branch performance, we hope developers will take the reality of more complex code seriously.
Early-Z, Memory Interface
NVIDIA has added hardware for Early-Z to G80, after their current Z-Cull hardware which removes regions of pixels completely occluded by other geometry. Early-Z is a more fine-grained occlusion culling method that looks at a calculated Z value of a fragment before it hits the pixel pipeline. Z-Cull doesn't look at per fragment Z values, but uses a Z value based on geometry. While Z-Cull can get rid of large blocks of data it has issues handling surfaces that are only partially occluded or intersecting surfaces. Looking at individual depth values per pixel can help remove unnecessary fragments from heading down the pipeline only to be thrown out when the ROPs get to them.
The memory interface has been dramatically redesigned to support the access patterns of all of G80's independent stream processors. Given the theme of increasing granularity within G80 it's no surprise that we are now seeing 5 and 6 channels of GDDR rather than the 2 or 4 channels we have been used to for the past few years. 8800 GTX will have a 384 bit bus (6 x 64-bit channels), while the 8800 GTS will have a 320 bit wide connection to DRAM (5 x 64-bit channels). We would love to delve further into the details of G80's new memory interface, but NVIDIA isn't discussing the details of this aspect of their hardware.
111 Comments
View All Comments
JarredWalton - Wednesday, November 8, 2006 - link
They did the same thing with the original Halo, porting it (and slowing it down) to DX9. MS seems to think making Halo 2 Vista-only will get people to upgrade to the new OS. [:rolls eyes:]stmok - Wednesday, November 8, 2006 - link
How else are they gonna get gamers to upgrade to Vista? :)(by cornering them into adopting Vista, using DirectX 10.0)
Its sad and pathetic at the same time.
DirectX 10.0 should be a "transitional" solution...That is, it covers both XP and Vista. This allows people to gradually upgrade their hardware, and if they wish, to Vista. What MS is doing now, is throwing everyone (developers and consumers) into the deep end, and expecting them to pay for the changes. (I suspect some would be put off by this, while the majority will continue to accept it...Which is unfortunate).
Great article BTW. Interesting to see the high-end stuff...But I doubt I can afford it in this lifetime!
I have two questions!
(1) Any chance of looking at a triple video card setup?
(I saw a presentation slide which had 2 video cards in SLI, while a third showed something else on screen).
(2) Any idea when the GF8600-series comes?
(mainstream market solution).
yyrkoon - Thursday, November 9, 2006 - link
Great, links arent working ?http://www.gamedev.net/reference/programming/featu...">http://www.gamedev.net/reference/programming/featu...
yyrkoon - Thursday, November 9, 2006 - link
http://www.gamedev.net/reference/programming/featu...">This article was written by a friend of mine back in April after an interview with ATI. Perhaps this will clear some things up.
yyrkoon - Thursday, November 9, 2006 - link
When you break all hardware/software ties to something that has been around for 4-5 years? Its not that easy making it "transitional". From a software perspective, D3D10 is not compatable with XP in the least.I for one, think this is a step in the right direction.
JarredWalton - Thursday, November 9, 2006 - link
Supposedly all of the changes to the WDDM make porting DX10 back to Windows XP "impossible", although I'm more inclined to think the correct term would be "difficult" and you also have to add in "it doesn't fit with MS marketing protocol". WDDM is quite different in Vista however, so maybe there's some substance to the claims.cosmotic - Wednesday, November 8, 2006 - link
On page 9:--Briefly explain what a sub-pixel is in the sentence before--
JarredWalton - Wednesday, November 8, 2006 - link
Due to the size of this article and the amount of time it took to get ready, let me preempt any comments about the spelling and grammar. I am in the process of editing the final document as I read through it, and there are spelling/grammar errors. If they bother you too much, check back in an hour. If you read this an hour from now and you still find errors, then you can respond, though it would be useful to keep all responses in a single thread like this one.Thanks in advance,
Jarred Walton
Editor
AnandTech.com
xtknight - Thursday, November 16, 2006 - link
On p 12 (gamma corrected AA):"This causes problems for thing like thin lines."
acejj26 - Wednesday, November 8, 2006 - link
"If DirectX 10 sounds like a great boon to software developers, the fact that DX10 will only be supported in Windows XP is certain to curb enthusiasm. "I believe this should say "DX10 will only be supported in Windows Vista..."
Not to be rude, but shouldn't the article be edited BEFORE being published??