NVIDIA's GeForce 8800 (G80): GPUs Re-architected for DirectX 10
by Anand Lal Shimpi & Derek Wilson on November 8, 2006 6:01 PM EST- Posted in
- GPUs
We always get very excited when we see a new GPU architecture come down the pipe from ATI or NVIDIA. For the past few years, we've really just been seeing reworked versions of old parts. NV40 evolved from NV30, G70 was just a step up from NV40, and the same is true with ATI as well. Fundamentally, not much has changed since the introduction of DX9 class hardware. But today, G80 ushers in a new class of GPU architecture that truly surpasses everything currently on the market. Changes like this only come along once every few years, so we will be sure to savor the joy that discovering a new architecture brings, and this one is big.
These massive architecture updates generally coincide with the release of a new DirectX, and guess what we've got? Thus we begin today's review not with discussions of pixel shaders and transistors, but about DirectX and what it will mean for the next-generation of graphics hardware, including G80.
DirectX 10
There has been quite a lot of talk about what DirectX 10 will bring to the table, and what we can expect from DX10 class hardware. Well, the hardware is finally here, but much like the situation we saw with the launch of ATI's Radeon 9700 Pro, the hardware precedes the new API. In the mean time, we can only look at our shiny new hardware as it performs under DX9. Of course, we will see full DX9 support, encompassing everything we've come to know and love about the current generation of hardware.
Even though we won't get to see any of the new features of DX10 and Shader Model 4.0, the performance of G80 will shine through due to its unified shader model. This will allow developers to do more with SM3.0 and DX9 while we all wait for the transition to DX10. In the mean time we will absolutely be able to talk about what the latest installment of Microsoft's pervasive graphics API will bring to the table.
More Efficient State and Object Management
One of the major performance improvements we will see from DX10 is a reduction in overhead. Under DX9, state change and draw calls are made quite often and can generate so much overhead that the API becomes the limiting factor in performance. With DX10, we will see the addition of state objects which hold all of the state information for a given pipeline stage. There are 5 state objects in DX10: InputLayout (vertex buffer layout), Sampler, Rasterizer, DepthStencil, and Blend. These objects can quickly change all state information without multiple calls to set the state per attribute.
Constant buffers have also been added to hold data for use in shader programs.
Each shader program has access to 16 buffers of 4096 constants. Each buffer can be updated in one function call. This hugely reduces the overhead of managing a lot of input for shader programs to use. Similar to constant buffers, texture arrays are also available in order to allow for much more data to be stored for use with a shader program. 512 equally sized textures can be stored in a texture array, and each shader is allowed 128 texture arrays (as opposed to 16 textures in DX9). The combination of 8Kx8K texture sizes with all this texture storage space will offer a huge boost in texturing ability to DX10 based games and hardware.
A new construct called a "view" is being introduced in DX10 which will allow resources to be used as more than one type of thing at the same time. For instance, a pixel shader could render vertex data to a texture, and then a vertex shader could use a view to interpret the data as vertex buffer. Views will basically give developers the ability to share resources between pipeline stages more easily.
There is also an DrawAuto call which can redraw an object without having to go back out to the CPU. This combined with predicated rendering should cut down on the overhead and performance impact of large numbers of draw calls currently being used in DX9.
111 Comments
View All Comments
DerekWilson - Thursday, November 9, 2006 - link
i'm sure there was a lot burried in there ... sorry if it wasn't easy to find.8800 gtx and gtx are both no louder than 7900 gtx. 1950 xtx still takes the cake for loudest graphics card around by a long shot -- especially after it heats up in a game.
crystal clear - Thursday, November 9, 2006 - link
My comments in Daily Tech on this subject-More "G80" Derivatives in February R
E: More info would be nice
By crystal clear on 11/8/06, Rating: 2
By crystal clear on 11/8/2006 8:03:43 AM , Rating: 2
If you link VISTA -SANTA ROSA platform-Core2DUO(merom)CPU line up(T7300,7500,7700 models)then a matching Graphics card
to complete the link.
So a G80 for laptops/notebooks?
The pairing of Intels Santa Rosa platform with Vista in the 2Q 07 is next big thing for the first tier notebook manufacturers & all they need is a matching G80 for this setup.
Unquote-
Nvidia currently caters to Desktop requirement/needs with the new G80 releases,wonder how the notebook/server versions will be-with Vista ofcourse.
yyrkoon - Thursday, November 9, 2006 - link
Vitual memory is probably a good thing for most cases, but in the graphics arena, this *could* potentially make for sloppy/ bad coding practises. Knowing a lot of game devers (some of which actually work for well known companies), I've heard them from time to time complain about maxing a 16x PCI-E pipe. What I'm trying to say here, is that while it would be a good thing for never having to run out of texture memory, but that system memory, and definately the swap disk can not hold a candle to the memory bandwidth that most Video cards are capable of. End result, is that you definately *will* get a performance hit. All this, and we already know the memory bandwidth capabilities of modern PCs, suffice it to say, the most we'll see from current systems is what ? 12-13K GB/s ? Even a 7800GS can do roughly 35 GB/s on card. A 7600GT ? 22GB/s ?Still I think Directx10 is a very good thing, and as I didnt read the whole article, perhaps a missed a little ? Reason being, I've been reading about Directx10 since April, and a friend of mine was privy to some of this information after an interview with ATI.
http://www.gamedev.net/reference/programming/featu...">http://www.gamedev.net/reference/programming/featu...
saratoga - Thursday, November 9, 2006 - link
I don't know how they threading really works, but its quite possible VM support is required in order to allow multiple threads to run without stepping all over each other,.saratoga - Thursday, November 9, 2006 - link
Sorry, should read "I don't know how THEIR threading works"falc0ne - Thursday, November 9, 2006 - link
I don't know what is the problem but I'm really unable to see the images within the latest articles from Anand...Can anyone give me a suggestion? What might be the cause of that?The thing is I'm really, really interested in these articles and I need to see those images. Thanks
yyrkoon - Thursday, November 9, 2006 - link
Oh, er, then in the options tab of Firefox, (tools->options->content) check the "load images" check box ;)falc0ne - Thursday, November 9, 2006 - link
well...it would've been simple but I'm afraid is not that...It might be the addblock extension from firefox, other than that I have nooo ideeea...Well I will use the IE tab option instead and load the pages using IE 7. Thanks anyway:)yyrkoon - Thursday, November 9, 2006 - link
Checked the exceptions list ? I know that firefox makes it really simple to block images from a site (to a point of being too easy).JarredWalton - Thursday, November 9, 2006 - link
If you've got AdBlock on Firefox, press Ctrl+Shift+A and you can see what it's blocking. If it blocks the images.anandtech.com stuff, you can then see which RegEx isn't working right and edit that.