NVIDIA's GeForce 7800 GTX Hits The Ground Running
by Derek Wilson on June 22, 2005 9:00 AM EST- Posted in
- GPUs
Inside The Pipes
The pixel pipe is made up of two vector units and a texture unit that all operate together to facilitate effective shader program execution. There are a couple mini-ALUs in each shader pipeline that allow operations such as a free fp16 normalize and other specialized features that relate to and assist the two main ALUs.Even though this block diagram looks slightly different from ones shown during the 6800 launch, NVIDIA has informed us that these mini-ALUs were also present in NV4x hardware. There was much talk when the 6800 launched about the distinct functionality each of the main shader ALUs had. In NV4x, only one ALU had the ability to perform a single clock MADD (multiply-add). Similarly, only one ALU assisted in texture address operations for the texture unit. Simply having these two distinct ALUs (regardless of their functionality difference) is what was able to push the NV4x so much faster than the NV3x architecture.
In their ongoing research into commonly used shaders (and likely much of their work with shader replacement), NVIDIA discovered that a very high percentage of shader instructions were MADDs. Multiply-add is extremely common in 3D mathematics as linear algebra, matrix manipulation, and vector calculus are a huge part of graphics. G70 implements MADD on both main Shader ALUs. Taking into account the 50% increase in shader pipelines and each pipe's ability to compute twice as many MADD operations per clock, the G70 has the theoretical ability to triple MADD performance over the NV4x architecture (on a clock for clock basis).
Of course, we pressed the development team to tell us if both Shader ALUs featured identical functionality. The answer is that they do not. Other than knowing that only one ALU is responsible for assisting the texture hardware, we were unable to extract a detailed answer about how similar the ALUs are. Suffice it to say that they still don't share all features, but that NVIDIA certainly feels that the current setup will allow G70 to extract twice the shader performance for a single fragment over NV4x (depending on the shader of course). We have also learned that the penalty for branching in the pixel shaders is much less than in previous hardware. This may or may not mean that the pipelines are less dependent on following the exact same instruction path, but we really don't have the ability to determine what is going on at that level.
127 Comments
View All Comments
BenSkywalker - Wednesday, June 22, 2005 - link
Derek-I wanted to offer my utmost thanks for the inclusion of 2048x1536 numbers. As one of the fairly sizeable group of owners of a 2070/2141 these numbers are enormously appreciated. As everyone can see 1600x1200x4x16 really doesn't give you an idea of what high resolution performance will be like. As far as the benches getting a bit messed up- it happens. You moved quickly to rectify the situation and all is well now. Thanks again for taking the time to show us how these parts perform at real high end settings.
blckgrffn - Wednesday, June 22, 2005 - link
You're forgiven, by me anyway :) It is also the great editorial staff that makes Anandtech my homepage on every browser on all of my boxes!Nat
yacoub - Wednesday, June 22, 2005 - link
#72 - Totally agree. Some Rome: Total War benchs are much needed - but primarily to see how the game's battle performance with large numbers of troops varies between AMD and Intel more so than NVidia and ATi, considering the game is highly CPU-limited currently in my understanding.DerekWilson - Wednesday, June 22, 2005 - link
Hi everyone,Thank you for your comments and feedback.
I would like to personally apologize for the issues that we had with our benchmarks today. It wasn't just one link in the chain that caused the problems we had, but there were many factors that lead to the results we had here today.
For those who would like an explanation of what happened to cause certain benchmark numbers not to reflect reality, we offer you the following. Some of our SLI testing was done forcing multi-GPU rendering on for tests where there was no profile. In these cases, the default mutli-GPU mode caused a performance hit rather than the increase we are used to seeing. The issue was especially bad in Guild Wars and the SLI numbers have been removed from offending graphs. Also, on one or two titles our ATI display settings were improperly configured. Our windows monitor properties, ATI "Display" tab properties, and refresh rate override settings were mismatched. This caused the card to render. Rather than push the display at a the pixel clock we expected, ATI defaulted to a "safe" mode where the game is run at the resolution requested, but only part of the display is output to the screen. This resulted in abnormally high numbers in some cases at resolutions above 1600x1200.
For those of you who don't care about why the numbers ran the way they did, please understand we are NOT trying to hide behind our explanation as an excuse.
We agree completely that the more important issue is not why bad numbers popped up, but that bad numbers made it into a live article. For this I can only offer my sincerest of apologies. We consider it our utmost responsibility to produce quality work on which people may rely with confidence.
I am proud that our readership demands a quality above and beyond the norm, and I hope that that never changes. Everything in our power will be done to assure that events like this will not happen again.
Again, I do apologize for the erroneous benchmark results that went live this morning. And thank you for requiring that we maintain the utmost integrity.
Thanks,
Derek Wilson
Senior CPU & Graphics Editor
AnandTech.com
Dmitheon - Wednesday, June 22, 2005 - link
I have to say, while I'm am extremely pleased with nVidia doing a real launch, the product leaves me scratching my head. They priced themselves into an extremely small market, and effectively made their 6800 series the second tier performance cards without really dropping the price on them. I'm not going to get one, but I do wonder how this will affect the company's bottom line.OrSin - Wednesday, June 22, 2005 - link
I not tring to be a buthole but can we get a benchmark thats a RTS game. I see 10+ games benchmarks and most are FPS, the few that are not might as well be. Those RPG seems to use a silimar type engine.stmok - Wednesday, June 22, 2005 - link
To CtK's question : Nope, SLI doesn't work with dual-display. (Last I checked, Nvidia got 2D working, but NO 3D)...Rumours say its a driver issue, and Nvidia is working on it.I don't know any more than that. I think I'd rather wait until Nvidia are actually demonstrating SLI with dual or more displays, before I lay down any money.
yacoub - Wednesday, June 22, 2005 - link
#60 - it's already to the point where it's turning people off to PC gaming, thus damaging the company's own market of buyers. It's just going to move more people to consoles, because even though PC games are often better games and much more customizable and editable, that only means so much and the trade-off versus price to play starts to become too imbalanced to ignore.jojo4u - Wednesday, June 22, 2005 - link
What was regarding the AF setting? I understand that it was set to 8x when AA was set to 4x?Rand - Wednesday, June 22, 2005 - link
I have to say I'm rather disappointed in the quality of the article. A number of apparently nonsensical benchmark results, with little to no analysis of most of the results.A complete lack of any low level theoretical performance results, no attempts to measure any improvements in efficiency of what may have caused such improvements.
Temporal AA is only tested on one game with image quality examined in only one scene. Given how dramatically different games and genres utilize alpha textures your providing us with an awfully limited perspective of it's impact.