ATI Radeon X800 Pro and XT Platinum Edition: R420 Arrives
by Derek Wilson on May 4, 2004 10:28 AM EST- Posted in
- GPUs
Pixel Shader Performance Tests
ShaderMark v2.0 is a program designed to stress test the shader performance of modern DX9 graphics hardware with Shader Model 2.0 programs written in HLSL running on a couple shapes in a scene.
We haven't used ShaderMark in the past because we don't advocate the idea of trying to predict the performance of real world game code using a synthetic set of tests designed to push the hardware. Honestly, as we've said before, the only way to determine performance of a certain program on specific hardware is to run that program on that hardware. As both software and hardware get more complex, results of any given test become less and less generalize able, and games, graphics hardware, and modern computer systems are some of the most complex entities on earth.
So why are we using ShaderMark you may ask. There are a couple reasons. First this is only a kind of ball park test. ATI and NVIDIA both have architectures that should be able to push a lot of shader operations through. It is a fact that NV3x had a bit of a handicap when it came to shader performance. A cursory glance at ShaderMark should tell us enough to know if that handicap carries over to the current generation of cards, and whether or not R420 and NV40 are on the same playing field. We don't want to make a direct comparison, we just want to get a feel for the situation. With that in mind, here are the benchmarks.
Radeon X800 XT PE | Radeon X800 Pro | GeForce 6800 Ultra | GeForce 6800 GT | GeForce FX 5950 U | |
---|---|---|---|---|---|
2 | 310 |
217 |
355 |
314 |
65 |
3 | 244 |
170 |
213 |
188 |
43 |
4 | 238 |
165 |
|||
5 | 211 |
146 |
162 |
143 |
34 |
6 | 244 |
169 |
211 |
187 |
43 |
7 | 277 |
160 |
205 |
182 |
36 |
8 | 176 |
121 |
|||
9 | 157 |
107 |
124 |
110 |
20 |
10 | 352 |
249 |
448 |
410 |
72 |
11 | 291 |
206 |
276 |
248 |
54 |
12 | 220 |
153 |
188 |
167 |
34 |
13 | 134 |
89 |
133 |
118 |
20 |
14 | 140 |
106 |
141 |
129 |
29 |
15 | 195 |
134 |
145 |
128 |
29 |
16 | 163 |
113 |
149 |
133 |
27 |
17 | 18 |
13 |
15 |
13 |
3 |
18 | 159 |
111 |
99 |
89 |
17 |
19 | 49 |
34 |
|||
20 | 78 |
56 |
|||
21 | 85 |
61 |
|||
22 | 47 |
33 |
|||
23 | 49 |
43 |
49 |
46 |
These benchmarks are run with fp32 on NVIDIA hardware and fp24 on ATI hardware. It isn't really an apples to apples comparison, but with some of the shaders used in shadermark, partial precision floating point causes error accumulation (since this is a benchmark designed to stress shader performance, this is not surprising).
ShaderMark v2.0 clearly shows huge increase in pixel shader performance from NV38 to either flavor of NV40. Even though the results can't really be compared apples to apples (because of the difference in precision), NVIDIA manages to keep up with the ATI hardware fairly well. In fact, under the diffuse lighting and environment mapping, shadowed bump mapping and water color shaders don't show ATI wiping the floor with NVIDIA.
In looking at data collected on the 60.72 version of the NVIDIA driver, no frame rates changed and a visual inspection of the images output by each driver yielded no red flags.
We would like to stress again that these numbers are not apples to apples numbers, but the relative performance of each GPU indicates that the ATI and NVIDIA architectures are very close to comparable from a pixel shader standpoint (with each architecture having different favored types of shader or operation).
In addition to getting a small idea of performance, we can also look deep into the hearts of NV40 and see what happens when we enable partial precision rendering mode in terms of performance gains. As we have stated before, there were a few image quality issues with the types of shaders ShaderMark runs, but this bit of analysis will stick only to how much work is getting done in the same amount of time without regard to the relative quality of the work.
GeForce 6800 U PP | GeForce 6800 GT PP | GeForce 6800 U | GeForce 6800 GT | |
---|---|---|---|---|
2 | 413 |
369 |
355 |
314 |
3 | 320 |
283 |
213 |
188 |
5 | 250 |
221 |
162 |
143 |
6 | 300 |
268 |
211 |
187 |
7 | 285 |
255 |
205 |
182 |
9 | 159 |
142 |
124 |
110 |
10 | 432 |
389 |
448 |
410 |
11 | 288 |
259 |
276 |
248 |
12 | 258 |
225 |
188 |
167 |
13 | 175 |
150 |
133 |
118 |
14 | 167 |
150 |
141 |
129 |
15 | 195 |
173 |
145 |
128 |
16 | 180 |
161 |
149 |
133 |
17 | 21 |
19 |
15 |
13 |
18 | 155 |
139 |
99 |
89 |
23 | 49 |
46 |
49 |
46 |
The most obvious thing to notice is that, overall, partial precision mode rendering increases shader rendering speed. Shader 2 through 8 are lighting shaders (with 2 being a simple diffuse lighting shader). These lighting shaders (especially the point and spot light shaders) will make heavy use of vector normalization. As we are running in partial precision mode, this should translate to a partial precision normalize, which is a "free" operation on NV40. Almost any time a partial precision normalize is needed, NV40 will be able to schedule the instruction immediately. This is not the case when dealing with full precision normalization, so the many 50% performance gains coming out of those lighting shaders is probably due to the partial precision normalization hardware built into each shader unit in NV40. The smaller performance gains (which, interestingly, occur on the shaders that have image quality issues) are most likely the result of decreased bandwidth requirements, and decreased register pressure: a single internal fp32 register can handle two fp16 values making scheduling and managing resources much less of a task for the hardware.
As we work on our image quality analysis of NV40 and R420, we will be paying heavy attention to shader performance in both full and partial precision modes (as we want to look at what gamers will actually be seeing in the real world). We will likely bring shadermark back for these tests as well. This is a new benchmark for us, so please bear with us as we get used to its ins and outs.
95 Comments
View All Comments
adntaylor - Tuesday, May 4, 2004 - link
I wish they'd also tested with an nForce3 motherboard. nVidia have managed some very interesting performance enhancements on the AGP to HT tunnel that only works with the nVidia graphics cards. That might have pushed the 6800 in front - who knows!UlricT - Tuesday, May 4, 2004 - link
Hey... Though the review rocks, you guys desperately need an editor for spelling and grammar!Jeff7181 - Tuesday, May 4, 2004 - link
This pretty much settles it. With the excellent comparision between architectures, and the benchmark scores to prove the advantages and disadvantages of the architecture... my next card will be made by ATI.NV40 sure has a lot of potential, one might say it's ahead of it's time, supporting SM 3.0 and being so programmable. However, with a product cycle of 6 months to a year, being ahead of it's time is more of a disadvantage in this case. People don't care what it COULD do... people care what it DOES do... and the R420 seems to do it better. I just hope my venture into the world of ATI doesn't turn into driver hell.
NullSubroutine - Tuesday, May 4, 2004 - link
Im fan boy for neither company and objectively I can say the cards are equal. Some games the ATI cards are faster other games the Nvidia cards are faster. So it all depends on the game you play to which one is better and the price of the card you are looking for. (Hmm, maybe motherboard companies could make 2 AGP slots...)About the arguement of the PS 2.0/3.0...
2.0 Cards will be able to play games with 3.0, they may not have full functionality or they may run it slower. This will remain to be seen till games begin to use 3.0. However...
The one thing bad for Nvidia in my eyes is the pixel shader quality that can be seen in Farcry, whether this is a game or driver glitch it is still unknown.
I forgot to add I like that the ATI cards use less power, I dont want to have to pay for another PSU ontop of already high prices of video cards. I would also like to see a review again a month from now when newer drivers come out to see how much things have changed.
l3ored - Tuesday, May 4, 2004 - link
pschhh, did you see the unreal 3 demo? in the video i saw, it looked like it ran at about 5fps imagine running halo on a gfx 5200. however you could run it if you were to turn of halo's PS 2 effects. i think thats how it's going to be with unreal 3Slaanesh - Tuesday, May 4, 2004 - link
Since PS 3.0 is not supported by the X800 hardware, does this mean that those extremely impressive graphical features showed in the Unreal 3 tech demo (NV40 launch) and the near-to-be-released goodlooking PS 3.0 Far Cry update are both NOT playable on the X800?? This would be a huge disadvantage for ATi since alot of the upcoming topgames will support PS3.0!l3ored - Tuesday, May 4, 2004 - link
i agree phiro, personally i think im gonna get the one that hits $200 first (may be a while)Phiro - Tuesday, May 4, 2004 - link
Hearing about the 6850 and the other Emergency-Extreme-Whatever 6800 variants that are floating about irritates me greatly. Nvidia, you are losing your way!Instead of spending all that time, effort and $$ just to try to take the "speed champ" title, make your shit that much cheaper instead! If your 6800 Ultra was $425 instead of $500, that would give you a hell of alot more market share and $$ than a stupid Emergency Edition of your top end cards... We laugh at Intel for doing it, and now you're doing it too, come fricking on...
gordon151 - Tuesday, May 4, 2004 - link
#14, I think it has more to do with the fact those OpenGL benchmarks are based on a single engine that was never fast on ATI hardware to begin with.araczynski - Tuesday, May 4, 2004 - link
12: personally i think the TNT line was better then the Voodoo line. I think they bought them out only to get rid of the competition, which was rather stupid because i think they would have died out sooner or later anyway because nvidia was just better. I would guess that perhaps they bought them out cuz that gave them patent rights and they woudln't have to worry about being sued for probably copying some of the technology :)