AMD's Radeon HD 6990: The New Single Card King
by Ryan Smith on March 8, 2011 12:01 AM EST- Posted in
- AMD
- Radeon HD 6990
- GPUs
Crysis: Warhead
Kicking things off as always is Crysis: Warhead, still one of the toughest game in our benchmark suite. Even 3 years since the release of the original Crysis, “but can it run Crysis?” is still an important question, and for 3 years the answer was “no.” However as we’ll see the 6990 changes that: full Enthusiast settings at a playable framerate is finally in the grasp of a single card.
It should come as no surprise that with the 6990, AMD has hit a few different important marks on Crysis for a single card thanks to the card’s near-6970CF performance. As far as our traditional 2560 benchmark goes, the 6990 cracks 60fps, meaning we can finally play Crysis at a perfectly smooth framerate at 2560 with our tweaked settings on what is more or less a single video card. Perhaps more importantly however, performance is to the point where Crysis in full enthusiast mode is now a practical benchmark. Thanks in big part to the extra VRAM here, the tops the 5970 by nearly 30%, coming in at 42.8fps. This is still a bit low for a completely smooth framerate, but it is in fact playable, which is more than we can say for the 5970.
Overall Crysis does a good job setting the stage here for most of our benchmark suite: the performance of the card is consistently between the 6950CF and 6970CF, hovering much closer to the former. Compared to NVIDIA’s offerings the 6990 is solidly between the GTX 580 and GTX 580SLI, owing to the fact that NVIDIA doesn’t have a comparable card. The GTX 580SLI is faster, but the 580 is also still the fastest single-GPU card on the market, meaning it commands a significant price premium.
Overclocked to uber mode however only shows minimal gains, as the theoretical maximum gain is only 6% while the real world benefit is less; uber mode alone will never have a big payoff.
As far as minimum framerates are concerned the story is similar. For some reason the 6990 underperforms the 6950CF here by a frame or two per second, which given the 6990’s mostly superior specs leads us to believe that it’s a limitation of PCIe bus bandwidth. Meanwhile we can clearly see the benefits of more than 1GB of VRAM per GPU here: the 6990 walks all over the 5970.
130 Comments
View All Comments
smookyolo - Tuesday, March 8, 2011 - link
My 470 still beats this at compute tasks. Hehehe.And damn, this card is noisy.
RussianSensation - Tuesday, March 8, 2011 - link
Not even close, unless you are talking about outdated distributed computing projects like Folding@Home code. Try any of the modern DC projects like Collatz Conjecture, MilkyWay@home, etc. and a single HD4850 will smoke a GTX580. This is because Fermi cards are limited to 1/8th of their double-precision performance.In other words, an HD6990 which has 5,100 Gflops of single-precision performance will have 1,275 Glops double precision performance (since AMD allows for 1/4th of its SP). In comparison, the GTX470 has 1,089 Gflops of SP performance which only translates into 136 Gflops in DP. Therefore, a single HD6990 is 9.4x faster in modern computational GPGPU tasks.
palladium - Tuesday, March 8, 2011 - link
Those are just theoretical performance numbers. Not all programs *even newer ones* can effectively extract ILP from AMD's VLIW4 architecture. Those that can will no doubt with faster; others that can't would be slower. As far as I'm aware lots of programs still prefer nV's scalar arch but that might change with time.MrSpadge - Tuesday, March 8, 2011 - link
Well.. if you can oly use 1 of 4 VLIW units in DP then you don't need any ILP. Just keep the threads in flight and it's almost like nVidias scalar architecture, just with everything else being different ;)MrS
IanCutress - Tuesday, March 8, 2011 - link
It all depends on the driver and compiler implementation, and the guy/gal coding it. If you code the same but the compilers are generations apart, then the compiler with the higher generation wins out. If you've had more experience with CUDA based OpenCL, then your NVIDIA OpenCL implementation will outperform your ATI Stream implementation. Pick your card for it's purpose. My homebrew stuff works great on NVIDIA, but I only code for NVIDIA - same thing for big league compute directions.stx53550 - Tuesday, March 15, 2011 - link
off yourself idiotm.amitava - Tuesday, March 8, 2011 - link
".....Cayman’s better power management, leading to a TDP of 37W"- is it honestly THAT good? :P
m.amitava - Tuesday, March 8, 2011 - link
oops...re-read...that was idle TDP !!MamiyaOtaru - Tuesday, March 8, 2011 - link
my old 7900gt used 48 at loadD:
Don't like the direction this is going. In GPUs it's hard to see any performance advances that don't come with equivalent increases in power usage, unlike what Core 2 was compared to Pentium4.
Shadowmaster625 - Tuesday, March 8, 2011 - link
Are you kidding? I have a 7900GTX I dont even use, because it fried my only spare large power supply. A 5670 is twice as fast and consumes next to nothing.