DX10 for the Masses: NVIDIA 8600 and 8500 Series Launch
by Derek Wilson on April 17, 2007 9:00 AM EST- Posted in
- GPUs
The New Face of PureVideo HD
The processing requirements of the highest quality HD-DVD and Blu-ray content are non-trivial. Current midrange CPUs struggle to keep up without assistance and older hardware simply cannot perform the task adequately. AMD and NVIDIA have been stepping in with GPU assisted video decode acceleration. With G84, NVIDIA takes this to another level moving well beyond simply accelerating bits and pieces of the process.
The new PureVideo hardware, VP2, is capable of offloading the entire decode process for HD-DVD and Blu-ray movies. With NVIDIA saying that 100% of the H.264 video decode process can be offloaded at up to 40 Mbits/sec on mainstream hardware, the average user will now be able to enjoy HD content on their PC (when prices on HD-DVD and Blu-ray drives fall, of course). There will still be some CPU involvement in the process, as the player will still need to run, AACS does have some overhead, and the CPU is responsible for I/O management.
This is quite a large change, even from the previous version of PureVideo. One of the most processing intensive tasks is decoding the entropy encoded bitstream. Entropy encoding is a method of coding that creates variable length symbols where the size of the symbol is inversely proportional to the probability of encountering it. In other words, patterns that occur often will be represented by short symbols when encoded while less probable patterns will get larger symbols. NVIDIA's BSP (bitstream processor) handles this.
Just adding the decoding of CABAC and CAVLC bitstreams (the two types of entropy encoding supported by H.264) would have helped quite a bit, but G84 also accelerates the inverse transform step. After the bitstream is processed, the data must go through an inverse transform to recover the video stream which then must have motion compensation and deblocking performed on it. This is a bit of an over simplification, but 100% of the process is 100% no matter how we slice it. Here's a look at the breakdown and how CPU involvement has changed between VP1 and VP2.
We have a copy of WinDVD that supports the new hardware acceleration and we are planning a follow up article to investigate real world impact of this change. As we mentioned, in spite of the fact that all video decoding is accelerated on the GPU, other tasks like I/O must be handled by the CPU. We are also interested in finding videos of more than 40 Mbit/sec to try and push the capabilities of the hardware and see what happens. We are interested in discovering the cheapest, slowest processor that can effectively play back full bandwidth HD content when paired with G84 hardware.
It is important to emphasize the fact that HDCP is supported over dual-link DVI, allowing 8600 and 8500 hardware to play HDCP protected content at its full resolution on any monitor capable of displaying 1920x1080. Pairing one of these cards with a Dell 30" monitor might not make sense for gamers, but for those who need maximal 2D desktop space and video playback, the 8600 GT or GTS would be a terrific option.
While it would be nice to have this hardware in NVIDIA's higher end offerings, this technology arguably makes more sense in mainstream parts. High end, expensive graphics cards are usually paired with high end expensive CPUs and lots of RAM. The decode assistance that these higher end cards offer is more than enough to enable a high end CPU to handle the hardest hitting HD videos. With mainstream graphics hardware providing a huge amount of decode assistance, the lower end CPUs that people pair with this hardware will benefit greatly.
The processing requirements of the highest quality HD-DVD and Blu-ray content are non-trivial. Current midrange CPUs struggle to keep up without assistance and older hardware simply cannot perform the task adequately. AMD and NVIDIA have been stepping in with GPU assisted video decode acceleration. With G84, NVIDIA takes this to another level moving well beyond simply accelerating bits and pieces of the process.
The new PureVideo hardware, VP2, is capable of offloading the entire decode process for HD-DVD and Blu-ray movies. With NVIDIA saying that 100% of the H.264 video decode process can be offloaded at up to 40 Mbits/sec on mainstream hardware, the average user will now be able to enjoy HD content on their PC (when prices on HD-DVD and Blu-ray drives fall, of course). There will still be some CPU involvement in the process, as the player will still need to run, AACS does have some overhead, and the CPU is responsible for I/O management.
This is quite a large change, even from the previous version of PureVideo. One of the most processing intensive tasks is decoding the entropy encoded bitstream. Entropy encoding is a method of coding that creates variable length symbols where the size of the symbol is inversely proportional to the probability of encountering it. In other words, patterns that occur often will be represented by short symbols when encoded while less probable patterns will get larger symbols. NVIDIA's BSP (bitstream processor) handles this.
Just adding the decoding of CABAC and CAVLC bitstreams (the two types of entropy encoding supported by H.264) would have helped quite a bit, but G84 also accelerates the inverse transform step. After the bitstream is processed, the data must go through an inverse transform to recover the video stream which then must have motion compensation and deblocking performed on it. This is a bit of an over simplification, but 100% of the process is 100% no matter how we slice it. Here's a look at the breakdown and how CPU involvement has changed between VP1 and VP2.
We have a copy of WinDVD that supports the new hardware acceleration and we are planning a follow up article to investigate real world impact of this change. As we mentioned, in spite of the fact that all video decoding is accelerated on the GPU, other tasks like I/O must be handled by the CPU. We are also interested in finding videos of more than 40 Mbit/sec to try and push the capabilities of the hardware and see what happens. We are interested in discovering the cheapest, slowest processor that can effectively play back full bandwidth HD content when paired with G84 hardware.
It is important to emphasize the fact that HDCP is supported over dual-link DVI, allowing 8600 and 8500 hardware to play HDCP protected content at its full resolution on any monitor capable of displaying 1920x1080. Pairing one of these cards with a Dell 30" monitor might not make sense for gamers, but for those who need maximal 2D desktop space and video playback, the 8600 GT or GTS would be a terrific option.
While it would be nice to have this hardware in NVIDIA's higher end offerings, this technology arguably makes more sense in mainstream parts. High end, expensive graphics cards are usually paired with high end expensive CPUs and lots of RAM. The decode assistance that these higher end cards offer is more than enough to enable a high end CPU to handle the hardest hitting HD videos. With mainstream graphics hardware providing a huge amount of decode assistance, the lower end CPUs that people pair with this hardware will benefit greatly.
60 Comments
View All Comments
erwos - Tuesday, April 17, 2007 - link
</font>I'm wondering if I can fix the disappearing text problem.
PrinceGaz - Tuesday, April 17, 2007 - link
Please remove or edit my above post to remove the (H) bit which caused a problem, I'd do it myself but we have no edit facility.JarredWalton - Tuesday, April 17, 2007 - link
That should hopefully fix it - you just need to turn off highlighting using {/h} (with brackets instead of braces).
defter - Tuesday, April 17, 2007 - link
You need to take into account that 7900GS will be soon discontinued and X1900 series will face same fate as soon as ATI releases RV630 cards.Cards based on previous high-end products like 7900 and X1900 based cards are great for consumers, but bad for ATI/NVidia since they have large die sizes and 256bit memory bus (= high board manufacturing costs).
hubajube - Tuesday, April 17, 2007 - link
I wouldn't replace my 7800GT with these but it would be fantastic for a HTPC.PICBoy - Tuesday, April 17, 2007 - link
I think a lot of people is waiting to see some DX10 bechmarks really bad because that's what makes G80 and G84 special.If the 8600 GTS can't run Crysis at AT LEAST 45 FPS with 1280x1024 with full details and a moderate 4xAA then it's not worth it in my own humble opinion.
Same for the 8800 GTS 320MB, if it can't run Crysis at 60 FPS with 1280x1024 with full details and full 16xCSAA then it sucks...
BTW 8800 GTS 320MB gets near double the performance at 50% higher price and when 4xAA is enabled a little over double. Think about that everyone ;-)
Staples - Tuesday, April 17, 2007 - link
My reaction to. Do you play PC games? Very few games can be run at 60fps with full detail even with top of the line hardware. I expect the 8600GTS to get about 20fps in Crysis.PICBoy - Tuesday, April 17, 2007 - link
The only games that I don't see get that amount of fps at 1280x1024 with current mainstream hardware (7900GS) are Black & White 2, Oblivion and of course Rainbow Six Vegas. The rest of the games get 60 or more, excepto for Splinter Cell which gets 52 but that's almost 60 to me. Only 3 games gentlemen and I'm taking this info from Anandtech. If 200$ can get me descent performance at good quality at DX10 then I don't think it's worth it and XFX 7900GS XXX would rock!DerekWilson - Wednesday, April 18, 2007 - link
The issues is still one of the direction the industry is going. Games are going to get more graphically intense in the future, and different techniques will scale better on different hardware.Rainbow Six: Vegas is very important, as it is an Unreal Engine 3 game -- and Epic usually does very well with licensing their engine ... It's possible many games could be based on this same code in the future, though we can't say for certain.
It's not only a question of DX10, but future DX9 games as well -- how will they be implemented, and whether more shader intensive DX9 code lend it self better to the G8x architecture of not.
gramboh - Tuesday, April 17, 2007 - link
Are you joking? 8800GTS 320 in Crysis with max details and 16x AA at 60+FPS?I'm not expecting more than 40fps on my system at 1920x1200 less-than-max-details no aa/af (E6600 3.4GHz, 2GB ram, 8800GTS 640MB at 600/1900)