Badaboom: A Full Test of Elemental's GPU Accelerated H.264 Transcoder
by Anand Lal Shimpi on August 18, 2008 12:00 AM EST- Posted in
- GPUs
What about Performance?
If you can find a source file that Badaboom will accept and transcode fine, the process is pretty quick.
The first test I ran was to convert Chapter 3 of Bad Boys on DVD to a 5Mbps VBR .mp4 file using the Xbox 360 profile. I upscaled the video to 1280 x 720.
The Core 2 Quad Q6600 completed the test in 245 seconds using the x264 codec, outputting a file that was similar in size and quality to what Badaboom managed (the file was a bit smaller 109MB vs. 116MB and the quality a bit better).
The entry level and midrange 8/9 series GPUs couldn’t actually do much better. The GeForce 9500 GT was actually slower, as were the 8500 GT and the 8600 GTS. The GeForce 8800 GT changed things though, at 103 seconds it encoded the test in less than half the time. NVIDIA’s fastest, the GeForce GTX 280 managed it in just over 60 seconds.
Next I tried outputting a lower resolution file for use on an iPhone, encoded at 1.5Mbps. Despite the default resolution being 480 x 320 the actual output resolution was 480 x 272:
The file outputted was obviously smaller at 35MB and the time to transcode went down significantly. Now our Q6600 took 36.5 seconds and the 8800 GT’s advantage was cut down, it ended up being only about 7 seconds faster (or about 28%). The GTX 280 still pulled ahead, processing the encode in just under 19 seconds.
What this chart shows is that the load on the GPU varies, much as it does in 3D games, depending on what we're doing. Just as higher resolutions tend to be more GPU bound than CPU bound, it would seem that smaller, simpler content at lower transcoding bitrates don't show as big of an advantage. The benefit of GPU accelerated transcoding is clear, but the performance gains will vary depending on the load.
For the final test I repeated the iPhone conversion but instead of only converting Chapter 3 of the DVD I selected the first ten chapters:
In the 5Mbps Xbox 360 test the GeForce GTX 280 ended up being around 3.5x faster than the Core 2 Quad Q9450, in the single-chapter iPhone test the advantage was reduced to 2.1x and here we find that the gap grows slightly to 2.2x but still not quite as high as the original test. It's looking like a range of 2 - 4x the speed of a reasonably fast quad-core CPU is what we can expect from Badaboom if you use NVIDIA's fastest GPU.
If you look at a more reasonably priced GPU, the 9800 GTX ends up being around 2 - 3x faster than the same quad-core CPU. The value of the entry level GPUs isn't that great unless you've got a dual-core CPU, otherwise quad-core chips will be able to encode faster and with better quality.
Next up I wanted to see how fast of a CPU we needed to keep the GeForce GTX 280 fed in its most CPU-bound test, the single-chapter iPhone conversion:
This graph should make NVIDIA pretty happy, you only really need a Core 2 Duo E4500 to keep the GTX 280 fed resulting in performance better than any quad-core Intel CPU can offer. The upside to GPU accelerated video transcode is huge, we just need a better app to deliver it.
38 Comments
View All Comments
Staples - Monday, August 18, 2008 - link
From the introMedical imaging and scientific analysis benefitted tremendously from GPU acceleration, but it's rare that you are a gamer with a $400 GPU is going to be searching for oil deposits in his/her spare time on the same machine.
Dobs - Tuesday, August 19, 2008 - link
Perhaps you can help me understand what Medical Imaging has to do with searching for oil deposits?Staples - Monday, August 18, 2008 - link
Or maybe that should be:a typical gamer
Probably the latter.
Doormat - Monday, August 18, 2008 - link
"I want a CUDA enabled version of x264"Amen to that. Plus possibly a WPF version of Handbrake to make it look more elegant. I could care less about video preview.
Also, does BadaBoom support reading from ISOs or do I have to mount with DaemonTools?
I have a Q9450 OC'd to 3.2GHz, so I'm pretty happy with my x264 performance. My iPhone movies are usually done in about 3x realtime (90 minute movie in 30 mintues) at 700-900kbit/s, and the PS3/360 movies are done a little bit quicker (since there is no resizing going on, just transcoding).
Anand Lal Shimpi - Tuesday, August 19, 2008 - link
Badaboom doesn't support reading from ISOs, you have to mount with DT.-A
Manabu - Monday, August 18, 2008 - link
>> "I want a CUDA enabled version of x264"It was already tried: http://forum.doom9.org/showthread.php?t=139158">http://forum.doom9.org/showthread.php?t=139158
Dark Shikari (x264 developer) said:
"Given my experience so far in trying to port the motion search to CUDA, and Avail's hiring of a contractor to attempt to do so, I'd put the quote for porting the whole encoder somewhere on the level of a few million dollars... if you can even find people willing and able to do it."
"GPU encoding has a lot of potential, but it has a lot of weaknesses too. Its a bit like programming for a Cell or an FPGA, except exponentially more of a nightmare."
EvilBob - Monday, August 18, 2008 - link
page 6 appears to have the wrong figure - according to the text, it should show energy use information, but the table currently rendering shows the badaboom regular v. pro comparison.sideshow23bob - Monday, August 18, 2008 - link
Isn't the product name Badaboom maybe a Fifth Element reference considering the company has the name Elemental in its name. Just a guess. If that's the case it's slightly cooler.