Badaboom: A Full Test of Elemental's GPU Accelerated H.264 Transcoder
by Anand Lal Shimpi on August 18, 2008 12:00 AM EST- Posted in
- GPUs
What about Performance?
If you can find a source file that Badaboom will accept and transcode fine, the process is pretty quick.
The first test I ran was to convert Chapter 3 of Bad Boys on DVD to a 5Mbps VBR .mp4 file using the Xbox 360 profile. I upscaled the video to 1280 x 720.
The Core 2 Quad Q6600 completed the test in 245 seconds using the x264 codec, outputting a file that was similar in size and quality to what Badaboom managed (the file was a bit smaller 109MB vs. 116MB and the quality a bit better).
The entry level and midrange 8/9 series GPUs couldn’t actually do much better. The GeForce 9500 GT was actually slower, as were the 8500 GT and the 8600 GTS. The GeForce 8800 GT changed things though, at 103 seconds it encoded the test in less than half the time. NVIDIA’s fastest, the GeForce GTX 280 managed it in just over 60 seconds.
Next I tried outputting a lower resolution file for use on an iPhone, encoded at 1.5Mbps. Despite the default resolution being 480 x 320 the actual output resolution was 480 x 272:
The file outputted was obviously smaller at 35MB and the time to transcode went down significantly. Now our Q6600 took 36.5 seconds and the 8800 GT’s advantage was cut down, it ended up being only about 7 seconds faster (or about 28%). The GTX 280 still pulled ahead, processing the encode in just under 19 seconds.
What this chart shows is that the load on the GPU varies, much as it does in 3D games, depending on what we're doing. Just as higher resolutions tend to be more GPU bound than CPU bound, it would seem that smaller, simpler content at lower transcoding bitrates don't show as big of an advantage. The benefit of GPU accelerated transcoding is clear, but the performance gains will vary depending on the load.
For the final test I repeated the iPhone conversion but instead of only converting Chapter 3 of the DVD I selected the first ten chapters:
In the 5Mbps Xbox 360 test the GeForce GTX 280 ended up being around 3.5x faster than the Core 2 Quad Q9450, in the single-chapter iPhone test the advantage was reduced to 2.1x and here we find that the gap grows slightly to 2.2x but still not quite as high as the original test. It's looking like a range of 2 - 4x the speed of a reasonably fast quad-core CPU is what we can expect from Badaboom if you use NVIDIA's fastest GPU.
If you look at a more reasonably priced GPU, the 9800 GTX ends up being around 2 - 3x faster than the same quad-core CPU. The value of the entry level GPUs isn't that great unless you've got a dual-core CPU, otherwise quad-core chips will be able to encode faster and with better quality.
Next up I wanted to see how fast of a CPU we needed to keep the GeForce GTX 280 fed in its most CPU-bound test, the single-chapter iPhone conversion:
This graph should make NVIDIA pretty happy, you only really need a Core 2 Duo E4500 to keep the GTX 280 fed resulting in performance better than any quad-core Intel CPU can offer. The upside to GPU accelerated video transcode is huge, we just need a better app to deliver it.
38 Comments
View All Comments
JarredWalton - Monday, August 18, 2008 - link
Wait... did you just talk about clean code and Cyberlink with a straight face!? I think every new version of PowerDVD gets worse, and I've had way too many difficulties with Blu-ray playback and their software (especially the OEM bundled version). Still, maybe they'll get it right with the ATI transcoding. And maybe I'll win the lottery.... :-)prodystopian - Monday, August 18, 2008 - link
Mike Lowry: ...It's a Limited Edition.Marcus Burnett: You d*mn right it's limited. No cup holder, no back seat...
Yes, I registered to post this.
Anand Lal Shimpi - Tuesday, August 19, 2008 - link
ahh, I love that movie. Too bad the sequel was such a letdown.10 points to you my friend :)
-A
Manabu - Monday, August 18, 2008 - link
You used too slow profiles. Acording to the handbrake site (http://trac.handbrake.fr/wiki/BuiltInPresets)">http://trac.handbrake.fr/wiki/BuiltInPresets), the Blind profile should be 4 times faster than the iPhone profile used here. Then, an quadcore leaves the GTX 280 smoking behind. The quality should be then comparable.Further discussion of this new encoder, inclusive by x264 developers: http://forum.doom9.org/showthread.php?t=136847">http://forum.doom9.org/showthread.php?t=136847
mongoosesRawesome - Tuesday, August 19, 2008 - link
Anand compares an x264 setting that is higher quality than badaboom's. He should have stopped right there, but instead he publishes numbers that show that badaboom is faster.You can't compare speeds if they aren't of similar quality!
thebackwash - Monday, August 18, 2008 - link
I must admit I never understood the consumer desire for anything more than reasonable multimedia encoding times. If I buy a new movie, and want to rip it to my computer, I only have to do it *once.* To some, any speedup they can get is well worth the price, but I honestly don't care how long it takes, as long as it's less than, say overnight, or even overnight plus whatever extra time it needs until I get home from work the next day.I understand the desire for faster computation of tasks that involve a lot of user interaction: games, web browsing, office applications, and basically the whole lot of interactive GUI-driven programs, but I never saw the draw of blazingly fast set-it-and-forget-it type computations. I can leave the computer on overnight to perform a task if need be. I personally care about quality, and whether the file can be played back in real time on the target platform. File size is important, too, but with 1TB hard drives coming in at about $125, that has started to matter a lot less.
While I *do* understand why this could provide enormous benefit to professionals working with video, any consumer of DVD movies or amateur videographer should be more than happy with what we have now. I don't see the outcry for faster word processors, and that's because computers perform that function well enough to be usable by consumers or amateurs for whom time is *not* money when it comes to using their computers.
I must admit though, I can take a chill pill and leave the computer for days at a time, as long as my RSS reader catches the daily web updates, so I might not be the average reader of tech sites.
(Once it took my old iBook *ten* days to compile KDE 3.5 from source!)
icrf - Monday, August 18, 2008 - link
Well, transcoding to a master high quality copy for long term storage, maybe. But when you want to take those with you on a portable device, you have to transcode. I'm not a fan of having multiple copies of things, despite the cost of hard drives, so I'd much rather a way to speedily convert that for me.My problem is I want to convert in bulk, which means either a nice job manager in your GUI, or a documented CLI for the app.
LTG - Monday, August 18, 2008 - link
Does it support 2-pass encoding?Does it encode uncompressed AVI?
Did they say if Main profile is coming, or if it's stuck like that?
Anand Lal Shimpi - Tuesday, August 19, 2008 - link
There are no options to control the number of passes the encoder does, this is simply a single-pass transcode that can happen in greater than real time depending on your GPU.Depending on the format of the video stream it may be able to support it.
Elemental is considering adding Main profile support to Badaboom, but right now it's reserved for the Premier plugin.
-A
erikejw - Monday, August 18, 2008 - link
Good article otherwise.If you gonna sit all day and code 100 movies or whatever this is the appropriate way to calculate energy consumption.
If not you have to include the extra seconds when your computer sit idle and the cpu transcode finishes.
This is how they do when they calculate server energy consumptin.
It is not like the computer instantly go down to 0 w when the coding is done.