Apple Makes the Switch: iMac G5 vs. iMac Core Duo
by Anand Lal Shimpi on January 30, 2006 11:26 PM EST- Posted in
- Mac
Next up, we'll look at floating point performance.
Al Aburto, about Flops:
One of the G5's strengths is in its floating point performance, and here, we see an example of that as it holds a 18% performance advantage over the Core Duo. This does complicate the performance scene, as the move to Core Duo isn't necessarily going to be a clean victory for Apple today.
The last architectural performance test was the Queens benchmark, which does a great job of measuring the performance of a CPU's branch predictor.
To test the branch prediction, we used the benchmark "Queens". Queens is a very well known problem where you have to place n chess Queens on an n x n board. The catch is that no single Queen must be able to attack the other. The exhaustive search strategy for finding a solution to placing the Queens on a chess board so that they don't attack each other is the algorithm behind this benchmark, and it contains some very branch intensive code.
Queens has about:
23% branches
45% memory instructions
No FP operations
On a PIII, the Branch misprediction rate is up to 19%! (Typical: 9%) Queens runs perfectly in the L1-cache.
As Johan mentioned in his article, it seemed as if a good branch predictor was very important to the chip's designers. The necessity for a good branch predictor is also evident when you look at how long it takes the G5 to access main memory. For this test, we looked at Queens performance with 16 queens on the chessboard:
Flops, programmed by Al Aburto, is a very floating-point intensive benchmark. Analyses show that this benchmark contains:Note that some of those 70% FP instructions are also memory instructions. Benchmarking with Flops is not real world, but isolates the FPU power.
70% floating point instructions;
only 4% branches; and
Only 34% of instructions are memory instructions.
Al Aburto, about Flops:
" Flops.c is a 'C' program which attempts to estimate your systems floating-point 'MFLOPS' rating for the FADD, FSUB, FMUL, and FDIV operations based on specific 'instruction mixes' (see table below). The program provides an estimate of PEAK MFLOPS performance by making maximal use of register variables with minimal interaction with main memory. The execution loops are all small so that they will fit in any cache."Flops shows the maximum double precision power that the core has, by making sure that the program fits in the L1-cache. Flops consists of 8 tests, and each test has a different, but well known instruction mix. The most frequently used instructions are FADD (addition), FSUB (subtraction) and FMUL (multiplication).
MOD | FADD | FSUB | FMUL | FDIV | iMac G5 1.9GHz |
iMac Core Duo 1.83GHz |
1 | 50% | 0% | 43% | 7% | 705 | 876 |
2 | 43% | 29% | 14% | 14% | 490 | 366 |
3 | 35% | 12% | 53% | 0% | 2213 | 1216 |
4 | 47% | 0% | 53% | 0% | 1349 | 1178 |
5 | 45% | 0% | 52% | 3% | 868 | 1109 |
6 | 45% | 0% | 55% | 0% | 1509 | 1291 |
7 | 25% | 25% | 25% | 25% | 341 | 235 |
8 | 43% | 0% | 57% | 0% | 1440 | 1264 |
Average: | 1114 | 942 |
One of the G5's strengths is in its floating point performance, and here, we see an example of that as it holds a 18% performance advantage over the Core Duo. This does complicate the performance scene, as the move to Core Duo isn't necessarily going to be a clean victory for Apple today.
The last architectural performance test was the Queens benchmark, which does a great job of measuring the performance of a CPU's branch predictor.
To test the branch prediction, we used the benchmark "Queens". Queens is a very well known problem where you have to place n chess Queens on an n x n board. The catch is that no single Queen must be able to attack the other. The exhaustive search strategy for finding a solution to placing the Queens on a chess board so that they don't attack each other is the algorithm behind this benchmark, and it contains some very branch intensive code.
Queens has about:
23% branches
45% memory instructions
No FP operations
On a PIII, the Branch misprediction rate is up to 19%! (Typical: 9%) Queens runs perfectly in the L1-cache.
As Johan mentioned in his article, it seemed as if a good branch predictor was very important to the chip's designers. The necessity for a good branch predictor is also evident when you look at how long it takes the G5 to access main memory. For this test, we looked at Queens performance with 16 queens on the chessboard:
The G5 completely dominates the Core Duo here. With a relatively short pipeline, not as much attention is usually paid to branch prediction as on a chip with a longer pipe.
35 Comments
View All Comments
Illissius - Tuesday, January 31, 2006 - link
Compared to native applications, obviously, it's less than ideal; on the other hand, compared to, say, PearPC, it's pretty amazing. (I don't have any data and haven't tried it myself, but from what I've heard I'd suspect it runs at 5%-ish performance; compared to that, 30-70% is a minor miracle.)I know it won't interest the end user any whether it could've been even worse, but wanted to point it out, nonetheless ;).
yacoub - Tuesday, January 31, 2006 - link
I wonder how it compares in game- oh, right, Mac. Hehehe ;)DrZoidberg - Tuesday, January 31, 2006 - link
there is one very popular game on mac.World of warcraft....could anandtech pls include a benchie comparing mac with intel core duo vs g5 in wow? It would be interesting to see if apple switching to intel means macs are better at games (or not).
fitten - Tuesday, January 31, 2006 - link
Is the Universal Binary out for WoW yet?Cusqueno - Tuesday, January 31, 2006 - link
I have a 20" iMac Core Duo and with the default 512 RAM it was bad performance. About 5-10 fps in IronForge and 20-25 elsewhere. When I upgraded to 2 GB RAM it has improved greatly, maybe 10 - 20 in IF and 30 - 40 on the road. I guess this is due to Rosetta using lots of RAM.As of last night, there was no Universal binary. But today is patch/reboot day so might be pushed when I get off work. It is supposed to be included with version 1.9.3 according to the WoW forums.
fitten - Thursday, February 2, 2006 - link
That's pretty awesome considering that you're running WoW in emulation (Rosetta).vortmax - Tuesday, January 31, 2006 - link
Seeing that Rosetta is needed for all MS and Adobe apps. and since using Rosetta seems to take lots of memory, it would be nice to see how it runs with 1gb. Also, some benchmarks from Photoshop would be nice :)Thanks Anand!
Lifted - Tuesday, January 31, 2006 - link
"... but those are the ones we want to measure anyways so they have to be there."Eug - Tuesday, January 31, 2006 - link
Does turning off one core turn off half the cache?ie. Is it really Yonah Core Solo, or is it Yonah Celeron M?
maconlysource - Wednesday, February 1, 2006 - link
Where did you get the toolbar single proc- dual proc utility.I installed the developer pkg on my Intel iMac but can't find it?
Can you email me it?
Thanks.
Pete.