AMD's 65nm Preview Part 2 - The Plot Thickens (Updated with Information from AMD)
by Anand Lal Shimpi on December 21, 2006 12:12 AM EST- Posted in
- CPUs
Brisbane Performance Issues Demystified: Higher Latencies to Blame
As you'll remember from Part 1, for some reason, our 65nm Athlon 64 X2 5000+ performed slower than our 90nm part. We had contacted AMD before publication of the article but didn't receive a response until after we were well underway with Part 2. AMD's explanation for the reduced performance? Higher memory latencies.
We wanted to investigate exactly how much higher, thus we turned to CPU-Z's latency benchmark to give us a quick indication of how things had changed.
CPU | CPU-Z Latency (8192KB, 128-byte) |
AMD Athlon 64 X2 5000+ (65nm) | 122 cycles (46.92 ns) |
AMD Athlon 64 X2 5000+ (90nm) | 121 cycles (46.54 ns) |
A single cycle increase in memory access latency, or 0.4ns, is a slight increase but not enough to cause the sort of performance deltas we saw in Quake 4 and Half Life 2, something else was amiss. Luckily it was another metric that CPU-Z's latency test reported that helped us understand the cause of the poor performance: L2 cache access latency.
CPU | CPU-Z L2 Cache Latency | ScienceMark 2.0 L2 Cache Latency |
AMD Athlon 64 X2 5000+ (65nm) | 20 cycles | 20 cycles |
AMD Athlon 64 X2 5000+ (90nm) | 12 cycles | 12 cycles |
Updated - 1/5/07: Although AMD previously did not mention any issues with our findings, we were contacted today and informed that the latency information both ScienceMark and CPU-Z produced is incorrect. The Brisbane core's L2 latency should be 14 cycles, up from 12 cycles and not 20 cycles. This would help explain the relatively low impact on application performance that we've seen across the board. We are still waiting to hear back from AMD on a handful of other issues regarding Brisbane and will update you as soon as we have more information.
The original K8 core, in both 130nm and 90nm flavors, had a 12-cycle L2 cache. With Brisbane, as reported by both CPU-Z and ScienceMark, 65nm K8 now has a 20-cycle L2 cache. Generally speaking you move to a higher latency cache if you're planning on introducing a larger cache size, but a quick glance at AMD's roadmaps doesn't show anything larger than a 1MB L2 per core for the next year. The argument for higher clock speeds isn't valid either as the highest clock speed on AMD's roadmaps thus far is only 3.2GHz.
Luckily the performance impact of the higher latency L2 cache isn't noticeable in all applications, thanks to the K8's on-die memory controller, but make no mistake - the new core is slower. We couldn't figure out why AMD made the change and with most of our key AMD contacts on vacation due to the holidays, we still have no official response on the matter. Rest assured that if/when we learn more we will let you know.
Updated: AMD has given us the official confirmation that L2 cache latencies have increased, and that it purposefully did so in order to allow for the possibility of moving to larger cache sizes in future parts. AMD stressed that this wasn't a pre-announcement of larger cache parts to come, but rather a preparation should the need be there to move to a vastly larger L2. Thankfully the performance delta isn't huge, at least in the benchmarks that we saw, so AMD's decision isn't too painful - especially as it comes with the benefit of a cooler running core that draws less power; ideally we'd like the best of all worlds but we'll take what we can get. Note that none of AMD's current roadmaps show any larger L2 parts (other than the usual 2x1MB offerings), which tells us one of two things: either AMD has some larger L2 parts that it's planning on releasing or AMD is being completely honest with the public in saying that the larger L2 parts will only be released if necessary.
52 Comments
View All Comments
MartinT - Thursday, December 21, 2006 - link
AMD - Best CPU at doing nothing.This seems to be AMD's new mantra, no wonder given how hopelessly behind in performance and performance/Watt they are.
mino - Thursday, December 21, 2006 - link
Nicely said.Or better:
CPU using the least power while doing nothing...
DigitalFreak - Thursday, December 21, 2006 - link
LOLBeenthere - Thursday, December 21, 2006 - link
I doubt many PC enthusiasts place much importance on CPU power consumption. If they did Intel would never have sold any P4 chips. With Video cards drawing 200+ watts per card, a 65 nano AMD chip is a sweet piece.From my perspective, this is the first AMD 65 nano chips and like most process drops there is little performance gain just in lowering the nano size. AMD has a lot in the pipeline and as it arrives I suspect PC enthusiasts will be quite satisfied with both the CPU options and performance.
It should be pretty obvious that 99.9% of the market doesn't need faster CPUs, dual cores, quad cores, etc. until we get a decent O/S that can use these CPU features and full 64-bit function. How friggin long will we have to wait for quality 64-bit software to arrive? That is something that would help PC performance significantly, yet we've been waiting two years and the software folks have delivered almost nothing.
Sh0ckwave - Thursday, December 21, 2006 - link
You're right, enthusiasts don't care about power consumption at all. We care about performance and overclocking ability.The average user does not need a faster CPU.
Why doesn't Anandtech write articles for enthusiasts anymore?
mino - Thursday, December 21, 2006 - link
Also, many enthusiasts work at IT depts making decisions what architecture to go for.I mean, for 100s/1000s PCs deployment... An believe me, there, power IS taken into account.
Final Hamlet - Thursday, December 21, 2006 - link
Quote: I doubt many PC enthusiasts place much importance on CPU power consumption.If they did Intel would never have sold any P4 chips.
That is where you are wrong. Say it after me: Million-dollar-marketing-campaign.
Not the best product wins, but the best advertised.
Think back to P4-times: Some average I-know-that-I-have-to-press-the-big-button-to-make-my-compie-start-Joe would enter a big (online) store like DELL where his only choice was a P4 - end of selection.
Asked why he should buy it he would receive something like this: It has 3 REAL GHz, other manufacturers have _only_ about 2GHz. And then he would buy.
PS I'm no AMD-fanboy. One has to clearly admit that Intel did a marvellous job with it's Core2. Only reason to buy is aforementioned power consumption in idle (my PC is idle 90% of the time) und the nice low price.
Too strange. If you read hardware sites you could come to the conclusion that there are no single core CPUs anymore.
feelingshorter - Thursday, December 21, 2006 - link
Looking at those benchmarks, I think Intel won based on per/watt performance. AMD had lower watt usage but also lower performance. Given that a cpu can work harder, then be idle, i see per watt performance as the most important thing. I would have expected AMD to do better, but they did not come through.mino - Thursday, December 21, 2006 - link
No offense, but the moment one takes into account the fact of average PC spending >90% of time at idle, well, C2D eats X2's dust.From energy efficiency perspective, of course.
Accord99 - Thursday, December 21, 2006 - link
Only if the C2D gets paired with a hotter chipset. The P965 motherboards tend to use 10-20W less on idle and load.