AMD's 65nm Preview Part 2 - The Plot Thickens (Updated with Information from AMD)
by Anand Lal Shimpi on December 21, 2006 12:12 AM EST- Posted in
- CPUs
Brisbane Performance Issues Demystified: Higher Latencies to Blame
As you'll remember from Part 1, for some reason, our 65nm Athlon 64 X2 5000+ performed slower than our 90nm part. We had contacted AMD before publication of the article but didn't receive a response until after we were well underway with Part 2. AMD's explanation for the reduced performance? Higher memory latencies.
We wanted to investigate exactly how much higher, thus we turned to CPU-Z's latency benchmark to give us a quick indication of how things had changed.
CPU | CPU-Z Latency (8192KB, 128-byte) |
AMD Athlon 64 X2 5000+ (65nm) | 122 cycles (46.92 ns) |
AMD Athlon 64 X2 5000+ (90nm) | 121 cycles (46.54 ns) |
A single cycle increase in memory access latency, or 0.4ns, is a slight increase but not enough to cause the sort of performance deltas we saw in Quake 4 and Half Life 2, something else was amiss. Luckily it was another metric that CPU-Z's latency test reported that helped us understand the cause of the poor performance: L2 cache access latency.
CPU | CPU-Z L2 Cache Latency | ScienceMark 2.0 L2 Cache Latency |
AMD Athlon 64 X2 5000+ (65nm) | 20 cycles | 20 cycles |
AMD Athlon 64 X2 5000+ (90nm) | 12 cycles | 12 cycles |
Updated - 1/5/07: Although AMD previously did not mention any issues with our findings, we were contacted today and informed that the latency information both ScienceMark and CPU-Z produced is incorrect. The Brisbane core's L2 latency should be 14 cycles, up from 12 cycles and not 20 cycles. This would help explain the relatively low impact on application performance that we've seen across the board. We are still waiting to hear back from AMD on a handful of other issues regarding Brisbane and will update you as soon as we have more information.
The original K8 core, in both 130nm and 90nm flavors, had a 12-cycle L2 cache. With Brisbane, as reported by both CPU-Z and ScienceMark, 65nm K8 now has a 20-cycle L2 cache. Generally speaking you move to a higher latency cache if you're planning on introducing a larger cache size, but a quick glance at AMD's roadmaps doesn't show anything larger than a 1MB L2 per core for the next year. The argument for higher clock speeds isn't valid either as the highest clock speed on AMD's roadmaps thus far is only 3.2GHz.
Luckily the performance impact of the higher latency L2 cache isn't noticeable in all applications, thanks to the K8's on-die memory controller, but make no mistake - the new core is slower. We couldn't figure out why AMD made the change and with most of our key AMD contacts on vacation due to the holidays, we still have no official response on the matter. Rest assured that if/when we learn more we will let you know.
Updated: AMD has given us the official confirmation that L2 cache latencies have increased, and that it purposefully did so in order to allow for the possibility of moving to larger cache sizes in future parts. AMD stressed that this wasn't a pre-announcement of larger cache parts to come, but rather a preparation should the need be there to move to a vastly larger L2. Thankfully the performance delta isn't huge, at least in the benchmarks that we saw, so AMD's decision isn't too painful - especially as it comes with the benefit of a cooler running core that draws less power; ideally we'd like the best of all worlds but we'll take what we can get. Note that none of AMD's current roadmaps show any larger L2 parts (other than the usual 2x1MB offerings), which tells us one of two things: either AMD has some larger L2 parts that it's planning on releasing or AMD is being completely honest with the public in saying that the larger L2 parts will only be released if necessary.
52 Comments
View All Comments
Schugy - Thursday, December 21, 2006 - link
Being able to sell more chips is not an argument for consumers but for AMD. Brisbane is not like Prescott - AMD has done a good job. Further development is needed but first 65nm units are running and are the basis for new architectures with increased transistor count.Yoshi911 - Thursday, December 21, 2006 - link
Hey all, I know that Socket 939 is obselete now but I think It'd be awesome if they'd make a 939 65nm core.. I still have my Opteron 144 at 3.1ghz on my Lanparty board and would love to see a core I could update to before the nextgen AMD achatecture makes it out.Anyone know if this is a possibilty??
Spoelie - Thursday, December 21, 2006 - link
get a 165 for 150$, overclock it to at least 2.8ghz and you have fx62 like performancethat's the best thing you will ever get on socket 939 I'm afraid, now and in the future.
peldor - Thursday, December 21, 2006 - link
Practically, it's never gonna happen. The market wouldn't be worth the effort.OcHungry - Thursday, December 21, 2006 - link
I don’t understand why anyone or any review expect stellar overclocking or performance from these 65nm’s?Did AMD promise any? No. AMD promised a transition to 65nm and on time. That’s what we all should expect and appreciate the successful transition.
Do you remember the first batch of Intel's 65nm Core 2’s? It was not as good as what you see today. Frankly I think AMD did much better in 65nm than Intel back then, and this first release is giving Core 2 due's matured chip a run for the money. After all the review here clearly shows AMD is on tract w/ 65nm’s performance per watt and energy consumption. Don’t forget its still K8 architecture competing w/ the latest and the greatest of Intel's.
IntelUser2000 - Thursday, December 21, 2006 - link
Which first batches?? The ones XS has been receiving far before the official Core 2 Duo release?? What's the OC that AT got??
http://www.anandtech.com/cpuchipsets/showdoc.aspx?...">http://www.anandtech.com/cpuchipsets/showdoc.aspx?...
X6800 went from 2.93GHz to 3.6GHz with default voltage. On a very good air cooler and voltage increased, it reached 4.0GHz.
E6700: 2.667GHz to 3.4 default, 3.9 highest
E6600: 4.0GHz highest
X6800 stock cooler highest: 3.4GHz
Tomshardware: X6800 to 3.46GHz
Xbitlabs: X6800 to 3.4GHz, 3.6GHz with +voltage
Brisbane 5000+
2.6GHz to 2.925GHz, on stock cooler, 1.475V.
It's not that bad for Brisbane IMO. It seems more like an architectural limitation than process or thermal limitation. Core 2 Duo still has ways to go and roadmaps sort of reflect it. Though the increase in L2 access latencies may mean it was done to increase the clock speed potential.
peldor - Thursday, December 21, 2006 - link
Going to 65nm shouldn't move you backwards in performance though. There's no excuse for that from the consumer's POV unless the price also goes down (certainly a possibility if yields are good).ydoucensor - Thursday, December 21, 2006 - link
could the increase in latencies have something to do with "trusted" computing and the need for attestation?fitten - Thursday, December 21, 2006 - link
Pure speculation, but the L2 latency increase may be a result of work going into the three level cache controller logic getting ready for K8L or whatever it's going to be.mino - Thursday, December 21, 2006 - link
My thoughts exactly.