Duron and Athlon

I won't bother going into details of the early Athlon and Duron processors. They were great in their day, but they're getting to be rather long in the tooth. If there is a strong demand for more details on these processors, I will add them at a later point, but for now I simply recommend that you bite the bullet and upgrade.

For those interested in some historical information, here are a few more tidbits. The early Argon, Pluto and Orion Athlon chips had L2 cache chips contained within the Slot A cartridge. This cache could run at 1/2, 2/5, or 1/3 of the core clock speed - the faster the core, the lower the ratio. This led to situations where, for example, a 700 MHz Athlon with 350 MHz L2 would outperform the more expensive 750/300, or the 850/340 would beat the 900/300 due to the slower cache. Generally speaking, performance comparisons between the Athlon and Pentium III chips of the day were neck-and-neck affairs, with each side winning some benchmarks. Athlon had better x87 floating point performance, while Intel generally won out with features like MMX and SSE - at least in applications that were properly optimized.

The socket A processors switched to an integrated full-speed L2 cache, but the cache was half as large. The increased speed and reduced latencies, however, more than made up for the decrease in cache size. At this time, AMD was able to actually surpass Intel in raw performance for a period of time. The Athlon Thunderbird eventually reached 1.4 GHz, while the Pentium III tried for 1.13 GHz and failed. Later versions of the Pentium III dubbed Tualatin would eventually reach 1.4 GHz, but those only came after the introduction of the Pentium 4. Athlon during these times was the chip for gaming systems.

One other item worth noting is that all of the Athlon and Duron systems used the EV6 bus protocol acquired from DEC/Alpha. This was a double-pumped system bus, which improved performance relative to older buses like that used in P6 motherboards. The bus speeds listed in the charts are the base bus speed, which is then multiplied by the CPU multiplier to arrive at the final CPU speed. However, due to the double-pumping, many motherboards will list the bus speed as the doubled value. The actual performance increased gained from the doubling of the bandwidth is not as large as some might expect, but it probably accounts for somewhere between 5 to 15 percent of the total performance of the architecture, depending on the application.

The Athlon 64 and Opteron processors, meanwhile, have switched to a HyperTransport bus running at 800 MHz on the early chips and 1 GHz on socket 939 chips. The main benefit of the HT bus is that it doesn't require as many traces (wires), so it makes motherboard layouts somewhat easier to design. This also allows for multiple high-speed bus connections when used in SMP systems without resorting to designs with more layers.

Athlon XP and Sempron Processors

Athlon XP (Desktop) & Sempron (Desktop Value)
Athlon XP 1500+1333Palomino256133.310.0X
Athlon XP 1600+1400Palomino256133.310.5X
Athlon XP 1700+1467Palomino/TBA256133.311.0X
Athlon XP 1800+1533Palomino/TBA256133.311.5X
Sempron 2200+1500Thoroughbred B256166.79.0X
Athlon XP 1900+1600Palomino/TBA256133.312.0X
Athlon XP 2000+1667Palomino/TBA256133.312.5X
Athlon XP 2000+1667Thorton256133.312.5X
Athlon XP 2000+1533Barton512133.311.5X
Athlon XP 2100+1733Palomino/TBA256133.313.0X
Sempron 2400+1667Thoroughbred B256166.710.0X
Athlon XP 2200+1800TBA/TBB256133.313.5X
Athlon XP 2200+1800Thorton256133.313.5X
Sempron 2500+1750Thoroughbred B256166.710.5X
Athlon XP 2200+1667Barton512133.312.5X
Sempron 2600+1833Thoroughbred B256166.711.0X
Athlon XP 2400+2000Thoroughbred B256133.315.0X
Athlon XP 2400+2000Thorton256133.315.0X
Athlon XP 2400+1800Barton512133.313.5X
Athlon XP 2500+1867Barton512133.314.0X
Sempron 2800+2000Thoroughbred B256166.712.0X
Athlon XP 2600+2133Thoroughbred B256133.316.0X
Athlon XP 2500+1833Barton512166.711.0X
Athlon XP 2600+2083Thoroughbred B256166.712.5X
Athlon XP 2600+2000Barton512133.315.0X
Athlon XP 2600+1917Barton512166.711.5X
Athlon XP 2700+2167Thoroughbred B256166.713.0X
Athlon XP 2800+2250Thoroughbred B256166.713.5X
Athlon XP 2800+2083Barton512166.712.5X
Athlon XP 3000+2167Barton512166.713.0X
Athlon XP 3000+2100Barton51220010.5X
Athlon XP 3200+2200Barton51220011.0X
*** All system buses for Athlon XP, Sempron, Athlon 64, and Opteron are "double pumped", so their data rate is twice the bus speed. The multiplier is based off the listed speed.

Many of the processors listed in the charts were not commonly available, so they may not be well known. Some of these parts were shipped to OEMs who had special requirements, for example they might want to use cheaper PC2100 RAM with a Barton core. Some of the listed chips might also have been mobile parts which were mistakenly listed in the wrong table. However, the majority of these chips actually do exist in various PCs. Note also that some parts were likely to be seen more in overseas markets than in the US. If you are sure that a part is incorrect or doesn't exist, feel free to post a comment or send an email.

Athlon XP tweaked some of the finer details of the Athlon architecture to improve performance. Since XP was also going up against Pentium 4 instead of Pentium III, AMD (re)introduced model numbers and began their "clock speed isn't everything" campaign. According to AMD, the XP line was rated in terms of performance relative to the Thunderbird core, but few people actually believe that. It was almost surely market driven, as the Pentium 4 was scaling rapidly in clock speed, and the Athlon cores couldn't possibly keep up in raw MHz. And of course, AMD is correct that clock speed isn't everything - average instructions executed per clock (IPC) multiplied by clock speed would give you the real instruction throughput. Unfortunately, coming up with a precise measurement of IPC is virtually impossible - it varies depending on the code executed. Still, clock-for-clock, Athlons are definitely faster than P4 chips, and the PR ratings were relatively accurate, at least in the beginning.

As the "processor wars" continued, both companies released tweaked designs. Thoroughbred was a process shrink that brought higher clock speeds, but not as high as initially desired. A reworked Thoroughbred B core - which added an extra layer to the core, among other things - helped raise the clock limit a bit more and allowed Athlon XP to eventually reach 2250 MHz. Note that Thoroughbred B cores can often overclock to 2.3 to 2.4 GHz with sufficient cooling, while the A versions are often limited to ~2.1 GHz.

After Thoroughbred, AMD added more cache with the Barton core, and readjusted their model numbers accordingly, since more cache brought more performance. This was really where the model numbers started to become suspect, though, since Intel had also added more cache and increased bus speeds without "adjusting" any model numbers. The 2500+, 2600+ and 2800+ tended to struggle a bit in keeping up with their Intel counterparts, but the real problem came when Intel released the 200 MHz (800 FSB) "C" version of their Pentium 4. The jump to 3200+ with the 200 MHz FSB really only kept the Athlon XP competitive with the P4 2.8C in overall performance comparisons. Of course, here the model names were a stroke of genius, as many people simply assumed that a 3200+ really was the equivalent of the 3.2C.

Athlon XP-Mobile Processors

Athlon XP-M (Mobile)
Athlon XP-M 850850Palomino2561008.5X
Athlon XP-M 900900Palomino2561009.0X
Athlon XP-M 950950Palomino2561009.5X
Athlon XP-M 10001000Palomino25610010.0X
Athlon XP-M 11001100Palomino25610011.0X
Athlon XP-M 12001200Palomino25610012.0X
Athlon XP-M 1400+1200Thoroughbred256133.39.0X
Athlon XP-M 1500+1300Palomino25610013.0X
Athlon XP-M 1600+1400Palomino25610014.0X
Athlon XP-M 1500+1333Thoroughbred256133.310.0X
Athlon XP-M 1600+1400Thoroughbred256133.310.5X
Athlon XP-M 1700+1467Thoroughbred256133.311.0X
Athlon XP-M 1800+1533Thoroughbred256133.311.5X
Athlon XP-M 1900+1600Thoroughbred256133.312.0X
Athlon XP-M 1900+1467Barton512133.311.0X
Athlon XP-M 2000+1667Thoroughbred256133.312.5X
Athlon XP-M 2000+1533Barton512133.311.5X
Athlon XP-M 2100+1600Barton512133.312.0X
Athlon XP-M 2200+1800Thoroughbred256133.313.5X
Athlon XP-M 2200+1667Barton512133.312.5X
Athlon XP-M 2400+1800Barton512133.313.5X
Athlon XP-M 2500+1867Barton512133.314.0X
Athlon XP-M 2600+2000Barton512133.315.0X
Athlon XP-M 2800+2133Barton512133.316.0X
*** All system buses for Athlon XP, Sempron, Athlon 64, and Opteron are "double pumped", so their data rate is twice the bus speed. The multiplier is based off the listed speed.

There's not really a whole lot to say about the Mobile AMD processors. They are identical to their desktop counterparts, except they run on lower voltages and can run at reduced clock speeds to save power. Later on, the Athlon XP-M processors gained tremendous popularity due to their unlocked multipliers, which allowed them to overclock very well, as you could keep the bus speed close to the standard 200 MHz.

There are some OEM parts as well in the Mobile Athlon market which use a different socket than the standard 462 pin socket A. For the Athlon XP, there is a 563 pin version, and for Athlon 64 there is a 638 pin version. Further details and information on these parts is, at present, lacking.

Athlon 64 and Opteron Processors

Athlon 64 & "Performance" Sempron
Sempron 3100+1800Paris*2562009.0X754
Athlon 64 2800+1800Clawhammer5122009.0X754
Athlon 64 2800+1800Newcastle5122009.0X754
Athlon 64 3000+2000Clawhammer51220010.0X754
Athlon 64 3000+2000Newcastle51220010.0X754
Athlon 64 3200+2000Clawhammer102420010.0X754
Athlon 64 3200+2200Newcastle51220011.0X754
Athlon 64 3400+2200Clawhammer102420011.0X754
Athlon 64 3400+2400Newcastle51220012.0X754
Athlon 64 3500+2200Newcastle51220011.0X939
Athlon 64 3700+2400Clawhammer102420012.0X754
Athlon 64 FX-512200Sledgehammer102420011.0X940
Athlon 64 3700+2600Newcastle51220013.0X754
Athlon 64 3800+2400Newcastle51220012.0X939
Athlon 64 FX-532400Sledgehammer102420012.0X940
Athlon 64 FX-532400Sledgehammer102420012.0X939

Opteron**
Opteron x401400Sledgehammer10242007.0X
Opteron x421600Sledgehammer10242008.0X
Opteron x441800Sledgehammer10242009.0X
Opteron x462000Sledgehammer102420010.0X
Opteron x482200Sledgehammer102420011.0X
Opteron x502400Sledgehammer102420012.0X
* The Paris core does not support 64-bit computing. It is included with the Athlon 64 because of the socket and because the integrated memory controller puts it ahead of the Athlon XP in performance.
** All Opterons are available in 1xx, 2xx, and 8xx versions. x=1 is for single processor systems, x=2 is for up to dual processor systems, and x=8 is for up to octal processor systems.
*** All system buses for Athlon XP, Sempron, Athlon 64, and Opteron are "double pumped", so their data rate is twice the bus speed. The multiplier is based off the listed speed.

With the Athlon 64, as the name suggests AMD added support for 64-bit addresses and integers. This was done by widening their pathways and registers, but it wasn't a radical redesign of the core Athlon architecture. It has a pipeline that was increased to 12/17 stages, it got SSE2 support added, and the system bus was switched to a HyperTransport bus. The longer pipelines allow it to scale to somewhat higher clockspeeds, and the HyperTransport buses - there are three in the Opteron - allow for better SMP, but the core remains essentially the same. The addition of x86-64 support has garnered a lot of attention, but so far it's pretty much marketing hype. It has potential to improve performance once 64-bit support arrives, but that potential has not yet been realized in the mainstream market. The scientific and academic community, however, has greeted the introduction of affordable 64-bit processing with open arms. Most consumers, meanwhile, are stuck waiting for Windows XP-64.

The reason for the superior performance of the Athlon 64 - in current 32-bit code as well as in 64-bit code to a lesser extent - lies mostly with the integrated memory controller, which dramatically reduces memory latencies. In effect, it helps to turn system RAM into a very large but still relatively slow L3 cache. It also continues to reduce memory latencies as clock speeds increase. Memory latencies on the Athlon XP were roughly 81 ns at 3200+ speeds, and the P4 3.2C was around 77 ns latency. Meanwhile, the Athlon 64 3400+ comes in at an astonishingly low 48 ns. As mentioned before, those latency figures are getting somewhat close to L3 cache values - for example, the L3 cache in a 3.06 GHz Xeon is about 10 ns. It's still four times slower, but it's also twice as fast as RAM on a P4 system.

No better example of this can be found than the newly introduced Paris core, a.k.a. Sempron 3100+. At 1.8 GHz, it is substantially slower than the fastest Athlon XP in core speed, and yet in typical use it outperforms even the Athlon XP 3200+. This from a part that has half as much cache as the Barton and Newcastle cores! The only area where it fails to keep up is in tasks that generally fit within the L1/L2 cache of the CPU, i.e. certain encoding tasks. In that case, the lack of raw clockspeed is a hindrance.

Of course, reduced latency isn't the entire story of the Athlon 64. In 64-bit mode, the number of useable registers for both integer and floating point operations has been doubled. Depending on the code being run, this could potentially bring 10 to 20 percent more performance. Certain applications that make heavy use of 64-bit integers can also benefit from the added 64-bit support, for example cryptography and encoding tools. However, MMX and SSE have provided alternative means of improving 64-bit integer performance for many years now - they just require more programming effort to realize.

Introduction to CPU Guides Concerning Intel...
Comments Locked

74 Comments

View All Comments

  • JarredWalton - Friday, August 27, 2004 - link

    Regarding pipeline lengths on Intel products, there are numerous sources that state the P6 core was a 12 stage design. Perhaps the Interger pipeline was shorter and the FP was longer? I don't know for sure, but the majority of information I have read says P6 (PPro, P2, P3, Cel, Cel-2) were all the same core and were all 12 stages. Here's a link to one of the more authoritative CPU information guys that I have read, Jon "Hannibal" Stokes:

    http://arstechnica.com/cpu/004/pentium-1/pentium-1...
    http://arstechnica.com/cpu/004/pentium-2/pentium-2...

    Those contain a histort of the Pentium architecture. Unless you can provide a more definitive source for pipeline lengths, I tend to believe Hannibal. I also heard at the time the original P4 launched that it had "as few as 20 and as many as 28 stages, depending on the instruction being executed and other factors." Something like that. Most people stuck with the "20 stage" figure, but it has become increasingly clear that it was not a straight 20-stage design.
  • IntelUser2000 - Friday, August 27, 2004 - link

    Another correction: the article states 12-stage pipeline for P6 cores? No, its 10, I don't know why some people say P6 cores and its related processors have 12 stage pipelines(exception being PM, because they ARE a different architecture, just not radical as P4), when its 10!!!
  • IntelUser2000 - Friday, August 27, 2004 - link

    First, some corrections.

    mostlyprudent, P4 Willamette is only available up to 2000. They are actually available from 1300-2000. Over 2000 is Northwood cores, which have 512KB L2 cache and is 0.13 micron process.

    Second, why don't anybody seem to notice the pipeline numbers for Prescott on Page 6?

    "The Prescott further extended the NetBurst pipeline to 23 stages in addition to the 8 fetch/decode stages. For whatever reason, Intel generally describes the pipeline of the Prescott as 31 stages while only calling the earlier design a 20 stage pipeline."

    What the hell? Is it actually true? Can the writer, Jarred Walton, please answer this question? Did you just get the facts wrong or is it true that Prescott does have 23 stage pipelines?
  • FlameDeer - Tuesday, August 24, 2004 - link

    Thanks Jarred, very good article! Very useful and helpful processor performance comparison, much better than Intel "BMW" naming! :)

    Some small correction at page 3 Intel Cheat Sheet table:
    Entry no.3 Mendocino is 250nm, 154mm2 only
    Entry no.7 Deschutes Bus Speed is 66 MHz
  • JarredWalton - Tuesday, August 24, 2004 - link

    #36 - I suppose I should have been consistent with the bus speeds. Intel's really is quad-pumped and AMD's really is double pumped. Somehow along the way I redid the Intel side to have the quad pumped bus speed and I didn't redo the AMD side. The Netburst architecture likely benefits a little more from the increased bus speed, but if AMD certainly benefits as well. I'll include that in my updated version later this week. (My left wrist needs a rest. I don't want to risk carpal tunnel syndrome.)

    On the HyperTransport side of things, I really don't regard the HT bus speed as being that important. The old style bus (Athlon Socket A) was a 64-bit 400 MHz bus (200 MHz double-pumped - at least on the 3200+) while HyperTransport is a 16-bit 800 MHz bus. I think that's right, anyway. So 16-bit * 800 MHz (bidirectional) is the same as 400 MHz * 400 MHz (unidirectional). Bleh. Whatever the case, I'm pretty sure the HT bus doesn't really make for the A64 being faster. It helps out tremendously in the Opteron with multiple processors, but that's different.
  • johnsonx - Tuesday, August 24, 2004 - link

    to #38

    There are two Thoroughbred B AXP 2600's. 133/266FSB @ 2133 Mhz (multiplier 15), and 166/333FSB at 2083Mhz (multiplier 12.5). Yours sounds like a 166/333FSB model.

  • mrmorris - Tuesday, August 24, 2004 - link

    #15
    My 2600+ AMD XP runs 2083MHz and its Thoroughbred-B!
  • magratton - Monday, August 23, 2004 - link

    #34 - Sweet. The article made me remember all those years, and that post gave me a great chuckle. Peace! Being an avid comments reader (though not so much a contributor) it is good to finally put a name to a.. well.. a name. Peace!
  • mlittl3 - Monday, August 23, 2004 - link

    Jarred

    Don't mean to be persistent but I was wondering what your thoughts about the bus speed listings were.

    Should AMD Athlon processors be listed with bus speeds like 100, 133, 166, 200 MHz or should it be 200, 266, 333, 400 MHz? Likewise for the AMD Athlon 64, FX, Opteron. They use hypertransport running anywhere from 600 to 1000 MHz and don't advertise a bus speed since the memory controller is integrated (even though everyone knows its 200 MHz X multiplier).

    If the current listed speeds are the way it should be written, what about the Intel bus speeds of 400, 533, 800 and 1066 MHz? These really are 100, 133, 200 and 266 MHz when calculating the actual processor speed.

    Do the Intel quad speed bus speeds really reflect the actual bus speed wherease the AMD double bus speed do not?

    Just wanted to be clear. Thanks. Can't wait for the GPU cheat sheet.

    Mark
  • JarredWalton - Monday, August 23, 2004 - link

    Umm... crap, sort of let the cat out of the bag there. If the "JW" at the end of the other name didn't clue you in, it should be blatantly obvious who I am now. (Although only people that read the news and article comments are likely to have seen the name.)

Log in

Don't have an account? Sign up now