AMD CPU Roadmap Update

2004 is coming to a close, and it has been a pretty exciting year for AMD. While it launched in 2003, 2004 was the year that the Athlon 64 really came into its own. Prices dropped to more affordable levels, and performance was improved to the point where even the best that Intel has to offer in the way of the Pentium 4 Extreme Edition is not able to score a victory over the top AMD processors. In fact, AMD has several processors that are currently beating Intel's top chips in the majority of applications. With such a successful year now completed, do we have more of the same to look forward to in 2005? That's difficult to say: hindsight is always 20-20, but predicting the future is far more difficult.

The buzz for the coming year is all centered around multi-cored processors - mostly dual core, but we may see quad core chips in the enterprise segment, and the future will almost certainly bring chips with more than two cores in a package. There are other areas besides the high-end multi-core arena, and we haven't seen the end of increasing clock speeds yet. Besides the high-end enthusiast/workstation/server market, we have the mainstream and value markets. The high-end may be good for bragging rights, but the vast majority of chips sold are in the value and mainstream markets. Here, then, is the latest look at what AMD is planning in each of these areas, starting with the performance processors.


AMD Desktop Athlon 64 Roadmap
Processor Clock Speed Socket Launch Date End of Line
Athlon >= FX-57 ??? Socket 939 Q1'06  
Athlon FX-57 ??? Socket 939 Q3'05  
Athlon FX-55 2.6 GHz 1MB L2 Socket 939 Now  
Athlon FX-53 2.4 GHz 1MB L2 Socket 939 Now  
Athlon FX-53 2.4 GHz 1MB L2 Socket 940 Now  
Athlon FX-51 2.2 GHz 1MB L2 Socket 940 Now  
Athlon 64 >=4200+ ??? Socket 939 Q3'05  
Athlon 64 4000+ 2.4 GHz Socket 939 Now  
Athlon 64 3800+ 2.4 GHz Socket 939 Now  
Athlon 64 3700+ ??? Socket 939 Q2'05  
Athlon 64 3700+ 2.4 GHz 1MB L2 Socket 754 Now Q4'05
Athlon 64 3500+ 2.2 GHz 90 nm Socket 939 Now  
Athlon 64 3500+ 2.2 GHz Socket 939 Now  
Athlon 64 3400+ 2.4 GHz 512K L2 Socket 754 Now Q4'05
Athlon 64 3400+ 2.2 GHz 1MB L2 Socket 754 Now Q3'05
Athlon 64 3200+ 2.2 GHz 512K L2 Socket 754 Now Q4'05
Athlon 64 3200+ 2.0 GHz 90 nm Socket 939 Now  
Athlon 64 3200+ 2.0 GHz 1MB L2 Socket 754 Now Q3'05
Athlon 64 3000+ 2.0 GHz 512K L2 Socket 754 Now Q4'05
Athlon 64 3000+ 1.8 GHz 90 nm Socket 939 Now Q3'05
Athlon 64 2800+ 1.8 Ghz Socket 754 Now Q1'05

If you compare that with our last AMD roadmap, you'll notice that there are very few changes. In fact, the only really new information is the appearance of a 3700+ socket 939 part. We would assume that this uses the new Venice core, which should include 512K L2 cache and SSE3 support. Unfortunately, precise features and clock speeds remain an unknown at present for many of the upcoming chips. The FX-57 could be a dual core chip, or else it could be a 2.8 GHz single core chip - we really can't say which yet.

In addition to adding support for SSE3, there are rumors that AMD will begin supporting DDR2 with their next processor revisions. At present that remains wild speculation. A new memory type would require new motherboards at the very least, and probably a new CPU socket as well. (Having two CPUs that share the same socket and yet support different memory types - due to the integrated memory controller - would create a lot of confusion among customers. We find it hard to imagine that AMD would take such an approach.) DDR2 also has increased latencies relative to DDR RAM, and the additional bandwidth offered does not seem to benefit Athlon 64 very much. All things considered, then, we assume that AMD will continue using only DDR memory and socket 939 for at least the first half of 2005.

You might note that we have added an "EOL" column for the processors. This is only for the desktop versions of the chips, and it indicates AMD's plans for when to phase out each model. In the past, we might see processors drop well below the $100 mark before they were discontinued, but now that AMD has closed the performance gap with Intel, they are halting the shipment of their "performance" processors once they drop below about $120. This raises the average sale price of AMD's processors, and that's a good thing as a more profitable AMD is a more competitive AMD. As we mentioned in the last update, all of the socket 754 chips are scheduled for EOL by Q3'05; beyond that, Athlon 64 will only be available on socket 939. Take these EOL dates with a grain of salt, however, as mobile variants will continue to be sold and will work in most - if not all - desktop boards. Below $120, AMD has their "value" offerings, and they pick up in performance basically where the more expensive processors leave off.


AMD Desktop Sempron Roadmap
Processor Clock Speed Socket Launch Date End of Line
Sempron >= 3500+ ??? Socket 754 Q1'06  
Sempron 3400+ ??? Socket 939 Q3'05  
Sempron 3400+ ??? Socket 754 Q4'05  
Sempron 3300+ ??? Socket 754 Q2'05  
Sempron 3200+ ??? Socket 939 Q1'05  
Sempron 3200+ ??? Socket 754 Q1'05  
Sempron 3100+ 1.8 GHz Socket 754 Now  
Sempron 3000+ ??? Socket 939 Q1'05  
Sempron 3000+ ??? Socket 754 Q1'05  
Sempron 2800+ ??? Socket 754 Q1'05 Q1'06
Sempron 2600+ ??? Socket 754 Q1'05 Q4'05
Sempron 3000+ 2.00 GHz 512K Socket A Now Q3'05
Sempron 2800+ 2.0 GHz Socket A Now Q3'05
Sempron 2600+ 1.83 GHz Socket A Now Q3'05
Sempron 2500+ 1.75 GHz Socket A Now Q3'05
Sempron 2400+ 1.67 GHz Socket A Now Q1'05
Sempron 2300+ 1.58 GHz Socket A Now Q1'05
Sempron 2200+ 1.5 GHz Socket A Now Q1'05

There are a few new additions to the value lineup since our last look, including the arrival of Sempron chips for socket 939. Clock speeds, again, remain somewhat unknown. However, given the apparent lack of working .5X multipliers for Athlon 64 motherboards, we would guess that they will come in 200 MHz increments from the current chips. Feel free to fill in the blanks. Sempron variants featuring SSE3 support should be arriving via the Palermo chips in early 2005, but where the current Paris ends and the new Palermo begins is anyone's guess. The newer parts will use a 90 nm process, so that should make spotting them somewhat easier. Overlapping model numbers continue to create some confusion, so pay close attention to the details when ordering any of these chips. With the value lineup transitioning to socket 754 - and AMD's roadmap makes this the clear intention - the days of socket A systems are numbered. The platform will also continue to see support in the way of mobile variants, but the desktop Athlon XP and Sempron chips are fading away fast. If you already own a system that uses the platform, performance is still more than acceptable in all but the most demanding of applications, but we wouldn't advise anyone looking for a new system to use socket A.

If our guess on the clock speeds is correct, we'll actually see the new chips launching with clock speeds as low as 1.2 GHz. That seems awfully low, and the parts will have a very short lifespan given the EOL looming just a few quarters after the launch. Power requirements would be very low at those speeds, however, making them an interesting prospect for mobile and embedded devices. Whether the low initial clock speeds are just AMD being cautious with a new design or if they're protecting the higher end markets is difficult to say - it's probably a little of both. The socket 939 Sempron parts will most likely start at 1.8 GHz and scale up from there in 200 MHz increments, so it looks to be more of a customer (i.e. OEM) demand than anything else. The lower performing socket 754 parts will take over the vacated positions of socket A Semprons, and that will let AMD continue to cater to the value conscious consumer without cannibalizing sales of the performance parts.


AMD Server/Workstation Roadmap
Processor Clock Speed Socket Launch Date End of Line
Opteron x50 2.4 GHz 1MB L2 Socket 940 Now  
Opteron x48 2.2 GHz 1MB L2 Socket 940 Now  
Opteron x46 2.0 GHz 1MB L2 Socket 940 Now  
Opteron x44 1.8 GHz 1MB L2 Socket 940 Now  
Opteron x42 1.6 GHz 1MB L2 Socket 940 Now  
Opteron x40 1.4 GHz 1MB L2 Socket 940 Now  

We don't have any new additions for the Opteron lineup yet, although the Venus, Troy, and Athens parts are supposed to be in the works. These are the server variants of the San Diego Athlon FX part. They should come with 1 MB L2 cache and SSE3 support and will be fabbed on the new 90 nm SOI process - strained silicon may also be used, although that isn't entirely clear. Rumors say that the x52 parts may arrive as early as Q1 2005, with clock speeds matching the 2.6 GHz of the FX-55. However, AMD may simply choose to skip these parts and head straight to their dual core models.

Speaking of dual core, Denmark, Italy, Egypt, and Toledo chips are also on the horizon for 2005, and the latest roadmap outlines AMD's plans for the introduction of these processors. Scheduled to arrive in the performance segment in the middle of 2005, dual core will initially benefit server and workstation applications the most. AMD even acknowledges this with the statement that gaming will be "best served with a maximum frequency, single core solution until 2006." The first half of 2005 will see the model numbering finalized and the sample and demonstration of dual core processors and platforms will begin in earnest. Planned clock speeds remain unknown for the time being.

The EOL for the Opteron processors remains blank, and this is what you would expect for any processor competing for the enterprise server market. Typically, such chips will be manufactured as long as there is a demand for them from the server manufacturers. There are also lower voltage versions of the earlier x40 chips available, which can be useful in blade servers. These chips might be around for years to come, although likely in smaller quantities as demand decreases.

Returning to an earlier subject, while current Athlon 64 processors do not seem to benefit much from increased memory bandwidth, that could change once we shift to dual core processors. The industry will eventually move to fully embrace DDR2 memory, just like the eventual shift from SDRAM to DDR (and RDRAM) several years back. If we are going to see DDR2 support from AMD any time soon, a separate revision of their dual core offerings would seem to make a lot of sense. That's just speculation on our part, but by 2006 we should see DDR2 prices drop to DDR levels and possibly even lower, and we should also see a shift towards the use of 1 GB and 2 GB DIMMs. That would be a good time for AMD to transition to a new CPU and RAM platform, we think.

Final Thoughts

If we take a look at the bigger picture, what we see for 2005 is that things will remain largely static with a few notable exceptions. Socket A is going to disappear, which comes as little surprise. Socket 754 becomes the new value platform and socket 939 fills in as the mainstream and performance platform - and even adds a couple of value options later in the year. Meanwhile, socket 940 will continue as the platform for workstations and servers. The notable exceptions are that we'll see the introduction of dual core processors, and we'll also see a shift to the 90 nm SOI process for AMD. The maximum clock speed of any single chip planned for 2005 appears to be 2.8 GHz at present, although there is a slight chance that we could see a 3.0 GHz chip at the end of the year. What this means is that those of you that went out and splurged on an FX-55 are going to be very close to the maximum performance available for games for the next year.

Of course, things may always change as the months roll by. Intel is certainly not sitting idle, watching AMD increase market share and performance. We'll take a look at the Intel side of things in the near future, and if Intel can execute properly on several key items, we might actually see AMD forced to accelerate the launch of faster processors. With initial overclocking of the FX-55 on 130 nm SOI with strained silicon reaching roughly 3.0 GHz, AMD may actually have more of a performance cushion than ever before. Fierce competition between AMD and Intel is almost certain to muck with these "best laid plans."
Comments Locked

31 Comments

View All Comments

  • Pannenkoek - Thursday, December 30, 2004 - link

    Perfect example why multithreading can be tricky, and locking the data used is necessary: we both used the same comment to reply to, and we're out of sync. ;-)

    Damn it, I was just happy I finished my long reply. My reply to you is coming Jarred. Don't post! :p
  • Pannenkoek - Thursday, December 30, 2004 - link

    PrinceXizor, will all due respect, but you have no clue about multithreading it seems. :p
    Neither am I a programmer, but I know about programming. ;)

    1. If there are different processes, you are already multithreading. But for the rest you are right. However, the whole point of modules is that they don't share a lot of data and that is why it should be easy.
    The accounting for the overlap is apart from creating and killing new threads the use of a multithreading API, like Posix Threads.

    2. This mysterious entity is the kernel of the operating system. Multithreading is one of the purposes of an OS, and you don't have to worry about the extra latency in your case. Especially not if different threads can run on different processors. If you are running different programs simultaneously, the OS is giving the CPU continually to a different program for a short time. Haven't seen you fret over that. ;)

    3. The compilation of source code into binary is not relevant for multithreading as everything which even comes close to hardware communication is far outside of your program. Programs are oblivious to the number of processors or cores, until they ask the OS. Compilation is not affected by multithreading.

    4. The reason why Hyperthreading hardly makes a difference is due to hardware, not software. A Hyperthreading processor is only virtually a dual processor system, and not dual core. Running two processes simultaneously is only partially supported by the hardware, and more a matter of utilizing unused transistors by trying to run an extra process.
    It depends on the situation whether you get any significant gain at all, and was never supposed to be a 100% increase. With dual core you really get twice the processing power, but not the memory bandwidth. (Of course, games can be video bound. ;)

    I think you are also confused a bit with SIMD (Single Input, Multiple Data) and SSE (Streaming SIMD Extensions) which are newer intel CPU instructions, if I look to point 3 of you.

    Multithreading overhead can be neglected in general, but it depends on how much data needs to be synchronized between different threads and how often. In a game it could be done mostly just once per frame, and if it has a modular design (which it should, really) hardly any data needs to be protected. If output of one module is used as the input of another module (from physics to graphics engine e.g.), one frame delay makes it possible to put those modules on different cores.

    Of course, there is no guarantee that a gamesdeveloper has sensible programmers, but I stand by my former comment.
  • JarredWalton - Thursday, December 30, 2004 - link

    17 & 18: Actually, 18 is relatively accurate from my experience. The term for what you need to have in order to get multiple threads to interact properly is called a "semaphore", I believe. Basically, you need a gateway so that certain segments of code can *only* be accessed by *one* thread at a time; otherwise, you get out of synch.

    Imagine the physics engine, which updates the locations of objects in the world. Let's say you put that in a thread. Then we have another thread handling player movement, one for input, maybe one for network communications, graphics rendering of course, artificial intelligence... there's a ton of things which sound like they *could* be moved into different threads, right?

    Consider this, though: in order for the AI to react properly to a given situation, it has to know the current state of the world. You can't have the AI thread examining the world and trying to figure out what to do while the physics thread is in the process of updating the location of objects in the world. The physics and graphics threads will also overlap: you can't have the physics thread moving objects around while the graphics are in the process of rendering to the screen.

    The type of application that usually benefits most from highly parallel designs is something where you have chunks of data to be processed that are *entirely* separate from each other - no overlap in shared state. With games, you could look at the graphics pipeline for lots of parallelism, but the graphics cards are already handling most of that now anyway. I don't think the physics calculations take nearly as much time as a lot of people assume (although that could be wrong).

    Anyway, I believe the basic process of most 3D games these days goes something like this:

    1) Analyze inputs and adjust variables as appropriate (i.e. a gun begins to fire, player begins to slow down/turn, etc.)
    2) Run AI routines to determine how the AI characters are going to behave (similar to player inputs) and adjust variables as appropriate.
    3) Run physics routines to update the state of all the objects in the world - position, angle, health, etc.
    4) Render the current state of the world in the graphics engine.

    A multi-threaded approach might do something like the following:

    1) Synchronize AI and player input threads and have both update the global variables appropriately.
    a) Certain variables are going to be accessed frequently by both threads, so put semaphore logic in place to keep them from writing/reading incorrect values.
    2) Run physics threads that update the state of the world. Each thread can handle a portion of the objects so that the physics calculations are done faster. Ideally, you would be able to vary the number of physics threads from 1 to n, where n is the number of processor cores that the system has available.
    a) Add additional logic to double-check areas where there is overlap - some objects are going to need to affect both threads.
    3) Render the current state of the world to the graphics card, using multiple threads. Again, the ability to have 1 to n rendering threads would be ideal.
    a) Hopefully, the graphics card drivers are capable of handling multi-threaded input efficiently!
    b) You would also want some sort of optimization in the way the objects are sent to the graphics card.

    That's a *VERY* rough description as to how game logic might be coded. You would need to analyze the game code thoroughly to make sure you focus on optimizing the right areas - if the physics and AI only takes 10% of the total CPU time, it's probably not worth wasting effort on this area!

    The algorithms that divide up the work between threads need to also be efficient enough that they don't end up wiping out any gains. That's a real key point. Imagine it takes one thread 10 milliseconds to render all of the current world state, and that if we can divide up the work into two threads each thread can get everything done in 5 milliseconds. If the task that divides the work (and the synchronizations issues) take 4 milliseconds on their own, then you break even and you've just spent a lot of effort for no performance gain.

    I believe that Jon Carmack encountered some of these issues on Quake 3. It had some alpha/beta level SMP support, but I never did hear of an instance where the SMP-enabled version ran substantially faster than the non-SMP version. I think he ended up halting work on the feature because it just didn't seem to be worthwhile.

    Now, my disclaimer: I *am* a programmer, but I haven't done any serious game programming work. I've also been doing less programming in the past two years as I've moved on to different work. My thoughts on some of this might be wrong, but logic seems to be on my side. If writing multi-threaded software is so "easy" (as some people seem to claim), then why is it that the vast majority of software is single-threaded? Even the best multi-threaded applications often end up with a 25 to 50% speed improvement over a single CPU, and finding truly independent tasks in a lot of applications - particularly games - can be rather difficult. Maybe it's just that the programmers haven't been trained/taught to look for such opportunities? We can only hope that's the case....
  • PrinceXizor - Thursday, December 30, 2004 - link

    I'll preface this by stating that I am not a programmer.

    However, I would think that multi-threaded applications are not nearly so simple, because of a few reasons.

    1. Even modular components are not totally modular. Data, variables, processes are shared to varying degrees so this overlap has to be accounted for.

    2. Something has to know what is going on. This knowledgable entity most likely introduces latencies into the system. These latencies could overshadow the initial performance gain from "ported" single threaded code.

    3. Translation. Please see the initial disclaimer especially for this item. It seems to me that we need to keep in mind that the code that is written is not what is run. Compiled code obviously differs from written code and the compilers would have to take into account a multi-thread environment as well. We currently rely on compilers to heavily optimize compiled code (just think of the arguments over which compiler to use for an "apples-to-apples" comparison of Linux application speed to Windows application speed),so, it certainly makes a difference. And, if the compilers are doing a poor job of optimizing for multi-threaded applications, then spending the extra time to program in written code for it, would seem a waste of time to many companies.

    4. Hyperthreading. While not exactly dual core, the simple fact that even now, hyperthreading does not accelerate a vast majority of computer tasks seems to indicate that the programming intracacies are not so simple. Only the most highly parallel operations see great performance gains from hyperthreading, which seems (to me anyway) to lend credence to point number three.

    Of course, this is all the opinion of a non-programmer :)

    You may fire at will!

    P-X
  • Pannenkoek - Thursday, December 30, 2004 - link

    Jarred, while I agree with you for arbitrary programs, we're talking about games here, which have large orthogonal, independent components. Every programmer can dig himself into a hole too deep to easily climb out of by bad design and code, but assuming that a game is somehow modular it should be relatively trivial as I said to kick one CPU intensive part of the engine onto a seperate thread. There is no paradigm shift required, unless the programmer hasn't made the switch yet from spaghetti code to modular design.

    Game developers are spending insane efforts in optimizing their engine for all kind of different hardware to squeeze out every drop of extra performance. Now, don't tell me that if AMD released a dual core processor next week Valve and Id wouldn't be flogging their programmers to make their game run 100% faster.
  • BARK - Wednesday, December 29, 2004 - link

    Is there any hope for socket 940 fx users? I thought AMD would support us more than this!
    2.2 to 2.4 thats not much of a bump.
  • JarredWalton - Wednesday, December 29, 2004 - link

    14 - The whole problem with getting games to utilize more than one processor is the same as getting *any* application to make use of more than one processor: you need to rework some fundamental aspects of the program. While it sounds trivial at a high level, getting multiple threads to work together effectively and efficiently is actually a rather tricky problem. It's not impossible, but it is more difficult than writing single-threaded code.

    What's needed is a paradigm shift (ugh - I hate terms like that, even when they're correctly used); the programmers need to step back and modify some of their core development processes. That's often a painful experience, and I think most software developers right now are looking at the problem and saying that there's no real benefit to changing the code yet. Once dual-core and multi-core setups become common, then they will *have* to change, but right now fewer than 1% of gamers (or computer users) have SMP setups, and only something like 40% have HyperThreading.
  • Pannenkoek - Tuesday, December 28, 2004 - link

    I don't understand why AMD thinks that games wouldn't benefit by dual core. It should be relatively trivial to implement threading into a game with its independant CPU intensive components: physics engine, AI and graphics, as there are still games which utilize the CPU a lot for rendering.

    Old games will run fine even on one core, while new games will be patched. Action and reaction, stimulus and response. The question is, who does the pushing?
  • miketheidiot - Tuesday, December 21, 2004 - link

    #11 moores law is about the # of transitors not performance. with dual core it should not be too much of a difficultly to keep up tho that.
  • coldpower27 - Tuesday, December 21, 2004 - link

    I have a theory on what hte Smeprons are likely to be.

    Smepron 3400+ Palermo 2.0GHZ/DC/256KB S939
    Smepron 3200+ Palermo 1.8GHZ/DC/256KB S939
    Smepron 3000+ Palermo 1.6GHZ/DC/256KB S939

    Sempron 3400+ Palermo 2.2GHZ/SC/128KB S754
    Sempron 3300+ Palermo 2.0GHZ/SC/256KB S754
    Sempron 3200+ Palermo 2.0GHZ/SC/128KB S754
    Sempron 3100+ Palermo 1.8GHZ/SC/256KB S754
    Sempron 3000+ Palermo 1.8GHZ/SC/128KB S754
    Sempron 2800+ Palermo 1.6GHZ/SC/256KB S754
    Sempron 2600+ Palermo 1.6GHZ/SC/128KB S754

    I don't beleive in having any K8 processors below 1.6GHZ for the Sempron, performance would just be quite low.

    The S754 data has been extrapolated in the fact that we alrerady have Sempron K8 Mobile. 2600+, 2800+ and 3000+

Log in

Don't have an account? Sign up now