Bandwidth and Memory Scaling

AMD includes the memory controller on their processors, but Intel continues to make the memory controller a part of the motherboard chipset. While the on-chip memory controller is theoretically superior, Intel manages to continue to improve memory bandwidth on their new chipsets. You have already seen in DDR3 vs. DDR2 and Intel P35 Memory Performance: A Closer Look that the new P35 improves memory bandwidth 16% to 18% compared to the same speed and memory timings on P965.

Kingston KHX11000D3LLK2 is the first to offer lower latency in DDR3 and this should further improve memory bandwidth. We compared Standard or Buffered bandwidth on the P965 running DDR2, the new P35 running DDR2, and the new P35 running Kingston DDR3-1375.

Standard (Buffered) Sandra XI.SP2 Memory Bandwidth - 2.66GHz
Memory 800 1066 1333 1520 (380x7)
Kingston DDR3-1333
KHX11000D3LLK2
6341
5-4-3-10 1.75V
6736
6-5-5-12 1.7V
6928
7-7-6-15 1.7V
7329
8-8-8-22 1.8V
Corsair DDR3-1066
CM3X1024-1066C7
6156
6-6-6-15 1.5V
6613
7-7-7-20 1.5V
6757
9-9-9-25 1.5V
-
DDR2 - P35
Corsair Dominator
6456
3-3-3-9 2.25V
6811
4-4-3-11 2.3V
- -
DDR2 - P965 (10x266)
Corsair Dominator
5531
3-3-3-9 2.25V
5782
4-4-3-11 2.3V
- -

At DDR3-800 the Kingston manages stable 5-4-3-10 timings at 1.75V. Based on current motherboards and JEDEC standards the fastest available DDR3 timings are 5-3-3, so the Kingston DDR3-1375 is very close to the theoretical limit at DDR3-800. With the improved timings DDR3 is about 3% faster than DDR3 at 6-6-6 timings. Fast DDR2 on the P35 supporting DDR2 is the widest buffered bandwidth, but the difference between P35 DDR2 3-3-3 and P35 DDR3 5-4-3 is less than 2%. All P35 results, even the lower 6-6-6 timings, exhibit higher bandwidth than P965 at DDR2 3-3-3.

At both 800 and 1066, Kingston DDR3-1375 bandwidth is very close to the bandwidth of fast DDR2 on the P35. Again, all P35 results, even slower DDR3, are faster than P965 with fast DDR2 memory. At 1333 the 7-7-6-15 timings improve bandwidth by around 2.5%, and using lower latency allows the Kingston to run as fast as DDR3-1520 at 8-8-8-22 timings. However, the best bandwidth was achieved at faster timings and slightly slower speed. As shown above, the Kingston managed 1500 speed at 7-7-7-15 timings where standard buffered bandwidth is almost 7500 MB/s.

We also test memory with buffering schemes like MMX, SSE, SSE2, SSE3, etc, turned off. While these features do provide apparent improved bandwidth, the unbuffered bandwidth tends to correlate better with actual gaming and application performance. Unbuffered performance does not always follow the patterns of buffered memory performance.

Unbuffered Sandra XI.SP2 Memory Bandwidth - 2.66GHz
Memory 800 1066 1333 1520 (380x7)
Kingston DDR3-1333
KHX11000D3LLK2
4411
5-4-3-10 1.75V
4761
6-5-5-12 1.7V
4936
7-7-6-15 1.7V
5172
8-8-8-22 1.8V
Corsair DDR3-1066
CM3X1024-1066C7
4098
6-6-6-15 1.5V
4547
7-7-7-20 1.5V
4702
9-9-9-25 1.5V
-
DDR2 - P35
Corsair Dominator
4536
3-3-3-9 2.25V
4926
4-4-3-11 2.3V
- -
DDR2 - P965 (10x266)
Corsair Dominator
4226
3-3-3-9 2.25V
4608
4-4-3-11 2.3V
- -

Unbuffered results show the same basic pattern as buffered results in this case, although the domination of P35 in bandwidth performance is not as pervasive. At 800 and 1066 speeds, best bandwidth is with fast DDR2 on the P35 chipset, next is this Kingston DDR3-1375, then Fast DDR2 on P965, and last is slower DDR3. Unbuffered bandwidth is a good mirror of real-world performance, and this is what we expect in gaming tests. It is interesting that the lower latency Kingston has now passed DDR2 on the P965 and is nearly the equal in unbuffered bandwidth to fast DDR2 on P35.

Of course DDR2 could not do the 1333 speed, so the higher speeds of 1333 and 1500+ are the domain of DDR3 and here the Kingston memory shows its true capabilities. Lower latency DDR3 appears to be able to close any gaps that might exist in the overlap speeds of 800 and 1066.

Memory Test Configuration Number Crunching and Overclocking
Comments Locked

42 Comments

View All Comments

  • Wesley Fink - Thursday, May 24, 2007 - link

    We ran a complete test suite at DDR3-1500 7-7-7-15. Not surprisingly ALL of the results were a bit higher than those reported at 1520 9-8-8-22.

    As a result we will be replacing the 1520 results on all performance charts with the higher 1500 7-7-7 results. Give us about 15 minutes to complete the update. Enjoy!
  • photoguy99 - Thursday, May 24, 2007 - link

    It would be a good accomplishment for Barcelona to come out and surpass Core2 performance that wowed the world last year.

    But how many of these can Barcelona beat:
    1) Original Core2 Quad at 2.66Mhz (probably what they were aiming for)
    2) Add P35 chipset for 5-10% performance increase
    3) Add DD3 at 1333Mhz or higher with low latencies for 5-10% increase
    4) Add Penryn core for 5-10% performance increase at same clock speed
    5) Penryn releases at 3.2 Ghz, add another 10% increase

    When is the pain gonna stop for AMD?

    It seems by this fall the Intel platform is going to be a lot faster that the original Core2 or Core2 quad releases.
  • defter - Friday, May 25, 2007 - link

    quote:

    5) Penryn releases at 3.2 Ghz, add another 10% increase


    Since Intel has already demonstrated air-cooled 3.33GHz Penryn based quad cores, and desktop Penryn based CPUs will use 1333MHz FSB and support half multipliers, I guess that desktop Penryn based quad core CPUs can be launched at least at 3.33-3.5GHz if necessary.
  • TA152H - Thursday, May 24, 2007 - link

    OK, this post really irritates me.

    You think AMD started design on the Barcelona last year? How else could you possibly say they were aiming for the 2.66 Core 2 before it was even released if this wasn't true? Good grief, think!

    The P35 most certainly does NOT add 5-10% application performance. Maybe in specific applications you will see something like this, but overall, it's not that high.

    DDR3 at 1333 isn't adding much of anything right now. 5-10%???? Where are you getting these numbers from? In fact, in every gaming benchmark they ran, it was either slower or the same as the DDR2-1066. 5-10% my ass.

    Penryn numbers are also made up, it would be extremely optimistic for 5-10% increase in IPC for most applications. Maybe a few will, but broadly, it's probably not true, and absolutely speculative.

    Hmmmm, going from 3.0 GHz they have out now, to 3.2 GHz is 10%? I think it's more like 6.67%.

    In short, all your assumptions are either, at best speculative, or at worst, just wrong.

    Will DDR3 timings go down? Of course, but so will DDR2 since that's the dominant memory. Considering the changes to the Barcelona memory controller, I think you can expect a pretty substantial improvement there, but we won't know until we see it. A lot of stuff we won't know until we see it.

    The big thing that bothers me is AMD still has not fully implement memory disambiguation, and while the scheduling of loads is improved to P6 levels, I'm not sure if it's enough. I'm also not crazy about their substantial x87 implementation, as it's a deprecated technology and more and more becoming dead weight. It's not even part of x86-64.

    So, I'm not saying Barcelona will be better or worse, we'll see soon enough, but the reasons you give are, at best, specious, and at worst pure nonsense.

  • yacoub - Thursday, May 24, 2007 - link

    I would guess they would aim for 20-25% improvement over last year's core2duo so somewhere around 3-4 of your 5 should be the level of Barcelona performance if it works out. In that case since I don't think you won't see all 5 of those combined this year, especially at a competitive price-point I think Barcelona still has a chance. =)
  • Anonymous Freak - Thursday, May 24, 2007 - link

    One of my big gripes with the DDR3 reviews so far, which were the same when DDR2 first came out, is the direct comparison of same-bus-speed results. Of *COURSE* DDR3 at 800 MHz will be slower than DDR2 at 800 MHz. As this review shows, even the best DDR3 timings are slower than the best DDR2 timings.

    But, that's not what DDR3 is designed to do. It's designed to have higher latency in exchange for significantly higher bus speeds, as this test shows. You should be comparing the DDR3-1333 results with the DDR2-800 or 1066 results.

    Just as when DDR2 came out, it had much higher latency than DDR1, but faster bus speeds. Try comparing a top of the line DDR2 rig to a top of the line DDR1 rig now. (Say AMD AM2 vs. 939.) The faster bus speed of the DDR2 rig will just blow away the DDR1 rig, regardless of how good the DDR1 timings are. The same will be true with DDR3. Faster timings will come, as will faster bus speeds. The two will cause DDR3 to completely dominate even the fastest overclocked DDR2. Just look at this review, we have fast, but *within spec* DDR3 performing the same as the ultimate in overclocked DDR2. Just wait until we have the equivalent ultra-high-end DDR3 running at a *fully within spec* 1600 Mhz with 5-3-3 timings; and we'll probably see overclocked settings even higher.
  • lopri - Friday, May 25, 2007 - link

    I'm afraid that your assertion is not quite the reality. AM2 CPU's memory controller has never been up to the level of Socket 939 CPU's. Under the same configuration sans memory, Socket 939 rig will always win over Socket AM2 rig.
  • takumsawsherman - Thursday, May 24, 2007 - link

    I doubt you will actually see a significant difference between DDR and DDR2 running on otherwise similar chipsets. It wasn't very difficult to find 2-2-2-5-1 or 2-2-2-6 latencies with DDR memory. Even now, I am finding it hard to consistently source DDR2 for a reasonable price that has a reasonably low latency. But if you were to take 2-2-2-5-1 DDR and 3-4-3-9 DDR2 module pairs and run them with similar chipsets, with the same processors, you may in fact get some victories for DDR in your benchmarks.

    Bandwidth isn't everything. For some tasks, latency is far more important. Therefore, it is vitally important for someone to actually test real world scenarios and publish results. That way, people can save their money for an upgrade that might have a chance at improving their performance.
  • bobsmith1492 - Thursday, May 24, 2007 - link

    Don't forget... latency is not just the CAS number; it is a function of the clock speed and the number of cycles of latency. The overall latency time is the important part. DDRII 800MHz at CAS3 will have better latency than DDRI 400MHz at CAS2 (if either of those exist even...)
  • Chunga29 - Thursday, May 24, 2007 - link

    Those both exist as unofficial RAM speeds, though the DDR is harder to find these days.

Log in

Don't have an account? Sign up now