Bandwidth and Memory Scaling
AMD includes the memory controller on their processors, but Intel continues to make the memory controller a part of the motherboard chipset. While the on-chip memory controller is theoretically superior, Intel manages to continue to improve memory bandwidth on their new chipsets. You have already seen in DDR3 vs. DDR2 and Intel P35 Memory Performance: A Closer Look that the new P35 improves memory bandwidth 16% to 18% compared to the same speed and memory timings on P965.
Kingston KHX11000D3LLK2 is the first to offer lower latency in DDR3 and this should further improve memory bandwidth. We compared Standard or Buffered bandwidth on the P965 running DDR2, the new P35 running DDR2, and the new P35 running Kingston DDR3-1375.
At DDR3-800 the Kingston manages stable 5-4-3-10 timings at 1.75V. Based on current motherboards and JEDEC standards the fastest available DDR3 timings are 5-3-3, so the Kingston DDR3-1375 is very close to the theoretical limit at DDR3-800. With the improved timings DDR3 is about 3% faster than DDR3 at 6-6-6 timings. Fast DDR2 on the P35 supporting DDR2 is the widest buffered bandwidth, but the difference between P35 DDR2 3-3-3 and P35 DDR3 5-4-3 is less than 2%. All P35 results, even the lower 6-6-6 timings, exhibit higher bandwidth than P965 at DDR2 3-3-3.
At both 800 and 1066, Kingston DDR3-1375 bandwidth is very close to the bandwidth of fast DDR2 on the P35. Again, all P35 results, even slower DDR3, are faster than P965 with fast DDR2 memory. At 1333 the 7-7-6-15 timings improve bandwidth by around 2.5%, and using lower latency allows the Kingston to run as fast as DDR3-1520 at 8-8-8-22 timings. However, the best bandwidth was achieved at faster timings and slightly slower speed. As shown above, the Kingston managed 1500 speed at 7-7-7-15 timings where standard buffered bandwidth is almost 7500 MB/s.
We also test memory with buffering schemes like MMX, SSE, SSE2, SSE3, etc, turned off. While these features do provide apparent improved bandwidth, the unbuffered bandwidth tends to correlate better with actual gaming and application performance. Unbuffered performance does not always follow the patterns of buffered memory performance.
Unbuffered results show the same basic pattern as buffered results in this case, although the domination of P35 in bandwidth performance is not as pervasive. At 800 and 1066 speeds, best bandwidth is with fast DDR2 on the P35 chipset, next is this Kingston DDR3-1375, then Fast DDR2 on P965, and last is slower DDR3. Unbuffered bandwidth is a good mirror of real-world performance, and this is what we expect in gaming tests. It is interesting that the lower latency Kingston has now passed DDR2 on the P965 and is nearly the equal in unbuffered bandwidth to fast DDR2 on P35.
Of course DDR2 could not do the 1333 speed, so the higher speeds of 1333 and 1500+ are the domain of DDR3 and here the Kingston memory shows its true capabilities. Lower latency DDR3 appears to be able to close any gaps that might exist in the overlap speeds of 800 and 1066.
AMD includes the memory controller on their processors, but Intel continues to make the memory controller a part of the motherboard chipset. While the on-chip memory controller is theoretically superior, Intel manages to continue to improve memory bandwidth on their new chipsets. You have already seen in DDR3 vs. DDR2 and Intel P35 Memory Performance: A Closer Look that the new P35 improves memory bandwidth 16% to 18% compared to the same speed and memory timings on P965.
Kingston KHX11000D3LLK2 is the first to offer lower latency in DDR3 and this should further improve memory bandwidth. We compared Standard or Buffered bandwidth on the P965 running DDR2, the new P35 running DDR2, and the new P35 running Kingston DDR3-1375.
Standard (Buffered) Sandra XI.SP2 Memory Bandwidth - 2.66GHz | ||||
Memory | 800 | 1066 | 1333 | 1520 (380x7) |
Kingston DDR3-1333 KHX11000D3LLK2 |
6341 5-4-3-10 1.75V |
6736 6-5-5-12 1.7V |
6928 7-7-6-15 1.7V |
7329 8-8-8-22 1.8V |
Corsair DDR3-1066 CM3X1024-1066C7 |
6156 6-6-6-15 1.5V |
6613 7-7-7-20 1.5V |
6757 9-9-9-25 1.5V |
- |
DDR2 - P35 Corsair Dominator |
6456 3-3-3-9 2.25V |
6811 4-4-3-11 2.3V |
- | - |
DDR2 - P965 (10x266) Corsair Dominator |
5531 3-3-3-9 2.25V |
5782 4-4-3-11 2.3V |
- | - |
At DDR3-800 the Kingston manages stable 5-4-3-10 timings at 1.75V. Based on current motherboards and JEDEC standards the fastest available DDR3 timings are 5-3-3, so the Kingston DDR3-1375 is very close to the theoretical limit at DDR3-800. With the improved timings DDR3 is about 3% faster than DDR3 at 6-6-6 timings. Fast DDR2 on the P35 supporting DDR2 is the widest buffered bandwidth, but the difference between P35 DDR2 3-3-3 and P35 DDR3 5-4-3 is less than 2%. All P35 results, even the lower 6-6-6 timings, exhibit higher bandwidth than P965 at DDR2 3-3-3.
At both 800 and 1066, Kingston DDR3-1375 bandwidth is very close to the bandwidth of fast DDR2 on the P35. Again, all P35 results, even slower DDR3, are faster than P965 with fast DDR2 memory. At 1333 the 7-7-6-15 timings improve bandwidth by around 2.5%, and using lower latency allows the Kingston to run as fast as DDR3-1520 at 8-8-8-22 timings. However, the best bandwidth was achieved at faster timings and slightly slower speed. As shown above, the Kingston managed 1500 speed at 7-7-7-15 timings where standard buffered bandwidth is almost 7500 MB/s.
We also test memory with buffering schemes like MMX, SSE, SSE2, SSE3, etc, turned off. While these features do provide apparent improved bandwidth, the unbuffered bandwidth tends to correlate better with actual gaming and application performance. Unbuffered performance does not always follow the patterns of buffered memory performance.
Unbuffered Sandra XI.SP2 Memory Bandwidth - 2.66GHz | ||||
Memory | 800 | 1066 | 1333 | 1520 (380x7) |
Kingston DDR3-1333 KHX11000D3LLK2 |
4411 5-4-3-10 1.75V |
4761 6-5-5-12 1.7V |
4936 7-7-6-15 1.7V |
5172 8-8-8-22 1.8V |
Corsair DDR3-1066 CM3X1024-1066C7 |
4098 6-6-6-15 1.5V |
4547 7-7-7-20 1.5V |
4702 9-9-9-25 1.5V |
- |
DDR2 - P35 Corsair Dominator |
4536 3-3-3-9 2.25V |
4926 4-4-3-11 2.3V |
- | - |
DDR2 - P965 (10x266) Corsair Dominator |
4226 3-3-3-9 2.25V |
4608 4-4-3-11 2.3V |
- | - |
Unbuffered results show the same basic pattern as buffered results in this case, although the domination of P35 in bandwidth performance is not as pervasive. At 800 and 1066 speeds, best bandwidth is with fast DDR2 on the P35 chipset, next is this Kingston DDR3-1375, then Fast DDR2 on P965, and last is slower DDR3. Unbuffered bandwidth is a good mirror of real-world performance, and this is what we expect in gaming tests. It is interesting that the lower latency Kingston has now passed DDR2 on the P965 and is nearly the equal in unbuffered bandwidth to fast DDR2 on P35.
Of course DDR2 could not do the 1333 speed, so the higher speeds of 1333 and 1500+ are the domain of DDR3 and here the Kingston memory shows its true capabilities. Lower latency DDR3 appears to be able to close any gaps that might exist in the overlap speeds of 800 and 1066.
42 Comments
View All Comments
goinginstyle - Thursday, May 24, 2007 - link
How did you arrive at the 1520 DDR3 memory speed? FSB increase from 8x333 or a memory ratio change. Do you have any overclocked DDR2 memory scores on the P965? It would be interesting to compare overclocked DDR2 to DDR3.Wesley Fink - Thursday, May 24, 2007 - link
You can look back at the Corsair Dominator memory review where we ran benchmarks at the highest overclock we could achieve. THe review is at http://www.anandtech.com/memory/showdoc.aspx?i=291...">http://www.anandtech.com/memory/showdoc.aspx?i=291.... THere are also overclocked test scores that can be compared in any of our more recent DDR2 reviewsWesley Fink - Thursday, May 24, 2007 - link
From the 1333 memory setting we overclocked to 380x8, or 3.04GHz. At that OC, with a base 1333 memory setting, the memory speed is 1520.One reader pointed out that 7x380 is also 2.66, which is our test frequency at other speeds. That is correct and it is an intriguing idea to also run all benchmarks at the 380x7 speed. We'll consider for a comparison in an upcoming review.
goinginstyle - Thursday, May 24, 2007 - link
So it is very possible that the improvements in scores came from the increase in cpu speed and not the memory or it is a combination of both? How close can you get to 1333 memory speed at 8x380 so we know how much improvement there is in cpu speed over the increase in memory speed.
That is what has been confusing to me. Why not run at 7x380 to keep the CPU at the same speed so we can see how much performance is gained in running the memory higher. The one flaw is the increase in FSB speed would alter the scores if the app responds to cpu throughput improvements. I would suppose that would be minimal in the game testing but it would throw off the sandra scores. Does high memory speeds at high latencies beat stock memory speeds at low latencies?
The article yesterday mentioned 1t command rates. Did you try 1t to see what happened with the Kingston memory? You used to report Everest scores and I was wondering if those scores are available or maybe Memtest if you use it. I think it would be interesting to see latency numbers in the article.
Wesley Fink - Thursday, May 24, 2007 - link
Our standard procedure has been to test to the highest available memory setting, in this case 1333, and then overclock as far as we can go using this base memory setting. It is just a fortunate accident that 1520 was top OC here (and it still wasn;t the fastest results - 1500 7-7-7 was faster)which is also 7x333 or the same 2.66 used in the other memory speed tests. It would not likely hit that exact number again in future DDR3 reviews.yuchai - Thursday, May 24, 2007 - link
the 1520 speed is probably achieved by a 380 x 7 = 2660 configuration, so processor speed remains constant while the RAM runs at 1520 speeds.That said I'm surprised at the big improvement from 1333 to 1520, especially compared to the relatively small difference between 1333 and 1066.
goinginstyle - Thursday, May 24, 2007 - link
If that is the case then how do we know how much the FSB increased the score or how much the memory affected the results. I still think it is important to show overclocked DDR2 if they are going to show overclocked DDR3.
Chunga29 - Thursday, May 24, 2007 - link
I wish that you were correct, but looking at the tables at least on says "8x380" - page 4. So it's not apples to apples. The text never talks about how fast the 1520 RAM speed is, likely because that's partly due to a 14% CPU overclock.While we're at it, where are the numbers for P965 with 1333 FSB? We've seen overclocking results on P965 with bus speeds as high as 2000+, so don't give us any excuses about it not being possible. Using ratios, you can come somewhat close to DDR2-800 and DDR2-1066, and if you're throwing in overclocked DDR3 scores anyway.... At least let us see what DDR2 can achieve on P965 with a decent effort. Sure, it's out of official spec, but then DDR2-800 with 3-3-3 timings isn't JEDEC spec either.
Wesley Fink - Friday, May 25, 2007 - link
The 7x380 and 8x380 results are in a comment below and will be added to the OC section in a table.As for the P965, it was not designed to run 1333 processors or DDR3 memory, so there is no 1333 CPU raio available or any memory ratio above 1066. While it is true you can run a 25% overclock at 1333 FSB, the memory is also overclocked 25% from whatever ratio you selected. Even if you OC and select to get close to 1333 you will be running different memory straps on the P35 and P65 which definitely impacts results. It is very difficult to fairly compare P965 to P35 at speeds above 1066.
At 1333 FSB the DDR2 memory is OC'ed from the 1066 base to 1333, and we don't have a single stick of DDR2 that is stable at 1333. An 800 speed base on P965 at 1333 would be DDR2-1000, which should be compared to what on the P35? Try to select OC vlues on your P965 board to see what we are talking about here.
You are correct that it is is not impossible to come up with something somewhat close in a P965 test, it is just everything on the P965 would be overclocked while P35 would be running in spec. We can always compare an overclcoked P965 to a spec part, but is that more like justification for a P965 purchase than a revealing comparison.
We will likely run some more P965 tests just to answer questions here, but we will only be including overlap speeds, where comparisons can be fairly made, in future reviews. There are also a multitude of P965 OC results in reviews out there for those that are interested.
Zaitsev - Thursday, May 24, 2007 - link
I noticed this as well. It just seems odd because the jump from 1066->1333 is 267MHz, while 1333->1520 is 187MHz. In Far Cry and Quake 4 that translated into 10.91 and 8 more frames per sec. respectively. Did I miss something in the article or can someone explain why a smaller increase in MHz yielded a larger improvement?Oh, I see now that the processor is overclocked.