Bandwidth and Memory Scaling
One of the surprises in comparing DDR2 performance on AM2 and Core 2 Duo was the much better memory bandwidth found on the AM2 platform courtesy of the on-chip memory controller. Unfortunately, this did not translate into significant performance improvements compared to a similar AMD processor running DDR. At that point we concluded that Core 2 Duo was not particularly bandwidth sensitive, since it made very good use of the memory bandwidth available.
In our earlier review we were really comparing the DDR2 memory controller on AM2 to the 975X chipset memory controller, since Intel continues to place the memory controller in the chipset. We have speculated since then whether an improved memory controller in a socket 775 chipset would bring with it improved performance.
P965 brought very minor changes, mainly in the straps and overclocking ability of the memory. The NVIDIA 680i/670/650 actually shows decreased buffered bandwidth, but unbuffered bandwidth is about the same as P965. This reinforced the notion that memory bandwidth didn't matter much with Core 2.
To begin our investigation into DDR3 performance, we compared Standard or Buffered bandwidth on the P965 running DDR2, the new P35 running DDR2, and the new P35 running DDR3. As you can see the results are very interesting.
While the purpose of this review was to compare DDR3 and DDR2 performance, something completely different emerged from the memory bandwidth tests. Namely, the memory controller on the P35 is definitely an improvement over the P965 memory controller. This is evident whether the P35 is running DDR2 or DDR3 memory.
In those cases where we can run timings the same or close to the same, as in 800 memory speed performance, DDR2 and DDR3 results are virtually identical. By 1067 the current slow DDR2-1067 timings of 7-7-7-20 are performing just as well as DDR2 running at 6-6-6-15. The superior timings of DDR2-1067 at 4-4-3 still provides the best bandwidth at that speed. Of course, DDR3 is currently alone at the 1333 memory speed, but even with the current slow 9-9-9-25 timings it performs nearly as well as DDR2-1067 at 4-4-3 timings.
We normally also test memory with buffering schemes like MMX, SSE, SSE2, SSE3, etc. turned off. While these features do provide apparent improved bandwidth, we have found the unbuffered bandwidth to correlate better with real-world application performance. Unbuffered performance does not always follow the patterns of buffered memory performance.
Unbuffered results show the same basic pattern as buffered results in this case. Here DDR3 is clearly the best performer at the same slow timings at DDR2-800, with DDR2 on the P35 behind about 3% and DDR2 on P965 about 12% lower. DDR2 is still faster at the better timings available with current DDR2 memory.
In Standard/Buffered memory bandwidth, the P35 (Bearlake) chipset is providing a 16% to 18% improvement in memory bandwidth compared to the P965. This is a significant improvement. The Unbuffered improvement is smaller, in the range of 4% to 8%. These bandwidth improvements may or may not translate into improved system performance. We will examine that in the SuperPi and Gaming benchmarks.
One of the surprises in comparing DDR2 performance on AM2 and Core 2 Duo was the much better memory bandwidth found on the AM2 platform courtesy of the on-chip memory controller. Unfortunately, this did not translate into significant performance improvements compared to a similar AMD processor running DDR. At that point we concluded that Core 2 Duo was not particularly bandwidth sensitive, since it made very good use of the memory bandwidth available.
In our earlier review we were really comparing the DDR2 memory controller on AM2 to the 975X chipset memory controller, since Intel continues to place the memory controller in the chipset. We have speculated since then whether an improved memory controller in a socket 775 chipset would bring with it improved performance.
P965 brought very minor changes, mainly in the straps and overclocking ability of the memory. The NVIDIA 680i/670/650 actually shows decreased buffered bandwidth, but unbuffered bandwidth is about the same as P965. This reinforced the notion that memory bandwidth didn't matter much with Core 2.
To begin our investigation into DDR3 performance, we compared Standard or Buffered bandwidth on the P965 running DDR2, the new P35 running DDR2, and the new P35 running DDR3. As you can see the results are very interesting.
Standard (Buffered) Sandra XI.SP2 Memory Bandwidth - 2.66GHz | |||
Memory Speed | P965 ASUS P5B Dlx |
P35 DDR2 ASUS P5K Dlx |
P35 DDR3 ASUS P5K3 Dlx |
DDR2-800 3-3-3-9 | 5531 | 6456 | - |
DDR2-800 5/6-6-6-15 DDR3-800 6-6-6-15 |
5207 | 6143 | 6156 |
DDR2-1067 4-4-3-11 | 5782 | 6811 | - |
DDR2-1067 5/6-6-6-15 | 5712 | 6621 | - |
DDR3-1067 7-7-7-20 | - | - | 6613 |
DDR3-1333 9-9-9-25 | - | - | 6757 |
While the purpose of this review was to compare DDR3 and DDR2 performance, something completely different emerged from the memory bandwidth tests. Namely, the memory controller on the P35 is definitely an improvement over the P965 memory controller. This is evident whether the P35 is running DDR2 or DDR3 memory.
In those cases where we can run timings the same or close to the same, as in 800 memory speed performance, DDR2 and DDR3 results are virtually identical. By 1067 the current slow DDR2-1067 timings of 7-7-7-20 are performing just as well as DDR2 running at 6-6-6-15. The superior timings of DDR2-1067 at 4-4-3 still provides the best bandwidth at that speed. Of course, DDR3 is currently alone at the 1333 memory speed, but even with the current slow 9-9-9-25 timings it performs nearly as well as DDR2-1067 at 4-4-3 timings.
We normally also test memory with buffering schemes like MMX, SSE, SSE2, SSE3, etc. turned off. While these features do provide apparent improved bandwidth, we have found the unbuffered bandwidth to correlate better with real-world application performance. Unbuffered performance does not always follow the patterns of buffered memory performance.
Unbuffered Sandra XI.SP2 Memory Bandwidth - 2.66GHz | |||
Memory Speed | P965 ASUS P5B Dlx |
P35 DDR2 ASUS P5K Dlx |
P35 DDR3 ASUS P5K3 Dlx |
DDR2-800 3-3-3-9 | 4226 | 4536 | - |
DDR2-800 5/6-6-6-15 DDR3-800 6-6-6-15 |
3668 | 3975 | 4098 |
DDR2-1067 4-4-3-11 | 4608 | 4926 | - |
DDR2-1067 5/6-6-6-15 | 4389 | 4557 | - |
DDR3-1067 7-7-7-20 | - | - | 4547 |
DDR3-1333 9-9-9-25 | - | - | 4702 |
Unbuffered results show the same basic pattern as buffered results in this case. Here DDR3 is clearly the best performer at the same slow timings at DDR2-800, with DDR2 on the P35 behind about 3% and DDR2 on P965 about 12% lower. DDR2 is still faster at the better timings available with current DDR2 memory.
In Standard/Buffered memory bandwidth, the P35 (Bearlake) chipset is providing a 16% to 18% improvement in memory bandwidth compared to the P965. This is a significant improvement. The Unbuffered improvement is smaller, in the range of 4% to 8%. These bandwidth improvements may or may not translate into improved system performance. We will examine that in the SuperPi and Gaming benchmarks.
45 Comments
View All Comments
Final Hamlet - Tuesday, May 15, 2007 - link
Oh look! It's Bicycle repair man!just4U - Wednesday, May 16, 2007 - link
I don't really know what to think with ddr3 .. or even ddr2.When I jumped from PC 2700 memory (cas 3 i think) up to PC3200 memory cas 2 I noticed a increase in the overall speed of my computer.
When I jumped to DDR2 PC5300 I noticed no speed increase. When I jumped to PC6400 I noticed no speed increase. When I overclock my memory I notice no...
You get the picture.
I have a question. DDR3 with tight timings, Will we accualy notice it over ddr2? Or will it be one of those things where only a benchmark tends to notice anything.
Starglider - Friday, May 18, 2007 - link
Depends on what applications you're running. But based on your recent experience, probably not.defter - Tuesday, May 15, 2007 - link
The article says that P965 board was running at 1066MHz FSB and then writer is suprised when P35 + DDR2 running at 1333MHz FSB is slightly faster??? Then in the conclusion it's said that the reason for an increased performance is the chipset instead of a higher FSB???If you want to make P965 vs P35+DDR2 comparision, why not use the same FSB.....
Wesley Fink - Tuesday, May 15, 2007 - link
The P965 was running at 10x266 or 2.66GHz. The P35 boards ran at 8x333 or 2.66GHz. 1333 is not a natural CPU ratio on the P965 so when you choose 1333 there is no way to also choose DDR2-800. At 1333 the closest you can get (at a different strap) is 833. We believe the way we tested was as close to apples to apples as we could design. Remember we are looking at MEMORY PERFOMNACE, and the FSB should not matter as long as memory speed and timings are set the same on all test boards, which we definitley insured.Despite that, we know that Sandra XI results can be influenced a small amount by CPU speed and possibly FSB. To make sure our results were still as close to apples to apples as possible we did run 10x266 on all boards and compared results to our test setup. The MEMORY PERFORMANCE results were virtually the same as we have reported.
We do agree that were we testing CPU performance the differing bus speeds at the same resulatant CPU speed could make some difference.
IntelUser2000 - Wednesday, May 16, 2007 - link
Defter is right. You guys were just testing performance increase by using a 1333FSB instead of 1066FSB. Memory bandwidth tests like Sandra will show HUGE differences. Remember, the CPU will not benefit from faster memory when FSB isn't faster. It's different from AMD where Hypertransport is only used for northbridge to southbridge communications(which yields into 0% improvement ) for PC.
Wesley Fink - Wednesday, May 16, 2007 - link
P35 is a combination of an increase in bus speed to 1333 and an improved memory controller. We are working on a followup article to appear in a few days that shows the breakdown of the bus speed contribution and the memory controller contribution.That does not change the fact that memory bandwidth for P35 is improved 16% to 18% over P965, but we have run additional tests to show the individual impact of the bus speed increase and the memory controller improvements.
In approaching testing it is not possible to run 1333 FSB on the P965/975x and also run memory speeds like 800 and 1066 as we would like. On P35 if 1066 FSB is selected then 1333 is not available as a memory option. We can, however, run 1066 FSB on P35 to roughly determine Memory Controller contribution compared to P965, and then compare those results to 1333 equivalents (8X333 instead of 10x266) to see the additional impact of the 1333 bus on memory performance. Those results will be reported as soon as testing is complete.
defter - Wednesday, May 16, 2007 - link
Why not report results at 1066MHz FSB then?
Intel's CPUs are FSB limited since memory traffic goes through FSB. For example, 1066MHz FSB can transfer 8.5GB/s while dual channel DDR2-800 can provide 12.8GB/s. Thus, increasing the FSB also affects memory performance.
vaystrem - Tuesday, May 15, 2007 - link
I'm curious:A) Why did you chose Farcry?
B) Why you didn't include more game tests?
I think it would be particularly interesting to see how a game like Supreme Commander or Company of Heroes performs. In strategy games you have the stress from the graphical component and a heavy AI load which may take better advantage of all that bandwidth than a simple FPS.
Wesley Fink - Tuesday, May 15, 2007 - link
This was a comparison of DDR3 and DDR2, not a launch review for P35. You will see that when the performance embargo lifts on May 21st. Far Cry is part of our standard memory test suite and we are very familiar with how it behaves with variations in memory bandwith and timings. That is why it was chosen.You will see test results with many more games when we review the chipset on May 21st and the new P35 motherboards on June 4th.
Consider this a preview, and all the advance info we could give you at this point. The NDA for DDR3 memory has lifted, but the performance NDA for P35 is in place until the 21st.