Memory Latency Impact on Performance
We just looked at the impact of memory bandwidth on performance, but what about latency? Let's first by adjusting the CAS latency from our default of 2 clocks up to 3 clocks. Almost all DDR400 these days is CAS 2 memory, but older memory may have a higher CAS latency or you may have to increase your CAS latency when overclocking to gain more memory bandwidth, so what kind of a performance hit is there when going from CAS 2 to CAS 3?
at_canals_08 |
at_coast_05 |
at_coast_12 |
at_prison_05 |
at_c17_12 |
|
Tcl = 2 |
116.12 |
140.43 |
123.37 |
113.69 |
83.15 |
Tcl = 3 |
115.52 |
137.07 |
121.91 |
113.37 |
79.92 |
At worst, CAS 2 memory seems to be about 5% faster than CAS 3 memory when looking at at_c17_12, our most CPU intensive test. While 5% alone isn't anything major, combine that with a number of other performance tweaks and they can definitely begin to add up.
Now let's look at keeping Tcl (CAS latency) fixed at 2 clocks, but vary Trcd timings from 3 up to 6 clocks:
at_canals_08 |
at_coast_05 |
at_coast_12 |
at_prison_05 |
at_c17_12 |
|
Trcd = 2 |
116.12 |
140.43 |
123.37 |
113.69 |
83.15 |
Trcd = 3 |
115.71 |
136.99 |
122.46 |
113.08 |
79.97 |
Trcd = 4 |
113.92 |
134.42 |
120.87 |
112.38 |
79.83 |
Trcd = 5 |
113.42 |
131.82 |
119.34 |
114.79 |
79.12 |
Trcd = 6 |
113.23 |
128.26 |
117.56 |
111.15 |
77.4 |
For the most part we saw no real changes when adjusting Trcd, the one exception being at_coast_05 which actually showed a pretty big difference between a Trcd value of 2 and higher latency values.
Next we'll look at adjusting Trp:
at_canals_08 |
at_coast_05 |
at_coast_12 |
at_prison_05 |
at_c17_12 |
|
Trp = 2 |
116.12 |
140.43 |
123.37 |
113.69 |
83.15 |
Trp = 3 |
115.6 |
139.24 |
123.13 |
116.35 |
82.09 |
Trp = 4 |
115.85 |
138.88 |
122.98 |
113.16 |
82.05 |
Trp = 5 |
114.84 |
138 |
122.65 |
112 |
80.98 |
Trp = 6 |
114.5 |
136.95 |
121.96 |
115.61 |
80.95 |
Here we see very little impact on performance.
Putting them all together we can see what the overall impact on using fast DDR400, higher latency DDR400 and extremely high latency DDR400 will be:
at_canals_08 |
at_coast_05 |
at_coast_12 |
at_prison_05 |
at_c17_12 |
|
2-2-2-10 |
116.12 |
140.43 |
123.37 |
113.69 |
83.15 |
3-3-3-10 |
114.47 |
134.11 |
120.64 |
112.62 |
80.56 |
3-6-6-10 |
110.74 |
123.76 |
114.75 |
112.17 |
73.8 |
Our standard 2-2-2-10 memory does actually offer reasonable performance benefits in Half Life 2 compared to DDR400 with higher timings such as 3-3-3-10 or the unrealistically high 3-6-6-10.
First and foremost Half Life 2 does appear to be rather dependent on memory bandwidth, but it is also quite appreciative of low latency memory as well. If you're wondering whether being able to run memory at low timings and high clock speeds is important, when it comes to Half Life 2 performance it is.
68 Comments
View All Comments
Roooooooooooooooooot - Wednesday, January 26, 2005 - link
Ommmmm ... yes, a very nice article :)
More than any article I've seen, this one made the point about the power of AMD processors for 3D gaming.
"Megahertz Shmegahertz" could have been the title. The 3.8 Pentium running neck and neck with a 2 GHz AMD CPU. Now I understand !!
One place I worked had hundreds of Dell Workstations. They gave us dual xeon ultra-SCSI jobbies. It may not sound like much, but 2 1 GHz Xeon's with an 18 Gig U160 or was it 320 SCSI HDD and an ATI FireGL graphics card was what I had.
I would love to see an article about corporate CAD machines, AMD vs. Intel with various scales of video cards.
bamacre - Wednesday, January 26, 2005 - link
1,000,001 demands for A XP benchies, how about one for high-end Northwood P4's ?? Please?AkumaX - Wednesday, January 26, 2005 - link
I one MEELIONTH the motion, i wish there were a XP barton benchmark somewhere in there, not just w/ the XP3200 (2.2ghz) but also w/ 2.3ghz and 2.4ghz (since most of us appear to also be running o/ced mobiles :P)michael2k - Wednesday, January 26, 2005 - link
What did you expect? People were demanding the HL2 CPU article in the Mac threads... and lo and behold, the next day, Anand has posted the HL2 CPU article.You can either get something now, or you can get something finished... very rarely can you get both :)
Crassus - Wednesday, January 26, 2005 - link
I was also quite surprised that the article still appeared. I'm glad it did, but I think it falls short of Anandtechs high standard:1. For comparison, at least two AXP (two to see how it scales) should be in the test field
2. As previously mentioned, Processor/speed/cache/socket. There is more that just one Athlon 64 3000+
3. Including CAS 2,5 would have been nice as this seems to be the default for people using mainstream DDR3200 RAM
4. What's the deal with the Athlon 3500+ in diagramm 3 on page 2?
Something else bothered me:
Quote: "If you are stuck with one of those older but still well-performing GPUs, don't bother upgrading your CPU unless it's something slower than a 2.4GHz Pentium 4 - you'd be much better served by waiting and upgrading to dual core later on."
Common wisdom seemed to be that especially games don't take advantage of multi-threading. Do you have any new information that upcoming games are geared more towards multiple CPUs/cores/HT?
quanta - Wednesday, January 26, 2005 - link
The test didn't show the impact of using partial precision vs full precision on NVIDIA cards. As some people have mentioned[1], Half-Life 2 doesn't need full 32-bit precision to run smoothly. In effect, NVIDIA card is running in speed crippled by the game's designers.[1] http://3dgpu.com/archives/2004/12/01/boost-perform...
DavidHull - Wednesday, January 26, 2005 - link
I second the need for SLI configurations to be included, as many reviewers have found them to be extremely limited by the CPU.AtaStrumf - Wednesday, January 26, 2005 - link
Interactive 3D charts in flash. Khm,... can it be done?AtaStrumf - Wednesday, January 26, 2005 - link
I was starting to think this article has been bined, but fortunately it wasn't.First of all I agree with the need for an AXP 3200+ in the charts. It's still a very, very common PCU!
Secondly this is only an OK article by Anand's standards. The first thing that really bothered me was how the CPU's are marked by some very long but also very useless names, like
AMD Athlon 64 3400+ (2.4 GHz)
Intel Pentium 4 570 (3.8 GHz)
This takes a lot of room on the carts but still tells me nothing about Cache size or Socket type. I suggest names like:
A64 3400+/S754/512kB/2.4GHz (its shorter and says a lot more)
Same thing for Intel: P4 570/Socket/Cache...
And how in the hell did you come up with CAS3? Most DDR 400 RAM (excluding OEMs) is CAS 2,5 and not 3 or 2. I appreciate the memory tests very much though, I just regret the very basic mistake in the underlaying assumptions.
I understand it was a low priority, seriously delayed article, but I just can't shake the feeling it could have been so much more.
One of these days I'm gonna have to take some time and put together a demo of how data is properly presented.
Questar - Wednesday, January 26, 2005 - link
"Next let’s take a look at at_coast_05, another very GPU limited test that has a good deal of NPC interaction as well as GPU limiting elements:"
How the hell could this be GPU limited if the difference from top to bottom of the graph is > 50%?