The Dark Knight: Intel's Core i7
by Anand Lal Shimpi & Gary Key on November 3, 2008 12:00 AM EST- Posted in
- CPUs
Multiple Clock Domains
Functionally there are some basic differences between Nehalem and previous Intel architectures. The Front Side Bus is gone and replaced with Intel's Quick Path Interconnect, similar to AMD's Hyper Transport. The QPI implementation on the first Nehalem is a 25.6GB/s interface which matches up perfectly to the 25.6GB/s of memory bandwidth Nehalem has.
The CPU operates on a multiplier of the QPI source clock, which in this case is 133MHz. The top bin Nehalem runs at 3.2GHz or 133MHz x 24. The L3 cache and memory controller operate on a separate clock frequency called the un-core clock. This frequency is currently 20x the BCLK, or 2.66GHz.
This is all very similar to AMD's Phenom, but where the two differ is in how they handle power management. While AMD will allow individual cores to request different clock speeds, Nehalem attempts to run all of its cores at the same frequency; if one core is idle then it's simply power gated and the core is effectively turned off. I explain this in greater detail here but the end result is that we don't have the strange performance issues that sometimes appear with AMD's Cool'n'Quiet enabled. While we have to turn off CnQ to get repeatable results in some of our benchmarks (in some cases we'll see a 50% performance hit with CnQ enabled), Intel's EIST seems to be fine when turned on and does not concern us.
My Concern
Looking at Nehalem's microarchitecture one thing becomes very clear: this is a CPU designed to address Intel's shortcomings in the server space. There's nothing inherently wrong about that, but it's a different approach than what Intel did with Conroe. With Conroe Intel took a mobile architecture and using the philosophy that what was good for mobile, in terms of power efficiency and performance per watt, would also be good for the desktop, it created its current microarchitecture.
This was in stark contrast to how microprocessor development used to work; chips would be designed for the server/workstation/high end desktop market and trickle down to mainstream users and the mobile space. But Conroe changed all of that, it's a good part of why Intel's Core 2 architecture makes such a great desktop and mobile processor.
Power obviously also matters in servers but not to the same extent as notebooks, needless to say Conroe did well in the server market but it lacked some key features that allowed AMD to hang onto market share.
Nehalem started out as an architecture that addressed these enterprise shortcomings head on. The on-die memory controller, Hyper Threading, larger TLBs, improved virtualization performance, restructured cache hierarchy, the new 2nd level branch predictor, all of these features will be very important to making Intel more competitive in the enterprise space, but at what cost to desktop power consumption and performance?
Intel promises better energy efficiency for the desktop, we'll be the judge of that...
I'm stating the concern up front because when I approached today's Nehalem review that's what I had in mind. Everyone has high expectations for Nehalem, but it hasn't been that long since Intel dropped Prescott on us - what I want to find out is whether Intel has stayed true to its mission on keeping power in check or if we've simply regressed with Nehalem.
The only hope I had for Nehalem was that it was the first high performance desktop core that implemented Intel's new 2:1 performance:power ratio rule. Also used by the Atom's design team, every feature that made its way into Nehalem had to increase performance by 2% for every 1% increase in power consumption otherwise it wasn't allowed in the design. In the past Intel used a general 1:1 ratio between power and performance, but with Nehalem the standards were much higher. We'll find out if Intel was all talk in a moment, but let's take a look at Nehalem's biggest weakness first.
73 Comments
View All Comments
npp - Tuesday, November 4, 2008 - link
Well, the funny thing is THG got it all messed up, again - they posted a large "CRIPPLED OVERCKLOCKING" article yesterday, and today I saw a kind of apology from them - they seem to have overlooked a simple BIOS switch that prevents the load through the CPU from rising above 100A. Having a month to prepare the launch article, they didn't even bother to tweak the BIOS a bit. That's why I'm not taking their articles seriously, not because they are biased towards Intel ot AMD - they are simply not up to the standars (especially those here @anandtech).gvaley - Tuesday, November 4, 2008 - link
Now give us those 64-bit benchmarks. We already knew that Core i7 will be faster than Core 2, we even knew how much faster.Now, it was expected that 64-bit performance will be better on Core i7 that on Core 2. Is that true? Draw a parallel between the following:
Performance jump from 32- to 64-bit on Core 2
vs.
Performance jump from 32- to 64-bit on Core i7
vs.
Performance jump from 32- to 64-bit on Phenom
badboy4dee - Tuesday, November 4, 2008 - link
and what's those numbers on the charts there? Are they frames per second? high is better then if thats what they are. Charts need more detail or explanation to them dude!TSM
MarchTheMonth - Tuesday, November 4, 2008 - link
I don't believe I saw this anywhere else, but the spots for the cooler on the Mobo, they the same as like the LGA 775, i.e. can we use (non-Intel) coolers that exist now for the new socket?marc1000 - Tuesday, November 4, 2008 - link
no, the new socket is different. the holes are 80mm far from each other, on socket 775 it was 72mm away.Agitated - Tuesday, November 4, 2008 - link
Any info on whether these parts provide an improvement on virtualized workloads or maybe what the various vm companies have planned for optimizing their current software for nehalem?yyrkoon - Tuesday, November 4, 2008 - link
Either I am not reading things correctly, or the 130W TDP does not look promising for the end user such as myself that requires/wants a low powered high performance CPU.The future in my book is using less power, not more, and Intel does not right now seem to be going in this direction. To top things off, the performance increase does not seem to be enough to justify this power increase.
Being completely off grid(100% solar / wind power), there seem to be very few options . . . I would like to see this change. Right now as it stands, sticking with the older architecture seems to make more sense.
3DoubleD - Tuesday, November 4, 2008 - link
130W TDP isn't much worse for previous generations of quad core processors which were ~100W TDP. Also, TDP isn't a measure of power usage, but of the required thermal dissipation of a system to maintain an operating temperature below an set value (eg. Tjmax). So if Tjmax is lower for i7 processors than it is for past quad cores, it may use the same amount of power, but have a higher TDP requirement. The article indicates that power draw has increased, but usually with a large increase in performance. Page 9 of the article has determined that this chip has a greater performance/watt than its predecessors by a significant margin.If you are looking for something that is extremely low power, you shouldn't be looking at a quad core processor. Go buy a laptop (or an EeePC-type laptop with an Atom processor). Intel has kept true to its promise of 2% performance increase for every 1% power increase (eg. a higher performance per watt value).
Also, you would probably save more power overall if you just hibernate your computer when you aren't using it.
Comdrpopnfresh - Monday, November 3, 2008 - link
Do differing cores have access to another's L2? Is it directly, through QPI, or through L3?Also, is the L2 inclusive in the L3; does the L3 contain the L2 data?
xipo - Monday, November 3, 2008 - link
I know games are not the strong area of nehalem, but there are 2 games i'd like to see tested. Unreal T. 3 and Half Life 2 E2.. just to know how does nehalem handles those 2 engines ;D