Neoseeker : Articles : CPU : Socket 1366 : Intel Core i7 920 940 965 Review & Overclocking
Hardware Newsletter:
Email:

News Headlines
New Articles

Compare Prices

Motherboards
Abit
ASUS
Gigabyte
MSI
eVGA
Intel
Tyan
More...

Processors
AMD
Intel
More...

Memory
DDR
DDR2
DDR3
More...

Video Cards
ATI
eVGA
XFX
BFG
Sapphire
More...

search for lowest prices

send article   hardware newsletter   article comments (10)
Intel Core i7 920 940 965 Review & Overclocking - PAGE 3
William Henning - Sunday, November 2nd, 2008

Core i7 Architecture - Continued

Nehalem also greatly improves on the memory architecture - massively increasing the bandwidth available to the cores not only from the caches but also from the external main memory. Careful attention was paid to reducing latency in all the levels of the memory hierarchy - L1, L2 and L3 caches as well as the new on-chip triple channel memory controller.

Intel considers the Nehalem to be divisible into two areas - the "Core" area, consisting of a number (currently four) of processor cores with individual L1 and L2 caches, and the "Uncore" area, consisting of a shared L3 cache, an Integrated Memory Controller (currently with three channels), a number of Quick Path Interconnects, and a Power&Clock section.

For servers and desktops in 2008-2009, Intel intends to differentiate its offerings by varying the:

  • number of cores
  • number of memory channels
  • number of QPI links
  • size of caches
  • type of memory supported
  • power management
  • integrated graphics

Intel is increasing cache performance by moving to per-core low latency L1 and L2 caches that share a unique shared L3 caches. The L3 cache is inclusive, that is, anything present in a core's L1 or L2 cache must be present in the L3 cache as well.

This presents some advantages, as Intel has added a "present in core's L2 cache" bit to each cache line in the L3, significantly reducing cache snooping and cache coherency traffic, as if the data being requested is not available in the L3 cache, it is guaranteed not to be in the L1/L2 caches of the other cores. If the L3 cache line is present in another core, the other core must be snooped to see if it has modified the cache line. The relatively small sizes of the L1 and L2 caches allow them to be built with very low latency - and help minimize cache coherency checks - and also allow for reducing the size of the L3 cache to as little as 1MB in a four core processor.

Nehalem increases the scalability of multi-processor systems significantly by having the memory controller on the processor; thus adding a socket also adds another three memory channels that may be populated. Adding processors, with associated memory channels, will increase the memory bandwidth available to servers, thus greatly improving scalability.

Currently Nehalem officially supports up to DDR3-1333, but as you will see, we were able to exceed that in our tests. Having the memory controller - with a potential peak 32GB/sec bandwidth - on the processor allows for hitherto unseen (on Intel platforms) low memory latencies. Nehalem wil also support RDIMM and UDIMM memories.

Using the Quick Path Interconnect, Intel adds NUMA capability (Non-Uniform Memory Access) needed to access the memory attached to other processors in the system. The memory local to any processor socket will always be faster to access than memory attached to another processor, however QPI will make non-local memory available at data rates comparable to, and in some cases faster, than current Intel FSB designs.

The combination of the triple channel memory controller and QPI is likely to erode the current advantage AMD enjoys in multi-socket servers; thus allowing Intel inroads in the only market where it currently arguably runs second place to AMD.

The improved virtualization support not only reduces the time cost of entering/leaving a virtual machine, it also reduces the number of virtual transitions by implementing extended virtual page tables to translate guest to host physical addresses, removing the #1 cause of having to leave the virtual machine and allowing virtual guests full control over their own page tables. A virtual processor ID also helps reduce the frequency of TLB entry invalidations.

Intel is also updating its optimizing compilers, and so is Microsoft - the new 2008 Visual studio will support SSE4.2 fully.

 

 


Article Index

1.Introduction
2.Core i7 Architecture
3.Core i7 Architecture - Continued
4.Core i7 920 - the value i7
5.Core i7 940 - mid range part
6.Core i7 965 Extreme - high end part
7.X58 & ICH10 North & South bridge for the Core i7
8.Thermalright Socket 1366 Cooler
9.Intel XM25 SSD & Quimonda DDR3
10.Intel Extreme Motherboard DX58SO
11.The BIOS
12.More BIOS
13.Test Setup & Benchmarks
14.Business Winstone & Content Creation
15.WinRAR, HDTach & HDTune
16.Sandra CPU & MMM
17.Sandra Bandwidth & Latency
18.RightMark Read & Write
19.RightMark Bandwidth & Latency
20.Lame & TMPGEnc
21.CineBench & POV-Ray
22.Doom 3 & Quake 4
23.UT2003, Halo & Jedi Knight
24.Commanche 4 & Call of Duty
25.World In Conflict
26.Crysis
27.Devil May Cry 4
28.Dynasty Warriors 6 Benchmark
29.Overclocking
30.Power Consumption
31.Conclusion

Submit our article to: diggDigg this! de.le.ciousdel.icio.us

Get updates when we publish new articles
Email Address:
(0.0357/d/nova)