Memory controllers, relocated into modern processors, affect the general logic of benchmarking computers. The main factor here is the disappearance of the usual mediator between a processor and memory -- Northbridge. So on the one hand, it becomes easier, CPU performance does not depend on a given chipset anymore, and usually on a motherboard in general, as the latter simply becomes a board to unite the other components. Of course, certain chipset controllers still affect performance of the disk system or peripheral interfaces, but a CPU got rid of their influence. On the other hand, tests of CPUs grow more complex, as their performance may change quite unpredictably depending on a given memory configuration. It happens because the memory controller is now a part and parcel of a processor, so it can be affected by other components. And it can affect them as well. For example, who cared about power consumption or heat release of a chipset (provided it came with sufficient cooling)? Nobody did. It did not matter whether it consumed more or less: its performance did not depend on this parameter, and a CPU was physically withdrawn from it. But now all these extra Watts and degrees add up to a CPU, which may affect the throttling threshold, reducing the speed of computational units. Besides, latencies have become more important -- access times have always affected resulting performance. But this effect was significantly leveled down by the complex memory access procedure. That is memory calls from a processor passed several stages, each one adding extra latencies. Integrated memory controllers effectively cope with these latencies, significantly reducing the overall latency. However, intrinsic latencies of memory modules start to play an important role. As well as latencies of the memory controller itself -- there are no mediators, total access times are reduced manifold, so each nanosecond counts. So, when the role of one component is simplified, the other components get more complicated to test.
It's not much of a revelation, AMD processors have been using the integrated memory controller for over six years already. So we already accumulated enough information on this issue. However, Intel processors take up a much bigger share of the market (it means more users), so this change in memory operation has become important for such people only now, with the rollout of popular processors with the integrated memory controller. The LGA1366 family with its tiny share of the market (much smaller than that of various Athlons/Phenoms) couldn't possibly be such turning point. That's what LGA1156 has done. So we are not going to postpone our analysis of memory operations on this platform.
LGA1156 and LGA1366: differences in memory subsystems
If you take a closer look at motherboards for both platforms, you will see practically no noticeable differences: both camps are equipped with four or six DDR3 memory slots. In fact, they can use them differently. What concerns LGA1366 systems, all six slots can be populated with any modules (up to 4GB ones) with the total size of up to 24GB. But don't expect such miracles from the lower platform, even if you have a motherboard with six slots (like our Gigabyte P55-UD6). LGA1156 supports three memory modules per channel, but under a very strict condition: the total number of supported banks cannot exceed eight. That is if you install two-bank modules, you can fill only four slots out of six. You can use all slots only with single-bank modules. As for us, all memory modules in our lab (even 1GB modules) used two banks. So what's the conclusion? For one, don't chase after six memory slots. You will most likely use no more than four anyway. Besides, maximum memory capacity is not affected by how many slots are installed. Single-bank modules have lower capacity than two-bank modules (twice as low actually). So the maximum capacity for this platform is 16GB so far. And if you use inexpensive 2GB modules, the maximum capacity will be only 8GB.
That's the 1.5 difference from 12GB or 24GB on the LGA1366 platform, which is dictated by the difference between triple- and dual-channel controllers. But don't jump at conclusions. In fact, the integrated controller in processors for LGA1366 supports 18 memory banks, not 12. But there is one restriction: only DDR3 800 memory is supported in this case. If only two modules per channel are installed (that's exactly what owners of X58-based motherboards have to do, as manufacturers are in no hurry to install nine slots there), we get DDR3 1066 or even 1333. DDR3-1333 and probably even faster modules (1600 or higher) will work only when a single module is installed per channel.
So we can draw a conclusion that the memory controller in LGA1366 processors is very complex: it supports twice as much memory as its new desktop competitor. What concerns LGA1156, it has fewer channels and simpler structure. And by the way, it may even work faster. This happens quite often. E.g., engines of heavy-duty dump trucks often have the same horsepower as sport car engines. But the former convert all the power to high carrying capacity, so they cannot compete even with a budget compact in terms speed. The same concerns processors for different platforms. So let's run our tests to see what's going on exactly.
We've selected Core i7 860 as the main test object. This processor has the starting core clock rate of 2.8 GHz and UnCore frequency of 2.4 GHz. It officially supports DDR3-1333 memory (even higher frequencies are supported unofficially). We mostly used our standard test procedure. The detailed test results are published here.
Testbed configurations differ mostly in memory. We have already tested processors for this socket with this reference testbed. Two modules from the Kingston KVR1333D3N9K3/6G triple-channel kit operating at 1333 MHz with relative timings of 9-9-9-24. However, 4GB is a bare minimum for a high-speed computer these days, so we have also experimented with memory capacity.
We've also added a couple of modules from the Walton Chaintech kit so that we get 8GB DDR3-1333 with the same timings.
We also decided to test a slow high-capacity configuration: Chaintech modules plus a couple of 1GB modules from Apacer (6GB in total). Memory frequency in the latter case was set to 1066 MHz, 9-8-8-20, for a better-posed comparison with LGA1366 systems. It's because we had already tested Core i7 920 with 4GB and 6GB of Kingston KVR1333D3N9K3/6G modules (1066 MHz, 8-8-8-19) -- two and three modules in dual- and triple-channel modes. We took those very results for this article.
Write a comment below. No registration needed!