How CPU Features Affect CPU Performance, Part 5
|
Conclusions
Performance growth curves with or without TB and HT do not differ much. However, a small slump with 3(6) cores is more noticeable in the former case.
Diagrams with performance gains demonstrate the same situation in more vivid colors: even if you dreamed of having a 3-core processor from Intel, you won't do it anymore. It's hard to speculate why programs, quite tolerant of three cores in AMD processors, responds so negatively to three cores from Intel without detailed documentation about the CPU structure (no one will give it to us -- it's a trade secret of Intel). We could have assumed that cores are not disabled correctly, when Hyper-Threading is enabled, but performance artifacts are also demonstrated without HT, so there seems to be nothing to go haywire in this case: we have only three physical cores. Besides, if there had been a mistake in the test procedure, it would have affected all tests in the same way. But in our case, some programs deal with three cores from Intel quite well.
Frankly speaking, we cannot say that we understand what's going on with Core i7. For example, the problem may be with shared L3 cache algorithms not designed for odd number of cores. This hypothesis is as good as any. However, we can confirm or confute it only by taking a look at these algorithms, which is unfortunately impossible.
On the whole, Phenom II X4 has better performance scalability than Core i7 as the number of cores increases. However, the difference is not too big to bury Intel, especially as performance of its single core per MHz is much higher than in AMD's flagship. Some of you may recall the beginning of the article, where we wrote why scalability comparison by this test procedure played against Intel. But on the other hand, we can contradict like this: indeed, this situation is not very equal, but it's quite natural. Each core in AMD is equipped with a relatively large L2 Cache -- 512 KB. So these cores depend less on L3 -- this is proved by test results of AMD Athlon II X4, which does not have L3 at all. So it's really a tad easier for AMD to scale its processors by the number of cores, because an Intel core has twice as small L2 cache. Thus, when a new core is added, it's desirable to enlarge L3 (and it's already the largest in Core i7 among all existing x86 desktop processors).
On the other hand, take another look at the last diagram. Is this millimeter of a difference actually worth discussing?
Write a comment below. No registration needed!
|
|
|
|
|