We proceed with our series of articles devoted to analyzing performance of modern CPUs in real applications and finding out the effect of processor features. This article dwells on the number of cores of Intel Core i7. That is we actually redo the first part of this series, but with a different processor this time.
Testbeds
The testbed remains the same as in the previous two articles devoted to Intel Core i7:
- CPU: Intel Core i7 950
- Cooler: ASUS Triton 81
- Motherboard: ASUS P6T SE (Intel X58)
- Memory: 3 x 2-GB Corsair DDR3-1800 in DDR3-1600 mode
- Graphics card: Palit GeForce GTX 275
- PSU: Cooler Master Real Power M1000
As in case of AMD Phenom II X4 940, we used BIOS to disable CPU cores and obtain "dual- and single-core" versions. Unfortunately, even BIOS does not allow to disable one core in Core i7, so it's impossible to get a "triple-core" in this case. Fortunately, there exists another way to limit the number of cores -- using the operating system. MSCONFIG on the Boot -> Advanced options tab allows to specify directly the number of processors used by the system. It's a more complex (and consequently less reliable) way than disabling cores in BIOS, of course. But we decided to run these tests and then use the results to figure out how correct they are. In fact, we got strange results -- but not too strange to assume that this OS function does not work correctly.
Tests
The first diagram is a traditional one. It contains scores of each "processor" calculated according to our test method (in the form of a graph). The second diagram is a graph, where each curve stands for performance gains from added processor cores for each application in a given software group. We start with a single-core system, which performance is taken for 100% (so all curves start from the same point). This diagram allows to track behavior of separate programs, which may be very useful in the scope of our article. And finally, a table with test results (for each application separately). Starting from the "2 cores" column, it includes not only absolute test results, but also percentage values. What are they? They reflect performance gains of a given system relative to the previous one. It's very important: relative to the previous one, not to the original single-core system. Thus, 12% in the 3-core column means that the triple-core system is faster by 12% than the dual-core system in a given application. The same result for the dual-core system would mean its performance advantage over the single-core system.
Besides, we have run each test twice: with Turbo Boost and Hyper-Threading disabled (this mode should theoretically provide a stricter comparison of Phenom II X4 and Core i7) and with Turbo Boost and Hyper-Threading enabled (this test will be closer to reality).
Besides, we traditionally publish a link to a Microsoft Excel spreadsheet for curious readers, which contains all test results in the most detailed form. Besides, it includes two additional tabs "Compare #1" and "Compare #2" to facilitate their analysis. Just like the tables in the article, they compare the four systems percentagewise. The difference is simple: Compare #1 provides percentage values that are calculated just like in tables from the article -- relative to the previous system. Compare #2 compares all systems with the reference single-core system (like on the second diagram, only in the text form).
3D visualization
Without Turbo Boost and Hyper-Threading
|
1 core |
2 cores |
3 cores |
4 cores |
3ds max ↑* |
16.36 |
16.45 |
1% |
15.55 |
-5% |
15.42 |
-1% |
Lightwave ↓ |
14.52 |
12.85 |
13% |
13.69 |
-6% |
14.29 |
-4% |
Maya ↑ |
3.59 |
4.08 |
14% |
4.01 |
-2% |
3.85 |
-4% |
SolidWorks ↓ |
54.39 |
52.64 |
3% |
53.31 |
-1% |
55.33 |
-4% |
Pro/ENGINEER ↓ |
1065 |
998 |
7% |
1029 |
-3% |
1024 |
0% |
UGS NX ↑ |
2.99 |
2.84 |
-5% |
2.97 |
5% |
3.06 |
3% |
Group Score ↑ |
132 |
139 |
5% |
136 |
-2% |
134 |
-1% |
* The up arrow (↑) marks tests, where the highest results are the best, the down arrow (↓) marks tests, where the best results are the lowest.
Unlike Phenom II X4, Core i7 slows down even when we go from one to two cores. The situation aggravates with three cores: only two applications slowed down in case of Phenom II X4, but in case of Core i7 performance suffered in practically all applications. Besides, in case of four cores, Phenom II X4 managed to gain everywhere, while Core i7 only decreased its list of negatively responded programs by one. So, it's all strategically bad. But tactically, it's less bad: there are no serious performance drops. However, 1-2-3-4 cores without Turbo Boost for Core i7 is a synthetic situation. Let's see how the same processor performs in the natural mode.
With Turbo Boost and Hyper-Threading
|
2 cores |
4 cores |
6 cores |
8 cores |
3ds max ↑ |
16.93 |
16.74 |
-1% |
15.81 |
-6% |
18.01 |
14% |
Lightwave ↓ |
13.7 |
12.91 |
6% |
13.68 |
-6% |
12.57 |
9% |
Maya ↑ |
4.17 |
4.29 |
3% |
3.94 |
-8% |
4.58 |
16% |
SolidWorks ↓ |
53.81 |
49.2 |
9% |
53.63 |
-8% |
50.74 |
6% |
Pro/ENGINEER ↓ |
1056 |
1007 |
5% |
1019 |
-1% |
1020 |
0% |
UGS NX ↑ |
3.14 |
2.96 |
-6% |
3.11 |
5% |
3.05 |
-2% |
Group Score ↑ |
139 |
143 |
3% |
137 |
-4% |
147 |
7% |
Results seem to be much better: fewer red results. On the other hand, fewer reds appear only owing to the 8-core column, the situation in other cases got even worse: in case of four cores (two physical cores + Hyper-Threading), there are already two applications that respond negatively. Besides, the 3(6)-core configuration is absolutely negative for a single exception.
Compared to AMD Phenom II X4
It's absolutely wrong to compare absolute results, so we are going to compare percentage gains, taking a single-core system for 100% in both cases. We can see that Phenom II X4 demonstrates a steeper gain and, more importantly, higher stability and predictability of results: it practically lacks the 3-core drop. On the other hand, Core i7 somehow managed to demonstrate 9-16% gains with 8 cores (4 physical cores + HT), and Phenom II X4 cannot reach that far -- its maximum performance gain with four cores is +5%.
It's time to formulate the first hypothesis about scalability of Core i7 and Phenom II X4, as their number of cores grows. The fact is, from the point of view of experimental purity, our tests are actually not 100% perfect, because changing the number of cores, we don't change the size of shared L3 Cache, used in both processors. It will be appropriate to mention here that Intel and AMD processors have different cache architectures: AMD has exclusive cache that is no information is not duplicated on any cache level, Intel has non-exclusive cache, that is information can be duplicated.
In case of exclusive cache, total size of cached data is calculated as a sum of cache sizes of all levels. In case of inclusive and non-exclusive caches, the total size of cached data equals the size of the largest cache (it's L3 in Core i7). Thus, if processors are tested as we do, that is by enabling the increasing number of cores in the same processor, total size of cached data in Phenom II grows linearly to the full size of L2 Cache. And what concerns Core i7 -- it does not grow linearly, and apparently to a smaller size. This hypothesis may explain us something about phenomenal scalability of Phenom II X4 versus Core i7. If it's true, we must keep in mind that in this case we deal not with advantage of Phenom II X4 or disadvantage of Core i7, but with disadvantages of our tests. Unfortunately, it's technically nonremovable.
Write a comment below. No registration needed!