How CPU Features Affect CPU Performance
|
As promised, we proceed with our series of articles devoted to how various characteristics affect the speed of processors based on modern architectures. In our previous article we analyzed how CPU performance was affected by the number of cores. This article will deal with the effect of various parameters of memory -- belonging both to the integrated controller in AMD Phenom II and depending on memory modules. That's where we actually say good-bye to Phenom II, as there are no more significant (and most importantly analyzable with performance tests) characteristics that we have noticed in this processor.
Testbed configurations
We used the same testbed as in the previous article, and some "generic" results were also taken from the previous article: the results of Phenom II X4 940 Black Edition in the nominal mode with all its cores enabled and with the standard memory in our testbed: 6 GB (2 x (2+1)) DDR2-800 at 4-4-4-12.
Tests
We used our standard test procedure (v4.0, 2009) and examined three possible scenarios:
- Cardinally deteriorated memory performance. In order to simulate this situation, we just half the operating frequency of our DDR2-800 memory modules. Thus, memory bandwidth is reduced twofold, while its latency gets twice as high (with the same timings). This "rough" comparison has little to do with reality. But in this case we actually wanted to see such maximized effect. Especially as our previous attempt to see a performance drop due to the reduced memory bandwidth suggested the idea that memory bandwidth must be reduced cardinally, if we had wanted to see at least a small effect.
- Switching between modes of the integrated memory controller built into Phenom II. There are only two of them: ganged and unganged*. Some of you may remember that we already analyzed this performance aspect. However, we did it with the first Phenom, which clock rate was lower. Besides, applications also get updated after all. We decided that it wouldn't hurt to repeat our tests. This situation is quite possible, especially if you like to mess with BIOS options.
* Note that Phenom II memory controller has another mode -- single-channel -- when all memory modules are connected to a single channel of the controller, while the second channel is idle. However, as our article has a practical bias, we didn't run our tests in the single-channel mode, because users would hardly choose this "inconvenient" artificially-limited mode.
- Switching to "worse" (higher) timings. It's a vital situation for all users who want to buy memory modules: the cost of minus one in such parameter as CAS can be significantly high, and people want to know whether it's worth it. The answer why we chose these very timings (6-6-6-18 with the nominal 4-4-4-12), is very simple: these were the worst timings allowed by BIOS.
As always, detailed results of all tests are published in this Microsoft Excel spreadsheet. Besides, diagrams in this article (just like in the previous one) are supplemented with tables containing detailed test results in various applications and performance percentage values relative to the reference system. It goes without saying that the reference system is our testbed operating in the nominal mode, its results are published in the second column. We highlighted the most illustrative results for those users, who don't like to analyze lots of numbers presented in a table.
3D visualization
|
DDR2-800 4-4-4-12 unganged |
DDR2-800 4-4-4-12 ganged |
DDR2-800 6-6-6-18 unganged |
DDR2-400 4-4-4-12 unganged |
3ds max ↑* |
12.83 |
11.91 |
-7% |
12.81 |
0% |
12.32 |
-4% |
Maya ↑ |
3.18 |
3.26 |
+3% |
3.17 |
0% |
2.4 |
-25% |
Lightwave ↓ |
18.94 |
19.32 |
-2% |
19.5 |
-3% |
21.04 |
-10% |
SolidWorks ↓ |
61.99 |
62.57 |
-1% |
62.56 |
-1% |
65.75 |
-6% |
Pro/ENGINEER ↓ |
1223 |
1209 |
+1% |
1240 |
-1% |
1467 |
-17% |
UGS NX ↑ |
2.17 |
2.23 |
+3% |
2.12 |
-2% |
1.84 |
-15% |
* The up arrow (↑) marks tests, where the highest results are the best, the down arrow (↓) marks tests, where the best results are the lowest.
A difference of 1-2% will be considered a measurement error -- at least if we have no reasons to think otherwise. So the simplest situation is with the increased latency: practically all applications failed to react to it, and even -3% in Lightwave does not look very convincing: "Is that it? Is that all low-latency memory can offer?" Situation with switching between controller modes is more interesting: 3ds max responded negatively to the ganged mode. The unganged mode is traditionally considered preferable for multi-threaded applications and programs running simultaneously, and the ganged is better for single-threaded applications that use almost all resources on their own. However, the total diagram shows that the ganged mode generally has the same effect as high timings: performance drops by 1 point relative to the reference system -- less than 1%.
Cardinal reduction of memory bandwidth (by twofold) left none of applications indifferent. However, one of the applications stands out from the others: it's Maya, which visualization performance dropped by one fourth (!). As we are going to see below, it's one of the most noticeable results in all our tests. Funny situation with 3ds max: performance drop from cut-down memory bandwidth fails to compare with the performance drop from the ganged mode. The average performance drop is about 15%. At least this value is far from insignificant.
3D rendering
|
DDR2-800 4-4-4-12 unganged |
DDR2-800 4-4-4-12 ganged |
DDR2-800 6-6-6-18 unganged |
DDR2-400 4-4-4-12 unganged |
3ds max ↑ |
12.89 |
12.89 |
0% |
12.8 |
-1% |
12.19 |
-5% |
Maya ↑ |
0:03:01 |
0:02:58 |
+2% |
0:03:02 |
-1% |
0:03:02 |
-1% |
Lightwave ↓ |
109.21 |
108.79 |
0% |
110.64 |
-1% |
111.31 |
-2% |
Rendering is sensitive to few parameters (we've already mentioned it many times) except a number and computing power of the processor cores. Even not a very big cache allows to reach acceptable performance (unless it's really tiny, of course). It was reasonable to expect render engines to depend even less on memory. And we were right. Interestingly, 3ds max rendering does not respond to the ganged mode, although it's rendering that can use all four cores better than the interactive part. Later on we'll come across situations, when the official positioning of ganged and unganged modes contradicts the real situation.
Write a comment below. No registration needed!
|
|
|
|
|