Solving a sparse symmetric set of linear equations
We've finally come to the most "devastating" test for Phenom X4. There are two apparent facts:
1) Phenom X4 operating at 2.5 GHz is four times as slow (!) with the native ACML than Phenom X3 operating at 2.4 GHz.
2) Intel MKL increases the execution speed of this test on Phenom X4 by 3.7 times.
The third fact is less noticeable: Phenom X4 9850 is slower than Phenom X3 8750 even with the Intel library. That is Intel MKL sort of hides the problem of Phenom X4, but it does not solve it.
The graph only confirms the assumption: that's the third time when we see ACML drop performance, as we switch from three to four cores. Sparse is just the best test to reveal this effect.
2D representation of Bernstein polynomial approximants
The effect from using Intel MKL is practically non-existent in both cases. It's a very strange graph, as both AMD CML and Intel MKL demonstrate the same tendency: an abrupt performance gain from the second core, and a slack performance drop from adding more cores.
Modeling a flexible membrane in 3D
There is practically no performance difference between ACML and Intel MKL, when processors operate in their nominal modes (with all cores). But the graph again shows something strange. Firstly, Phenom X3 and Phenom X4 perform differently again, when they use the same number of cores: Phenom X3 almost does not slow down as the second core is enabled (with both libraries). But Phenom X4 noticeably drops performance in this situation. Secondly, regardless of the library or a processor, single-core configurations are still the fastest solutions.
Conclusions
The chart and the graph use geometrical mean results obtained in all tests. We've noticed two things: firstly, Intel MKL has a positive effect not only on Phenom X4, but also on Phenom X3. To a much lesser degree, of course. But still, you get a 7% performance gain. Secondly, as you replace ACML with Intel MKL, Phenom X4 accelerates by almost 40% (!). Thus we assume that ACML shipped with MATLAB is incompatible with the latest AMD processors.
Besides, we'd like to mention strange behavior of AMD CML and Intel MKL, as we enable the fourth core in Phenom X4 in the FFT test (fast Fourier transform). Frankly speaking, we got used to ACML's eccentricity with Phenom, but the same reaction to the fourth Phenom core in the Intel library alerts us. However, it would have been too arrogant to speculate what happens inside this library without examining the source code.
P.S. The most obvious idea is to modify our test procedure. But that's actually the last thing we should do. Having read this article, you may replace AMD CML with Intel MKL at your own risk (we have tested performance only). However, we cannot recommend using Intel software with AMD processors -- that's not our field of expertise. Besides, such a combination would cause questions about fair competition between these two manufacturers. A fixed ACML.DLL would be the perfect solution here. Unfortunately, the latest version we have found at AMD's official website (4.1.0) does not work with MATLAB correctly. It crashes at launch of MATLAB's built-in benchmark.
Write a comment below. No registration needed!