Four months have already passed since we tested 64-bit compilers on the AMD64 platform. Today we'll proceed with our tests and will see what has changed since that time, because the 32-bit product from Intel has often been in advance of 64-bit compilers. And the main plot will be the release of a 64-bit compiler from Intel. Of course, it was initially created for CPUs from this company supporting the EM64T technology, but it also works fine with AMD64.
We used the following compilers:
We'll briefly introduce our participants for those who are not keeping up with our publications:
A standard compiler for Linux systems, gcc remains the most popular compiler for non-commercial use. We used a compiler version from the SuSE package (to be more exact – from the update), because the new (at the moment of our tests) version (we tried 3.4.2) did not provide a considerable performance gain in SPEC CPU2000 tests.
PGI is the first 64-bit commercial compiler for AMD64. PGI Workstation 5.2 package includes C, C++, Fortran, and Fortran90 compilers (as well as a debugger and a profiler). Supporting OpenMP and MPI. Since the first release, it has gone through several versions and the current one is 5.2-2. Note that new versions of PGI are in fact released almost every day, but unfortunately the developer does not always increase version numbers. So you can learn how fresh the release is only by the date of the installation package file. Fortran compiler from this package is also used in commercial applications. In particular, it was used to compile the 64-bit version of LS-DYNA for the AMD64 platform.
PathScale EKO Compiler Suite has appeared relatively recently, the first (1.0) version was released this year in spring. Interestingly, this product was initially developed to operate on the AMD64 platform. The package includes C, C++, and Fortran 77/90/95 compilers. It lacks debuggers and other utilities. The compiler works only in 64-bit versions of Linux (it is claimed to support RedHat, Fedora, SuSE). Version 1.3 was released in late August. The developer is trying to attract attention to the quality (speed of the compiled code) of the product by conducting contests of the type "You'll Win if Your Code Runs 10% Faster". Besides, the web site provides multiple test results in various applications (they often use "64bit Commercial Compiler" :) ). As this compiler does not have an apparent 32-bit version (but it has an option to compile 32-bit code, moreover it is used in peak metrics) and the company kindly includes a full config file for SPEC CPU2000 into the package, we additionally obtained peak metrics for it. Note that these results are hors concours, because we use only base metrics for the other compilers.
Intel compilers have always demonstrated high quality of the code optimization both in synthetic tests and in life. The largest processor manufacturer managed to become a serious competitor to purely software companies. Of course, one can lament that they have always known how their hardware products operated to a nicety, but considerable investments into R&D also played their role. We have been testing Intel compilers starting from Version 5.0, and each new version demonstrates considerable performance gains of the compiled code.
Due to these compilers in many respects, Intel Pentium 4 processors demonstrate high results with resource-critical tasks, if a developer was not lazy to use a compiler from Intel :). Wide popularity of SIMD can surely be attributed to them as well.
Interestingly, Intel compilers demonstrate excellent speed on other processors as well :). However, starting from version 8.0 the company introduced the CPU maker check, but the optimization option (including vectorization and SIMD) for "generic" processors remained. By the way, only recently 64-bit compilers for AMD64 have managed to outscore the 32-bit Intel compiler in SPECfp_base2000, while in SPECint_base2000 it is still a leader.
Everybody has been looking forward to the release of compilers for EM64T version of the 64-bit Intel technology expanding IA32. Since the processors with EM64T were introduced this summer, the company couldn't leave them without software support, and the corresponding compiler has been released already in autumn. This special version of the compiler provides code compilation both for Intel CPUs with EM64T (Prescott core, SSE3) and for processors from other companies compatible with the 64-bit mode (but without SSE3 and some memory operation fine tuning options). The latter surely means AMD Athlon 64/Opteron :).
We used the following optimization keys in our tests:
As before, these test results should be referred to as "estimated" according to the SPEC terminology, because not all compilers managed to complete the full set of tests (gcc does not have a compiler for Fortran 90, and Intel/EM64T didn't manage to compile 252.eon). However, all other formal requirements have been satisfied.
In SPECint_base2000 all compilers behave in a similar way, except for PGI, which record low results in 252.eon do not allow a decent integral mark (the vexed question – is it worth "digging deep" in synthetics, or is it better to restrict oneself to overall marks – is up to our readers to decide personally). If you want to find a leader, the integral mark points to the 32-bit Intel compiler (remember that the peak result of Pathscale is hors concours). Its serious competitor is Pathscale, in certain tests the difference varies from -36% to +28%. Its integral mark is lower just by 1.5%.
On the whole gcc and PGI are not that bad, in some tasks they demonstrate a good speed. Intel/EM64T (so far?) is outscored by its predecessor and can be now considered only as a potentially interesting compiler.
So, this time the bottom line for CINT2000 will be as follows: if speed is important to you, you should test all the above-mentioned compilers with your application. No doubt one of them will considerably raise the code execution speed of your application.
From the compatibility point of view, reliable compilation of twelve various applications (for the only exception) does not give cause for doubts concerning the quality of the reviewed products.
In the CFP2000 tests Pathscale is still the leader. The second place is taken by the new product from Intel, which is outscored by its "brother" only in a single context. Its integral mark raised by 7.4%. But it was not enough to become the leader, only 3% separating it from this place.
PGI is 5.6% behind the leader in SPECfp_base2000, but in several tests the results vary from -28% to +21%.
Judging from the results, you shouldn't urgently switch to another compiler. However, as in CINT2000, there is a point in trying other compilers for calculation tasks, the spread in execution speed of different subtests being fortunately rather wide.
Besides the tests on the AMD64 platform, we also managed to take some readings on Intel Xeon/Nocona. We used the same versions of the operating system from SuSE. Note that we installed the initial releases, dated April 2004. Of course we updated the OS after the installation, but we had no problems with its operability. It should be noted that we did not use a heavily loaded computer (2 õ Intel Xeon 3.0 GHz (Nocona), Supermicro X6DA8-G2 (Intel E7525), 2x512 MB DDR2-400 SDRAM and Western Digital WD360 HDD (SATA)) and, to our mind, you shouldn't take these results as "everything is 100% working!", but the compatibility fact is doubtlessly positive.
This system was used for the SPEC CPU2000 tests with gcc, PGI, and Intel compilers. Unfortunately we had no time to test Pathscale, but we'll try to make up for it in the next material :).
Optimization keys and other settings are similar to the listed above for AMD64 (of course, for IC we used the -xP key instead of -xW). Note that the table does not contain the ic81e.xP results for the 252.eon, 253.perlbmk, 254.gap, and 255.vortex tests. Most likely, there will never be 252.eon results (the test uses the old method for managing streams, which will probably not be supported by the new versions of compilers), while the other three tests will probably be included in the new releases.
There is practically no point in considering the absolute results in the light of the eternal Intel vs AMD dispute – we did not use the fastest processor and the DDR2 usage is rather a negative point so far.
As you can see in the results, the most preferable compiler for Xeon/Nocona is the one from Intel. It could have been assumed even before the tests, though :). But the fact that one of the first 64-bit versions is quite operable is certainly pleasing.
Note that the code obtained using gcc and PGI was working on the new Intel processor without any shaman rituals. It is very nice and gives hope that other software, already ported to AMD64, will operate on EM64T without any complications.
It's interesting to compare the effect of the 64-bit transition on different platforms. This comparison is certainly of a conditional character – the choice of processors, platforms, compiler options is far from being univocal. That's why we recommend to hold back your far-reaching conclusions and consider these figures as an additional piece of information about 64 vs 32, Intel vs AMD, gcc vs IC, etc. Especially since you cannot possibly equalize all the parameters, so you have to content yourselves with these figures anyway. The following table contains percentage values of the changes caused by the transition from 32-bit to 64-bit software.
From these figures you can see that the gcc behavior is the same on different processors – considerable gains and drops (if there are any) are almost always demonstrated on both platforms. So the effect of transition to the off-the-shelf 64-bit Linux will not depend on what 64-bit version you choose.
The situation with Intel compilers is more interesting. First of all note the considerable drop of indices in many CINT2000 tests on both platforms. Let's hope that these issues will be fixed in the new compiler versions. The effect is sometimes "a tad more positive" for AMD. What concerns CFP2000, almost +16% in the integral mark look quite good. On AMD the effect is worse, but there is nothing to be done here :(. We'll just have to use other compilers.
PGI performed quite well on the Intel processor in the 64-bit mode. Alas, this combination cannot be recommended for calculation tasks. Though it should be noted that the compiler may be "corrected" with the advance of EM64T processors. CFP2000 tests of the product from Portland Group on the AMD processors demonstrated performance gains in most tasks.
The appearance of a new competitor on the market of 64-bit compilers for the AMD64/EM64T platforms revived the would-be stagnation. Of course, working on the AMD platform, Intel 8.1/EM64T does not unveil the full CPU potential. But this fact does not prevent it from getting the second place after Pathscale in SPECfp_base2000 on AMD Athlon 64. It's doing worse in the SPECint_base2000 tests – the new product from Intel is unfortunately outscored even by its 32-bit partner.
What concerns the 64-bit version of the processor from Intel, the first tests
demonstrated that the existing 64-bit software for AMD64 works fine
on the new competing processor. A full set of compilers and their
compatibility with AMD64 are particularly pleasing. Thus, porting
software to EM64T will most likely consist in the operability tests
of the software on the new core from Intel.
Kirill Kochetkov (firstname.lastname@example.org),
October 6, 2004
Write a comment below. No registration needed!