iXBT Labs - Computer Hardware in Detail

Platform

Video

Multimedia

Mobile

Other

SPEC CPU2000. Part 27: Intel Core 2 Extreme X6800, Intel C++/Fortran 9.1 Compilers

November 13, 2006




It may be the last article in the series of materials devoted to analyzing various platforms in SPEC CPU2000, because SPEC announced a new long-awaited SPEC CPU2006. Nevertheless, it still has a right to exist and it will be devoted to a top processor, recently presented by Intel — Intel Core 2 Extreme X6800.

Unfortunately, this processor is not very "extreme" this time (compared to other extreme processors from Intel). We'll omit unnecessary details (like unlocked FSB multiplier), but the only difference of this processor from the previous "non-extreme" representative of Intel Core 2 Duo E6700 lies in a higher clock rate — 2.93 GHz, which is higher by 266 MHz (by one step). This extreme processor does not possess a higher FSB clock (333 MHz) to distinguish it from other representatives of Core 2 desktop processors. Thus, this analysis actually comes down to the question "what can additional 266 MHz give to Conroe core" to be answered by SPEC CPU2000 tests.

SPEC CPU2000 tasks were compiled with the following compilers:

  • Intel(R) C++ Compiler for 32-bit applications, Version 9.1 Build 20060323Z Package ID: W_CC_P_9.1.022
  • Intel(R) Fortran Compiler for 32-bit applications, Version 9.1 Build 20060323Z Package ID: W_FC_C_9.1.024

In all cases (various optimization options) we used the same keys to compile the code - two-pass compilation with profile-guided optimization (PGO):

  • PASS1_CFLAGS= -Qipo -O3 -Qprof_gen
  • PASS2_CFLAGS= -Qipo -O3 -Qprof_use

Intel Core 2 Extreme X6800

As usual, at first we shall analyze SPEC CPU2000 performance in pure form, that is in absolute values with all optimizations possible, including the new option for Intel Core 2 processors. We'll use the usual single-thread method to run the tests (base metrics).

  No Opt. -QxK -QxW -QxN -QxB -QxP -QxT
164.gzip
1618
1797
1808
1795
1798
1802
1803
175.vpr
-
2143
2273
2263
2261
2322
2242
176.gcc
-
3333
3360
3375
3378
3391
3385
181.mcf
3961
3669
3672
5084
5113
5093
5102
186.crafty
2337
2356
2707
2714
2690
2677
2704
197.parser
1654
1652
1617
1647
1647
1655
1653
252.eon
2965
3255
3744
3775
3566
3766
-
253.perlbmk
3281
3260
3227
3307
3325
3261
3302
254.gap
2869
2869
3054
3066
3048
3052
3050
255.vortex
4776
4767
4806
4771
4801
4856
4872
256.bzip2
2472
2349
2317
2340
2338
2332
2341
300.twolf
2622
3090
3197
3175
3175
3334
3342
SPECint_base2000
2711
2757
2854
2943
2930
2960
2895

SPECint tests gave us a surprise — usually inoperable non-optimized 175.vpr and 176.gcc tasks are supplemented with 252.eon (-QxT), specific for Core 2 processors. This task did not behave like that on Core 2 Duo E6700 that took part in our previous analysis.

Here are performance ratings of various optimizations of SPECint 2000 tasks according to SPECint_base2000: no opt. < -QxK < -QxW < -QxT < -QxB < -QxN < -QxP. Compared to test results of Core 2 Duo E6700, -QxT optimization went down in this row, having settled between -QxW and -QxB. It might have to do with 252.eon leaving this list, which resulted in a lower total score (this task contributes much to it.) Comparing individual results of integer tasks, we can make sure that -QxT (native optimization for Core 2 processors) is no worse or even better in most of them than the absolute leader — -QxP optimization, called optimization for Intel Pentium 4/D, Core Solo/Duo, as well as compatible Intel processors with SSE3 support.

  No Opt. -QxK -QxW -QxN -QxB -QxP -QxT
168.wupwise
3838
3660
3943
4527
4283
4604
4598
171.swim
2625
2999
3071
3071
3070
3066
3047
172.mgrid
1431
1783
1857
1868
1834
1870
1868
173.applu
1566
1662
1697
2224
2103
2234
2230
177.mesa
1925
2699
2833
2827
2479
2847
2678
178.galgel
2748
5092
6402
7057
6363
7076
7068
179.art
8242
9153
9301
9286
9229
8438
8442
183.equake
2723
2680
2728
2714
2693
3094
3089
187.facerec
2399
2991
3005
2985
2968
3038
3032
188.ammp
1944
1949
2098
2079
1998
2095
1991
189.lucas
2537
2495
2941
2887
2535
2898
2891
191.fma3d
1725
1727
2187
2208
1924
2185
2214
200.sixtrack
769
746
1163
1144
726
1136
1159
301.apsi
1713
1741
1834
1846
1840
1853
1833
SPECfp_base2000
2223
2493
2764
2858
2633
2875
2853

SPECfp 2000 tasks with real numbers offered no surprises. Here are the average SPECfp_base2000 results: No Opt. < -QxK < -QxB < -QxW < -QxT < -QxN < -QxP, that is the same sequence as the results of Core 2 Duo E6700.

Comparison with Intel Core 2 Duo E6700

Let's proceed to the next stage of our analysis — comparing the results with the previous leader, Core 2 Duo E6700. Remember that we actually compare the same Conroe cores (of different revisions — Core 2 Duo E6700 was represented by an engineering sample with an earlier core revision — B0 versus B1), but with different clock rates: 2.93 GHz and 2.66 GHz correspondingly.

SPECint 2000. The advantage of Core 2 Extreme X6800 over Core 2 Duo E6700 is demonstrated in all integer tasks, varying on the quantitative level. The least advantageous task is 181.mcf (according to our previous tests, it's critical to memory bandwidth). It demonstrates just a 0.2% advantage in non-optimized code and 4.7%-4.9% with optimizations for modern processors. Maximum advantage is demonstrated in 256.bzip2 - it reaches 13.9% in case of -QxB. Note that the maximum gain expected from the core clock frequency should be 2.93/2.66 = 1.10 times, that is approximately by 10%. It may be the effect of some changes in the newer core revision B1, or it may be a measurement error. Anyway, performance gain in integer tasks (SPECint_base2000) generally amounts to 8.5%-9.6% (if we don't take into account incorrect results of the -QxT optimization that lack 252.eon), that is it falls within those 10% dictated by CPU clock differences.

As usual, tests with real numbers demonstrate a less homogenous picture. We can see a stable drop in 171.swim performance in all optimizations (from -4.8 to -6.0%), which is rather difficult to explain (perhaps it has to do with differences in core revisions, this time — not to the credit of the newer B1), and a large spead in values. For example, in 178.galgel (the gain ranges from 9.0% to 16.3%). We can also note enough tasks that do not gain much performance on the new extreme processor — for example, 173.applu, 183.equake, and 189.lucas. Strange as it may seem, gains in the average SPECfp_base2000 results fall within a narrow interval - 5.5%-6.0%, smaller than performance gains in integer tasks. It quite possibly has to do with greater requirements of real SPEC CPU2000 tasks to memory bandwidth, which is identical in both analyses (peak memory bandwidth of dual-channel DDR2-800, actually limited by the throughput of 266 MHz FSB to 8.53 GB/sec).

Efficiency of dual cores

And finally, by analogy with previous analyses of dual-core processors, let's evaluate the efficiency of running two SPEC CPU2000 instances, using the rate metrics. Results of a single instance obtained in this metrics will be taken for the reference point.

Efficiency of running two instances of integer tasks is very high practically in all cases, except for 181.mcf. According to our previous results, this task cannot boast of high "parallel" efficiency on other dual-core processors as well, including Intel Pentium Extreme Edition, Intel Core Duo, and Intel Core 2 Duo. We had already assumed that such low efficiency of parallel execution of this task has to do with the reduction of available L2 Cache per single core (in this case — from 4 MB to 2 MB), while this task has high requirements to cache/memory bandwidth. According to results of this task as well as all the other SPECint 2000 tasks and the average SPECint_rate2000, efficiency of parallel execution of two task instances on Core 2 Extreme X6800 is a tad lower than on Core 2 Duo E6700. For example, according to the average results, the gain from running two instances on Core 2 Extreme is 76-78%, while it was 78-82% on Core 2 Duo.

The general picture of comparing performance of two instances of real tasks versus a single instance of the task on Core 2 Extreme X6800 looks qualitatively the same as on Core 2 Duo E6700. Like in integer tests, the differences lie in quantitative results, Core 2 Extreme is again defeated. According to average results in SPECfp_rate2000, performance gain from running two instances of the tasks amounts to 47-56%, which is a tad lower than the results obtained on Core 2 Duo (54-63%).

Conclusion

Results obtained in this article are quite natural. An increase in Conroe core's clock frequency from 2.66 GHz to 2.93 GHz (that is approximately by 10%) is generally accompanied by a proportional performance gain in SPEC CPU2000 — from 8.5% to 9.6% for integer tasks and from 5.5% to 6.0% for tasks with real numbers, which are more critical to memory bandwidth than to CPU clock. At the same time, efficiency of parallel execution of tasks on the higher-clocked extreme modification of Conroe core is a tad lower than on the previously reviewed "non-extreme" processor (lower-clocked earlier revision of the core). Performance gain from running two instances of the tasks amounts to 76-78% for integer tasks and 47-56% for tests with real numbers.



Dmitri Besedin (dmitri_b@ixbt.com)
November 13, 2006


Write a comment below. No registration needed!


Article navigation:



blog comments powered by Disqus

  Most Popular Reviews More    RSS  

AMD Phenom II X4 955, Phenom II X4 960T, Phenom II X6 1075T, and Intel Pentium G2120, Core i3-3220, Core i5-3330 Processors

Comparing old, cheap solutions from AMD with new, budget offerings from Intel.
February 1, 2013 · Processor Roundups

Inno3D GeForce GTX 670 iChill, Inno3D GeForce GTX 660 Ti Graphics Cards

A couple of mid-range adapters with original cooling systems.
January 30, 2013 · Video cards: NVIDIA GPUs

Creative Sound Blaster X-Fi Surround 5.1

An external X-Fi solution in tests.
September 9, 2008 · Sound Cards

AMD FX-8350 Processor

The first worthwhile Piledriver CPU.
September 11, 2012 · Processors: AMD

Consumed Power, Energy Consumption: Ivy Bridge vs. Sandy Bridge

Trying out the new method.
September 18, 2012 · Processors: Intel
  Latest Reviews More    RSS  

i3DSpeed, September 2013

Retested all graphics cards with the new drivers.
Oct 18, 2013 · 3Digests

i3DSpeed, August 2013

Added new benchmarks: BioShock Infinite and Metro: Last Light.
Sep 06, 2013 · 3Digests

i3DSpeed, July 2013

Added the test results of NVIDIA GeForce GTX 760 and AMD Radeon HD 7730.
Aug 05, 2013 · 3Digests

Gainward GeForce GTX 650 Ti BOOST 2GB Golden Sample Graphics Card

An excellent hybrid of GeForce GTX 650 Ti and GeForce GTX 660.
Jun 24, 2013 · Video cards: NVIDIA GPUs

i3DSpeed, May 2013

Added the test results of NVIDIA GeForce GTX 770/780.
Jun 03, 2013 · 3Digests
  Latest News More    RSS  

Platform  ·  Video  ·  Multimedia  ·  Mobile  ·  Other  ||  About us & Privacy policy  ·  Twitter  ·  Facebook


Copyright © Byrds Research & Publishing, Ltd., 1997–2011. All rights reserved.