This time we will examine operation of the SPEC CPU 2000 test on different platforms of the Intel Pentium 4. Here we used only Intel chipsets (they are 3), because if we add Via and SiS solutions the test is going to be very complicated. We are also going to focus on peculiarities of the SPEC CPU2000 applications so that we will can examine performance of different systems.
Usage of a processor with an unlocked multiplier will let us find out how the SPEC CPU2000 results depend on a processor frequency. Besides, we will increase a FSB frequency from 100 to 110 and 120 MHz to find out the impact of the memory. Although the FSB speed affects not only memory, we don't have any other ways to examine this dependence. Besides, it makes no sense to adjust latency as we don't know definitely how the applications work with the memory. And the tests didn't show us any applications which would perform better in case of a lower latency. So, we will use the concept of memory speed implying that it's rather a bandwidth than other parameters.
So that the processor can work stable a growth of the FSB frequency should be accompanied by reduction of a CPU multiplier, that is why the tests were carried out at 10x. Maybe it's not correct for a processor which is currently able to work at over 2.0 GHz, but overclocking is also not the right exercise :). Besides, all boards have different possibilities, and 10x110 and 10x120 combinations were common for all and worked flawlessly.
Test system configurations:
The other components (Seagate Barracuda ATA III HDD, 40 GBytes and GeForce 2 Pro video card) didn't affect the results (we also tried the IBM DPTA drive and the ATi RagePro PCI video card). As for the memory size, in Part 2 we saw that it was enough for testing uni-processor systems.
SPEC CPU2000 configuration:
A single test of the fastest configuration (Pentium 4 2.0 GHz, RDRAM) takes more than 2 hours. And it might take more than a week to test three times each stand. But our aim is to find out peculiarities of the SPEC CPU2000 on different systems, that is why we ran each test only once, and the obtained results are not official. Good resettability of the result justifies such approach - in case of the official thrice-repeated running the difference was 1% for the CINT2000 and 0.5% for the CFP2000. But the struggle of Intel vs AMD will be tested completely.
Let me remind you once again that the SPEC CPU2000 test is synthetic and meant to estimate the overall performance of systems on a wide range of applications.
i850 chipset and RDRAM
Here we are going to carry out the largest number of tests - we will find out how the results depend on a speed of the CPU and memory.
So, let's start with a CPU frequency. The system ran at 1.0..2.0 GHz. The Y-axis shows SPECint_base2000 and SPECfp_base2000 which displays an operating speed in % relative to the base system. The more the better:
Nothing unexpected: the most of the tests speed up as the processor works faster. Only in 171.swim and 179.art there is almost no gain. Below you can see the gain obtained (%) for the 10% growth of the clock speed:
Note that the applications work with large volumes of data that can't go into the CPU cache, that is why the gain falls down as the multiplier increases.
But such tests as 164.gzip, 186.crafty, 252.eon, 177.mesa and 200.sixtrack have the greatest effect caused by the frequency growth. Other applications (171.swim, 179.art and 181.mcf) do not almost depend on the CPU speed. Here the bottleneck is the memory subsystem. In the previous part we found out that these applications have different requirements for the memory - on average, it is 190 MBytes for the 171.swim and 4 MBytes for the 179.art. Note also that the 164.gzip and 256.bzip2 behave differently although they both are of the compression type.
Now let's accelerate the system by lifting up the FSB frequency. The CPU frequency will be the same. Our system turned out to be operable at 110 and 120 MHz. Although such approach is not a pure memory acceleration because there are many other devices involved, there are no ways in modern synchronous chipsets to speed up only memory. The Y-axis shows growth (%) per 10% frequency increase:
The applications which were highly dependent on the processor's speed are not very sensitive to the memory's one, and vice versa. The negative results of the 164.gzip and 252.eon are on account of inaccuracy of the clock generator. For example, if 100 is, in fact, 99.5, and 110 is actually 110.5, the clock speed of the processor for 10x110 is less by 1% than for 11x100.
These diagrams supplement with the previous pair - if we put together the columns we will get a row of the same level close to 10. Let's check it by comparing 10x100 with 10x110 and 10x100 with 10x120:
You may object that increase of the FSB accelerates almost all PC components, but the i845D chipset is in the same situation where AGP/PCI buses can't be synchronized from a separate 66 MHz power supply unit even when the FSB speeds up. That is why the test results do not depend on anything but a processor, a chipset and memory.
Well, the architecture of the Pentium 4 + RDRAM is really well balanced (from the SPEC CPU2000's standpoint), because a lot of the tests work more efficient as the processor speeds up (and, therefore, the memory doesn't limit the performance much).
So, the first attempts to run the SPEC CPU2000 on one of the most efficient system allows me to draw the following conclusions:
i845D chipset and DDR SDRAM
Of course, the i850 chipset with RDRAM is the most optimal solution for the Pentium 4, but platforms with the DDR SDRAM for it are also very popular today. We will omit some diagrams of SPEC CPU2000 results vs. processor's and memory's speeds as they are very similar to those of the i850/RDRAM (just take into account the lower speed of the DDR SDRAM).
Instead, we will compare the result of CPU acceleration for the RDRAM and DDR SDRAM based systems. The Y-axis indicates growth (%) per 10% frequency increase of the processor:
As expected, the more an application sensitive to the memory, the less the effect from the CPU's speeding up. And the DDR RAM based system get lower scores. The i845D will get a greater gain than the i850 in case of the memory overclocking as the memory is a bottleneck for the DDR chipset.
Now let' compare the SPEC CPU2000 results of the i850 and i845D systems. I assume the i850 will perform better, especially in the tests much dependent on the memory speed such as 171.swim, 179.art and 181.mcf. And the results of the 164.gzip, 186.crafty, 252.eon, 253.perlbmk, 177.mesa, 200.sixtrack won't differ much.
The diagrams show the gap between the DDR SDRAM and RDRAM based systems (%):
I was right. In some applications the DDR SDRAM memory do not worsen the results while in the others the performance drop can be as high as 25% for integral arithmetic and 35% for real-number one. Note that 35% can be obtained also by formula: (3.2-2.1)/3.2*100 which shows how much the DDR SDRAM falls behind the RDRAM in throughput, that is why the 171.swim and 179.art tests can be used to estimate performance of the memory subsystem in different PCs.
Nevertheless, the integral scores are not bad: the i845D system showed the CINT2000 and CFP2000 results lower by 7% and 15% than the PC with RDRAM.
i845 chipset and SDRAM
Frankly speaking, after the i845D I didn't want to examine the configuration with SDRAM. First of all, it was clear that its results are much lower that those of the i850 with RDRAM; secondly, usage of the SDRAM together with the Pentium 4 makes sense only if you have a lot of memory of this type as different types do not differ much in price today. Nevertheless, the tests were carried out.
Let me compare the effect of memory overclocking for the i845 and i845D. As the PC133 SDRAM has a twice lower bandwidth than the PC2100 DDR RAM and it is the main bottleneck, its overclocking must be more beneficial:
Well, it is. And the effect is more considerable in the CFP2000 applications.
Now let me compare performance of the systems with PC133 SDRAM and RDRAM. The diagrams show the gap of the i845/SDRAM (%):
Now the difference is greater as compared with the i845D vs i850. There are now more applications that noticed that the memory has changed. And there are only three that do not care - 186.crafty, 252.eon and 200.sixtrack. And again, the 171.swim and 179.art can measure a pure memory speed - (3.2-1.06)/3.2*100 ~ 66%, which is close to the gap between the i845+PC133 and i850+PC800.
So, today we examined three different platforms for the Intel Pentium 4 in the SPEC CPU2000 test. Unfortunately, the test is synthetic, and you should carefully generalize its results to real applications. Nevertheless, they can give us a general idea on performance of different platforms for the Intel Pentium 4 in computational applications:
Such great differences of different subtests let us estimate the performance drop in case of different memory types used with the Pentium 4. Out of 26 applications there are only 6 which do not lower the speed when using the slowest memory - SDRAM. 10 applications work perfectly with DDR SDRAM also. And 16 tests slow down by 5..35% even with the DDR SDRAM. There are also applications (171.swim and 179.art) whose speed directly depends on the memory bandwidth.
Although the SPEC CPU2000 test is synthetic, some its tests can be useful o estimate performance of some PC components:
Their results excellently coincide with the theory.
Next time we will take a look at the SPEC CPU2000 on other platforms
made of AMD Athlon XP processor and Via, AMD and SiS chipsets for
Write a comment below. No registration needed!