It is always worth measuring a PC productivity. But the first thing to do before it is to establish the criteria for comparison with other systems. You can, for example, take a HDD size. At first sight, the bigger the disc, the better. But in this case there are some important factors that should be accounted as well: size/price, size/noise, size/power consumption, size/performance ratios etc. Moreover, maybe you don't need a 180 GBytes disc, and a 10 GBytes is sufficient. The most frequently used packets often answer the question what computer is better, but they don't take into account user's requirements. An ideal test must allow everyone to choose a computer in respect to the applications he uses and. But it is impossible to write a test which would satisfy everybody. That is why there are other approaches for speed estimation - first of all, we estimate a speed in real applications, and second, we carry out hardware low-level tests. In the first case, implemented scripts may differ from how a certain user works. But, as a rule, each script is based on an analyses of work of many people, therefore, its results are suitable for a lot of users. In the second case, a code of the test program differs from that of a certain application, that is why the results are often far from reality. But such approach allows us to estimate a possible efficiency of the system even in yet undeveloped applications. For this purpose they study a behavior of applications of a certain class and then build a template for a synthetic test. There is also one more test type - a so called 100% synthetic test. When examining the peak memory exchange speed we, for example, obtain 1600 Mbps. What does it mean? In a separate case - nothing. Neither testers not readers like such tests, but we shouldn't refuse them. The correctness of tests is a wide theme for discussion. The subject of the today's review is a bit different. In the first part I will describe tests of systems based on the Intel Pentium III, AMD Athlon / Athlon MP, and the second one will include also the Intel Pentium 4. Today we won't argue what system is better and the stuff like that, we will just give you some figures and talk a bit about them. Moreover, we will consider separate applications - no integral estimates will be given. I hope it will help us to say whether one or another system suits for a certain application. AMD Athlon MP processor and AMD 760 MP chipsetThe new AMD's CPU doesn't differ much from its predecessor (photos can be enlarged): Athlon MP Athlon The internal differences are not many (in fact, there is only one in the table below):
The company has developed a new chipset for the new processor which is AMD-760MP. It is the first chipset for SMP platforms based on the AMD processors. Here are its characteristics:
The chipset uses a classic architecture of North and South Bridges which are linked with a 133 MBps PCI bus. Despite the support of the 66 MHz PCI bus by the AMD-762 chip there are currently no south bridges working at such a speed. It differs from the AMD-760 chipset only in SMP and 64-bit PCI bus support. The south bridge is the same (AMD-766). The first motherboard on the AMP-760MP is Tyan Thunder K7 S2462. The platform is primarily intended for the market of servers and high-level workstations. This motherboard thus has a lot of peculiarities of a server board: integrated SCSI controller, network and video. DIMM slots positioned at angle of 45 degrees to the board will make no problems when installing it in server 1U/2U rack-mount cases. Technical characteristics:
The board is of high quality. It ships together with a manual, a bracket with the second COM-port, FDD- and ATA/100 cables. In course of installation we faced some problems with fans, FDD and power connectors. Fans for Athlon processors should be large. Therefore we used Thermaltake Mini Copper Orb coolers but they were quite difficult to install because the capacitors were too close. One of the DIMM modules of the first processor also prevents their installation. The FDD port is very close to the lower edge of the board, i.e. far from a floppy disc drive in a standard case, and its cable becomes snarled with SCSI ones. The main power supply connector is located in the center and, thus, is difficult to reach. But it is possible that such position of the FDD and power connectors is much better for rack-mount servers. All integrated controllers (video, SCSI, two network ones) can be disabled with jumpers. Besides, there are 4 jumpers onboard which assign a FSB frequency; they were set to 90 MHz at default, that is why the Athlon 1 GHz first worked only at 676 MHz. The situation improved after we set them to 133 MHz. Another problem occurred during installation of the operating system. I tried to install the system by loading it from a CD, but it turned out that the ATAPI DVD-ROM drive is not a standard CD-ROM device according to the BIOS. The 48x CD-ROM from Philips didn't help either. The BIOS agreed to work only with the Samsung SCR-2432. There are not many settings in the Phoenix BIOS: standard ones (adjustment of integrated ports and IDE devices), ordering of loading devices, adjustment of PCI bus parameters. Nothing that can help to overclock the processor can be found here. The company also plans on a light version of the S2462 board - MP S2460. It will lack for SCSI, network and video. The price therefore will be much lower. Test and programsDual-processor platforms are usually meant for server applications. Unfortunately, we are not currently able to use server/network tests. That is why we will test the board for suitability for high-performance workstations. Before starting the tests it is important to adjust the system. Luckily, the today's mobo production technology has reached the level when all boards from different manufacturers but based on the same elemental base show very close results. Comparison of boards on different platforms is much more difficult to implement. They have different instrument settings (for example, those of RDRAM and DDR RAM). That is why we left all settings at default in all configurations. This approach will allow us to compare a speed of base systems. Adjustment of programs is much easier. Before installation we deleted a system partition on a hard disc. After that, we installed Windows 2000 from a CD, created one NTFS partition, installed drivers of the motherboard, Service Pack 2, drivers of a video card and of a network controller. Another aspect to be pointed out is optimization of the tests for sets of additional instructions of different processors. It may greatly affect the result. But an application must ascertain that a processor has such a set, and only after that it can use SIMD instructions. It is very useful when you can control this process both automatically and manually. You might remember that a couple of years ago the programs detected the Intel Pentium by the CPUID instructions support. This caused problems because there were some other processors with the same support but which weren't fully compatible with Pentium. That is why now when MMX is supported both by Intel and AMD processors, it may be used not always by the latter ones. SIMD instructions in briefSIMD (Single Instruction Multiple Data) instructions mean they operate a great number of arguments at the same time. They are primarily used for multimedia applications which process single-type instructions over large data streams. They were first used in different programs for processing of digital multimedia data. The applications were easy to optimize - it was necessary only to rewrite the most laborious part of the code to get a considerable speed gain. Of course, such instruction sets can be used in programs of scientific computations but it is connected with greater difficulties in code optimization. But the first instruction sets could operate only with integer data. The first additional instructions were released by Intel in 1997 together with the Intel Pentium MMX. The MMX specification determines new, 64-bit data types:
and 57 new instructions implemented in new, 64-bit registers of the processor. Later, in 1999, the SSE set that came with the Intel Pentium III added 70 new instructions and a possibility to work with real single precision data. In general, they were intended for stream data, hence the name SSE - Streaming SIMD Extensions. Besides, there were 8 128-bit registers added. The Pentium 4 supports SSE2 set which can implement block instructions over double precision data. The number of instructions has extended up to 144. AMD has also introduced its new SIMD technology called 3DNow! (in the AMD K6-2 processor). Apart from support for all MMX extensions it includes 21 new operations over real single precision data. They use the same additional 8 64-bit registers as in the Intel Pentium MMX. In the AMD Athlon the company realized another 24 instructions and named the set Enhanced 3DNow! It included 19 instructions for improvement of MMX integer arithmetic and of additional possibilities of stream data movement, and 5 digital signal processing (DSP) instructions for applications like software modems, Dolby Digital and MP3 processing. The Athlon MP acquired the SSE technology support, and the resulted combination of the Enhanced 3DNow! and SSE was named 3DNow! Professional. Here you can look at what SIMD instructions are supported by what processors:
Testing of SMP systems has its own peculiarities. The matter is that some applications can gain from a multiprocessor system only if they were developed with this in mind. All other applications don't care whether it is a dual-processor system or not (the speed won't increase). They can benefit from the second processor, however, if you want to start up two applications at the same time, for example, an archiver and an MP3 player. That is why we will describe peculiarities of each test. We used a lot of test programs. First of all, SYSmark 2000 (patch 5) and Intel Pentium 4 Application Launcher 2.1 (further - P4AL). They are very similar to each other and differ only in a set of applications. In essence, the program implements a certain script. Of course, it may greatly differ from what you are doing, but it allows estimating an operating speed of a great number of various PCs. The propgram measures time of script implementation (or fps) and collates the result with that of the base system percentage-wise. Unfortunately, it is unclear what SYSmark 2000 scripts are doing. For the P4AL there is a detailed description for each application. Besides, it gives an information on the base system configuration:
But you can see that the test is basically meant for the Pentium 4, that is why there can be problems in other configurations. And we did face them. For example, one of the tests required exactly a GeForce2 based card (in all other tests we used the GeForce3 one). The Intel Pentium 4 could have perfectly worked with the GeForce3, but we had to leave the GeForce2. The P4AL uses simpler scripts than the SYSmark 2000. For example, conversion of only audio/video files, clip playback etc. The SYSmark 2001 wasn't used since it gives only general, overall figures and it is impossible to say what application affected such a result. The Z&D Winstone 2001 tests were rejected as well due to the same reasons. Secondly, we used the SPECviewperf packet; its results are given only for uni-processor configurations since it doesn't support SMP. Its results are not highly informative, though, because even with a 1.2 GHz processor the system is limited by a video card. At last, we used the Quake 3 Arena. This game estimates adequately the efficiency of modern systems and supports SMP. 3DStudio MAX 3.0, Windows Media Encoder 7 and Flask MPEG encoder with DivX 3.11 were used as additional tests. ConfigurationsThe following processors will be tested in uni- and dual-processor configurations in the first part of the review:
The Pentium III works at 133 MHz of FSB while all Athlon processors operate at 266 MHz. We are going to compare the processors in pairs: Pentium III 1 GHz vs. Athlon 1 GHz and Athlon 1.2 GHz vs. Athlon MP 1.2 GHz. There are a lot of cons and pros of testing the processors working at the same frequency because their internal architecture, FSB frequency, processor buses and prices are different. Nevertheless, you can single out only those processors which you like the most. For the Intel Pentium III based system we used the Tyan Thunder HEsl (S2567) motherboard. The AMD based system was accompanied with a dual-processor board Tyan Thunder K7 (S2462). Almost all other components are the same:
The test configurations are still differ as they use different SCSI controllers. But the express SYSmark 2000 test of one system with different controllers (we used the external controller on the Ultra160 chip of QLogic) showed that the difference in the results doesn't exceed 2% so we can neglect it. The boards feature different network chips from Intel and 3Com, but they were disabled by the OS during the tests. Different memory technologies are quite difficult to compare today. So, we are to submit it. Moreover, the DDR RAM is not very beneficial for the Intel PIII. But the Intel processors on the S2567 platform use the ServerWorks chips whose technology of memory operation is similar to the DDR RAM. The components of the system are chosen so that bottlenecks for the applications can be only processors or motherboards. The tests were carried out under the Windows 2000 Pro, SP2. For the AMD based boards we used the drivers for the chipset (they can be found on the AMD's site). For the ServerWorks board the drivers are just absent, that is why I doubt that the AGP has done its best. I think it worked in the PCI mode. But the most of the tests we conducted are based on the CPU computational power rather than on a video card. Further we will show where and how the AGP speed makes an effect (SPECviewperf and Quake3 tests). For the video card we used the v12.40 drivers, VSync was off, the monitor resolution was 1280x1024x32. For the P4AL tests we used the Inno3D GeForce2 GTS video card while for the rest it was MSI MS-8822 GeForce3. Test resultsThe first part is SYSmark 2000 test. Among the applications used by the SYSmark 2000 only the Windows Media Encoder 4.0 fully supports the SMP. That is why its results will be given for the SMP systems as well. In the Office Productivity test the higher the CPU frequency, the higher the score. The comparison of the architectures shows that the AMD Athlon beats the Intel Pentium III working at the same clock speed by 8 - 18%. The Athlon MP outscores the simple Athlon, but not considerably (1..4%). Applications of this tests are not optimized for the SIMD instructions of the participants (or optimized only for MMX). Nothing has changed except the Intel Pentium III which has gone far ahead in the Adobe Photoshop 5.5. The Photoshop 5.5 uses a set of plugins with the Pentium III support - FastCore.8BX, MMXCore.8BX, Wind.8BF and LightingEffects.8BF. In this test applications are well optimized for the MMX and SSE. The SSE is used also in the Athlon MP - that is why its results is higher than that of a usual Athlon. It is unclear why Adobe didn't supply its products with a full description of the processors and instructions supported. Version 4 of the Windows Media Encoder and its audio/video codecs use only the MMX technology. And since it is supported by all processors, the results depend only on the CPU frequency. The SMP support helps to reach higher results when working with two processors. Besides, the Athlon MP dominates over its predecessor by 10%. The next test, Intel Pentium 4 Application Launcher 2.1, uses the following applications:
Intel has divided them into two groups:
Let's start with consumer applications. The upper line stands for one processor, the lower one means SMP. Each test was conducted three times.
This test implements coding of one large WAV file (more than 500 MBytes) into the MP3 128Kbps format with the eJay MP3 Plus Encoder. The coder, according to the test, can use all advantages of the SMP systems. But the instructions other than MMX are not used. The data processing speed gets higher as the CPU frequency grows. This test uses a demo of the Incoming Forces game. The speed of processing of complex surfaces and fractal objects requires a lot of processor's power. According to the results, the game doesn't support SMP systems, but it can use the SSE and only on the Intel CPUs. Why the results have dropped when we replaced the Athlon 1 GHz with the 1.2 GHz is not clear. This is a single file of an MPEG2 player. It can use 4 codec versions: "pure C implementation" and a version compiled for MMX, SSE and SSE2. We played a 20 MBytes MPEG2 file (704x480, NTSC, 29.97fps, 5,000,000 bps) and measured fps. The SSE processors behave very strange. It also concerns the dual-processor Athlon 1.2 GHz. Adobe Premiere 5.1 is used for creation and editing of video. The Ligos LSX plug-in allows us to record the final video into the MPEG2 format. The script uses 320x240 AVI fragment from the Intel MMX(TM) technology presentation. It is coded into the MPEG2 (640x480, VBR video, 384k audio, motion estimation 16, and I, B and P frames included). Besides, they use one of the dynamic libraries from Ligos chosen according to the processor type. The test shows that the SSE allows the Pentium III 1GHz to catch up with the Athlon 1.2 GHz. The Athlon MP leads either because of the SSE, or because of the better MMX realization. The test was implemented correctly only on the higher processors, in all other cases there were errors. But since the test was very simple, I carried out all other tests with a stop-watch and their results were normalized to the base system. Ulead VideoStudio will help you sample video clips from different sources, cut them using digital effects and record it into a file of MPEG1/MPEG2 or another format. The utilized script opens a project, which consists of four video clips, and creates a final file of the MPEG2 format. It uses almost the same Ligos MPEG library as the Adobe Premiere does. That is why the results are so close, though the Pentium III has achieved better scores. I think that the Athlon MP uses only the MMX library version despite the SSE support. The latter three applications of the Intel packet use Ligos' code. This company is obviously a partner of Intel in promotion of the SSE and SSE2. Now let's take a look at the business applications.
The MagniTrax serves for watching images including special 3D images in the HoloGrafix format. The basic distinguishing feature is 3D file management with the help of the proprietary technology of tracing a user's head with a video camera. Without SSE this program looks dull. The result proves that the Athlon MP has an excellent SSE support. Dragon NaturallySpeaking Preferred 4.0 is a speech recognition application. The utility plays a certain file with the utility Dragon's PlayWave WAV and converts sound into text. The time of the test defines the efficiency. The participants take the same places as in the SYSmark 2000 test which uses the same application. The Windows Media Encoder is used for encoding of video and audio data into a stream WMV/WMA format. The source data can be either standard files of AVI, WAV and MP3 format and can be sampled in a real time mode from a sound card or a video input one. The final file can be viewed with a Windows Media Player or sent as a stream from a special server. The source file is a 30sec fragment of 320x240 AVI format from the Intel MMX(TM) technology presentation. The file was coded with the following parameters: 720x480, 30 FPS, 10 sec/iframe, crispness=50, MPEG4 V3 high bandwidth video codec, Windows Media V7 Audio 44 kHz stereo audio codec. The time of file conversion is measured. The Windows Media Encoder 7.0 is the second application of the Intel Pentium 4 Application Launcher set which supports SMP. As compared with the v.4.0 included into the SYSmark 2000, this one supports SSE, that is why the results are much better. The SPECviewperf shows that the video card doesn't cope with its task, and lack of the AGP support for the ServerWorks has also affected the results. Here they are: So, the first thing that attracts our attention is the Intel Pentium III system which falls too far behind. In fact since we first tested the Pentium 4 it was quite strange for us to see 15.11 in the ProCDRS-03, the Pentium 4 1.7 GHz got the same result. At first sight it was the video card that played a role of a bottleneck. We checked it by replacing the video card with the Inno3D GeForce2 GTS, but no great changes were noticed. The matter could be in the AGP throughput. So we conducted additional tests on the Thunder K7 board coupled with the Athlon MP 1.2 GHz. The results show that the speed falls only if the AGP is limited to the 1x, while the results of the 2x and 4x are almost the same. Then I again tested the video card at different GF3 core frequencies (I changed it by 10%), and the results confirmed that it was exactly a video card to blame. The second conclusion is that the Thunder HEsl S2567 system is limited by the AGP bus speed; when the AGP bus was set to 1x the Athlon MP 1.2 GHz was performing very close to the Pentium III1GHz - 51.42 in the AWadvs-04 and 3.8 in the Light-04. The illustrations of all additional SPECviewperf tests and some solutions will be given in the second part of the review. Now the Quake3 Arena. I used three demos with different quality settings in 4 resolutions (the results will be given for the lowest resolution 640x480). The demo002 shows higher fps as compared with the q3crush and quaver (the results will be illustrated). For the rest two demos the pictures are similar: I don't know exactly if the Quake3 supports SIMD. I think it uses only MMX. Here the Athlon also beats the Pentium III (35% against 17% in a dual-processor configuration, 1 GHz CPUs). The Athlon MP edges out its predecessor which works at the same frequency. The Intel Pentium III results are also decreased because of the ServerWorks chipset. The additional tests are:
The source file of the MPEG1 toronto.mpg format, 29 MBytes, from the ATi video clip set was coded with standard WME 7.0 profiles at 64, 128 and 256Kbps (3-5 profiles in a newly installed WME). We measured the conversion time that is why the less the figure - the better the speed: Flask 0.594 and DivX 3.11alpha codec were used for coding into DivX. The source VOB file, 27 MBytes, DolbyDigitalBroadway.vob is taken from the Dolby demo set. The conversion of the VOB file was implemented with the DivX Fast Motion codec. All settings were at default. We measured the conversion time: The SSE are not used there, that is why the Pentium III takes a lower place. The last two tests are implemented with the 3DStudio MAX program. It uses the SMP at the frame rendering and is not optimized for any processor at standard settings. In the first test we measured time of calculation of the first frame of the architecture.max scene from the SPECapc for 3D Studio MAX R3 packet. The result is here (62 KBytes, JPEG). The scene consists of 29 objects, 7 lights, 398229 vertices, 606794 faces; all its files take almost 45 MBytes. The less the time, the better: AMD Athlon and Athlon MP at 1.2 GHz are leading in this test. The Pentium III is outscored by the Athlon which works at the same frequency. With simpler scenes the positions are the same. The second test measures (in fps) the speed of drawing of a standard scene 4views.max in four windows. The Pentium III is dragging behind as well. The Athlon MP has almost a 40% lead over the simple Athlon in a uni-processor configuration. But the SMP system on the Intel beats the Athlon 1 GHz system. ConclusionIn applications which are not optimized for MMX, SSE and 3DNow! the AMD Athlon performs better than the Intel Pentium III. In some tests the Athlon MP also bests the simple Athlon. It seems that the architecture was improved, though it is very difficult to find any differences when reading their descriptions on the AMD site. If your favorite program is able to use MMX and/or SSE, the benefit can be quite considerable (the Photoshop, VideoStudio and WME7 tests prove it). Unfortunately, we couldn't estimate the efficiency of SIMD instructions of AMD - 3DNow! and Enhanced 3DNow!, though the MMX and SSE realization in the Athlon MP is excellent: in the Photoshop 5.5 (SYSmark 2000) the Athlon MP gains a 40% increase over the simple Athlon, mainly thanks to the SSE. By the way, the lsxprem.dll library contains "AuthenticAMD" and "lsxpremk7.dll" lines, but the file itself seems to be still absent. That is why there are two possible variants: either AMD will be actively promoting the Enhanced 3DNow!, or it will just realize the SSE2 :). Unfortunately, there are no tests which can show the efficiency op different SIMD instructions, but we keep on searching. In the second part you will see the Intel Pentium
4 which will compete against the latest models of AMD.
Write a comment below. No registration needed!
|
Platform · Video · Multimedia · Mobile · Other || About us & Privacy policy · Twitter · Facebook Copyright © Byrds Research & Publishing, Ltd., 1997–2011. All rights reserved. |