Part 2: Graphics Cards and Synthetic Benchmarks
We've tested four 3870 X2 cards from the above mentioned companies (AMD partners) in our testlab. Three of them are copies of the reference design. They do not differ much from the reference (except for higher operating frequencies in the card from MSI). That is HIS, MSI and TUL have nothing to do with manufacturing these cards, they just buy ready products from AMD. Then MSI selected cards with GPUs capable of operating at 860 MHz to offer them as OC editions.
These cards don't have counterparts, so we have nothing to compare them to. However, we'll risk a comparison with a single-GPU graphics card based on the same 3870 GPU. Developers had to use the back side of the board for memory chips (half of memory chips had to be moved from the front to the back side). So they are covered with a metal plate on the back side of the board for cooling. On the other hand, the GeCube card works fine without such a plate, although it has the same memory chips operating even at a higher frequency.
GeCube designed its own PCB. This solution has its pros and cons. Pro: the card is shorter than the reference model, so there are more chances to fit it inside your PC case. Con: the layout of external power connectors. They look down in the reference card (when it's installed in the motherboard), so it's easy to plug power cables to them. What concerns the GeCube card, they are oriented to the rear. That is power cables automatically make the card a tad longer.
The card from GeCube has another peculiarity - four DVIs instead of two (in the reference card). Moreover, non-standard BIOS in this card does not allow the driver to enable CrossFire by default (it cannot be disabled in the reference card). That is users can disable CrossFire and get just two cards, each with two DVIs. In this case you can plug four monitors to your card. So, flexibility of this card is praiseworthy.
What concerns reference cards from HIS, PowerColor, and MSI, their PCBs are as long as the 8800 GTX (270 mm). You should take it into account, if you want to buy this card.
The cards have TV-Out with a unique jack. You will need special bundled adapters to output video to a TV-set via S-Video or RCA. You can read about the TV-Out in more detail here.
Analog monitors with d-Sub (VGA) interface are connected with special DVI-to-d-Sub adapters. The bundle also includes DVI-to-HDMI adapters (these graphics cards can transfer video and audio data to HDMI receivers), so there should be no problems with such monitors. Maximum resolutions and frequencies:
What concerns MPEG2 playback features (DVD-Video), we analyzed this issue in 2002. Little has changed since that time. CPU load during video playback on modern graphics cards does not exceed 25%.
HDTV and other trendy video features. You can read one review here.
These cards require additional power supply, so each card is bundled with Molex-to-6-pin adapters, although all modern PSUs offer such cables. The reference cards have TWO power connectors, one of them is an 8-pin one. You shouldn't be confused by them - a usual 6-pin cable will be sufficient. Two additional pins are responsible for overclocking via the drivers.
Now about the cooling systems.
We monitored temperatures using RivaTuner (written by A.Nikolaychuk AKA Unwinder). Here are the results:
HIS RADEON HD 3870 X2 2x512MB PCI-E
|HIS RADEON HD 3870 X2 2x512MB PCI-E|
|The box contains a User's Manual, CD with drivers, component output adapter, DVI-to-VGA adapter, DVI-to-HDMI adapter, external power adapter, CrossFire bridge. And the most curious thing, HIS carries on to bundle bonuses with graphics cards, e.g. a screwdriver with interchangeable heads. The screwdriver has a built-in flashlight to illuminate the screw you are working with and a level. It's a nice bonus for all users, who assemble or upgrade computers on their own. Unfortunately, there is no 6-pin-to-8-pin adapter, although one of the on-board power connectors has 8 pins.|
|GeCube RADEON HD 3870 X2 X-Turbo Dual 2x512MB PCI-E|
|The bundle includes: User's Manual, CD with drivers, component output adapter, DVI-to-VGA adapter, DVI-to-HDMI adapter, external power adapter, CrossFire bridge. It's a small bundle. Nothing like the luxurious looking card and box.|
|PowerColor RADEON HD 3870 X2 2x512MB PCI-E|
|User's Manual, CD with drivers, component output adapter, two DVI-to-VGA adapters, DVI-to-HDMI adapter, external power adapter, CrossFire bridge.|
|MSI R3870X2-T2D1Q-OC (RADEON HD 3870 X2) 2x512MB PCI-E|
|The same bundle, plus (VERY IMPORTANT!) a 6-8-pin adapter. Thus, the MSI card can be plugged so that AMD drivers will reveal overclocking options.|
|HIS RADEON HD 3870 X2 2x512MB PCI-E|
We'd like to give a scolding to HIS here. It possesses great experience in packaging. But in this case it stuffed a huge card into a thin box, which bulges when you close it. Plus tasteless design. The box produces an impression of a cheap card inside, even of a noname product made in China.
Bundled components are secured in a plastic section inside.
|GeCube RADEON HD 3870 X2 X-Turbo Dual 2x512MB PCI-E|
Designers from GeCube did a great job. It's an excellent box with a window to show off the card, which looks gorgeous.
Bundled components are arranged into cardboard sections inside.
|PowerColor RADEON HD 3870 X2 2x512MB PCI-E|
Designers still prefer vertically oriented boxes. It's actually a jacket with a white cardboard box inside. The box contains all bundled components arranged into cardboard sections in a pile of cardboard padding. I don't understand why waste so much cardboard, if they could do with a plastic form.
The box has a stylish design.
|MSI R3870X2-T2D1Q-OC (RADEON HD 3870 X2) 2x512MB PCI-E|
In this case we can see a famous huge bag, used for all expensive cards from MSI. However, the bag is half-empty, because the bundle is small (we can remember the time, when such boxes held 11 CDs with software and various bonuses).
The graphics card is secured inside a foamed polyurethane box.
VSync is disabled.
Our synthetic benchmarks can be downloaded here:
Synthetic tests were run with the following graphics cards:
We've selected these solutions to compare with the new dual-GPU card from AMD for the following reasons: RADEON HD 3870 as a full single-GPU counterpart, which differs only in operating frequencies of the GPU and memory. And the latest GeForce 8800 GTS (G92) is used for two reasons. Firstly, this graphics card is close to the HD 3870 X2 in price. Secondly, it's one of the fastest solutions from the competitor.
We should warn you that the analysis is going to be boring, because nothing changed from the architectural point of view. The GPU is the same, only this card is equipped with two GPUs. And we are familiar with CrossFire already. Besides, it's easy to predict results, as CrossFire always uses the AFR mode. Performance (as well as frames) will be almost doubled in most cases, which is far from the situation in real games - synthetic tests are simple, their frames do not depend on previous results (render targets). So we expect a doubled frame rate almost in all cases.
This test determines peak texel rate in FFP mode for different numbers of textures applied to a pixel:
In case of a single texture, performance of all solutions is limited by memory bandwidth and by the number of ROPs. In case of many textures per pixel, the situation gets better. The GeForce 8800 GTS and the HD 3870 X2 go on a par, although the dual-GPU monster from AMD is still outperformed by the single-GPU solution from NVIDIA. That's the effect of not so many TMUs in the AMD R6xx architecture... There is no twofold difference between the HD 3870 X2 and the HD 3870 in any test, even though the difference between them grows with the number of textures. Let's have a look at the fill rate test:
Let's analyze a couple of extreme geometry tests. The first test uses the simplest vertex shader that shows maximum triangle throughput:
Our GPUs execute this test in various modes with similar efficiency. Peak performance in FFP, VS 1.1, and VS 2.0 does not differ much, only the G92 is a tad faster in the FFP mode. We cannot say much about these results, except that AMD GPUs process geometry traditionally faster than NVIDIA GPUs. The card based on two RV670 GPUs becomes an evident leader in geometry performance, nothing stops it from demonstrating doubled frame rate in this test.
We removed two intermediate geometry tests with a single light source, because they don't show anything interesting. So we proceed straight to the most complex geometry task with three light sources, including static and dynamic branches:
Let's draw a bottom line under geometry tests: the HD 3870 X2 has the same GPUs, it uses CrossFire to double the frame rate, so we get expectable results—they are twice as high as results of the single-GPU HD 3870. But don't forget that the situation may change much in real applications...
The first group of pixel shaders to be reviewed here is too simple for modern GPUs. It includes various versions of pixel programs of relatively low complexity: 1.1, 1.4, and 2.0.
The twofold difference between the HD 3870 X2 and the HD 3870 is preserved, it's smaller only in the illumination tests. Let's have a look at results in more complex pixel programs of intermediate versions:
In the second compute intensive test, AMD solutions are already ahead. The GeForce 8800 GTS is a tad slower than the regular HD 3870, to say nothing of the X2 card, which is traditionally twice as fast as the HD 3870. So, the second task suits the AMD architecture with many unified processors.
These tests of DirectX 9 pixel shaders are even more complex, they are divided into two categories. We'll start with easier shaders - SM 2.0:
There are two modifications of these shaders: arithmetic intensive and texture sampling intensive. Let's analyze arithmetic intensive modifications, they are more promising from the point of view of future applications:
AMD solutions were traditionally faster in the second Parallax Mapping test. But NVIDIA G92 solutions with improved TMUs (parallax mapping requires an additional texture lookup) changed the situation, so the new GeForce 8800 GTS outperforms the HD 3870. However, the twin RV670 represented by the HD 3870 X2 demonstrates twice as many frames per second in this test as the single-GPU solution. Let's analyze texturing intensive modifications of the same tests:
Let's have a look at results of another two pixel shader tests—SM 3.0. They are the most complex of all our tests for Direct3D 9 pixel shaders. The tests load ALUs and texture units heavily. Both shader programs are complex, long, and include a lot of branches:
The RADEON HD 3870 X2 is still twice as fast as its single-GPU modification, but it gives a very little advantage over the GeForce 8800 GTS. What will happen, when NVIDIA launches a similar card with two G92 GPUs? No need to guess. AMD will be defeated again, in case of equal SLI and CrossFire modes.
New RightMark3D 2.0 includes two old PS 3.0 tests for Direct3D 9, rewritten for DirectX 10, and two brand new tests. The first two tests can now enable self-shadowing and shader supersampling, which increases the GPU load.
These tests measure efficiency of executing looped pixel shaders with a lot of texture lookups (up to several hundreds of lookups per pixel in the heaviest mode!) and a relatively low ALU load. In other words, they measure a texture sampling rate and branching efficiency in a pixel shader.
The first pixel shader test will be the Fur test. When used with the lowest settings, it uses 15-30 texture lookups from bump maps and two lookups from the main texture. The High Effect Detail mode increases the number of lookups to 40-80. When shader supersampling is enabled—the number of lookups grows to 60-120. And the High mode with SSAA is the heaviest mode—160-320 lookups from a bump map.
Let's see what happens in modes without supersampling - they are relatively simple, and the correlation of results in Low/High modes must be similar.
All results in the High mode are approximately 1.5 times as low as in the Low mode. Procedural fur tests with a lot of texture lookups traditionally show a great advantage of NVIDIA over AMD. None of the RADEONs can compete with the GeForce 8800 GTS, even two GPUs are of no help here. This defeat cannot be explained even theoretically. Perhaps, the problem is in bugs of the driver for Direct3D 10. CrossFire is doing great in Direct3D 10, the HD 3870 X2 is twice as fast as the single-GPU solution.
By the way, judging by previous reviews, performance in this test depends not only on the number and speed of TMUs, rendering speed is also limited by the fill rate and memory bandwidth. Let's have a look at the results in this test with enabled shader supersampling, which quadruples the load. Perhaps it will change the situation:
Theoretically, supersampling quadruples the load, but performance drops deeper in NVIDIA solutions than in AMD cards. So the breakaway gets smaller. Still, only the GeForce 8800 GTS copes with tests of such complexity, other solutions demonstrate very low results. Nothing changes between the HD 3870 X2 and the HD 3870. The dual-GPU card is twice as fast again owing to the AFR CrossFire mode.
The second test that measures efficiency of executing complex looped pixel shaders with many texture lookups is called Steep Parallax Mapping. With low settings it uses 10-50 texture lookups from a bump map and three lookups from main textures. The heavy mode with self-shadowing doubles the number of texture lookups, and supersampling quadruples this number. The most complex test mode with supersampling and self-shadowing uses 80-400 texture lookups, that is eight times as many as in the low mode. Let's analyze simple modes without supersampling first:
This test is more interesting from the practical point of view. Various parallax mapping methods have been used in games for a long time already. Heavy modifications, such as our steep parallax mapping, are used in the latest games, e.g. in Crysis and Lost Planet. Besides, along with supersampling, this test also allows to enable self-shadowing that doubles the GPU load (High mode).
Although AMD solutions used to be strong in Direct3D 9 tests of parallax mapping, they fail to keep up with the GeForce 8800 GTS in the updated DX10 test without supersampling. Even the HD 3870 X2 is outperformed by the NVIDIA card here. Besides, self-shadowing causes a bigger performance drop in AMD products, over two times versus 1.5 in NVIDIA solutions.
The X2 card is again twice as fast as the regular HD 3870. That is CrossFire AFR works fine in our tests. Let's see what supersampling will change, as it slowed down NVIDIA cards much more than AMD solutions in the previous test.
That's one more heavy task for GPUs, where two options are enabled: supersampling and self-shadowing. The GPU load grows almost eight times, so the performance drop is enormous. The performance difference between our cards is generally preserved, supersampling has the same effect as in the previous case—AMD cards improve their results relative to the NVIDIA solution. However, even the HD 3870 X2 is still outperformed by the GeForce 8800 GTS 512MB. What concerns the comparison between the HD 3870 X2 and the HD 3870, we have nothing to add—there is still the traditional twofold advantage of the dual-GPU card.
The next couple of pixel shader tests contains very few texture lookups to minimize the effect of TMUs on performance. They use a lot of arithmetic operations, so they measure arithmetic performance of GPUs, how fast they execute arithmetic instructions in pixel shaders.
The first computing test is called Mineral. It's a complex procedural texturing test, which uses only two texture lookups and 65 sin and cos instructions.
We've already noted in our synthetic Direct3D 9 tests that the latest architecture from AMD often performs better than NVIDIA's architecture in compute-intensive tasks. Although the RADEON HD 3870 is still slower than the best solution based on the G92 in this test, the dual-GPU card doubles its frame rate and significantly outperforms one of the fastest cards from NVIDIA in FPS.
The second shader test is called Fire, it's even harder for ALUs. It contains only a single texture lookup, while the number of sin/cos instructions is doubled to 130. Let's see what changes as the load grows:
AMD cards failed this test in all previous articles, demonstrating very low results, which indicated an apparent bug in the drivers. Judging by today's test results, the bug has finally been fixed (almost a year after it was reported!), and now the HD 3870 performs on a par with the 512 MB modification of the GeForce 8800 GTS. The RADEON HD 3870 X2 is twice as fast.
RightMark3D 2.0 includes two geometry shader tests. The first one is called Galaxy, it's similar to point sprites from previous Direct3D versions. It animates a system of particles using a GPU, a geometry shader creates four vertices from each particle. Similar algorithms should be used in future DirectX 10 games.
A change of balance in geometry tests does not affect rendering results, the image is always identical, only scene processing methods differ. GS load value determines what shader will be busy—vertex or geometry. The amount of work is always the same.
Let's analyze the first modification of Galaxy with vertex computing for three levels of geometric complexity:
The correlation of results with different complexity levels of the scene is almost the same, only absolute values are different. Performance corresponds to the number of points, FPS is halved each step. Only the dual-GPU card from AMD can compete with the GeForce 8800 GTS in this test. It's twice as fast as its single-GPU modification based on the RV670, which in its turn is slower than the NVIDIA card. However, it's not a hard task for modern graphics cards. Our previous tests demonstrate that performance is not limited by shader ALUs here, the task is limited more by memory bandwidth than by GPU. Perhaps the situation will change, when some work is moved to a geometry shader.
There are no significant changes in this case. All graphics cards demonstrate practically the same results, when GS load changes (responsible for moving some computations into a geometry shader). The GeForce 8800 GTS still outperforms the HD 3870, while the HD 3870 X2 is twice as fast as the single-GPU card. Perhaps, it's the effect of different clock rates and a measurement error. We'll see what will happen in the next test...
Hyperlight is the second geometry test that uses several techniques: instancing, stream output, buffer load. It employs dynamic generation of geometry by rendering into two buffers, as well as a new Direct3D 10 feature—stream output. The first shader generates ray directions, their speed and growth vectors. These data are stored in a buffer, which is used by the second shader for rendering. Each ray point is used to generate 14 vertices in a circle, up to a million output points.
The new type of shader programs is used to generate rays. If "GS load" is set to "Heavy"—it's also used for rendering. That is in Balanced mode, geometry shaders are used only to generate and grow rays. Output is up to instancing. The geometry shader also outputs data in the Heavy mode. Let's analyze the easy mode first:
Relative results in various modes correspond to the load: performance scales well in all cases. It's close to theoretical parameters, according to which, each next level of Polygon count must be twice as slow. Performance of the NVIDIA card in this test is again much higher than that of both cards from AMD with any geometry complexity. A single-GPU card from AMD is outperformed by the GeForce 8800 GTS 512MB by more than twofold, and the dual-GPU card cannot even get close to it.
It's the first test in our review, when the RADEON HD 3870 X2 outperforms the single-GPU by less than twofold. Their performance difference is just 1.5. It's a synthetic test, where the most favorable CrossFire mode for FPS does not yield a twofold performance gain. It will be worse in game tests...
Results may change in the next test, which uses geometry shaders more actively. It will be also interesting to compare results obtained in Balanced and Heavy modes.
The correlation of performance results has changed very much. AMD GPUs execute more complex geometry shaders more efficiently than the NVIDIA GPU. However, the GeForce 8800 GTS 512MB performs almost on a par with the RADEON HD 3870, while the gap used to be much larger. The HD 3870 X2 significantly outperforms both competitors, although the performance difference does not reach twofold.
What concerns the comparison of results in different modes, the new GeForce 8800 GTS in Balanced mode demonstrates better results than the RADEON HD 3870 X2 in Heavy mode. You should keep in mind that the image does not differ in these modes. That is, AMD solutions perform better in the second mode (using the geometry shader for output instead of instancing), while NVIDIA prefers the first one. However, when we compare performance results in the best modes, the GeForce 8800 GTS is a tad faster than the new dual-GPU card from AMD.
Vertex Texture Fetch tests measure the speed of many vertex texture fetches. These tests are similar, and the correlation of their results in Earth and Waves tests must be also similar. Both tests use displacement mapping based on texture lookups. The only major difference is that the Waves test uses conditional branches, while the Earth test does not.
Let's analyze the first test (Earth) in Effect detail Low mode:
Results in different modes again demonstrate a similar picture with relative performance. Judging by our previous reviews, results of this test are heavily affected by memory bandwidth. The easier the mode, the stronger the effect on performance.
The GeForce 8800 GTS outperforms the RADEON HD 3870, but the dual-GPU card shoots forward. The difference between AMD cards reaches twofold, but only in heavy modes. Memory bandwidth is insufficient in the easiest mode. The difference in the average mode also fails to reach twofold. Let's have a look at results of this test with more texture lookups:
The situation hasn't changed much, the RADEON HD 3870 X2 is still ahead. It's up to twice as fast as the HD 3870, as the task grows more complex. The NVIDIA GeForce 8800 GTS is somewhere in between.
Let's have a look at results of the second vertex texture fetch test. The Waves test executes fewer texture lookups, but it uses conditional branches. The number of bilinear texture lookups in this case reaches 14 (Effect detail Low) or 24 (Effect detail High) per each vertex. Geometry complexity changes just like in the previous test.
The situation in the Waves test is slightly different from previous results. Both HD 3800 cards look good. The single-GPU product performs on a par with the GeForce 8800 GTS, being faster in the Low mode and slower in the High mode. The HD 3870 X2 is twice as fast, so it becomes a leader owing to CrossFire. Let's analyze the second mode:
There are almost no changes. But as the test grows more complex, results of a single-GPU HD 3870 get evidently better than those of the GeForce 8800 GTS. It wins in all modes, not just in the Low mode. The other conclusions still hold true—performance in the Low mode is a tad limited by memory bandwidth, TMUs and ROPs play a more important role in the High mode. AMD drivers seem to be optimized, because NVIDIA solutions used to cope with vertex texture fetch tests better than AMD cards. Now the situation is much better. Especially as the X2 card is again more than twice as fast as the regular HD 3870.
There is nothing new or interesting for our conclusions. The RADEON HD 3870 X2 is based on two GPUs, which have been already reviewed. They changed little compared to the R600, all architectural fortes and weaknesses remain the same. The dual-GPU solution is notable for high computing performance, especially in modern and future applications with many complex shaders of all types. The weakest link here is relatively few texturing units, which do not allow all R6xx-based cards demonstrate higher performance in tests that depend much on texturing speed.
What concerns a performance gain of the HD 3870 X2 relative to the single-GPU modification, all results can be explained with the AFR CrossFire mode (Alternate Frame Rendering). As frames do not depend on each other in our tests, the frame rate grows approximately twofold, except for several tests, where the card suffers from insufficient memory bandwidth. Besides, we should keep in mind that it's not easy to double performance in real games even in the AFR mode, which has its own problems - higher latencies compared to honest twofold performance gain. In other words, in many cases 60 FPS provided by CrossFire or SLI will be just as playable as 30 FPS on a single-GPU card.
On the whole, having analyzed synthetic test results, we should admit that the RADEON HD 3870 X2 is quite a fast solution. In certain conditions it can even compete with more expensive graphics cards. It outperforms the single-GPU GeForce 8800 GTS 512MB (with a similar price tag) in many cases. But don't forget that when you choose between a single-GPU and a dual-GPU graphics card, you must take into account the above-mentioned problems of CrossFire/SLI systems. Besides, you should keep in mind the results, demonstrated in real games, not in synthetic or semi-synthetic tests, like 3DMark.
That's why it's vital to pay attention to the next part of the article devoted to tests of the new dual-GPU card from AMD in modern games. These results must be much more interesting than synthetic results from this part, because it's not that easy to double performance in games.
Write a comment below. No registration needed!
|blog comments powered by Disqus|
|Most Popular Reviews||More RSS|
Comparing old, cheap solutions from AMD with new, budget offerings from Intel.
February 1, 2013 · Processor Roundups
A couple of mid-range adapters with original cooling systems.
January 30, 2013 · Video cards: NVIDIA GPUs
An external X-Fi solution in tests.
September 9, 2008 · Sound Cards
The first worthwhile Piledriver CPU.
September 11, 2012 · Processors: AMD
Trying out the new method.
September 18, 2012 · Processors: Intel
|Latest Reviews||More RSS|
Retested all graphics cards with the new drivers.
Oct 18, 2013 · 3Digests
Added new benchmarks: BioShock Infinite and Metro: Last Light.
Sep 06, 2013 · 3Digests
Added the test results of NVIDIA GeForce GTX 760 and AMD Radeon HD 7730.
Aug 05, 2013 · 3Digests
An excellent hybrid of GeForce GTX 650 Ti and GeForce GTX 660.
Jun 24, 2013 · Video cards: NVIDIA GPUs
Added the test results of NVIDIA GeForce GTX 770/780.
Jun 03, 2013 · 3Digests
|Latest News||More RSS|