Part 1: Theory and architecture
In the 2007 summary we mentioned that AMD was going to make multi-GPU CrossFire solutions for the high-end range. And so this happened, the company announced RADEON HD 3870 X2, its first graphics card based on dual RV670 GPUs. It seems the era of dual-GPU solutions is about to begin, since NVIDIA is also preparing a similar product on dual G92 processors.
This enthusiasm for multi-GPU products based on SLI and CrossFire is a bit disappointing. Despite the comfort of creating multiple solutions on different numbers of the same chips for various price ranges, a single-GPU product will always have some advantages like being faster in all applications, not just those optimized for multi-GPU configurations. Besides, it won't have excessive units in its every chip and will have better energy consumption and heat emission. But the most important—it will lack rendering latencies peculiar to main SLI/CrossFire modes. We hope manufacturers won't forget about single-GPU top-end solutions as well, multi-chip products being but a temporary way out.
Actually this theory section is very short since HD 3870 X2 is just a couple of RV670 sharing the same PCB with memory and circuitry. This dual-GPU solution operates using hardware CrossFire. PCI Express lanes and the corresponding bridge are onboard, so the card doesn't use corresponding motherboard resources. Comparing to a couple of HD 3870 cards, it's different only on GPU/memory clock rates.
The AMD R6xx unified architecture itself was announced as far back as in May 2007. Mid-end and low-end graphics cards based on R6xx were rolled out in Summer, while the latest mid-end solutions based on RV670 were released by AMD late in Fall. The key difference of RV670 from the previous top-end R600 was 55nm process technology that reduced chip cost price important for inexpensive products. Besides, new AMD GPUs stood out by supporting the updated DirectX 10.1, PCI Express 2.0 and also by improved hardware video processing unit.
Before you start reading this article we traditionally recommend you to familiarize yourself with the basics describing various aspects of modern graphics cards and peculiarities of NVIDIA and AMD (former ATI) architectures.
These materials were rather accurate in predicting the current state of GPU architecture. Many assumptions proved to be correct. For more information about AMD R6xx unified architecture please see these articles:
RADEON HD 3870 X2 graphics accelerator
- Codenamed R680 (2 x RV670)
- 55nm process technology
- 2 x 666 millions of transistors
- Unified architecture with an array of shared processors for vertex, pixels and other data
- Hardware DirectX 10.1, including the new Shader Model 4.1, geometry generation and intermediate shader data output (stream output)
- Dual 256-bit memory bus: 2 x four 64-bit controllers connected by a ring bus
- 825 MHz core clock
- 640 (2 x 320) scalar floating-point ALUs (integer and floating-point, IEEE 754 FP32 precision)
- 2 x 4 extended texture units supporting FP16 and FP32 textures
- 2 x 32 texture addressing units (see details in the basic article)
- 2 x 80 texture fetch units (see details in the basic article)
- 2 x 16 bilinear filtering units supporting FP16 textures are full performance as well as trilinear and anisotropic filtering for all texture formats
- Dynamic branching in pixel and vertex shaders
- 2 x 16 ROPs supporting AA with programmable fetch of over 16 samples per pixel, FP16/FP32 frame buffers included; peak performance of up to 16 samples per clock (32 samples per clock in Z only mode)
- Up to 8 MRT (multiple render targets)
- Dual RAMDAC, 2 x Dual Link DVI, HDMI, HDTV
RADEON HD 3870 X2 graphics card specs
- 825 MHz core clock
- 640 (2 x 320) unified processors
- 2 x 16 texture units, 2 x 16 blending units
- 1800 MHz (2 x 900 MHz) effective memory clock
- 2 x 512MB GDDR3 memory
- 2 x 57.6 GB/s memory throughput
- Peak theoretical fillrate of 2 x 13.2 gigapixels per second
- Theoretical texture fetch of 2 x 13.2 gigatexels per second
- CrossFireX interface
- PCI Express 1.1 bus (see below)
- 2 x DVI-I Dual Link, up to 2560x1600
- TV-Out, HDTV-Out, HDCP support, HDMI adapter
- MSRP of $299
It seems that exactly the 55nm process technology enabled production of dual-GPU products of this power consumption. RV670 is significantly smaller in size than R600, consuming twice as less energy at similar performance. By the way, as you remember, starting with RADEON HD 3870/3850, AMD changed card naming. The RADEON HD 3870 X2 has "X2" suffix that naturally means two GPUs + memory.
There's nothing new to add to the previous review of the RV670 chip, since it has remained the same. We'll just briefly repeat that RV670 doesn't differ much from R600, having the same amount of units (ALU, ROP, TMU). The only important difference is 256-bit bus instead of a 512-bit one.
Of course, RADEON HD 3870 X2, like its single-GPU sibling, fully supports the yet unreleased DirectX 10.1 API, new and improved features of which we described in the previous article. AMD is also very proud that its dual-GPU solution has 640 streaming processors with peak performance reaching 1 teraflop in some operations for the first time for a standalone card. It's not clear how this teraflop can be useful for a casual gamer that uses a graphics card just as intended. Especially considering that CrossFire has obvious disadvantages comparing to top-end single-GPU cards...
Another interesting matter is the PCI Express 2.0 support of this card. Though it's one of the most crucial innovations of the RV670 GPU, the product we review today doesn't offer any throughput improvements. To interconnect two GPUs this modification of RADEON HD 3870 X2 uses a special PLX PEX 8548 PCI Express bridge that supports only 48 PCI-E 1.1 lanes. This chip, sized 37.5 x 37.5 mm, addes another 5W to power consumption of two GPUs and memory. By the way, AMD plans to implement such PCI Express bridges into futures GPUs to simplify circuitry and reduce cost price.
So, while this card does support PCI Express 2.0, it doesn't provide its throughput improvements since it actually uses PCI Express 1.1 interface. In real life this throughput difference is barely noticeable. On the other hand it could still help a CrossFire configuration, GPUs of which exchange large volumes of data over a bus. We'll try to research the difference between 2.0 and 1.1 to more extent, but it will hardly exceed a few percent.
Note that while already featuring two GPUs HD 3870 X2 can work with another such card on a single motherboard thanks to the ATI CrossFireX innovations. RV670-based solutions were announced as the first products capable of working in configurations with four single-GPU or two dual-GPU graphics cards.
Manufacturers see these multi-GPU solutions as one of the easiest ways to performance growth (noticeable primarily in benchmarks). Though both major market players claim that acceleration reaches 80-90%, this is actually noticeable only at high resolutions and mostly in benchmarks, not in all games.
It's good that HD 3870 X2 still supports ATI PowerPlay. As you remember, this is a dynamic power management technology based on a special steering circuit that monitors GPU load and adjusts core and memory clock rates as well as voltage and other parameters accordingly to optimize power consumption and heat emission. Without it a dual-GPU card would consume much more energy in 2D mode.
So, we have examined the peculiarities of the new dual-GPU RADEON HD 3870 X2 card and are ready to proceed to the practical part that will compare its performance with that of single-GPU products from AMD and NVIDIA. As usual, let's first have a look at synthetic benchmarks that uncover pros and contras or such solutions.
Write a comment below. No registration needed!