New Dual-GPU Challenger of the 3D Throne
On the example of
Gigabyte GeForce 9800 GX2 2x512MB PCI-E,
XFX GeForce 9800 GX2 2x512MB PCI-E
Part 1: Theory and architecture
Time has come for the long-awaited announcement of the dual-GPU card NVIDIA GeForce 9800 GX2, which unites the power of two G92 GPUs into a single solution. Unlike AMD with its multi-GPU CrossFire solutions for the top price range, NVIDIA does not seem to give up the single-GPU future for its graphics cards. The company still plans to manufacture fast single-GPU solutions. But the card we are going to review today should outperform AMD RADEON HD 3870 X2, which was the fastest card since this January.
NVIDIA possesses much experience in designing and manufacturing multi-GPU cards.
The company had offered SLI much earlier than AMD came up with its
CrossFire. Besides, NVIDIA already offered a modern dual-GPU solution
GeForce 7950 GX2. This
card was based on two G71 operating at reduced frequencies in SLI
mode. And now the company has announced another card based on two
G92 GPUs. We'll review it today. The time of dual-GPU solutions has
come again (after Quantum3D Obsidian, ATI Rage Fury MAXX, GeForce
I repeat that passion for multi-GPU products is bad news. Even though it's convenient to design products for various price ranges using different numbers of identical GPUs, single-GPU solutions will always have advantages: they will be faster in all applications, not only in optimized for multi-GPU configurations. Besides, they do not contain excessive units in each GPU, they offer better power consumption and heat release characteristics. Single-GPU solutions don't have problems with latencies in drawing frames, typical of Alternate Frame Rendering (AFR).
Manufacturers regard multi-GPU configurations as one of the simplest ways to increase performance, noticeable in benchmarks in the first place. Efficiency of SLI/CF technologies in benchmarks reaches 80-90%, this performance gain is demonstrated only in high resolutions and only in some benchmarks. In our opinion, such products should be launched only for temporal reinforcement of positions in the market until the rollout of the next GPU generation. They should not replace single-GPU High-End graphics cards.
The theoretical part about GeForce 9800 GX2 will be very short. This card is just based on two G92 GPUs installed on two connected PCBs with memory and other components. The dual-GPU system works in SLI mode implemented on the hardware level. PCI Express lanes and the corresponding bridge are embedded into the PCB, so the card does not take these resources from the motherboard. The only difference of this solution from the system of two GeForce 8800 GTS 512MB cards is their GPU and memory frequencies.
Before you read this article, you may want to study the baseline theoretical articles - DX.Update, DX Next, and Longhorn. They describe various aspects of modern graphics cards and architectural peculiarities of products from NVIDIA and AMD.
These articles predicted the current situation with GPU architectures, many assumptions about future solutions were confirmed. Detailed information about the unified architecture of NVIDIA G8x/G9x solutions can be found in the following articles:
As we mentioned in previous articles, G9x GPUs are based on the GeForce 8 (G8x) architecture and enjoy all its advantages: unified shader architecture, full support for DirectX 10 API, high-quality methods of anisotropic filtering and CSAA with up to sixteen samples. The new GPUs feature improved units (TMU, ROP, PureVideo HD) and the 65-nm fabrication process, which allowed to reduce the costs. Let's analyze characteristics of the new dual-G92 card:
GeForce 9800 GX2
- Code name: 2 x G92-450
- Fabrication process: 65 nm
- 2 x 754 million transistors
- Unified architecture with an array of common processors for streaming processing of vertices and pixels, as well as other data
- Hardware support for DirectX 10, including Shader Model 4.0, geometry generation, and stream output
- Double 256-bit memory bus, four independent 64-bit controllers
- Core clock: 600 MHz
- ALUs operate at more than a doubled frequency (1.5 GHz)
- 2 x 128 scalar floating-point ALUs (integer and floating-point formats, support for FP32 according to IEEE 754, MAD+MUL without penalties)
- 2 x 64 texture address units, support for FP16 and FP32 components in textures
- 2 x 64 bilinear filtering units (like in the G84 and the G86, no free trilinear filtering and more effective anisotropic filtering)
- Dynamic branching in pixel and vertex shaders
- 2 x 4 wide ROPs (16 pixels) supporting antialiasing with up to 16 samples per pixel, including FP16 or FP32 frame buffer. Each unit consists of an array of flexibly configurable ALUs and is responsible for Z generation and comparison, MSAA, blending. Peak performance of the entire subsystem is up to 128 MSAA samples (+ 128 Z) per cycle, in Z only mode - 256 samples per cycle
- Multiple render targets (up to 8 buffers)
- All interfaces (2 x RAMDAC, 2 x Dual DVI, HDMI, HDTV) are integrated into the chip
Reference GeForce 9800 GX2 Specifications
- Core clock: 600 MHz
- Frequency of unified processors: 1500 MHz
- Unified processors: 2 x 128
- 2 x 64 texture units, 2 x 16 blending units
- Effective memory frequency: 2.0 GHz (2*1000 MHz)
- Memory type: GDDR3
- Memory: 2 x 512 MB
- Memory bandwidth: 2 x 64.0 GB/sec.
- Maximum theoretical fillrate: 2 x 9.6 gigapixel per second.
- Theoretical texture sampling rate: up to 2 x 38.4 gigatexel per second.
- 2 x DVI-I Dual Link, 2560x1600 video output
- HDMI with HDCP support
- SLI connector
- PCI Express 2.0
- TV-Out, HDTV-Out
- Power consumption: up to 197 W
- Dual-slot design
- Recommended price: $599-$649
Nothing new or interesting. As you can see from characteristics, the dual-G92 card differs from the single-GPU GeForce 8800 GTS 512MB in GPU frequencies on the whole and in shader frequencies in particular. Memory bandwidth of this solution is similar to that of the SLI system with GeForce 8800 GTS 512MB. Memory size hasn't changed either (per GPU). This is more interesting, as the cards are designed for different price ranges.
According to our recent tests, 512 MB of video memory per GPU is sufficient for most modern games. But judging by the latest games, such as Crysis, this memory size may still be insufficient for top graphics cards operating in high resolutions. In some cases GeForce 9800 GX2 may be even outperformed by GeForce 8800 Ultra with 768 MB of video memory. On the other hand, the G92-based graphics card cannot accommodate this very memory size, it may come only with 512 MB or 1024 MB. And 2 x 1024 MB of fast GDDR3 memory is too expensive for NVIDIA. Perhaps, the company decided to equip GeForce 9800 GX2 with 512 MB - a golden mean for modern games.
NVIDIA notes that GeForce 9800 GX2 uses a patented dual-PCB design offering optimal acoustic and other properties. This special design implies installing each of two GPUs on its own PCB, which gives the following advantages:
- Each GPU dissipates heat and warms only its own PCB, unlike the single-PCB design, where both GPUs are installed on the same PCB and heat it together, which may require to reduce GPU frequencies.
- Two PCBs allow better layouts, which is important for video memory operating at high frequencies.
- The patented cooler works for both GPUs, unlike GeForce 7950 GX2, which uses two cooling devices.
It's said that GeForce 9800 GX2 will work in any motherboard supporting PCI Express. It does not have to support SLI. This should be verified in practice, because incompatibilities are always an option.
It must also be noted that GeForce 9800 GX2 requires two PCI-E power cables: with 6-pin and 8-pin connectors. The graphics card won't work if you plug only one cable, or use two 6-pin connectors. Thus, PSU requirements include a corresponding power capacity. A single card requires at least a 580-W power supply unit, two GeForce 9800 GX2 cards need a modern 850-W PSU.
We have nothing new to add. G92 GPUs remain the same, they were described in detail in corresponding reviews. NVIDIA G9x architecture was announced last year in Autumn. Considering that it's just a slightly modified G8x architecture, it had appeared back in 2006. The main difference between G92 and the old G80 is the 65-nm fabrication process used in the former. It allows to reduce the costs of complex GPUs and decrease their power consumption and heat release. They have the same number of ALUs and TMUs. Another significant difference of the new GPU is its 256-bit bus versus 384-bit one.
As we wrote in articles about GeForce 8800 GT and 8800 GTS 512MB, G92 is a modified G80 GPU, upgraded to a new fabrication process with some architectural changes: fewer ROPs and improved TMUs, as well as a new compression algorithm in ROPs (more efficient by 15%). These changes have been described in detail in our previous articles.
GeForce 9800 GX2 is a dual-GPU SLI system. But Quad SLI technology allows to join two such cards in a motherboard supporting SLI. These are motherboards on NVIDIA nForce 680i, 780i, and 780a.
The first implementations of Quad SLI used a hybrid SLI mode: Split Frame Rendering (SFR) and Alternate Frame Rendering (AFR) simultaneously. In most modern games, which use complex shaders, multi-pass rendering, and complex post processing, the SFR mode is noticeably less efficient than AFR. So it was decided to implement pure AFR, when four frames are processed simultaneously. In this case the frame rate grows almost linearly to geometry, texture, and shader performance. Besides, there are fewer problems with compatibility.
Quad SLI systems get a great advantage in games, where performance in limited by the fill rate, such as FEAR. For example, the system with two GeForce 9800 GX2 is almost twice as fast in this game as the single card. The situation is a tad worse in other applications, the average performance gain from the second GX2 reaches about 40-50%. However, ~70% performance gain in Crysis is not bad at all.
It's high time to speak about drawbacks. We've already mentioned latencies introduced by AFR (to be more exact, they are not reduced as FPS seems to grow). They are practically unnoticeable in a dual-GPU system. However, you may already notice them in a quad-GPU system, because FPS grows, but the latencies are not reduced. Games seem to run smoother than on a single-GPU system. But it's almost just as uncomfortable to play, if a single GPU fails to provide the average frame rate of at least 30 fps in the same conditions. Another minor problem of Quad SLI is the fact that it's supported only by Windows Vista.
GeForce 9800 GX2 partially supports the Hybrid SLI technology, which includes two main features: HybridPower and GeForce Boost. The dual-GPU card we're reviewing today supports HybridPower. This technology automatically switches the GPU between an external graphics card (GeForce 9800 GX2, in our case) and the GeForce core integrated into the chipset (the motherboard must support HybridPower, of course) depending on the load.
The diagram shows two modes a SLI system can work with HybridPower. The first mode (top) is used for 3D applications that actively use all features of GeForce 9800 GX2. The second mode is intended for everyday usage, for example hardware-assisted video playback. In this case, the system uses resources of the integrated graphics core, while the installed graphics cards may be disabled not to consume power at all!
When HybridPower is enabled, video output from external graphics cards is routed to the integrated graphics core and fed to the video-out on the rear panel of the motherboard. So you can use integrated and discrete graphics with the same physical connector. In everyday applications HybridPower disables discrete graphics, conserving power and reducing noise generated by GPU coolers. But when you need the entire 3D capacity of the installed graphics cards, they are powered up and start rendering frames.
Nothing has changed since graphics cards based on a single G92 GPU. However, all software improvements in PureVideo HD, which appeared in new driver versions starting from ForceWare 174 (added for GeForce 9600 GT), also work well in GeForce 9800 GX2. The latest innovations in PureVideo HD include dual-thread decoding, dynamic adjustment of contrast and saturation.
Another interesting new feature in the latest version of PureVideo HD is support for Aero interface in Windows Vista during hardware accelerated video playback in a window. This was not possible in previous versions. We covered it all in the 9600 GT review.
Support for external interfaces
Support for external interfaces in GeForce 9800 GX2 differs little from that of previous solutions based on the same GPUs. An additional NVIO chip for external interfaces in the GeForce 8800 (except for those based on G92) has been integrated into the G92.
Reference GeForce 9800 GX2 cards come with two Dual Link DVIs (one on each PCB) supporting HDCP, as well as one HDMI with HDCP support. In fact, HDMI and DisplayPort connectors are not mandatory, they can be replaced with adapters from DVI to HDMI or DisplayPort.
To output audio via HDMI, the card under review is equipped with SPDIF-In to plug an audio source to. For this purpose, SPDIF-Out on the motherboard is connected with a bundled cable to SPDIF-In in the upper part of the graphics card. Interestingly, not all test samples have an optical audio input. It seems that its necessity is determined by NVIDIA partners.
So, we've briefly reviewed theoretical peculiarities of the new dual-GPU GeForce 9800 GX2. Practical parts of the review with synthetic and game tests will follow. They will show how much increased performance of the dual-GPU card from NVIDIA relative to single-GPU modifications, as well as the dual-GPU solution from AMD. We'll find out its fortes and weaknesses.
Write a comment below. No registration needed!