iXBT Labs - Computer Hardware in Detail

Platform

Video

Multimedia

Mobile

Other

Testing NVIDIA SLI (GeForce 6800/6600 PCI-E).
Part 1 – i7525 Platform

December 23, 2004








Contents

  1. Introduction, video cards' features
  2. Testbed configurations, benchmarks
  3. Test results: Serious Sam : The Second Encounter
  4. Test results: Code Creatures
  5. Test results: Unreal Tournament 2003
  6. Test results: Unreal II: The Awakening
  7. Test results: RightMark 3D
  8. Test results: TRAOD
  9. Test results: FarCry
  10. Test results: Call Of Duty
  11. Test results: HALO: Combat Evolved
  12. Test results: Half-Life2(beta)
  13. Test results: Splinter Cell
  14. Test results: Quake3 Arena
  15. Test results: Return to Castle Wolfenstein
  16. Test results: DOOM III
  17. Test results: 3DMark05 Game1 (with FPS in time graph)
  18. Test results: 3DMark05 Game2 (with FPS in time graph)
  19. Test results: 3DMark05 Game3 (with FPS in time graph)
  20. Test results: 3DMark05 MARKS
  21. Conclusions



Today we have a very interesting review. Quite a long time ago, when platforms with PCI-Express support had only started to appear, NVIDIA announced that its High-End video cards, and then Middle-End as well (GeForce 6600GT) could work in pairs. And considering that all patents and trademarks of the former 3dfx are a property of NVIDIA, there was no need in inventing a name for the new technology, it was decided to use a well known abbreviation – SLI. But! This is not the Scanline Interleaving that we saw in a combo of two Voodoo2 cards. SLI means Scalable Link Interface.




SLI for Voodoo2 distributed operations between the two cards PER LINE. That is odd lines were processed by the first video card, even lines – by the second. This was hardcoded in the driver. The data (processed lines) to be assembled in the buffer of the primary video card was transferred via the VGA-LOOP cable synchronized by a special SLI adapter looking like a piece of a floppy cable.

NVIDIA SLI operates differently. Like in the latest multi-chip solutions from 3dfx, the pair is formed by a Primary (Master) card and a Secondary (Slave) card. A special buffer is allocated in a master card to assemble the image. But we have written much about operating methods of multi-chip solutions, so we shall not repeat ourselves.

Note that there is a special unit inside the corresponding chips (NV45, NV43), which is responsible for SLI, working with shared resources of the two video cards. The data is exchanged via a bus and synched via a special SLI connector plugged to both cards (see the photos below).

Theoretically, SLI can operate in two modes:

  • To divide the screen and work in half (limiting case) or to vary the load on each GPU depending on application and driver operations. But the one thing is clear: each card works with its own line (3dfx Voodoo5 mode, XGI Volari DUO).
  • To delegate an entire frame to each video card (and to each GPU) and interleave frames: even frames are processed by one card, odd ones – by another. (RAGE MAXX mode).

The second mode is not operable so far (to be more exact, it cannot be enabled), so let's consider SLI as dividing a single frame between video cards.

At present SLI is officially supported by NVIDIA GeForce 6800 GT/Ultra PCI-E (NV45) and GeForce 6600GT (NV43). Though, as I have already mentioned above, all video cards based on NV45/43 should theoretically support this mode. So the lack of a MIO port for a SLI connector on the 6600/6200 PCB is a purely marketing move, and thus we can safely assume that later on there may appear cheap cards up to $100 with SLI support.

We have a lot of questions concerning how SLI operates in present 3D environments from the point of view of rendering resources. In old days of Voodoo everything was more or less clear: interleaved operations, just divided the operations (trivial thing) in half and that's all. Geometry and illumination (in case of vertex one) calculations were up to CPU.

Now everything is different. Just take shader operations, which are used to process object surfaces (roughly speaking). What about objects located on the borderline between areas of responsibility of the two GPUs? Should both processors calculate the same shader and do the redundant work? Or should one GPU calculate the entire object and pass half of the data to the second one? - It's not physically possible. There will probably be some performance losses for overheads on the borderline objects between the areas of responsibility of the first and the second GPU.

From bad to worse: how will adaptive anisotropy work in cases when more data is required than available in one GPU? It's a murky secret so far, we are going to investigate further into the matter. One thing we know for sure: geometry calculations and texture loading are duplicated by both cards.

Today we are interested in practice. What's in all of it for us.

In practice one will have problems with availability of... no, not cards or SLI connectors (though the latter are obviously in deficit so far). But of the platforms with two PCI-Express x16/x4 (x8/x8) slots. You know that motherboard manufacturers have not yet released such motherboards based on i9xx for desktop and game systems, and now you can see this in action only in server solutions based on i75xx. That's why SLI will bring no joy to the Pentium4 camp, until the x16+x4 or x8+x8 layout will be designed for two sterling video slots. Either VIA or SIS will probably announce revisions of their solutions with SLI support for P4, but we know nothing about this so far. Having got the license for P4 bus, NVIDIA is obviously preparing a corresponding nForce with SLI support for this platform as well, as nForce4 is already officially released for AMD Athlon64/FX/Opteron.

Or nForce4 SLI + AMD processor of the latest generation or an almost server platform based on i75xx with one or two Intel Xeon CPUs. It's clear that the first solution will be much cheaper.

Will SLI be an additional AMD trump card for promoting its solutions among Hardcore Gamers? It's hard to say so far, because nForce4 is not yet properly available (only early samples). Besides, as I have already said, SLI connectors are not to be found on sale yet and they are not shipped with video cards so far. And we have information that they will not be shipped later either. Only vendors selling motherboards based on nForce4 SLI will include such adapters into the bundle.

Thus, we are approaching the chapter devoted to testing video cards. As it's clear from the title of the article, we shall test only Intel platforms today. Readers should understand that this solution is not designed for games, and even Xeon 3.4 GHz together with a powerful motherboard demonstrated that the performance dropped even lower than previously on Pentium4 3.2GHz (i875P) (registered DDR333 memory is much slower than well-known DDR400). Nevertheless we got very interesting results.

I should note that the main SLI requirement for today is the complete IDENTITY of video cards forming the SLI pair. I have tested several GeForce 6600GT cards from various manufacturers and I can say that only video cards from Leadtek and Palit managed to work together, they were correctly detected by the driver and organized into SLI. In other cases the driver reported that one of the two video cards is not SLI compatible. Besides the manufacturers mentioned above, the tests included Gainward and Gigabyte.

To make the experiment pure, we took pairs of reference cards, which are certainly identical within the pair.

Video Cards

We used the following PAIRS of video cards:

NVIDIA GeForce 6800 GT PCI-E (NV45) NVIDIA GeForce 6600GT






The card has the PCI-E x16 interface, 256 MB GDDR3 SDRAM allocated in 8 chips on the front side of the PCB. Samsung (GDDR3) memory chips. 2.0ns memory access time, which corresponds to 500 (1000) MHz, at which the memory operates. GPU frequency — 350 MHz. 256bit memory bus. Pixel pipelines X texture units - 16x1. 6 vertex pipelines. The card has the PCI-Express x16 interface, 128 MB GDDR3 SDRAM allocated in 4 chips on the front side of the PCB. Samsung (GDDR3) memory chips. 2.0ns memory access time, which corresponds to 500 (1000) MHz, at which the memory operates. GPU frequency — 500 MHz. 128-bit memory bus. Pixel pipelines X texture units - 8x1. 3 vertex pipelines.


Installation and Drivers

Testbed configurations:




  • Xeon 3400 MHz based computer
    • Intel Xeon 3400 MHz processor (L2=1024K)
    • SuperMicro X6DA8-G motherboard based on i7525
    • 2 GB DDR SDRAM 333MHz Registered ECC



    • WD Caviar SE WD1600JD 160GB SATA HDD

  • Operating system – Windows XP SP2; DirectX 9.0c
  • Monitor: Mitsubishi Diamond Pro 2070sb (21").
  • NVIDIA drivers v66.81/66.93.



VSync is disabled.

A pair of GeForce 6800GT PCI-E video cards




A pair of GeForce 6600GT video cards




Adapter-synchronizer:




Note: do not be surprised to see different CPU fans on the photos. As the standard fan was so noisy that it caused headaches (heavy whistling at high frequencies), we had to remove it and temporarily install a regular fan from a power supply unit, which successfully blew over the copper heatsink without any noise, the latter being almost cool (why tower this 7000-rpm noisy fan on this heatsink?).

And at the end of this chapter I can add that when SLI is detected, a new tabbed page appears in driver settings offering to enable SLI and an option to show load of each GPU (you can see it clear on the screenshot of driver settings). The latter (show GPU load balancing) is a horizontal stripe, which is approximately 1/3 of the screen below, overlaid above all applications with GPU load indicators – the left one is for the first GPU and the right one is for the second GPU. So you can see how the load is distributed between them. However, deviations from 50/50 were rare and did not exceed 40/60 or 60/40 in all cases.

Operating frequencies of GeForce 6800 Ultra were obtained by overclocking each GeForce 6800GT video card (BEFORE SLI IS ENABLED!) separately.

Test results: performance comparison

We used the following test applications:

  • Serious Sam : The Second Encounter v.1.05 (Croteam/GodGames) – OpenGL, multitexturing, ixbt0703-demo, test settings: quality, S3TC OFF

  • Quake3 Arena v.1.17 (id Software/Activision) – OpenGL, multitexturing, ixbt0703-demo, all test settings to maximum: detailing level – High, texture details – #4, S3TC OFF, curved surfaces are strongly smoothed using variables r_subdivisions «1» and r_lodCurveError «30000» (note that by default r_lodCurveError «250» !), the configurations can be downloaded here

  • Code Creatures Benchmark Pro (CodeCult) – gaming test demonstrating how the video card works with DirectX 8.1, Shaders, HW T&L.

  • Unreal Tournament 2004 v.3323 (Digital Extreme/Epic Games) – Direct3D, Vertex Shaders, Hardware T&L, Dot3, cube texturing, High quality

  • Unreal II: The Awakening (Legend Ent./Epic Games) – Direct3D, Vertex Shaders, Hardware T&L, Dot3, cube texturing, default quality

  • RightMark 3D v.0.4 (one of game scenes) – DirectX 8.1, Dot3, cube texturing, shadow buffers, vertex and pixel shaders (1.1, 1.4).

  • Tomb Raider: Angel of Darkness v.49 (Core Design/Eldos Software) – DirectX 9.0, Paris5_4 demo. The tests were conducted with the quality set to maximum, only Depth of Fields PS20 was disabled.

  • HALO: Combat Evolved (Microsoft) – Direct3D, Vertex/Pixel Shaders 1.1/2.0, Hardware T&L, maximum quality

  • Half-Life2 (Valve/Sierra) – DirectX 9.0, demo (ixbt07. The tests were carried out with enabled anisotropic filtering as well as in heavy mode with AA and anisotropy.

  • Tom Clancy's Splinter Cell v.1.2b (UbiSoft) – Direct3D, Vertex/Pixel Shaders 1.1/2.0, Hardware T&L, Very High settings; demo 1_1_2_Tbilisi

  • Call of Duty (MultiPlayer) (Infinity Ward/Activision) – OpenGL, multitexturing, ixbt0104demo, test settings – maximum, S3TC ON

  • FarCry 1.3 (Crytek/UbiSoft), DirectX 9.0, multitexturing, demo01 (research) (the game was started with the -DEVMODE option), test settings – Very High.

  • Return to Castle Wolfenstein (MultiPlayer) (id Software/Activision) – OpenGL, multitexturing, ixbt0703-demo, test settings – all to maximum, S3TC OFF,

  • DOOM III (id Software/Activision) – OpenGL, multitexturing, test settings – High Quality (ANIS8x),

  • 3DMark05 (FutureMark) – DirectX 9.0, multitexturing, test settings – trilinear,

If you want to get the demo-benchmarks, which we use, contact me at my e-mail.



Serious Sam : The Second Encounter



Diagrams with results

  • 6800Ultra SLI versus 6800Ultra – with AA+AF we can see the gain of up to 36%, which is not surprising. As I have already told before, this platform lays serious limits to video cards, and even AA+AF in this game do not help to load such cards completely.
  • 6800GT SLI versus 6800GT – the same picture, but the gains are a tad higher.
  • 6600GT SLI versus 6600GT – the same picture.

+40-50% is quite good on the whole... Anyway, previously known SLI or two-chip solutions did not provide even such gains.

Code Creatures



Diagrams with results

  • 6800Ultra SLI versus 6800Ultra – the gain is up to 78% with AA+AF! Though without AA+AF everything turns on the system again. But still this test depends more on GPU, and thus we see SLI operating almost at full power.
  • 6800GT SLI versus 6800GT – up to 95% of gain!!! That is almost twofold! Excellent test for SLI! :) surely, almost no resources are wasted on borderline objects falling into the jurisdiction of both processors.
  • 6600GT SLI versus 6600GT – despite a weaker video card and thus seemingly weaker influence of the system and higher SLI effect, nothing of that happens. Let's blame it on the driver.



Unreal Tournament 2004



Diagrams with results

  • 6800Ultra SLI versus 6800Ultra – Excellent results with AA+AF again! Up to 82% of performance gain!
  • 6800GT SLI versus 6800GT – the same picture.
  • 6600GT SLI versus 6600GT – a tad lower results.



Unreal II: The Awakening



Diagrams with results

  • 6800Ultra SLI versus 6800Ultra – this test obviously turns on the system resources, and so the performance gains are quite moderate.
  • 6800GT SLI versus 6800GT – the same picture.
  • 6600GT SLI versus 6600GT – now everything falls into place: the card is weaker, and so the influence of system resources is lower, thus the SLI yield is higher. It provides up to 78% of performance gain!



RightMark 3D



Diagrams with results

The results are similar to the previous ones.



TR:AoD, Paris5_4 DEMO



Diagrams with results

  • 6800Ultra SLI versus 6800Ultra – this test makes a sterling use of video cards, little depends on system resources, and so the results are obvious: up to 96% of performance gain! "Cooooool..."
  • 6800GT SLI versus 6800GT – the same picture.
  • 6600GT SLI versus 6600GT – almost the same picture.

I can't believe it... Wow! I repeat that geometry calculations and texture loading are carried out by both cards (duplication). But anyway, perfect sharing of shader calculations among the cards yields an impressive effect.

FarCry, demo01



Diagrams with results

The same picture! Only in 6800 Ultra the SLI gains are a tad lower because of system resource limits. But anyway, 90% of performance gain in Far Cry – that's a worthy result! However the i7525 platform costs dear.. :)



Call of Duty, ixbt04



Diagrams with results

  • 6800Ultra SLI versus 6800Ultra – system resources limit the performance of video cards even with AA+AF, so the gains are not that high.
  • 6800GT SLI versus 6800GT – the same picture.
  • 6600GT SLI versus 6600GT – everything is all right here, the gain is very high with AA+AF, when little depends on system resources with such video cards.



HALO: Combat Evolved



Diagrams with results

The same picture. The weaker the card is, the higher the SLI effect is.



Half-Life2 (beta): ixbt07 demo



Diagrams with results

The same picture. I guess, no need in comments.



Splinter Cell



Diagrams with results

  • 6800Ultra SLI versus 6800Ultra – there are almost no gains, everything turns on the system.
  • 6800GT SLI versus 6800GT – the SLI effect is more prominent, but still the CPU influence is too large.
  • 6600GT SLI versus 6600GT – here is the maximum effect of performance gain.



Quake III Arena



Diagrams with results

No need in comments.



Return to Castle Wolfenstein (Multiplayer)



Diagrams with results

The same picture.



DOOM III



Diagrams with results

Excellent results in all the cards! It goes without saying that we mean the AA+AF mode. Though in 6600GT the SLI effect is impressive even without this mode.



3DMark05: Game1



Diagrams with results




  • 6800Ultra SLI versus 6800Ultra – up to 62% of performance gain from SLI!
  • 6800GT SLI versus 6800GT – even up to 68%!
  • 6600GT SLI versus 6600GT – up to 79% here. No need in the AA+AF mode, because FPS is rather small and the video cards are heavily loaded even without this mode.

Pay attention to the FPS in time graph. You can clearly see the places where everything turns on the system resources and thus there is no effect from SLI.

3DMark05: Game2



Diagrams with results




The same to previous.



3DMark05: Game3



Diagrams with results




The same to previous.

Pay close attention to the FPS in time graph. Note the strong oscillations of FPS due to SLI in this test, especially in the first and middle parts. A lot of objects are located here falling into the jurisdiction of both GPUs. And the driver switches between processor operations very fast resulting in instant FPS drops. This may well be because of the untuned drivers and one GPU is periodically idle.

3DMark05: MARKS



Diagrams with results

Do you need more comments? :) I think you don't.

Conclusions

So, I guess it's perfectly clear that the SLI technology is effective. It even provides excellent performance gains where system resources do not lay severe limits. All this happens even despite that one PCI-Express slot operates as x16, and the second – only x4 (there are almost no other options, only x8+x8 is theoretically possible)).

But there is also a fly in the ointment so far. During our tests we sometimes witnessed artifacts (ripples), which, by the way, obviously demonstrate half-frame operations (between GPUs). There occurred color distortions sometimes. That is NVIDIA still has some issues to fix in the drivers.

I have also tested SLI with GeForce 6600GT + GeForce 6800GT. It won't work. To be more exact, it works with glitches and problems.

On the whole one can say that SLI made its debut. In the second part of this material we'll analyze test results for nForce4 SLI and only then we'll draw our conclusions – how it fares on more gaming-like platform. Though untuned drivers for nForce may have their effect there, because drivers from Intel are well known for their stability and fine-tuned characteristics, especially system ones.

I think that if the market offers more motherboards with two PCi-E slots, SLI will definitely be successful, even stunningly successful.

In our 3Digest you can find more detailed comparisons of various video cards.



Theoretical materials and reviews of video cards, which concern functional properties of the GPU ATI RADEON X800 (R420)/X850 (R480)/X800XL (R430)/X700 (RV410) and NVIDIA GeForce 6800 (NV40/45)/6600 (NV43)

PART 2: nForce4 SLI based tests






Andrey Vorobiev (anvakams@ixbt.com)

December 21, 2004



Write a comment below. No registration needed!


Article navigation:



blog comments powered by Disqus

  Most Popular Reviews More    RSS  

AMD Phenom II X4 955, Phenom II X4 960T, Phenom II X6 1075T, and Intel Pentium G2120, Core i3-3220, Core i5-3330 Processors

Comparing old, cheap solutions from AMD with new, budget offerings from Intel.
February 1, 2013 · Processor Roundups

Inno3D GeForce GTX 670 iChill, Inno3D GeForce GTX 660 Ti Graphics Cards

A couple of mid-range adapters with original cooling systems.
January 30, 2013 · Video cards: NVIDIA GPUs

Creative Sound Blaster X-Fi Surround 5.1

An external X-Fi solution in tests.
September 9, 2008 · Sound Cards

AMD FX-8350 Processor

The first worthwhile Piledriver CPU.
September 11, 2012 · Processors: AMD

Consumed Power, Energy Consumption: Ivy Bridge vs. Sandy Bridge

Trying out the new method.
September 18, 2012 · Processors: Intel
  Latest Reviews More    RSS  

i3DSpeed, September 2013

Retested all graphics cards with the new drivers.
Oct 18, 2013 · 3Digests

i3DSpeed, August 2013

Added new benchmarks: BioShock Infinite and Metro: Last Light.
Sep 06, 2013 · 3Digests

i3DSpeed, July 2013

Added the test results of NVIDIA GeForce GTX 760 and AMD Radeon HD 7730.
Aug 05, 2013 · 3Digests

Gainward GeForce GTX 650 Ti BOOST 2GB Golden Sample Graphics Card

An excellent hybrid of GeForce GTX 650 Ti and GeForce GTX 660.
Jun 24, 2013 · Video cards: NVIDIA GPUs

i3DSpeed, May 2013

Added the test results of NVIDIA GeForce GTX 770/780.
Jun 03, 2013 · 3Digests
  Latest News More    RSS  

Platform  ·  Video  ·  Multimedia  ·  Mobile  ·  Other  ||  About us & Privacy policy  ·  Twitter  ·  Facebook


Copyright © Byrds Research & Publishing, Ltd., 1997–2011. All rights reserved.