iXBT Labs - Computer Hardware in Detail

Platform

Video

Multimedia

Mobile

Other

Brief account of FarCry v.1.2 tests
and of the first Shader 3.0 implementation








Contents

  1. Introduction, video cards' features
  2. Testbed configurations
  3. Test results: Tests based on third party demo-scripts
  4. Test results: Tests based on the demo-scripts included into Patch 1.2
  5. Test results: Quality
  6. Conclusions



FarCry is a well known game. It has gained its popularity not only because of its gameplay or other qualities of the game. It also became popular because it is used by all testers to benchmark the latest video cards.

It all started with technological demo-programs showing off new GeForce features. The Crytek company actually came out with progressive solutions for those times. The first version of the «dinosaur isle» (X-Isle tech demo) was released before 2001 and was using Dot3 Bumpmapping, Realtime Reflections and Refractions, Curved Surfaces (progressive for those times). At that time GeForce2 Pro, GTS were the most powerful video cards.




This demo did not allow "to roam" the isle, it only featured a fly-around of the isle by the specified script. Each dinosaur represented its own rendering technique. The water was made with procedural textures, the grass was sparse. But it looked really miraculous for 2000!

The second version, named X-ISLE: Dinosaurs Island TechDemo, was released right after the announcement of GeForce3 in 2001. It was aimed at demonstrating technologies peculiar solely to that video card: Texture Shaders used for water-rippling, bump-mapping, sun-reflection; Environmental Bump-Mapping on Sun-Reflection; High Polygon vertex-programm skinned T-Rex (25.000 Polygons).




The engine in this program was completely reworked and oriented at hundreds of thousands polygons in a frame (just imagine that performance displayed by GeForce3). The snap was in constructing thoroughly designed models of prehistoric animals applying Shaders 1.1.

Yes, I remember as if it were yesterday, how I used to admire this program. I liked to roam the isle (this version already allowed it), to look at indolent, seemingly harmless animals, to admire brontosaurus' height, to go up to the mountains and get a bird's-eye view of the whole isle with archaeopteryx flying by. Majestic music wonderfully supplemented this scene, and even now I'm very disappointed that this demo had not been developed into a game.

There was no news for a long time after that, though at the end of 2001 the engine developers mentioned two games to be released with the CryEngine.

The first bells testifying that Crytek is not dead (figuratively speaking, of course) and that a graphics bomb (FarCry game) is under development sounded in 2002. With the development progress more and more screenshots regularly appeared in Internet, which made 3D-graphics fans roll their eyes in delight.

And then the miracle happened. Cardinally reworked engine from Crytek, luxurious levels, excellent physics in the game, etc, had done their part and raised the game to the popularity peak. It goes without saying that a powerful video accelerator is required to use the entire graphics potential, even new NV40/R420 cannot boast of the fastest speeds without AA and anisotropy. But on the other hand, testers and 3D-accelerator fans have obtained a new tool to benchmark performance.

I have mentioned NV40/R420 above. We have reviewed their features many times, so you can browse our articles on the subject.

Theoretical materials and reviews of video cards, which concern functional properties of the GPU ATI RADEON X800 (R420) and NVIDIA GeForce 6800 (NV40):

So, FarCry and new accelerators. Up to now all testers and gamers used Patch 1.1 for this game, which fixed a lot of bugs and optimized the game performance. But with a considerable raise in performance of the GeForce FX 59xx series it partially lowered the quality (we have already written about this issue).

The release of Patch 1.2 is on the horizon now. What can we expect from it? - I think the answer to this question is hardly appropriate in this section, it's a prerogative of gaming resources or sections. But from the point of view of NV40, this version will bring the first real usage of Shaders 3.0. Yes, Patch 1.2 employs this new technology. Running a few steps forward, I'll tell you that just from the performance point of view, the quality component has not been touched (except for some bugfixes).

Just in order to see what brings Patch 1.2 for GeForce 6800 we decided to publish this material. In the context of this article I shall not plunge into reviewing all modern video cards and their performance in FarCry 1.2, we shall content ourselves only with GeForce 6800 Ultra and X800 XT. Since we have 3DiGest, where we reviewed performance characteristics (when the patch is released officially, we shall test all of our 35-37 cards with it) as well as quality characteristics (all existing glitches will be certainly published in this section).

Video Cards


ATI RADEON X800 XT NVIDIA GeForce 6800 Ultra














Remember that both cards have 256 MB of GDDR3 memory, 1.6ns access time, memory frequencies being at 550 (1100) MHz for NV40, and at 575 (1150) MHz for R420. Chip clocks: NV40 - 400 and 450 MHz, R420 - 525 MHz. Both GPUs contain 16 pixel and 6 vertex pipelines. NV40 also supports shaders 3.0, and R420 - a proprietary technology for compressing normal maps - 3Dc. That's it in brief.

Installation and Drivers

Testbed configurations:

  • Athlon 64 3200+ based computer:

    • AMD Athlon 64 3200+ (L2=1024K) CPU.

    • ASUS K8V SE Deluxe (VIA K8T800) mainboard.

    • 1 GB DDR SDRAM PC3200.

    • Seagate Barracuda 7200.7 80GB SATA HDD.

  • Operating system - Windows XP SP2beta(!); DirectX 9.0c.

  • Monitors - ViewSonic P810 (21") and Mitsubishi Diamond Pro 2070sb (21").

  • ATI drivers v6.458 (CATALYST 4.7beta); NVIDIA v61.45..

VSync is disabled.

I emphasize that in our test we used Microsoft Service Pack 2 (RC0) for WinXP, which already includes a new version of DX 9.0c. It is required for Shaders 3.0.

Test Results

Test results : Tests based on third party demo-scripts

We used the following test applications:

  • FarCry 1.2 (Crytek/UbiSoft), DirectX 9.0, multitexturing, (the game is started with the key -DEVMODE), test settings - Very High.

Note that the batch includes premade demo-scripts, which can be started either via console or from the command prompt. Here is a sample BAT-file:

echo Running Research...
Bin32\FarCry.exe -DEVMODE "#demo_num_runs=1" "#demo_quit=1" "map research" "demo research" "r_sm30path 1"
echo Running Regulator...
Bin32\FarCry.exe -DEVMODE "#demo_num_runs=1" "#demo_quit=1" "map regulator" "demo regulator" "r_sm30path 1"
echo Running Training...
Bin32\FarCry.exe -DEVMODE "#demo_num_runs=1" "#demo_quit=1" "map training" "demo training" "r_sm30path 1"
echo Running Volcano...
Bin32\FarCry.exe -DEVMODE "#demo_num_runs=1" "#demo_quit=1" "map volcano" "demo volcano" "r_sm30path 1"

Using variables #r_Width=1600 and #r_Height=1200 you can set resolution from the command prompt. The usage of Shaders 3.0 is controlled by the r_sm30path variable. When it is set to "1", SM 3.0 are enabled;«0» - disabled and rendering is back to Version 2.0 maximum.

This batch file starts prerecorded demo-scripts on different levels. We shall return to them later. In this section we used demos specially recorded for testing purposes on several levels to show off the diversity of results and assess the situation more objectively. Readers should understand that Shaders 3.0 cannot substitute the second version completely, in this case they play their optimization role. In particular, one of the main components is dynamic branching capability. Everybody knows that a shader is a code snippet, a program written on a special language similar to assembler. A developer must make a shader universal, so that it could work with various accelerators. In the process of shader execution a situation may occur, when for some reason there is no need to process this shader to the end, no need to perform useless operations; or to stop this shader and pass to the next, etc. That is, to perform a jump operation, branching, not knowing about it beforehand. This feature is specific to Shaders 3.0 only. Question: is this due to that feature that Crytek programmers managed to optimize NV4x operations in a range of scenes? Let's see.

First of all, let's run our demo01, which has been used in our tests for a long time (Research level).









Evidently, there are almost no Shaders 3.0 in the open space on this level, and the performance gain is hardly visible. X800 XT is still a leader in heavy modes with AA and AF.

Lets run this demo in the episode in an enclosed space (with a generator inside a cave). Realistic surfacing of the cave, floor, and other objects was rendered using complex Shaders 2.0... And now Shaders 3.0 are also used.


   






Yes, performance gain is considerable. But in heavy modes X800 XT the leader still beats the creation from NVIDIA.

These examples demonstrated that, on the whole, we shouldn't expect much performance gain from NV40 in the game, since Shaders 3.0 are used selectively for the time being.

Let's look at the performance on the other levels.

Control




   






Regulator




   






Volcano




   






And a concluding comparative table:




It's evident that the usage of Shaders 3.0 yielded most dividends in heavy modes with AA and AF. But X800 XT is not conquered (except for cases without AA and AF). Status quo is preserved. Bus still, if you take into account that Shaders 3.0 were introduced by two programmers during two weeks, it's quite remarkable.

Test results: Tests based on the demo-scripts included into the patch

I have also run the demos that were included into the patch and were recommended by NVIDIA for benchmarking. I shall not cite all the figures, just a comparative table as it is the most illustrative:




You must have already paid attention on the striking difference of these results from the previous ones. Yes, the scripts make their route only around the objects optimized with SM 3.0. And the performance gain is really considerable! And X800 XT is not a leader in such places. Thus, we can make a conclusion that if the game pretty often used v2.0 and the 3.0 optimizations were applicable all over the game (with positive results, of course), the performance gain in general would be very high, and then X800 XT would surely fall back to the second place in FarCry performance.

Test results: Quality

As I have already mentioned, introduction of Shaders 3.0 does not pursue any graphics innovations (improvements, new effects) at this stage. Inclusion of HDR to NV40 and introduction of new visual effects will be in the next Patch 1.3. We can already have a glance at how it will approximately look like (the source: NVNews.net forum):


   


   

It would be fair to mention that Patch 1.3 will also include the 3Dc technology, but it is still a question what it will bring from the visualization point of view.

And for now we can mention a comforting fact that the problem of low NVIDIA quality in FarCry (in several scenes you could see pixelization at changeable lighting) is resolved in v1.2 even without using Shaders 3.0.

RADEON X800 XT GeForce 6800 Ultra 2.0 GeForce 6800 Ultra 3.0










Conclusions

Thus, it should seem the tests confirmed that Shaders 3.0 are not an empty phrase. That they are for real. Of course we should allow for the beta-version of the software used for testing, it can be improved in future.

This game uses 3.0 in places with heavy load on the accelerator. This helped raise the minimum performance values for NV40. What concerns the average performance, you cannot say that 6800 Ultra scored a triumph over its competitor, because in heavy modes in general X800 XT has higher performance. We cannot possibly take for sound results those figures demonstrated in the fly-around near several objects.

But those figures are not useless either. Results from the demos included into the patch demonstrate what can be achieved with the overall introduction of the optimization (naturally, in case of performance gains, the tests also showing proof of possible performance losses).

When Patch 1.3 is out, we shall pay attention to the quality component (that is visual effects) as well. And now let's note one more time that NV40 has its problems with quality in FarCry resolved.

The question remains: is it really Shaders 3.0 that cause the performance gain, or it is something else? Let's have a look at the research of Alexei Barkovoi.

ADDENDUM from Alexei Barkovoi:

Performance gain at what cost?

Having obtained performance gain exceeding 20% in demo-scripts included into the patch, it was decided to research the reasons for such gain. According to NVIDIA and the Cevat Yerli's answers (CryTek) in the interview to the FiringSquad web site, Patch 1.2 uses improved object lighting and geometry instancing on video cards supporting shaders of the third version.

The new version of FarCry can apply maximum four light sources to an object at a single pass. Compare it with video cards supporting Shaders 2.0 only, where each light source requires one rendering pass. Such an improvement in the game engine reduces requirements both to the memory bandwidth and to the performance of the vertex shader unit. For example, in the Research demo (where the maximum gain is obtained - see above) the activation of Shaders 3.0 reduces the number of processed triangles by 40% (proved by our tests).

The next improvement - geometry instancing - is now mostly used for rendering open spaces. This technology allows to render a lot of identical objects at a single DirectX call instead of rendering each object individually. Geometry instancing reduces CPU usage (by both DirectX and a video card driver) as well as raises the GPU effectiveness (reduced pipeline downtime between rendering different objects). In FarCry geometry instancing is used for rendering trees, grass, and other flora.

It turned out that both improvements can be controlled separately. In addition to the key «r_sm30path», used in batch files from NVIDIA, there are keys «r_noPS30» and «r_GeomInstancing». It's clear from their title that r_noPS30 disables Pixel Shaders 3.0, and r_GeomInstancing activates geometry instancing.

Let's look at the results demonstrated by GeForce 6800 Ultra with the improvements activated separately. The diagram displays (top-down) performance at: full activation of Shaders 3.0, using only Pixel Shaders 3.0, using only geometry instancing, using the "old" rendering technique with Shaders 2.0 and lower.




The diagram shows that the geometry instancing usage is of no gain to FarCry and that the main performance gain is due to the new one pass lighting implementation with Pixel Shaders 3.0. Interestingly, though the performance with the "r_noPS30" key differs from the performance with the "r_sm30path" key, it is always lower than the latter.

So, the maximum gain in the game is due to the new lighting technique. Let's see what pixel shaders are used in the game. To do this, using 3DAnalyzer we saved all the shaders used in the game with GeForce 6800 in the Research demo with the activated mode r_sm30path. This demo was selected because it demonstrated maximum gain after the activation of this rendering mode. Complete listing of Pixel Shaders 3.0 contains almost two thousand lines. But to my surprise, a shallow examination of the resulting shaders did not reveal a single distinctive feature of Pixel Shaders 3.0: neither dynamic or static branching, nor dynamic/static cycles, etc. Likewise, all the shaders seem to have the code length meeting the specification for Pixel Shaders 2.0 (and guaranteed compliance with limitations for Shaders 2.0a and 2.0b).

Where is the reason for statements about branching in shaders - quote: "more lighting can be accomplished with flow control and branching so we can encode for example 4 lights in one pass"? The only detected fragment of the source code in Shaders HLSL/Cg, which (after compilation) can take advantage of the Shaders 3.0 features, is shown below:

   FLOAT fAttenFunction = 4.f/16.f;
   for (int i=0; i<_NUM_LIGHTS; i++)
   {
     // Calculate attenuation
     FLOAT atten = 1;
     if (aLType[i] != 0)
     {
       FLOAT dist = length(IN.lightVec[i].xyz) * AttenInfo[i];
       atten = tex2D(attenMap, float2(dist, fAttenFunction));
     }

    …………

    FLOAT3 dif = decalColor.xyz * NdotL * ( FLOAT3 )Diffuse[i].xyz * atten * filterColor.xyz;
    dif.xyz = HDREncode(dif.xyz);
    vFinalDif += dif.xyz;
    }

This code uses a cycle to calculate object lighting with several light sources. Their number is specified in the _NUM_LIGHTS variable. But the engine of FarCry features precompiling all possible shader perturbations into byte-code. Thus when this shader is compiled, _NUM_LIGHTS is a constant and static branching depending on a number of lights is not used. As a result, if necessary, the game uses a lot of various shaders compiled from several "mega-shaders". And FarCry 1.2 does not use the features of Shaders 3.0, though it has such capabilities.

Conclusion

FarCry 1.2 example demonstrated that Shaders 3.0 are real and effective and that many games will be capable of their support in case of a flexible engine. Obtained results are somewhat ambiguous: on the one hand, we witnessed performance gain in several places; on the other hand, it seems that the improvement, which brought considerable rendering gain (several light sources at one pass), could have been implemented with Shaders 2.x (possibly even 2.0). Thus I would like to wish the developers not only to promote the latest technologies, but also not to forget and optimize the game performance with the existing video cards. However all performance gains in the above tests are actually due to pass reduction at rendering lighting. That is the current implementation has no merits achieved via the features of Shaders 3.0.

Patch 1.3 is ahead, it is announced to support rendering with HDR effects. Because of the game requirements (support for floating point colours blending in framebuffer) it will be compatible only with GeForce 6800. Let's hope that the next patch will use the new features offered by Shaders 3.0 more effectively.

Will the technology of Shaders 3.0 be actively introduced and used? In my opinion - yes, it will, but not fast. It's a pity that ATI and NVIDIA again drifted technologically apart, Shaders 2.0 had just started to bring them together and the developers had sighed in relief. And then again they drifted apart like ships at sea. But we shall watch after the 3Dc development, and we shall see what dividends this technology brings. And for now we finish this article with a wish not to stop at this point and demonstrate 3.0 more vividly and effectively (today's demonstration is mostly a developers' trick.) We are looking forward to Patch 1.3, which should demonstrate another NV40 feature hidden for the time being. Perhaps programmers will not stop their efforts on the quality issue and will continue to apply 3.0 with positive results, to achieve in general more considerable performance gain in the game.

Andrew Vorobiev (anvakams@ixbt.com)
Alexei Barkovoi (clootie@ixbt.com)

12 July, 2004

Write a comment below. No registration needed!


Article navigation:



blog comments powered by Disqus

  Most Popular Reviews More    RSS  

AMD Phenom II X4 955, Phenom II X4 960T, Phenom II X6 1075T, and Intel Pentium G2120, Core i3-3220, Core i5-3330 Processors

Comparing old, cheap solutions from AMD with new, budget offerings from Intel.
February 1, 2013 · Processor Roundups

Inno3D GeForce GTX 670 iChill, Inno3D GeForce GTX 660 Ti Graphics Cards

A couple of mid-range adapters with original cooling systems.
January 30, 2013 · Video cards: NVIDIA GPUs

Creative Sound Blaster X-Fi Surround 5.1

An external X-Fi solution in tests.
September 9, 2008 · Sound Cards

AMD FX-8350 Processor

The first worthwhile Piledriver CPU.
September 11, 2012 · Processors: AMD

Consumed Power, Energy Consumption: Ivy Bridge vs. Sandy Bridge

Trying out the new method.
September 18, 2012 · Processors: Intel
  Latest Reviews More    RSS  

i3DSpeed, September 2013

Retested all graphics cards with the new drivers.
Oct 18, 2013 · 3Digests

i3DSpeed, August 2013

Added new benchmarks: BioShock Infinite and Metro: Last Light.
Sep 06, 2013 · 3Digests

i3DSpeed, July 2013

Added the test results of NVIDIA GeForce GTX 760 and AMD Radeon HD 7730.
Aug 05, 2013 · 3Digests

Gainward GeForce GTX 650 Ti BOOST 2GB Golden Sample Graphics Card

An excellent hybrid of GeForce GTX 650 Ti and GeForce GTX 660.
Jun 24, 2013 · Video cards: NVIDIA GPUs

i3DSpeed, May 2013

Added the test results of NVIDIA GeForce GTX 770/780.
Jun 03, 2013 · 3Digests
  Latest News More    RSS  

Platform  ·  Video  ·  Multimedia  ·  Mobile  ·  Other  ||  About us & Privacy policy  ·  Twitter  ·  Facebook


Copyright © Byrds Research & Publishing, Ltd., 1997–2011. All rights reserved.