FarCry is a well known game. It has gained its popularity not only because of its gameplay or other qualities of the game. It also became popular because it is used by all testers to benchmark the latest video cards.
It all started with technological demo-programs showing off new GeForce features. The Crytek company actually came out with progressive solutions for those times. The first version of the «dinosaur isle» (X-Isle tech demo) was released before 2001 and was using Dot3 Bumpmapping, Realtime Reflections and Refractions, Curved Surfaces (progressive for those times). At that time GeForce2 Pro, GTS were the most powerful video cards.
This demo did not allow "to roam" the isle, it only featured a fly-around of the isle by the specified script. Each dinosaur represented its own rendering technique. The water was made with procedural textures, the grass was sparse. But it looked really miraculous for 2000!
The second version, named X-ISLE: Dinosaurs Island TechDemo, was released right after the announcement of GeForce3 in 2001. It was aimed at demonstrating technologies peculiar solely to that video card: Texture Shaders used for water-rippling, bump-mapping, sun-reflection; Environmental Bump-Mapping on Sun-Reflection; High Polygon vertex-programm skinned T-Rex (25.000 Polygons).
The engine in this program was completely reworked and oriented at hundreds of thousands polygons in a frame (just imagine that performance displayed by GeForce3). The snap was in constructing thoroughly designed models of prehistoric animals applying Shaders 1.1.
Yes, I remember as if it were yesterday, how I used to admire this program. I liked to roam the isle (this version already allowed it), to look at indolent, seemingly harmless animals, to admire brontosaurus' height, to go up to the mountains and get a bird's-eye view of the whole isle with archaeopteryx flying by. Majestic music wonderfully supplemented this scene, and even now I'm very disappointed that this demo had not been developed into a game.
There was no news for a long time after that, though at the end of 2001 the engine developers mentioned two games to be released with the CryEngine.
The first bells testifying that Crytek is not dead (figuratively speaking, of course) and that a graphics bomb (FarCry game) is under development sounded in 2002. With the development progress more and more screenshots regularly appeared in Internet, which made 3D-graphics fans roll their eyes in delight.
And then the miracle happened. Cardinally reworked engine from Crytek, luxurious levels, excellent physics in the game, etc, had done their part and raised the game to the popularity peak. It goes without saying that a powerful video accelerator is required to use the entire graphics potential, even new NV40/R420 cannot boast of the fastest speeds without AA and anisotropy. But on the other hand, testers and 3D-accelerator fans have obtained a new tool to benchmark performance.
I have mentioned NV40/R420 above. We have reviewed their features many times, so you can browse our articles on the subject.
Theoretical materials and reviews of video cards, which concern functional properties of the GPU ATI RADEON X800 (R420) and NVIDIA GeForce 6800 (NV40):
So, FarCry and new accelerators. Up to now all testers and gamers used Patch 1.1 for this game, which fixed a lot of bugs and optimized the game performance. But with a considerable raise in performance of the GeForce FX 59xx series it partially lowered the quality (we have already written about this issue).
The release of Patch 1.2 is on the horizon now. What can we expect from it? - I think the answer to this question is hardly appropriate in this section, it's a prerogative of gaming resources or sections. But from the point of view of NV40, this version will bring the first real usage of Shaders 3.0. Yes, Patch 1.2 employs this new technology. Running a few steps forward, I'll tell you that just from the performance point of view, the quality component has not been touched (except for some bugfixes).
Just in order to see what brings Patch 1.2 for GeForce 6800 we decided to publish this material. In the context of this article I shall not plunge into reviewing all modern video cards and their performance in FarCry 1.2, we shall content ourselves only with GeForce 6800 Ultra and X800 XT. Since we have 3DiGest, where we reviewed performance characteristics (when the patch is released officially, we shall test all of our 35-37 cards with it) as well as quality characteristics (all existing glitches will be certainly published in this section).
Remember that both cards have 256 MB of GDDR3 memory, 1.6ns access time, memory frequencies being at 550 (1100) MHz for NV40, and at 575 (1150) MHz for R420. Chip clocks: NV40 - 400 and 450 MHz, R420 - 525 MHz. Both GPUs contain 16 pixel and 6 vertex pipelines. NV40 also supports shaders 3.0, and R420 - a proprietary technology for compressing normal maps - 3Dc. That's it in brief.
VSync is disabled.
I emphasize that in our test we used Microsoft Service Pack 2 (RC0) for WinXP, which already includes a new version of DX 9.0c. It is required for Shaders 3.0.
We used the following test applications:
Note that the batch includes premade demo-scripts, which can be started either via console or from the command prompt. Here is a sample BAT-file:
This batch file starts prerecorded demo-scripts on different levels. We shall return to them later. In this section we used demos specially recorded for testing purposes on several levels to show off the diversity of results and assess the situation more objectively. Readers should understand that Shaders 3.0 cannot substitute the second version completely, in this case they play their optimization role. In particular, one of the main components is dynamic branching capability. Everybody knows that a shader is a code snippet, a program written on a special language similar to assembler. A developer must make a shader universal, so that it could work with various accelerators. In the process of shader execution a situation may occur, when for some reason there is no need to process this shader to the end, no need to perform useless operations; or to stop this shader and pass to the next, etc. That is, to perform a jump operation, branching, not knowing about it beforehand. This feature is specific to Shaders 3.0 only. Question: is this due to that feature that Crytek programmers managed to optimize NV4x operations in a range of scenes? Let's see.
First of all, let's run our demo01, which has been used in our tests for a long time (Research level).
Evidently, there are almost no Shaders 3.0 in the open space on this level, and the performance gain is hardly visible. X800 XT is still a leader in heavy modes with AA and AF.
Lets run this demo in the episode in an enclosed space (with a generator inside a cave). Realistic surfacing of the cave, floor, and other objects was rendered using complex Shaders 2.0... And now Shaders 3.0 are also used.
Yes, performance gain is considerable. But in heavy modes X800 XT the leader still beats the creation from NVIDIA.
These examples demonstrated that, on the whole, we shouldn't expect much performance gain from NV40 in the game, since Shaders 3.0 are used selectively for the time being.
Let's look at the performance on the other levels.
And a concluding comparative table:
It's evident that the usage of Shaders 3.0 yielded most dividends in heavy modes with AA and AF. But X800 XT is not conquered (except for cases without AA and AF). Status quo is preserved. Bus still, if you take into account that Shaders 3.0 were introduced by two programmers during two weeks, it's quite remarkable.
Test results: Tests based on the demo-scripts included into the patch
I have also run the demos that were included into the patch and were recommended by NVIDIA for benchmarking. I shall not cite all the figures, just a comparative table as it is the most illustrative:
You must have already paid attention on the striking difference of these results from the previous ones. Yes, the scripts make their route only around the objects optimized with SM 3.0. And the performance gain is really considerable! And X800 XT is not a leader in such places. Thus, we can make a conclusion that if the game pretty often used v2.0 and the 3.0 optimizations were applicable all over the game (with positive results, of course), the performance gain in general would be very high, and then X800 XT would surely fall back to the second place in FarCry performance.
Test results: Quality
As I have already mentioned, introduction of Shaders 3.0 does not pursue any graphics innovations (improvements, new effects) at this stage. Inclusion of HDR to NV40 and introduction of new visual effects will be in the next Patch 1.3. We can already have a glance at how it will approximately look like (the source: NVNews.net forum):
It would be fair to mention that Patch 1.3 will also include the 3Dc technology, but it is still a question what it will bring from the visualization point of view.
And for now we can mention a comforting fact that the problem of low NVIDIA quality in FarCry (in several scenes you could see pixelization at changeable lighting) is resolved in v1.2 even without using Shaders 3.0.
Thus, it should seem the tests confirmed that Shaders 3.0 are not an empty phrase. That they are for real. Of course we should allow for the beta-version of the software used for testing, it can be improved in future.
This game uses 3.0 in places with heavy load on the accelerator. This helped raise the minimum performance values for NV40. What concerns the average performance, you cannot say that 6800 Ultra scored a triumph over its competitor, because in heavy modes in general X800 XT has higher performance. We cannot possibly take for sound results those figures demonstrated in the fly-around near several objects.
But those figures are not useless either. Results from the demos included into the patch demonstrate what can be achieved with the overall introduction of the optimization (naturally, in case of performance gains, the tests also showing proof of possible performance losses).
When Patch 1.3 is out, we shall pay attention to the quality component (that is visual effects) as well. And now let's note one more time that NV40 has its problems with quality in FarCry resolved.
The question remains: is it really Shaders 3.0 that cause the performance gain, or it is something else? Let's have a look at the research of Alexei Barkovoi.
ADDENDUM from Alexei Barkovoi:
Performance gain at what cost?
Having obtained performance gain exceeding 20% in demo-scripts included into the patch, it was decided to research the reasons for such gain. According to NVIDIA and the Cevat Yerli's answers (CryTek) in the interview to the FiringSquad web site, Patch 1.2 uses improved object lighting and geometry instancing on video cards supporting shaders of the third version.
The new version of FarCry can apply maximum four light sources to an object at a single pass. Compare it with video cards supporting Shaders 2.0 only, where each light source requires one rendering pass. Such an improvement in the game engine reduces requirements both to the memory bandwidth and to the performance of the vertex shader unit. For example, in the Research demo (where the maximum gain is obtained - see above) the activation of Shaders 3.0 reduces the number of processed triangles by 40% (proved by our tests).
The next improvement - geometry instancing - is now mostly used for rendering open spaces. This technology allows to render a lot of identical objects at a single DirectX call instead of rendering each object individually. Geometry instancing reduces CPU usage (by both DirectX and a video card driver) as well as raises the GPU effectiveness (reduced pipeline downtime between rendering different objects). In FarCry geometry instancing is used for rendering trees, grass, and other flora.
It turned out that both improvements can be controlled separately. In addition to the key «r_sm30path», used in batch files from NVIDIA, there are keys «r_noPS30» and «r_GeomInstancing». It's clear from their title that r_noPS30 disables Pixel Shaders 3.0, and r_GeomInstancing activates geometry instancing.
Let's look at the results demonstrated by GeForce 6800 Ultra with the improvements activated separately. The diagram displays (top-down) performance at: full activation of Shaders 3.0, using only Pixel Shaders 3.0, using only geometry instancing, using the "old" rendering technique with Shaders 2.0 and lower.
The diagram shows that the geometry instancing usage is of no gain to FarCry and that the main performance gain is due to the new one pass lighting implementation with Pixel Shaders 3.0. Interestingly, though the performance with the "r_noPS30" key differs from the performance with the "r_sm30path" key, it is always lower than the latter.
So, the maximum gain in the game is due to the new lighting technique. Let's see what pixel shaders are used in the game. To do this, using 3DAnalyzer we saved all the shaders used in the game with GeForce 6800 in the Research demo with the activated mode r_sm30path. This demo was selected because it demonstrated maximum gain after the activation of this rendering mode. Complete listing of Pixel Shaders 3.0 contains almost two thousand lines. But to my surprise, a shallow examination of the resulting shaders did not reveal a single distinctive feature of Pixel Shaders 3.0: neither dynamic or static branching, nor dynamic/static cycles, etc. Likewise, all the shaders seem to have the code length meeting the specification for Pixel Shaders 2.0 (and guaranteed compliance with limitations for Shaders 2.0a and 2.0b).
Where is the reason for statements about branching in shaders - quote: "more lighting can be accomplished with flow control and branching so we can encode for example 4 lights in one pass"? The only detected fragment of the source code in Shaders HLSL/Cg, which (after compilation) can take advantage of the Shaders 3.0 features, is shown below:
This code uses a cycle to calculate object lighting with several light sources. Their number is specified in the _NUM_LIGHTS variable. But the engine of FarCry features precompiling all possible shader perturbations into byte-code. Thus when this shader is compiled, _NUM_LIGHTS is a constant and static branching depending on a number of lights is not used. As a result, if necessary, the game uses a lot of various shaders compiled from several "mega-shaders". And FarCry 1.2 does not use the features of Shaders 3.0, though it has such capabilities.
FarCry 1.2 example demonstrated that Shaders 3.0 are real and effective and that many games will be capable of their support in case of a flexible engine. Obtained results are somewhat ambiguous: on the one hand, we witnessed performance gain in several places; on the other hand, it seems that the improvement, which brought considerable rendering gain (several light sources at one pass), could have been implemented with Shaders 2.x (possibly even 2.0). Thus I would like to wish the developers not only to promote the latest technologies, but also not to forget and optimize the game performance with the existing video cards. However all performance gains in the above tests are actually due to pass reduction at rendering lighting. That is the current implementation has no merits achieved via the features of Shaders 3.0.
Patch 1.3 is ahead, it is announced to support rendering with HDR effects. Because of the game requirements (support for floating point colours blending in framebuffer) it will be compatible only with GeForce 6800. Let's hope that the next patch will use the new features offered by Shaders 3.0 more effectively.
Will the technology of Shaders 3.0 be actively introduced and used? In my opinion - yes, it will, but not fast. It's a pity that ATI and NVIDIA again drifted technologically apart, Shaders 2.0 had just started to bring them together and the developers had sighed in relief. And then again they drifted apart like ships at sea. But we shall watch after the 3Dc development, and we shall see what dividends this technology brings. And for now we finish this article with a wish not to stop at this point and demonstrate 3.0 more vividly and effectively (today's demonstration is mostly a developers' trick.) We are looking forward to Patch 1.3, which should demonstrate another NV40 feature hidden for the time being. Perhaps programmers will not stop their efforts on the quality issue and will continue to apply 3.0 with positive results, to achieve in general more considerable performance gain in the game.
Alexei Barkovoi (firstname.lastname@example.org)
12 July, 2004
Write a comment below. No registration needed!