iXBT Labs - Computer Hardware in Detail

Platform

Video

Multimedia

Mobile

Other

NVIDIA NV20 GPU
in-depth features analisys
Updated



General

No secret that on February 27 at Intel's Developer Forum (IDF) a new GPU from NVIDIA, known as NV20, must be announced officially. All are interested what's new could be seen in this graphics processor. So, we are going to lift the veil of secrecy. Note that we won't give results of performance measurements since this engineering sample is rather raw and the drivers are not finished. But on the NV20 functional features we can speak today in full volume.

First comes the NV20 specification (we are demonstrating the most interesting parameters in order not to repeat everything supported in the GeForce2).

NV20 specification

  • Technology: 0.15 micron
  • 60 mil. transistors
  • Graphics core frequency: 200+ MHz (later "Ultra" and "Pro" variants are possible, e.g. at 250 MHz)
  • Rendering pixel pipelines: 4
  • Texture blocks per rendering pipeline: 2
  • 4 textures per pixel
  • Memory interface: 128 bit
  • Supported memory as DDR SDRAM/SGRAM
  • At the time of release the first NV20 based cards will be equipped with the memory working at 250 (500) MHz
  • Peak bandwidth of the memory bus at 250 MHz: 8 GBytes/sec
  • Supported local memory size: up to 128 MBytes (the first cards, as our sample, will have 64 MBytes)
  • RAMDAC: 350 MHz
  • Max resolution: 2048x1536@75 Hz
  • Integrated TMDS transmitter allows connecting digital monitors and supports resolution up to 1600x1200
  • External bus interface: full support of AGP x2/x4 (including SBA, DME and Fast Writes) and PCI 2.2 (including Bus mastering).
  • Hardware T&L: effective performance around 40+ million triangles per second (our sample has a bit lower than 40 million triangles per second at a special synthetical test)
  • Full hardware support for all MS DirectX 8.0 and OpenGL 1.2 possibilities
  • Full support for hardware VertexShaders DX8, 1.1 version
  • Full support for hardware PixelShaders DX8, 1.0 version
  • Support for volume textures
  • Support for Cube environment mapping
  • Support for projective textures
  • Support for hardware tesselation of smooth surfaces (rectangular and triangular patches)
  • Hardware support for relief texturing of the following types: Embosing, Dot Product3 and EMBM (at last!)
  • Support for S3TC and all 5 DXTC texture compression methods
  • Support for primitive clipping in arbitrary set planes
  • Support for FSAA (with OGMS MultiSampling)
  • Support for memory bandwidth saving based on compressed Z-buffer
  • Support for HSR (early Z test)
  • Support for textures up to 4096x4096 @ 32 bit

Even a fleeting glance allows to conclude that in the NV20 there is a completely new architecture that provides a full hardware support of the DirectX 8.0. I can even state that the DX8 support comes in time, this fact seems to be used as a basis when promoting the NV20 and cards based on it.

NVIDIA didn't take a way of direct power increase (e.g. extension of the number of rendering pipelines). On the contrary, the engineers tried to do their best to realize a new architecture allowing full hardware support for DX8, but at the same time without a considerable increase in the number of transistors in the chip. Plus 0.15 micron technological process that makes possible to accommodate around 60 million transistors on the chip without making its size gigantic. A quick shift to the NV20 guarantees that NVIDIA has very good relations with vendors and reliable channels of memory supply together with the GPU. The reference design of cards for the NV20 doesn't differ much from the PCB design for the GeForce2 - memory chips are several mm shifted to the graphics core, and the position of power stabilizers has changed a bit. The DDR memory is available in industrial range that's why there is nothing to worry about.

The NV20 graphics core frequency and number of pixel pipelines and texture blocks correspond to the GeForce2 series chips. It means that the potential pixel fillrate of the NV20 based card will be equal to that of the GeForce2 based cards, but it's not a tragedy. In fact the problem of all GPU, starting from the GeForce 256, concludes not in the potential fillrate, but in the effective (real) fillrate, which is directly connected with the local memory bandwidth. It is exactly a local memory that turns to be a bottleneck of all modern graphics accelerators. So I think that it's right of NVIDIA that took a way of increasing the architecture effectiveness instead of just a direct power extension.

Another interesting feature of the NV20 concludes in 4 textures per pixel, what will allow to reveal a potential of programmable shaders lifting 3D realism on a completely new level. But being limited by the memory bandwidth, the chip developers didn't spend transistors realizing 4 texture blocks per each pixel pipeline. They took another way: they allowed keeping the results of working of two texture blocks for combining with results received at the following clock. It means that with usage of 3 or 4 textures the maximum reached performance falls twice. But in real applications, where everything is limited by a memory bandwidth, the punishment won't be so severe.

As to a video memory - nothing new. Probably full value drivers for the NV20 will weaken undesirable effects connected with limitation of the memory bus bandwidth at the expense of activation of special schemes, e.g. deferred texturing, hierarchical z-buffer etc. There is only left to get the final samples of the NV20 based cards, the release version of drivers and to check it in practice.

Let's digress a bit to recall the XBox. The official specs of the X-Box sound amazing: the pixel fillrate corresponds to 4000 million pix/sec! Frequency of the graphics core used in the XBox of the NV2A will be around 250 MHz. Even considering 8 pixel pipelines we won't get this figure (250*8=2000 million pixels). But 8 pipelines are unlikely - the XBox memory bandwidth won't be sufficient, in the current specification it's around 6.4 GBytes/sec, therefore they are again 4. Moreover, it's known that NV2A has 2 texture modules per pixel pipeline. Where is this figure from? In fact, we are talking about 4x Multi-Sample mode. There, 4 positions correspond to one received color value, it means that fillrate increases 4 times as compared with cards that don't support Multi-Sample mode hardwarely. This is what Microsoft and NVIDIA are doing while advertising their new products.

You know that you can increase the value of effective fillrate - without usage of new or/and expensive memory technologies - only with usage of modified algorithms of primitive rendering which reduce requirements to its bandwidth. It's obvious that future games will greater rely on T&L and hardware removal of hidden surfaces than on their own effective algorithms. Besides, the increased complexity of models will extend the OverDraw. In the NV20 they realized a simple hardware support for HSR (early revision of Z). It's effeciency greatly depends on the scene transferred to the accelerator. If the primitives are sorted in the order of moving off from the observer, this will allow decrease the necessity considerably in calculation of hidden pixels of the primitives, the effective fillrate will rise several times. Besides, the NV20 is equipped with Z-buffer compression. And according to the data received, it's a bit different algorithm than an hierarchical Z, but the compressed buffer is divided into tiles which are read, recorded, processes and cached in a chip wholly. We are only to wait until the final samples of the cards on the NV20, release version of drivers and to check the effectiveness of these possibilities with real games.

Besides, it's doubtful that MS would decide on usage of such exotic and therefore expensive memory types. Where is this figure from? In fact it concerns an effective fillrate - the NV2A (and probably the NV20) will be supplied with an algorithm(s) allowing to optimize the process of rendering - not to show invisible pixels. It's likely to be some combination of the hierarchical Z buffer and one more method of hidden surface removal. There is a sound ground for counting on special modified algorithms which reduce the requirements to the video memory bandwidth. Without new and/or expensive technologies you can lift an effective fillrate only with usage of modified algorithms of primitives' shading that reduces requirements to its bandwidth.

So, now we will show where the claimed 4000 million pix/sec for XBox come from. The OverDraw factor of many modern games exceeds 2-4. For example, in case of movements it reaches 3 in Quake3, and 6 (!) in Unreal. But remember that everything depend on effective realization of hidden primitive removal. It's unlikely that the technique used in the NV2A will throw out 100% hidden pixels. So, for 4 pipelines and 60% effectiveness we have the OverDraw around 6. This figure is the most possible, and it served a basis for calculation of the XBox specs. It's clear that the number of necessary pipelines (with the OverDraw fixed) will directly depend on the effectiveness of algorithms of hidden pixel removal! A good algorithm will require only 4 pixel pipelines. Games of the future, to be created taking into account the XBox and DX8, will in greater degree rely on T&L and hardware hidden surface removal than on its own effective algorithms. Besides, an increased complexity of models will additionally lift the OverDraw. But nevertheless, remember that 4000 million pix/sec is a relative figure and it can be reached only in theory. Meanwhile, a question whether those tricks are realized in the NV20 and how much they are effective remains open.

Now comes the HW T&L unit - its performance in comparison to the GeForce2 considerably increased, and undoubtedly in the NV20 there used an effective scheme of vertex cache and a mechanism of conversion and implementation of the lists of vertices and primitives stored in the local memory, without an access the main memory of the system at all. These measures allowed to minimize (and in the second case to avoid completely) limitations connected with AGP bus bandwidth. The more powerful T&L will let us get almost twice gain in games. Especially it will be well noticeable in games intended for the API DX 8.0 - there we will get high detailing and effects based on shaders. It concerns also the games which are prepared for the XBox release, and they will work successfully on the NV20 and following accelerators of the "generation X" 8.0. Old games will have practically the same speed as on the GeForce2-family. Some increase in the form of fps growth will have a place in all games which will be able to use HW T&L, e.g. Quake3, but only at low resolutions.

NVIDIA counts on the fact that new effects and possibilities will make people buy new NV20-based cards. And the XBox will help in doing that. It will provoke games of new visual level to appear, and PC users will reach for it buying practically all accelerators identical to the XBox (considering games). It's a real chance not only to sell a huge number of NV2A but also new expensive NV20-based cards.

For enthusiasts

Now comes the internal potential of the NV20 (and consequently the NV2A). New possibilities of the NV20 are tightly connected with the DX8. Here is a comparable table:

 
Parameter NVIDIA NV20 
driver of the 7 series
Radeon DDR 
driver of the 5 series
NVIDIA GeForce2 Ultra 
driver of the 7 series
Max Texture Size
4,096*4,096
2,048*2,048
2,048*2,048
Max Textures Count
4
3
2
Point Sprites
hardware, 
scalable (64x)
hardware(?), 
non-scalable
emulation, non-scalable
Max Texture Stages
8
Max Simultaneous Lights
Max Clip Planes
8
6
0
Max Vertex Blend Matrices
4
2
Max Primitive Count / Max Vertex Index
16,777,215
65,535
Vertex Shader Version
1.1
0.0
Max Vertex Shader Const
96
0
Pixel Shader Version
1.0
0.0
Quintic/RT Patches
Y/Y
N/N
W Depth Values
Yes
Stencil Buffer
Anisotropy Filtering
Cubic Texturing
Volume Texturing
Yes
No
Z test
No
Yes
Shade modes
Color and Specular Gouraud, Alpha Gouraud blend, Fog goraud
MultiSample (MS) Types
2x, 4x MS
with OGMS AA
No
No
SuperSample (SS) Types
No
OGSS AA
OGSS AA
Z/MIPMAP Bias
Y/Y
Vertex/Table/Range Fog
Y/Y(Emulation?)/Y
Y/Y/Y
Y/Emulation/Y
W/Z Fog
Y/Y
Bandwidth Saving
Early Z test / Compressed Z
Hierarhical optimized Z
No

If you haven't stopped reading

So, the comments:

  • Max Texture Size - no comments.
  • Max Textures Count - max number of textures used in forming of the one pixel. In the DX8 it is a max number of textures that can be used simultaneously in different texture stages of the pipeline.
  • Point Sprites - support for point sprites intended for quick rendering of particles. For more detailed info see our DirectX 8.0 FAQ. Scalable sprites can change their size, non-scalable are always shown 1:1. Without a special hardware support the sprites are emulated with two small triangles what brings to nought their potential gain (high speed of rendering).
  • Max Texture Stages - pipeline length, i.e. the number of operations that can be used for color forming of the net pixel from the source parameters (color of the selected texture points, values of transparency and light of vertices interpolated along the triangle surface, relief parameters etc.). In case of shader support the number of texture stages can define the length of a shader's program. At least, a length of a program implemented effectively, without a delay. Each standstill of a pipeline (in case an architecture would let it) leads to performance fall. One standstill - and shading goes twice slower, two - three times, etc. That's why all shader accelerators of the first wave will limit themselves with shaders which length is no more than the number of texture stages. A bit later we will discuss shaders, and now comes a table of operations possible at each stage:
 
NVIDIA NV20 
driver of the 7 series
Radeon DDR 
driver of the 5 series
NVIDIA GeForce2 Ultra 
driver of the 7 series
DISABLE 
SELECTARG 1 
SELECTARG 2 
MODULATE 
MODULATE 2X 
MODULATE 4X 
ADD 
ADDSIGNED 
ADDSIGNED 2X 
SUBTRACT 
ADDSMOOTH 
BLENDDIFFUSEALPHA 
BLENDTEXTUREALPHA 
BLENDFACTORALPHA 
BLENDTEXTUREALPHAPM 
BLENDCURRENTALPHA 
PREMODULATE 
MODULATE ALPHA_ADDCOLOR 
MODULATE COLOR_ADDALPHA 
MODULATEINVALPHA_ADDCOLOR 
MODULATEINVCOLOR_ADDALPHA 
BUMPENVMAP 
BUMPENVMAPLUMINANCE 
DOTPRODUCT3 
MULTIPLYADD 
LERP
DISABLE 
SELECTARG 1 
SELECTARG 2 
MODULATE 
MODULATE 2X 
MODULATE 4X 
ADD 
ADDSIGNED 
ADDSIGNED 2X 
SUBTRACT 
BLENDDIFFUSEALPHA 
BLENDTEXTUREALPHA 
BLENDFACTORALPHA 
BLENDTEXTUREALPHAPM 
BLENDCURRENTALPHA 
MODULATEALPHA_ADDCOLOR 
MODULATECOLOR_ADDALPHA 
MODULATEINVALPHA_ADDCOLOR 
MODULATEINVCOLOR_ADDALPHA 
BUMPENVMAP 
DOTPRODUCT3 
MULTIPLYADD 
LERP
DISABLE 
SELECTARG 1 
SELECTARG 2 
MODULATE 
MODULATE 2X 
MODULATE 4X 
ADD 
ADDSIGNED 
ADDSIGNED 2X 
SUBTRACT 
ADDSMOOTH 
BLENDDIFFUSEALPHA 
BLENDTEXTUREALPHA 
BLENDFACTORALPHA 
BLENDTEXTUREALPHAPM 
BLENDCURRENTALPHA 
PREMODULATE 
MODULATEALPHA_ADDCOLOR 
MODULATECOLOR_ADDALPHA 
MODULATEINVALPHA_ADDCOLOR 
MODULATEINVCOLOR_ADDALPHA 
DOTPRODUCT3
All these operations are possible when a stage is intended for processing of color or alpha value (besides, the stage can be used for interpolation or loading of the corresponding parameter from texture/vertices etc.)

So, what these operations do:

  • DISABLE - prohibits working (giving out of the results) beginning from the current pipeline stage and further.
  • SELECTARG1 (or 2) - the result of this stage is one of its input parameters (nowise modified)
  • MODULATE - the result is multiplication of the input parameters. Out=In1*In2
  • MODULATE2X (or 4X) - the same, plus scaling, Out=(In1*In2)*2 or *4, correspondingly
  • ADD - adding Out=In1+In2
  • ADDSIGNED - adding with a sign Out=In1+In2-0.5
  • ADDSIGNED2X - adding with a sign and scaling Out=(In1+In2-0.5)*2
  • SUBTRACT - Out=In1-In2
  • ADDSMOOTH - trick adding with combination Out=In1+In2*(1-In1)
  • BLENDDIFFUSEALPHA,

  • BLENDTEXTUREALPHA,
    BLENDFACTORALPHA,
    BLENDCURRENTALPHA - alpha blending of parameters with usage of one of 4 possible Alpha values (taken from the previous stage CURRENT, a value FACTOR taken from vertices and interpolated along the triangle surface, value from TEXTURE or constant value DIFFUSE. Correspondingly, Out=In1*Alpha+In2*(1-Alpha)
  • BLENDTEXTUREALPHAPM - special type of alpha blending, Alpha value is taken from the texture. Out=In1+In2*(1-Alpha)
  • PREMODULATE - modulates the result of the current stage with the result of the following one (e.g. used for creating of flashes).
  • MODULATEALPHA_ADDCOLOR - modulates the color of the second parameter with alpha of the first. Out=In1RGB+In2RGB*In1Alpha
  • MODULATECOLOR_ADDALPHA - multiplies colors and adds alpha Out=In1RGB*In2RGB+In1Alpha
  • MODULATEINVALPHA_ADDCOLOR,

  • MODULATEINVCOLOR_ADDALPHA - the same as the two previous, correspondingly, but instead of Alpha there used 1-Alpha
  • BUMPENVMAP - per-pixel EMBM, the result of the following stage serves an environment map. See further for description of texture formats, there are special formats for keeping maps of altitude/bias for setting relief.
  • BUMPENVMAPLUMINANCE - the same but allowing for lighting factor also kept in relief's texture.
  • DOTPRODUCT3 - the most honest per-pixel relief type. In fact, a scalar product of two vectors allowing for signs the components of which are located in RGB of input parameters. In1R*In2R+In1G*In2G+In1B*In2B
  • MULTIPLYADD - popular operation Out=In1+In2*In3
  • LERP - linear interpolation Out=(In1)*In2+(1-In1)*In3

On a based of this mechanism one can program many effects with usage of different number of textures. But the fact that different accelerators support different number of operations discredits much this mechanism of effects' control (NV20 is capable of everything, Radeon is a good boy, and the GeForce2 lags too far behind them). Anyway, pixel shaders are more flexible and convenient tool.

We continue with comments on the table:

  • Max Simultaneous Lights - max number of light sources processed hardwarely. 8 is already a standard. But realizations, according to the test results which reveal dependence of performance decrease on light sources, are different. Let's see what possibilities in hardware calculation of lighting and geometry are provided by our cards:
 
NVIDIA NV20 
driver of the 7 series
Radeon DDR 
driver of the 5 series
NVIDIA GeForce2 Ultra 
driver of the 7 series
DIRECTIONALLIGHTS 
LOCALVIEWER 
MATERIALSOURCE7 
POSITIONALLIGHTS 
TEXGEN
DIRECTIONALLIGHTS 
LOCALVIEWER 
MATERIALSOURCE7 
POSITIONALLIGHTS 
TEXGEN
DIRECTIONALLIGHTS 
LOCALVIEWER 
POSITIONALLIGHTS 
TEXGEN

Cooments:

  • DIRECTIONALLIGHTS - support for infinitely far light sources, which are set only by direction
  • POSITIONALLIGHTS - support for pixel and conic sources
  • LOCALVIEWER - support of calculation in local coordinates
  • MATERIALSOURCE7 - you may choose a light source of vertices of a primitive
  • TEXGEN - hardware generation of texture coordinates

OK, let's return to the main table:

  • Max Clip Planes - a number of planes set by a user for clipping primitives. The plane is determined with 4 factors (ABCD) and if for the primitive's coordinates the term (Ax + By + Cz + Dw >= 0 (w - quaternary coordinate) is implemented , the primitive is clipped and doesn't proceed to rendering. There is an obvious fall of the GeForce2. And an interesting redundancy in case of the NV20 - even an arbitrary cubic sector can be set with 6 planes.
  • Max Vertex Blend Matrices - max number of matrices simultaneously applied to a vertex at the time of multimatrix coordinate blending. In DX7 matrix blending (single-skin) with 4 matrices was possible. Unfortunately, the GeForce/GeForce2 support only two. In the DX8 you can use a set up to 256 matrices, with 4 matrix limitation for one vertex, they are chosen with an index. But drivers of the NV20, Radeon and GeForce2 today do not support such indexing.
  • Max Primitive Count / Max Vertex Index - all these accelerators can interpret, transform, light and render lists of primitive tremendously unloading a CPU. This parameter defines a max size of a list of primitives or vertices.
  • Vertex Shader Version - for a start comes an illustration from the DX8 SDK documentation:

Here you can see a black box - version 1.1 of the NV20, which is lacking in the Radeon and GeForce2. In fact, the Radeon and GeForce2 have the Vertex ALU which can interpret shaders, though they are not completely compatible with the final standard. You can say that these shaders are of 0.5 version. By the way, for the Radeon they can be switched on with a special key in the register, but in this case the most of the samples from the DX8 SDK will buzz, and only some shaders will be implemented as they should do. It's because the compatibility is partial. However that may be, Microsoft didn't introduce a conception of shaders of "0.5 version", and we will hold it only for the NV20. So, shaders deals with constants (next line of the table defines their number - Max Vertex Shader Const), for the NV20 they are 96. With 16 input and 8 variable (temporary) registers. When the shader's operation is working (the size of which for the NV20 is limited with 128 ops, but it differs with other chips) data are operated and on this base there created 4 sets of coordinates for textures and two color values for a vertex and the resulting vertex coordinates. A pixel shader or a chosen (by a user) configuration of texture stages further work with these data (while rendering a primitive):

I'm not going to touch the performance of this unit. I just want to notice that it's very easy to make a parallel unit processing several vertices simultaneously, implementing in fact one shader's program. I think that it works this way here.

By the way, when using a shader many constants are available (96 for the NV20). Nothing prevent us from writing a shader that would realize a blending with arbitrary degree of flexibility, e.g. with usage of 96/4 matrices. But remember about a restriction on the number of shader's operations. Taking this into account we will get ~20 matrices per vertex. Though, in case of one skip it's useless.

Again comes the main table:

  • Pixel Shader Version - this is the second type of shaders. Enabling pixel shaders with the Radeon, you may see that a situation is more optimistic this time. It seems that what it has in the hardware is very close to the 1.0 version, and many samples from the DX8 works well. For those who want to experiment, here are the keys for the Radeon:
    HKEY_LOCAL_MACHINE\SOFTWARE\ATI_Technologies\Driver\0000\atidxhal
    (instead of 0000 it can be 0001 etc.)

    string VertexShaderVersion = "10"
    (nearly all parameters using it buzz)

    string PixelShaderVersion = "10"
    (it works better, you can see nearly all examples in hardware implementation)

    string PureDevice = "1" - enables "PURE device" mode
    In this mode (the fastest) an accelerator stores, converts and implements lists of primitives and vertices in local memory.

A place of pixel shaders in the general shade picture:

Simple but tasteful. A shader is calculated in every shaded point of a triangle, it should be implemented max quickly. There are 8 constants, two colors (interpolated along a surface of a primitive), texture stages which we can interact with. There are two temporary registers. The problem is to calculate the resulting color of the pixel. There are a huge number of operations available which are similar to the described above for calculations at the texture stage, but more flexible. The operations can work in several directions simultaneously. Their max number is equal to the number of texture stages (for the NV20 they are only 8). A pixel shader is rather a setting of pipeline stages. And it is implemented more effectively giving out one result per clock. And again nothing hampers parallel living of several pipelines in the hardware which are set equally.

Well, again the main table:

  • Quintic/RT Patches - hardware support for tesselation of smooth surfaces! There are added two new primitives of high order - rectangular and triangular patches. For their rendering there are two corresponding calls in the DX8, and you can control the degree of detailing when dividing smooth surfaces of patches into triangles. It seems that the NV20 implements tesselation hardwarely. Maybe the result of tesselation is kept in the local memory of an accelerator in the form of a resulting list of the triangular primitives, or parameters of a triangle are generated dynamically and immediately go to rendering without being anywhere saved. In the second case, the work of a programmer becomes simpler and memory load is decreased.
  • W Depth Values - support of an alternative format of depth values (W format).
  • Stencil Buffer - support for Stencil buffer. Accelerators can implement the following operations with Stencil buffer values:
 
NVIDIA NV20 
driver of the 7 series
Radeon DDR 
driver of the 5 series
NVIDIA GeForce2 Ultra 
driver of the 7 series
KEEP 
ZERO 
REPLACE 
INCRSAT 
DECRSAT 
INVERT 
INCR 
DECR
KEEP 
ZERO 
REPLACE 
INCRSAT 
INVERT 
DECR
KEEP 
ZERO 
REPLACE 
INCRSAT 
DECRSAT 
INVERT 
INCR 
DECR

Detailed info on operations:

  • KEEP - not to change a value in the buffer
  • ZERO - set 0 in all rendered pixels of the primitive
  • REPLACE - record some definite value
  • INCRSAT - increase by one. If case the maximum reached, the value are not to be changed.
  • DECRSAT - decrease by one. On reaching 0 do not change it.
  • INCR, DECR - the same but with return. For example, when the maximum is reached the value is set to 0

Now comes the main table again:

  • Anisotropy Filtering - It seems that in the NV20 it's realized the same way as in the GeForce2, and looks worse than that of the Radeon. A bit later we will give a table with all accessible filtering modes for different texture types.
  • Cubic Texturing - hardware support of cubic texturing (cube environment mapping).
  • Volume Texturing - hardware support for volume texturing.

Here comes a table with possible filtering modes for three possible texture types, with usage of MIP levels and without:

 
Filtering NVIDIA NV20 
driver of the 7 series
Radeon DDR 
driver of the 5 series
NVIDIA GeForce2 Ultra 
driver of the 7 series
Standard Texture Filters
Min/Mag
Point, Linear, Anisotropic
MIPMAP
Point, Linear
Cube Texture Filters
Min/Mag
Point, Linear, Anisotropic
Point, Linear
Point, Linear, Anisotropic
MIPMAP
Point, Linear
No
Point, Linear
Volume Texture Filters
Min/Mag
Point, Linear, Anisotropic
Point, Linear
No
MIPMAP
Point, Linear
No
No

Here the NV20 takes the lead...

Let's return to the parameters:

  • Z test - a possibility to inform after rendering of a primitive whether at least one its pixel can be seen
  • Shade modes - There all three cards are equal.
  • MultiSample Types - possible modes of multisampling. Only of the NV20.
  • SSAA (Super Sample Anti-Aliasing) Types - possible modes of full screen AA.
  • Z/MIPMAP Bias - possibility of biasing of MIP level or a depth.
  • Vertex/Table/Range Fog - supported types of fog.
  • W/Z Fog - possible depth formats for the fog.
  • Bandwidth Saving - methods of saving of memory bandwidth.

Look at a table with texturing parameters:

 
NVIDIA NV20 
driver of the 7 series
Radeon DDR 
driver of the 5 series
NVIDIA GeForce2 Ultra 
driver of the 7 series
PERSPECTIVE 
POW2 
ALPHA 
ALPHAPALETTE 
PROJECTED 
CUBEMAP 
VOLUMEMAP 
MIPMAP 
MIPVOLUMEMAP 
MIPCUBEMAP 
CUBEMAP_POW2 
VOLUMEMAP_POW2
PERSPECTIVE 
POW2 
ALPHA 
PROJECTED 
CUBEMAP 
VOLUMEMAP 
MIPMAP 
CUBEMAP_POW2 
VOLUMEMAP_POW2
PERSPECTIVE 
POW2 
ALPHA 
PROJECTED 
CUBEMAP 
MIPMAP 
MIPCUBEMAP 
CUBEMAP_POW2

Comments:

  • PERSPECTIVE - hardware correction of perspective.
  • POW2, CUBEMAP_POW2, VOLUMEMAP_POW2 - textures of the corresponding type should have the size equal to the power of 2.
  • MIPMAP - support of mipmap texturing for standard textures
  • CUBEMAP, VOLUMEMAP - support for cube environment maps and volume textures correspondingly
  • MIPVOLUMEMAP, MIPCUBEMAP - the same, plus mipmap levels. The NV20 excelled here again.
  • ALPHA - support for alpha channels in a texture and
  • ALPHAPALETTE - in a palette correspondingly.

And at last, possible formats for frame buffers, depth buffers and different types of textures:

 
NVIDIA NV20 
driver of the 7 series
Radeon DDR 
driver of the 5 series
NVIDIA GeForce2 Ultra 
driver of the 7 series
Depth/Stencil Formats
D24S8 
D16 
D24X8
D32 
D24S8 
D16 
D24X8
D24X8 
D24S8 
D16 (standart/lockable)
Render Target Formats
A8R8G8B8 
X8R8G8B8 
R5G6B5 
X1R5G5B5
A8R8G8B8 
X8R8G8B8 
R5G6B5 
A1R5G5B5 
A4R4G4B4 
R3G3B2
A8R8G8B8 
X8R8G8B8 
R5G6B5 
X1R5G5B5
Texture Formats
A8R8G8B8 
X8R8G8B8 
R5G6B5 
X1R5G5B5 
A1R5G5B5 
A4R4G4B4 
P8 
V8U8 
L6V5U5 
X8L8V8U8 
Q8W8V8U8 
DXT1 
DXT2 
DXT3 
DXT4 
DXT5
A8R8G8B8 
X8R8G8B8 
R5G6B5 
X1R5G5B5 
A1R5G5B5 
A4R4G4B4 
R3G3B2 
V8U8 
DXT1 
DXT2 
DXT3 
DXT4 
DXT5
A8R8G8B8 
X8R8G8B8 
R5G6B5 
X1R5G5B5 
A1R5G5B5 
A4R4G4B4 
P8 
DXT1 
DXT2 
DXT3 
DXT4 
DXT5
Cube Texture Formats
A8R8G8B8 
X8R8G8B8 
R5G6B5 
X1R5G5B5 
A1R5G5B5 
A4R4G4B4 
P8 
V8U8 
L6V5U5 
X8L8V8U8 
Q8W8V8U8 
DXT1 
DXT2 
DXT3 
DXT4 
DXT5
A8R8G8B8 
X8R8G8B8 
R5G6B5 
X1R5G5B5 
A1R5G5B5 
A4R4G4B4 
R3G3B2 
DXT1 
DXT2 
DXT3 
DXT4 
DXT5
A8R8G8B8 
X8R8G8B8 
R5G6B5 
X1R5G5B5 
A1R5G5B5 
A4R4G4B4 
P8 
DXT1 
DXT2 
DXT3 
DXT4 
DXT5
Volume Texture Formats
A8R8G8B8 
X8R8G8B8 
R5G6B5 
X1R5G5B5 
A1R5G5B5 
A4R4G4B4 
P8 
R5G6B5
A8R8G8B8 
X8R8G8B8 
R5G6B5 
X1R5G5B5 
A1R5G5B5 
A4R4G4B4 
R3G3B2 
DXT1 
DXT2 
DXT3 
DXT4 
DXT5
-

Comments on texture and buffer formats:

  • The letters mean the type of data stored (e.g. R/G/B - color components). A digit after a letter - number of bits (R8G8B8 True color)
  • RGB - color components.
  • D - depth in the Z or W format
  • A - alpha channel, i.e. transparency
  • X - unused value
  • DXT1..5 - compressed textures (by the corresponding method)
  • QWUV - relief parameters (BumpMap)
  • L - light
  • P - index in a palette

Well, the NV20 turned out well in possibilities (unlike the GeForce2). NVIDIA company proves its rank of a technological leader.

It's not clear up what will be with a speed, many things will depend on memory and programmers. I want to note a tighter integration of the API (DX8) and hardware and wide range of technological innovations. It's obvious that together with the XBox the NV2x series is capable to provoke a real revolution of trick effects in games. The main thing is that game developers manage to create new games or remake the current ones; it's cool that the upcoming XBox guaranties availability of games intended for the API DX8 possibilities.

Expected that the first NV20-based cards will be released by ASUS, Leadtek, Elsa, Hercules in March, and in April-May there will come GigaByte, MSI and many others.

In the very beginning of sales the NV20 based cards with 64 MBytes will cost around $450-500, but a month later the price will little by little come down. Interestingly is the fact that in the roadmaps of many respected companies there are only the cards with 128 MBytes memory.


Write a comment below. No registration needed!


Article navigation:



blog comments powered by Disqus

  Most Popular Reviews More    RSS  

AMD Phenom II X4 955, Phenom II X4 960T, Phenom II X6 1075T, and Intel Pentium G2120, Core i3-3220, Core i5-3330 Processors

Comparing old, cheap solutions from AMD with new, budget offerings from Intel.
February 1, 2013 · Processor Roundups

Inno3D GeForce GTX 670 iChill, Inno3D GeForce GTX 660 Ti Graphics Cards

A couple of mid-range adapters with original cooling systems.
January 30, 2013 · Video cards: NVIDIA GPUs

Creative Sound Blaster X-Fi Surround 5.1

An external X-Fi solution in tests.
September 9, 2008 · Sound Cards

AMD FX-8350 Processor

The first worthwhile Piledriver CPU.
September 11, 2012 · Processors: AMD

Consumed Power, Energy Consumption: Ivy Bridge vs. Sandy Bridge

Trying out the new method.
September 18, 2012 · Processors: Intel
  Latest Reviews More    RSS  

i3DSpeed, September 2013

Retested all graphics cards with the new drivers.
Oct 18, 2013 · 3Digests

i3DSpeed, August 2013

Added new benchmarks: BioShock Infinite and Metro: Last Light.
Sep 06, 2013 · 3Digests

i3DSpeed, July 2013

Added the test results of NVIDIA GeForce GTX 760 and AMD Radeon HD 7730.
Aug 05, 2013 · 3Digests

Gainward GeForce GTX 650 Ti BOOST 2GB Golden Sample Graphics Card

An excellent hybrid of GeForce GTX 650 Ti and GeForce GTX 660.
Jun 24, 2013 · Video cards: NVIDIA GPUs

i3DSpeed, May 2013

Added the test results of NVIDIA GeForce GTX 770/780.
Jun 03, 2013 · 3Digests
  Latest News More    RSS  

Platform  ·  Video  ·  Multimedia  ·  Mobile  ·  Other  ||  About us & Privacy policy  ·  Twitter  ·  Facebook


Copyright © Byrds Research & Publishing, Ltd., 1997–2011. All rights reserved.