NVIDIA NV20 GPU in-depth features analisys

General

No secret that on February 27 at Intel's Developer Forum (IDF) a new GPU from NVIDIA, known as NV20, must be announced officially. All are interested what's new could be seen in this graphics processor. So, we are going to lift the veil of secrecy. Note that we won't give results of performance measurements since this engineering sample is rather raw and the drivers are not finished. But on the NV20 functional features we can speak today in full volume.

First comes the NV20 specification (we are demonstrating the most interesting parameters in order not to repeat everything supported in the GeForce2).

NV20 specification

Technology: 0.15 micron
60 mil. transistors
Graphics core frequency: 200+ MHz (later "Ultra" and "Pro" variants are possible, e.g. at 250 MHz)
Rendering pixel pipelines: 4
Texture blocks per rendering pipeline: 2
4 textures per pixel
Memory interface: 128 bit
Supported memory as DDR SDRAM/SGRAM
At the time of release the first NV20 based cards will be equipped with the memory working at 250 (500) MHz
Peak bandwidth of the memory bus at 250 MHz: 8 GBytes/sec
Supported local memory size: up to 128 MBytes (the first cards, as our sample, will have 64 MBytes)
RAMDAC: 350 MHz
Max resolution: 2048x1536@75 Hz
Integrated TMDS transmitter allows connecting digital monitors and supports resolution up to 1600x1200
External bus interface: full support of AGP x2/x4 (including SBA, DME and Fast Writes) and PCI 2.2 (including Bus mastering).
Hardware T&L: effective performance around 40+ million triangles per second (our sample has a bit lower than 40 million triangles per second at a special synthetical test)
Full hardware support for all MS DirectX 8.0 and OpenGL 1.2 possibilities
Full support for hardware VertexShaders DX8, 1.1 version
Full support for hardware PixelShaders DX8, 1.0 version
Support for volume textures
Support for Cube environment mapping
Support for projective textures
Support for hardware tesselation of smooth surfaces (rectangular and triangular patches)
Hardware support for relief texturing of the following types: Embosing, Dot Product3 and EMBM (at last!)
Support for S3TC and all 5 DXTC texture compression methods
Support for primitive clipping in arbitrary set planes
Support for FSAA (with OGMS MultiSampling)
Support for memory bandwidth saving based on compressed Z-buffer
Support for HSR (early Z test)
Support for textures up to 4096x4096 @ 32 bit

Even a fleeting glance allows to conclude that in the NV20 there is a completely new architecture that provides a full hardware support of the DirectX 8.0. I can even state that the DX8 support comes in time, this fact seems to be used as a basis when promoting the NV20 and cards based on it.

NVIDIA didn't take a way of direct power increase (e.g. extension of the number of rendering pipelines). On the contrary, the engineers tried to do their best to realize a new architecture allowing full hardware support for DX8, but at the same time without a considerable increase in the number of transistors in the chip. Plus 0.15 micron technological process that makes possible to accommodate around 60 million transistors on the chip without making its size gigantic. A quick shift to the NV20 guarantees that NVIDIA has very good relations with vendors and reliable channels of memory supply together with the GPU. The reference design of cards for the NV20 doesn't differ much from the PCB design for the GeForce2 - memory chips are several mm shifted to the graphics core, and the position of power stabilizers has changed a bit. The DDR memory is available in industrial range that's why there is nothing to worry about.

The NV20 graphics core frequency and number of pixel pipelines and texture blocks correspond to the GeForce2 series chips. It means that the potential pixel fillrate of the NV20 based card will be equal to that of the GeForce2 based cards, but it's not a tragedy. In fact the problem of all GPU, starting from the GeForce 256, concludes not in the potential fillrate, but in the effective (real) fillrate, which is directly connected with the local memory bandwidth. It is exactly a local memory that turns to be a bottleneck of all modern graphics accelerators. So I think that it's right of NVIDIA that took a way of increasing the architecture effectiveness instead of just a direct power extension.

Another interesting feature of the NV20 concludes in 4 textures per pixel, what will allow to reveal a potential of programmable shaders lifting 3D realism on a completely new level. But being limited by the memory bandwidth, the chip developers didn't spend transistors realizing 4 texture blocks per each pixel pipeline. They took another way: they allowed keeping the results of working of two texture blocks for combining with results received at the following clock. It means that with usage of 3 or 4 textures the maximum reached performance falls twice. But in real applications, where everything is limited by a memory bandwidth, the punishment won't be so severe.

As to a video memory - nothing new. Probably full value drivers for the NV20 will weaken undesirable effects connected with limitation of the memory bus bandwidth at the expense of activation of special schemes, e.g. deferred texturing, hierarchical z-buffer etc. There is only left to get the final samples of the NV20 based cards, the release version of drivers and to check it in practice.

Let's digress a bit to recall the XBox. The official specs of the X-Box sound amazing: the pixel fillrate corresponds to 4000 million pix/sec! Frequency of the graphics core used in the XBox of the NV2A will be around 250 MHz. Even considering 8 pixel pipelines we won't get this figure (250*8=2000 million pixels). But 8 pipelines are unlikely - the XBox memory bandwidth won't be sufficient, in the current specification it's around 6.4 GBytes/sec, therefore they are again 4. Moreover, it's known that NV2A has 2 texture modules per pixel pipeline. Where is this figure from? In fact, we are talking about 4x Multi-Sample mode. There, 4 positions correspond to one received color value, it means that fillrate increases 4 times as compared with cards that don't support Multi-Sample mode hardwarely. This is what Microsoft and NVIDIA are doing while advertising their new products.

You know that you can increase the value of effective fillrate - without usage of new or/and expensive memory technologies - only with usage of modified algorithms of primitive rendering which reduce requirements to its bandwidth. It's obvious that future games will greater rely on T&L and hardware removal of hidden surfaces than on their own effective algorithms. Besides, the increased complexity of models will extend the OverDraw. In the NV20 they realized a simple hardware support for HSR (early revision of Z). It's effeciency greatly depends on the scene transferred to the accelerator. If the primitives are sorted in the order of moving off from the observer, this will allow decrease the necessity considerably in calculation of hidden pixels of the primitives, the effective fillrate will rise several times. Besides, the NV20 is equipped with Z-buffer compression. And according to the data received, it's a bit different algorithm than an hierarchical Z, but the compressed buffer is divided into tiles which are read, recorded, processes and cached in a chip wholly. We are only to wait until the final samples of the cards on the NV20, release version of drivers and to check the effectiveness of these possibilities with real games.

Besides, it's doubtful that MS would decide on usage of such exotic and therefore expensive memory types. Where is this figure from? In fact it concerns an effective fillrate - the NV2A (and probably the NV20) will be supplied with an algorithm(s) allowing to optimize the process of rendering - not to show invisible pixels. It's likely to be some combination of the hierarchical Z buffer and one more method of hidden surface removal. There is a sound ground for counting on special modified algorithms which reduce the requirements to the video memory bandwidth. Without new and/or expensive technologies you can lift an effective fillrate only with usage of modified algorithms of primitives' shading that reduces requirements to its bandwidth.

So, now we will show where the claimed 4000 million pix/sec for XBox come from. The OverDraw factor of many modern games exceeds 2-4. For example, in case of movements it reaches 3 in Quake3, and 6 (!) in Unreal. But remember that everything depend on effective realization of hidden primitive removal. It's unlikely that the technique used in the NV2A will throw out 100% hidden pixels. So, for 4 pipelines and 60% effectiveness we have the OverDraw around 6. This figure is the most possible, and it served a basis for calculation of the XBox specs. It's clear that the number of necessary pipelines (with the OverDraw fixed) will directly depend on the effectiveness of algorithms of hidden pixel removal! A good algorithm will require only 4 pixel pipelines. Games of the future, to be created taking into account the XBox and DX8, will in greater degree rely on T&L and hardware hidden surface removal than on its own effective algorithms. Besides, an increased complexity of models will additionally lift the OverDraw. But nevertheless, remember that 4000 million pix/sec is a relative figure and it can be reached only in theory. Meanwhile, a question whether those tricks are realized in the NV20 and how much they are effective remains open.

Now comes the HW T&L unit - its performance in comparison to the GeForce2 considerably increased, and undoubtedly in the NV20 there used an effective scheme of vertex cache and a mechanism of conversion and implementation of the lists of vertices and primitives stored in the local memory, without an access the main memory of the system at all. These measures allowed to minimize (and in the second case to avoid completely) limitations connected with AGP bus bandwidth. The more powerful T&L will let us get almost twice gain in games. Especially it will be well noticeable in games intended for the API DX 8.0 - there we will get high detailing and effects based on shaders. It concerns also the games which are prepared for the XBox release, and they will work successfully on the NV20 and following accelerators of the "generation X" 8.0. Old games will have practically the same speed as on the GeForce2-family. Some increase in the form of fps growth will have a place in all games which will be able to use HW T&L, e.g. Quake3, but only at low resolutions.

NVIDIA counts on the fact that new effects and possibilities will make people buy new NV20-based cards. And the XBox will help in doing that. It will provoke games of new visual level to appear, and PC users will reach for it buying practically all accelerators identical to the XBox (considering games). It's a real chance not only to sell a huge number of NV2A but also new expensive NV20-based cards.

For enthusiasts

Now comes the internal potential of the NV20 (and consequently the NV2A). New possibilities of the NV20 are tightly connected with the DX8. Here is a comparable table:

Parameter	NVIDIA NV20 driver of the 7 series	Radeon DDR driver of the 5 series	NVIDIA GeForce2 Ultra driver of the 7 series
Max Texture Size	4,096*4,096	2,048*2,048	2,048*2,048
Max Textures Count	4	3	2
Point Sprites	hardware, scalable (64x)	hardware(?), non-scalable	emulation, non-scalable
Max Texture Stages	8
Max Simultaneous Lights	8
Max Clip Planes	8	6	0
Max Vertex Blend Matrices	4		2
Max Primitive Count / Max Vertex Index	16,777,215	65,535
Vertex Shader Version	1.1	0.0
Max Vertex Shader Const	96	0
Pixel Shader Version	1.0	0.0
Quintic/RT Patches	Y/Y	N/N
W Depth Values	Yes
Stencil Buffer
Anisotropy Filtering
Cubic Texturing
Volume Texturing	Yes		No
Z test	No	Yes	No
Shade modes	Color and Specular Gouraud, Alpha Gouraud blend, Fog goraud
MultiSample (MS) Types	2x, 4x MS with OGMS AA	No	No
SuperSample (SS) Types	No	OGSS AA	OGSS AA
Z/MIPMAP Bias	Y/Y
Vertex/Table/Range Fog	Y/Y(Emulation?)/Y	Y/Y/Y	Y/Emulation/Y
W/Z Fog	Y/Y
Bandwidth Saving	Early Z test / Compressed Z	Hierarhical optimized Z	No

If you haven't stopped reading

So, the comments:

Max Texture Size - no comments.
Max Textures Count - max number of textures used in forming of the one pixel. In the DX8 it is a max number of textures that can be used simultaneously in different texture stages of the pipeline.
Point Sprites - support for point sprites intended for quick rendering of particles. For more detailed info see our DirectX 8.0 FAQ. Scalable sprites can change their size, non-scalable are always shown 1:1. Without a special hardware support the sprites are emulated with two small triangles what brings to nought their potential gain (high speed of rendering).
Max Texture Stages - pipeline length, i.e. the number of operations that can be used for color forming of the net pixel from the source parameters (color of the selected texture points, values of transparency and light of vertices interpolated along the triangle surface, relief parameters etc.). In case of shader support the number of texture stages can define the length of a shader's program. At least, a length of a program implemented effectively, without a delay. Each standstill of a pipeline (in case an architecture would let it) leads to performance fall. One standstill - and shading goes twice slower, two - three times, etc. That's why all shader accelerators of the first wave will limit themselves with shaders which length is no more than the number of texture stages. A bit later we will discuss shaders, and now comes a table of operations possible at each stage:

NVIDIA NV20 driver of the 7 series	Radeon DDR driver of the 5 series	NVIDIA GeForce2 Ultra driver of the 7 series
DISABLE SELECTARG 1 SELECTARG 2 MODULATE MODULATE 2X MODULATE 4X ADD ADDSIGNED ADDSIGNED 2X SUBTRACT ADDSMOOTH BLENDDIFFUSEALPHA BLENDTEXTUREALPHA BLENDFACTORALPHA BLENDTEXTUREALPHAPM BLENDCURRENTALPHA PREMODULATE MODULATE ALPHA_ADDCOLOR MODULATE COLOR_ADDALPHA MODULATEINVALPHA_ADDCOLOR MODULATEINVCOLOR_ADDALPHA BUMPENVMAP BUMPENVMAPLUMINANCE DOTPRODUCT3 MULTIPLYADD LERP	DISABLE SELECTARG 1 SELECTARG 2 MODULATE MODULATE 2X MODULATE 4X ADD ADDSIGNED ADDSIGNED 2X SUBTRACT BLENDDIFFUSEALPHA BLENDTEXTUREALPHA BLENDFACTORALPHA BLENDTEXTUREALPHAPM BLENDCURRENTALPHA MODULATEALPHA_ADDCOLOR MODULATECOLOR_ADDALPHA MODULATEINVALPHA_ADDCOLOR MODULATEINVCOLOR_ADDALPHA BUMPENVMAP DOTPRODUCT3 MULTIPLYADD LERP	DISABLE SELECTARG 1 SELECTARG 2 MODULATE MODULATE 2X MODULATE 4X ADD ADDSIGNED ADDSIGNED 2X SUBTRACT ADDSMOOTH BLENDDIFFUSEALPHA BLENDTEXTUREALPHA BLENDFACTORALPHA BLENDTEXTUREALPHAPM BLENDCURRENTALPHA PREMODULATE MODULATEALPHA_ADDCOLOR MODULATECOLOR_ADDALPHA MODULATEINVALPHA_ADDCOLOR MODULATEINVCOLOR_ADDALPHA DOTPRODUCT3
All these operations are possible when a stage is intended for processing of color or alpha value (besides, the stage can be used for interpolation or loading of the corresponding parameter from texture/vertices etc.)

So, what these operations do:

DISABLE - prohibits working (giving out of the results) beginning from the current pipeline stage and further.
SELECTARG1 (or 2) - the result of this stage is one of its input parameters (nowise modified)
MODULATE - the result is multiplication of the input parameters. Out=In1*In2
MODULATE2X (or 4X) - the same, plus scaling, Out=(In1*In2)*2 or *4, correspondingly
ADD - adding Out=In1+In2
ADDSIGNED - adding with a sign Out=In1+In2-0.5
ADDSIGNED2X - adding with a sign and scaling Out=(In1+In2-0.5)*2
SUBTRACT - Out=In1-In2
ADDSMOOTH - trick adding with combination Out=In1+In2*(1-In1)
BLENDDIFFUSEALPHA,

BLENDTEXTUREALPHA,

BLENDFACTORALPHA,

BLENDCURRENTALPHA

CURRENT

FACTOR

TEXTURE

DIFFUSE

BLENDTEXTUREALPHAPM - special type of alpha blending, Alpha value is taken from the texture. Out=In1+In2*(1-Alpha)
PREMODULATE - modulates the result of the current stage with the result of the following one (e.g. used for creating of flashes).
MODULATEALPHA_ADDCOLOR - modulates the color of the second parameter with alpha of the first. Out=In1RGB+In2RGB*In1Alpha
MODULATECOLOR_ADDALPHA - multiplies colors and adds alpha Out=In1RGB*In2RGB+In1Alpha
MODULATEINVALPHA_ADDCOLOR,

MODULATEINVCOLOR_ADDALPHA

BUMPENVMAP - per-pixel EMBM, the result of the following stage serves an environment map. See further for description of texture formats, there are special formats for keeping maps of altitude/bias for setting relief.
BUMPENVMAPLUMINANCE - the same but allowing for lighting factor also kept in relief's texture.
DOTPRODUCT3 - the most honest per-pixel relief type. In fact, a scalar product of two vectors allowing for signs the components of which are located in RGB of input parameters. In1R*In2R+In1G*In2G+In1B*In2B
MULTIPLYADD - popular operation Out=In1+In2*In3
LERP - linear interpolation Out=(In1)*In2+(1-In1)*In3

On a based of this mechanism one can program many effects with usage of different number of textures. But the fact that different accelerators support different number of operations discredits much this mechanism of effects' control (NV20 is capable of everything, Radeon is a good boy, and the GeForce2 lags too far behind them). Anyway, pixel shaders are more flexible and convenient tool.

We continue with comments on the table:

Max Simultaneous Lights - max number of light sources processed hardwarely. 8 is already a standard. But realizations, according to the test results which reveal dependence of performance decrease on light sources, are different. Let's see what possibilities in hardware calculation of lighting and geometry are provided by our cards:

NVIDIA NV20 driver of the 7 series	Radeon DDR driver of the 5 series	NVIDIA GeForce2 Ultra driver of the 7 series
DIRECTIONALLIGHTS LOCALVIEWER MATERIALSOURCE7 POSITIONALLIGHTS TEXGEN	DIRECTIONALLIGHTS LOCALVIEWER MATERIALSOURCE7 POSITIONALLIGHTS TEXGEN	DIRECTIONALLIGHTS LOCALVIEWER POSITIONALLIGHTS TEXGEN

Cooments:

DIRECTIONALLIGHTS - support for infinitely far light sources, which are set only by direction
POSITIONALLIGHTS - support for pixel and conic sources
LOCALVIEWER - support of calculation in local coordinates
MATERIALSOURCE7 - you may choose a light source of vertices of a primitive
TEXGEN - hardware generation of texture coordinates

OK, let's return to the main table:

Max Clip Planes - a number of planes set by a user for clipping primitives. The plane is determined with 4 factors (ABCD) and if for the primitive's coordinates the term (Ax + By + Cz + Dw >= 0 (w - quaternary coordinate) is implemented , the primitive is clipped and doesn't proceed to rendering. There is an obvious fall of the GeForce2. And an interesting redundancy in case of the NV20 - even an arbitrary cubic sector can be set with 6 planes.
Max Vertex Blend Matrices - max number of matrices simultaneously applied to a vertex at the time of multimatrix coordinate blending. In DX7 matrix blending (single-skin) with 4 matrices was possible. Unfortunately, the GeForce/GeForce2 support only two. In the DX8 you can use a set up to 256 matrices, with 4 matrix limitation for one vertex, they are chosen with an index. But drivers of the NV20, Radeon and GeForce2 today do not support such indexing.
Max Primitive Count / Max Vertex Index - all these accelerators can interpret, transform, light and render lists of primitive tremendously unloading a CPU. This parameter defines a max size of a list of primitives or vertices.
Vertex Shader Version - for a start comes an illustration from the DX8 SDK documentation:

Here you can see a black box - version 1.1 of the NV20, which is lacking in the Radeon and GeForce2. In fact, the Radeon and GeForce2 have the Vertex ALU which can interpret shaders, though they are not completely compatible with the final standard. You can say that these shaders are of 0.5 version. By the way, for the Radeon they can be switched on with a special key in the register, but in this case the most of the samples from the DX8 SDK will buzz, and only some shaders will be implemented as they should do. It's because the compatibility is partial. However that may be, Microsoft didn't introduce a conception of shaders of "0.5 version", and we will hold it only for the NV20. So, shaders deals with constants (next line of the table defines their number - Max Vertex Shader Const), for the NV20 they are 96. With 16 input and 8 variable (temporary) registers. When the shader's operation is working (the size of which for the NV20 is limited with 128 ops, but it differs with other chips) data are operated and on this base there created 4 sets of coordinates for textures and two color values for a vertex and the resulting vertex coordinates. A pixel shader or a chosen (by a user) configuration of texture stages further work with these data (while rendering a primitive):

I'm not going to touch the performance of this unit. I just want to notice that it's very easy to make a parallel unit processing several vertices simultaneously, implementing in fact one shader's program. I think that it works this way here.

By the way, when using a shader many constants are available (96 for the NV20). Nothing prevent us from writing a shader that would realize a blending with arbitrary degree of flexibility, e.g. with usage of 96/4 matrices. But remember about a restriction on the number of shader's operations. Taking this into account we will get ~20 matrices per vertex. Though, in case of one skip it's useless.

Again comes the main table:

Pixel Shader Version - this is the second type of shaders. Enabling pixel shaders with the Radeon, you may see that a situation is more optimistic this time. It seems that what it has in the hardware is very close to the 1.0 version, and many samples from the DX8 works well. For those who want to experiment, here are the keys for the Radeon:
HKEY_LOCAL_MACHINE\SOFTWARE\ATI_Technologies\Driver\0000\atidxhal
(instead of 0000 it can be 0001 etc.)

string VertexShaderVersion = "10"
(nearly all parameters using it buzz)

string PixelShaderVersion = "10"
(it works better, you can see nearly all examples in hardware implementation)

string PureDevice = "1" - enables "PURE device" mode
In this mode (the fastest) an accelerator stores, converts and implements lists of primitives and vertices in local memory.

A place of pixel shaders in the general shade picture:

Simple but tasteful. A shader is calculated in every shaded point of a triangle, it should be implemented max quickly. There are 8 constants, two colors (interpolated along a surface of a primitive), texture stages which we can interact with. There are two temporary registers. The problem is to calculate the resulting color of the pixel. There are a huge number of operations available which are similar to the described above for calculations at the texture stage, but more flexible. The operations can work in several directions simultaneously. Their max number is equal to the number of texture stages (for the NV20 they are only 8). A pixel shader is rather a setting of pipeline stages. And it is implemented more effectively giving out one result per clock. And again nothing hampers parallel living of several pipelines in the hardware which are set equally.

Well, again the main table:

Quintic/RT Patches - hardware support for tesselation of smooth surfaces! There are added two new primitives of high order - rectangular and triangular patches. For their rendering there are two corresponding calls in the DX8, and you can control the degree of detailing when dividing smooth surfaces of patches into triangles. It seems that the NV20 implements tesselation hardwarely. Maybe the result of tesselation is kept in the local memory of an accelerator in the form of a resulting list of the triangular primitives, or parameters of a triangle are generated dynamically and immediately go to rendering without being anywhere saved. In the second case, the work of a programmer becomes simpler and memory load is decreased.
W Depth Values - support of an alternative format of depth values (W format).
Stencil Buffer - support for Stencil buffer. Accelerators can implement the following operations with Stencil buffer values:

NVIDIA NV20 driver of the 7 series	Radeon DDR driver of the 5 series	NVIDIA GeForce2 Ultra driver of the 7 series
KEEP ZERO REPLACE INCRSAT DECRSAT INVERT INCR DECR	KEEP ZERO REPLACE INCRSAT INVERT DECR	KEEP ZERO REPLACE INCRSAT DECRSAT INVERT INCR DECR

Detailed info on operations:

KEEP - not to change a value in the buffer
ZERO - set 0 in all rendered pixels of the primitive
REPLACE - record some definite value
INCRSAT - increase by one. If case the maximum reached, the value are not to be changed.
DECRSAT - decrease by one. On reaching 0 do not change it.
INCR, DECR - the same but with return. For example, when the maximum is reached the value is set to 0

Now comes the main table again:

Anisotropy Filtering - It seems that in the NV20 it's realized the same way as in the GeForce2, and looks worse than that of the Radeon. A bit later we will give a table with all accessible filtering modes for different texture types.
Cubic Texturing - hardware support of cubic texturing (cube environment mapping).
Volume Texturing - hardware support for volume texturing.

Here comes a table with possible filtering modes for three possible texture types, with usage of MIP levels and without:

Filtering	NVIDIA NV20 driver of the 7 series	Radeon DDR driver of the 5 series	NVIDIA GeForce2 Ultra driver of the 7 series
Standard Texture Filters
Min/Mag	Point, Linear, Anisotropic
MIPMAP	Point, Linear
Cube Texture Filters
Min/Mag	Point, Linear, Anisotropic	Point, Linear	Point, Linear, Anisotropic
MIPMAP	Point, Linear	No	Point, Linear
Volume Texture Filters
Min/Mag	Point, Linear, Anisotropic	Point, Linear	No
MIPMAP	Point, Linear	No	No

Here the NV20 takes the lead...

Let's return to the parameters:

Z test - a possibility to inform after rendering of a primitive whether at least one its pixel can be seen
Shade modes - There all three cards are equal.
MultiSample Types - possible modes of multisampling. Only of the NV20.
SSAA (Super Sample Anti-Aliasing) Types - possible modes of full screen AA.
Z/MIPMAP Bias - possibility of biasing of MIP level or a depth.
Vertex/Table/Range Fog - supported types of fog.
W/Z Fog - possible depth formats for the fog.
Bandwidth Saving - methods of saving of memory bandwidth.

Look at a table with texturing parameters:

NVIDIA NV20 driver of the 7 series	Radeon DDR driver of the 5 series	NVIDIA GeForce2 Ultra driver of the 7 series
PERSPECTIVE POW2 ALPHA ALPHAPALETTE PROJECTED CUBEMAP VOLUMEMAP MIPMAP MIPVOLUMEMAP MIPCUBEMAP CUBEMAP_POW2 VOLUMEMAP_POW2	PERSPECTIVE POW2 ALPHA PROJECTED CUBEMAP VOLUMEMAP MIPMAP CUBEMAP_POW2 VOLUMEMAP_POW2	PERSPECTIVE POW2 ALPHA PROJECTED CUBEMAP MIPMAP MIPCUBEMAP CUBEMAP_POW2

Comments:

PERSPECTIVE - hardware correction of perspective.
POW2, CUBEMAP_POW2, VOLUMEMAP_POW2 - textures of the corresponding type should have the size equal to the power of 2.
MIPMAP - support of mipmap texturing for standard textures
CUBEMAP, VOLUMEMAP - support for cube environment maps and volume textures correspondingly
MIPVOLUMEMAP, MIPCUBEMAP - the same, plus mipmap levels. The NV20 excelled here again.
ALPHA - support for alpha channels in a texture and
ALPHAPALETTE - in a palette correspondingly.

And at last, possible formats for frame buffers, depth buffers and different types of textures:

NVIDIA NV20 driver of the 7 series	Radeon DDR driver of the 5 series	NVIDIA GeForce2 Ultra driver of the 7 series
Depth/Stencil Formats
D24S8 D16 D24X8	D32 D24S8 D16 D24X8	D24X8 D24S8 D16 (standart/lockable)
Render Target Formats
A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5	A8R8G8B8 X8R8G8B8 R5G6B5 A1R5G5B5 A4R4G4B4 R3G3B2	A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5
Texture Formats
A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5 A4R4G4B4 P8 V8U8 L6V5U5 X8L8V8U8 Q8W8V8U8 DXT1 DXT2 DXT3 DXT4 DXT5	A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5 A4R4G4B4 R3G3B2 V8U8 DXT1 DXT2 DXT3 DXT4 DXT5	A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5 A4R4G4B4 P8 DXT1 DXT2 DXT3 DXT4 DXT5
Cube Texture Formats
A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5 A4R4G4B4 P8 V8U8 L6V5U5 X8L8V8U8 Q8W8V8U8 DXT1 DXT2 DXT3 DXT4 DXT5	A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5 A4R4G4B4 R3G3B2 DXT1 DXT2 DXT3 DXT4 DXT5	A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5 A4R4G4B4 P8 DXT1 DXT2 DXT3 DXT4 DXT5
Volume Texture Formats
A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5 A4R4G4B4 P8 R5G6B5	A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5 A4R4G4B4 R3G3B2 DXT1 DXT2 DXT3 DXT4 DXT5	-

Comments on texture and buffer formats:

The letters mean the type of data stored (e.g. R/G/B - color components). A digit after a letter - number of bits (R8G8B8 True color)
RGB - color components.
D - depth in the Z or W format
A - alpha channel, i.e. transparency
X - unused value
DXT1..5 - compressed textures (by the corresponding method)
QWUV - relief parameters (BumpMap)
L - light
P - index in a palette

Well, the NV20 turned out well in possibilities (unlike the GeForce2). NVIDIA company proves its rank of a technological leader.

It's not clear up what will be with a speed, many things will depend on memory and programmers. I want to note a tighter integration of the API (DX8) and hardware and wide range of technological innovations. It's obvious that together with the XBox the NV2x series is capable to provoke a real revolution of trick effects in games. The main thing is that game developers manage to create new games or remake the current ones; it's cool that the upcoming XBox guaranties availability of games intended for the API DX8 possibilities.

Expected that the first NV20-based cards will be released by ASUS, Leadtek, Elsa, Hercules in March, and in April-May there will come GigaByte, MSI and many others.

In the very beginning of sales the NV20 based cards with 64 MBytes will cost around $450-500, but a month later the price will little by little come down. Interestingly is the fact that in the roadmaps of many respected companies there are only the cards with 128 MBytes memory.

Write a comment below. No registration needed!