iXBT Labs - Computer Hardware in Detail

Platform

Video

Multimedia

Mobile

Other

NVIDIA GeForce4 Ti 4400
and GeForce4 Ti 4600 (NV25) Review



The NV25 chip was awaited for a long time by many as an echo of 3dfx's deals, as a competitor against ATI RADEON 8500 and as the second optimized and enriched incarnation of NV20. Let me dive directly into the root of the matter...

Attention! Before reading the review you should turn to the previous articles on NVIDIA GeForce3 (NV20) and ATI Radeon 8500 (R200).

Product line

The GeForce 4 line is based on two chips - NV17 and NV25 which is the today's main hero:

  • GeForce 4 Ti4600 - NV25 300 MHz core, 128 MBytes 325(650) MHz 128-bit DDR memory.
  • GeForce 4 Ti4400 - NV25 275 MHz core, 128 MBytes 275(550) MHz 128-bit DDR memory.
  • One more junior card on the NV25 will be announced later.
  • GeForce 4 MX460 - 300 MHz core, 64 MBytes 275(550) MHz 128-bit DDR memory.
  • GeForce 4 MX440 - 270 MHz core, 64 MBytes 200(400) MHz 128-bit DDR memory.
  • GeForce 4 MX420 - 200 MHz core, 64 MBytes 166 MHz 128-bit DDR memory.

Note:

  1. The GeForce 3 line will be quickly replaced with the GeForce 4 one.
  2. NV17 doesn't and won't support pixel and vertex shaders.
  3. NV17 will have MPEG2 hardware decoder and dynamic power management system, and the NV25 does not.
  4. NV17 has only two fill pipelines, and NV25 - four.
  5. NV25 has a superscalar (dual) T&L unit, NV17 has a single one.
  6. NV17 and NV25 have similar memory controllers (2x-channel one of the NV17 and a 4-channel one of the NV25).
  7. Both chips have the same set of systems for increasing the memory effective bandwidth (Z buffer compression and fast Z clear, MSAA, HSR).
  8. NV17 has 2 integrated LCD-panel controllers.
  9. Both chips have two independent RAMDACs, CRTC controllers, and integrated TV-Out and DVI interfaces.

This pretty monster will help promoting and advertising NV25 based products demonstrating advanced soft illumination, skeletal animation, hair and fur made of vertex shaders and per-pixel relief:

Theory

NV25

 

Main architectural innovations of the NV25 (vs. NV20)

  1. 2 independent CRTC controllers. Flexible support of all possible modes and of output of two frame buffers independent in resolution and in contents onto any accessible signal receivers.
  2. 2 normal 350 MHz RAMDACs integrated in the chip (with a 10bit palette).
  3. Integrated TV-Out.
  4. Integrated TDMS transmitter (for DVI interface).
  5. 2 units of interpretation and implementation of vertex shaders. They promise a considerable growth of a processing speed of a scene with complicated geometry. The units can't implement a different microcode of shaders, and the only advantage of processing of two vertices simultaneously is a performance increase.
  6. Improved fill pipelines provide hardware support of pixel shaders up to v1.3 inclusive.
  7. According to NVIDIA, an effective fillrate in the MSAA modes got higher, and now 2x AA and Quincunx AA modes will cause much less performance drop. The Quincunx AA is improved (positions of sample fetching are shifted). A new AA method appeared - 4xS.
  8. Improved separate caching system (4 separate caches for geometry, frame buffer and Z-buffer).
  9. Improved lossless compression (1:4) and fast Z clear.
  10. Improved hidden surface removal algorithm (Z Cull HSR).

Further we will check all these declared advantages of the new chip.

The above changes are rather evolutionary rather than revolutionary as compared with the previous NVIDIA's product (NV20). But it is typical of NVIDIA to release first a product carrying a great deal of new technologies and then its improved (optimized) variant. Just take TNT and TNT2, GF256 and GF2, and now GF3 and GF4. The experience shows that usually the second variant meets with great success.

Performance characteristics

First of all, a bit of explanation:

  1. The accelerator can't be examined ignoring its drivers. Chip's capabilities depend on whether the drivers support certain applications for two main APIs. Many characteristics from the table can depend on drivers and can be correct, first of all, for a certain version. Moreover, some possibilities can be available even if the drivers do not mention them (for example, clipping planes in D3D for NVIDIA cards). We will consider such features absent as a correctly written application mustn't try to use options the driver doesn't report about.
  2. The most of the data relate to Direct3D, and in OpenGL these parameters can differ. There are several reasons of it, and first of all, it should be noted that this gaming API is closer to the accelerator's hardware. Besides, capabilities of modern accelerators are much dependent on the D3D specification.

And now take a look at the summary table of the key characteristics of the chips and cards tested today. Keep in mind that in the nearest quarter the ATI RADEON 8500 will be a main competitor of NV25 based cards (GeForce 4 Ti 4600 and Ti 4600) because of the postponed release of RADEON 8500XT and because first R300 based products won't appear very soon.

Card GeForce3 Ti 500 RADEON 8500 GeForce 4 Ti 4600 (GeForce 4 Ti 4400)
Chip, revision, driver version
Chip NV20 R200 NV25
Revision A5 A23 A03
Driver version 27.30 6.018 27.30
Main parameters
Pipelines 4 4 4
Texture blocks 2 2 2
Textures at a pass 4 6 4
Core frequency, MHz 240 275 300 (275)
Fill rate (million pixels) 960 1100 1200 (1100)
Fill rate (million texels) 1920 2200 2400
RAMDAC, MHz 350 400 (+ external 240) 350*2
Local memory parameters
Memory frequency, MHz 250 275 325 (275)
Memory bus, bits 128 (DDR) 128 (DDR) 128 (DDR)
Technology, micron 0.15 0.15 0.15
Memory size, MB 64 64 128
Memory speed, ns 3.8 3.6 2.8 (3.6)
OpenGL version 1.3 1.3 1.3
DirectX version 8.1 8.1 8.1
GDI+ acceleration Yes Yes Yes
Pixel pipeline
Pixel shaders 1.0, 1.1 1.0..1.4 1.0..1.3
Range of calculated color values -1.0..+1.0 -8.0..+8.0 -1.0..+1.0
Texture stages 4 8 4
Combination stage 8 8 8
Multisampling 2,3,4 samples No 2,3,4 samples
Clipping planes 0 6 0
Vertex shader
Vertex shaders 1.0, 1.1 1.0, 1.1 1.0, 1.1
Vertex streams 16 8 16
Constants of vertex shader 96 192 96
Matrices for blending (max.) 4 4 4
Indexed blending No Up to 57 matrices No
Light sources 8 8 8
N-Patches No Yes No
RT-Patches No No No
Primitives 1048575 65536 1048575
Vertices 1048575 16777215 1048575
Other parameters
Pure Device Yes Yes Yes
Sprite scaling up to 64 256 8192
3D textures Yes (with anisotropy) Yes (without MIPMAP) Yes (with anisotropy)
Reflection mapping Yes (with anisotropy) Yes (without MIPMAP) Yes (with anisotropy)
Anisotropic filtering Yes Yes Yes
Anisotropy degree up to 2,3,4 bi/trilinear sampling 2,3 bilinear sampling in a line 2,3,4 bi/trilinear sampling
Fog FOGVERTEX FOGRANGE FOGTABLE FOGVERTEX FOGRANGE FOGVERTEX FOGRANGE FOGTABLE
Frame buffer
Rendering buffer formats A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A8R8G8B8 X8R8G8B8 R5G6B5 A1R5G5B5 A4R4G4B4 R3G3B2 A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5
Z-buffer formats D32 D24S8 D16 D24X8 D32 D24S8 D16 D24X8 D32 D24S8 D16 D24X8
Texture formats
Maximum texture size (maximum repeat) 4096x4096(8192) 2048x2048(2048) 4096x4096(8192)
2D texture formats A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5 A4R4G4B4 P8 V8U8 L6V5U5 X8L8V8U8 DXT1 DXT2 DXT3 DXT4 DXT5 D24S8 D16 D24X8 A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5 A4R4G4B4 R3G3B2 L8 A8L8 V8U8 L6V5U5 X8L8V8U8 Q8W8V8U8 V16U16 W11V11U10 DXT1 DXT2 DXT3 DXT4 DXT5 A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5 A4R4G4B4 P8 V8U8 L6V5U5 X8L8V8U8 DXT1 DXT2 DXT3 DXT4 DXT5 D24S8 D16 D24X8
3D texture formats A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5 A4R4G4B4 P8 A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5 A4R4G4B4 R3G3B2 L8 A8L8 Q8W8V8U8 W11V11U10 DXT1 DXT2 DXT3 DXT4 DXT5 A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5 A4R4G4B4 P8

Comments:

  1. GeForce4 Ti4600 has a higher clock speed of the core and memory than the RADEON 8500. The GeForce4 Ti4400 also has the same core and memory clock speed as the RADEON 8500.
  2. At last NVIDIA products got dual-monitor support, and unlike the R200, here both normal 350 MHz RAMDACs are integrated into the NV25 chip.
  3. The NV25's RAMDAC has a lower frequency as compared with the primary RAMDAC of the R200 (350 against 400 MHz)
  4. Organization of the internal architecture of the NV25 is close to the NV20 and R200 - 4 fill pipelines with two texture blocks on each. However, in case of the R200 results of their operation can accumulate twice, thus, allowing us to combine up to 6 textures at a pass; the NV25 limits them to 4. Nevertheless, there is no any applications capable to get a sound gain using 6 textures at a pass. However, the Next Doom is going to be such application.
  5. Again, NVIDIA doesn't support 1.4 pixel shaders (see R200 review in detail) and a more flexible mechanism of dependent sampling of texture values. Shaders are, in fact, translated into settings of sampling and combination pipelines; the number of stages of texture sampling pipeline remains the same - 4 of the NV25/NV20 against 8 of the R200; some slight changes in the combination pipeline make possible to support 1.2 and 1.3 shaders on the hardware level. Their difference from the 1.1 shaders is connected not with organization of more flexible dependent sampling, but with utilization and modification of Z values and other useful options.
  6. Combination pipelines of all chips have 8 stages and support all specified DirectX 8.1 operations.
  7. The current drivers of the NV25 do not support a larger number of constants to be enabled in vertex shaders (96 against 192 of the R200) or of vertex shader instructions (128). It seems that there are no other qualitative changes apart from the second T&L unit (which is also a vertex shader interpreter) are made in the pipeline.
  8. The NV25 memory now successfully works at the same frequency as the R200, with the rated access time being also the same. However, it doesn't mean the same efficiency as the R200 and NV20/NV25 approach issues related with memory operation differently. The NV25 prefers smaller blocks and an effective 4-channel crossbar controller, and the R200 uses larger blocks and intensive combined caching. What approach is more viable in modern tests and applications we will see later.
  9. All the cards have normal DirectX 8.1 and OpenGL 1.3 drivers. The OpenGL driver from ATI is considered less efficient than that of NVIDIA. But the difference is becoming narrower, and at present it depends on how the OpenGL works with geometry and whether it uses index buffers - the R200 itself is less efficient in delivering geometry via AGP than the NV20/NV25.
  10. To some reason, the current drivers of the NV20 and NV25 report that there are no clipping planes, though in our tests they work excellently. The reason is that NVIDIA uses a special pixel driver to realize clipping planes which uses the most part of slots of a combination pipeline and then an application becomes unable to use its own pixel shader and some other resources. It doesn't comply with the DirectX standard, and that is why clipping planes were disabled at the level of the reported capabilities.
  11. NV25 doesn't support N-Patches on a hardware level again.
  12. The drivers of the NV20 and NV25 do not support hardware tessellation of smooth surfaces (HOS based on RT-Patches). When a card doesn't support N-Patches on a hardware level the API tries to emulate them using RT-Patches. It makes operation of N-Patches very slow. NVIDIA thus had to disable the RT-Patches so that games supporting N-Patches won't be too slow.
  13. NV25 doesn't support indexed matrix blending like the NV20 as the shaders can help organize flexibly any schemes of matrix blending.
  14. Multisampling hasn't changed since NV20 - the same 2..4 samples which the R200 is not capable of.
  15. Realization of the anisotropy of NV25/NV20 is different from R200, and each approach has its advantages and disadvantages.
  16. The range of pixel shader values of the NV25 is still -1.0 to 1.0 - the higher precision of the R200 had no response to.
  17. All cards support a standard set of texture formats, though the R200 supports some more formats for additional data in shaders (normal and displacement maps) with an increased precision of component delivery (11 and 16bit - V16U16, W11V11U10); NV25 and NV20 make possible to use textures with the Z buffer format (D32, D24S8, D16, D24X8) necessary to realize Depth Buffer Shadows algorithms. Usage of this algorithm which is peculiar to NVIDIA products by applications in the drivers for DirectX is nonstandard.
  18. NV25 doesn't allow compressing 3D textures. It is a bad drawback of the drivers or the chip. At the same time the OpenGL drivers from NVIDIA have their own 3D texture compression format.
  19. NV25 supports all types of fog, like the NV20 does.

Here is a complete list of OpenGL extensions supported by the NV25 in the current drivers:

  • GL_VENDOR: NVIDIA Corporation
  • GL_RENDERER: GeForce4 Ti 4400/AGP/SSE2
  • GL_VERSION: 1.3.1
  • GL_EXTENSIONS:
    • GL_ARB_imaging
    • GL_ARB_multisample
    • GL_ARB_multitexture
    • GL_ARB_texture_border_clamp
    • GL_ARB_texture_compression
    • GL_ARB_texture_cube_map
    • GL_ARB_texture_env_add
    • GL_ARB_texture_env_combine
    • GL_ARB_texture_env_dot3
    • GL_ARB_transpose_matrix
    • GL_S3_s3tc
    • GL_EXT_abgr
    • GL_EXT_bgra
    • GL_EXT_blend_color
    • GL_EXT_blend_minmax
    • GL_EXT_blend_subtract
    • GL_EXT_compiled_vertex_array
    • GL_EXT_draw_range_elements
    • GL_EXT_fog_coord
    • GL_EXT_multi_draw_arrays
    • GL_EXT_packed_pixels
    • GL_EXT_paletted_texture
    • GL_EXT_point_parameters
    • GL_EXT_rescale_normal
    • GL_EXT_secondary_color
    • GL_EXT_separate_specular_color
    • GL_EXT_shared_texture_palette
    • GL_EXT_stencil_wrap
    • GL_EXT_texture3D
    • GL_EXT_texture_compression_s3tc
    • GL_EXT_texture_edge_clamp
    • GL_EXT_texture_env_add
    • GL_EXT_texture_env_combine
    • GL_EXT_texture_env_dot3
    • GL_EXT_texture_cube_map
    • GL_EXT_texture_filter_anisotropic
    • GL_EXT_texture_lod
    • GL_EXT_texture_lod_bias
    • GL_EXT_texture_object
    • GL_EXT_vertex_array
    • GL_EXT_vertex_weighting
    • GL_HP_occlusion_test
    • GL_IBM_texture_mirrored_repeat
    • GL_KTX_buffer_region
    • GL_NV_blend_square
    • GL_NV_copy_depth_to_color
    • GL_NV_evaluators
    • GL_NV_fence
    • GL_NV_fog_distance
    • GL_NV_light_max_exponent
    • GL_NV_multisample_filter_hint
    • GL_NV_occlusion_query
    • GL_NV_packed_depth_stencil
    • GL_NV_point_sprite
    • GL_NV_register_combiners
    • GL_NV_register_combiners2
    • GL_NV_texgen_reflection
    • GL_NV_texture_compression_vtc
    • GL_NV_texture_env_combine4
    • GL_NV_texture_rectangle
    • GL_NV_texture_shader
    • GL_NV_texture_shader2
    • GL_NV_texture_shader3
    • GL_NV_vertex_array_range
    • GL_NV_vertex_array_range2
    • GL_NV_vertex_program
    • GL_NV_vertex_program1_1
    • GL_SGIS_generate_mipmap
    • GL_SGIS_multitexture
    • GL_SGIS_texture_lod
    • GL_SGIX_depth_texture
    • GL_SGIX_shadow
    • GL_WIN_swap_hint
    • WGL_EXT_swap_control

The same list in the latest drivers of the R200:

  • GL_VENDOR: ATI Technologies Inc.
  • GL_RENDERER: Radeon 8500 DDR x86/SSE2
  • GL_VERSION: 1.3.2475 WinXP Release
  • GL_EXTENSIONS:
    • GL_ARB_multitexture
    • GL_ARB_texture_border_clamp
    • GL_ARB_texture_compression
    • GL_ARB_texture_cube_map
    • GL_ARB_texture_env_add
    • GL_ARB_texture_env_combine
    • GL_ARB_texture_env_crossbar
    • GL_ARB_texture_env_dot3
    • GL_ARB_transpose_matrix
    • GL_ARB_vertex_blend
    • GL_ARB_window_pos
    • GL_S3_s3tc
    • GL_ATI_element_array
    • GL_ATI_envmap_bumpmap
    • GL_ATI_fragment_shader
    • GL_ATI_map_object_buffer
    • GL_ATI_pn_triangles
    • GL_ATI_texture_mirror_once
    • GL_ATI_vertex_array_object
    • GL_ATI_vertex_streams
    • GL_ATIX_texture_env_combine3
    • GL_ATIX_texture_env_route
    • GL_ATIX_vertex_shader_output_point_size
    • GL_EXT_abgr
    • GL_EXT_bgra
    • GL_EXT_blend_color
    • GL_EXT_blend_func_separate
    • GL_EXT_blend_minmax
    • GL_EXT_blend_subtract
    • GL_EXT_clip_volume_hint
    • GL_EXT_compiled_vertex_array
    • GL_EXT_draw_range_elements
    • GL_EXT_fog_coord
    • GL_EXT_packed_pixels
    • GL_EXT_point_parameters
    • GL_ARB_point_parameters
    • GL_EXT_rescale_normal
    • GL_EXT_secondary_color
    • GL_EXT_separate_specular_color
    • GL_EXT_stencil_wrap
    • GL_EXT_texgen_reflection
    • GL_EXT_texture_env_add
    • GL_EXT_texture3D
    • GL_EXT_texture_compression_s3tc
    • GL_EXT_texture_cube_map
    • GL_EXT_texture_edge_clamp
    • GL_EXT_texture_env_combine
    • GL_EXT_texture_env_dot3
    • GL_EXT_texture_lod_bias
    • GL_EXT_texture_filter_anisotropic
    • GL_EXT_texture_object
    • GL_EXT_vertex_array
    • GL_EXT_vertex_shader
    • GL_KTX_buffer_region
    • GL_NV_texgen_reflection
    • GL_NV_blend_square
    • GL_SGI_texture_edge_clamp
    • GL_SGIS_texture_border_clamp
    • GL_SGIS_texture_lod
    • GL_SGIS_generate_mipmap
    • GL_SGIS_multitexture
    • GL_WIN_swap_hint
    • WGL_EXT_extensions_string
    • WGL_EXT_swap_control

The most of NV25 extensions remained standard which means a stronger influence of NVIDIA on the OpenGL. Now let's turn to the video cards based on two NV25 versions: GeForce4 Ti 4400 and 4600.

[ Part II ]

Write a comment below. No registration needed!


Article navigation:



blog comments powered by Disqus

  Most Popular Reviews More    RSS  

AMD Phenom II X4 955, Phenom II X4 960T, Phenom II X6 1075T, and Intel Pentium G2120, Core i3-3220, Core i5-3330 Processors

Comparing old, cheap solutions from AMD with new, budget offerings from Intel.
February 1, 2013 · Processor Roundups

Inno3D GeForce GTX 670 iChill, Inno3D GeForce GTX 660 Ti Graphics Cards

A couple of mid-range adapters with original cooling systems.
January 30, 2013 · Video cards: NVIDIA GPUs

Creative Sound Blaster X-Fi Surround 5.1

An external X-Fi solution in tests.
September 9, 2008 · Sound Cards

AMD FX-8350 Processor

The first worthwhile Piledriver CPU.
September 11, 2012 · Processors: AMD

Consumed Power, Energy Consumption: Ivy Bridge vs. Sandy Bridge

Trying out the new method.
September 18, 2012 · Processors: Intel
  Latest Reviews More    RSS  

i3DSpeed, September 2013

Retested all graphics cards with the new drivers.
Oct 18, 2013 · 3Digests

i3DSpeed, August 2013

Added new benchmarks: BioShock Infinite and Metro: Last Light.
Sep 06, 2013 · 3Digests

i3DSpeed, July 2013

Added the test results of NVIDIA GeForce GTX 760 and AMD Radeon HD 7730.
Aug 05, 2013 · 3Digests

Gainward GeForce GTX 650 Ti BOOST 2GB Golden Sample Graphics Card

An excellent hybrid of GeForce GTX 650 Ti and GeForce GTX 660.
Jun 24, 2013 · Video cards: NVIDIA GPUs

i3DSpeed, May 2013

Added the test results of NVIDIA GeForce GTX 770/780.
Jun 03, 2013 · 3Digests
  Latest News More    RSS  

Platform  ·  Video  ·  Multimedia  ·  Mobile  ·  Other  ||  About us & Privacy policy  ·  Twitter  ·  Facebook


Copyright © Byrds Research & Publishing, Ltd., 1997–2011. All rights reserved.