iXBT Labs - Computer Hardware in Detail

Platform

Video

Multimedia

Mobile

Other

ATI RADEON 9700 Pro 128MB Video Card Review









For a start we recommend that you read the analytical article dealing with the architecture and specifications of the RADEON 9700 (R300)

CONTENTS

  1. General information
  2. Theoretical aspects of implementation of anti-aliasing and anisotropic filtering
  3. Peculiarities of the ATI RADEON 9700 Pro 128MB video card 
  4. Test system configuration and drivers' settings 
  5. Test results: briefly on 2D, extreme tests from DirectX 8.1 SDK and synthetic (DirectX 9.0 based) tests
  6. Test results: 3DMark2001 SE synthetic tests 
  7. Test results: 3DMark2001 SE game tests 
  8. Test results: Quake3 ARENA 
  9. Test results: Serious Sam: The Second Encounter 
  10. Test results: Return to Castle Wolfenstein 
  11. Test results: Code Creatures DEMO 
  12. Test results: Comanche4 DEMO 
  13. Test results: Unreal Tournament 2003 DEMO 
  14. Test results: AquaMark 
  15. Test results: RightMark 3D 
  16. 3D quality: Anisotropic filtering
  17. 3D quality: Anti-aliasing
  18. Overall 3D quality
  19. Conclusion

In this review we are not going to touch the architecture or specs of the RADEON 9700 chip (also known as R300); instead, we will examine its capabilities and performance aspects. The current line of the R300 based cards looks the following: 

  • RADEON 9700 PRO - 325 MHz chip, 128 MB 310 MHz (DDR 620) 256 bit local memory; 
  • RADEON 9700 - 300 MHz chip, 128 MB 300 MHz (DDR 600) 256 bit local memory; 
  • RADEON 9500 - the chip cut to 4 pipelines, 128bit local memory; 
We are testing the senior card of the line which is also the only available on the today's market - RADEON 9700 PRO. It's the first time we use the DirectX 9 (beta 2) for testing chips. Before the practical tests look at the DX9 capabilities currently supported by the card; in the parentheses we give marks to the respective values: 
  • Texture size - up to 2048x2048 (standard) 
  • Non-square textures supported (good) 
  • Light sources (max.) - 8 (standard) 
  • Texture fetch stages - 8 (excellent) 
  • Combination stages - 8 (standard) 
  • Clipping surfaces - 6 (excellent) 
  • Sprite size (max.) - 256 (excellent) 
  • Primitives for one call (max.) - 65535 (questionable) 
  • Vertex buffer size - 16777215 (excellent) 
  • Vertex streams (max.) - 16 (excellent) 
  • Vertex shaders version (max.) - 1.1 (bad) 
  • Constants of vertex shader - 256 (excellent) 
  • Pixel shaders version (max.) - 1.4 (bad) 
  • Pixel shader (max. value) - 3.40282E+038 (the highest value for the floating point F32 format, excellent) 
  • Multisampling modes: no, 2, 4, 6 samples (only in the X8R8G8B8 mode, in the frame buffer mode with alpha channel A8R8G8B8 it's not available). 
  • Render target format (good): 
    • D3DFMT_A8R8G8B8 
    • D3DFMT_X8R8G8B8 
    • D3DFMT_R5G6B5 
    • D3DFMT_A1R5G5B5 
    • D3DFMT_A4R4G4B4 
  • Depth buffer formats (good): 
    • D3DFMT_D16_LOCKABLE 
    • D3DFMT_D24S8 
    • D3DFMT_D24X8 
    • D3DFMT_D16 
  • Texture formats (strange): 
    • D3DFMT_A8R8G8B8 
    • D3DFMT_X8R8G8B8 
    • D3DFMT_R5G6B5 
    • D3DFMT_X1R5G5B5 
    • D3DFMT_A1R5G5B5 
    • D3DFMT_A4R4G4B4 
    • D3DFMT_R3G3B2 
    • D3DFMT_L8 
    • D3DFMT_V8U8 
    • D3DFMT_L6V5U5 
    • D3DFMT_X8L8V8U8 
    • D3DFMT_Q8W8V8U8 
    • D3DFMT_V16U16 
    • D3DFMT_UYVY 
    • D3DFMT_YUY2 
  • Cube texturing formats (OK): 
    • D3DFMT_A8R8G8B8 
    • D3DFMT_X8R8G8B8 
    • D3DFMT_R5G6B5 
    • D3DFMT_X1R5G5B5 
    • D3DFMT_A1R5G5B5 
    • D3DFMT_A4R4G4B4 
    • D3DFMT_R3G3B2 
    • D3DFMT_L8 
    • D3DFMT_UYVY 
    • D3DFMT_YUY2 
  • Volume texture formats (OK): 
    • D3DFMT_A8R8G8B8 
    • D3DFMT_X8R8G8B8 
    • D3DFMT_R5G6B5 
    • D3DFMT_X1R5G5B5 
    • D3DFMT_A1R5G5B5 
    • D3DFMT_A4R4G4B4 
    • D3DFMT_R3G3B2 
    • D3DFMT_L8 
    • D3DFMT_UYVY 
    • D3DFMT_YUY2 
    Filtering modes for single texturing (excellent): 
    • D3DPTFILTERCAPS_MINFPOINT 
    • D3DPTFILTERCAPS_MINFLINEAR 
    • D3DPTFILTERCAPS_MINFANISOTROPIC 
    • D3DPTFILTERCAPS_MIPFPOINT 
    • D3DPTFILTERCAPS_MIPFLINEAR 
    • D3DPTFILTERCAPS_MAGFPOINT 
    • D3DPTFILTERCAPS_MAGFLINEAR 
    • D3DPTFILTERCAPS_MAGFANISOTROPIC 
  • Filtering modes for cube texturing (good): 
    • D3DPTFILTERCAPS_MINFPOINT 
    • D3DPTFILTERCAPS_MINFLINEAR 
    • D3DPTFILTERCAPS_MIPFPOINT 
    • D3DPTFILTERCAPS_MIPFLINEAR 
    • D3DPTFILTERCAPS_MAGFPOINT 
    • D3DPTFILTERCAPS_MAGFLINEAR 
  • Filtering modes for volume textures (good): 
    • D3DPTFILTERCAPS_MINFPOINT 
    • D3DPTFILTERCAPS_MINFLINEAR 
    • D3DPTFILTERCAPS_MIPFPOINT 
    • D3DPTFILTERCAPS_MIPFLINEAR 
    • D3DPTFILTERCAPS_MAGFPOINT 
    • D3DPTFILTERCAPS_MAGFLINEAR 

You might notice that there are no traces of the DX9. The matter is that the currently available drivers support only the old DDI8 (Direct Driver Interface 8) for drivers and can't provide any capabilities beyond the DX8. Only when the DDI9 appears (ATI will likely make it available only when Microsoft stops making changes in it and gives a respective permission), we will be able to test the new capabilities of the chip, and now we have just old ones. Although the DX9 works with such drivers without speed losses (the test results of the applications do not differ from the DX8 within the inaccuracy of measurement) we are deprived of the most interesting capabilities of the RADEON 9700 PRO - the second versions of the pixel and vertex drivers or floating point formats of the textures and a frame buffer. On the other hand, nothing prevents from estimating performance and implementation of AA, anisotropic filtering and certain chip's characteristics such as a fillrate and performance of its geometrical unit. 

Because of the early driver (or peculiar operation of the DX9 with the DDI8) the list of textures supported lacks for any compression formats. 

Here is a list of currently available OpenGL extensions and the OpenGL ICD version:
 

ATI, Radeon 9700 x86/SSE2, version 1.3.3259 NVIDIA, GeForce4 Ti 4600/AGP/SSE2, version 1.3.1 ATI, Radeon 8500 DDR x86/SSE2, version 1.3.2475
GL_ARB_depth_texture GL_ARB_imaging GL_ARB_multitexture
GL_ARB_multitexture GL_ARB_multisample GL_ARB_texture_border_clamp
GL_ARB_point_parameters GL_ARB_multitexture GL_ARB_texture_compression
GL_ARB_shadow GL_ARB_texture_border_clamp GL_ARB_texture_cube_map
GL_ARB_shadow_ambient GL_ARB_texture_compression GL_ARB_texture_env_add
GL_ARB_texture_border_clamp GL_ARB_texture_cube_map GL_ARB_texture_env_combine
GL_ARB_texture_compression GL_ARB_texture_env_add GL_ARB_texture_env_crossbar
GL_ARB_texture_cube_map GL_ARB_texture_env_combine GL_ARB_texture_env_dot3
GL_ARB_texture_env_add GL_ARB_texture_env_dot3 GL_ARB_transpose_matrix
GL_ARB_texture_env_combine GL_ARB_transpose_matrix GL_ARB_vertex_blend
GL_ARB_texture_env_crossbar GL_S3_s3tc GL_ARB_window_pos
GL_ARB_texture_env_dot3 GL_EXT_abgr GL_S3_s3tc
GL_ARB_transpose_matrix GL_EXT_bgra GL_ATI_element_array
GL_ARB_vertex_blend GL_EXT_blend_color GL_ATI_envmap_bumpmap
GL_ARB_vertex_program GL_EXT_blend_minmax GL_ATI_fragment_shader
GL_ARB_window_pos GL_EXT_blend_subtract GL_ATI_map_object_buffer
GL_S3_s3tc GL_EXT_compiled_vertex_array GL_ATI_pn_triangles
GL_ATI_element_array GL_EXT_separate_specular_color GL_ATI_texture_mirror_once
GL_ATI_envmap_bumpmap GL_EXT_fog_coord GL_ATI_vertex_array_object
GL_ATI_fragment_shader GL_EXT_multi_draw_arrays GL_ATI_vertex_streams
GL_ATI_map_object_buffer GL_EXT_packed_pixels GL_ATIX_texture_env_combine3
GL_ATI_separate_stencil GL_EXT_paletted_texture GL_ATIX_texture_env_route
GL_ATI_texture_mirror_once GL_EXT_point_parameters GL_ATIX_vertex_shader_output_point_size
GL_ATI_vertex_array_object GL_EXT_rescale_normal GL_EXT_abgr
GL_ATI_vertex_streams GL_EXT_clip_volume_hint GL_EXT_bgra
GL_ATIX_texture_env_route GL_EXT_draw_range_elements GL_EXT_blend_color
GL_ATIX_vertex_shader_output_point_size GL_EXT_shared_texture_palette GL_EXT_blend_func_separate
GL_EXT_abgr GL_EXT_stencil_wrap GL_EXT_blend_minmax
GL_EXT_bgra GL_EXT_texture3D GL_EXT_blend_subtract
GL_EXT_blend_color GL_EXT_texture_compression_s3tc GL_EXT_clip_volume_hint
GL_EXT_blend_func_separate GL_EXT_texture_edge_clamp GL_EXT_compiled_vertex_array
GL_EXT_blend_minmax GL_EXT_texture_env_add GL_EXT_draw_range_elements
GL_EXT_blend_subtract GL_EXT_texture_env_combine GL_EXT_fog_coord
GL_EXT_clip_volume_hint GL_EXT_texture_env_dot3 GL_EXT_packed_pixels
GL_EXT_compiled_vertex_array GL_EXT_texture_cube_map GL_EXT_point_parameters
GL_EXT_draw_range_elements GL_EXT_texture_filter_anisotropic GL_ARB_point_parameters
GL_EXT_fog_coord GL_EXT_texture_lod GL_EXT_rescale_normal
GL_EXT_packed_pixels GL_EXT_texture_lod_bias GL_EXT_secondary_color
GL_EXT_point_parameters GL_EXT_texture_object GL_EXT_separate_specular_color
GL_EXT_rescale_normal GL_EXT_vertex_array GL_EXT_stencil_wrap
GL_EXT_secondary_color GL_EXT_vertex_weighting GL_EXT_texgen_reflection
GL_EXT_separate_specular_color GL_HP_occlusion_test GL_EXT_texture_env_add
GL_EXT_stencil_wrap GL_IBM_texture_mirrored_repeat GL_EXT_texture3D
GL_EXT_texgen_reflection GL_KTX_buffer_region GL_EXT_texture_compression_s3tc
GL_EXT_texture_env_add GL_NV_blend_square GL_EXT_texture_cube_map
GL_EXT_texture3D GL_NV_copy_depth_to_color GL_EXT_texture_edge_clamp
GL_EXT_texture_compression_s3tc GL_NV_evaluators GL_EXT_texture_env_combine
GL_EXT_texture_cube_map GL_NV_fence GL_EXT_texture_env_dot3
GL_EXT_texture_edge_clamp GL_NV_fog_distance GL_EXT_texture_lod_bias
GL_EXT_texture_env_combine GL_NV_light_max_exponent GL_EXT_texture_filter_anisotropic
GL_EXT_texture_env_dot3 GL_NV_multisample_filter_hint GL_EXT_texture_object
GL_EXT_texture_filter_anisotropic GL_NV_occlusion_query GL_EXT_vertex_array
GL_EXT_texture_lod_bias GL_NV_packed_depth_stencil GL_EXT_vertex_shader
GL_EXT_texture_object GL_NV_point_sprite GL_KTX_buffer_region
GL_EXT_vertex_array GL_NV_register_combiners GL_NV_texgen_reflection
GL_EXT_vertex_shader GL_NV_register_combiners2 GL_NV_blend_square
GL_HP_occlusion_test GL_NV_texgen_reflection GL_SGI_texture_edge_clamp
GL_KTX_buffer_region GL_NV_texture_compression_vtc GL_SGIS_texture_border_clamp
GL_NV_texgen_reflection GL_NV_texture_env_combine4 GL_SGIS_texture_lod
GL_NV_blend_square GL_NV_texture_rectangle GL_SGIS_generate_mipmap
GL_SGI_texture_edge_clamp GL_NV_texture_shader GL_SGIS_multitexture
GL_SGIS_texture_border_clamp GL_NV_texture_shader2 GL_WIN_swap_hint
GL_SGIS_texture_lod GL_NV_texture_shader3 WGL_EXT_extensions_string
GL_SGIS_generate_mipmap GL_NV_vertex_array_range WGL_EXT_swap_control
GL_SGIS_multitexture GL_NV_vertex_array_range2 -
GL_WIN_swap_hint GL_NV_vertex_program -
WGL_EXT_extensions_string GL_NV_vertex_program1_1 -
WGL_EXT_swap_control GL_SGIS_generate_mipmap -
GL_ARB_multisample GL_SGIS_multitexture -
- GL_SGIS_texture_lod -
- GL_SGIX_depth_texture -
- GL_SGIX_shadow -
- GL_WIN_swap_hint -
- WGL_EXT_swap_control -


 

Theoretical aspects of implementation of anti-aliasing and anisotropic filtering

Contrary to the previous chip, operation of the anisotropy based on the RIP mapping is corrected in the R300 - irrespective of an angle of inclination of a plane relative to the Z axis the anisotropy works buglessly. It wasn't a bug of the RIP mapping, but it was a peculiarity of its implementation in the previous ATI's chips. However, this more correct implementation of the R300 costs more. 

AA has changed as well. Like before, one of the pseuderandom chaotic templates with two, four or six samples is chosen (it's the main difference from NVIDIA where a template is always the same, though it depends on an AA method). But here samples are singled out according to the multisampling method, like in the NVIDIA's chips. The fillrate of polygons is expected to rise greatly, the situation will remain the same on their edges, but edges of transparent polygons will be processed incorrectly. Well, it's a cost of the increased AA speed. Further we will examine speed and quality of the AA in detail. 

And now let's take a look at the card. 

Card

The card has AGP x4/x8 interface, 128 MB DDR SDRAM memory (8 chips located on both PCB sides). 
 







 
The card comes with the Samsung's memory K4D26323RA-GC2A, BGA form-factor. The highest frequency of the card is 350 (700) MHz, which indicates that the access time is 2.8 ns instead of 2.2 ns as it was shown before in some reviews. The memory works at 310 (620) MHz at default.


The memory chips in the new BGA package have become quite popular among video card makers, and we won't focus on their advantages once again. 
 

ATI RADEON 9700 Pro 128MB






At first glance the card looks ordinary. Certainly, a 256-bit high-speed bus makes the PCB more complicated, but while the cards from Matrox and 3Dlabs were entirely shielded, here only the left part is screened: 




And the main part of the PCB containing the memory and the chip is not protected: 




The card works with just external power supply. It comes with an adapter/splitter connecting the card with a standard tail of a power unit: 




The set of interface connectors is standard: VGA, DVI and TV-out (S-Video). There is an adapter for connecting TV-out via RCA. 

The right part of the PCB is very similar to the RADEON 8500, especially in positions of the memory chips. However, the chip has a huge heatsink which is not typical of ATI :-). 







But it's necessary here as the VPU heats up greatly (because of a large number of transistors at 325 MHz and the .15-micron fab process). By the way, look at the package's form of the processor: 



Well, the FCPGA packaging of processors, with an open flip die, has reached the graphics chips. In the Matrox Parhelia Review I was surprised at the dimensions though it had a metallic cover above of an approximately same size, and here it is lacking. There is a great deal of outputs here (well, it's a 256-bit bus). 

That's all about the card itself. Note that in the second part of the RADEON 9700 Pro review we will examine peculiarities of the card in operation in the dual-monitor and TV-out modes (this time we have to leave it aside as we are short of time. Besides, the second part will also shed light on operation of the AGP8x). 

The card is supplied with: 




two CDs with software (drivers, MMC 7.8 etc.) and marketing materials, a quite small S-Video-to-RCA adapter (without a cable), and a DVI-to-d-Sub adapter. 

And here is the box the RADEON 9700 PRO based cards will be shipping in: 




Look at the right-hand corner. As you might know, at Quakecon'2002 such cards were demonstrated in special system units, not to mention that the promised fancy boxes priced at $450 (RADEON 9700 Pro, souvenirs, DOOM III which was promised for nothing and a remote control from ATI) didn't sell but could be just ordered. 

Overclocking

When we started testing operation at the rated frequencies, we thought that overclocking wouldn't be possible at such high temperatures. It turned out to be wrong! The latest version (3.21) of the PowerStrip is able to work with the RADEON 9700. 
ATI RADEON 9700 Pro 128MB  325/620 -> 350/700 MHz 

 




Frankly speaking, the chip is able to work at 370 MHz but we noticed no gains relative to 350 MHz in the standard modes (without AA and/or anisotropy). A bit later we ran the card again at 370 MHz , but that time under the maximum load and made sure that it operated stably. That is why the diagrams below show these results in the "brickwork bars". Although the processor is quite sophisticated and its temperature mode is rather high there is some overclocking potential. The FCPGA packaging helps it a lot. Besides, ATI selects the best chips for samples. Note that 

     
    • in course of overclocking you must provide additional cooling, in particular, for the card (first of all, for its memory):
       



    • overclocking depends on a definite sample, and you shouldn't generalize the results of one card to all video cards of this mark or series. The overclocking results are not the obligatory characteristics of a video card. 

Test system and drivers

Testbed: 
  • Pentium 4 based system (Socket 478): 
    • Intel Pentium 4 2200 (L2=512K); 
    • ASUS P4T-E (i850); 
    • 512 MB RDRAM PC800; 
    • Quantum FB AS 20GB; 
    • Windows XP. 
The test system was coupled with ViewSonic P810 (21") and ViewSonic P817 (21") monitors. The test system based on the AMD Athlon XP will be used in the second part, where we will also estimate operation in the AGP8x mode (on the VIA KT400). 

In the tests we used ATI's drivers of v6.143 (this driver is meant only for the RADEON 9*** series, card of the previous release are not supported (yet?). VSync was off, the texture compression was off. 

The following cards are used for comparison: 

  • ASUS V8460Ultra (GeForce4 Ti 4600, 300/325 (650) MHz, 128 MB, driver 30.82); 
  • Matrox Parhelia (220/275 (550) MHz, 128 MB, driver 2.31); 
  • Gigabyte MAYA AP128DG-H RADEON 8500 Deluxe (275/275 (550) MHz, 128 MB, driver 6.118). 

Drivers' settings

















Only the DirectX 8 drivers are released! The DirectX 9.0 drivers are expected only in October. The control settings are almost standard for the entire new drivers series CATALYST, except the anti-aliasing (SmoothVision II) and anisotropy. As you can see, you can choose a mode of the anisotropic filtering (except levels): performance/quality. Later we will see what they differ in (in short, possible/impossible operation of trilinear filtering together with anisotropy). 

The AA is finally put in order. Instead of vague performance/quality modes together with AA levels we have three modes: 2x, 4x and 6x. It spares users from thinking over a choice. Below we will take a look at the most interesting modes - 4x and 6x. 

Test results

2D graphics

Despite the high frequency and complexity of the card the 2D quality is superb! It's interesting that the colors are richer (I switched the monitor to the RADEON 8500). 

But remember that 2D quality depends on a given sample, as well as on a card/monitor tandem (first of all, quality of a given monitor and cable). 

3D graphics, MS DirectX 8.1 SDK - extreme tests

This time we wanted to test the extreme characteristics using prototypes of the synthetic tests developed within the frames of the open graphics benchmark RightMark 3D. But as the basic DX9 capabilities are not supported in the drivers we used two old tests from the DX 8.1 SDK, and left some new DX9 tests aside. 

So, for testing various extreme characteristics of the chips we used modified (for better handling and control) examples from the latest official version of the DirectX SDK (8.1). 

EMBM 

In this test we measure performance drop caused by Environment mapping and EMBM (Environment Mapped Bump Mapping). We also estimate a fillrate of single texturing.  The test was set to 1280x1024 because it's optimal for extreme testing of modern cards: 



Look at the red bars of the RADEON 9700 PRO. Well, the fillrate falls down much when the EMBM is enabled. This drop puts the RADEON 9700 PRO (aka R300) with its 256bit memory interface on the same level with the previous generation of the chips (in the EMBM mode)! While in the pure texture mode the card has a great advantage over its competitors, the environment mapping makes it much smaller. The other card do not lose their performance as they can enable their second texture unit which is lacking in the R300. In the EMBM mode 3 textures are combined. Besides, in case of the EMBM one texture is sampled out according to the values obtained from another, and such approach strongly kicks the pixel pipelines of the R300. 

Performance of Pixel Shaders 1.0

We against used a modified example of the MFCPixelShader, having measured performance of the card in a high resolution when it processed 5 shaders of different complexity, for bilinear-filtered textures: 



Well, on the simplest shader the R300 is far ahead thanks to 8 pipelines. But as the shaders become more sophisticated, its speed falls down faster than that of its competitors, and at the last stage it keeps up with the NV25(!). The situation is very close to the P10 and it will probably be peculiar to all modern accelerators able to process large pixel shaders. However, if we compare the R200 and the R300, the progress will be more than twice greater. 

Such performance drop is caused by two factors: 

  • As there is only one texture unit, operation gets slower with each new texture, instead of a pair of textures like in the other chips. 
  • All the other chips have pixel stages and implement shaders 1.X twice or four times faster than an instruction per clock, while the R300 works with shaders instruction by instruction, though 8 pipelines are used simultaneously. We mentioned it in the analytical article on the R300 and in the review on the P10, and this time our analytical ideas are proven in the synthetic tests. 
Besides, the NV25 and the P512 are equipped with at least 4 stages per pixel pipeline, while the R200 comes with two. On the other hand, 4 texture units do not help the P512 a lot because of a low clock speed, and the NV25 is well balanced from the standpoint of implementation of pixel shaders 1.0. 

Remember that the R300 will be competing against the NV30 whose performance is quite vague yet; we just know its approximate clock speed, that it has two texture units per pipeline and works with shaders instruction by instruction (like in the R300). 

3D graphics, MS DirectX 9 SDK (beta 2) - synthetic tests

For other extreme characteristics of the chip we used prototypes of our new synthetic DX9 tests created within the RightMark 3D project. 

GPU Speed





This test measures an extreme throughput of an accelerator with triangles, using different types and a different number of light sources and ways of lighting. At present there are 7 lighting models: 
  1. Constant (ambient lighting) 
  2. Diffuse (1 point source) 
  3. Diffuse (2 point sources) 
  4. Diffuse (3 point sources) 
  5. Diffuse + Specular (1 point source) 
  6. Diffuse + Specular (2 point sources) 
  7. Diffuse + Specular (3 point sources) 
And 4 operating modes: 
  1. Traditional TCL (Fixed-Function Pipeline) 
  2. Vertex shaders 1.1 
  3. Vertex shaders 1.1 and pixel shaders 1.1 
  4. Vertex shaders 2.0 and pixel shaders 2.0 
Later the test will be extended with several typical tasks of animation and and geometry transformation. 

The test minimizes dependence on all factors except the geometrical performance and a triangle transfer speed (parameters delivery into shaders). It renders a lot of small and detailed models whose triangles are very small (comparable to a pixel) in order to eliminate dependence on HSR or shading. 

Here are results of the traditional TCL both in the hardware mode and in the software vertex processing: 




The R300 is again ahead in the simplest task: 106M transformed vertices per second (!). In more complicated tasks it goes on a par with the NV25 and P512, which is not a brilliant result for a chip of the new generation. In the software transformation mode the NV25 supporting FastWrites takes the lead, while the R300 is not brighter than the R200 and P512. 

But this was a test of the old TCL. The previous chips without even a dedicated fixed TCL unit supported its effective emulation. Probably it wasn't the aim for the developers of the R300. Let's how transformation and lighting based on vertex shaders look like: 




Well, our suggestions concerning emulation of the TCL come true - the R300 shines in implementation of shaders. The new chip outscores the NV25 twice, and the R200 three times. The results of the R300 coincide with the data obtained in the TCL mode which means that it lacks for any special improved emulation. So, the R300 shows a gain matching the new generation. 

In the software mode the NV25 aces the others. The software emulation is rather slow on more or less complicated shaders. But it's hard to image a real application able to draw 30 or 40 million of triangles, that is why in real tasks the emulation remains acceptable. 

And now let's see how the test depends on a resolution: 




Resolution doesn't affect the test, but a complexity level of models has a decent effect. 

Point Sprites





This test measures a speed of rendering of point sprites. The test always uses semitransparent sprites as the most real effects based on systems of particles require transparency and blending. There are two modes available: with sprite lighting using light sources and without. It's possible to adjust a size of rendered sprites. 



Without lighting and with small sprites (up to 4 pixels inclusive) the R300 loses to the R200 (!), not to mention the NV25. But it takes a leading position as the sprites grow up thanks to the 256-bit memory bus. The bigger the sprite, the more frequent the frame buffer addressing is during blending. 

When the lighting is enabled the general dependence looks the same, but now the difference is not so striking, especially on small sprites. It seems that the transformation and lighting are the limiting factors. As the size becomes greater, the R300 goes further ahead, again thanks to a considerable advantage in speed of operation with the frame buffer. Besides, 9M particles is not a very great figure - effects based on particles are actually limited by blending, not a geometrical performance of the modern chips. 

Texturing Rate

This test uses an integrate approach in estimation a texture rate randomly changing the number of textures used in a pass, their size, format and a filtering method. With one texture we can measure a pixel fillrate; using all textures and changing a filtering method we can measure a filtering speed (i.e. performance of texture units). Besides, we can estimate an algorithm of MIP level determination: 



And quality of implementation of any filtering type: 



The test displays several big polygons with a wide range of depth values. It allows estimating visually chosen MIP levels, as well as realization of anisotropic filtering. To test all possible angles of inclination and plane turning angles the "tunnel" (or rather pyramid) rotates around the Z axis, and its vertex circles in the plane parallel to the screen. That is why triangles it consists of turn evenly and can be seen at different angles. 

First of all, let's see how the fillrate depends on the number of textures (bilinear filtering): 




While the old cards give expected results, the R300's performance obtained is much lower than its potential maximum. Even the R200 outscores it! But there is the following explanation: 
  1. 8 texture units at 325 MHz are comparable in the extreme performance of the bilinear filtering to 8 units at 300 MHz of the NV25 or to 8 units working at 275 MHz of the R200. 
  2. The R300 has no combination stages. Our test gathers all textures together, and evidently in the R300 the stage settings are emulated by the respective pixel shader. On the new architecture of pixel pipelines of the R300 each sampling takes one instruction, and each combination takes one more. And the R300 turns out to be inferior to the old stage-based pipelines, like in case of the fixed TCL! 
To check it we modified the test (SPECIAL TEST on the diagram) making textures "wipe" out each other instead of combining. The results of the other cards remained the same - the same number of stages is used, and the R300 was growing in a linear proportion until it exceeded its own theoretical limit. How could that happen? The answer justifies our hypothesis. The stage settings were translated by the drivers (or DX9) into a respective pixel shader. In course of compilation of this driver it was optimized and all independent and further unused texture samples were excluded. As a result, only the last texture was imposed, and our test calculated the texture fillrate incorrectly "thinking" that all textures were sampled. Well, again we face new peculiarities of the new generation of chips. Although it's more flexible, it's also less efficient in traditional simple tasks. Well, we should wait for the new DDI9 drivers and the NV30 to make final conclusions. And now we can be satisfied at least with the fact that in real applications the speed depends not only on texture unit's performance, but also on the total texture volume (which wasn't great here - one texture of 256x256 could easily go into the cache), therefore, the R300 equipped with a wider memory bus will get a decent advantage. This will be shown later, in the tests in real applications, and now we turn to trilinear filtering which, according to ATI, works for nothing on the R300: 



Well, the trilinear filtering is really almost free, though like in all other cards. The layout remains the same. Now let's see what we have in cae of different filtering types: 



In case of the anisotropic filtering the R300 thrives - the NV's solutions of the previous generation were always inferior in this aspect. But this time the anisotropy is not so free for the R300 as it was for the R200. Especially, when it's used together with the trilinear filtering. The matter is that ATI has finally corrected the problem of implementation of the anisotropy based on the RIP mapping. Now it allows for any turning angle of a textured surface around the Z axis (!). This brought in some performance drop (but not in several times) relative to the R200. On the other hand, the results of the NV25 and P512 which take a classical approach in implementation of the anisotropy are still very low. The only exception is the lowest degree where both the NV25 and especially P512 (thanks to 4 texture units per pipeline) are able to compete against the RADEON family. ATI can boast of its results - the problems are corrected and speed drop is not weighty. But we still don't know what the NV30 is coming with. Isn't a speed of anisotropic filtering, which is comparable to the R200, a reckless step? ATI was much limited by the .15 fab process and couldn't provide second texture units or lift the core's frequency. But at the same time the chip is based on the new architecture of pixel pipelines which is less efficient (on simple tasks) yields to the old one - this is the cost of the more flexible programming. Because of the technological process Matrox foresaw such problems and didn't provide for complete DX9 compatibility for the .15 process. The engineers from ATI took the risk. The time will show whether it was worth doing that. 

And now let's see how the test depends on a texture size (we'll take 4 textures in a pass as all today's chips are able to use such quantity): 




As you can see, the dependence is not great, and the texture of 256x256 can be considered a rational solution for most tests. Now let's look at different resolutions (again with 4 textures): 



Starting from 1280x1024, the dependence is almost invisible, which is required from a good test. A dependence on a texture format won't be tested as these drivers do not support compression formats and a 16-bit texture will be rendered at the same rate (note that we use just one texture which goes into the cache). 

3D graphics, 3DMark2001 SE - synthetic tests

All measurements in all 3D tests were carried out in the 32-bit depth color. 

Fillrate

 

 




The theoretical maximum for this test makes 880 M pixels/sec for the Parhelia, 1100 for RADEON 8500, 1200 for Ti 4600 and 2600 for R300. The results obtained are in perfect harmony with the theory, and thanks to 8 fill pipelines (and to the 256bit bus) the R300 takes a double lead. But which modern application uses just one texture? Let's estimate multitexturing: 



The peak values for this test are 3520 (1760) M texels/sec for the Parhelia (in the parentheses you can see a value for 4 pipelines with 2 texture units on each), 2200 - for RADEON 8500, 2400 for Ti 4600 and 2600 for RADEON 9700 PRO. In multitexturing the degree of the chip's balance plays a greater role. This time the peak scores are achieved by the Ti 4600 and RADEON 9700 PRO (only in high resolution, for which it's recommended ;) ). The R300 is not a strong leader, even theoretically because of just one texture unit and the core's frequency comparable to the others. The data obtained correspond to the results of our new synthetic test of the DX9. 

Scene with a lot of polygons

Focus on the minimal resolution where dependence on shading is almost lacking: 



With one light source the R300 is an absolute leader. The results coincide with those we got in our own DX9 test, but the 3DMark2001 doesn't allow reaching the physical limit of the chip so closely like the future GPU Speed from the RightMark 3D. The P512 is a vivid outsider despite 4 vertex pipelines. It's interesting that in our GPU Speed test the P512 performs much better - it scores 1.5 times better results matching those of the R200. It seems that the 3DMark2001 touches a sore spot of the chip which slows down operation of the vertex unit. Or it's the drivers and DX to blame - our test is written and compiled with the DX9 interface while the 3DMark2001 SE was compiled with the DX81. 



In case of 8 light sources the general layout doesn't change, but the difference between the cards gets smaller, again coinciding with the earlier obtained results. Besides, the R200 and the P512 exchange their positions - the Parhelia scores better results: as the number of light sources increases its performance falls down slower as compared with the RADEON 8500). The R300 is a leader, and the R200 becomes an outsider. 

Bump mapping

Look at the results of a synthetic EMBM scene: 



Unlike our old test from the DX81 SDK, the R300 performs better! It seems that this test is more dependent on a frame buffer write speed which is higher for the R300. And now the DP3 bump mapping: 



The same situation. 

Vertex shaders

 

 




The results of our synthetic test GPU Speed are again proven. The breakthrough of the R300 in the 3DMark2001 is not so great, but it still separates it from the other participants - the R300 is an undoubted leader in operation with vertex shaders, geometry and transformation. At least, until it concerns the fixed TCL. 

Pixel shader

Taking into account the fact that the too low resolutions are limited by the geometry and too high ones - by the memory throughput, let's take a look at 1024x768 and 1280x1024: 



The R300 is again ahead. But its performance falls down faster as the resolution grows up, because of instruction-per-clock implementation of vertex shaders. Let's see what happens in the more complex test Advanced Pixel Shader. 



The breakthrough of the R300 grows up because of optimization of the pixel pipelines of the chip for more flexible and longer pixel shaders. It's clear that it's too early to make any conclusions until the complete support of the DX9 and pixel shaders 2.0 is provided in the drivers. 

Sprites





The R300 is ahead, but the gap is smaller. The Parhelia suffers in this test as it doesn't have a special hardware acceleration to output point sprites. Without blending the sprite's performance isn't of great value. Well, this test demonstrates once again advantages of the 256-bit bus and 8 fill pipelines. 

So, let's sum it up. In the synthetic tests the R300 card looks differently. In the geometry processing it takes a firm leading position, and in the shading tests everything depends on a task. At least we can complain of just one texture unit per pipeline. As compared with the other cards, the R300 certainly leaves an impression of a strong leader developed for complex tasks of future applications. On the other hand, we must wait for the normal DX9 drivers to make final conclusions on the synthetic tests. Besides, we were not able to test new formats of textures and a frame buffer and the second version of the pixel and vertex shaders. Besides, the NV30, which is the main competitor for the R300 wasn't tested here yet. 

Before we turn to the games I must notice that the tests were carried out not only at the normal and overclocked frequencies but also at reduced to 300/600 MHz to estimate a performance of the slowest cards of this series (see the freqeuncy range on the cards from the ATI's partners). 

3D graphics, 3DMark2001 - games tests

3DMark2001, 3DMARKS




The RADEON 9700 Pro outpaces the GeForce4 Ti 4600 by 17 to 39% in these benchmarks. It's not that bad taking into account a strong limiting effect of the central processor. 
 

3DMark2001, Game1 Low details






Test characteristics: 
  • Rendered triangles per frame (min/avg/max): 19773/33753/143422 
  • Rendered textures per frame with 16 bit textures (min/avg/max): 7.5/8.8/16.5 MB 
  • Rendered textures per frame with 32 bit textures (min/avg/max): 15.1/17.7/30.3 MB 
  • Rendered textures per frame with texture compression (min/avg/max): 10.7/12.2/21.0 MB 
In 1600x1200x32 the RADEON 9700 outscores the Ti 4600 by 26%, the Parhelia by 123%, and the RADEON 8500 by 57.2%. 

When the AA is enabled it performs better than the Ti 4600 (AA4x) by 121%, than the Parhelia (FAA16x) by 94% and it surpasses the RADEON 8500 (AA4xP) by 265%. 

The anisotropic filtering (quality mode) brings the following numbers: the gain over the Ti 4600 (ANIS 8) is 60% and over the RADEON 8500 (ANIS 16) is 19%. 

When the AA and anisotropy are both switched on the new Canadian product has the following gain (in 1280x1024): 75% over Ti 4600, 44% over the Parhelia and 163% over the RADEON 8500. 
 

3DMark2001, Game2 Low details








Test characteristics: 
  • Rendered triangles per frame (min/avg/max): 46159/51440/147828 
  • Rendered textures per frame with 16 bit textures (min/avg/max): 8.0/8.8/10.1 MB 
  • Rendered textures per frame with 32 bit textures (min/avg/max): 15.6/17.2/19.8 MB 
  • Rendered textures per frame with texture compression (min/avg/max): 9.3/10.9/13.5 MB 
In 1600x1200x32 the RADEON 9700 has the following gap: 38% Ti 4600, 136% over Parhelia and 79% over RADEON 8500.

With the AA the gain over the Ti 4600 (AA4x) is 171%, over the Parhelia (FAA16x) it's 104%, and over the RADEON 8500 (AA4xP) it makes 311%. 

With the anisotropic filtering (quality mode) the gain over the Ti 4600 (ANIS 8) is 195%, over the RADEON 8500 (ANIS 16) it makes 69%. 

When the AA and anisotropy are both switched on the new Canadian product has the following gain (in 1280x1024): 181% over Ti 4600, 105% over Parhelia and 283% over RADEON 8500. 
 
 

3DMark2001, Game3 Low details








Test characteristics: 
  • Rendered triangles per frame (min/avg/max): 16681/21746/39890 
  • Rendered textures per frame with 16 bit textures (min/avg/max): 2.8/4.1/4.7 MB 
  • Rendered textures per frame with 32 bit textures (min/avg/max): 5.7/8.2/9.4 MB 
  • Rendered textures per frame with texture compression (min/avg/max): 5.0/7.2/8.4 MB 
In 1600x1200x32 the RADEON 9700 has the following breakthrough: 29% over Ti 4600, 114% over Parhelia and 61.5% over RADEON 8500.

With the AA the gain over the Ti 4600 (AA4x) is 188%, over Parhelia (FAA16x) - 76%, and over RADEON 8500 (AA4xP) - 338%. 

With the anisotropic filtering (quality mode) the gain over the Ti 4600 (ANIS 8) is 153%, and over the RADEON 8500 (ANIS 16) - 74%. 

When the AA and anisotropy are both switched on the new Canadian product has the following gain (in 1280x1024): 192% over Ti 4600, 65% over Parhelia, and 297% over RADEON 8500. 
 
 

3DMark2001, Game4








Test characteristics: 
  • Rendered triangles per frame (min/avg/max): 55601/81714/180938 
  • Rendered textures per frame with 16 bit textures (min/avg/max): 14.9/17.4/20.7 MB 
  • Rendered textures per frame with 32 bit textures (min/avg/max): 28.4/33.5/40.0 MB 
  • Rendered textures per frame with texture compression (min/avg/max): 28.4/33.5/40.0 MB 
In 1600x1200x32 the RADEON 9700 has the following breakthrough: 101% over Ti 4600, 207% over Parhelia, and 113% over the RADEON 8500. 

With the AA the gain over the Ti 4600 (AA4x) is 146%, over the Parhelia (FAA16x) - 133%, and over the RADEON 8500 (AA4xP) - 276%. 

With the anisotropic filtering (quality mode) the gain over the Ti 4600 (ANIS 8) is 90%, and over the RADEON 8500 (ANIS 16) - 51%. 

When the AA and anisotropy are both switched on the new Canadian product has the following gain (in 1280x1024): 116% over the Ti 4600, 155% over the Parhelia, and 187% over the RADEON 8500. 

In the 3DMark2001SE benchmarks the new ATI's product has the greater gain in the tough modes of AA and anisotropy. It should be expected because of a limiting performance of the CPU's frequency (and of the platform despite 2.2 GHz) with a low load. The 256-bit memory bus has a positive effect on the overall speed with the AA. As for anisotropy, the performance doesn't fall down drastically in the quality mode. Later we will speak about it in depth. 

3D graphics, game tests

For the performance estimation we used: 
  • Return to Castle Wolfenstein (MultiPlayer) (id Software/Activision) - OpenGL, multitexturing, Checkpoint-demo, test settings - maximum, S3TC OFF, the configurations can be downloaded from here 
  • Serious Sam: The Second Encounter v.1.05 (Croteam/GodGames) - OpenGL, multitexturing, Grand Cathedral demo, test settings: quality, S3TC OFF 

  •  
  • Quake3 Arena v.1.17 (id Software/Activision) - OpenGL, multitexturing, Quaver, test settings - maximum: detailing level - High, texture detailing level - #4, S3TC OFF, smoothness of curves is much increased through variables r_subdivisions "1" and r_lodCurveError "30000" (at default r_lodCurveError is 250 !), the configurations can be downloaded from here 

  •  
  • Comanche4 Benchmark Demo (NovaLogic) - Direct3D, Shaders, Hardware T&L, Dot3, cube texturing, highest quality

  •  
  • Unreal Tournament 2003 Demo v.927 (Digital Extreme/Epic Games) - Direct3D, Vertex Shaders, Hardware T&L, Dot3, cube texturing, default quality

  •  
  • Code Creatures Benchmark Pro (CodeCult) is a game that demonstrates operation of cards in the DirectX 8.1, Shaders, HW T&L. 

  •  
  • AquaMark (Massive Development) is a game that demonstrates operation of cards in the DirectX 8.1, Shaders, HW T&L. 

  •  
  • RightMark Video Analyzer v.0.4 (Philip Gerasimov) - DirectX 8.1, Dot3, cube texturing, shadow buffers, vertex and pixel shaders (1.1, 1.4). 

Quake3 Arena, Quaver







In 1600x1200x32 the RADEON 9700 has the following breakthrough: 38% over Ti 4600, 195% over Parhelia, and 66% over the RADEON 8500. 

With the AA the gain over the Ti 4600 (AA4x) is 113%, over Parhelia (FAA16x) - 137%, and over RADEON 8500 (AA4xP) - 327%. 

With the anisotropic filtering (quality mode) the gain over the Ti 4600 (ANIS 8) is 94%, and over the RADEON 8500 (ANIS 16) - 49%. 

When the AA and anisotropy are both switched on the new Canadian product has the following gain (in 1280x1024): 146% over Ti 4600, 169% over Parhelia, and 429% over RADEON 8500. 

Without AA and anisotropy the speed is limited by the platform and CPU, but when these functions get enabled, the RADEON 9700 goes far ahead. Surely, the OpenGL driver is far imperfect, though it's a trifle as compared with what we will see in the next test. 

Serious Sam: The Second Encounter, Grand Cathedral

Here are screenshots of the settings: 







Here are the results: 





In 1600x1200x32 the RADEON 9700 has the following breakthrough: -17.8% (falls behind) over Ti 4600, 47% over Parhelia, and 27% over the RADEON 8500. 

With the AA the gain over the Ti 4600 (AA4x) is 26%, and over Parhelia (FAA16x) - -4% (loses)/ We don't take the scores of the RADEON 8500 (AA4xP) into account because it has a clear bug in this test with the AA. Such a small gap is impossible; we carried out the test several times: the results were the same and we ignored them. 

With the anisotropic filtering (quality mode) the gain over the Ti 4600 (ANIS 8) is 0%, and over the RADEON 8500 (ANIS 16) - 0%. 

When the AA and anisotropy are both switched on the new Canadian product has the following gain (in 1280x1024): 0% over Ti 4600, -23% over Parhelia, and no results for the RADEON 8500. 

Well, it's either the ATI's driver that works terribly in this game, or there is something else that prevents the RADEON 9700 from showing its might. I pin my hopes on the new drivers or at least on an explanation from ATI. 

Return to Castle Wolfenstein (Multiplayer), Checkpoint






In 1600x1200x32 the RADEON 9700 has the following breakthrough: 24% over Ti 4600, 397% over Parhelia, and 22% over the RADEON 8500. 

With the AA the gain over the Ti 4600 (AA4x) is 131%, over Parhelia (FAA16x) - 378%, and over RADEON 8500 (AA4xP) - 262%. 

With the anisotropic filtering (quality mode) the gain over the Ti 4600 (ANIS 8) is 116%, and over the RADEON 8500 (ANIS 16) - 26%. 

When the AA and anisotropy are both switched on the new Canadian product has the following gain (in 1280x1024): 177% over Ti 4600, 400% over Parhelia, and 315% over RADEON 8500. 

In this test (which depends more on the platform than the Quake3 does) the new solution shows brilliant scores when the load is heavy. Such a low speed of the Parhelia is probably on the account of the same factors as it was in case of the RADEON 9700: bugs in software or incompatibility (a patch is probably needed).

Code Creatures

This test is based on the CodeCult's engine which is used for several games. 
 












The engine uses almost all modern capabilities of video cards of the latest generation. And the demo program based on this engine contains quite tough scenes as to the texture size, geometry and effects used. 




In 1600x1200x32 the RADEON 9700 has the following breakthrough: 31% over Ti 4600, 110% over Parhelia, and 320% over the RADEON 8500. 

As you can see, the test is quite tough even for the super-accelerator, that is why it makes no sense to run the card with the AA and/or anisotropy. 

Comanche4 DEMO







In 1600x1200x32 the RADEON 9700 has the following breakthrough: 0% over Ti 4600, 88% over Parhelia, and 34% over the RADEON 8500. 

With the AA the gain over the Ti 4600 (AA4x) is 58%, over Parhelia (FAA16x) - 113%, and over RADEON 8500 (AA4xP) - 149%. 

With the anisotropic filtering (quality mode) the gain over the Ti 4600 (ANIS 8) is 35%, and over the RADEON 8500 (ANIS 16) - 23%. 

When the AA and anisotropy are both switched on the new Canadian product has the following gain (in 1280x1024): 37% over Ti 4600, 100% over Parhelia, and 100% over RADEON 8500. 

A processor affects much this test, and the load on the cards is very high, that is why the overall performance is low at the maximum quality level. That is why even a 50% gain is a great achievement. 

Unreal Tournament 2003 DEMO b.927






In 1600x1200x32 the RADEON 9700 has the following breakthrough: 0% over Ti 4600, 54% over Parhelia, and 230% over the RADEON 8500. 

With the AA the gain over the Ti 4600 (AA4x) is not accounted as AA4x doesn't work here for some reason, over Parhelia (FAA16x) - 25%, and over RADEON 8500 (AA4xP) - 364%. 

With the anisotropic filtering (quality mode) the gain over the Ti 4600 (ANIS 8) is 12%, and over the RADEON 8500 (ANIS 16) - 76%. 

When the AA and anisotropy are both switched on the new Canadian product has the following gain (in 1280x1024): no results for the Ti 4600, -7% over Parhelia, and 165% over RADEON 8500. 

This test puzzles many, but we decided to leave it here until the final DEMO version is released. Yet in our 3Digest we wrote that performance of the ATI's cards was unexplainable. We are inclined to think that the ATI's drivers are to blame here. 

AquaMark






In 1600x1200x32 the RADEON 9700 has the following breakthrough: 4% over Ti 4600, 75% over Parhelia, and 76% over the RADEON 8500. 

With the AA the gain over the Ti 4600 (AA4x) is 101%, over Parhelia (FAA16x) - 66%, and over RADEON 8500 (AA4xP) - 169%. 

With the anisotropic filtering (quality mode) the gain over the Ti 4600 (ANIS 8) is 278%, and over the RADEON 8500 (ANIS 16) - 38%. 

When the AA and anisotropy are both switched on the new Canadian product has the following gain (in 1280x1024): 157% over the Ti 4600, 40% over Parhelia, and 142% over RADEON 8500. 

This test was released in 2001 when the GeForce3 based card just appeared, and at that time it was very difficult. And even the current leader doesn't strike with the speeds, though the performance is tenfold better. Note that the speed of the GeForce4 falls down dramatically when anisotropy is enabled (this filtering type processes all textures without distinction, and this test uses a lot of semitransparent textures). 

RightMark 3D





In 1600x1200x32 the RADEON 9700 has the following breakthrough: 48% over Ti 4600, 86% over Parhelia, and 211% over the RADEON 8500. 

This test is also very difficult to estimate not only operation of modern accelerators with the DX8.1 functions enabled but also how fast they work with them. 

3D quality

ANISOTROPIC FILTERING

In our 3Digest you can find information about this function: its operation and what it's for. 

We know that different VPU makers implement this function in their own way. And speed characteristics of anisotropy, say from ATI and from NVIDIA, are very different. Only the output quality is similar. 

Till recently this postulate was true (at least for implementation of this function from ATI). Now it's different. Now the Canadian developers give you a choice: either pure anisotropy or together with trilinear filtering. 

What do the "Performance/Quality" switches mean, which are located in the section related to the anisotropic filtering? Let's take a look at the Fillrate test from the future RightMark 3D. We will enable trilinear filtering (the usual mode is on the left, and with MIP levels shading on the right): 




Works perfectly. Now we enable anisotropy in the Performance mode: 



The trilinear filtering disappears, though anisotropy is of excellent quality. Let's compare it with same filtering of the RADEON 8500; we purposely placed walls in the tunnel at 45 degrees :-). On the left is a screenshot that shows the tunnel exactly in this position. And on the right it is turned by 40 degrees. 
 

RADEON 8500




GeForce4                                                        RADEON 9700



Have you noticed the difference? :-) And now look at the screenshots of the RADEON 9700 above to make sure that the problem of angles close to 45 degrees is solved. 

Now let's return to the difference in operation of this "Performance/Quality" function. We saw above that although the anisotropy quality grew and didn't depend on angles anymore, it is still unable to coexist with the trilinear filtering. Quality mode. 




Both filtering types work! It relates to all Direct3D applications. As it turned out, in case of such forcing of anisotropy the internal filtering modes in the games obey the drivers irrespective of what is set in the games, except 3DMark2001 where the trilinear filtering seems to be completely put an end to. 

It's OK in the OpenGL as well. Let's test it in the Quake3. Performance mode: 




The anisotropy works, the trilinear doesn't (irrespective of whether it's enabled in the game). Now Quality mode: 



Well, it's a really pleasant outcome for the ATI's fans. The earlier given test results show that at ANIS 16 and with the almost complete anisotropy the performance drop is not so great as compared with the GeForce4 Ti. But it works only in the very expensive High-End card. This will probably be left untouched in the RADEON 9500, but we have nothing to say about either performance or price of this card (it can be both $180-200 and $250-290). 

At the end of the anisotropy section look at some screenshots: 

RADEON 8500 RADEON 9700
3DMark2001, Game 1
ANIS 0



ANIS 16












3DMark2001, Game 2
ANIS 0



ANIS 16












Serious Sam: TSE
ANIS 0



ANIS 16










 




ANTI-ALIASING (AA)

In the very beginning we studied some aspects of operation of AA for the RADEON 9700, the performance was estimated in the previous section and here we are going to look at quality. 
Example 1 Example 2
3DMark2001, Game 1
No AA






AA 4x






AA 6x






3DMark2001, Game 3
No AA



 
AA 4x



 
AA 6x



 
3DMark2001, Game 4
No AA






AA 4x






AA 6x






Serious Sam: TSE
No AA






AA 4x






AA 6x






  1. Almost lacking difference in the 4x and 6x modes. So, why to pay more? 
  2. As expected, the MSAA doesn't work when transparent textures are used (look at the leaves in the Game4). 
  3. On the whole, the AA quality is very good and it corresponds to the respective level of the GeForce4 Ti. In the Direct3D with AA the LOD BIAS shifts to negative, that is why the images get sharper. 

3D quality in large

We have run just few games but some artifacts are already revealed. First of all, we found them in some games released in 2000-2001 (RealMYST, Sacrifice). The latest games do not have such. The details will be available in our 3Digest; the gallery of screenshots will get more pictures of the RADEON 9700 (the Morrowind has no artifacts :-) ). 

Conclusion

Our cinema hall has still one free seat meant for the NV30. Let's wait for this last spectator to draw the overall conclusion on the accelerators of the latest generation. And at this moment we have the following to say. 
  1. In general, ATI managed to make a good card in spite of such a sophisticated chip and a high temperature mode (without additional cooling the chip's temperature can reach 85 degrees (!), the PCB heats as much as 68-70. But it doesn't affect stability of operation of the card (we ran the 3DMark2001 during 6 hours without a break). 
  2. The RADEON 9700 is absolutely the king today. But the cards will start shipping only in September; besides, soon NVIDIA will release its NV30 which might become a leader this autumn. It will probably not kill the RADEON 9700, but it will be able to move ATI off the throne or make it cut prices. 
  3. As far as operation of the card is concerned, we are currently unable to estimate DirectX 9 functions because the DX9 is not released yet, even the beta version of the DX9 drivers. That is why we will continue the RADEON 9700 review quite soon, it will be the third part. In the second part we will deal with video functions, TV-out, AGP8x (today we took measurements at AGP4x). 
  4. It should be noted that when the ATI RADEON 9700 Pro is installed into the Soltek 75DRV5 (VIA KT333) mainboard only AGP 1x and 2x (!) modes are available for some reason (maybe because of the BIOS). The 4x mode doesn't work, though the board can enable it with other cards. As the 2x mode is obviously unsupported by the video card (!), any 3D application causes hang-ups. 
  5. The software has still a long way to develop, in particular, in the drivers' panel one must switch some settings 2-3 times to make them work as required. Well, the ATI's programmers have something to work at (although the 6.143 version is certified). 
  6. Although the 3D speed of the card is quite low in the synthetic tests, the performance is good enough. The 8x1 will definitely tell upon future games greedy for the fillrate. But now I can see that such card is not worth using without AA and/or anisotropy! That is why if you are ready to pay $400 for this super accelerator you should start learn basic 3D graphics to adjust functions affecting 3D quality correctly. 
  7. We have already mentioned the downsides (low fillrate because of 8x1, high prices, not available on the market, artifacts in some games), the advantages, apart from the speed, also include the improved anisotropic filtering and modern features of AA (high quality at a moderate speed cost). 
I hope these cards will soon get onto the shelves and the prices will go down swiftly (this will make prices of the GeForce4 Ti 4600 cards fall down as well). And we congratulate the ATI's fans on the new 3D king in the gaming sector (maybe just for short, but ATI managed to outperform the leader). 

Once again I must say that the current performance of the card may change dramatically after the release of the DirectX 9 and the respective drivers, that is why the subject is still open. Besides, soon we will publish the second part concerning aspects of operation of the RADEON 9700 with DVD, TV-out and some tests on the KT400 (AGP8x). 
 

Andrey Vorobiev (anvakams@ixbt.com)
Alexander Medvedev (unclesam@ixbt.com

Write a comment below. No registration needed!


Article navigation:



blog comments powered by Disqus

  Most Popular Reviews More    RSS  

AMD Phenom II X4 955, Phenom II X4 960T, Phenom II X6 1075T, and Intel Pentium G2120, Core i3-3220, Core i5-3330 Processors

Comparing old, cheap solutions from AMD with new, budget offerings from Intel.
February 1, 2013 · Processor Roundups

Inno3D GeForce GTX 670 iChill, Inno3D GeForce GTX 660 Ti Graphics Cards

A couple of mid-range adapters with original cooling systems.
January 30, 2013 · Video cards: NVIDIA GPUs

Creative Sound Blaster X-Fi Surround 5.1

An external X-Fi solution in tests.
September 9, 2008 · Sound Cards

AMD FX-8350 Processor

The first worthwhile Piledriver CPU.
September 11, 2012 · Processors: AMD

Consumed Power, Energy Consumption: Ivy Bridge vs. Sandy Bridge

Trying out the new method.
September 18, 2012 · Processors: Intel
  Latest Reviews More    RSS  

i3DSpeed, September 2013

Retested all graphics cards with the new drivers.
Oct 18, 2013 · 3Digests

i3DSpeed, August 2013

Added new benchmarks: BioShock Infinite and Metro: Last Light.
Sep 06, 2013 · 3Digests

i3DSpeed, July 2013

Added the test results of NVIDIA GeForce GTX 760 and AMD Radeon HD 7730.
Aug 05, 2013 · 3Digests

Gainward GeForce GTX 650 Ti BOOST 2GB Golden Sample Graphics Card

An excellent hybrid of GeForce GTX 650 Ti and GeForce GTX 660.
Jun 24, 2013 · Video cards: NVIDIA GPUs

i3DSpeed, May 2013

Added the test results of NVIDIA GeForce GTX 770/780.
Jun 03, 2013 · 3Digests
  Latest News More    RSS  

Platform  ·  Video  ·  Multimedia  ·  Mobile  ·  Other  ||  About us & Privacy policy  ·  Twitter  ·  Facebook


Copyright © Byrds Research & Publishing, Ltd., 1997–2011. All rights reserved.