ATI RADEON 9700 Pro 128MB Video Card Review
|
For a start we recommend that you read the analytical article dealing with
the architecture and specifications of the RADEON 9700
(R300)
CONTENTS
-
General
information
-
Theoretical
aspects of implementation of anti-aliasing and anisotropic filtering
-
Peculiarities
of the ATI RADEON 9700 Pro 128MB video card
-
Test
system configuration and drivers' settings
-
Test
results: briefly on 2D, extreme tests from DirectX 8.1 SDK and synthetic
(DirectX 9.0 based) tests
-
Test
results: 3DMark2001 SE synthetic tests
-
Test
results: 3DMark2001
SE game tests
-
Test
results: Quake3
ARENA
-
Test
results: Serious
Sam: The Second Encounter
-
Test
results: Return
to Castle Wolfenstein
-
Test
results: Code
Creatures DEMO
-
Test
results: Comanche4
DEMO
-
Test
results: Unreal
Tournament 2003 DEMO
-
Test
results: AquaMark
-
Test
results: RightMark
3D
-
3D
quality: Anisotropic filtering
-
3D
quality:
Anti-aliasing
-
Overall
3D quality
-
Conclusion
In this review we are not going to touch the architecture or specs of
the RADEON 9700 chip (also known as R300); instead, we will examine its
capabilities and performance aspects. The current line of the R300 based
cards looks the following:
-
RADEON 9700 PRO - 325 MHz chip, 128 MB 310 MHz (DDR 620) 256 bit local
memory;
-
RADEON 9700 - 300 MHz chip, 128 MB 300 MHz (DDR 600) 256 bit local memory;
-
RADEON 9500 - the chip cut to 4 pipelines, 128bit local memory;
We are testing the senior card of the line which is also the only available
on the today's market - RADEON 9700 PRO. It's the first time we use the
DirectX 9 (beta 2) for testing chips. Before the practical tests look at
the DX9 capabilities currently supported by the card; in the parentheses
we give marks to the respective values:
-
Texture size - up to 2048x2048 (standard)
-
Non-square textures supported (good)
-
Light sources (max.) - 8 (standard)
-
Texture fetch stages - 8 (excellent)
-
Combination stages - 8 (standard)
-
Clipping surfaces - 6 (excellent)
-
Sprite size (max.) - 256 (excellent)
-
Primitives for one call (max.) - 65535 (questionable)
-
Vertex buffer size - 16777215 (excellent)
-
Vertex streams (max.) - 16 (excellent)
-
Vertex shaders version (max.) - 1.1 (bad)
-
Constants of vertex shader - 256 (excellent)
-
Pixel shaders version (max.) - 1.4 (bad)
-
Pixel shader (max. value) - 3.40282E+038 (the highest value for the floating
point F32 format, excellent)
-
Multisampling modes: no, 2, 4, 6 samples (only in the X8R8G8B8 mode, in
the frame buffer mode with alpha channel A8R8G8B8 it's not available).
-
Render target format (good):
-
D3DFMT_A8R8G8B8
-
D3DFMT_X8R8G8B8
-
D3DFMT_R5G6B5
-
D3DFMT_A1R5G5B5
-
D3DFMT_A4R4G4B4
-
Depth buffer formats (good):
-
D3DFMT_D16_LOCKABLE
-
D3DFMT_D24S8
-
D3DFMT_D24X8
-
D3DFMT_D16
-
Texture formats (strange):
-
D3DFMT_A8R8G8B8
-
D3DFMT_X8R8G8B8
-
D3DFMT_R5G6B5
-
D3DFMT_X1R5G5B5
-
D3DFMT_A1R5G5B5
-
D3DFMT_A4R4G4B4
-
D3DFMT_R3G3B2
-
D3DFMT_L8
-
D3DFMT_V8U8
-
D3DFMT_L6V5U5
-
D3DFMT_X8L8V8U8
-
D3DFMT_Q8W8V8U8
-
D3DFMT_V16U16
-
D3DFMT_UYVY
-
D3DFMT_YUY2
-
Cube texturing formats (OK):
-
D3DFMT_A8R8G8B8
-
D3DFMT_X8R8G8B8
-
D3DFMT_R5G6B5
-
D3DFMT_X1R5G5B5
-
D3DFMT_A1R5G5B5
-
D3DFMT_A4R4G4B4
-
D3DFMT_R3G3B2
-
D3DFMT_L8
-
D3DFMT_UYVY
-
D3DFMT_YUY2
-
Volume texture formats (OK):
-
D3DFMT_A8R8G8B8
-
D3DFMT_X8R8G8B8
-
D3DFMT_R5G6B5
-
D3DFMT_X1R5G5B5
-
D3DFMT_A1R5G5B5
-
D3DFMT_A4R4G4B4
-
D3DFMT_R3G3B2
-
D3DFMT_L8
-
D3DFMT_UYVY
-
D3DFMT_YUY2
Filtering modes for single texturing (excellent):
-
D3DPTFILTERCAPS_MINFPOINT
-
D3DPTFILTERCAPS_MINFLINEAR
-
D3DPTFILTERCAPS_MINFANISOTROPIC
-
D3DPTFILTERCAPS_MIPFPOINT
-
D3DPTFILTERCAPS_MIPFLINEAR
-
D3DPTFILTERCAPS_MAGFPOINT
-
D3DPTFILTERCAPS_MAGFLINEAR
-
D3DPTFILTERCAPS_MAGFANISOTROPIC
-
Filtering modes for cube texturing (good):
-
D3DPTFILTERCAPS_MINFPOINT
-
D3DPTFILTERCAPS_MINFLINEAR
-
D3DPTFILTERCAPS_MIPFPOINT
-
D3DPTFILTERCAPS_MIPFLINEAR
-
D3DPTFILTERCAPS_MAGFPOINT
-
D3DPTFILTERCAPS_MAGFLINEAR
-
Filtering modes for volume textures (good):
-
D3DPTFILTERCAPS_MINFPOINT
-
D3DPTFILTERCAPS_MINFLINEAR
-
D3DPTFILTERCAPS_MIPFPOINT
-
D3DPTFILTERCAPS_MIPFLINEAR
-
D3DPTFILTERCAPS_MAGFPOINT
-
D3DPTFILTERCAPS_MAGFLINEAR
You might notice that there are no traces of the DX9. The matter is that
the currently available drivers support only the old DDI8 (Direct Driver
Interface 8) for drivers and can't provide any capabilities beyond the
DX8. Only when the DDI9 appears (ATI will likely make it available only
when Microsoft stops making changes in it and gives a respective permission),
we will be able to test the new capabilities of the chip, and now we have
just old ones. Although the DX9 works with such drivers without speed losses
(the test results of the applications do not differ from the DX8 within
the inaccuracy of measurement) we are deprived of the most interesting
capabilities of the RADEON 9700 PRO - the second versions of the pixel
and vertex drivers or floating point formats of the textures and a frame
buffer. On the other hand, nothing prevents from estimating performance
and implementation of AA, anisotropic filtering and certain chip's characteristics
such as a fillrate and performance of its geometrical unit.
Because of the early driver (or peculiar operation of the DX9 with the
DDI8) the list of textures supported lacks for any compression formats.
Here is a list of currently available OpenGL extensions and the OpenGL
ICD version:
ATI, Radeon 9700 x86/SSE2, version 1.3.3259 |
NVIDIA, GeForce4 Ti 4600/AGP/SSE2, version 1.3.1 |
ATI, Radeon 8500 DDR x86/SSE2, version 1.3.2475 |
GL_ARB_depth_texture |
GL_ARB_imaging |
GL_ARB_multitexture |
GL_ARB_multitexture |
GL_ARB_multisample |
GL_ARB_texture_border_clamp |
GL_ARB_point_parameters |
GL_ARB_multitexture |
GL_ARB_texture_compression |
GL_ARB_shadow |
GL_ARB_texture_border_clamp |
GL_ARB_texture_cube_map |
GL_ARB_shadow_ambient |
GL_ARB_texture_compression |
GL_ARB_texture_env_add |
GL_ARB_texture_border_clamp |
GL_ARB_texture_cube_map |
GL_ARB_texture_env_combine |
GL_ARB_texture_compression |
GL_ARB_texture_env_add |
GL_ARB_texture_env_crossbar |
GL_ARB_texture_cube_map |
GL_ARB_texture_env_combine |
GL_ARB_texture_env_dot3 |
GL_ARB_texture_env_add |
GL_ARB_texture_env_dot3 |
GL_ARB_transpose_matrix |
GL_ARB_texture_env_combine |
GL_ARB_transpose_matrix |
GL_ARB_vertex_blend |
GL_ARB_texture_env_crossbar |
GL_S3_s3tc |
GL_ARB_window_pos |
GL_ARB_texture_env_dot3 |
GL_EXT_abgr |
GL_S3_s3tc |
GL_ARB_transpose_matrix |
GL_EXT_bgra |
GL_ATI_element_array |
GL_ARB_vertex_blend |
GL_EXT_blend_color |
GL_ATI_envmap_bumpmap |
GL_ARB_vertex_program |
GL_EXT_blend_minmax |
GL_ATI_fragment_shader |
GL_ARB_window_pos |
GL_EXT_blend_subtract |
GL_ATI_map_object_buffer |
GL_S3_s3tc |
GL_EXT_compiled_vertex_array |
GL_ATI_pn_triangles |
GL_ATI_element_array |
GL_EXT_separate_specular_color |
GL_ATI_texture_mirror_once |
GL_ATI_envmap_bumpmap |
GL_EXT_fog_coord |
GL_ATI_vertex_array_object |
GL_ATI_fragment_shader |
GL_EXT_multi_draw_arrays |
GL_ATI_vertex_streams |
GL_ATI_map_object_buffer |
GL_EXT_packed_pixels |
GL_ATIX_texture_env_combine3 |
GL_ATI_separate_stencil |
GL_EXT_paletted_texture |
GL_ATIX_texture_env_route |
GL_ATI_texture_mirror_once |
GL_EXT_point_parameters |
GL_ATIX_vertex_shader_output_point_size |
GL_ATI_vertex_array_object |
GL_EXT_rescale_normal |
GL_EXT_abgr |
GL_ATI_vertex_streams |
GL_EXT_clip_volume_hint |
GL_EXT_bgra |
GL_ATIX_texture_env_route |
GL_EXT_draw_range_elements |
GL_EXT_blend_color |
GL_ATIX_vertex_shader_output_point_size |
GL_EXT_shared_texture_palette |
GL_EXT_blend_func_separate |
GL_EXT_abgr |
GL_EXT_stencil_wrap |
GL_EXT_blend_minmax |
GL_EXT_bgra |
GL_EXT_texture3D |
GL_EXT_blend_subtract |
GL_EXT_blend_color |
GL_EXT_texture_compression_s3tc |
GL_EXT_clip_volume_hint |
GL_EXT_blend_func_separate |
GL_EXT_texture_edge_clamp |
GL_EXT_compiled_vertex_array |
GL_EXT_blend_minmax |
GL_EXT_texture_env_add |
GL_EXT_draw_range_elements |
GL_EXT_blend_subtract |
GL_EXT_texture_env_combine |
GL_EXT_fog_coord |
GL_EXT_clip_volume_hint |
GL_EXT_texture_env_dot3 |
GL_EXT_packed_pixels |
GL_EXT_compiled_vertex_array |
GL_EXT_texture_cube_map |
GL_EXT_point_parameters |
GL_EXT_draw_range_elements |
GL_EXT_texture_filter_anisotropic |
GL_ARB_point_parameters |
GL_EXT_fog_coord |
GL_EXT_texture_lod |
GL_EXT_rescale_normal |
GL_EXT_packed_pixels |
GL_EXT_texture_lod_bias |
GL_EXT_secondary_color |
GL_EXT_point_parameters |
GL_EXT_texture_object |
GL_EXT_separate_specular_color |
GL_EXT_rescale_normal |
GL_EXT_vertex_array |
GL_EXT_stencil_wrap |
GL_EXT_secondary_color |
GL_EXT_vertex_weighting |
GL_EXT_texgen_reflection |
GL_EXT_separate_specular_color |
GL_HP_occlusion_test |
GL_EXT_texture_env_add |
GL_EXT_stencil_wrap |
GL_IBM_texture_mirrored_repeat |
GL_EXT_texture3D |
GL_EXT_texgen_reflection |
GL_KTX_buffer_region |
GL_EXT_texture_compression_s3tc |
GL_EXT_texture_env_add |
GL_NV_blend_square |
GL_EXT_texture_cube_map |
GL_EXT_texture3D |
GL_NV_copy_depth_to_color |
GL_EXT_texture_edge_clamp |
GL_EXT_texture_compression_s3tc |
GL_NV_evaluators |
GL_EXT_texture_env_combine |
GL_EXT_texture_cube_map |
GL_NV_fence |
GL_EXT_texture_env_dot3 |
GL_EXT_texture_edge_clamp |
GL_NV_fog_distance |
GL_EXT_texture_lod_bias |
GL_EXT_texture_env_combine |
GL_NV_light_max_exponent |
GL_EXT_texture_filter_anisotropic |
GL_EXT_texture_env_dot3 |
GL_NV_multisample_filter_hint |
GL_EXT_texture_object |
GL_EXT_texture_filter_anisotropic |
GL_NV_occlusion_query |
GL_EXT_vertex_array |
GL_EXT_texture_lod_bias |
GL_NV_packed_depth_stencil |
GL_EXT_vertex_shader |
GL_EXT_texture_object |
GL_NV_point_sprite |
GL_KTX_buffer_region |
GL_EXT_vertex_array |
GL_NV_register_combiners |
GL_NV_texgen_reflection |
GL_EXT_vertex_shader |
GL_NV_register_combiners2 |
GL_NV_blend_square |
GL_HP_occlusion_test |
GL_NV_texgen_reflection |
GL_SGI_texture_edge_clamp |
GL_KTX_buffer_region |
GL_NV_texture_compression_vtc |
GL_SGIS_texture_border_clamp |
GL_NV_texgen_reflection |
GL_NV_texture_env_combine4 |
GL_SGIS_texture_lod |
GL_NV_blend_square |
GL_NV_texture_rectangle |
GL_SGIS_generate_mipmap |
GL_SGI_texture_edge_clamp |
GL_NV_texture_shader |
GL_SGIS_multitexture |
GL_SGIS_texture_border_clamp |
GL_NV_texture_shader2 |
GL_WIN_swap_hint |
GL_SGIS_texture_lod |
GL_NV_texture_shader3 |
WGL_EXT_extensions_string |
GL_SGIS_generate_mipmap |
GL_NV_vertex_array_range |
WGL_EXT_swap_control |
GL_SGIS_multitexture |
GL_NV_vertex_array_range2 |
- |
GL_WIN_swap_hint |
GL_NV_vertex_program |
- |
WGL_EXT_extensions_string |
GL_NV_vertex_program1_1 |
- |
WGL_EXT_swap_control |
GL_SGIS_generate_mipmap |
- |
GL_ARB_multisample |
GL_SGIS_multitexture |
- |
- |
GL_SGIS_texture_lod |
- |
- |
GL_SGIX_depth_texture |
- |
- |
GL_SGIX_shadow |
- |
- |
GL_WIN_swap_hint |
- |
- |
WGL_EXT_swap_control |
- |
Theoretical aspects of implementation of anti-aliasing and anisotropic filtering
Contrary to the previous chip, operation of the anisotropy based on the RIP mapping
is corrected in the R300 - irrespective of an angle of inclination of a plane
relative to the Z axis the anisotropy works buglessly. It wasn't a bug of the
RIP mapping, but it was a peculiarity of its implementation in the previous ATI's
chips. However, this more correct implementation of the R300 costs more.
AA has changed as well. Like before, one of the pseuderandom chaotic templates
with two, four or six samples is chosen (it's the main difference from NVIDIA
where a template is always the same, though it depends on an AA method). But here
samples are singled out according to the multisampling method, like in the NVIDIA's
chips. The fillrate of polygons is expected to rise greatly, the situation will
remain the same on their edges, but edges of transparent polygons will be processed
incorrectly. Well, it's a cost of the increased AA speed. Further we will examine
speed and quality of the AA in detail.
And now let's take a look at the card.
Card
The card has AGP x4/x8 interface, 128 MB DDR SDRAM memory
(8 chips located on both PCB sides).
The card comes with the Samsung's memory K4D26323RA-GC2A,
BGA form-factor. The highest frequency of the card is 350 (700) MHz, which indicates
that the access time is 2.8 ns instead of 2.2 ns as it was shown before in some
reviews. The memory works at 310 (620) MHz at default. |
|
The memory chips in the new BGA package have become quite popular among video
card makers, and we won't focus on their advantages once again.
ATI RADEON 9700 Pro 128MB |
|
|
At first glance the card looks ordinary. Certainly, a 256-bit high-speed bus
makes the PCB more complicated, but while the cards from Matrox and 3Dlabs were
entirely shielded, here only the left part is screened:
And the main part of the PCB containing the memory and the chip is not protected:
The card works with just external power supply. It comes with an adapter/splitter
connecting the card with a standard tail of a power unit:
The set of interface connectors is standard: VGA, DVI and TV-out (S-Video).
There is an adapter for connecting TV-out via RCA.
The right part of the PCB is very similar to the RADEON 8500, especially in
positions of the memory chips. However, the chip has a huge heatsink which is
not typical of ATI :-).
But it's necessary here as the VPU heats up greatly (because of a large number
of transistors at 325 MHz and the .15-micron fab process). By the way, look at
the package's form of the processor:
Well, the FCPGA packaging of processors, with an open flip die, has reached
the graphics chips. In the Matrox Parhelia Review
I was surprised at the dimensions though it had a metallic cover above of an approximately
same size, and here it is lacking. There is a great deal of outputs here (well,
it's a 256-bit bus).
That's all about the card itself. Note that in the second part of the RADEON
9700 Pro review we will examine peculiarities of the card in operation in the
dual-monitor and TV-out modes (this time we have to leave it aside as we are short
of time. Besides, the second part will also shed light on operation of the AGP8x).
The card is supplied with:
two CDs with software (drivers, MMC 7.8 etc.) and marketing materials, a quite
small S-Video-to-RCA adapter (without a cable), and a DVI-to-d-Sub adapter.
And here is the box the RADEON 9700 PRO based cards will be shipping in:
Look at the right-hand corner. As you might know, at Quakecon'2002 such cards
were demonstrated in special system units, not to mention that the promised fancy
boxes priced at $450 (RADEON 9700 Pro, souvenirs, DOOM III which was promised
for nothing and a remote control from ATI) didn't sell but could be just ordered.
Overclocking
When we started testing operation at the rated frequencies, we thought that overclocking
wouldn't be possible at such high temperatures. It turned out to be wrong! The
latest version (3.21) of the PowerStrip is able to work with the RADEON 9700. |
ATI RADEON 9700 Pro 128MB |
325/620 -> 350/700 MHz |
Frankly speaking, the chip is able to work at 370 MHz but we noticed
no gains relative to 350 MHz in the standard modes (without AA and/or anisotropy).
A bit later we ran the card again at 370 MHz , but that time under the
maximum load and made sure that it operated stably. That is why the diagrams
below show these results in the "brickwork bars". Although the processor
is quite sophisticated and its temperature mode is rather high there is
some overclocking potential. The FCPGA packaging helps it a lot. Besides,
ATI selects the best chips for samples. Note that
-
in course of overclocking you must provide additional cooling, in particular,
for the card (first of all, for its memory):
-
overclocking depends on a definite sample, and you shouldn't generalize
the results of one card to all video cards of this mark or series. The
overclocking results are not the obligatory characteristics of a video
card.
Test system and drivers
Testbed:
-
Pentium 4 based system (Socket 478):
-
Intel Pentium 4 2200 (L2=512K);
-
ASUS P4T-E (i850);
-
512 MB RDRAM PC800;
-
Quantum FB AS 20GB;
-
Windows XP.
The test system was coupled with ViewSonic P810 (21") and
ViewSonic
P817 (21") monitors. The test system based on the AMD Athlon XP
will be used in the second part, where we will also estimate operation
in the AGP8x mode (on the VIA KT400).
In the tests we used ATI's drivers of v6.143 (this driver is meant only
for the RADEON 9*** series, card of the previous release are not supported
(yet?). VSync was off, the texture compression was off.
The following cards are used for comparison:
-
ASUS V8460Ultra (GeForce4 Ti 4600, 300/325 (650) MHz, 128 MB, driver 30.82);
-
Matrox Parhelia (220/275 (550) MHz, 128 MB, driver 2.31);
-
Gigabyte MAYA AP128DG-H RADEON 8500 Deluxe (275/275 (550) MHz, 128 MB,
driver 6.118).
Drivers' settings
Only the DirectX 8 drivers are released! The DirectX 9.0 drivers
are expected only in October. The control settings are almost standard
for the entire new drivers series CATALYST, except the anti-aliasing (SmoothVision
II) and anisotropy. As you can see, you can choose a mode of the anisotropic
filtering (except levels): performance/quality. Later we will see what
they differ in (in short, possible/impossible operation of trilinear filtering
together with anisotropy).
The AA is finally put in order. Instead of vague performance/quality
modes together with AA levels we have three modes: 2x, 4x and 6x. It spares
users from thinking over a choice. Below we will take a look at the most
interesting modes - 4x and 6x.
Test results
2D graphics
Despite the high frequency and complexity of the card the 2D quality is
superb! It's interesting that the colors are richer (I switched the monitor
to the RADEON 8500).
But remember that 2D quality depends on a given sample, as well as on
a card/monitor tandem (first of all, quality of a given monitor and cable).
3D graphics, MS DirectX 8.1 SDK - extreme tests
This time we wanted to test the extreme characteristics using prototypes
of the synthetic tests developed within the frames of the open graphics
benchmark RightMark 3D. But as the basic DX9 capabilities are not supported
in the drivers we used two old tests from the DX 8.1 SDK, and left some
new DX9 tests aside.
So, for testing various extreme characteristics of the chips we used
modified (for better handling and control) examples from the latest official
version of the DirectX SDK (8.1).
EMBM
In this test we measure performance drop caused by Environment mapping
and EMBM (Environment Mapped Bump Mapping). We also estimate a fillrate
of single texturing. The test was set to 1280x1024 because it's optimal
for extreme testing of modern cards:
Look at the red bars of the RADEON 9700 PRO. Well, the fillrate falls down
much when the EMBM is enabled. This drop puts the RADEON 9700 PRO (aka
R300) with its 256bit memory interface on the same level with the previous
generation of the chips (in the EMBM mode)! While in the pure texture mode
the card has a great advantage over its competitors, the environment mapping
makes it much smaller. The other card do not lose their performance as
they can enable their second texture unit which is lacking in the R300.
In the EMBM mode 3 textures are combined. Besides, in case of the EMBM
one texture is sampled out according to the values obtained from another,
and such approach strongly kicks the pixel pipelines of the R300.
Performance of Pixel Shaders 1.0
We against used a modified example of the MFCPixelShader, having measured
performance of the card in a high resolution when it processed 5 shaders
of different complexity, for bilinear-filtered textures:
Well, on the simplest shader the R300 is far ahead thanks to 8 pipelines. But
as the shaders become more sophisticated, its speed falls down faster than that
of its competitors, and at the last stage it keeps up with the NV25(!). The situation
is very close to the P10 and it will probably be peculiar to all modern accelerators
able to process large pixel shaders. However, if we compare the R200 and the R300,
the progress will be more than twice greater.
Such performance drop is caused by two factors:
-
As there is only one texture unit, operation gets slower with each new
texture, instead of a pair of textures like in the other chips.
-
All the other chips have pixel stages and implement shaders 1.X twice or
four times faster than an instruction per clock, while the R300 works with
shaders instruction by instruction, though 8 pipelines are used simultaneously.
We mentioned it in the analytical article on the R300 and in the review
on the P10, and this time our analytical ideas are proven in the synthetic
tests.
Besides, the NV25 and the P512 are equipped with at least 4 stages per
pixel pipeline, while the R200 comes with two. On the other hand, 4 texture
units do not help the P512 a lot because of a low clock speed, and the
NV25 is well balanced from the standpoint of implementation of pixel shaders
1.0.
Remember that the R300 will be competing against the NV30 whose performance
is quite vague yet; we just know its approximate clock speed, that it has
two texture units per pipeline and works with shaders instruction by instruction
(like in the R300).
3D graphics, MS DirectX 9 SDK (beta 2) - synthetic tests
For other extreme characteristics of the chip we used prototypes of our
new synthetic DX9 tests created within the RightMark 3D project.
GPU Speed
This test measures an extreme throughput of an accelerator with triangles,
using different types and a different number of light sources and ways
of lighting. At present there are 7 lighting models:
-
Constant (ambient lighting)
-
Diffuse (1 point source)
-
Diffuse (2 point sources)
-
Diffuse (3 point sources)
-
Diffuse + Specular (1 point source)
-
Diffuse + Specular (2 point sources)
-
Diffuse + Specular (3 point sources)
And 4 operating modes:
-
Traditional TCL (Fixed-Function Pipeline)
-
Vertex shaders 1.1
-
Vertex shaders 1.1 and pixel shaders 1.1
-
Vertex shaders 2.0 and pixel shaders 2.0
Later the test will be extended with several typical tasks of animation
and and geometry transformation.
The test minimizes dependence on all factors except the geometrical
performance and a triangle transfer speed (parameters delivery into shaders).
It renders a lot of small and detailed models whose triangles are very
small (comparable to a pixel) in order to eliminate dependence on HSR or
shading.
Here are results of the traditional TCL both in the hardware mode and
in the software vertex processing:
The R300 is again ahead in the simplest task: 106M transformed vertices
per second (!). In more complicated tasks it goes on a par with the NV25
and P512, which is not a brilliant result for a chip of the new generation.
In the software transformation mode the NV25 supporting FastWrites takes
the lead, while the R300 is not brighter than the R200 and P512.
But this was a test of the old TCL. The previous chips without even
a dedicated fixed TCL unit supported its effective emulation. Probably
it wasn't the aim for the developers of the R300. Let's how transformation
and lighting based on vertex shaders look like:
Well, our suggestions concerning emulation of the TCL come true - the R300
shines in implementation of shaders. The new chip outscores the NV25 twice,
and the R200 three times. The results of the R300 coincide with the data
obtained in the TCL mode which means that it lacks for any special improved
emulation. So, the R300 shows a gain matching the new generation.
In the software mode the NV25 aces the others. The software emulation
is rather slow on more or less complicated shaders. But it's hard to image
a real application able to draw 30 or 40 million of triangles, that is
why in real tasks the emulation remains acceptable.
And now let's see how the test depends on a resolution:
Resolution doesn't affect the test, but a complexity level of models has
a decent effect.
Point Sprites
This test measures a speed of rendering of point sprites. The test always
uses semitransparent sprites as the most real effects based on systems
of particles require transparency and blending. There are two modes available:
with sprite lighting using light sources and without. It's possible to
adjust a size of rendered sprites.
Without lighting and with small sprites (up to 4 pixels inclusive) the
R300 loses to the R200 (!), not to mention the NV25. But it takes a leading
position as the sprites grow up thanks to the 256-bit memory bus. The bigger
the sprite, the more frequent the frame buffer addressing is during blending.
When the lighting is enabled the general dependence looks the same,
but now the difference is not so striking, especially on small sprites.
It seems that the transformation and lighting are the limiting factors.
As the size becomes greater, the R300 goes further ahead, again thanks
to a considerable advantage in speed of operation with the frame buffer.
Besides, 9M particles is not a very great figure - effects based on particles
are actually limited by blending, not a geometrical performance of the
modern chips.
Texturing Rate
This test uses an integrate approach in estimation a texture rate randomly
changing the number of textures used in a pass, their size, format and
a filtering method. With one texture we can measure a pixel fillrate; using
all textures and changing a filtering method we can measure a filtering
speed (i.e. performance of texture units). Besides, we can estimate an
algorithm of MIP level determination:
And quality of implementation of any filtering type:
The test displays several big polygons with a wide range of depth values.
It allows estimating visually chosen MIP levels, as well as realization
of anisotropic filtering. To test all possible angles of inclination and
plane turning angles the "tunnel" (or rather pyramid) rotates around the
Z axis, and its vertex circles in the plane parallel to the screen. That
is why triangles it consists of turn evenly and can be seen at different
angles.
First of all, let's see how the fillrate depends on the number of textures
(bilinear filtering):
While the old cards give expected results, the R300's performance obtained
is much lower than its potential maximum. Even the R200 outscores it! But
there is the following explanation:
-
8 texture units at 325 MHz are comparable in the extreme performance of
the bilinear filtering to 8 units at 300 MHz of the NV25 or to 8 units
working at 275 MHz of the R200.
-
The R300 has no combination stages. Our test gathers all textures together,
and evidently in the R300 the stage settings are emulated by the respective
pixel shader. On the new architecture of pixel pipelines of the R300 each
sampling takes one instruction, and each combination takes one more. And
the R300 turns out to be inferior to the old stage-based pipelines, like
in case of the fixed TCL!
To check it we modified the test (SPECIAL TEST on the diagram) making textures
"wipe" out each other instead of combining. The results of the other cards
remained the same - the same number of stages is used, and the R300 was
growing in a linear proportion until it exceeded its own theoretical limit.
How could that happen? The answer justifies our hypothesis. The stage settings
were translated by the drivers (or DX9) into a respective pixel shader.
In course of compilation of this driver it was optimized and all independent
and further unused texture samples were excluded. As a result, only the
last texture was imposed, and our test calculated the texture fillrate
incorrectly "thinking" that all textures were sampled. Well, again we face
new peculiarities of the new generation of chips. Although it's more flexible,
it's also less efficient in traditional simple tasks. Well, we should wait
for the new DDI9 drivers and the NV30 to make final conclusions. And now
we can be satisfied at least with the fact that in real applications the
speed depends not only on texture unit's performance, but also on the total
texture volume (which wasn't great here - one texture of 256x256 could
easily go into the cache), therefore, the R300 equipped with a wider memory
bus will get a decent advantage. This will be shown later, in the tests
in real applications, and now we turn to trilinear filtering which, according
to ATI, works for nothing on the R300:
Well, the trilinear filtering is really almost free, though like in all
other cards. The layout remains the same. Now let's see what we have in
cae of different filtering types:
In case of the anisotropic filtering the R300 thrives - the NV's solutions
of the previous generation were always inferior in this aspect. But this
time the anisotropy is not so free for the R300 as it was for the R200.
Especially, when it's used together with the trilinear filtering. The matter
is that ATI has finally corrected the problem of implementation of the
anisotropy based on the RIP mapping. Now it allows for any turning angle
of a textured surface around the Z axis (!). This brought in some performance
drop (but not in several times) relative to the R200. On the other hand,
the results of the NV25 and P512 which take a classical approach in implementation
of the anisotropy are still very low. The only exception is the lowest
degree where both the NV25 and especially P512 (thanks to 4 texture units
per pipeline) are able to compete against the RADEON family. ATI can boast
of its results - the problems are corrected and speed drop is not weighty.
But we still don't know what the NV30 is coming with. Isn't a speed of
anisotropic filtering, which is comparable to the R200, a reckless step?
ATI was much limited by the .15 fab process and couldn't provide second
texture units or lift the core's frequency. But at the same time the chip
is based on the new architecture of pixel pipelines which is less efficient
(on simple tasks) yields to the old one - this is the cost of the more
flexible programming. Because of the technological process Matrox foresaw
such problems and didn't provide for complete DX9 compatibility for the
.15 process. The engineers from ATI took the risk. The time will show whether
it was worth doing that.
And now let's see how the test depends on a texture size (we'll take
4 textures in a pass as all today's chips are able to use such quantity):
As you can see, the dependence is not great, and the texture of 256x256
can be considered a rational solution for most tests. Now let's look at
different resolutions (again with 4 textures):
Starting from 1280x1024, the dependence is almost invisible, which is required
from a good test. A dependence on a texture format won't be tested as these
drivers do not support compression formats and a 16-bit texture will be
rendered at the same rate (note that we use just one texture which goes
into the cache).
3D graphics, 3DMark2001 SE - synthetic tests
All measurements in all 3D tests were carried out in the 32-bit depth color.
Fillrate
The theoretical maximum for this test makes 880 M pixels/sec for the Parhelia,
1100 for RADEON 8500, 1200 for Ti 4600 and 2600 for R300. The results obtained
are in perfect harmony with the theory, and thanks to 8 fill pipelines
(and to the 256bit bus) the R300 takes a double lead. But which modern
application uses just one texture? Let's estimate multitexturing:
The peak values for this test are 3520 (1760) M texels/sec for the Parhelia
(in the parentheses you can see a value for 4 pipelines with 2 texture
units on each), 2200 - for RADEON 8500, 2400 for Ti 4600 and 2600 for RADEON
9700 PRO. In multitexturing the degree of the chip's balance plays a greater
role. This time the peak scores are achieved by the Ti 4600 and RADEON
9700 PRO (only in high resolution, for which it's recommended ;) ). The
R300 is not a strong leader, even theoretically because of just one texture
unit and the core's frequency comparable to the others. The data obtained
correspond to the results of our new synthetic test of the DX9.
Scene with a lot of polygons
Focus on the minimal resolution where dependence on shading is almost lacking:
With one light source the R300 is an absolute leader. The results coincide
with those we got in our own DX9 test, but the 3DMark2001 doesn't allow
reaching the physical limit of the chip so closely like the future GPU
Speed from the RightMark 3D. The P512 is a vivid outsider despite 4 vertex
pipelines. It's interesting that in our GPU Speed test the P512 performs
much better - it scores 1.5 times better results matching those of the
R200. It seems that the 3DMark2001 touches a sore spot of the chip which
slows down operation of the vertex unit. Or it's the drivers and DX to
blame - our test is written and compiled with the DX9 interface while the
3DMark2001 SE was compiled with the DX81.
In case of 8 light sources the general layout doesn't change, but the difference
between the cards gets smaller, again coinciding with the earlier obtained
results. Besides, the R200 and the P512 exchange their positions - the
Parhelia scores better results: as the number of light sources increases
its performance falls down slower as compared with the RADEON 8500). The
R300 is a leader, and the R200 becomes an outsider.
Bump mapping
Look at the results of a synthetic EMBM scene:
Unlike our old test from the DX81 SDK, the R300 performs better! It seems
that this test is more dependent on a frame buffer write speed which is
higher for the R300. And now the DP3 bump mapping:
The same situation.
Vertex shaders
The results of our synthetic test GPU Speed are again proven. The breakthrough
of the R300 in the 3DMark2001 is not so great, but it still separates it
from the other participants - the R300 is an undoubted leader in operation
with vertex shaders, geometry and transformation. At least, until it concerns
the fixed TCL.
Pixel shader
Taking into account the fact that the too low resolutions are limited by
the geometry and too high ones - by the memory throughput, let's take a
look at 1024x768 and 1280x1024:
The R300 is again ahead. But its performance falls down faster as the resolution
grows up, because of instruction-per-clock implementation of vertex shaders.
Let's see what happens in the more complex test Advanced Pixel Shader.
The breakthrough of the R300 grows up because of optimization of the pixel
pipelines of the chip for more flexible and longer pixel shaders. It's
clear that it's too early to make any conclusions until the complete support
of the DX9 and pixel shaders 2.0 is provided in the drivers.
Sprites
The R300 is ahead, but the gap is smaller. The Parhelia suffers in this
test as it doesn't have a special hardware acceleration to output point
sprites. Without blending the sprite's performance isn't of great value.
Well, this test demonstrates once again advantages of the 256-bit bus and
8 fill pipelines.
So, let's sum it up. In the synthetic tests the R300 card looks differently.
In the geometry processing it takes a firm leading position, and in the
shading tests everything depends on a task. At least we can complain of
just one texture unit per pipeline. As compared with the other cards, the
R300 certainly leaves an impression of a strong leader developed for complex
tasks of future applications. On the other hand, we must wait for the normal
DX9 drivers to make final conclusions on the synthetic tests. Besides,
we were not able to test new formats of textures and a frame buffer and
the second version of the pixel and vertex shaders. Besides, the NV30,
which is the main competitor for the R300 wasn't tested here yet.
Before we turn to the games I must notice that the tests were carried
out not only at the normal and overclocked frequencies but also at reduced
to 300/600 MHz to estimate a performance of the slowest cards of this series
(see the freqeuncy range on the cards from the ATI's partners).
3D graphics, 3DMark2001 - games tests
3DMark2001, 3DMARKS
The RADEON 9700 Pro outpaces the GeForce4 Ti 4600 by 17 to 39% in these
benchmarks. It's not that bad taking into account a strong limiting effect
of the central processor.
3DMark2001, Game1 Low details
Test characteristics:
-
Rendered triangles per frame (min/avg/max): 19773/33753/143422
-
Rendered textures per frame with 16 bit textures (min/avg/max): 7.5/8.8/16.5
MB
-
Rendered textures per frame with 32 bit textures (min/avg/max): 15.1/17.7/30.3
MB
-
Rendered textures per frame with texture compression (min/avg/max): 10.7/12.2/21.0
MB
In 1600x1200x32 the RADEON 9700 outscores the Ti 4600 by 26%, the Parhelia
by 123%, and the RADEON 8500 by 57.2%.
When the AA is enabled it performs better than the Ti 4600 (AA4x) by
121%, than the Parhelia (FAA16x) by 94% and it surpasses the RADEON 8500
(AA4xP) by 265%.
The anisotropic filtering (quality mode) brings the following numbers:
the gain over the Ti 4600 (ANIS 8) is 60% and over the RADEON 8500 (ANIS
16) is 19%.
When the AA and anisotropy are both switched on the new Canadian product
has the following gain (in 1280x1024): 75% over Ti 4600, 44% over the Parhelia
and 163% over the RADEON 8500.
3DMark2001, Game2 Low details
Test characteristics:
-
Rendered triangles per frame (min/avg/max): 46159/51440/147828
-
Rendered textures per frame with 16 bit textures (min/avg/max): 8.0/8.8/10.1
MB
-
Rendered textures per frame with 32 bit textures (min/avg/max): 15.6/17.2/19.8
MB
-
Rendered textures per frame with texture compression (min/avg/max): 9.3/10.9/13.5
MB
In 1600x1200x32 the RADEON 9700 has the following gap: 38% Ti 4600, 136%
over Parhelia and 79% over RADEON 8500.
With the AA the gain over the Ti 4600 (AA4x) is 171%, over the Parhelia
(FAA16x) it's 104%, and over the RADEON 8500 (AA4xP) it makes 311%.
With the anisotropic filtering (quality mode) the gain over the Ti 4600
(ANIS 8) is 195%, over the RADEON 8500 (ANIS 16) it makes 69%.
When the AA and anisotropy are both switched on the new Canadian product
has the following gain (in 1280x1024): 181% over Ti 4600, 105% over Parhelia
and 283% over RADEON 8500.
3DMark2001, Game3 Low details
Test characteristics:
-
Rendered triangles per frame (min/avg/max): 16681/21746/39890
-
Rendered textures per frame with 16 bit textures (min/avg/max): 2.8/4.1/4.7
MB
-
Rendered textures per frame with 32 bit textures (min/avg/max): 5.7/8.2/9.4
MB
-
Rendered textures per frame with texture compression (min/avg/max): 5.0/7.2/8.4
MB
In 1600x1200x32 the RADEON 9700 has the following breakthrough: 29% over
Ti 4600, 114% over Parhelia and 61.5% over RADEON 8500.
With the AA the gain over the Ti 4600 (AA4x) is 188%, over Parhelia
(FAA16x) - 76%, and over RADEON 8500 (AA4xP) - 338%.
With the anisotropic filtering (quality mode) the gain over the Ti 4600
(ANIS 8) is 153%, and over the RADEON 8500 (ANIS 16) - 74%.
When the AA and anisotropy are both switched on the new Canadian product
has the following gain (in 1280x1024): 192% over Ti 4600, 65% over Parhelia,
and 297% over RADEON 8500.
3DMark2001, Game4
Test characteristics:
-
Rendered triangles per frame (min/avg/max): 55601/81714/180938
-
Rendered textures per frame with 16 bit textures (min/avg/max): 14.9/17.4/20.7
MB
-
Rendered textures per frame with 32 bit textures (min/avg/max): 28.4/33.5/40.0
MB
-
Rendered textures per frame with texture compression (min/avg/max): 28.4/33.5/40.0
MB
In 1600x1200x32 the RADEON 9700 has the following breakthrough: 101% over
Ti 4600, 207% over Parhelia, and 113% over the RADEON 8500.
With the AA the gain over the Ti 4600 (AA4x) is 146%, over the Parhelia
(FAA16x) - 133%, and over the RADEON 8500 (AA4xP) - 276%.
With the anisotropic filtering (quality mode) the gain over the Ti 4600
(ANIS 8) is 90%, and over the RADEON 8500 (ANIS 16) - 51%.
When the AA and anisotropy are both switched on the new Canadian product
has the following gain (in 1280x1024): 116% over the Ti 4600, 155% over
the Parhelia, and 187% over the RADEON 8500.
In the 3DMark2001SE benchmarks the new ATI's product has the greater
gain in the tough modes of AA and anisotropy. It should be expected because
of a limiting performance of the CPU's frequency (and of the platform despite
2.2 GHz) with a low load. The 256-bit memory bus has a positive effect
on the overall speed with the AA. As for anisotropy, the performance doesn't
fall down drastically in the quality mode. Later we will speak about it
in depth.
3D graphics, game tests
For the performance estimation we used:
-
Return to Castle Wolfenstein (MultiPlayer) (id Software/Activision) - OpenGL,
multitexturing, Checkpoint-demo,
test settings - maximum, S3TC OFF, the configurations can be downloaded
from here
-
Serious Sam: The Second Encounter v.1.05 (Croteam/GodGames) - OpenGL, multitexturing,
Grand Cathedral demo, test settings: quality, S3TC OFF
-
Quake3 Arena v.1.17 (id Software/Activision) - OpenGL, multitexturing,
Quaver,
test settings - maximum: detailing level - High, texture detailing
level - #4, S3TC OFF, smoothness
of curves is much increased through variables r_subdivisions "1"
and r_lodCurveError "30000" (at default r_lodCurveError is
250 !), the configurations can be downloaded from
here
-
Comanche4 Benchmark Demo (NovaLogic) - Direct3D, Shaders, Hardware T&L,
Dot3, cube texturing, highest quality
-
Unreal Tournament 2003 Demo v.927 (Digital Extreme/Epic Games) - Direct3D,
Vertex Shaders, Hardware T&L, Dot3, cube texturing, default quality
-
Code Creatures Benchmark Pro (CodeCult) is a game that demonstrates operation
of cards in the DirectX 8.1, Shaders, HW T&L.
-
AquaMark (Massive Development) is a game that demonstrates operation of
cards in the DirectX 8.1, Shaders, HW T&L.
-
RightMark Video Analyzer v.0.4 (Philip
Gerasimov) - DirectX 8.1, Dot3, cube texturing, shadow buffers, vertex
and pixel shaders (1.1, 1.4).
Quake3 Arena, Quaver
In 1600x1200x32 the RADEON 9700 has the following breakthrough: 38% over
Ti 4600, 195% over Parhelia, and 66% over the RADEON 8500.
With the AA the gain over the Ti 4600 (AA4x) is 113%, over Parhelia
(FAA16x) - 137%, and over RADEON 8500 (AA4xP) - 327%.
With the anisotropic filtering (quality mode) the gain over the Ti 4600
(ANIS 8) is 94%, and over the RADEON 8500 (ANIS 16) - 49%.
When the AA and anisotropy are both switched on the new Canadian product
has the following gain (in 1280x1024): 146% over Ti 4600, 169% over Parhelia,
and 429% over RADEON 8500.
Without AA and anisotropy the speed is limited by the platform and CPU,
but when these functions get enabled, the RADEON 9700 goes far ahead. Surely,
the OpenGL driver is far imperfect, though it's a trifle as compared with
what we will see in the next test.
Serious Sam: The Second Encounter, Grand Cathedral
Here are screenshots of the settings:
Here are the results:
In 1600x1200x32 the RADEON 9700 has the following breakthrough: -17.8%
(falls behind) over Ti 4600, 47% over Parhelia, and 27% over the RADEON
8500.
With the AA the gain over the Ti 4600 (AA4x) is 26%, and over Parhelia
(FAA16x) - -4% (loses)/ We don't take the scores of the RADEON 8500 (AA4xP)
into account because it has a clear bug in this test with the AA. Such
a small gap is impossible; we carried out the test several times: the results
were the same and we ignored them.
With the anisotropic filtering (quality mode) the gain over the Ti 4600
(ANIS 8) is 0%, and over the RADEON 8500 (ANIS 16) - 0%.
When the AA and anisotropy are both switched on the new Canadian product
has the following gain (in 1280x1024): 0% over Ti 4600, -23% over Parhelia,
and no results for the RADEON 8500.
Well, it's either the ATI's driver that works terribly in this game,
or there is something else that prevents the RADEON 9700 from showing its
might. I pin my hopes on the new drivers or at least on an explanation
from ATI.
Return to Castle Wolfenstein (Multiplayer), Checkpoint
In 1600x1200x32 the RADEON 9700 has the following breakthrough: 24% over
Ti 4600, 397% over Parhelia, and 22% over the RADEON 8500.
With the AA the gain over the Ti 4600 (AA4x) is 131%, over Parhelia
(FAA16x) - 378%, and over RADEON 8500 (AA4xP) - 262%.
With the anisotropic filtering (quality mode) the gain over the Ti 4600
(ANIS 8) is 116%, and over the RADEON 8500 (ANIS 16) - 26%.
When the AA and anisotropy are both switched on the new Canadian product
has the following gain (in 1280x1024): 177% over Ti 4600, 400% over Parhelia,
and 315% over RADEON 8500.
In this test (which depends more on the platform than the Quake3 does)
the new solution shows brilliant scores when the load is heavy. Such a
low speed of the Parhelia is probably on the account of the same factors
as it was in case of the RADEON 9700: bugs in software or incompatibility
(a patch is probably needed).
Code Creatures
This test is based on the CodeCult's engine which is used for several games.
The engine uses almost all modern capabilities of video cards of the
latest generation. And the demo program based on this engine contains quite
tough scenes as to the texture size, geometry and effects used.
In 1600x1200x32 the RADEON 9700 has the following breakthrough: 31% over
Ti 4600, 110% over Parhelia, and 320% over the RADEON 8500.
As you can see, the test is quite tough even for the super-accelerator,
that is why it makes no sense to run the card with the AA and/or anisotropy.
Comanche4 DEMO
In 1600x1200x32 the RADEON 9700 has the following breakthrough: 0% over
Ti 4600, 88% over Parhelia, and 34% over the RADEON 8500.
With the AA the gain over the Ti 4600 (AA4x) is 58%, over Parhelia (FAA16x)
- 113%, and over RADEON 8500 (AA4xP) - 149%.
With the anisotropic filtering (quality mode) the gain over the Ti 4600
(ANIS 8) is 35%, and over the RADEON 8500 (ANIS 16) - 23%.
When the AA and anisotropy are both switched on the new Canadian product
has the following gain (in 1280x1024): 37% over Ti 4600, 100% over Parhelia,
and 100% over RADEON 8500.
A processor affects much this test, and the load on the cards is very
high, that is why the overall performance is low at the maximum quality
level. That is why even a 50% gain is a great achievement.
Unreal Tournament 2003 DEMO b.927
In 1600x1200x32 the RADEON 9700 has the following breakthrough: 0% over
Ti 4600, 54% over Parhelia, and 230% over the RADEON 8500.
With the AA the gain over the Ti 4600 (AA4x) is not accounted as AA4x
doesn't work here for some reason, over Parhelia (FAA16x) - 25%, and over
RADEON 8500 (AA4xP) - 364%.
With the anisotropic filtering (quality mode) the gain over the Ti 4600
(ANIS 8) is 12%, and over the RADEON 8500 (ANIS 16) - 76%.
When the AA and anisotropy are both switched on the new Canadian product
has the following gain (in 1280x1024): no results for the Ti 4600, -7%
over Parhelia, and 165% over RADEON 8500.
This test puzzles many, but we decided to leave it here until the final
DEMO version is released. Yet in our 3Digest
we wrote that performance of the ATI's cards was unexplainable. We are
inclined to think that the ATI's drivers are to blame here.
AquaMark
In 1600x1200x32 the RADEON 9700 has the following breakthrough: 4% over
Ti 4600, 75% over Parhelia, and 76% over the RADEON 8500.
With the AA the gain over the Ti 4600 (AA4x) is 101%, over Parhelia
(FAA16x) - 66%, and over RADEON 8500 (AA4xP) - 169%.
With the anisotropic filtering (quality mode) the gain over the Ti 4600
(ANIS 8) is 278%, and over the RADEON 8500 (ANIS 16) - 38%.
When the AA and anisotropy are both switched on the new Canadian product
has the following gain (in 1280x1024): 157% over the Ti 4600, 40% over
Parhelia, and 142% over RADEON 8500.
This test was released in 2001 when the GeForce3 based card just appeared,
and at that time it was very difficult. And even the current leader doesn't
strike with the speeds, though the performance is tenfold better. Note
that the speed of the GeForce4 falls down dramatically when anisotropy
is enabled (this filtering type processes all textures without distinction,
and this test uses a lot of semitransparent textures).
RightMark 3D
In 1600x1200x32 the RADEON 9700 has the following breakthrough: 48% over
Ti 4600, 86% over Parhelia, and 211% over the RADEON 8500.
This test is also very difficult to estimate not only operation of modern
accelerators with the DX8.1 functions enabled but also how fast they work
with them.
3D quality
ANISOTROPIC FILTERING
In our 3Digest
you can find information about this function: its operation and what it's
for.
We know that different VPU makers implement this function in their own
way. And speed characteristics of anisotropy, say from ATI and from NVIDIA,
are very different. Only the output quality is similar.
Till recently this postulate was true (at least for implementation of
this function from ATI). Now it's different. Now the Canadian developers
give you a choice: either pure anisotropy or together with trilinear filtering.
What do the "Performance/Quality" switches mean, which are located in
the section related to the anisotropic filtering? Let's take a look at
the Fillrate test from the future RightMark 3D. We will enable trilinear
filtering (the usual mode is on the left, and with MIP levels shading on
the right):
Works perfectly. Now we enable anisotropy in the Performance mode:
The trilinear filtering disappears, though anisotropy is of excellent quality.
Let's compare it with same filtering of the RADEON 8500; we purposely placed
walls in the tunnel at 45 degrees :-). On the left is a screenshot that
shows the tunnel exactly in this position. And on the right it is turned
by 40 degrees.
RADEON 8500
GeForce4
RADEON 9700
Have you noticed the difference? :-) And now look at the screenshots of
the RADEON 9700 above to make sure that the problem of angles close to
45 degrees is solved.
Now let's return to the difference in operation of this "Performance/Quality"
function. We saw above that although the anisotropy quality grew and didn't
depend on angles anymore, it is still unable to coexist with the trilinear
filtering. Quality mode.
Both filtering types work! It relates to all Direct3D applications. As
it turned out, in case of such forcing of anisotropy the internal filtering
modes in the games obey the drivers irrespective of what is set in the
games, except 3DMark2001 where the trilinear filtering seems to be completely
put an end to.
It's OK in the OpenGL as well. Let's test it in the Quake3. Performance
mode:
The anisotropy works, the trilinear doesn't (irrespective of whether it's
enabled in the game). Now Quality mode:
Well, it's a really pleasant outcome for the ATI's fans. The earlier given
test results show that at ANIS 16 and with the almost complete anisotropy
the performance drop is not so great as compared with the GeForce4 Ti.
But it works only in the very expensive High-End card. This will probably
be left untouched in the RADEON 9500, but we have nothing to say about
either performance or price of this card (it can be both $180-200 and $250-290).
At the end of the anisotropy section look at some screenshots:
RADEON 8500 |
RADEON 9700 |
3DMark2001, Game 1 |
ANIS 0 |
|
ANIS 16 |
|
|
|
|
3DMark2001, Game 2 |
ANIS 0 |
|
ANIS 16 |
|
|
|
|
Serious Sam: TSE |
ANIS 0 |
|
ANIS 16 |
|
|
|
ANTI-ALIASING (AA)
In the very beginning we studied some aspects of operation of AA for the
RADEON 9700, the performance was estimated in the previous section and
here we are going to look at quality. |
Example 1 |
Example 2 |
3DMark2001, Game 1 |
No AA |
|
|
AA 4x |
|
|
AA 6x |
|
|
3DMark2001, Game 3 |
No AA |
|
|
AA 4x |
|
|
AA 6x |
|
|
3DMark2001, Game 4 |
No AA |
|
|
AA 4x |
|
|
AA 6x |
|
|
Serious Sam: TSE |
No AA |
|
|
AA 4x |
|
|
AA 6x |
|
|
-
Almost lacking difference in the 4x and 6x modes. So, why to pay more?
-
As expected, the MSAA doesn't work when transparent textures are used (look
at the leaves in the Game4).
-
On the whole, the AA quality is very good and it corresponds to the respective
level of the GeForce4 Ti. In the Direct3D with AA the LOD BIAS shifts to
negative, that is why the images get sharper.
3D quality in large
We have run just few games but some artifacts are already revealed. First
of all, we found them in some games released in 2000-2001 (RealMYST, Sacrifice).
The latest games do not have such. The details will be available in our
3Digest; the gallery of screenshots will get more pictures of the RADEON
9700 (the Morrowind has no artifacts :-) ).
Conclusion
Our cinema hall has still one free seat meant for the NV30. Let's wait
for this last spectator to draw the overall conclusion on the accelerators
of the latest generation. And at this moment we have the following to say.
-
In general, ATI managed to make a good card in spite of such a sophisticated
chip and a high temperature mode (without additional cooling the chip's
temperature can reach 85 degrees (!), the PCB heats as much as 68-70. But
it doesn't affect stability of operation of the card (we ran the 3DMark2001
during 6 hours without a break).
-
The RADEON 9700 is absolutely the king today. But the cards will start
shipping only in September; besides, soon NVIDIA will release its NV30
which might become a leader this autumn. It will probably not kill the
RADEON 9700, but it will be able to move ATI off the throne or make it
cut prices.
-
As far as operation of the card is concerned, we are currently unable to
estimate DirectX 9 functions because the DX9 is not released yet, even
the beta version of the DX9 drivers. That is why we will continue the RADEON
9700 review quite soon, it will be the third part. In the second part we
will deal with video functions, TV-out, AGP8x (today we took measurements
at AGP4x).
-
It should be noted that when the ATI RADEON 9700 Pro is installed into
the Soltek 75DRV5 (VIA KT333) mainboard only AGP 1x and 2x (!) modes are
available for some reason (maybe because of the BIOS). The 4x mode doesn't
work, though the board can enable it with other cards. As the 2x mode is
obviously unsupported by the video card (!), any 3D application causes
hang-ups.
-
The software has still a long way to develop, in particular, in the drivers'
panel one must switch some settings 2-3 times to make them work as required.
Well, the ATI's programmers have something to work at (although the 6.143
version is certified).
-
Although the 3D speed of the card is quite low in the synthetic tests,
the performance is good enough. The 8x1 will definitely tell upon future
games greedy for the fillrate. But now I can see that such card is not
worth using without AA and/or anisotropy! That is why if you are ready
to pay $400 for this super accelerator you should start learn basic 3D
graphics to adjust functions affecting 3D quality correctly.
-
We have already mentioned the downsides (low fillrate because of 8x1, high
prices, not available on the market, artifacts in some games), the advantages,
apart from the speed, also include the improved anisotropic filtering and
modern features of AA (high quality at a moderate speed cost).
I hope these cards will soon get onto the shelves and the prices will go
down swiftly (this will make prices of the GeForce4 Ti 4600 cards fall
down as well). And we congratulate the ATI's fans on the new 3D king in
the gaming sector (maybe just for short, but ATI managed to outperform
the leader).
Once again I must say that the current performance of the card may change
dramatically after the release of the DirectX 9 and the respective drivers,
that is why the subject is still open. Besides, soon we will publish the
second part concerning aspects of operation of the RADEON 9700 with DVD,
TV-out and some tests on the KT400 (AGP8x).
Write a comment below. No registration needed!
|
|