Today we are starting to climb the RADEON 8500
moutain. The expedition is promised to be quite tough and risky.
But while we are at its foot we should study the theoretical issues
concerned. If you are a beginner, you'd better first take a look
at our previous examinations.
Besides, all participants should consider the scheme outlined in
the first R200
preview.
Technical data
Here is a comparison table of general characteristics
of the chips and possibilities available in the current DirectX
8.1 drivers:
Who is who |
Card |
GeForce3 Ti 500 |
RADEON 8500 |
Chip |
NV20
|
R200
|
Chip revision |
A05
|
A13
|
Basic parameters |
Pipelines |
4
|
4
|
Texture units |
2
|
2
|
Texture per pass |
4
|
6
|
Core frequency, MHz |
240
|
275
|
Fillrate (milion pixels) |
960
|
1100
|
Fillrate (million texels) |
1920
|
2200
|
RAMDAC, MHz |
350
|
400 (+ external 240) |
Memory |
Memory frequency, MHz |
250
|
275
|
Memory bus, bit |
128 (DDR)
|
128 (DDR)
|
Technology, micron |
.15
|
.15
|
Memory size, MB |
64
|
64
|
Memory speed, ns |
3.5
|
3.6
|
API |
OpenGL version |
1.3
|
1.2 (1.3?)
|
DirectX version |
8.1
|
8.1
|
GDI+ |
Yes
|
Yes
|
Pixel pipeline |
Pixel shaders |
1.0, 1.1
|
1.0, 1.1, 1.4
|
Maximum color in pixel shader registers |
1.0
|
8.0
|
Texture sampling stages |
4
|
8
|
Texture combination stages |
8
|
8
|
Vertex pipeline |
Vertex shaders |
1.0, 1.1
|
1.0, 1.1
|
Vertex streams |
16
|
8
|
Vertex shader constants |
96
|
192
|
Other |
Texture size (max.) |
2048X2048 (4096X4096?)
|
2048X2048
|
Matrices for blending (max.) |
4
|
4
|
Indexed blending |
No
|
up to 57 matrices
|
Sprite scaling up to |
64
|
64
|
Light sources |
8
|
8
|
Clipping planes |
0 (6?)
|
6
|
Pure Device |
Yes
|
Yes
|
N-Patches |
No
|
Yes
|
RT-Patches |
No (!)
|
No
|
Multisampling |
2, 3, 4
|
No
|
3D textures |
Yes
|
Yes (without MIPMAP)
|
Environment maps |
Yes
|
Yes (without MIPMAP)
|
Anisotropic filtering |
Yes (without MIPMAP)
|
Yes (without MIPMAP)
|
Anisotropy degree up to |
8
|
16
|
Fog |
FOGVERTEX FOGRANGE FOGTABLE
|
FOGVERTEX FOGRANGE
|
Frame buffer |
Rendering buffer formats |
A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 |
A8R8G8B8 X8R8G8B8 R5G6B5 A1R5G5B5 A4R4G4B4
R3G3B2 |
Z-buffer formats |
D32 D24S8 D16 D24X8 |
D32 D24S8 D16 D24X8 |
Texture formats |
2D texture formats |
A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5
A4R4G4B4 DXT1 DXT2 DXT3 DXT4 DXT5 V8U8 L6V5U5 X8L8V8U8 Q8W8V8U8
P8 D32 D24S8 D16 D24X8 |
A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5
A4R4G4B4 DXT1 DXT2 DXT3 DXT4 DXT5 V8U8 L6V5U5 X8L8V8U8 Q8W8V8U8
L8 R3G3B2 A8L8 V16U16 W11V11U10 |
3D texture formats |
A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5
A4R4G4B4 P8 |
A8R8G8B8 X8R8G8B8 R5G6B5 X1R5G5B5 A1R5G5B5
A4R4G4B4 R3G3B2 L8 A8L8 DXT1 DXT2 DXT3 DXT4 DXT5 |
Let me draw your attention to the following points:
- Higher clock speed of the RADEON 8500 core and memory;
- Higher limiting frequency of the primary RAMDAC R200 - 400
MHz, and presence of the secondary, external RAMDAC working at
240 MHz which makes possible to apply a VGA signal to two receivers;
- An internal chip architecture organization is similar to the
NV20 - 4 shading pipelines, with two texture units on each. But
this time their results can twice accumulate, so we can combine
up to 6 textures at a pass. Of course, with two penalty cycles
for 6 textures and with one for 4 ones at least (like for the
NV20);
- 1.4 pixel shader support (see the R200
preview for details) and a more flexible texture value sampling
mechanism. Since the shaders are translated into settings of sampling
and combination pipelines, texture sampling pipelines increased
up to 8 stages and their possibilities extended;
- Combination pipelines of both chips have up to 8 stages and
support all DirectX 8.1 declared operations;
- The number of constants for vertex shaders has increased to
192 against 96 of the NV20. This allows realizing more complex
algorithms of blending and vertex processing;
- Despite a modest access time (3.6 ns against 3.5 ns of the
GeForce3 Ti 500), the RADEON 8500 memory works successfully at
a higher frequency, even without any heatsinks. The R200 and NV20
use different approaches for memory operation. While the NV20
prefers smaller blocks and an effective crossbar controller, the
R200 uses larger blocks and an intensive caching. Later we will
see what approach is more viable in modern tests, but now it should
be noted that the R200 treats memory softer, and this may increase
its overclocking potential;
- The R200 outdoes the NV20 in a set of realized DirectX 8.1
features, but the NV20 has an excellent OpenGL 1.3 driver. The
current OpenGL driver of the R200 corresponds to v.1.2 (ATI say's
that they already have OpenGL 1.3 for XP, alpha version) and is
not so efficient as the NVIDIA's baby;
- The R200 can set 6 arbitrary clipping planes, while the NV20
lacks for such a possibility (in current drivers);
- The R200 has a hardware support of the N-Patches, the NV20
lacks for it;
- The current NV20's drivers do not support hardware tesselation
of smooth surfaces anymore (RT-Patches);
- The R200 can implement an indexed matrix blending using a palette
of 57 matrices (4 can be enabled at a time). But this feature doesn't
seem so considerable when vertex shaders are used. The shaders
can organize any blending schemes with any number of matrices
used. But is the blending with shaders so efficient as the hardware
one? Later we will try to answer this question;
- The R200 doesn't support Multisampling (!);
- R200 doesn't support Mip-mapping (and, therefore, a trilinear
filtering) for environment and 3D textures;
- Both accelerators do not allow for simultaneous Mip-mapping
and anisotropic filtering in the DirectX, i.e. it is impossible
to enable a trilinear filtering + an anisotropic one. On the other
hand, the NV20 supports this mode in the OpenGL, while DirectX
applications are lacking for anisotropy filtering as a rule;
- The maximum anisotropy degree is twice higher in the R200;
- Pixel shaders of the R200 can operate on values exceeding 1.0
(i.e. 255) - namely, from 0 to 8.0. It is an OverBright approach.
Complex calculations can have an additional precision reserve,
i.e. you can realize accumulation of particular vertices, for
example, for a more adequate delivery of bright lighting;
- Texture formats are almost the same, but while the R200 has
several exotic formats for using additional data in shaders (normal
and bump maps) with an increased precision of component delivery
(11bit and 16bit - V16U16, W11V11U10), the NV20 can use textures
with a Z-buffer format (D32, D24S8, D16, D24X8) which are necessary
for realization of algorithms of the Shadow Buffer class;
- All texture compression formats are supported, but while the
R200 allows compressing also 3D textures into the same formats,
the NV20 does not! Taking into account significant dimensions
of 3D textures we can consider it a worse drawback of the drivers
or of the chip. In the OpenGL NVIDIA successfully uses its own
3D texture compression format;
- The DirectX R200 supports all kinds of a fog except the table
one;
Here is a list of OpenGL extensions supported
by current drivers of the Radeon 8500:
GL_VENDOR: ATI Technologies Inc.
GL_RENDERER: Radeon 8500 DDR x86/SSE
GL_VERSION: 1.2.2357 Win9x Release
GL_EXTENSIONS:
GL_ARB_multitexture
GL_ARB_texture_border_clamp
GL_ARB_texture_compression
GL_ARB_texture_cube_map
GL_ARB_texture_env_add
GL_EXT_texture_env_add
GL_ARB_texture_env_combine
GL_ARB_texture_env_crossbar
GL_ARB_texture_env_dot3
GL_ARB_transpose_matrix
GL_ARB_vertex_blend
GL_S3_s3tc
GL_ATI_element_array
GL_ATI_envmap_bumpmap
GL_ATI_fragment_shader
GL_ATI_pn_triangles
GL_ATI_texture_mirror_once
GL_ATI_vertex_array_object
GL_EXT_vertex_shader
GL_ATI_vertex_streams
GL_ATIX_texture_env_combine3
GL_ATIX_texture_env_route
GL_ATIX_vertex_shader_output_point_size
GL_EXT_abgr
GL_EXT_bgra
GL_EXT_blend_color
GL_EXT_blend_func_separate
GL_EXT_blend_minmax
GL_EXT_blend_subtract
GL_EXT_clip_volume_hint
GL_EXT_compiled_vertex_array
GL_EXT_draw_range_elements
GL_EXT_fog_coord
GL_EXT_packed_pixels
GL_EXT_point_parameters
GL_ARB_point_parameters
GL_EXT_rescale_normal
GL_EXT_secondary_color
GL_EXT_separate_specular_color
GL_EXT_stencil_wrap
GL_EXT_texgen_reflection
GL_EXT_texture3D
GL_EXT_texture_compression_s3tc
GL_EXT_texture_cub
GL_MAX_TEXTURE_SIZE: 1024
GL_MAX_ACTIVE_TEXTURES_ARB: 6
The current situation differs from the one we witnessed
a year ago when the R100 was released. The technological advantage
is obvious, but it is not so great as it was in the R100 and NV15
(GeForce2). But having a similar pipeline configuration, the chip
must be a more efficient as it operates at a higher clock speed.
Well, it's time to start climbing our mountain.
First of all, let's take a gander at the ATI RADEON 8500 video card.
Card
The senior model of the ATI's game card has the
same name as the graphics processor.
The card has AGP x2/x4 interface, 64 MBytes DDR
SDRAM located in 8 chips on the right and back sides of the
PCB. The layout is, in fact, very close to the RADEON 64
MBytes DDR.
Hynix (former Hyundai Semiconductor) produces memory
chips with 3.6 ns access time, which corresponds to 277 (554) MHz.
The memory operates at 275 (550) MHz, but the chips
do not have any heatsinks. Lack of them is a peculiarity of the whole
RADEON 7500/8500 series. While NVIDIA card require obligatory cooling
of their memory chips which work, at the same time, at a frequency
much lower than the rated one (just take, for example, 230 MHz when
an access time equals 3.8ns which corresponds to 263 MHz), the ATI
engineers equipped the chip with an excellent "ecological" memory
controller.
The ATI RADEON 7500 card which was
studied some time ago also has a quick 4ns memory. According
to some sources, the memory controller of the RADEON 7500 is the
same as in the RADEON 8500. And althouth the memory works at 230
MHz, the same controller allows doing without any cooling (the GeForce2
Ultra cards, for example, have memory chips which warm up considerably).
Moreover, the memory of the RADEON 7500 has a greater overclocking
potential than the GeForce2 Ultra.
Now let me compare the design of both cards: RADEON
7500 (on the top) and RADEON 8500 (on the bottom):
First of all, both cards have different positions
of DVI and VGA connectors. Besides, the RADEON 8500 has a RAGE THEATER
module which is in charge of the VIVO (Video In Video Out). Here
we have only the Video Out enabled. I think ATI refused to equip
the 3D accelerator with a full set of VIVO multimedia functions to divide
stricktly possibilities of All-In-Wonder series and other cards
(the All-In-Wonder combine would differ from a usual video card
with VIVO only in a TV tuner, and it would be unprofitable to produce
the former cards). Now the complete VIVO possibilities are available
only for All-In-Wonder cards.
The cooler of the processor of the Radeon 8500
is glued and not just mounted on the the PCB. All ATI cards have
no holes for attaching coolers, that is why if the fan goes out
of order you have to break off it from the chip. (which is dangerous
as you can damage the chip and the card), look for new grease and
clean out the surface before installing a new heatsink and fan risking to
erase what is written on the GPU:
When we removed the cooler the dense layer of glue
made possible to distinguish only "RADEON 8500". The revision, however,
is clear as the PowerStrip informed on the A13 stepping.
As you know, the RADEON 8500 is equipped with a
RAGE THEATER coprocessor which controls multimedia functions. The
card has a complete multimonitor support, i.e. offers all features
of the 7500, including a TV-out excellently working separately from
a monitor. But while the older model has it realized through the
GPU, in the RADEON 8500 it is the RAGE THEATER which controls the
TV-out.
The most of video streams are recorded in the Interlaced
mode: even lines are followed by odd ones. It is the first quarter
which is drawn first on monitors and TV screens with Interlacing
support. Then comes the second quarter in the second pass. But it
is not in use on modern monitors, that is why it is needed to transfer
to the deinterlacing.
There are two methods to realize it - BOB and Weave.
In the first case two frames are implemented: one of the odd ones
and the other of the even ones. Each line is to be copied twice.
This approach is good for video records with intensive movements.
Here is an example:
The other approach, Weave, is suitable for stopped
frames. There lines are interlaced which results in one frame with
a twice increased vertical resolution. Here is an example:
ATI offers its own method of per-pixel deinterlacing
with much higher quality of an image:
The quality of the TV-out and video reprodcution
of the RADEON 8500 is one of the best.
Like the RADEON 7500, the 8500 card can display
an image on two screens since the chip has two integrated CRTC modules
(and a transmitter for digital monitors). The secondary RAMDAC (it
is between the GPU and the RAGE THEATER on the photo) turned out
to be an external 10bit chip working at a maximum of 240 MHz. Though
it is not much, this frequency is still enough to obtain 1600X1200
at 100 Hz on the second monitor. All peculiarities of operation
of the RADEON 7500, including the HydraVision technology, are also
typical of the 8500 (you can read the RADEON
7500 to get the details on displaying an image on two screens).
The video card ships both in the OEM and Retail
packages. In the box you can find:
- User Manual;
- CD with drivers and utilities;
- CDs with games and demo-products;
- S-Video-to-RCA adapter;
- DVI-to-VGA adapter;
- S-Video, RCA cables.
Overclocking
Being cooled, this card worked flawlessly at 310/295(590)
MHz of the core and memory. I think, the ATI's new chip has an excellent
potetial. The memory didn't speed up much. Nevertheless, taking
into account that the card is well balanced, an increase of the
GPU frequency is preferable.
Installation and drivers
Test system:
- Athlon based system:
- AMD Athlon 1400 MHz;
- Chaintech 7KJD (AMD760);
- 512 MBytes DDR SDRAM PC2100;
- IBM DTLA HDD, 45 GBytes;
- OS Windows 98 SE.
- Pentium 4 based system:
- Intel Pentium 4 1500 MHz;
- ASUS P4T (i850);
- 512 MBytes RDRAM PC800;
- Quantum FB AS HDD, 20 GBytes;
- OS Windows 98 SE.
The test system was also supplimented with ViewSonic
P810 (21") and ViewSonic P817 (21") monitors.
For the tests we used the ATI 7.191 drivers. For
the comparative analyses we processed the results of the NVIDIA
GPU based cards obtained with the NVIDIA 21.85 drivers. The competitors
are:
- NVIDIA GeForce3 Ti 500 (240/250 (500) MHz, 64 MBytes DDR;
- NVIDIA GeForce3 Ti 200 (175/200 (400) MHz, 64 MBytes DDR);
- ABIT Siluro GF3 VIO (GeForce3, 200/230 (460) MHz, 64 MBytes
DDR).
The VSync was disabled in the drivers of all cards.
Attention!!! You all know that such powerful video
cards are meant for normal operation in 32-bit color. We omit the
analyses of the results obtained in 16-bit color as the latest genration
(GeForce3/RADEON 8500) gave up it for lost.
[ Part II
]
Write a comment below. No registration needed!