New Solutions from ATI (R200 family)

By Alexander Medvedev

NVIDIA is an undoubted leader. As you know, they always developed good drivers, sold successfully chips, released their products in time and never strove after doubtful technologies. Well,

Are there any other competitors?

First of all, I'd give you several facts about new chips from ATI which have recently been released. It seems that these chips are the last chance of ATI to climb up. After that either NVIDIA will drown it with the help of NV25 or ATI will gain the ground.

R200 family

First, let's look at the general picture:

So, the first chips have been already officially released - R200 and RV200. Half a year later a new mainstream chip RV250 will appear with the R300 family. It will have all possibilities of the R200.

By the way, the diagram shows that the division of the market into sectors by the companies differs. The chips marked by ATI as "performance" are, at present, competing against "mainstream" sector of NVIDIA (GeForce2 MX). Having divided the GeForce2 MX into three subsectors, NVIDIA considerably extended the concept "mainstream", and it positively told upon the sales volume. Let's clarify the concepts and let the "performance" mean a sector where you can find everything which is not mainstream or professional, and let forget the concept "enthusiast" since this is not a sector of the market at all.

Here is the table of characteristics available today:

Codename	R100	NV20	R200	RV200	RV250
Name	Radeon	GeForce3	Radeon 8XXX	Radeon 7XXX	?
Technology, micron	0.18	0.15	0.15	0.15	0.15 (0.13?)
Core frequency, MHz	145-183 (1)	200 (3)	250	275	300
Memory frequency, MHz	145-183 (2)	240 (3)	275	230	250-300
Memory bus	128	128	128	128/64	128
Memory type	SDR/DDR	SDR/DDR	SDR/DDR	SDR/DDR	SDR/DDR
RAMDAC	350 MHz	350 MHz	400 MHz	2 * 350 MHz	2 * 350 MHz
Integrated TV Out	Yes	No	Yes	Yes	Yes
DirectX version	7+ (4)	8	8.1	7+ (4)	8.1
Pixel pipelines	2	4	4	2	4
Texture units	3	2	2	3	2
Textures per transfer	3	4 (5)	6 (5)	3	6 (5)
Shading (Mpixel/sec)	366	800	1000	460	1200
Shading (Mtexel/sec), max	1098	1600	2000	1380	2400
Pixel Shaders	No (6)	1.0-1.3	1.0-1.4	No(6)	1.0-1.4
Vertex Shaders	No	1.1	1.1	No	1.1
3D textures	Yes	No (7)	Yes	Yes	Yes
HOS	No	RT-Patches	N-Patches (Truform)	No	N-Patches (Truform)
Texture compression	Yes	Yes	Yes	Yes	Yes
Z compression	Hierarchical tile Z buffer (HyperZ)	Tile Z buffer	Hierarchical tile Z buffer (HyperZ II)	Hierarchical tile Z buffer (HyperZ)	Hierarchical tile Z buffer (HyperZ)
HSR	Hierarchical tile Z buffer (HyperZ)	Tile Z buffer	Hierarchical tile Z buffer (HyperZ II)	Hierarchical tile Z buffer (HyperZ)	Hierarchical tile Z buffer (HyperZ)
FSAA	SSAA	MSAA	Adaptive AA	SSAA	SSAA

(1) - this parameter is not fixed for ATI and can change. The first Radeon had 166 for OEM and 183 for Retail cards, 145 for the cheap LE. Later, when an optimized version of the chip appeared, even OEM cards started working at 183 MHz. For the new chips we give start characteristics.

(2) - at default the memory frequency coincides with the core one, but in practice another value may be set (with the help of tweakers).

(3) - the recommended frequency is given, but it can be changed by the card manufacturers, though not greatly.

(4) - some possibilities of DirectX 8, but except shaders.

(5) - texture units can be used three times at one pass, with the number of simultaneously enabled textures equal to 6, but at the expense of 2 additional delay cycles.

(6) - non-standard shaders, so called "v0.5".

(7) - at present, 3D texture support for GeForce 3 is locked in DirectX, but is available in OpenGL.

R200

This one is a successor of the first Radeon. This chip and the card based on it (64 MB Radeon 8500) are aimed at the "performance" market sector.

It should be noted that despite the speed close to professional cards from NVIDIA and 3Dlabs the cards on this chip cannot be referred to the professional market. The matter is that it lacks the required certification of drivers for definite professional packets or for definite hardware. But soon one of R200 based cards from ATI may become a participant of the professional market, because the Canadians have recently acquired everything which is connected with the professional family FireGL. Both professional drivers and the trademark will remain the same. But you shouldn't await the certified professional card in the near future.

So, the chip is based on the 0.15 micron technology; it is very close to the GeForce3 (NV20) in complexity, power consumption and heat dissipation. It is even a bit more complex - about 60M transistors. The cards will be equipped with a large sound cooler, very similar to the one of the GeForce3. The tests show that the R200 can boast not only of a series of technological developments, but sometimes even by a greater performance than that of the GeForce3. Let's loot at the structural scheme of the R200:

The first thing that attracts attention is HydraVision technology which is known since Radeon VE (RV100 codename). And although the second integrated RAMDAC is absent, there are two independent display controllers which are able to give different resolutions and scanning frequencies to:

Integrated 400 MHz RAMDAC
Integrated DVI interface
Integrated TV-Out signal coder
Additional external digital interface (for instance, for realization of one more DVI interface with the help of an external chip)
Additional external RAMDAC (for the second CRT monitor support).

So, the following independent configurations of outputs are available:

CRT+CRT (external RAMDAC required)
CRT+DVI
DVI+DVI (external DVI interface required)
CRT+TV
DVI+TV

And doubling of any output to TV:

CRT+CRT+TV (external RAMDAC required)
CRT+DVI+TV
DVI+DVI+TV (external DVI interface required)

Output modes can switch dynamically, and nothing prevented to develop a card able, like the Radeon VE, to use two CRT-monitors, a DVI-monitor and a TV-out in different configurations. Besides, the dualhead support on the program level was improved - now an independent configuration of modes and an independent setting of the vertical scanning frequency for Windows NT/2000/XP are available. Besides, you can now work with Direct3D applications on two monitors at the same time.

The chip contains a hardware HDTV decoder and a hardware support (iDCT) for decoding of MPEG-like algorithms. It is possible that the program support for the MPEG4 appears very soon, while for DivX it will hardly be developed.

There is a complete hardware support of GDI, including new functions which are necessary for optimal 2D acceleration of Windows XP - AlphaBlt, AlphaCursor, GradientFill.

Now come the architectural news. The most interesting thing here is a hardware realization of DirectX 8 technology (N-patches) called Truform. I have already written about this technology. But unlike the GeForce3, which emulates N-patches with the help of RT-patches supported on a hardware level (such emulation is less efficient than a completely software realization of N-patches), in the R200 everything is implemented on a hardware level. How effective this realization is the tests will show. According to the provisional data, at the average detailing level an activation of N-patches doesn't slow down the modern games, but improves the visual quality of the models very much. By the way, a new version of the Quake I able to use N-patches has already appeared on the Net. Even such a simple adaptation of the game (without taking into account right angles of models) looks pretty. Besides, the CounterStrike will soon receive the Truform support as well.

The second important feature is support of vertex and pixel shaders. Here ATI promotes the new version of shaders - 1.4. By the way, ATI names this version 2.0 (maybe, in order to differentiate them from NVIDIA's ones), while Microsoft prefers 1.4 in the DirectX 8.1. But let's clarify the difference between the versions of shaders:

Shader version	1.0 - 1.3	1.4
Textures combined	4	6
Texture addressing operations	No	8
Combination operations	8	8
Shader instructions (max)	12	22

In fact, the R200 offers not only a possibility to use up to 6 textures simultaneously (of course, at the expense of delay cycle in case of two extra textures, and two delay cycles in case of 4 extra textures), but also a flexible programming of what data and where from are to be taken from the textures. While earlier we had only a definite set of modes of texture addressing, now we can define these modes ourselves. It will allows us to make more realistic textures of surface lighting and really complex procedural textures. More or less sound support is realized, as a rule, only for possibilities of mainstream cards. I.e. only the release of GeForce3 MX, or RV250 (in case of shaders 1.4) is able to popularize these technologies. But we may hope to enjoy pixel shaders 1.4 in the Doom III. The developers say the image building with the R200 will require the number of passes twice or trice less as compared with the GeForce3.

But NVIDIA makes every attempt to maintain its position. According to their request, two additional standards - shaders v1.2/1.3 were included into the specification of the DirectX 8.1; they realize insignificant possibilities typical of the GeForce3 family. This step allowed NVIDIA to announce a complete compatibility of its products with DirectX 8.1 shaders. Moreover, NVIDIA have probably already integrated a combination pipeline compatible with shaders-1.4 into its next chip codenamed NV25. It seems that because of the superiority of the Radeon 8500 in speed NVIDIA released the fourth generation of detonators (20.XX) to increase the performance of the GeForce3 in high resolutions by 10% and more at the expense of more optimal placing of data in the local memory and other minor optimizations. The total shading speed in 32-bit color has grown much in high resolutions, thus allowing the GeForce3 outscoring the ATI chip which works at a higher frequency! The R200 drivers are still not perfect enough, and the final version of the chip is not released yet, that's why it's useless to speak about an unquestioned victory of the GeForce3. But if we take into account that the GeForce3 appeared half a year earlier and costs cheaper, the ATI's prospects become pale.

But I can state that the NV25 will have neither N-patches support nor an adaptive AA - it will appear only in the NV30 which should come out right after R300 and RV250.

The next new feature is AAA (Adaptive Anti-Aliasing). The technology consists in a dynamic definition (during shading) of how many source values for each resulting pixel should be calculated. This technology is very close to the stochastic smoothing widely used in photorealistic graphics. First of all, there is a set of AA-masks which emulate a random stochastic selection; secondly, a mask size depends on the building part of an image. But the more polygons used in the scene the less advantageous this technology is. Besides, it requires some rework of applications. But with the same or better than Quinqunx visual quality the performance drop must be less or at least equivalent to it. The NV30 will also has some technology similar to the AAA.

An excellent Z buffer compression technology based on the hierarchical idea (HyperZ) widens an effective memory bandwidth by 20%. Besides, the hierarchical Z buffer allows quickly cleaning the frame buffer and implementing an efficient HSR (more efficient than that of the GeForce3), especially when a definite order of scenes is provided (Front-to-Back). In fact, the improvement is reached at the expense of twice smaller tiles which provide for optimal work with modern and future applications.

The last issue we'd like to touch is a bus used in the R200. In fact, it is the same memory controller as that of the Radeon, a 128-bit bus and a wider (256 against 128 bits) cache exchange line. While NVIDIA reduces this parameter using a 4-channel crossbar architecture, ATI optimizes the chip for a larger data transfer packets. Both approaches aim an optimal correspondence to the internal structure of the chip. It was exactly the crossbar-controller which boosted up the performance so significantly with new NVIDIA drivers by optimizing the data order in the memory. Large internal caches of the R200 lowers this advantage (the R200 has separate caches for vertices, textures and a frame buffer; besides, there is a special hierarchical cache for an hierarchical Z-buffer). But the advantage the R200 caches take place mostly in low and middle resolutions.

The start price of the base card (Radeon 8500) is $400. The competition is possible only at the price drop being at least $100 and with the well optimized drivers which should help the R200 perform as equals with the GeForce3 in all applications. By the way, in the 3DMark2001 the R200 performs very good: it outshines the GF3 in the number of triangles and sprites and in the shader performance at least twice. But this advantage will make sense only when applications able to use such possibilities will get popular, i.e. closer to the release of R300. In fact, today the link of R200+drivers is weak, especially taking into account a high clock speed of the memory and of the chip.

RV200

The greatest sales volumes and, therefore, domination on the whole market is possible only in the mainstream sector. And here we have two models from the new series of chips from ATI: RV200 (now) and RV250 (spring 2002). In fact, the RV200 is only an improved version of the first Radeon equipped with a more effective memory controller from R200 based on the 0.15 micron fab process and an excellent dualhead support. Let's look at its structural scheme:

The 0.15 technology will help it both cut the prime cost and increase the clock speed. This chip is able to outscore the GeForce2 MX family both in performance and in a set of possibilities (but, unfortunately, it is more expensive).

The second 350 MHz RAMDAC integrated into the chip pleases me. Don't worry about the restrictions of the external RAMDAC, especially if you have two normal 19" or larger monitors. Again we have an integrated TV-out, and again there are a lot of dual configurations and new possibilities(program support of the new version of HydraVision) in controlling of several desktops. It is still unclear what typical configurations of memory will be used for cards on this chip.

The start price of the base card with 64 MB DDR memory is $180; of course, it will be falling down rapidly, especially when RV200 based cards will be launched by other manufacturers. A real competition with NVIDIA is possible only at the price lower than $140.

RV250

In fact, it is the same R200, but with two integrated RAMDAC 350 MHz. It is interesting that the clock speed is planned to be higher - 300 MHz; it means that the chip is based on the 0.13 micron technology. Only such approach will ensure a low prime cost of the chip and allow lifting the core frequency up to 300 MHz. Roadmap says that RV250 will be based on the 0.15 fab process. But what to do, then, with the dissipated heat?

R300

This chip will serve a base for a new generation of ATI solutions. Obviously, it will offer hardware decoding and encoding(!) of MPEG1/2/4 and DV(!). There are also 8 pixel pipelines with 4 texture units each (up to 8 textures per pass, and, therefore, a new technology of saving an effective memory bandwidth).

But it will be a problem to realize this chip on the 0.15 technology (at the core frequency equal to 350 MHz and the memory one being 400 MHz this chip will turn into an oven).

There you can find two RAMDACs, an integrated TV-Out, a twice higher triangle performance (as compared with the R200): 150M per second against 75 of the R200, but the experience shows that the figures given by ATI should be divided at least by 4 and those from NVIDIA - at least by 2. The final judgment on this chip will, however, be given with the final specification of the DirectX 9.

Conclusion

Well, none technological advantage and none excellent chips will improve a poor marketing and ineffective drivers. I hope ATI will arrive at the same conclusion before the R200 will be launched. At present the chip looks interesting but useless.

Write a comment below. No registration needed!