Summer 2002 has passed its peak and started its way to the decline. However, everything is relative, and the decline of one phenomenon is uprise for another. The beginning of autumn is a promising time for new 3D accelerators. But why autumn if now it's just the middle of summer? Simply because announcements please but don't warm :). If in summer a company announces a super-chip which is going to beat everything it doesn't mean it will be available on the market tomorrow. One should wait a couple of months until it can get the solution at a fabulous price (relative to a promised one).
At the same time the budget sector is the widest one: prices for video cards there do not exceed $100. This is where the greatest profits are gained. NVIDIA follows the right marketing policy for quite a long time already by producing wide lines of products including the cheapest accelerators and High-End solutions (for example, of the GeForce4 Ti class). Last fall ATI also offered an alternative between a RADEON 7500 (as a budget solution) and 8500 (as a High-End one). Plus previous RADEON 7000/7200 cards widely available on the shelves. This approach and the new policy of collaboration with some manufacturers allowed the Canadian company press NVIDIA that felt so free before. However, the RADEON 7500 is just an accelerated version of the simple RADEON with a dual-monitor support. It required great price cuts so that those cards started getting popular (taking into account that ATI was known as a company that always had problems with drivers). The pressure of NVIDIA was very strong being reinforced by budget GeForce4 MX solutions. Today we can see how rapidly prices for the GeForce4 Ti 4200 are falling down, which is now fighting against the ATI's former High-End solution - RADEON 8500.
However, NVIDIA made a big mistake by cutting capabilities of the GeForce4 MX so that they became just accelerated versions of the GeForce2 without any DirectX 8.1 support. First, the today's market offers games which use shader technologies lacking in the GeForce4 MX, that is why the release of the RV250 may turn out to be a terrible blow on the GeForce4 MX. ATI applies shader technologies in the sub-$100 video card sector! It means that game developers won't afraid any more of creating games with shaders involved (especially pixel shaders), realizing that the market of video cards supporting DirectX 8.1 has extended by a great margin. And the consequences of such extension may be the last nail in the coffin of the GeForce4 MX.
We, users and testers, must welcome such solution of ATI, and if the flow of games supporting shaders will be increasing in a geometric progression, we should thank the Canadian company. NVIDIA can response with one more price reduction for the Ti 4200. But this is just an assumption.
So, you must understand that the RV250 (RADEON 9000) replaces the RV200 (RADEON 7500), not RADEON 8500! Thus, ATI holds up the landslide of prices for the latter providing space for the RADEON 9000, which, being priced at about $100, will easily kill the RADEON 7500. So, below we will have the RADEON 9000 (as well as the previous solutions up to RADEON 7000), the RADEON 8500 will be in the middle, and the R300 will get the crown (RADEON 9700). I think the latter will bring in a whole line, that is why the RADEON 8500 will be gradually vanishing as well. But for the time being this solution will take the center in the new structure of the ATI's card market.
As you know, we've already got a proved algorithm for testing High-End solutions which, at the same time, doesn't suit for middle and low-level ones because ATI releases two products flowing above and below the current flagship RADEON 8500. By the way, why is the card numbered 9000 if it is weaker than the 8500 RADEON?
Line of products
We got used to the fact that ATI releases not just one card. Last time we had RADEON 8500 and 8500LE, and now:
Remember that we do not touch upon features of the ATI's High-End R300 (RADEON 9700); secondly, we had not much time for thorough examination of the RV250 (RADEON 9000), that is why the review consists of 2 parts. In this part we will take a look at the RV250, analyze performance and capabilities of the 64 MBytes card on the RADEON 9000 Pro. The next part will deal with a 128 MBytes card on the same GPU.
As we can see, ATI follows NVIDIA when naming chips (NV17, being weaker than the NV20 (GeForce3), was named GeForce4 MX). Numbers are given not in accordance with power or capabilities but in the chronological order. Certainly, some marketing tricks are also involved - a card with a greater number will be considered more advanced and, thus, will attract more attention of inexperienced users. Besides, the price of the new solution will also be interesting.
Although ATI positions the RADEON 9000 as a revolutionary solution in comparison with the RADEON 7500, we can see that this is just a light version of the RADEON 8500 with some functions added. This is only from a standpoint of capabilities, not of the technology of production, because according to the size of the cover (and layout) the RV250 does correspond to the RADEON 7500.
It seems that they took the RADEON 8500, shortened the texture units, used an old 128-bit memory controller instead of two 64-bit ones, added a second RAMDAC, TV-out, and named all that RADEON 9000. I also doubt that the HyperZ II is fully supported, but I'm not sure about it.
This is actually an optimal way to make a budget card out of a High-End one. Anyway, all interesting functions of the RV250 are left unchanged (full DX 8.1. support)
Today we have a preproduction card on the RV250 produced by Gigabyte.
The card has an AGP x2/x4 interface, 64 MBytes DDR SDRAM located in 8 chips on both sides of the PCB.
When we tested the card we didn't have a reference card to compare with, that is why we used a photo from ATI to prove that the Gigabyte's card followed the reference design:
The design of the RADEON 9000 is relatively simple, almost the same (as to complexity of the layout) as that of the RADEON 7500, though it differs much from the latter.
The card has three standard connectors: DVI, VGA and TV-out. Unfortunately we hadn't enough time to test the TV-out. We'll do it a bit later.
This GPU is smaller than the RADEON 8500. If we take off the cooler we will see the processor:
Although its frequencies are usual for the RADEON 7500, complexity of the chip makes problems with its thermo mode, that is why its cover is metallic. The cooler has a usual shape.
The latest version (3.20) of the PowerStrip is able to work with the RV250, that is why overclocking is possible. Its potential allowed us to lift the frequencies up to 300/310 (620) MHz without additional cooling.
Test system and drivers
The test systems were coupled with ViewSonic P810 (21") and ViewSonic P817 (21") monitors.
In the tests we used ATI Catalyst drivers 6.118. VSync was off.
For comparison we used the following cards:
The settings of the Catalyst driver differ considerably from the panels we got used to:
In the tests we are going to show operation of the card at the highest possible quality (without additional functions such as anisotropy). Note that in the Direct3D the maximum texture detailing is default.
Since we tested the preproduction sample, estimation of 2D quality is conditional. In general, there are some flaws in high resolutions, in particular, in 1280x1024@100 Hz the images are blurry.
However, 2D quality estimation is subjective. Such cards are produced by a lot of companies, that is why their quality will depend on a certain sample, as well as on a card/monitor tandem (first of all, quality of monitor and cable).
3D graphics, MS DirectX 8.1 SDK - extreme tests
To test various extreme characteristics of the chips we used modified (for better convenience and control) examples from the latest version of the DirectX SDK (8.1, release).
This test defines a real maximum throughput of an accelerator as far as triangles are concerned. For this purpose it uses several simultaneously displayed models, each consisting of 50,000 triangles. No texturing. The dimensions are minimal - each triangle takes just one pixel. It must be noted that the results of this test are unachievable for real applications where triangles are much greater, and textures and lighting are used. The results are given only for 3 rendering methods - model optimized for the optimal output speed (with the size of the internal vertex cache on the chip accounted for) - Optimized, Unoptimized original model, and Strip - unoptimized model displayed in the form of one Triangle Strip:
In case of the optimized model, when the memory subsystem has a minimal effect we can see that a weaker Hardware T&L unit of the RADEON 9000 affects the scores of this test. When 1 Strip is used the RADEON 9000 goes almost on a par with the 8500. That is why geometrical caches of the queue of the RV250 are good, and the cache with random access is weaker (smaller) than that of the RADEON 8500. Well, this is a budget version, and it makes sense to save mostly on transistors of the cache.
Vertex shader unit performance
This test allows determining the maximum performance of the vertex shader unit. It uses a complex shader which deals with both type-transformation and geometrical functions. The test is carried out in the minimal resolution in order to minimize a shading effect:
In case of operation with vertex shaders the performance is equivalent to the RADEON 8500. Besides, we have high scores in the software emulation as well. It means that the mechanism of geometrical data transfer from a processor to an accelerator was improved (this is usually a weak point of ATI as compared with NVIDIA). Probably, the FastWrites or alike is supported at last.
Vertex matrix blending
This T&L's feature is used for verisimilar animation and model skinning. We tested blending using two matrices both in the "hardware" version and with a vertex shader that implements the same function. Besides, we obtained results in the software T&L emulation mode:
The sad situation related with all aspects of operation of the HW T&L doesn't change. The scores are again lower than those of the predecessors.
In this test we measure a performance drop caused by Environment mapping and EMBM (Environment Bump). We set 1280x1024 because exactly in this resolution the difference between cards and different texturing modes is the most discernible:
The tested card works with the EMBM better than the RADEON 8500; in all other cases the situation is equal.
Pixel Shader performance
We used again a modified example of the MFCPixelShader having measured performance of the cards in the high resolution during fulfilling 5 shaders different in complexity, for bilinear-filtered textures:
We got the advantage of pixel pipelines of the new chip in execution of pixel shaders; moreover, it is greater than the frequency difference which means that the pixel pipeline was optimized. The advantage grows with complexity of the shader - it is obvious that the number of execution units for stages was increased.
So, let's draw the first intermediate conclusion. Despite the twice cut number of texture units and the geometrical unit almost twice slower in tasks of the old fixed T&L, the budget chip demonstrates excellent results exactly in new applications that use shaders. Together with a low price it can be a crucial factor for a customer; ATI addresses the question of what can be saved on and what can't quite correctly, giving preference to modern and future applications.
3D graphics, 3DMark2001 SE - synthetic tests
All measurements in all 3D tests were done in 32-bit color.
In the Single Texturing mode the one-channel memory controller and smaller caches have an effect. And in case of multitexturing there is only one texture unit per pipeline. Not an exactly twice but almost twice performance drop relative to the RADEON 8500 is caused by the ability to accumulate 6 textures per pass. But now it's necessary to give up 5 clocks instead of 2 like for the RADEON 8500.
Scene with a large number of polygons
In this test you should pay more attention to the minimal resolution where the fillrate makes almost no effect:
Again we can notice traces of the weaker HW T&L unit of the RADEON 9000. It is almost twice weaker than that of the RADEON 8500. But that's ok due to the compromise between the old T&L unit and a new one based on shaders. But even in this case performance of the lighting subunit is higher as compared with the RADEON 7500.
Look at the result of the synthetic EMBM scene:
Advantage of the RADEON 9000 comes not only from a higher clock speed but also from optimization of operation with the EMBM of the new processor. And now the DP3:
As expected, the results are almost identical because of a similar configuration of the pipeline structure.
The vertex shaders are processed slower than with the RADEON 8500, but not twice. It seems that the developers saved on T&L emulation purposely: on the RADEON 8500 it was implemented with some specific means used, and now the load is fully upon the shoulders of a usual vertex shader. Further we will see that performance of processing of typical shaders of applications haven't suffered much.
The RADEON 9000 outscores its competitor RADEON 8500, though only within the cores' frequencies difference. In general, the situation is not bad: the reduced number of texture units is made up for by a higher chip's frequency in a complicated task of pixel shaders. Besides, the RV250 stomps the GeForce4 MX into the ground, because the latter can't work with pixel shaders at all.
The situation is very similar, and lag of the RADEON 9000 is caused by a more complicated scene and multitexturing, the cost of which is already higher.
This test depends mostly on a geometrical performance and after that on ability to shade sprites using an optimized method. The latter is realized excellently, unlike the former - in this respect the card falls behind almost twice. The overall performance drops 1.5 times as compared with the RADEON 8500. However, if you look at the RADEON 7500 which emulates sprites with two triangles you will see the progress is not weak.
So, one more intermediate conclusion. It doesn't differ much from that we drew at the end of the extreme tests from SDK. There is a weaker HW T&L unit but a much stronger new shader unit (and first of all, pixel shaders). With new games actively using these techniques the advantage of the RADEON 9000 will grow up. However, the chip has only 4 texture units instead of 4, which affects the overall efficiency, though for a card priced at $100 this thing is excusable.
3D graphics, 3DMark2001 - game tests
As expected, the RADEON 9000 is between the RADEON 8500 and 7500.
3DMark2001, Game1 Low details
The situation is close to the previous one. Well, the Game1 and the general Marks often show proportional results.
When the anisotropy is activated, the speed is lower than of the RADEON
8500, though the gap is generally smaller. Quality of this filtering will
be taken a look at in the Part 2.
3DMark2001, Game2 Low details
This test always shows the full power of accelerators :-). The one with
better frequencies, pipelines etc. will be a winner. It is interesting
that even 300/620 MHz of the RADEON 9000 didn't let it outdo the RADEON
3DMark2001, Game3 Low details
In this test the RADEON 9000 comes very close to its senior brother. Its
performance drop caused by anisotropy is much smaller as compared with
the competitors, though the card has only 4 texture units. Though the RIP-mapping
can adjust it, and the developers could found a way to optimize implementation
of the filtering. In the next part we will return to this issue.
Thanks to a higher speed of operation of pixel shaders the RADEON 9000 has caught up with its senior sibling although the test is quite complicated and it can't boast about the texture units.
3D graphics, game tests
For estimation of 3D performance in games we used the following tests:
Quake3 Arena, Quaver
This test, lacking complicated geometry, allowed the card to catch up with the RADEON 8500 and even outscore it in the overclocking mode. And the fact that it possesses only 4 texture units and inability to work with more than two textures resulted in a great slowdown of operation with the anisotropy enabled for the RADEON 9000.
Serious Sam: The Second Encounter, Grand Cathedral
Here are screenshots with the settings:
So, the results:
This game can apply 4 textures, that is why capabilities of the RADEON 9000 are used with more benefit, however lack of texture units (only one) of the pipeline affected the performance when the anisotropic filtering is used.
Return to Castle Wolfenstein (Multiplayer), Checkpoint
If the overall performance of the RADEON 9000 hasn't changed (it came close to its senior brother), with activation of the anisotropy the speed falls faster as the resolution grows up. Though it can be an effect of the CPU on which the card depends much, especially in low resolutions.
This is a difficult test, that is why the bottlenecks come to light easily. However, the RADEON 9000 performs nicely. But it's not clear why the performance of the RADEON 8500/9000 drops so much with anisotropic filtering.
In short, there are no complains - the same as with the RADEON 8500. The details on anisotropic quality will be given in the next part of the RV250.
I think that if the RADEON 9000 will be widely available on the market
(not like the RADEON 8500 which was in short supply for half a year), the
company has a chance to retire the RADEON 7500, and the GeForce4 MX will
have to make room or fall in price. And the RADEON 8500 can hold its price
because there is a product below with all modern technologies realized.
Hence the conclusion that ATI tries to line up a well-composed and logical
chain of its produces which will possess almost all modern technologies.
Write a comment below. No registration needed!