Direct3D 9: Pixel Shaders Tests
The first group of pixel shaders to be reviewed here is too simple for modern GPUs. It includes various versions of pixel programs of relatively low complexity: 1.1, 1.4, and 2.0.
These tests are too easy for modern architectures, even for Low-End solutions. They fail to reveal the true GPU power, but they help evaluate the balance between texture lookups and arithmetic operations in a new architecture. Performance in simple tests is limited by TMUs. And the HD 4670 card we review today is based on RV730, which has improved texturing parameters. Indeed, HD 4670 outperforms its rivals almost in all tests, it's just a bit slower than HD 3870 in one test. Performance difference from NVIDIA solutions is more than twofold.
Comparing RV670 and RV730, we can clearly see which test is limited by arithmetic operations, and which one -- by texture lookups. For example, all procedural texturing tests use higher texturing performance of RV730. And lighting tests load ALUs with arithmetic operations, so RV670 and RV730 demonstrate close results at the same frequency. Let's have a look at results in more complex pixel programs of intermediate versions:
On the whole, the situation is similar to the previous test -- there is an apparent difference between two tests, one of which loads texture units, while the other loads ALUs. The procedural water test (which depends much on texturing performance) uses dependent texture lookups of high nesting depth, so the cards line up according to their texel rates. RV730 is almost three times as fast as HD 3870 here, and almost four times as fast as NVIDIA cards!
The second test (arithmetic-intensive) apparently favors all AMD architectures with lots of arithmetic units. In this test the new Low-End solution from AMD performs on a par with RADEON HD 3870 operating at the same frequency, which fully complies with the theory. Architectural changes in RV7xx are apparently successful, they've raised ALU efficiency and accelerated texturing.
Direct3D 9: New Pixel Shaders Tests
These tests of DirectX 9 pixel shaders are even more complex, they are divided into two categories. We'll start with easier shaders - SM 2.0:
- Parallax Mapping is a texturing method used in many modern games
- Frozen Glass is a complex procedural texture that visualizes frozen glass with adjustable parameters
There are two modifications of these shaders: arithmetic intensive and texture sampling intensive. Let's analyze arithmetic-intensive modifications, they are more promising from the point of view of future applications:
These universal tests depend on the speed of ALUs and texturing, it's the overall GPU balance that matters here. Performance of graphics cards in the Frozen Glass test is limited not only by arithmetic speed, but also by texel rate. So the old RADEONs demonstrated weak results. But our product under review is way faster than all the other cards, even HD 3870 is slower by 80%. This card just has no competitors, NVIDIA products are 2.5-3 times as slow.
AMD cards are usually even faster in the second Parallax Mapping test. However, there is not much difference from HD 3870 here. Perhaps ALUs play a more important role than in the previous test. But HD 4670 is still the fastest solution, outperforming competing GeForces more than twofold. TMU improvements significantly raised results of RV730 versus RV670. The difference at the same frequency amounts to 40%. Let's analyze results obtained in the texture sampling intensive tests, where RV730 may demonstrate even more interesting results:
That's right, RADEON HD 4670 breaks away from the other cards even further in the first test. The situation has changed a little, performance is apparently limited by the speed of texture units. RV730 demonstrates better results in both tests, outperforming RV670 by 2.5 times in the first test and by one third in the second test. NVIDIA cards improved their results, but RV730 is still too fast for them.
Let's have a look at results of another two pixel shader tests -- SM 3.0. They are the most complex of all our tests for Direct3D 9 pixel shaders. The tests load ALUs and texture units heavily. Both shader programs are complex, long, and include a lot of branches:
- Steep Parallax Mapping is a much heavier modification of parallax mapping
- Fur is a procedural shader that visualizes fur
Older AMD solutions used to be much slower than NVIDIA cards in these tests, although it has nothing to do with cheap GeForce models, which also show low results in these tests. But the new RV7xx architecture offered huge performance gains in PS 3.0 tests. RADEON HD 4670 is in the lead now, and it outperforms HD 3870 at the same frequencies by 1.7-2 times.
What concerns other contenders, the new card from AMD is 3.6-3.7 times as fast as GeForce 9500 GT and three times as fast as GeForce 8600 GTS. The previous AMD solution from the same price range demonstrates similar results. The overhauled architecture from AMD offers excellent results, which can be explained with a significantly increased number of execution units, improved architecture, and higher efficiency of utilizing available resources.
Write a comment below. No registration needed!