Direct3D 9: Pixel Shaders
The first group of pixel shaders to be reviewed here is too simple for modern GPUs. It includes various versions of pixel programs of relatively low complexity: 1.1, 1.4, and 2.0.
We wrote many times that they fail to reveal the true GPU power, but they help evaluate the balance between texture lookups and arithmetic operations. But now performance in these simple tests is limited mostly by the speed of texturing units. That's exactly what happens -- results are similar to texturing speed tests.
As there have been no major architectural changes since the RV7xx, the new HD 5700 card perform similarly to the HD 4870. However, there is actually a difference in one test -- the most arithmetic-intensive test with three light sources. All the more important arithmetic performance is, so the HD 5750 is slightly outperformed by the other cards from AMD that offer the same performance level.
The new mid-end solutions from AMD compete well with both cards from NVIDIA. HD 5770 easily leaves the GTX 260 behind, and the HD 5750 performs on a par with the GTS 250. Considering the expected advantage in arithmetic tests, it's a good investment into the overall victory. The HD 5870 is not that much faster actually. Let's have a look at results in more complex pixel programs of intermediate versions:
That one is even more interesting. Performance is apparently limited mostly by ALUs. The procedural water test (which depends on texturing performance among other things) uses dependent texture lookups of high nesting depth, so the cards line up according to their texel and rill rates (if we ignore the difference between vendors.) The Cypress-based card demonstrates top results, being twice as fast as the top Juniper-based card. It agrees with the theoretical values.
The HD 5770 is faster than the GTX 260, and the HD 5750 performs on a par with the GTS 250. So this situation resembles the texturing test. Interestingly, the GTX 260 is outperformed by all contenders here, even by the GTS 250.
The second test (arithmetic-intensive) apparently favors the AMD architecture with lots of arithmetic units. A difference between the new mid-end solution from AMD and its top solution is a tad smaller than twofold. Probably the HD 5870 cannot reveal its full potential in these conditions. But these results agree with the theory that the HD 5770 is on a par with the HD 4870, and the HD 5750 is outperformed by both of them. And both HD 5700 cards are still faster than their competitors from NVIDIA.
Direct3D 9: New Pixel Shaders
These tests of DirectX 9 pixel shaders are even more complex, they are divided into two categories. We'll start with easier shaders -- SM 2.0:
- Parallax Mapping -- a texturing method used in many modern games.
- Frozen Glass -- a complex procedural texture that visualizes frozen glass with adjustable parameters.
There are two modifications of these shaders: arithmetic intensive and texture sampling intensive. Let's analyze arithmetic-intensive modifications, they are more promising from the point of view of future applications:
These universal tests depend on the speed of ALUs and texturing, it's the overall GPU balance that matters here. Performance of graphics cards in the Frozen Glass test is limited not only by arithmetic speed, but also by texel rate. This situation resembles what we have seen above, only NVIDIA cards are a tad stronger here. The HD 5770 is outperformed by the top HD 5870 by less than twofold again, demonstrating results on a par with the HD 4870. It's enough to outperform the GTX 260. But the HD 5750 and GTS 250 demonstrate similar results in this test.
Results in the second Parallax Mapping test depend on ALUs, but mostly on memory bandwidth, so the HD 4870 noticeably outperforms both cards from the new series. That's the weak spot of the new solutions -- the 128-bit memory bus. But considering lowered results of the GTS 250, they both demonstrate higher performance than their direct competitors from NVIDIA. Let's analyze results obtained in the texture sampling intensive tests to make sure our conclusions are correct:
It's a similar situation, but the new cards cope with texture lookups a tad better than the HD 4870. Both models demonstrate very good results. The top card ranks almost on a par with the HD 4870, but it's still outperformed in the parallax mapping test because of its low memory bandwidth. Performance is limited by the TMU speed to a higher degree here, as the GTS 250 defeats the GTX 260 in these tests. But it only makes this card perform on a par with the HD 5750. And the HD 5770 is noticeably faster than the GTX 260.
Let's have a look at results of another two pixel shader tests -- SM 3.0. They are the most complex of all our tests for Direct3D 9 pixel shaders. The tests load ALUs and texture units heavily. Both shader programs are complex, long, and include a lot of branches:
- Steep Parallax Mapping -- A much heavier modification of Parallax Mapping.
- Fur -- A procedural shader that visualizes fur.
Both PS 3.0 tests are not limited by memory bandwidth and fill rate, they are purely arithmetic. It gives the new solutions an opportunity to show their full potential. And the strongest contender, HD 5870, shoots far ahead, being twice as fast as the HD 5770 in full accordance with the theory. Interestingly, results of the HD 4870 fall in between the results of two HD 5700 cards.
So, the new R8xx GPU acts just like the previous GPUs, it demonstrates very high performance in PS 3.0 tests. Especially if we compare with performance of NVIDIA cards, which are sometimes outperformed by more than 1.5 times. It's a very good result, which suggests strong results in the other arithmetic tests.
Write a comment below. No registration needed!