Direct3D 10: PS 4.0 Tests (texturing, loops)
New RightMark3D 2.0 includes two old PS 3.0 tests (Direct3D 9), rewritten for DirectX 10, and two brand new tests. The first two tests can now enable self-shadowing and shader supersampling, which increase their load on GPUs.
These tests measure efficiency of executing looped pixel shaders with a lot of texture lookups (up to several hundreds of lookups per pixel in the heaviest mode!) and a relatively low ALU load. In other words, they measure a texture sampling rate and branching efficiency in a pixel shader.
The first pixel shader test will be the Fur test. When used with the lowest settings, it uses 15-30 texture lookups from bump maps and two lookups from the main texture. The High Effect Detail mode increases the number of lookups to 40-80. When shader supersampling is enabled, the number of lookups grows to 60-120. The heaviest mode is the High mode with SSAA -- 160-320 lookups from a bump map.
Let's see what happens in modes without supersampling - they are relatively simple, and the correlation of results in Low/High modes must be similar.
Performance in this test depends not only on the number and speed of TMUs, but also on the fill rate and memory bandwidth. Results in the High mode are 1.5 times as low as in the Low mode. Finally, at least one AMD solution performs on a par with competing NVIDIA cards in the procedural Fur tests for Direct3D 10 with lots of texture lookups. Theoretically, the other cards from AMD shouldn't demonstrate so low results.
RADEON HD 4670 performs on a par with the old GeForce 8600 GTS and even outperforms the new GeForce 9500 GT. The other cards from AMD are slower. HD 4670 is more than 1.5 times as fast as HD 3870 at the same frequency, and RADEON HD 3650 is actually four times as slow as the new Low-End solution from this company. Let's have a look at the results in this test with enabled shader supersampling, which quadruples the load. Perhaps it will change the situation, and memory bandwidth/fill rate will produce a weaker effect:
Theoretically, supersampling quadruples the load. This time the advantage of NVIDIA cards is gone, they are also outperformed by HD 3870 (although it's wrong to compare them). As the shader grows more complex and the GPU load increases, performance differences between HD 4670 and HD 3870 reach almost twofold. Both GeForces are also twice as slow (or more) as the new graphics card in this test. AMD apparently mends its ways, even our D3D10 package has almost no weak spots left for its solutions.
The second test that measures efficiency of executing complex looped pixel shaders with many texture lookups is called Steep Parallax Mapping. With low settings it uses 10-50 texture lookups from a bump map and three lookups from main textures. The heavy mode with self-shadowing doubles the number of texture lookups, and supersampling quadruples this number. The most complex test mode with supersampling and self-shadowing uses 80-400 texture lookups, that is eight times as many as in the low mode. Let's analyze simple modes without supersampling first:
This test is even more interesting from the practical point of view. Various parallax mapping methods have been used in games for a long time already. Heavy modifications, such as our steep parallax mapping, are already used in some projects, e.g. in Crysis and Lost Planet. Along with supersampling, our test can enable self-shadowing that doubles the GPU load (High mode).
The situation is similar to what we saw in the previous test. RADEON HD 4670 is the best in the updated D3D10 version of the test without supersampling. Both GeForces are left behind, as well as HD 3870 operating at the reduced frequencies. Texturing performance is apparently important in this test. Curiously enough, self-shadowing causes a bigger performance drop in AMD products than in NVIDIA solutions.
RADEON HD 4670 defeats older cards from AMD with a big advantage again, from various price ranges at that. Performance difference between HD 4670 and HD 3870 at the same frequencies reaches 45-65%. Let's see what supersampling will change. Performance drop from supersampling was bigger in NVIDIA cards in the previous test.
Supersampling and self-shadowing increase the load on graphics cards by almost eight times, causing a great performance drop. Performance differences between the cards are different now. Supersampling has a similar effect here -- AMD cards improve their results relative to NVIDIA solutions. RADEON HD 4670 is still noticeably faster than both NVIDIA GeForces, and it's almost 1.5 times as fast as HD 3870 at the same frequency. RADEON HD 3650 based on the old architecture is left far behind.
Direct3D 10: PS 4.0 Tests (computing)
The next couple of pixel shader tests contains minimum texture lookups to reduce the effect of TMU performance. They use a lot of arithmetic operations, so they measure arithmetic performance of GPUs, how fast they execute arithmetic instructions in pixel shaders.
The first computing test is called Mineral. It's a complex procedural texturing test, which uses only two texture lookups and 65 sin and cos instructions.
We always note in the analysis of our synthetic test results that modern AMD architectures perform better in complex arithmetic tasks than the competing products from NVIDIA. The situation changed eventually. While RADEON HD 3650 slightly outperformed GeForce 9500 GT, the new HD 4670 tears its competitor to pieces, demonstrating excellent performance like graphics cards of a higher level.
The graphics card based on the new RV730 is 2.7-3.1 times as fast as its direct competitors from NVIDIA and HD 3650 of the previous generation. Interestingly, despite the identical clock rates, HD 4670 is a tad faster (10%) than HD 3870 with just as many streaming processors. It looks like this test is affected by minor changes in RV7xx, which were introduced to raise arithmetic efficiency of ALUs.
The second shader test is called Fire, it's even harder for ALUs. It contains only a single texture lookup, while the number of sin/cos instructions is doubled to 130. Let's see what changes as the load grows:
Rendering speed in this test is limited almost solely by shader performance as well. So we have similar results, although performance differences between the cards have grown a little. RADEON HD 4670 is 3-3.5 times as fast as both GeForces and HD 3650 in this test. And in this test the new graphics card outperforms HD 3870, operating at the same frequency, now by almost 12%. So, what concerns arithmetic performance, the new solution from AMD is an obvious leader.
Write a comment below. No registration needed!