DX10: PS 4.0 Tests (texturing, loops)
New RightMark3D 2.0 includes two old PS 3.0 tests (Direct3D 9), rewritten for DirectX 10, and two brand new tests. The first two tests can now enable self-shadowing and shader supersampling, which increase their load on GPUs.
These tests measure efficiency of executing looped pixel shaders with a lot of texture lookups (up to several hundreds of lookups per pixel in the heaviest mode!) and a relatively low ALU load. In other words, they measure a texture sampling rate and branching efficiency in a pixel shader.
The first pixel shader test will be the Fur test. When used with the lowest settings, it uses 15-30 texture lookups from bump maps and two lookups from the main texture. The High Effect Detail mode increases the number of lookups to 40-80. When shader supersampling is enabled, the number of lookups grows to 60-120. The heaviest mode is the High mode with SSAA -- 160-320 lookups from a bump map.
Let's see what happens in modes without supersampling - they are relatively simple, and the correlation of results in Low/High modes must be similar.
Performance in the Fur test depends not only on the number and speed of TMUs, but also on the fill rate and memory bandwidth. Finally, AMD drivers got rid of bugs that affected our tests! As for now, nothing is left of the huge advantage of NVIDIA over AMD in the procedural Fur tests for Direct3D 10 with lots of texture lookups.
I cannot say that HD 4870 X2 is much stronger than both NVIDIA cards, but it's at least on the same performance level. There is no need to look at the other two cards from AMD, their results will grow, when the driver is updated. Well, let's have a look at the results in this test with enabled shader supersampling, which quadruples the load. Perhaps it will change the situation, and memory bandwidth/fill rate will produce a weaker effect:
Theoretically, supersampling quadruples the load, this time the overwhelming advantage of NVIDIA cards has gone as well. The new dual-GPU card from AMD is a little faster than even GeForce GTX 280. As the shader grows more complex and the GPU load increases, performance difference between the cards is similar. It's good of AMD to improve its results in this test at last.
The second test that measures efficiency of executing complex looped pixel shaders with many texture lookups is called Steep Parallax Mapping. With low settings it uses 10-50 texture lookups from a bump map and three lookups from main textures. The heavy mode with self-shadowing doubles the number of texture lookups, and supersampling quadruples this number. The most complex test mode with supersampling and self-shadowing uses 80-400 texture lookups, that is eight times as many as in the low mode. Let's analyze simple modes without supersampling first:
This test is even more interesting from the practical point of view. Various parallax mapping methods have been used in games for a long time already. Heavy modifications, such as our steep parallax mapping, are already used in some projects, e.g. in Crysis and Lost Planet. Along with supersampling, our test can enable self-shadowing that doubles the GPU load (High mode).
The AMD card again improves its results with the new drivers! These were software drawbacks after all, as we expected. It seemed strange that AMD solutions performed well in Direct3D 9 tests of parallax mapping, but they lost too much in the updated D3D10 test without supersampling. Besides, self-shadowing causes a big performance drop in AMD products. The situation is much better now. RADEON HD 4870 X2 is outperformed by the new top card from NVIDIA only in the High mode. And the AMD card is faster in the Low mode. Let's see what supersampling will change. Performance drop from supersampling was bigger in NVIDIA cards in the previous test.
Supersampling and self-shadowing increase the load on graphics cards by almost eight times, causing a great performance drop. Now the layout of forces is different. Supersampling has a similar effect here -- AMD cards improve their results relative to NVIDIA solutions. The new HD 4870 X2 is just faster than the other cards in both modes. That looks more like an actual situation compared to results demonstrated with previous driver versions.
Write a comment below. No registration needed!