DX10: PS 4.0 Tests (computing)
The next couple of pixel shader tests contains minimum texture lookups to reduce the effect of TMU performance. They use a lot of arithmetic operations, so they measure arithmetic performance of GPUs, how fast they execute arithmetic instructions in pixel shaders.
The first computing test is called Mineral. It's a complex procedural texturing test, which uses only two texture lookups and 65 sin and cos instructions.
We always note in the analysis of our synthetic test results that modern AMD solutions perform better in complex arithmetic tasks than the competing products from NVIDIA. But that much... OK, it's a dual-GPU solution, and it cannot be compared to a single-GPU card directly. But still, RADEON HD 4870 X2 demonstrates brilliant results in the Mineral test. The top card from AMD based on two RV770 chips is more than twice as fast as the prev-gen card based on two RV670 chips, which is close to the difference in the number and frequency of streaming processors. The new card is also 2.5 times as fast as its direct competitor (GeForce GTX 280) and GeForce 9800 GX2.
The second shader test is called Fire, it's even harder for ALUs. It contains only a single texture lookup, while the number of sin/cos instructions is doubled to 130. Let's see what changes as the load grows:
Rendering speed in this test is limited solely by shader performance as well. This test favors AMD architectures, which is quite noticeable after the recent bugfix in AMD drivers. Solutions from this company enjoy an overwhelming advantage again. RADEON HD 4870 is 70% as fast as GeForce GTX 280, but the dual-GPU HD 4870 X2 just "rips its to pieces", being three times as fast! It's a very high result, the dual-RV770 card demonstrates brilliant computing capacity. By the way, the dual-GPU card was less than twice as fast as the single-GPU card in both arithmetic tests.
DX10: Geometry Shader Tests
RightMark3D 2.0 includes two geometry shader tests. The first one is called Galaxy, it's similar to point sprites from previous Direct3D versions. It animates a system of particles using a GPU, a geometry shader creates four vertices from each dot, forming a particle. Similar algorithms should be used in future DirectX 10 games.
A change of balance in geometry tests does not affect rendering results, the image is always identical, only scene processing methods differ. GS load value determines what shader will be busy -- vertex or geometry. The amount of work is always the same.
Let's analyze the first modification of Galaxy with vertex processing for three levels of geometric complexity:
Performance ratios are approximately similar with different geometry complexity. Performance demonstrated corresponds to the number of points, FPS is halved each step. It's not a hard task for modern graphics cards. Performance in this test is not limited by streaming processors that apparently. The task is also limited by memory bandwidth and fill rate.
This test demonstrates interesting results as well. Results of several solutions are very close to each other: HD 3870 X2, HD 4870, and GTX 280. And AFR gives HD 4870 X2 an opportunity to shoot forward, almost doubling results in each mode. We'll see what happens, when some work is moved to a geometry shader. The situation may become even more interesting.
No, the difference between these tests is small, nothing has changed much. But the HD 3870 X2 has gone slightly up. But HD 4870 X2 is still the leader. Dual-GPU cards easily demonstrate high results in this test, as multi-GPU render algorithm AFR effectively doubles FPS. By the way, graphics cards from NVIDIA demonstrate identical results with various GS load values, which are responsible for moving some of the load to the geometry shader. The AMD card based on the old RV670 chips has improved its results a little. Let's see what will change in the next test, which generates a heavier load on geometry shaders...
Hyperlight is the second geometry test that uses several techniques: instancing, stream output, buffer load. It employs dynamic generation of geometry by rendering into two buffers, as well as a new Direct3D 10 feature -- stream output. The first shader generates ray directions, their speed and growth vectors. These data are stored in a buffer, which is used by the second shader for rendering. Each ray point is used to generate 14 vertices in a circle, up to a million output points.
The new type of shader programs is used to generate rays. If "GS load" is set to "Heavy", it's also used for rendering. That is in Balanced mode, geometry shaders are used only to generate and grow rays. Output is up to instancing. The geometry shader also outputs data in the Heavy mode. Let's analyze the easy mode first:
Relative results in various modes correspond to the load: performance scales well in all cases. It's close to theoretical parameters, according to which, each next level of Polygon count must be twice as slow. This time, dual-GPU rendering is not as effective as in the previous case. So GeForce GTX 280 gets close to RADON HD 4870 X2 (especially in heavy modes), even though the latter is still our leader in all modes.
Judging by not very high results of HD 3870 X2, the new cards seem to be affected by improved texturing features. However, results must change on the next diagram for the test that actively uses geometry shaders. It will be also interesting to compare test results obtained in Balanced and Heavy modes.
Strange as it may seem, almost nothing has changed. All architectures, except for the old G9x, improve their results. Especially as both RV770 and GT200 feature some optimizations to improve execution of geometry shaders. Now RADEON HD 4870 catches up with GeForce GTX 280, and the dual-GPU card is much faster. The previous generation of AMD chips performs much worse in this test, just have a look at the dual-GPU card.
What concerns the comparison of results obtained in different modes, everything is as usual. The graphics cards from AMD improve their results, as they switch from instancing to a geometry shader. And old NVIDIA cards get slower. You should keep in mind that the image does not differ (visually) in these modes.
Write a comment below. No registration needed!