Direct3D 10: Geometry Shaders
RightMark3D 2.0 includes two geometry shader tests. The first one is called Galaxy, it's similar to point sprites from previous Direct3D versions. It animates a system of particles using a GPU, a geometry shader creates four vertices from each dot, forming a particle. Similar algorithms should be used in future DirectX 10 games.
A change of balance in geometry tests does not affect rendering results, the image is always identical, only scene processing methods differ. GS load value determines what shader will be busy -- vertex or geometry. The amount of work is always the same.
Let's analyze the first modification of Galaxy with vertex processing for three levels of geometric complexity:
Performance ratios are approximately similar with different geometry complexity in all contenders. Performance demonstrated corresponds to the number of points, FPS is halved each step. It's not a hard task for modern graphics cards. Performance in this test is not limited by streaming processors. The task is limited by something else than memory bandwidth.
It's difficult to draw any conclusions, if all solutions demonstrate so close results, there is even no 1.5-fold difference between the best and the worst solutions. Our previous tests show that only dual-GPU graphics cards gain from their AFR and demonstrate almost twice as high results. Perhaps the situation will change, when some work is moved to a geometry shader.
Alas, test results almost do not change, as the load grows. All our graphics cards practically don't respond to changes in GS load values, which are responsible for moving some of the load to the geometry shader, and demonstrate similar results. Let's see what will change in the next test, which generates a heavier load on geometry shaders.
Hyperlight is the second geometry test that uses several techniques: instancing, stream output, buffer load. It employs dynamic generation of geometry by rendering into two buffers, as well as a new Direct3D 10 feature -- stream output. The first shader generates ray directions, their speed and growth vectors. These data are stored in a buffer, which is used by the second shader for rendering. Each ray point is used to generate 14 vertices in a circle, up to a million output points.
The new type of shader programs is used to generate rays. If "GS load" is set to "Heavy", it's also used for rendering. That is in Balanced mode, geometry shaders are used only to generate and grow rays. Output is up to instancing. The geometry shader also outputs data in the Heavy mode. Let's analyze the easy mode first:
And again relative results in various modes correspond to the load: performance scales well in all cases. It's close to theoretical parameters, according to which, each next level of Polygon count must be twice as slow.
But in other respects performance is apparently limited by video memory bandwidth rather than by GPUs themselves. For example, the HD 4870 is significantly faster this time. And the RADEON HD 5870 is again almost twice as fast as the HD 5770, which can be also explained with higher memory bandwidth. This time the HD 5770 competes with the GTX 260, while the GTS 250 is faster than both of them, to say nothing of the HD 5750. That happens because performance is limited by memory bandwidth and fill rate.
Results must change on the next diagram for the test that actively uses geometry shaders. It will be also interesting to compare test results obtained in Balanced and Heavy modes.
This time the figures have grown higher. It concerns most graphics cards, but not the GeForce GTS 250, which is the only card to slow down in the test that actively uses geometry shaders. So it takes up the last position and cannot compete with the HD 5750.
And the GTX 260 improves its position and significantly outperforms its competitor -- the new RADEON HD 5770. That's only one of synthetic tests. Besides, geometry shaders are not used widely in games, unfortunately.
Write a comment below. No registration needed!