Direct3D 10: geometry shaders
RightMark3D 2.0 includes two geometry shaders tests. The first one is called Galaxy, it's similar to point sprites from previous Direct3D versions. It animates a system of particles using a GPU, a geometry shader creates 4 vertices from each dot, forming a particle. Similar algorithms should be used in future DirectX 10 games.
A change of balance in geometry tests does not affect rendering results, the image is always identical, only scene processing methods differ. GS load value determines what shader will be busy -- vertex or geometry. The amount of work is always the same.
Let's analyze the first modification of Galaxy with vertex processing for three levels of geometric complexity:
The correlation of results with different complexity levels of the scene is almost the same. Performance demonstrated corresponds to the number of points, FPS is halved each step. It's not a hard task for modern graphics cards. Performance in this test is not limited by streaming processors that apparently. The task is also limited by memory bandwidth and fill rate, although to a lesser degree.
This is where the new GPU shows its real power. In all of the modes GeForce GTX 480 does similarly to the competing dual-GPU product, outperforming both Radeon HD 5870 and GeForce GTX 295 by about 1.5 times. An excellent result. As we have expected GF100 handles geometry shaders very nicely, about 2.5 times faster than GT200 does. Let's see what will change, if we move some of the math to the geometry shader.
Well, the numbers remained almost the same. All of the cards haven't been bothered much by changes made to GS load. Now let's look what will happen in the next test that provides heavy load on geometry shaders.
Hyperlight is the second geometry test that uses several techniques: instancing, stream output, buffer load. It employs dynamic generation of geometry by rendering into two buffers, as well as a new Direct3D 10 feature -- stream output. The first shader generates ray directions, their speed and growth vectors. These data are stored in a buffer, which is used by the second shader for rendering. Each ray point is used to generate 14 vertices in a circle, up to a million output points.
The new type of shader programs is used to generate rays. If "GS load" is set to "Heavy", it's also used for rendering. That is in Balanced mode, geometry shaders are used only to generate and grow rays. Output is up to instancing. The geometry shader also outputs data in the Heavy mode. Let's analyze the easy mode first:
Both dual-GPU cards performed regularly. Perhaps, this test isn't compatible much with the Alternate Frame Rendering mode (AFR). Other relative results correspond to the load. Performance scales pretty well in all cases, being close to theoretical peaks, according to which each new polygon count level should be less than 2 times slower.
In this test, in the heavy-load mode, GeForce GTX 480 only slightly outperforms Radeon HD 5870, but under the light load the difference is more noticeable. As for GeForce GTX 480 compared to GeForce GTX 285, the former is about 2 times faster than the latter.
Results may change on the next diagram for the test that actively uses geometry shaders. It will be also interesting to compare test results obtained in the Balanced and Heavy modes.
Once again GF100 shows surprising results in geometry processing and geometry shaders performance. This is what all those changes to the graphics pipeline have been introduced for. Actually, geometry shaders performance was also nicely boosted in both GT200 and RV870. But GF100 just tears the rivals to shreds, figuratively speaking.
In this test GeForce GTX 480 is about 2 times faster than Radeon HD 5870 and up to 2.75 times faster than GeForce GTX 285. NVIDIA really did its best to improve the new architecture in terms of geometry processing. We wonder what'll happen in tessellation tests that should demonstrate even bigger a difference.
Direct3D 10: vertex shaders texture fetch
Vertex texture fetch tests measure the speed of many vertex texture fetches. These tests are essentially similar, and the correlation of their results in Earth and Waves tests must also be similar. Both tests use displacement mapping based on texture fetches. The only major difference is that the Waves test uses conditional branches, while the Earth test does not.
Let's analyze the first test, Earth, in the Effect Detail Low mode:
According to previous examinations, the results of this test are s affected by both texturing performance and memory bandwidth. However, the graphics cards perform quite similarly. GeForce GTX 480 is as fast as GeForce GTX 295, a bit better than Radeon HD 5870, but also a bit worse than Radeon HD 5970. These results are somewhat strange. Let's see what will change if we increase the number of texture fetches.
The results have become a bit worse, except for those of GeForce GTX 480. The latter has almost performed the same -- thanks to the more efficient TMUs and, especially, the cache.
Let's have a look at results of the second vertex texture fetch test. The Waves test executes fewer texture lookups, but it uses conditional branches. The number of bilinear texture lookups in this case reaches 14 (Effect Detail Low) or 24 (Effect Detail High) per each vertex. The complexity of geometry changes just like in the previous test.
It's interesting that the Waves results are different from those we've seen on the previous diagrams. The AMD graphics cards have slightly improved their standing. GeForce GTX 480 performs similarly to Radeon HD 5870 and GeForce GTX 295, losing a bit in the heavy-load mode. Now let's look at the second variant of this test.
No considerable changes again this time, though the GeForce GTX 480 has improved its results compared to those of AMD graphics cards as the test complexity has grown. It has a little advantage over Radeon HD 5870, and it also wins over GeForce GTX 295, except for the light-load test.
Write a comment below. No registration needed!