Direct3D 10: geometry shader tests
RightMark3D 2.0 includes two geometry shader tests. The first one is called Galaxy, it's similar to point sprites from previous Direct3D versions. It animates a system of particles using a GPU, a geometry shader creates four vertices from each dot, forming a particle. Similar algorithms should be used in future DirectX 10 games.
A change of balance in geometry tests does not affect rendering results, the image is always identical, only scene processing methods differ. GS load value determines what shader will be busy -- vertex or geometry. The amount of work is always the same.
Let's analyze the first modification of Galaxy with vertex processing for three levels of geometric complexity:
This graph looks more like what we saw in D3D9 tests than in our previous D3D10 tests. The correlation of results with different complexity levels of the scene is almost the same. Performance corresponds to the number of points, FPS is halved each step. It's not a hard task for modern graphics cards. Performance in this test is not limited by streaming processors that apparently. The task is also limited by memory bandwidth and fill rate, although to a lesser degree. And AFR effectively increases the speed of multi-GPU solutions.
GeForce GTX 285 demonstrates a tad higher result than GTX 280, the difference between them amounts to about 7%. That is, the task is restricted by the fill rate rather than by memory bandwidth or the speed of stream processors. The dual-GPU RADEONs take the lead, the cheaper one outperforming the card under review by a third, and the more expensive one -- by 75-80% (owing to the higher memory bandwidth in HD 4870 X2). Perhaps the situation will change, when some work is moved to a geometry shader.
There are practically no differences between these test versions, they are insignificant. All graphics cards from NVIDIA demonstrate almost the same results with various GS load values, which are responsible for moving some of the load to the geometry shader. Only results of AMD cards have grown a little, especially those of RADEON HD 4850 X2 with new drivers. GeForce GTX 285 is still lagging behind, the gap is even a tad bigger. Let's see what will change in the next test, which generates a heavier load on geometry shaders.
Hyperlight is the second geometry test that uses several techniques: instancing, stream output, buffer load. It employs dynamic generation of geometry by rendering into two buffers, as well as a new Direct3D 10 feature -- stream output. The first shader generates ray directions, their speed and growth vectors. These data are stored in a buffer, which is used by the second shader for rendering. Each ray point is used to generate 14 vertices in a circle, up to a million output points.
The new type of shader programs is used to generate rays. If "GS load" is set to "Heavy", it's also used for rendering. That is in Balanced mode, geometry shaders are used only to generate and grow rays. Output is up to instancing. The geometry shader also outputs data in the Heavy mode. Let's analyze the easy mode first:
Relative results in various modes correspond to the load: performance scales well in all cases. It's close to theoretical parameters, according to which, each next level of Polygon count must be twice as slow. It's strange that HD 4850 X2 failed this test, just like it had happened with GeForce 9800 GX2. That's probably the fault of a bug in the new drivers, as HD 4870 X2 is doing fine.
However, HD 4850 X2 had no chance to outperform GeForce GTX 285 in this test, because even HD 4870 X2 demonstrated only a tad higher result. Performance difference between NVIDIA solutions still conforms with the difference in GPU frequencies. Results may change on the next diagram for the test that actively uses geometry shaders. It will be also interesting to compare test results obtained in Balanced and Heavy modes.
But we don't see any changes again. Old architectures used to slow down significantly in this test. But GT200 and RV770 had this problem solved, and now they cope with this task efficiently. Don't forget that GeForce GTX 285 performs almost on a par with HD 4870 X2, which has two GPUs. That is the new card from NVIDIA is definitely faster than HD 4850 X2 in the Hyperlight test, even if the latter hadn't been set back by bugs in the drivers.
Direct3D 10: vertex texture fetch rate
Vertex Texture Fetch tests measure the speed of many vertex texture fetches. These tests are essentially similar, and the correlation of their results in Earth and Waves tests must also be similar. Both tests use displacement mapping based on texture fetches. The only major difference is that the Waves test uses conditional branches, while the Earth test does not.
Let's analyze the first test (Earth) in Effect detail Low mode:
Judging by our previous reviews, this test is heavily affected by memory bandwidth -- the easier the mode, the stronger its effect on performance. We can see it well in the easy mode, where test results are close to the difference in memory bandwidth. Fill rate kicks in in more complex conditions, and AFR mode helps AMD cards crash their competition.
GeForce GTX 285 is even a tad slower than GTX 280. It has to do with some changes and optimizations in the drivers, as we'll see below. It defeats its competitors in the easy mode. But in complex modes the garland goes to dual-GPU cards from AMD, which demonstrate twofold performance gains versus single-GPU modifications. Let's have a look at results of this test with more texture lookups:
The situation hasn't changed much. GTX 280 and GTX 285 got closer to their dual-GPU rivals in the Low mode. But they still lead in the High mode. There is something else on this graph -- GTX 285 has a bigger advantage over GTX 280 than theoretically possible, and HD 4850 X2 with newer drivers outperforms its more expensive modification in two heavy modes. This is a clear sign of optimizations in drivers for complex modes, because the situation was different on the previous graph.
Let's take a look at results of the second vertex texture fetch test. The Waves test executes fewer texture lookups, but it uses conditional branches. The number of bilinear texture lookups in this case reaches 14 (Effect detail Low) or 24 (Effect detail High) per each vertex. Geometry complexity changes just like in the previous test.
The Waves test favors AMD products, and the difference between single-GPU GeForce GTX 200 cards and dual-GPU RADEONs has grown. Performance in this test depends not on TMUs, but on memory bandwidth and fill rate, which are effectively doubled in AFR mode. GeForce GTX 285 do not outscore GTX 280 in two modes out of three, which can also be explained with optimizations for heavy tasks, one of which will be shown on our next graph.
Compared to HD 4850 X2, the new card from NVIDIA leads only in the Low mode. In the other two modes it's defeated by the dual-GPU RADEON. Let's analyze the second modification of the test:
There are not many changes here. As the test grows more complex, results of dual-GPU RADEON HD 4800 cards grow a tad better relative to NVIDIA performance, because the latter suffer from a bigger performance drop. Even though performance in the Low mode is still limited by memory bandwidth, GeForce GTX 285 is already slower than both cards from AMD, which benefit from their two cores. TMU efficiency and two cores start to play a more important role in High modes, so GeForce GTX 285 and GTX 280 are more than twice as slow as AMD cards.
Conclusions on the synthetic tests
Synthetic tests of GeForce GTX 285 and other products from both competitors show us that the new product from NVIDIA is slightly more powerful than GTX 280. It was to be expected, because the only changes that can affect performance include only increased frequencies.
If we take a look at the numbers alone, GTX 285 demonstrated lower frame rates in most of our tests compared to HD 4850 X2. However, GTX 285 is the fastest solution among single-GPU solutions, and it shows worthy results even versus competing dual-GPU cards (especially in Direct3D 10 tests), which suffer from their own problems to be covered in a separate article.
Along with the increased frequencies that provide a little performance gain, NVIDIA paid special attention to power consumption, energy efficiency, and cost reduction. GeForce GTX 285 demonstrates a high performance level. And owing to the new process technology and simplified design, it consumes relatively little power. Besides, it won't be as expensive as the previous model, GeForce GTX 280, in the beginning of its life cycle. And this is more important for users than results in synthetic tests.
The next part of our article contains tests of the new solution from NVIDIA in modern games. These results should reveal the real performance ratios between GeForce GTX 285 and its competitors from AMD. Results of our game tests may not match our synthetic conclusions, because comparing single- and dual-GPU cards is not a simple matter. We can assume that GeForce GTX 285 and HD 4850 X2 will have similar average frame rates in games, even though the difference in our synthetic tests reached up to twofold from time to time. Synthetic tests should be treated with caution.
Write a comment below. No registration needed!