Synthetic testsTestbed:
Our synthetic benchmarks:
Tested graphics cards:
Let's explain the graphics cards selection. ATI Radeon HD 5830 is the closest rival in terms of price, ATI Radeon HD 5770 is a close competitor as well. NVIDIA GeForce GTX 480 is the fastest single-GPU graphics card on the top-class GPU of the same generation, NVIDIA GeForce GTX 465 is a close counterpart based on GF100. These counterparts will expose the changes in the GF104 architecture. Direct3D 9: pixel fillingThis test determines peak texel rate in FFP mode for different numbers of textures applied to a pixel: Yet again we can see that the test is a bit obsolete. Graphics cards, at least NVIDIA's, show results far from theoretically possible. But we'll check those in the Vantage test later. According to the results, GTX 460 selects up to 32 texels per clock from 32-bit textures in the bilinear filtering mode. This is far from the theoretical 56 filtered texels. Due to this GTX 460 is outperformed by all other competitors, except for GTX 465, when many textures per pixel are involved. But theoretically GTX 460 should have higher texture performance than HD 5770, almost catching up with GTX 480. Well, not in this test. The difference between GTX 460 and GTX 465 is interesting. The latter wins, when a few textures are used and where bandwidth limitations have a larger impact. But the former catches up and even wins, when 4-8 textures are involved. But that's still far from the real capabilities of the new GPU. Take a look at the fillrate test. The fillrate test demonstrates the same situation, with the number of pixels in the frame buffer considered. ATI products still lead the way, having more TMUs and being more efficient. Even HD 5770 performs on a par with GTX 480. It's a pity that GTX 460 loses much with 0-3 mapped textures. It's clear that performance is bottlenecked by bandwidth in such modes, but GTX 460 has more of it than even HD 5770, so it's not the reason. Direct3D 9: PS 1.1, 1.4, 2.0, 2.aThe first group of pixel shaders to be reviewed here is too simple for modern GPUs. It includes various versions of pixel programs of relatively low complexity: 1.1, 1.4, and 2.0. These tests are very easy for modern architectures, so they cannot demonstrate all of their capabilities. But these tests are still interesting to assess the balance between texture fetches and math computing. Especially, when there's a new architecture to examine. So it will be interesing to compare GF104 with GF100. In these tests performance is primarily bottlenecked by TMU performance, but with efficiency and texture caching in real applications taken into account. Let's see what effect the architectural changes have. The new GTX 460 outperforms GTX 465 in the three simple tests. But GTX 465 outruns the novelty in lighting tests. Either GTX 460 is bottlenecked by effective fillrate or execution units have different efficiency (perhaps, texturing is as inefficient here as it was in the previous test). GTX 460 performs a bit worse than HD 5770, not to mention the more powerful HD 5830. Let's take a look at more complex pixel programs. And again there's an interesting difference between GTX 460 and GTX 465. The Water test that strongly depends on texturing performance utilizes a dependent sampling of strongly nested textures, so the graphics cards are always ranked by texturing performance, adjusted by different TMU efficiency. Here GTX 460 performs well, outrunning both GTX 465 and HD 5770 and almost catching up with GTX 480. HD 5830 is still the leader, though. The results of the second test differ. GTX 460 loses again. The test is more computing-intensive and it has always favored the ATI's architecture that has more ALUs. GTX 460 lags behind GTX 465 for some reason, though, theoretically, its math capabilities should be a bit better. Perhaps, ALU efficiency is reduced in this test, because there are more of them per one Stream Multiprocessor. Direct3D 9: PS 2.0These tests of DirectX 9 pixel shaders are even more complex, they are divided into two categories. We'll start with the easier SM 2.0 shaders.
There are two modifications of these shaders: arithmetic intensive and texture sampling intensive. Let's analyze arithmetic-intensive modifications, they are more promising from the point of view of future applications. These tests are universal, they depend on both ALU performance and texturing speed -- the balance is the key. As you can see, the Frozen Glass results are similar to those of Cook-Torrance, and the new GTX 460 still lags behind GTX 465. Both ATI solutions are faster at that. The results of Parallax Mapping are also similar. But this time HD 5830 yielded to GTX 480. However, GTX 460 loses yet again, lagging behind GTX 465 to a similar extent. Let's see what happens next. But games are usually more diverse than synthetic tests. Texturing is not the only thing they utilize. Now let's examine the same tests, modified so texture fetch is preferred to computing. GTX 460 is bound to do better. Well, it did better. It still loses to both ATI cards, though. But it now outperforms GTX 465, especially in Frozen Glass that depends on TMU performance more. GTX 460 is starting to catch up with the much more expensive GTX 480 -- in these tests performance is bottlenecked by TMUs, and GF100 doesn't have a sufficient number of those. But those were old tasks aimed at texturing, not very complex at that. Let's have a look at the results of two more pixel shader tests -- SM 3.0. These are the most complex of all our tests for Direct3D 9 pixel shaders. The tests load ALUs and texture units heavily. Both shader programs are complex, long, and include a lot of branches.
Compared with ATI cards, NVIDIA performs quite well. Both PS 3.0 tests are very complex. They don't depend on bandwidth or texturing, they are pure math, but with lots of transitions and branches. And it seems GF104 does a fine job in this case. GTX 460 outperforms HD 5770 and performs on a par with HD 5830. Unfortunately, it loses to GTX 465 in the Steep Parallax Mapping test. The reason is not quite clear. It's either the lack of bandwidth or the increased number of ALUs per Stream Multiprocessor. Write a comment below. No registration needed!
|
Platform · Video · Multimedia · Mobile · Other || About us & Privacy policy · Twitter · Facebook Copyright © Byrds Research & Publishing, Ltd., 1997–2011. All rights reserved. |