Architecture and features
From the architectural point of view, RV790 does not differ from RV770. It's just overhauled to achieve higher clock rates. But it's not that simple. What concerns technical characteristics, the first thing that catches the eye is the number of transistors increased to three millions and much larger surface area of RV790 versus RV770.
The RV770 did not need many modifications, as it was already very good, well balanced and all that. It made no sense to increase the number of execution units without switching to the 40nm process technology (not ready for complex GPUs yet) because of the dual-GPU strategy for the High-End segment.
So it was decided to make the GPU work at higher frequencies. That's why engineers overhauled the structure of RV770, changing its inner circuits and outputs. Besides, they added a so-called decap ring along the perimeter of the GPU, consisting of filtering (blocking) capacitors to help separate signals by reducing crosstalk. These changes led to a small increase in the number of transistors and enlarged the surface area of the chip.
As a result, RV790 really works better at higher frequencies than RV770, aided by slightly increased voltage. But AMD still failed to deliver long-expected 1 GHz with air cooling. The reference clock rate is higher by 100 MHz, and partners are now allowed to manufacture overclocked models operating at 900 MHz or higher. With any luck, overclockers can make them run at 1 GHz. But not all cards will overclock to this level without changing the cooling system, raising voltage, etc.
In other respects, we have nothing to add about the architecture. RV790 is little different from RV770, it's just a tad faster. Our theoretical part of the article is over. Now we'll describe some peculiarities of modern solutions from AMD.
DirectX 10.1 support
We cannot say that DirectX 10.1 support is promptly adopted by new games. Much time passed since the appearance of RADEON HD 3800, but there are still few titles with its support. On the other hand, we can definitely see positive dynamics -- the number of such applications is constantly increasing. Let's take a look at the most interesting ones.
As we all know, the first game to support DirectX 10.1 was STALKER: Clear Sky. To be more exact, one of its patches. We already analyzed qualitative changes in this patch. The main improvements have to do with high-quality antialiasing for objects with alpha transparency as well as a new high-quality filtering level for shadow maps. Plus a small (up to 10%) performance gain, when the application switches from DX10 to DX10.1.
But Clear Sky is not a new game. There have recently appeared several games (developed in cooperation with AMD) supporting DirectX 10.1. One of such games is Tom Clancy's H.A.W.X. from Ubisoft. It uses high-quality algorithms of Screen Space Ambient Occlusion (SSAO) and Gaussian Blur of shadow maps. Both algorithms do not require compulsory use of DX10.1 features, but they get significant performance gains from some of features in this API version. For example, according to the built-in benchmark in 1920x1080, DX 10.1 brings a 20% performance gain. The image below shows frames from the benchmark with instant FPS values.
Another game supporting DX10.1 is Stormrise from SEGA. It's one of the first games to use the latest top-quality SSAO algorithm. Besides, Shader Model 4.1 accelerates Gaussian Blur for shadow maps. It all accelerates the application by up to 20-25% (versus DirectX 10) with graphics cards from AMD.
We want to mention another game here -- BattleForge from Electronic Arts. DX 10.1 provided the game with an accelerated high-quality SSAO algorithm as well as semitransparent antialiasing, as in Clear Sky. Like HAWX, this game contains a built-in benchmark, which shows a 20-25% performance gain with RADEON HD 4800 graphics cards and DirectX 10.1.
We must also say a few words about ATI Stream, AMD's technology of using GPUs for computing. Stream is like any other similar technology, and it can be effectively used in a lot of applications. Audio and video processing, entertainment, graphics applications, special physics effects, artificial intelligence, and many others.
It must be noted that AMD prefers to develop open standards, such as OpenCL. They are not property of any vendor, they support various GPUs and CPUs, they are cross-platform and free. However, AMD does not offer public support for OpenCL so far. ATI Stream SDK 2.0 (current version is 1.4) will be released in the first half of the year, closer to the middle of the year.
One of the popular applications of GPU computing is using it for physics. As AMD manufactures both GPUs and CPUs, it has to mention that CPUs can also compute physics well. Especially multi-core CPUs for simple effects, and especially when performance is limited by a GPU.
However, the competitor uses GPUs for computing physics in grand style. According to AMD, GPUs cope well with lots of complex physics effects, offloading CPUs where possible. The company also mentions a feature to allocate idle parts of CrossFire configurations and X2 solutions.
In order to keep up with the GPU physics trend, AMD ported a part of Havok physics engine to OpenCL. It's one of the most popular physics engines for games, over two hundred games use it. As for now, AMD demonstrated only two GPU-assisted Havok effects: Cloth and Destruction. That's how it looks in demos, accelerated by ATI Stream.
Unfortunately, these are only two features from Havok, even if the most interesting ones. And we don't know whether we'll see games to support even these Havok effects at all. It's Intel that owns Havok now, and this company does not manufacture GPUs that can accelerate physics effects yet. What concerns NVIDIA, along with the above mentioned effects this company offers GPU-assisted particles, soft body simulation, etc. Moreover, lots of these features are supported by games already, which gives a certain market advantage. Even if future games will support GPU-assisted computing via Havok/OpenCL, they will work with GPUs from all manufacturers.
Windows 7 readiness
For the lack of really interesting innovations in all new solutions, both manufacturers mention their readiness for the upcoming Microsoft Windows 7. Both AMD and NVIDIA tell us about their tight cooperation with Microsoft and how well their drivers support this operating system. They show diagrams where competing graphics cards demonstrate much worse results than their products.
We shall not write about 3D performance in Windows 7 now. This operating system has not been released yet, and its drivers are still beta-tested. But we can write about some features of the new operating system. It will be the first OS to support computing with CPUs and GPUs. Speaking of OS features that have to do with graphics cards, Windows 7 has an updated driver model WDDM 1.1 that accelerates 2D and 3D rendering, offers improved video memory management and higher performance.
Aero interface in the new system supports DirectX 10, plus useful APIs to accelerate 2D graphics: Direct2D and DirectWrite. But the most important GPU-related feature is DirectX Compute -- compute shader in DirectX 11. The new version of this graphics API should introduce lots of long-awaited important changes. For example, wide support for compute shaders will help offload CPUs in some well-parallelized applications. This process is currently hampered by the lack of open APIs.
According to AMD, CATALYST for Windows 7 is almost ready. And it's even faster than the driver for Windows Vista. It will be initially included into the operating system, it will support all WDDM 1.1 features, unified for Vista and Seven. This brilliant future awaits us soon.
That's the end of the theoretical part, as we already know everything about the RV7xx. Now we'll evaluate performance of the new RV790-based solution in comparison with other cards from AMD and competing cards from NVIDIA.
Write a comment below. No registration needed!