iXBT Labs - Computer Hardware in Detail






RADEON R(V)8XX Reference

Specifications, architecture, technologies, etc.

December 11, 2009

<< Previous page

     Next page >>

DirectX 11 support

It's a new version of this graphics API, which works in Windows 7 and Windows Vista (after Windows Update). There are a lot of new features in this version. They have to do with performance improvements and qualitative changes.

Performance improvements include multithreaded rendering as well as new features of DirectCompute. Image quality improvements include: tessellation, order-independent rendering of transparent polygons, complex postprocessing, new features of shadow filtering. Running physics computations and AI algorithms on GPUs via DirectCompute looks very promising for games.

DirectX 11 features software support for all previous levels of hardware, starting from DirectX 10. Although old DX10 GPUs can support only some of DirectX 11 features, several functions of the new API can make the life of game developers easier. However, sterling DirectX 11 GPUs, such as RV870, are required to reveal full potential of this API version. Only such GPUs can provide support for DirectCompute11 and improved multithreaded rendering. Multithreading will work with old GPUs, if they are supported by new drivers, but performance may be lower than with DX11 GPUs.

Shader Model 5.0

Shader Model 5 offers a new set of instructions with a more flexible access to data and more convenience for developers. It's a unified set of instructions, the same for shaders of all types: Vertex, Hull, Domain, Geometry, Pixel, and Compute. It uses an object-oriented program model, functions and routines in shader code facilitate development of graphics applications.

Let's enumerate some of the new instructions in Shader Model 5.0:

  • SV_Coverage provides information about sample coverage for pixel shaders, it's used to detect polygon edges in specific antialiasing algorithms.
  • Gather fetches four samples with a single instruction, it's used in algorithms of shadow filtering and ambient occlusion.
  • Instructions to convert data types between 32-bit and 16-bit floating-point formats, which facilitates programming in some cases.
  • Bit operations to accelerate data compression and decompression.


One of the most important features of the new graphics API is DirectCompute, which gives access to general GPU computing technology (ATI Stream Technology). This feature is especially important, because DirectX API is the industrial standard, it will always be used.

There are several levels of hardware support: DirectCompute10 for DirectX 10.0 GPUs, DirectCompute10.1 and DirectCompute11 correspondingly. DirectCompute can be used in image processing and filtering, rendering of semitransparent surfaces without preliminary sorting (Order Independent Transparency), shadow rendering, physics effects, artificial intelligence algorithms, ray tracing.

DirectCompute11 supported by the RV870 provides more features than DirectCompute10, here are some of them:

  • 3D Thread Dispatch -- replacing several 2D arrays of threads with one 3D array.
  • The maximum number of threads in DirectCompute11 has been increased from 768 to 1024, so it's possible to execute more threads simultaneously (by 33%).
  • Memory volume per group of threads has been increased from 16 KB to 32 KB, this memory is used to transfer data between threads.
  • Access to shared memory is improved, instead of writing into the 256 KB area it's now possible to read and write into the 32 KB area.
  • Atomic operations allow each thread to use protected areas of memory, so they significantly facilitate porting algorithms from CPU to GPU.
  • Double precision computing required by some generic computing algorithms.
  • Gather4 -- fetching from video memory up to four times as fast (in certain conditions).


New shader types have been added for more convenient tessellation in DirectX 11: Hull and Domain Shaders. Hardware-assisted tessellation in DX11 allows to use a wide range of algorithms and methods: Catmull-Clark Subdivison, Bezier and N-patches, Displacement Mapping, adaptive tessellation (dynamic level of detail).

We've touched upon tessellation many times. In brief, it allows to obtain detailed models under light GPU load. Tessellation (breaking a model into triangles) is basically used to create ground and water surfaces, sometimes even for character modeling. You can see it in the next STALKER game -- smooth and round surfaces (and not very good textures):

Order Independent Transparency (OIT)

It's drawing semitransparent polygons without preliminary sorting to make rendering of overlapping semitransparent objects (shadow, fire, water, glass, etc) more efficient. Here is AMD's demonstration of the effect:

We cannot say that it's a new feature. Rendering semitransparent surfaces requires preliminary sorting for correct output, as their blending requires a certain draw order. DirectCompute11 features only facilitate such rendering by sorting pixels in a single pass. Atomic operations and append buffers are used for this.


DirectCompute can be used to accelerate image postprocessing and make it more complex. There are a lot of postfiltering types: depth of field, motion blur, edge detection, antialiasing, sharpening, etc.

Postprocessing requires data about neighboring pixels. DirectCompute significantly facilitates the usage of complex postfilters, increasing their performance and improving image quality. For example, constant time filter spreading imitates the optical depth of field effect, this new technique was developed by AMD together with University of California in Berkeley. It does not require an alpha buffer, and the code uses access to shared memory. As a result, there are fewer artifacts (like halos and sharp silhouettes), higher processing speed compared to usual methods with pixel shaders.

Postprocessing with DirectCompute can also improve shadow rendering algorithms, including ambient occlusion (AMD calls it HDAO -- High Definition Ambient Occlusion). We already described this algorithm, it's a global lighting (shadowing) model used in 3D graphics to make images look more realistic by calculating light intensity on the surface.

DirectCompute11 gives extra features to render more realistic shadows, when a shadow becomes more blurry at the edges as it gets farther (that is half-shadows look more realistic). AMD provides the following pictures from the future STALKER: Call of Pripyat:

Improvements and additional formats of texture compression

DirectX 11 allows to compress 16-bit HDR textures, which compression ratio reaches 6:1. It will come in handy in modern games, as they often use such formats. Besides, this version of the graphics API offers improved texture compression quality (confirmed by the best SNR parameter -- signal-to-noise) and reduced pixelization artifacts in textures.

Multithreaded rendering

It's one of the long-awaited improvements in DirectX API, which has been available in game consoles for a long time. Now not only an application, DirectX runtime code, and the driver are executed each in its own separate thread, but also such tasks as loading textures or compiling shaders can be started in parallel threads.

This innovation will help eliminate bottlenecks in CPU performance in case of many draw calls -- some of them can finally be offloaded to another thread, which can be executed on another CPU core, different from the one running the main rendering thread. Don't confuse it with multithreaded game code!

Write a comment below. No registration needed!

<< Previous page

Next page >>

blog comments powered by Disqus

  Most Popular Reviews More    RSS  

AMD Phenom II X4 955, Phenom II X4 960T, Phenom II X6 1075T, and Intel Pentium G2120, Core i3-3220, Core i5-3330 Processors

Comparing old, cheap solutions from AMD with new, budget offerings from Intel.
February 1, 2013 · Processor Roundups

Inno3D GeForce GTX 670 iChill, Inno3D GeForce GTX 660 Ti Graphics Cards

A couple of mid-range adapters with original cooling systems.
January 30, 2013 · Video cards: NVIDIA GPUs

Creative Sound Blaster X-Fi Surround 5.1

An external X-Fi solution in tests.
September 9, 2008 · Sound Cards

AMD FX-8350 Processor

The first worthwhile Piledriver CPU.
September 11, 2012 · Processors: AMD

Consumed Power, Energy Consumption: Ivy Bridge vs. Sandy Bridge

Trying out the new method.
September 18, 2012 · Processors: Intel
  Latest Reviews More    RSS  

i3DSpeed, September 2013

Retested all graphics cards with the new drivers.
Oct 18, 2013 · 3Digests

i3DSpeed, August 2013

Added new benchmarks: BioShock Infinite and Metro: Last Light.
Sep 06, 2013 · 3Digests

i3DSpeed, July 2013

Added the test results of NVIDIA GeForce GTX 760 and AMD Radeon HD 7730.
Aug 05, 2013 · 3Digests

Gainward GeForce GTX 650 Ti BOOST 2GB Golden Sample Graphics Card

An excellent hybrid of GeForce GTX 650 Ti and GeForce GTX 660.
Jun 24, 2013 · Video cards: NVIDIA GPUs

i3DSpeed, May 2013

Added the test results of NVIDIA GeForce GTX 770/780.
Jun 03, 2013 · 3Digests
  Latest News More    RSS  

Platform  ·  Video  ·  Multimedia  ·  Mobile  ·  Other  ||  About us & Privacy policy  ·  Twitter  ·  Facebook

Copyright © Byrds Research & Publishing, Ltd., 1997–2011. All rights reserved.