iXBT Labs - Computer Hardware in Detail

Platform

Video

Multimedia

Mobile

Other

Problems of Testing 3D Performance with FRAPS, Part 2

More analysis of 3D benchmarking problems.

May 27, 2008



<< Previous page

     Next page >>

Conclusions

Let's draw a bottom line of another FRAPS research. Fourfold testing on each computer without processing results is too rough, and its measurement error may reach 10-15%, which is absolutely unacceptable! Even though the frame rate in many games is relatively stable and the graphs show similar peaks, the difference in the average FPS values is sometimes too big. It makes no sense to compare performance of graphics cards using such tests, as performance differences between some cards may sometimes be less than 10%. Even if you carry out 4-10 tests, you are careful with the test procedure, and you process results by discarding anomalous readings, you cannot be certain about the resulting figures.

To say nothing of the huge difference between the highest and the lowest FPS results. Games generate slightly different situations each time, you cannot get identical 3D scenes. For example, the frame rate in many racing games is affected by many things: changing weather, time of the day, traffic density (if it exists), changing behavior of computer players, skills of a player. Each time a tester is presented with a different situation, so the frame rate differs much as well. In this case the only solution is to run multiple tests and discard anomalous results.

What concerns modern first or third person shooters, the situation is even worse. It's even more difficult to repeat the original gaming situation there, each attempt to pass through a level brings something new. Enemies in modern games possess some AI, use complex behavior scripts, acting differently each time. So if testers use the above mentioned method of loading a saved game and going through a certain part of the level to measure performance, we can say for sure that it's impossible to do it several times in the same way, if the test is rather long.

Another solution is to turn around without going anywhere, or just read the instant FPS value right after loading a saved game. In this case the test will make some sense. But it will be much worse than sterling tests in games with convenient built-in benchmarks. However it's still better than nothing.

So here are the main conclusions that agree with the bottom line of the previous article. It's possible to benchmark graphics cards' performance in 3D games with FRAPS, but with the following reservations:

  • This method works with games that don't have built-in tools to benchmark performance, but which allow to record and play demos. In this case the measurement error may be close to that in games with built-in benchmarks.
  • FRAPS can also be used with games that can display identical animations using the game engine. But these tests won't reflect gaming performance, they will just show performance of rendering script scenes. That is it works well with interactive games, consisting mostly of such scenes.
  • If games don't have such features, they must be thoroughly analyzed whether they can be used for benchmarking. Just go through a level a couple dozens of times to find out the maximum measurement error. If it falls within 3%, such a game can be used for benchmarking with FRAPS.
  • But even in all above-mentioned cases, FRAPS results should be dealt with caution. In order to obtain reliable results, a test should be repeated at least 4-5 times. Results should be processed afterwards - you should thoroughly analyze each pass, find and discard anomalous values, and average the rest. Only in this case you may get reliable results.

All the above-said makes FRAPS tests very time consuming. Add the lack of automation, except for limiting the test to a certain length, so these tests take even more time. Plus a significant error even in case of multiple repetitions of the tests, so credibility of this method of measuring 3D performance is too low for i3DSpeed.

Tests with FRAPS may contain too many tester's mistakes in games that does not allow to play demos. The spread of results may be too big, 9-10% is an unacceptable value. Of course, we can think of ways to soften the effect of a human factor and differences in game scenes. For example, in a racing game we can choose a circle (or even straight) track without traffic or contenders. In FPS games we can find a place without enemies and go in a straight line for a short time. But will such results reflect the real gaming? Such synthetic tests are no better than 3DMark.

We address game developers again to heed the word of people interested in 3D graphics and high-tech games. If modern games had convenient tools to benchmark performance, and could be automated at that, testers would have had much fewer problems. And this article wouldn't have been necessary.

What concerns our tests, we shall stick to applications with built-in benchmarks. The methods described in this article can be used with caution in single cases, repeating the tests many times and thoroughly verifying all results for validity. If you use such tests, you must publish a detailed test procedure.


Write a comment below. No registration needed!


<< Previous page

Article navigation:

Page 1: Introduction, testbed

Page 2: Results

Page 3: Conclusions



blog comments powered by Disqus

  Most Popular Reviews More    RSS  

AMD Phenom II X4 955, Phenom II X4 960T, Phenom II X6 1075T, and Intel Pentium G2120, Core i3-3220, Core i5-3330 Processors

Comparing old, cheap solutions from AMD with new, budget offerings from Intel.
February 1, 2013 · Processor Roundups

Inno3D GeForce GTX 670 iChill, Inno3D GeForce GTX 660 Ti Graphics Cards

A couple of mid-range adapters with original cooling systems.
January 30, 2013 · Video cards: NVIDIA GPUs

Creative Sound Blaster X-Fi Surround 5.1

An external X-Fi solution in tests.
September 9, 2008 · Sound Cards

AMD FX-8350 Processor

The first worthwhile Piledriver CPU.
September 11, 2012 · Processors: AMD

Consumed Power, Energy Consumption: Ivy Bridge vs. Sandy Bridge

Trying out the new method.
September 18, 2012 · Processors: Intel
  Latest Reviews More    RSS  

i3DSpeed, September 2013

Retested all graphics cards with the new drivers.
Oct 18, 2013 · 3Digests

i3DSpeed, August 2013

Added new benchmarks: BioShock Infinite and Metro: Last Light.
Sep 06, 2013 · 3Digests

i3DSpeed, July 2013

Added the test results of NVIDIA GeForce GTX 760 and AMD Radeon HD 7730.
Aug 05, 2013 · 3Digests

Gainward GeForce GTX 650 Ti BOOST 2GB Golden Sample Graphics Card

An excellent hybrid of GeForce GTX 650 Ti and GeForce GTX 660.
Jun 24, 2013 · Video cards: NVIDIA GPUs

i3DSpeed, May 2013

Added the test results of NVIDIA GeForce GTX 770/780.
Jun 03, 2013 · 3Digests
  Latest News More    RSS  

Platform  ·  Video  ·  Multimedia  ·  Mobile  ·  Other  ||  About us & Privacy policy  ·  Twitter  ·  Facebook


Copyright © Byrds Research & Publishing, Ltd., 1997–2011. All rights reserved.