Introduction
This article continues our analysis started in the previous article, which you may want to read first. Problems of testing 3D performance haven't disappeared for the past two years. The major point being that not all games have built-in benchmarks, to say nothing of convenient tools, simple automation, and full data output about the average, minimum and maximum frame rate.
From the point of view of benchmarking, there are three game types: (1) without built-in benchmarks; (2) with an option to display instant FPS or with a test, which can hardly be automated; (3) and games with all necessary features to measure average/minimum/maximum FPS and to record custom demos.
The first group includes such popular games as BioShock, DiRT, Need for Speed: ProStreet, Test Drive: Unlimited, and many others, they are in majority. The second group comprises World in Conflict, which benchmark cannot be automated easily (you have to write complex scripts in special applications), it's impossible to start the test using a command line and record custom demos, and a lot of games that display only the instant frame rate. The best group for reviewers includes the following games: Crysis, Call of Juarez, and S.T.A.L.K.E.R. They allow to record user demos (the first game also includes high-quality standard demos), display minimum, maximum and average fps values.
Games with built-in benchmarks are too scarce, they are much more rare than games without benchmarks, so many reviewers test such 3D applications in any possible way. For example, using third-party utilities that can measure FPS (minimum, maximum and average). Among all tools of this kind, FRAPS is the most popular.
These utilities are the only way to benchmark performance in applications without integrated features. That's an apparent advantage of this method. In the previous article we analyzed its drawbacks, which may render such tests useless. In this article we proceed with our analysis whether it makes sense to use FRAPS to measure performance in 3D games that have no built-in benchmarks and what measurement errors this method implies.
Theory
We've covered theoretical issues in the first article, so we advise that you should read it. We'll repeat only major points here. One of the main drawbacks of this method is its human factor, because a tester is limited by its reaction time, attention, etc. A tester cannot press the button to start/stop the test with absolute precision several times. And the average FPS value of acceptable precision requires practically identical start/stop cycles for different cards in each test.
Besides, automation plays a very important role, if you test a lot of graphics cards in different conditions. If you start and stop tests manually in different resolutions with different settings and with different cards, you have to be extra careful all this time. And your inevitable errors will affect the measurement error, increasing it to an unacceptable value (more than usual 2-3%, typical of benchmarks in games of the third type).
And now we shall continue our analysis in some modern games, we'll evaluate tester's errors and their effect on results. This time we'll focus on games that do not allow to record and play user demos, because if they do, tests with FRAPS make certain sense, as we figured in the previous article. But if these features are not available, some testers use the following trick - they load a saved game or a certain game level and just complete it several times with various software and hardware settings. They are of the opinion that the measurement error in the average FPS test will be acceptable in this case. Let's check it up.
Testbed configuration and settings
- Processor: AMD Athlon 64 X2 4600+ Socket 939
- Motherboard: Foxconn WinFast NF4SK8AA-8KRS (NVIDIA nForce4 SLI)
- RAM: 2048 MB of DDR SDRAM PC3200
- Graphics cards: NVIDIA GeForce 8800 GT 512MB
- HDD: Seagate Barracuda 7200.7 120 Gb SATA
- Operating system: Microsoft Windows Vista Home Premium
- Video driver: NVIDIA ForceWare 169.21
We used only one video mode - a popular and resource-intensive resolution of 1600x1200 (or wide 1680x1050, which will demonstrate close results), MSAA 4x and anisotropic filtering 16x. All features were enabled from game options, nothing was changed in the control panel of the video driver.
Our bundle of game tests includes modern projects without built-in benchmarks. We gave preference to new games of various genres, featuring interesting 3D techniques. Here is a list of games used in our tests: BioShock, Call of Duty 4, DiRT, Need for Speed: ProStreet, Oblivion, Test Drive: Unlimited. We also used the latest version of FRAPS.
Write a comment below. No registration needed!