Tech-Hounds.com

Because gamers play games, not benchmarks




What makes a good game benchmark?

Built-in benchmarking tool or replay feature
To be a benchmark, the games must either have built in benchmarking tool or a replay feature. Typically benchmarking tool come in the form of a 'timedemo'. An ordinary timedemo is actually a replay of gameplay, the difference being timedemos are played back as fast as possible. The time taken for the timedemo to complete is then compared to the number of frames and the computed result is the average fps for that timedemo. If the game does not feature a benchmarking tool or timedemo, we must use the replay feature with an external tool to compute the results we want. For this, we're using the FRAPS software utility. Bear in mind, using external software does have some impact on performance, but we're confident that any influence is negligible at most.
Benchmarks must reflect or represent gameplay
Of course, having timedemo and replay is only the beginning. Since we want to see what the performance will be like in gaming situations, the timedemo and replay must also reflect or represent gameplay scenarios or be as close as possible to actual gameplay. A timedemo or replay that does not use the player's perspective, or does not feature gaming situations (empty levels or maps), or does not use effects commonly used throughout the game does not reflect or represent gameplay, at least in our opinion.
Benchmarks must be repeatable and produce repeatable, similar results
Timedemo and replay must also be repeatable and repeated runs should give the same results or at the very least be very close to each other. By repeating timedemos and replays, we can be sure that the results are valid and that there are no other influencing factors that may have come into play. This way, when we change the settings or switch platforms or products, any differences in results will only be caused by the change / switch.

Variances and Significance

Of course, even repeated runs using the same settings and platform or product may vary. These differences are often caused by external factors such as loading data from the hard drives into memory. If this is the case, we will repeat the runs until the differences is minimal and not significant. An example of this would be a difference of 3 fps between two results such as 120 and 123 fps, where the 3 fps difference is still within a standard deviation of 5 % and hardly noticeable during gameplay. The same 3 fps difference will be quite significant if the results are 25 and 28 fps (a 12 % difference).

Bugs, Optimizations and Different Rendering Methods

As with any other software, games and benchmarks often have bugs. To minimize the impact of bugs, we update the game binaries with the most recent, final update from the games' developer or publisher. Results from using different binaries may not be compared without checking whether or not there are performance fixes or any changes affecting performance in the patch. Of course, this means we also update graphics card and motherboard drivers, which are often updated monthly. Drivers can also offer performance fixes, particularly graphics card drivers. This means benchmark results using different drivers should not be compared 'as is' without proper evaluation of the performance impact.

Unlike game updates, graphics card drivers may contain performance enhancements and optimizations for a specific application or game. While it can be argued that they will enhance gameplay with faster frame rates, they complicate the benchmark and evaluation process since the driver may override or alter some settings or sacrifice the image quality displayed. Illegal optimizations may also inflate benchmark results and thus invalidating them, even more so if these optimizations are only present during benchmarks and not during gameplay. There are steps we have taken to ensure our benchmark results are valid. Image quality test, both static (ie. screenshots) and dynamic (playing the game and looking for artifacts, errors or differences) are done to ensure the image quality is the same or at least very close. We also compare the frame rate in benchmark results to real world gaming experience. A note will be given if we discover any anomalies during testing, along with any relevant information. Graphical settings and features will be activated through the game options when possible and drivers settings panel when the game doesn't support the settings and features.

Unlike pure synthetic benchmarks, game developers may also choose to optimize their game for a specific product or platform, either right out of the box or through an update. In this case, these developer optimizations are acceptable since at the very least the developers have made assurances through their own internal testing that the image quality or gameplay have not deviated far from what it should be. A note will also be given in this case.

The games and benchmarks

As time went by, newer games may also be included. We're already thinking of using Splinter Cell Chaos Theory but are still examining it further to see whether or not it will fit our needs. Other games we're considering as candidates are Call of Duty 2 or Dungeon Siege 2 when they're finally available.

[Previous Page]
[Go to top]
[Next Page]
Disclaimer and Privacy policy.