Tech-Hounds.com

Because gamers play games, not benchmarks




Experiences with the Core 2 Duo E6300

Moving to a new platform is never easy. In most cases, it will involve a completely new install of all software, new drivers plus coping and handling a number of quirks and odd behaviors. Even with a chipset from a leading manufacturer such as Intel, our experience with Core 2 Duo and Intel P965 chipset have not been all bells and whistles. Granted, most, if not some of the problem we face can be blamed on different manufacturers implementations.

That does not mean moving to a Core 2 Duo platform do not offer any advantages. To the contrary. Performance wise, Core 2 Duo have proven itself to be a much stronger performer than AMD's current offering, more so when performance per watt is factored into the equation. Add a competitive price to the mix and it's only natural gamers and hardware enthusiast quickly moved in droves to Core 2 Duo platforms. However, for some of us, the icing on the cake is definitely the overclocking potential of these Core 2 Duo processors. It wouldn't be too much of a stretch to say Intel have returned to its Pentium II / Celeron days.

Who could forget Intel's famous Celeron 300A processor? First marketed as a low-end, cut down version of Intel's Pentium II processor, it quickly became the staple of overclockers everywhere due to its overclocking potential and easiness to overclock. You can raise the FSB (Front Side Bus) to 100 MHz, sometimes without changing the processor's voltage. At that speed, you'll get a very affordable platform offering the performance of a Pentium II 450 MHz at a fraction of the cost. Intel hadn't even bother disabling multiprocessing capabilities of the processor. With ABIT's famed BP6, one could built a low cost, dual processor setup with two Celeron 300A, either for a workstation, server or just fooling around with dual processor systems. The introduction of the 133 MHz FSB only propel the performance of these processors even more. VIA's 694x chipsets were particularly interesting, allowing users to raise the FSB up to or near 133 MHz then maximize bandwidth by having the memory clocked at 166 MHz. For some, it became a much more viable alternative than Intel's ill fated 820 chipset with Rambus and ALi 's Magik chipset with DDR-SDRAM.

Out of the number of processors available, the Core 2 Duo E6300 and the recently launched E4300 are probably the most sought after for its overclocking potential and price. Unfortunately, we were unable to get our hands on the E4300 just yet, but we've been toying with our E6300 processor for some time now. Today we're going to share what we found.

Setting Up

Before moving on to the technical stuff, first we'd like to discuss the reasons - why overclock your processor or any other component for that matter? Well, everything else being the same, running at a higher clock means the processor can do more cycles at one time and more cycles means more data gets crunched, which of course translates to higher performance. However please keep in mind that overclocking means you're running above the manufacturers specification, thus damages that might and very likely occur if you're not careful and take necessary precautions are not covered by the manufacturers' warranty.

So much for the disclaimer, now on to the other stuff.

Besides the obvious performance benefits (and the sheer fun and sometimes frustration) with any overclocking endeavor, with this article we wanted to see what are the various performance influencing factors in a Core 2 Duo setup. Running at high clocks alone do not guarantee high performance - we also need to make sure the processor gets enough data to be processed. Since this is usually the case, we are especially interested in learning the impacts of the FSB, synchronous / asynchronous FSB and memory settings and DDR2-SDRAM timings on performance. All these factors contribute greatly to achieving optimal performance from your processor.

Our test setup
Intel Core 2 Duo E6300 socket LGA-775
2 x 512 MB A-DATA Vitesta 5-5-5-18 PC6400 DDR2-SDRAM
2 x 1024 MB Kingston KHX 5-5-5-16 (at PC6400) PC8500 DDR2-SDRAM
Gigabyte Radeon X1950 Pro 256 MB graphics card
Gigabyte P965-DS3P Intel P965 motherboard
Maxtor DiamondMaxPlus9 80 GBs Serial ATA 8 MB buffer
LiteOn 1673S DVD-RW
Tagan TG530-U15 530 watts ATX/BTX power supply

Settings:
Core: 1.225 volt
DDR: 1.9 volt (default motherboard voltage)
PCI-E clock: 100 MHz
Memory Timings: SPD except otherwise specified

FSB, Multipliers and Voltage Settings

Unlike previous processors, there are two tools we can use to push Core 2 Duo processors to the limit. The first is of course the FSB or Front Side Bus. Both Intel's P975 and P965 chipsets offer the possibility of using much higher FSBs than the official specified 266 MHz (1066 MHz effective) on Core 2 Duo Processors. There are at least three major FSB of choice if you want to run synchronously with various DDR2-SDRAM standards: 333 (for DDR2-667), 400 (DDR2-800) and 533 MHz (DDR2-1066). Most motherboards offer the option to raise the FSB in 1 MHz increments, so you have a very wide selection of FSBs to choose from. The second tool is something that have been missing for quite some time from Intel's processor - multiplier control. At last, Intel have chosen to follow AMD's footsteps by allowing downward multiplier control on Core 2 Duo processors. This allows us to experiment with much higher FSBs than before. For example, instead of topping off at 7 x 343 MHz to achieve a processor clock of 2.4 GHz, we can also use 6 x 400 MHz. Traditionally, only Engineering Samples - ES - versions of Pentium III and 4 processors allow multiplier changes.

Of course, these are not the only requisite for getting a high, stable overclock. Voltage control is just as important. Thankfully, the performance per watt oriented design and Intel's excellent fabrication process have allowed Core 2 Duo processors to run at very low voltages and much, much lower temperatures than Pentium 4 processors. It's not strange to find Core 2 Duo processors running just fine at much lower voltages than officially specified. Take the Core 2 Duo E6300 - our sample runs smoothly at 1.1 volts (1.866 MHz) although the official specification clearly states 1.35 volts for normal operation. Low voltages usually mean not only do we have much more overclocking headroom, but also much lower thermal dissipation. In layman's terms, that means less heat and thus less hassle in cooling down the processor. Throughout the whole process of testing for this article, we found the stock cooler does quite an adequate job of cooling a 2.8 GHz Core 2 Duo E6300 at 1.225 volts (though temperatures did get a little too high for our taste at full load for prolonged periods). You probably notice the shot below shows 1.2 volts - that's no mistake. This is confirmed by both SpeedFan and Gigabyte's own V-Tuner. It would seem the Gigabyte P965-DS3P have a 0.025 volts deficit hence the difference.



Some might ask, why thermal dissipation and temperatures are so important, both in general use and overclocking setups? Well for one, lower temperatures mean longer lifespan. They can also help us avoid thermal protection features from kicking in, which is something of a nuisance if you want to get the most performance out of your processor. Intel first introduced their processor thermal management technology with Pentium 4 'Willamette' processors. Should the processor reach a certain temperature that's deemed unsafe by the engineers at Intel, it will automatically scale down the clock and run slower to keep its temperature from getting any higher. It's quite effective in minimizing risk from thermal damage or in simpler terms, burning itself up. Tom's Hardware once documented this feature, quite dramatically we might add.

Now, back to the FSB or Front Side Bus. It is generally accepted that the higher the FSB you use, the higher the performance you'll get. That's because assuming timing and latency are constant, you effectively pushing more data from and to the component you're overclocking. Even if the processor can't process all the data, you will still experience some gains because 'real' latency - idle cycles - are lower with higher FSBs because the take up less time. However, some more adventurous overclockers have actually reported lower performance at higher FSBs with Core 2 Duo processors. Of course, we're curious to see if this is true and if there is a point of diminishing return, at what clock and is there something we can do about it.

Memory Related Factors

Another area we want to examine is further down the pipe - memory. By default, Core 2 Duo processors, even the highest clocked Core 2 Duo X6800 are only running at 266 MHz FSB (1066 MHz effective). To alleviate bandwidth penalties, Intel decided to use larger caches with these processors. Conroe processors - E6600, E6700 and X6800 came with 4 MBs of cache while Allendale processors - E6300 and E6400 - comes with 2 MBs of cache, half of it's much expensive siblings. On the chipset, you can use asynchronous memory clocks - for example if you're using the official FSB of 266 MHz FSB, you can run your DDR2 modules at 333, 400 or 533 MHz (translating into DDR2-667, 800 and 1066, respectively). Now, there's two problem we can see with this picture. Judging from past experiences, running asynchronous memory clocks are not what is all cracked up to be. Memory controllers have to be specifically tuned for asynchronous settings and traditionally Intel chipsets work best in synchronous modes. Second, memory timings usually gets 'looser' with higher frequencies - latencies are higher at higher clocks. Despite higher raw bandwidth, we might not see much improvement in performance with asynchronous modes.

Most articles on the matter have also stuck to the official FSB specs - 266 MHz. We are particularly interested to see if such a behavior are also present at higher FSBs, which fits perfectly with our overclocking experiment. If there is an FSB cap to Intel's chipsets, then which method should we use to increase memory bandwidth: higher memory clocks with asynchronous settings, tighter memory timings with synchronous settings or a little bit of both? What kind of gains can we expect from each? This is especially interesting, since many memory manufacturers have been touting DDR2-1066 and higher clocked memory modules for some time now. Some even tout low latencies, but is there really a definite need for them?

Low Risk, Low Hassle Overclocking

Overclocking, voltage and cooling are heavily related to one another. If you overclock a component, chances are it will produce more heat than if you don't overclock it. That means you have to pay attention to cooling solutions - sometimes just putting a fan is enough, sometimes you have to use more exotic methods like water cooling or thermo-electric solutions. Raising the voltage adds another problem - more voltage, more watts - and more watts means more heat which of course means you need a more effective cooling solution to keep temperatures low (or low enough). For the purpose of this article, we decided NOT to raise any voltage settings - be it processor, memory or chipset. In fact, the Core 2 Duo E6300 we're using is actually running at a lower voltage than specified - mostly done to keep a balance between stability and temperature. By not raising the voltage, the overclocks we reached are pretty tame - 2.8 GHz with stock air cooling - but it's also hassle free, easy and less risky than if we were to raise any voltages. No doubt if we were to raise the voltage and use more effective cooling methods we could've reach higher clocks, but that's not what we're interested in for this article.

Preliminary Tests

Since we're going to examine very specific areas, we have to use very isolated tests - that means resorting to synthetic benchmarks. In this case, we're going to use two benchmarks - SuperPi 1.1 (no mod) and Sciencemark 2.0's Membench. SuperPi is a very processor / bandwidth sensitive benchmark, which makes it ideal for doing preliminary tests. Although it's a very good benchmark, SuperPi test results are influenced by several components - processor (clock) and memory (timing and bandwidth). That's why we need a second benchmark. Sciencemark's Membench results will give us a much clearer view of bandwidth and latency so we can determined which one has the bigger influence on performance. Keep in mind, these are only preliminary tests results, from which we're hoping can help us find the best combination of FSB and memory settings that offer the most optimal increase in performance.

Here is the SPD data from the A-DATA Vitesta modules


Synchronous vs Asynchronous Mode

First of, lets take a look at asynchronous test results from Sciencemark's Membench. We kept the processor running at its default clock 7 x 266 MHz or 1833 MHz, then ran tests under three settings - with the memory running at 533 MHz (synchronous), 667 and 800 MHz (asynchronous). For this test, we've decided to let the motherboard apply SPD timing values, meaning memory timings are different for each memory clock.


Sync Async Async

533 Mhz 667 Mhz 800 Mhz
Bandwidth 4609.8 MB/s 5016.16 MB/s  5079 MB/s
Latency


4 byte stride 3 cycles 3 cycles 3 cycles
16 byte stride 7 7 6
64 byte stride 28 26 25
256 byte stride 99 89 83
512 byte stride 112 102 98


Compared to Sync 533 MHz
Compared to Sync 533 MHz
Bandwidth
8.82% 10.18%
Latency


4 byte stride
0 cycles 0 cycles
16 byte stride
0 -1
64 byte stride
-2 -3
256 byte stride
-10 -16
512 byte stride
-10 -14

Next, we raise the FSB to 333 MHz, which is supposedly the new official FSB for the newer Core 2 architecture based Intel processors and Intel Bearlake chipsets.


Sync Async

667 Mhz 833 Mhz
Bandwidth 5295.66 MB/s 5649.92 MB/s
Latency

4 byte stride 3 cycles 3 cycles
16 byte stride 8 8
64 byte stride 34 31
256 byte stride 120 104
512 byte stride 138 122


Compared to Sync 677 MHz
Bandwidth
6.69%
Latency

4 byte stride
0 cycles
16 byte stride
0
64 byte stride
-3
256 byte stride
-16
512 byte stride
-16

Notice that bandwidth gains are relatively insignificant with asynchronous memory settings (mostly below 10 percent, staying around 400 MB/s), even if you were able to run with slightly lower latencies. Below you can see the results of the same test, but this time using synchronous settings. Keep in mind that this mean raising the FSB to keep it in sync with memory speeds (FSB 266 - memory 533 MHz, FSB 333 - memory 667 MHz and FSB 400 - memory 800MHz).


Sync Sync Sync

533 Mhz 667 Mhz 800 Mhz
Bandwidth 4609.8 MB/s 5295.66 MB/s 6262.48 MB/s
Latency


4 byte stride 3 cycles 3 cycles 3 cycles
16 byte stride 7 8 9
64 byte stride 28 34 35
256 byte stride 99 120 125
512 byte stride 112 138 143


Compared to Sync 533 MHz Compared to Sync 533 MHz
Bandwidth
14.88% 35.85%
Latency


4 byte stride
0 cycles 0 cycles
16 byte stride
1 2
64 byte stride
6 7
256 byte stride
21 26
512 byte stride
26 31

Notice how things have changed - bandwidth gains are significant and much more noticeable (more than 15 percent). Assuming memory clocks and timings are the same between synchronous and asynchronous settings, the extra boost in bandwidth have to come from somewhere else - we believe from the inner workings of the chipset or rather the memory controller. You can see the table below for a much clearer comparison.


Async Sync Async Sync

667 Mhz 667 Mhz 800 Mhz 800 Mhz
Bandwidth 5016.16 MB/s 5295.66 MB/s 5079 MB/s 6262.48 MB/s
Latency cyclescycles cycles cycles
4 byte stride 3 3 3 3
16 byte stride 7 8 6 9
64 byte stride 26 34 25 35
256 byte stride 89 120 83 125
512 byte stride 102 138 98 143


Compared to Async 667MHz
Compared to Async 800MHz
Bandwidth
5.57%
23.30%
Latency
cycles
cycles
4 byte stride
0
0
16 byte stride
1
3
64 byte stride
8
10
256 byte stride
31
42
512 byte stride
36
45

Just for confirmation, let's take a peek at CPU-Z's memory timing dump.

667 MHz

 

800 MHz

 

Well, that single CAS latency difference may play a part in the results between synchronous and asynchronous 667 MHz, but there's no other explanation for the lack of any latency difference between synchronous and asynchronous 800 MHz. Conclusion: memory timings are the same between synchronous and asynchronous settings. Obviously, Intel chipsets, in this particular case the Intel P965 chipset, is still designed with synchronous than asynchronous memory settings in mind - exactly like its predecessors. Wait! You might say, these results are caused by raising the processor's FSB but remember, we're not looking at performance numbers here, but memory (subsystem) bandwidth and latencies.

Here's an explanation:

By raising the FSB, we raise the clock at which the chipset is operating. If internal chipset / memory controller latencies and memory timings are constant, we should've seen lower time spent on idle cycles - we didn't. . We saw latencies are higher in synchronous mode than at asynchronous mode at the same memory clock (and timing).  Add to that, we also saw higher FSBs use higher latencies than lower FSBs (266 MHz vs 343 MHz vs 400 MHz) - latencies that are inside the chipset / memory controller, not on the memory modules. Later we will see why this is important.

FSB and Multiplier Mix

With multiplier control available, there's another new 'wrinkle' to keep in mind when setting your processor clock - do you want to push the FSB as high as you can go or go for higher multiplier first? The rule of thumb is to raise the FSB first, since theoretically you're pushing more data from and to the processor. Is that still true for these processors? Let's take a closer look. We set the processor to run at 2.4 GHz, then ran test under two settings - first with 7 x 343 MHz and then with 6 x 400 MHz. For this test, we've included results with fixed memory timing as well, mostly to see the influence of timing in ideal conditions (synchronous mode).


343 Mhz SPD 343 Mhz 400 Mhz 400 Mhz
Bandwidth 5429.04 MB/s 5466.46 MB/s 5913.74 MB/s 5913.74 MB/s
Latency



4 byte stride 3 cycles 3 cycles 3 cycles 3 cycles
16 byte stride 8 8 9 9
64 byte stride 34 34 34 34
256 byte stride 121 120 125 125
512 byte stride 138 137 143 143


Compared to 343 MHz SPD Compared to 343 MHz SPD Compared to 343 MHz
Bandwidth
0.69% 8.93% 8.18%
Latency



4 byte stride
0 cycles 0 cycles 0 cycles
16 byte stride
0 1 1
64 byte stride
0 0 0
256 byte stride
-1 4 5
512 byte stride
-1 5 6

And here are CPU-Z's memory timing dump.

343 MHz

 

400 MHz



Apparently, higher FSB is still the way to go. You still gain some bandwidth with higher FSBs, despite the slightly higher latencies we believe is 'inside' the chipset. Assuming everything else is constant (processor clock, memory timing), raising the FSB by 57 MHz in synchronous mode allows us to enjoy pretty much the same benefit from using asynchronous settings (say 343 MHz FSB and 1029 MHz memory). In fact, if we look back at the synchronous mode test results, we can see the increase from raising the FSB is actually higher, since the gain you get from asynchronous mode decreases with even higher FSBs.

Now, is that kind of gain (less than 10 percent) noticeable at all in real life? We've keept everything else pretty much constant (processor clock, memory timings) but we don't really know for sure. We think SuperPi test results can offer a glimpse of what the answer might be. We choose the 8 M digits option, which seems to be a pretty good compromise between speed and quality. Here are the results:


343 Mhz SPD 343 Mhz 400 Mhz
Iterationssecondssecondsseconds
1 13 13 13
2 25 25 24
3 37 37 36
4 49 49 47
5 61 61 59
6 73 73 70
7 85 85 82
8 96 97 93
9 108 108 105
10 120 120 116
11 132 132 128
12 144 144 139
13 156 156 151
14 168 168 162
15 180 180 174
16 191 192 185
17 203 204 197
18 215 215 208
19 227 227 219
20 238 239 231
21 250 250 241
22 260 260 251

The easiest way to spot the difference is to compare values in the last row - a 9 second difference between running synchronously at 400 MHz and 343 MHz. However, it is interesting to note that there are some very small differences between 343 MHz values. The SPD values are lower on  the 16th and 17th iteration, in addition to the 20th. Rounding error perhaps? Or caused by that single TRAS cycle? Overall, it doesn't really matter really. That 9 second difference translates to a mere 3 percent performance increase. We're doubtful it will show up in real life conditions such as game benchmarks and gameplay testing sessions. But we will conduct game benchmarks just to be sure.

Now, let's look back at the asynchronous vs synchronous results. if under ideal conditions (synchronous mode) an 8 percent gain in bandwidth nets us only a 3 percent increase in performance, it's very likely we'll see the same, perhaps less with asynchronous settings. What if we raise the bar and use higher multipliers? A higher clocked processor is more likely to be bandwidth starved than a lower clocked one. Let's test that assumption. First - the FSB. This time we're going to run the processor slightly faster at 2.8 GHz - first with 7 x 400 MHz and then with 6 x 466 MHz. For brevity, we use the same memory timing on both runs.


400 Mhz 466 Mhz
Bandwidth 6229.61 MB/s 6783.84 MB/s
Latency cycles
cycles
4 byte stride 3 3
16 byte stride 9 9
64 byte stride 36 34
256 byte stride 128 128
512 byte stride 146 146


Compared to 400 MHz
Bandwidth
8.90%
Latency
cycles
4 byte stride
0
16 byte stride
0
64 byte stride
-2
256 byte stride
0
512 byte stride
0

Interestingly enough, we're seeing practically the same amount of bandwidth increase as we did from 343 MHz to 400 MHz - around 8 to 9 percent. That seems to be the average gain for every 66 MHz increase in FSB.  Important to note, this time we're seeing practically no difference in latency between the two settings. It's very likely chipset timings didn't change between these two settings, not like we saw earlier with 343 and 400 MHz. Now, let's take a look at SuperPi test results.


400 Mhz 466 Mhz
Iterationssecondsseconds
1 12 11
2 22 21
3 32 31
4 42 41
5 53 51
6 63 61
7 73 71
8 84 81
9 94 91
10 104 101
11 115 111
12 125 121
13 135 131
14 146 141
15 156 151
16 166 161
17 176 171
18 187 181
19 197 191
20 207 200
21 217 210
22 226 218

The 400 MHz jump we got from 2.4 GHz to 2.8 GHz allows us to shave off 25 to 34 seconds. However, that's not really what we're interested in. If you compare values in the last row, you'll see that we managed to shave off 8 seconds by using a slightly higher FSB (66 MHz higher). Still, the gain in performance is only about 3 percent - similar, if not the same to what we saw earlier with 343 and 400 MHz FSB.

Though it would have been much more fun to raise the FSB to 533 MHz, we were unable to get that far. Remember, we have not raise any voltage for any of the components and we're still using the processor's stock air cooling. Running at 533 MHz FSB means we have to compare results at 3.2 GHz, either by using 6 x 533 MHz or 7 x 457 MHz, but 3.2 GHz is not really attainable using stock air cooling. At 466 MHz, the Intel P965 chipset on our Gigabyte P965-DS3P sample was already running quite hot, it was not entirely rock stable without a much more effective cooling solution. At that setting we were unable to run stable long enough to conduct our game benchmarks.

Looks like we've taken the FSB as far as we could go and still run rock stable at 400 MHz. Now, let's examined the memory. We've managed to get our hands on some DDR2-1066 modules - Kingston's  KHX-8500 2 GB kit. Swapping them in, we ran the same tests under two settings - synchronous with tighter timings (4-4-4-9) and asynchronous with SPD timings at DDR2-1000 (running the memory asynchronously at 4:5). Here is the SPD data from these modules:



For your information, Kingston officially states these memory moduls are PC8500 memory modules at 2.2 volts. Remember, we're running them throughout these tests at 1.9 volts.

Here are the results:


Sync Async

400 Mhz 4-4-4-9 400 Mhz 5-7-7-20
Bandwidth 6450.78 MB/s 6719.65 MB/s
Latency

4 byte stride 3 cycles 3 cycles
16 byte stride 8 8
64 byte stride 34 31
256 byte stride 120 110
512 byte stride 137 127


Compared to 400 MHz 4-4-4-9
Bandwidth
4.17%
Latency

4 byte stride
0 cycles
16 byte stride
0
64 byte stride
-3
256 byte stride
-10
512 byte stride
-10

Next, the SuperPi results under both settings.


Sync Async

400 Mhz 4-4-4-9 400 Mhz 5-7-7-20
Iterationssecondsseconds
1 11 11
2 21 21
3 31 31
4 42 42
5 52 52
6 62 62
7 72 72
8 82 82
9 92 92
10 102 102
11 112 112
12 122 122
13 132 132
14 142 142
15 152 152
16 162 162
17 172 172
18 182 182
19 192 192
20 202 202
21 212 212
22 220 220

Now, this is particularly interesting - very interesting indeed. From Sciencemark Membench's results we can see bandwidth wise, asynchronous results at 400 MHz FSB with 1000 MHz memory offer an advantage in both bandwidth and latency compared to running synchronous at 400 MHz with tighter timings. However, those two advantages didn't translate to real world increase in performance - the overall SuperPi 8M test results on both settings, hell, even the time taken with each iterations are the same. Now compare these SuperPi results with our previous 400 and 466 MHz results, taken with much more 'relaxed' timings - '5-5-5-18'. The difference: 6 seconds faster from 400 MHz and 2 seconds slower than 466 MHz. These are so small you will not notice them in real life.

Confused yet? What does it all mean? What's the point of these tests?

If you're lucky enough to get a more overclockable processor or employ a much more effective cooling solution, the performance bottleneck at very high FSBs on a Core 2 Duo platform is not on the DDR2 modules or the processor itself  but rather on the chipset or more precisely, the memory controller. The Intel P965 chipset was designed for synchronous settings in mind and it looks like the ceiling for this chipset was set to 400 MHz. It make do with asynchronous modes by lowering latencies (no doubt to offset the high latencies of higher clocked memory modules). At least that's what we think.

Why?

We saw a trend with synchronous mode that with every 66 MHz jump from 266 MHz to 400 MHz, overall latency increases. We did not see a similar situation between 400 and 466 MHz. At 400 MHz, internal chipset latencies are already being stretch to the limit. At higher clocks it simply can not use anymore 'looser' internal timings to compensate and transfer data from the memory to the processor fast enough - limiting us from gaining any increase in performance. Look closely at bandwidth numbers from 466 MHz FSB and 400 MHz with DDR2-1000 modules test results to see what we mean - they are practically similar. The reason we're seeing higher performance from the higher clocked FSB is because 'real' latency or the actual time spent on idle cycles went down with higher clocks.

Latencies in terms of cycles stayed constant, but the chipset / memory controller is running at a higher clock. Thus latencies in time spent (ns) are lower - more data went through..

However, since the chipset is not able to loosen its internal timings any more past 400 MHz, it's only a matter of time before we hit a MHz brick wall should we continue to increase the FSB. It's obvious why some overclockers reported a decrease in performance with FSB higher than 400 MHz - the chipset's internal latency was more likely 'loosened' (either by Intel or motherboard manufacturers) at higher clocks - a much needed compromise to allow stability above 400 MHz.

Performance

Although we did get some useful information from preliminary test, we'd still like to see what do those settings translate in a more applicable situations - games. So we ran our usual game benchmarks with the same setup under several settings: the default 266 MHz FSB, slightly higher 343 MHz and 400 MHz FSB at 2.4 GHz and last, 400 MHz FSB at 2.8 GHz. This tests also provide some other valuable information - we can see which games are bandwidth and / or clock sensitive. Another way of looking at it is this: Will games be able to use the additional memory bandwidth? These results are the averages of three runs.

Our test setup
Intel Core 2 Duo E6300 socket LGA-775
2 x 512 MB A-DATA Vitesta 5-5-5-18 PC6400 DDR2-SDRAM
Gigabyte Radeon X1950 Pro 256 MB graphics card
Gigabyte P965-DS3P Intel P965 motherboard
Maxtor DiamondMaxPlus9 80 GBs Serial ATA 8 MB buffer
LiteOn 1673S DVD-RW
Tagan TG530-U15 530 watts ATX/BTX power supply

Windows XP Professional with Service Pack 2 installed
ATI Catalyst 7.1 reference driver
Intel Chipset Software Installation Utility 8.1.0.1006
DirectX 9.0c
all respected games used for benchmarks have been updated to their latest, final builds.

The results:

Call of Duty - Dawnville, 1024 x 768
7x 266 MHz FSB
7x343 MHz FSB
6x400 MHz FSB
7x400 MHz FSB
97
257.01
600.67
.
120
308.59
637.67
.
122.67
316.7
645.67
.
140.67
348.12
720


7x343 to 6x2666x400 to 6x2667x400 to 6x2666x400 to 7x343
Min 23.71% 26.46% 45.02% 2.75%
Avg 20.07% 23.23% 35.45% 3.15%
Max 6.16% 7.49% 19.87% 1.33%

The table above provides us with a much more clearer view of the results. By raising the FSB to 400 MHz, we effectively raise both the FSB and real processor clock by 50 percent, however we're only seeing a 35 percent increase overall. It would seem Call of Duty, or at least the Dawnville demo, don't respond all that well to increases in bandwidth. Higher clock is still offer the reigning performance factor, allowing increases in minimum and average frame rates. If we compare the results we got from 6 x 400 MHz and 7 x 343 MHz FSB, we can see this benchmark falls in line with our performance increase expectations based on SuperPi results - around 3 percent.

Homeworld 2 - Vaygr Bomber Strike, 1024 x 768
7x 266 MHz FSB
7x343 MHz FSB
6x400 MHz FSB
7x400 MHz FSB
77.67
295.36
485.67
.
129.33
353.25
569.33
.
137.33
370.58
604.33
.
170.67
412.83
665


7x343 to 6x2666x400 to 6x2667x400 to 6x2666x400 to 7x343
Min 66.52% 76.82% 119.74% 10.30%
Avg 19.60% 25.47% 39.77% 5.87%
Max 17.23% 24.43% 36.93% 7.21%

There's a nasty bug with the Catalyst 7.1 that causes very low frame rates when shadows are enabled in this game. To get any meaningful results, we had to turn off shadows during testing for this article. Looking at the results, it's clear that minimum frame rates on this benchmark are heavily system related. Look at the increase we got when we overclock the system - close to 120 percent at 2.8 GHz. Average frame rates gains are not as spectacular, but still significant (39 percent). What about bandwidth increases due to higher FSBs? Homeworld 2 seems to respond very well to bandwidth increases, almost twice the rate we saw with SuperPi. In low frame rate situations, it's even more responsive (10 percent), one proof that higher FSBs and higher bandwidth can offer something significant for the gameplay experience in this game.

Nascar 2003 - Custom Replay, 1024 x 768
7x 266 MHz FSB
7x343 MHz FSB
6x400 MHz FSB
7x400 MHz FSB
37.67
52.01
91
.
49
65.3
112
.
51
66.79
112.67
.
58
76.23
130.33


7x343 to 6x2666x400 to 6x2667x400 to 6x2666x400 to 7x343
Min 30.09% 35.40% 53.98% 5.31%
Avg 25.56% 28.42% 46.57% 2.86%
Max 23.08% 23.81% 43.22% 0.73%

From the results, it's obvious this game is very system limited. Look at the increase from 2.4 to 2.8 GHz - we gain almost twice the increase (46.57 percent compared to 25.56 and 28.42 percent). Increases in minimum, average and maximum frame rates stays pretty close with each clock increase (23 to 35 percent at 2.4 GHz and 43 to 53 percent at 2.8 GHz). So, are the gains clock or bandwidth related? On average frame the gains are lower than what we saw with SuperPi. The more interesting is the difference of performance increase in minimum fps - 5.31 percent. It's obvious low frame rates situations are also bandwidth related, but processor clock is still the main performance factor here.

Rome Total War - Custom Battle, 1024 x 768
7x 266 MHz FSB
7x343 MHz FSB
6x400 MHz FSB
7x400 MHz FSB
17
23.64
27
.
21
28.33
32.33
.
23.67
29.53
33.67
.
26
32.93
38.33


7x343 to 6x2666x400 to 6x2667x400 to 6x2666x400 to 7x343
Min 23.53% 39.22% 52.94% 15.69%
Avg 19.86% 24.93% 39.32% 5.07%
Max 19.75% 24.69% 41.98% 4.94%

This benchmark have slightly larger variations, but it's still a good example of how heavy RTS games really are. Heavily system limited, we gain the most with minimum frame rates when we overclock the processor. This game responds very well to increases in bandwidth from the use of highers FSB, though there are still limits. Look at the increase in minimum fps when we change the FSB from 343 MHz to 400 MHz - 15 percent. Increases in average frame rates are not as dramatic, but still higher than what we saw with SuperPi.

Full Spectrum Warrior - Custom Replay, 1024 x 768
7x 266 MHz FSB
7x343 MHz FSB
6x400 MHz FSB
7x400 MHz FSB
91
121.41
159
.
110
138.94
181
.
105.67
141.16
186.67
.
95.33
146.67
207


7x343 to 6x2666x400 to 6x2667x400 to 6x2666x400 to 7x343
Min 20.88% 16.12% 4.76% -4.76%
Avg 14.44% 16.26% 20.80% 1.83%
Max 13.84% 17.40% 30.19% 3.56%

Now, this is one game that defies the norm. Though overall we're seeing ever increasing performance with higher clocks, the increase is actually decreasing with higher FSBs and clocks. One likely explanation is that we're nearing the crossover path between system and graphics performance boundaries - the graphics card we're using have become the bottleneck. In this case, we need to look at maximum fps, where the graphics card clearly still have some room. The numbers here make much more sense - we see a 30 percent increase when we overclock the Core 2 Duo E6300 to 2.8 GHz with a 400 MHz FSB. Bandwidth wise, this game seems to behave pretty much like Call of Duty - performance increases are largely due to clock not bandwidth

One strange thing to note: Look at actual frame rates numbers - we're actually seeing lower minimum frame rates at 2.8 GHz than at 2.4 GHz.

Dungeon Siege - Greilyn Beach, 1024 x 768
7x 266 MHz FSB
7x343 MHz FSB
6x400 MHz FSB
7x400 MHz FSB
35.67
74.55
179.33
.
43
89.86
202
.
45
94.41
213.33
.
48.33
103.72
236


7x343 to 6x2666x400 to 6x2667x400 to 6x2666x400 to 7x343
Min 20.56% 26.17% 35.51% 5.61%
Avg 20.53% 26.63% 39.11% 6.10%
Max 12.64% 18.96% 31.60% 6.32%

Like it's previous incarnations, Dungeon Siege is very system lmited. Clock speed rules here, though it can still appreciate additional bandwidth from higher FSBs. The performance gained by moving from 2.4 GHz to 2.8 GHz is significant, be it from 7 x 343 MHz or 6 x 400 MHz. Overall, we are close to the thereotical increase of 50 percent. but something else is holding us back and it's not bandwidth. Though larger than what we saw with SuperPi, performance increase due to bandwidth is not significant.

SW: KOTOR - Endar Spire, 1024 x 768
7x 266 MHz FSB
7x343 MHz FSB
6x400 MHz FSB
7x400 MHz FSB
39
67.99
82.67
.
39
82.18
98.33
.
41.67
82.5
102
.
41.33
87.02
110


7x343 to 6x2666x400 to 6x2667x400 to 6x2666x400 to 7x343
Min 0.00% 6.84% 5.98% 6.84%
Avg 20.87% 21.34% 27.98% 0.47%
Max 18.95% 23.39% 33.06% 4.44%

The results here are interesting. See how using a 400 MHz FSB, be it at 2.4 GHz or 2.8 GHz, allow us to have higher minimum frame rates than either 343 MHz and the official 266 MHz FSB. Interesting but not really significant - the differences are so small, you wouldn't notice it in real life. Raw clock speed is still king - performance increases from clock speed are around 20 percent at 2.4 GHz and 27.98 percent at 2.8 GHz. 

Richard Burns Rally - Harwood Forest, 1024 x 768
7x 266 MHz FSB
7x343 MHz FSB
6x400 MHz FSB
7x400 MHz FSB
126.33
177.73
248
.
163
221.91
309.67
.
171
234.33
331
.
183.67
266.98
376


7x343 to 6x2666x400 to 6x2667x400 to 6x2666x400 to 7x343
Min 29.02% 35.36% 45.38% 6.33%
Avg 24.86% 31.85% 50.22% 6.99%
Max 24.87% 33.47% 51.61% 8.60%

Wow. This game really responds well to clock increases. Look at how minimum, average and maximum frame rates stays pretty close to each other. In fact, they are still increasing with higher clocks. One really interesting to focus on is the fact that we reached the theoretical performance increase of 50 percent here. However, this game don't respond all that well to bandwidth increases. While the increase due to higher FSB is twice what we saw with SuperPi, they are hardly significant.

F1 Career Challenge - Custom Replay, 1024 x 768
7x 266 MHz FSB
7x343 MHz FSB
6x400 MHz FSB
7x400 MHz FSB
30.67
77
120
.
33
86.25
136
.
36.33
92.67
144.67
.
38
99.3
157


7x343 to 6x2666x400 to 6x2667x400 to 6x2666x400 to 7x343
Min 7.61% 18.48% 23.91% 10.87%
Avg 12.02% 20.36% 28.96% 8.34%
Max 13.33% 20.56% 30.83% 7.22%

Unfortunately, looks like current AMD's / ATI's Catalyst drivers, even back to the Catalyst 6.11 we used for our Intel P965 motherboards round up have a performance limiting bug with this game. However, since we're not comparing graphics cards and we use the same drivers and settings, we still think the results are still valid for comparing processor and system performance. Notice how minimum frame rates increase with higher FSBs - almost 11 percent. That's significant to some extent. Particularly when we see it is practically the same to what we get by going from 2.4 GHz to 2.8 GHz. This game responds very well to bandwidth increases.

LockOn - F15 Demo1, 1024 x 768
7x 266 MHz FSB
7x343 MHz FSB
6x400 MHz FSB
7x400 MHz FSB
32.33
83.07
228.33
.
43
103.38
284.67
.
42.67
105.93
292
.
45.67
120.19
330.33


7x343 to 6x2666x400 to 6x2667x400 to 6x2666x400 to 7x343
Min 32.99% 31.96% 41.24% -1.03%
Avg 24.44% 27.52% 44.68% 3.08%
Max 24.67% 27.88% 44.67% 3.21%

From the numbers in the table above, we can see this game is again more influenced by higher processor clocks than higher FSB clocks or bandwidth. There's practically no difference in minimum frame rates between running at 7 x 343 MHz and 6 x 400 MHz, though there's a 3 percent difference in average and maximum fps. 3 percent difference is nothing to write home about. On the bright side, look at the performance increase we gained with a 2.8 GHz overclock - more than 40 percent across the board. That is something to write home about.

Brothers In Arms - Chapter 1, 1024 x 768
7x 266 MHz FSB
7x343 MHz FSB
6x400 MHz FSB
7x400 MHz FSB
50.33
76.57
113.67
.
53.33
82.49
120
.
56.33
86.62
126.67
.
56
89.18
130


7x343 to 6x2666x400 to 6x2667x400 to 6x2666x400 to 7x343
Min 5.96% 11.92% 11.26% 5.96%
Avg 7.74% 13.12% 16.47% 5.38%
Max 5.57% 11.44% 14.37% 5.87%

Traditionally, Brothers in Arms have been very graphics bound, so it's not surprising to see such a small increase with higher processor clocks. It is interesting to note that a simple 57 MHz increase in FSB accounts for half the increase of using 6 x 400 MHz than 7 x 343 MHz. Clearly, the additional memory bandwidth from using a higher FSB is more appreciated by the game than higher processor clocks. Too bad, the increase isn't really significant at around 6 percent per 57 / 66 MHz.

Conclusion:

From the games we tested today, only Homeworld 2, Rome Total War and F1 Career Challenge showed significant increases in frame rates. Its worth noting the increases are in minimum frame rates, which should help the gameplay experience tremendously. Most other games exhibit less significant increases, around 3 percent in average. However, most of these games, except for Brothers In Arms, enjoy a good boost from high processor clocks, around 12 percent in average from 2.4 GHz to 2.8 GHz with the same 400 MHz FSB.  

If we want optimal performance it looks like we have to settle for a 400 MHz FSB for the time being. At least until a new chipset with higher high FSBs tolerance comes along. But don't take it too hard. At 400 MHz, there are still some options left to improve performance - either use asynchronous memory settings, like using DDR2-1000 or DDR2-1066 modules or use tighter timing at synchronous settings. However, looking at the performance results, there's really no significant increase in performance to be gained. The only tool left to us that still offer considerable performance increase at this point is multiplier control, making processor with very high multipliers such as the E4300 very appealing. True, there's the X6800, but the E4300 cost peanuts compared to the X6800 and seems to have the same overclocking potential as the rest of Core based processors.

However, its unlikely we can push multiplier as high as we can without experiencing diminishing returns. Why? Because with higher procesor clocks, the need to supply more data - bandwidth is the keyword here - to keep those execution units filled become increasingly important. Since running higher FSBs is out of the option, we again must choose between running in asynchronous mode and / or use tighter timings. That may help a little, but even together they do not offer much of a bandwidth boost. But of course, that never stops any overclockers from trying. After all, that's actually the fun part of overclocking.

Go to top
Disclaimer and Privacy policy.