Last week we presented our first results of the DDR2-533 low level tests with the FSB frequency of 266 MHz, which corresponds to the proper operating frequency of this memory type. In these tests we used a new 3.46 GHz Intel Pentium 4 Extreme Edition processor, which is actually based on the good old Northwood core. The results obtained in memory system tests turned out not that impressive, because the Northwood architecture restrains considerably its potential. So we drew a conclusion that the full DDR2-533 potential in the dual channel mode could be really unveiled only with the release of Pentium 4 processors on Prescott core (which is much more efficient at memory operations) supporting 266 MHz FSB.
The new revision of Prescott processors will probably not be released in the nearest future, but... why not experiment with the existing processors designed for 200 MHz FSB? Especially as our lab got hold of a very interesting ECS PF21 Extreme mainboard based on Intel 925XE chipset, which is equipped with overclocking functions unlike Intel D925XECV2. In particular, it allows to modify the FSB frequency in a very wide range – from 200 to 510(!) MHz inclusive. So, let's proceed to our tests!
In our tests we decided to use an unlocked sample of 3.6 GHz Intel Pentium 4 Prescott, which supports FSB frequency multipliers from 14 to 18 inclusive. Mainboard BIOS settings for the first series of tests: FSB frequency – 200 MHz, memory operating mode – DDR2-533, timings – SPD data (4-4-4-11). To test the memory in synchronous mode, FSB frequency was set to 266 MHz, CPU voltage was raised to 1.3875 V. It's important to note that the memory operating mode was set to DDR2-400, which actually means that it operated at the frequency ratio of DRAM:FSB = 1:1, that is at 266 MHz. While DDR2-533 mode means the frequency ratio of DRAM:FSB = 4:3, which would have inevitably resulted in memory failure. Besides, for this case we set PCI and PCI Express buses to asynchronous mode to ensure their operation at standard frequencies of 33.3 and 100 MHz correspondingly.
Real memory bandwidth
As usual, low-level memory characteristics (memory bandwidth and
latency) were measured in RightMark
Average real memory read bandwidth on the first platform (200 MHz FSB) is 4963 MB/sec – approximately 77.5% of the theoretical maximum (conditioned by the maximum FSB throughput of 6.4 GB/sec), which is higher by 22% in comparison with the previous results obtained with Pentium 4 Extreme Edition. Upgrade to the 266 MHz FSB, which lifts this maximum memory bandwidth limitation, is accompanied by a noticeable gain in the average real memory read bandwidth of up to 5958 MB/sec. The absolute gain is rather large – 20% (taking into account that the maximum possible gain conditioned by the frequency ratio is 33.3%). What concerns the average real memory write bandwidth when you switch to a 266 MHz bus (12.7%), the values as well as the absolute gain are both rather small. Nevertheless, you shouldn't expect any better taking into account the significant impact of the CPU cache writing peculiarities on the test results. Besides, you shouldn't forget that these are average values. They concern the real memory bandwidth only indirectly, being limited by many other factors.
Maximum real DDR2-533 memory bandwidth,
Intel Pentium 4 Prescott, 266 MHz FSB
Let's proceed to the analysis of maximum memory bandwidth values. The use of a processor based on Prescott core has expectedly crossed all the T's. Maximum real memory read bandwidth on the first testbed is 6412 MB/sec – it's rigidly limited by the FSB throughput. A corresponding value on the second testbed – 8247 MB/sec(!). There is almost no point in comments – 28.6% absolute gain, the value itself reaching 96.6% of the theoretical maximum (8.53 GB/sec). Well, DDR2-533 read efficiency in the dual channel mode in case of correctly organized read operations (implemented in Prescott core) is quite high!
Potential of the DDR2-533 memory bandwidth can be also seen unveiled in maximum real memory write bandwidth, which, according to our numerous tests, is rigidly limited at 2/3 of the theoretical FSB throughput. This parameter gains 32.8%, when the FSB frequency is increased from 200 to 266 MHz. Note that almost an identical result was obtained a week ago, when we tested the same memory with the same chipset but with a different processor – Pentium 4 Extreme Edition.
Procedures for measuring latency in Pentium 4 processors were devised, justified, and described in detail earlier. That's why we shall only outline them: the latency test uses the pseudo-random (as well as random) walk mode of a relatively large memory block (16 MB) at 128 byte steps ("effective" L2 cache line size bound up with hardware prefetch of an adjacent line from memory into cache in all walk modes).
DDR2-533 latency (pseudo-random and random walk),
Intel Pentium 4 Prescott, 266 MHz FSB
Average latency of the pseudo-random memory walk (obtained without offloading the bus by inserting dummy instructions) on the first testbed, where the memory operates in asynchronous mode, is 48.3 ns (which is almost 40% as low as the value obtained with Pentium 4 XE). The bus being gradually offloaded, the spread of latency values is only 7 ns (from 47.5 to 54.3 ns, in Pentium 4 XE the spread is full 40 ns!). Switching the memory to synchronous mode (Testbed #2) has quite a positive effect on the latency – it's reduced by approximately 7 ns in all cases. Still stronger reduction of memory latency (by 11-12 ns) is demonstrated during random access walk, the spread of values remaining at 17 ns. Nevertheless, note that random access memory latency values are much less interesting (from the point of view of the real memory latency), because of the considerable influence of D-TLB misses of the processor on the test results.
Our assumption put forward a week ago proved absolutely true. As soon as we took a more advanced processor core in respect to memory operations (implementation of BIU, hardware and software prefetch), everything fell into place. FSB frequency increase from 200 MHz to 266 MHz (by 33.3%) is accompanied by a comparable gain in maximum real memory bandwidth (by 28.6% in read operations and by 32.8% in write operations). An extra bonus is the noticeable reduction of memory latency (by 7 ns, which is approximately 14.5% of the latency values), which is definitely caused by a switch from asynchronous to the synchronous memory operating mode. So, from now on we can ascertain: the use of DDR2-533 memory in dual channel mode together with Intel Pentium 4 processors on Prescott core at the FSB frequency of 266 MHz is completely justified. Though the future revision of Pentium 4 Prescott processors is expected to increase L2-cache to 2 MB, the results of our tests can be considered final (you can apply them to future CPU models as well), because L2 cache capacity has practically no impact on the low-level memory characteristics. We only have to wait for the official announcement and deliveries of Pentium 4 Prescott processors supporting 266 MHz FSB.
Dmitry Besedin (firstname.lastname@example.org)
November 12, 2004
Write a comment below. No registration needed!