Dual Channel DDR2-800 Memory on AMD Athlon 64 X2 "AM2". Part 2: AMD Athlon 64 FX-62 Processor
|
Our first review of the integrated memory controller in the new revision of AMD Athlon 64 X2/FX processors, designed for DDR2 memory (from DDR2-400 to DDR2-800), which we have published quite recently, have shown mediocre results of the memory system on the new AMD platform. We have assumed that it has to do partially with the narrow 64-bit bidirectional bus between L1-L2 D-Caches (its theoretical bandwidth directly depends on a core clock), partially — with the core clock itself. We have demonstrated that bandwidth of modern high-speed dual-channel DDR2 memory is comparable with data transfer rates inside a processor. Increasing the latter without exceeding the bounds of the current architecture (still K8, even if with dual cores) can still be done only by raising the core clock. It will increase performance results of all memory levels of a processor — L1 and L2 Cache. An additional bonus is the same n-power increase in memory controller frequency, which will have a good effect of the last memory level — memory as such.
Today we shall run our tests with a different processor than what we used in the previous review - Athlon 64 X2 4000+ (2.0 GHz core) is replaced with a new FX-series Athlon 64 FX-62 (2.8 GHz, 1.4 times higher), the other platform components being nearly the same. So, let's see whether we can expect a comparable gain (about 40%) in the main property of the memory system (maximum real bandwidth) from the 40% increase in the core clock (the difference between Athlon 64 X2 4000+ and FX-62 comes down only to this).
Testbed configurations
Testbed 1
- CPU: AMD Athlon 64 X2 4000+ (Core Revision F), Socket AM2
- Chipset: NVIDIA nForce 570 SLI
- Motherboard: MSI K9N SLI Platinum
- Memory: 2x1024 MB Corsair XMS2 PRO PC2-6400 DDR2-800 (5-5-5-12)
Testbed 2
- CPU: AMD Athlon 64 FX-62 (Core Revision F), Socket AM2
- Chipset: NVIDIA nForce 570 SLI
- Motherboard: MSI K9N SLI Platinum
- Memory: 2x1024 MB Kingston HyperX PC2-6400 DDR2-800 (4-4-4-12)
Test Results
We decided to test dual-channel DDR2-800 on the new AMD Athlon 64 FX-62 platform (Testbed 2) in the two fastest modes: DDR2-667 (expected real memory frequency — 311 MHz, that is 622 MHz in DDR terms) and DDR2-800 (expected real memory frequency matches the nominal value, 400 MHz). Besides, unlike the previous review, each of these modes is tested with two timing schemes — 5-5-5-12 (identical to the previous tests) and 4-4-4-12 (nominal scheme for Kingston DDR2-800). The table below also contains results of the previous tests of Corsair DDR2-800 in DDR2-667 and DDR2-800 modes with Athlon 64 X2 4000+ (Testbed 1).
Parameter/Mode |
Testbed 1 |
Testbed 2 |
DDR2-667 |
DDR2-800 |
DDR2-667 |
DDR2-800 |
Timings |
5-5-5-12
|
5-5-5-12
|
5-5-5-12
|
4-4-4-12
|
5-5-5-12
|
4-4-4-12
|
Theoretical Memory Bandwidth,
MB/s |
10667
|
12800
|
9955*
|
9955*
|
12800
|
12800
|
Average memory read bandwidth,
MB/sec |
3368
|
3590
|
3693
|
3883
|
4137
|
4393
|
Average memory write bandwidth,
MB/sec |
2759
|
2909
|
3202
|
3311
|
3514
|
3760
|
Max. memory read bandwidth,
MB/sec |
6590
(61.8 %)
|
6819
(53.3 %)
|
7405
(74.4 %)
|
7876
(79.1 %)
|
8382
(65.5 %)
|
8777
(68.6 %)
|
Max. memory write bandwidth,
MB/sec |
5758
(54.0 %)
|
5790
(45.2 %)
|
7874
(79.1 %)
|
7999
(80.4 %)
|
8039
(62.8 %)
|
8070
(63.0 %)
|
Minimum Pseudo-Random Access
Latency, ns |
31.8
|
28.8
|
30.7
|
28.6
|
27.4
|
24.7
|
Maximum Pseudo-Random Access
Latency, ns |
35.1
|
31.9
|
34.0
|
31.9
|
30.4
|
27.6
|
Minimum Random Access Latency**,
ns |
96.3
|
85.3
|
91.6
|
88.0
|
76.2
|
74.2
|
Maximum Random Access Latency**,
ns |
99.5
|
88.5
|
97.3
|
91.7
|
78.8
|
76.4
|
*real memory frequency - 311 MHz (2800/9) **32 MB block size
You can see with the naked eye that a higher core clock really improves memory performance. Strange as it may seem, maximum theoretical memory bandwidth in DDR2-667 mode is demonstrated when writing data into memory (we saw a similar situation in the previous review in such low modes as DDR2-400, but not in DDR2-667) - it reaches approximately 7.9 GB/s. Interestingly, a faster timings scheme (4-4-4-12 versus 5-5-5-12) results in a noticeably higher memory read bandwidth (increased from 7.4 to 7.9 GB/s). According to our multiple reviews of DDR2 memory, there is almost no such effect on the Intel platform. So the differences in memory controllers and memory system architecture are obvious.
So, we've managed to reach maximum real memory bandwidth in DDR2-667 mode with Athlon 64 FX-62 - nearly 80% of the theoretical bandwidth of DDR2 memory operating at 311 MHz. What concerns the 2GHz core in Athlon 64 X2 4000+, the ratio of maximum memory bandwidth values is approximately 1.17 (the core clocks relation is 1.4). Thus, increasing core clocks evidently contributes to increasing the real memory bandwidth. But the proportionality factor between the increase in memory bandwidth and the increase in a core clock is obviously less than one (to be more exact, it's 1.17/1.4 = 0.84). Besides, 80% of the DDR2-667 potential, though looking quite impressive, is still worse than the results demonstrated long ago by the Intel platform with a 266 MHz FSB, easily reaching the real memory bandwidth, practically identical to the theoretical FSB bandwidth (8.53 GB/s).
Let's proceed to DDR2-800 mode. On one hand, the potential is unveiled to a much less degree (considering how large this potential is). Maximum real memory bandwidth is approximately 8.4-8.8 GB/s for reading and 7.9-8.0 GB/s for writing (these two values are swapped again, this platform has the point of reaching the advantage of reading over writing shifted to the area of higher memory frequencies). It's just 63-69% of the theoretical memory bandwidth (12.8 GB/s). Not much higher than the real memory bandwidth on the Intel platform (8.53 GB/s). So we cannot speak of the evident bandwidth advantage of the new integrated AMD controller over the traditional bus architecture of memory on the Intel platform. On the other hand, in this case the memory bandwidth gain relative to the first platform reviewed is more noticeable — 1.28 times as high, that is more appropriate to the 1.4-fold increase in the core clock. Just a trifle, but it's still nice.
And finally, several words about random and pseudo-random memory access latencies. As you can see on the table, they are little different from the results of the previous review in DDR2-667 mode and with identical timings. Reducing timings to 4-4-4-12 results in lower latencies in all cases (especially pronounced in random access), which conforms with the results of multiple DDR2 tests on the Intel platform. At the same time, random access latencies in DDR2-800 mode are initially much lower (in case of 5-5-5-12 timings). Their further reduction in case of 4-4-4-12 timings is quite insignificant. Thus, the clock increase in memory controller (in this case the key role is played by the clock of the memory controller, not of the core) contributes to the reduction of memory access latencies, which can be considered an additional advantage. Thus, the advantage of the integrated AMD controller over the chipset one from Intel (in absolute latency terms) becomes even more noticeable.
Conclusion
Our today's analysis of dual-channel DDR2-800 memory with Athlon 64 FX-62 versus the previous analysis with Athlon 64 X2 4000+ reveals that unveiling the real potential (bandwidth) of DDR2 memory on this platform depends much on a core clock (and memory controller frequency). At the same time, the relative gain in memory bandwidth is less pronounced than the relative gain in a core clock. Absolute memory bandwidth values for the 2.8 GHz core are not that impressive either — they only slightly exceed the point, reached by the Intel platform with 266MHz FSB long ago (8.53 GB/s). They are still very far from the theoretical maximum of 12.8 GB/s for the dual-channel DDR2-800 memory, to say nothing of higher-clocked DDR2 and especially of the future DDR3. Thus, the potential of high-speed DDR2 memory on the AMD platform is not revealed fully yet. A radical way out of this situation may be an updated core architecture. As for now, we can only hope that two cores are better than one, that is such a high performance potential of DDR2 memory may really come in handy when memory is intensively used simultaneously by both cores (but these tasks are extremely rare in real life). We'll try to get back to this topic in our future reviews.
Write a comment below. No registration needed!
|
|
|
|
|