nForce2: single-channel mode vs. dual-channel mode

Some time ago we published the review on the NVIDIA's chipset where we touched upon some prejudices, but when the nForce2 hit the stores the number of rumors rose sharply. That is why we decided to carry out a certain investigation to clear the things up, but attentive readers will find nothing new here. Moreover, the article is written in a very simple language, ignoring such terms as memory banks.

The truth about the nForce2 chipset

The first, the most curious question which is often brought up in forums is: will video cards based on ATI chips work on the NVIDIA's chipset? There can be only one more urgent question - will the Intel processors work on the AMD's chipsets (or vice versa) - no one is skeptical about the VIA's chipsets anymore. Actually, there is nothing to comment much: graphics cards of the ATI's lines work flawlessly on mainboards based on the NVIDIA's chipsets, as well as graphics cards of the NVIDIA's line on ATI chipsets based mainboards (however, the speed is another story...).

The next question is related with the dual-channel nForce/2 controller. Attention: the two-channel mode utilizes usual memory modules with the memory size being equal to the total size of the modules installed, not to the half of it (it concerns all dual-channel chipsets).

What's needed for the dual-channel mode of the nForce/2? At least, two memory modules, with one installed exactly into the DIMM0 slot which is arranged differently than DIMM1/2 or DIMM1, with the DIMM2/3 being provided as well. In case of wrong installation the second memory controller might not operate. The modules can be different (as well as their producers), but their speed characteristics should be the same, otherwise the system will operate at the speed of the slower one (for example, DDR266 in case of one DDR266 and one DDR400 modules). It's also possible to use three modules, but remember that the dual-channel access can be set only for the memory size equal to the double size of the first module (which is installed into the slot of the lowest number), and if the size of the first module does not exceed the sum of the sizes of the second and third modules. Obviously, there are two most optimal cases: equal memory modules inserted into the first and second slots, or three modules, with the first one being equal to the sum of the other two modules.

Now comes the most interesting aspect. You can hear the expressions like "dual-channel nForce2" and "fastest Socket A chipset nForce2" quite often. Both expressions are absolutely correct but it does not mean that they are directly connected to each other. As you know, the EV6 FSB used for the Socket A has its throughput limited by 2100 MB/s at 133 MHz and 2700 MB/s at 166 MHz. That is why the data will reach the processor no sooner than the bus delivers them, and the processor, therefore, does not need the memory faster than DDR266 in the first case and DDR333 in the second case. The figures are given for the single-channel access mode - then is the dual-channel one, with its too great throughput, needed at all?

Theoretically, it can come in handy when the memory access based on the ransom algorithm prevails (dual-channelling can reduce the access delay for the next random address, but such situation occurs seldom in the real life), and when there are devices beside the processor that read from or record into the memory. There are two most important types: video cards and IDE DMA devices (and devices for the CSA bus in the Intel 875P/865x chipsets).

But we can't estimate the pure benefit from the memory controller because NVIDIA uses one more technological solution in its chipsets, namely DASP. This module serving actually as a buffer for the memory controller has advanced capabilities in predicting and prefetching data, and it can bring some gain for the nForce and especially for the nForce2 which supports its redesigned and improved version.

So, today we will try to find out the difference between the single-channel and dual-channel modes of the nForce2 (i.e. to what degree this factor affects the speed and what's the contribution of the others, including DASP).

Performance

Testbed:

Processor: AMD Athlon XP 2700+ (13x166 MHz = 2167 MHz), Socket 462
Mainboards:

MSI K7N2G-LISR (BIOS 1.1) on NVIDIA nForce2-GT
ABIT AT7-MAX2 (BIOS DC) on VIA KT400

Memory: 2x256 MB PC3200(DDR400) DDR SDRAM DIMM Winbond, CL 2.5 (used as DDR333 with CL 2)
External video cards:

Palit Daytona GeForce4 Ti 4600
Inno3D GeForce4 MX 440

Hard Drive: IBM IC35L040AVER07-0, 7200 rpm

Software:

OS and drivers:

Windows XP Professional SP1
DirectX 8.1b
NVIDIA nForce UDP 2.03
VIA 4-in-1 4.45
NVIDIA Detonator XP 40.72 (VSync=Off)

Test applications:

Cachemem 2.4MMX
Wstream
VirtualDub 1.4.10 + DivX codec 5.02 Pro
WinAce 2.2
SPECviewperf 7.0
MadOnion 3DMark2001 SE build 330
Gray Matter Studios & Nerve Software Return to Castle Wolfenstein v1.1
Croteam/GodGames Serious Sam: The Second Encounter v1.07

We used the processor coupled with the 166MHz bus, and the memory working at the same frequency on both chipsets at the minimal timings (remember that DDR400 with CL 2.5 used on the KT400 worsens performance of this chipset). Also, taking into account that the dual-channel mode of the memory controller originally aimed to accelerate operation of the chipset with the integrated graphics, we checked this mode as well. Also, we compared the integrated graphics of the nForce2-G (marked IGP on the diagrams) in the games with an identical external video card based on the GeForce4 MX 440. We are not going to directly compare the nForce2 and the KT400, that is why the figures for the VIA's chipset are given just to clear the thing up.

Test results

The nForce2 results almost coincide for reading and writing in all modes in case of the direct Cachemem algorithm, except the single-channel mode with the integrated graphics; but in case of Wstream copying the dual-channel mode ensures the gain of 20%. Such advantage is obtained exactly thanks to the higher memory speed in the dual-channel mode, but will it be so in the real applications? The KT400 is also slower than the single-channel nForce2 in all tests except the copying speed where they go on a par.

The MPEG4 video stream encoding and archiving with a 4MB dictionary are the best tests for the memory/processor tandem. The character of the results is similar, the single-channel mode with the integrated video yields rather poor results; the dual-channel mode gives a gain of just 2-3%, and the rest of the bandwidth reserve is enough to make up for operation with the integrated video core.

We can see that the dual-channel support does not actually accelerate processor tasks, and the advantage of the nForce2 over the KT400 in these tests can be explained by features of the architecture of the NVIDIA's chipset. The dual-channel mode suits best of all for the integrated graphics of the nForce2-G preventing a performance drop in comparison with the discrete chipset version (or an external video card).

Usage of the professional packet for the 3D graphics accelerators will let us test one more type of interaction of the components. The nForce2 IGP has no chances to catch up with the GeForce4 Ti 4600, since the performance suffers from its 3D acceleration, and the effect from the dual-channel support is like memory synthetic tests, i.e. inapplicable to the real life.

Now, to understand the results let's look into the SPECviewperf tests. They all actively use 3D functions of the video cards, while the processor must just quickly calculate basic parameters and scene parameters and also quickly transfer large volumes of data through AGP to the video accelerator. The situation differs much from games where the processor has a lot of other things to do, and it's impossible to compare the CPU utilization to the video card's load. For example, in low resolutions when the graphics quality is low the game is limited by the CPU while the video card quickly renders frames, but in high resolutions when the image quality is high the video accesserator carefully draws images, while the CPU completes its task much faster. Obviously, there is a situation somewhere in the middle similar to the SPECviewperf, which we are now going to analyze.

The dual-channel mode now looks much better than the single-channel one in three tests out of six. In the other tests all the systems based on the GeForce4 Ti 4600 are right about the same, and the breakaway of the dual-channel nForce2 is not great, though 16, 19 and 25% (in dx-07, proe-01 and drv-08 respectively) is considerable, it's even more than the synthetic Wstream showed. Probably, this is the situation of an ideal balance between the processor and video accelerator when they make their job at the same speed, and the extension of the memory throughput lets the memory controller utilize the FSB entirely, and simultaneously transfer data via AGP to the video card. By the way, in the tests of the maximum difference (drv-08) the KT400 outscores the nForce2 (single-channel) for the first time, and if we compare there the dual-channel nForce2 and the KT400, the difference will be 15-19%, like in the other two tests. That is why it might be a certain artifact in performance of the NVIDIA's chipset.

There is nothing interesting or unexpected in the games. In any combinations of resolutions and graphics settings with the external card the advantage of the dual-channel mode does not reach even 4%; the gain is maximum in the weak conditions, i.e. when it's possible to accelerate the processor/memory tandem. In all other aspects, two channels bring perfect scores for the integrated graphics, though it doesn't make IGP look adequate to the external GeForce4 MX 440.

Now let's bring the main data into one table (tell us, please, what you think about such a way of data representation):

Dual-channel mode vs. single-channel mode, % difference	With the external video card (GF4Ti4600)	With the integrated graphics core (IGP)
memory reading	+0.4	+4.4
memory recording	+2.3	+15.5
memory copying	+20.9	+31.5
MPEG4	+2.2	+6.7
WinAce	+2.9	+8.4
drv-08	+24.7	+55.0
dx-07	+15.9	+65.8
3dsmax-01	+1.8	+13.5
light-05	+4.6	+45.0
proe-01	+19.0	+24.6
3DMark2001	+1.7	+65.0
RtCW(Fast)	+3.5	+42.4
RtCW(High)	+1.1	+76.1
SS2(Speed)	+2.8	+40.6
SS2(Normal)	+1.8	+67.1

Conclusion

First of all, the dual-channel mode of the nForce2 always makes a positive effect, and taking into account that this mode actually costs nothing (it needs two smaller memory modules instead of one big module), it's very possible to get the performance growth of 2-4%.

Secondly, the dual-channel support is not the weapon that helped NVIDIA win the battle on the chipset field. The VIA KT400, an excellent chipset, gives in to the single-channel nForce2 in the same conditions as to the dual-channel one (except the only one case), that is why the battle is won by the Californian company thanks to other components of the system logic, including the unique DASP.

If you are going to use the integrated graphics of the nForce2-G/T chipset, you should definitely use the dual-channel mode since the speed drop might reach 10 to 70%. Also remember that it's said that the graphics core used on the nForce2-G corresponds to the GF4 MX440-8x chip, but actually, the lower frequencies of the chip in IGP and of the system memory bring much worse results in the 3D graphics tests.

Dmitry Mayorov (destrax@ixbt.com)
Sergei Pikalov (peek@ixbt.com)

Write a comment below. No registration needed!