[an error occurred while processing this directive]

Performance and Power Consumption Management Functions in Intel Pentium 4 and Intel Xeon. Part 2: New Processors, New Technologies


Enhanced Halt State technology

According to the available documentation, Enhanced Halt State (C1E) is an operating mode of a physical processor with low power consumption, which is activated when both logical processors go to sleep (the HLT or MWAIT instruction) if this technology is enabled in BIOS. So what's the difference between this low power mode and the "usual" Halt State (C1)? In this case a processor can dynamically reduce the motherboard frequency multiplier (FID) and voltage (VID) and restore to maximum performance state (nominal FID/VID) when necessary. This can be done absolutely automatically (without operating system interference). One can say that C1E is a compromise between the old ODCM technology, which automatically decreases effective CPU clock by modulating it in the halt state as demonstrated, and the new server Enhanced SpeedStep (DBS) technology, which can change the effective CPU clock and voltage, but does it "on demand" instead of doing it automatically.

Well, it's time to see the new technology in action. For this purpose we'll use RMClock, which has obtained additional features since the previous article. In particular, it has specific settings for power management functions in Intel Pentium 4, Xeon and Pentium M processors, so that we can now do without extra utilities, such as CPUMSR.

But we have to precede the discussion of results with one important qualification – the processor in our test lab turned out an engineering sample (as reported by Intel Processor ID Utility). In this connection we can expect that the behaviour of production samples may differ from that demonstrated by our processor in today's tests.

The second important issue – C1E detection is implemented in the hit-and-miss fashion so far. Thus it is of a presumptive character (nevertheless, it functionality is field-proven).

Let's proceed to the main tabbed pages of the application – that's how they look in a couple of minutes of operation when the CPU is not loaded. On the left picture you can see that the processor supports TM1, TM2, ODCM and C1E. Only TM2 and C1E are currently enabled (i.e. by default). The C1E effect is already noticeable – just compare the current FID and VID values of the processor with the nominal ones. The first one is at minimum, the second is somewhere between minimum and maximum. You can see the FID/VID history on the right picture – at minimum CPU load, FID remains on the constantly low level (14x), VID varies within a wide range, its average value being about 60% of the nominal one. In fact, VID changes may fail to reflect the real changes of CPU voltage, because VID is just a voltage value requested by a processor, but a motherboard is actually free to do anything with this request (change nothing, first of all). Nevertheless, we monitored real CPU voltage changes in our tests as well (using Hardware Monitor from Intel Desktop Utilities, which relies on readings of motherboard sensors). It was approximately 0.1V lower than VID values.

So, everything is great with FID and VID, however we cannot miss another important issue – CPU clock as such. Both the actual CPU clock (measured by TSC) and the throttled CPU clock sort of remain... at the constantly high level. To be more exact, the throttled CPU clock is within 3590...3600 MHz, which is just 10 MHz lower than the nominal clock. There is actually nothing surprising in this seemingly unusual phenomenon. Considering the C1E nature (it's activated only when both logical processors go into the C1/HALT state), there is absolutely no opportunity to see it in action. Because even the most precise methods for measuring CPU clock will inevitably make a processor switch from sleep mode (C1) into operating mode (C0), which restores full CPU clock.

Let's try to disable C1E on the CPU settings tabbed page.

The result is illustrative – disabled C1E immediately results in maximum FID and VID set as defaults. After that they are kept at the constant level irregardless of the CPU load. And the effective CPU clock becomes stable at the level of its actual (nominal) clock – 3600 MHz.

The next experiment – we restore C1E and apply variable load to the processor, imitated by a simple simulator.

FID/VID curves prove that the processor can switch fast between minimal/maximal P-State Profiles. To all appearances there are only two FID states (initial and final), while VID changes may undergo many intermediate states (at 0.0125 V steps).

Thus, the Enhanced Halt State technology is quite a promising innovation, which reduces the power consumed by a processor in the halt state much more effectively compared to the regular C1 mode (HALT). The introduction of this technology may raise a question: if there is a completely automatic C1E (which is also used in new Xeon Nocona processors, by the way), why do we need Enhanced SpeedStep for servers (DBS), which requires "manual" (program) control? The answer to this question is very simple: C1E is really a completely automatic technology, it can reduce power consumption of a processor only at full system halt and restore the system to full performance at the least CPU load. And DBS can force down the power consumption of a processor in normal operation, including considerable system load (if the controlling software decides that there is no need in full capacity of the server).


Dmitri Besedin (dmitri_b@ixbt.com)
February 10, 2005

[an error occurred while processing this directive]