Intel & AMD vs. Users

I haven't touched my keyboard too long, and suddenly it occurred to me that there is one posh and actually very simple issue, around which so many lances were broken that it's high time to call for Greenpeace and stop the thoughtless waste of finewood :). So, "Intel vs. AMD". This phrase instantly causes headaches among forum moderators, and at the same time it is a topic favoured by all — from doctors of science to advanced housewives. But think well: what's in this eternal battle outcome for us? What are we? Shareholders? Employees? That's why I suggest a different angle to this issue: there are "us" users and "them" manufacturers. What can we expect from them? This is a far more interesting question to my mind…

Athlon for the Faithful

"Contemporary history" of AMD, despite its foundation in 1969, for most people started 30 years later, in 1999. At that time the first AMD Athlon processor was released, which turned our idea of the x86 alternative inside out.

The main advantage of the new CPU was its "pantophagy". Due to its relatively short pipeline and orientation to executing maximum number of instructions at a cycle, Athlon (for simplicity sake we shall mean by it both K7 and K8, if the modification is not specifically expressed) can execute "dirty" code at a high speed. It means in the first place, that you don't have to rework the old programs, which had no idea of the CPU-oriented optimization, to achieve satisfactory performance. In fact the reality is much deeper: strictly speaking, the point is not only in the old software. Any coder knows that even a tenth version of a serious, long-living package can often contain code fragments written for the first version. This code is operational and bugs free — why change it? The only problem is that it was written for 486 or Pentium MMX… Well, AMD processors easily and rapidly execute such programs, delivering developers from extra headache.

Besides, there exist tasks admittedly poorly optimized for CPUs with long pipelines: e.g. databases for random unpredicted access (you cannot really predict users' queries!). Generally, the more interactivity, the more attractive a short pipeline is. Interactivity also means computer games, by the way! Software with early stages of artificial intellect also prefers short pipelines.

But that's not all either, because the problem of "nonlinear code" is often caused not by programmers or task specifics, but sometimes by compilers. It's very difficult to write a good optimizing compiler — and a common compiler can sometimes make a mess "out of the blue" so that an elementary operation on the high-level programming language translated into an assembler code will raise your brows: "Such a mess! I wouldn't have done it on purpose…".

You shouldn't forget that Athlon became a serious market event. For the first time in many years there appeared a powerful, quick and state-of-the-art x86-compatible CPU, developed outside Intel. I emphasize – not only manufactured by another company, but also designed by it! Moreover, a whole new alternative platform was born, because unlike the previous AMD processors, Athlon operated with different chipsets, and was installed into a different socket (at first physically compatible with Intel sockets, but later AMD abandoned it). As a result, system logics developers get more freedom: they had to cater for a single CPU manufacturer, and this manufacturer could choose the platform development strategy on the whole if not directly dictate terms to them. The situation changed with the appearance of Athlon, and the positive impact showed itself at once: despite the cool attitude of Intel to DDR, who preferred to promote RDRAM, DDR successfully entered the market and consolidated its grip — solely due to this alternative. By the way, Intel later indirectly acknowledged that AMD was right in this issue — presently DDR memory is a de facto standard for both platforms. Besides, the price war resulted in a considerable drop of prices on CPUs with high clocks, and, by the way, in the raised rate of increase of the CPU clocks. If you wish you can compare the release rates of new CPUs, e.g., for the period from 1995 to 1999 and the present rates.

But AMD never rested content with what has been achieved: a new K8 core (Athlon 64 and Opteron CPUs) came to replace Athlon XP (K7). Interestingly, the general vector to "modernizing x86", that is designing processors progressive both in technical sense and good at executing old applications, became even more apparent in K8. On one hand, Athlon 64 and, especially, Opteron have all features of CPUs for large serious systems: embedded memory controller, 64bit addressing, up to three independent hyper transports, which allow these processors to connect not only to the chipset but also between themselves to form something like a computing system with NUMA architecture. On the other hand, they are fully compatible with the classic set of IA32 instructions, they are also oriented to desktops, and they are quite affordable in price (especially the low range models). In fact, K8 core is a "little engineering miracle" — architecture both powerful enough and of high performance in order to build multi-processor systems, servers, and even supercomputers; and on the other hand, quite capable to compete with common desktop solutions in its price. By the way, this case also indirectly confirms that AMD chose the right development path against its main rivals: 64bit EM64T extensions, which support is intended in the server variants of Intel chips (and later, maybe to desktop ones), are in fact a variation of x86-64 (current AMD64) — an instruction set developed in AMD.

Summing it all up: AMD processors (Athlon, Athlon XP, Athlon 64, Opteron) are an example of extremely successful "rejuvenation" of x86. Simply and unpretentiously, without posing extra problems to developers, AMD managed to design state-of-the-art CPUs possessing all presently required functionality and at the same time backward compatible with the old software.

Short pipeline and a great number of instructions executed at one cycle: excellent efficiency "in MHz equivalent".
Powerful and quick floating point unit (FPU), which allows not to use additional instruction sets without serious performance loss.
Reasonable use of third-party development and cooperation with other companies: EV6 bus from DEC, technical process "silicon-on-insulator" (SOI) from Motorola/IBM, platform chipsets from the three leading chipmakers (VIA, SiS, NVIDIA).
Promotion of new technologies: DDR memory, progressive Hyper Transport bus.
x86 architecture extension to 64bit giving it additional "safety factor" and allowing bugs free habitual software usage on systems with high memory volume.
Original (and never used before in x86 processors) solution to integrate memory controller into CPU, which allows considerable decrease in memory access latencies and a simplified chipset.
Socket A platform has become a champion among x86 platforms in long living in the market: its processors has been manufactured and actually available for almost five years.

Catechism for a True NetBurst Fan

A Frenchman: I don't understand, how you, the Swiss, can fight
for money. We, the Frenchmen, fight only for honour and glory!
A Swiss: Everybody fights for what he lacks…
(origin unknown)

In fact, no matter what is said, Pentium 4 is a very interesting processor. Its interest lies in the fact that the traditionally main units (computing units) are rather simple. Actually the entire processor is very simple — otherwise it wouldn't be able to operate at these frequencies. The essence of the NetBurst architecture (according to the author — he is not part of the development team, so he has to content himself with more or less plausible guesses) is in organizing sort of "a computer inside computer". In fact, a very big part of the Pentium 4 processor is a complex system of cooperating units, which must provide proper supply rate of x86 instructions for a decoder, and of internal core instructions for the simple but high frequency executive units. Forgive me this sedition, but the x86 code itself for Pentium 4 is nothing more than "some external architecture of instructions", and (still braver guess) in this case it's not difficult at all to change x86 for another.

Look at the Pentium 4 flowchart, you can see that there are only five blocks operating with x86 instructions — Bus interface unit (BIU), L2 cache, onboard decoder/prefetcher bus, and the proper decoder and prefetcher. One can say that AMD is approximately the same… But there is one principal issue: K7/K8 decoder is a part of the pipeline, thus an instruction (even if it is already in the L1 cache) is decoded each time it is executed. In case of the NetBurst architecture, it is decoded only once because L1 cache already contains internal core instructions, but not x86 instructions. And all the rest on this diagram, except for ALU/FPU, has nothing to do neither with x86 code nor with the execution of instructions. It's one big "improver", which provides uninterrupted flow of instructions. If digress from CPUs for a while, and look at the modern computer system on the whole, you will notice that Pentium 4 and Athlon ideologies resemble two constantly competing approaches to the bus interface design: Athlon is a typical "parallel bus", Pentium 4 is a "serial bus". I don't want to make far-reaching conclusions out of this daring analogy, but the comparison is funny in itself... Especially considering the current fashion to serial interfaces :).

I am not sure whether I managed to bring you to my idea of the above, so I'll make myself clear: NetBurst is the first of its kind processor architecture, where the main attention is paid not to the code execution, but to the fastest code delivery to the executing units. High performance of these units is rather easy to achieve — due to CPU clocks.

Is it good or bad? Let's not get carried away by primitive answers and try to understand under what conditions it can be good, and vice versa, under what conditions this architecture cannot productively operate. However, the answer to this question is well known: Pentium 4 is good at programs, which were developed taking into account its operation features. Then let's ask another, more global question: why Intel decided to design its latest architecture exactly like this? There are surely no fools in IAG (Intel Architecture Group)! (Those who support the idea that "Pentium 4 was designed by request of the guys from marketing department" may skip the rest — the author is not up to clinical idiocy cases).

And here you have to understand the main difference between the development approach to the NetBurst core from Intel and the K7 core (which later served as the basis for K8) from AMD. Let's start with the fact that the Pentium 4 project had started when Intel supremacy on the market was absolute, and nobody took AMD seriously. In this situation manufacturer's priorities predictably change: it is interested not only in designing new progressive architecture, but to a greater extent in minimizing expenses and in getting a maximum benefit. To put it straight: "we are the leaders anyway — so let's think how we can double our millions" :). Even at that moment Intel was the absolute leader in technological respect: many plants, progressive technological processes, well-adjusted manufacturing. It would have been silly not to take advantage of all these. And Intel seized the opportunity and designed a processor intended for rapid clock frequency increase. This idea most likely belonged to engineers and technologists, not marketing specialists! Any engineer will tell you that if a problem can be solved in several ways, but one of the solutions is provided with the existing base — this is usually the solution to choose. It's easier and cheaper. The idea was simple and elegant: design an architecture aimed at the technology, in which we are far ahead of the others. We are technologically able to get 0,18micron process chips at 2 GHz, and AMD will most likely not be able to catch up for a long time — so let it be 2 GHz!

Thus, Pentium 4 was a processor designed not only to be fast, but also to get maximum profits with minimum expenses. Marketing specialists, by the way, had bad times, when it turned out that Pentium 4 Willamette would have to compete not with a weak K6 or its slightly modified version, but with a quick and powerful Athlon instead.

But it would have been naive to think that Intel put all the eggs in one basket and made Pentium 4 solely a "profit making" processor meant for absolute lack of worthy competitors. If you have a closer look at the products of this company, you will clearly see that its processors (as well as other devices) always have the second layer, and sometimes the third, and the fourth. In case of NetBurst, the second layer is its initial orientation to optimized software. It's no secret that in ideal conditions (extended instruction sets SSE and SSE2, "smooth" code without many hard-to-predict jumps) Pentium 4 is always faster than its competitors. The main paradox of this processor is that sometimes it operates slowly, but it's almost always possible to rework the code so that Pentium 4 is the fastest. I underline, not because the other processors will decelerate on this code, but because Pentium 4 will accelerate. Thus, NetBurst in a certain way forestalled the performance crisis, which we shall consider later, in the concluding part of this article. A program executed in the processor with this architecture can almost always be considerably accelerated due to additional optimization. Obviously, if the software developer wants to do it — but again the technical solution here is deeply intertwined with the marketing strategy: Intel was probably sure that it would manage to persuade the software development companies. By the way, it was not wrong: there are more than enough programs optimized for Pentium 4.

But NetBurst possesses the third layer: orientation to high clock memory. It's quite transparent: we have already found out that the only serious threat to performance of the CPU based on this architecture is slow delivery of instructions to the executing units. Surely the "improver" must not allow this, but ideals cannot be reached — sometimes the branch prediction system and prefetcher fail, and the CPU falls into a stupor and starts cleaning out the pipeline. In this situation the operation will be resumed the faster, the quicker a portion of "correct" data and/or code will be read from memory via the CPU bus. And now remember that the bus in Pentium 4 was the fastest among all x86 CPUs — only relatively recently the dual link controller in AMD Athlon 64 / Opteron has managed to catch up with its competitor in memory operation speed (and by the way, it was a bloody victory — to win this fight AMD had to embed a controller into the CPU). Thus, the bus is all right — it is initially faster than that of the competitor, it's now the memory's turn. What does this situation mean in practice? Firstly, it means that Pentium 4 will always benefit more from faster RAM, than its competitors. Broadly speaking, faster RAM is more critical to this CPU.

And, of course, we cannot ignore another technology, which appeared together with the NetBurst architecture and which is a logical development of the ideas laid in its foundation — Hyper-Threading. In fact, it's another variant of the "improver", only on a higher level. Do you remember, we found out that in case of the branch prediction system and/or prefetcher failure, the CPU falls into a stupor for some time? How can you fight it, if the predictor and prefetcher cannot be currently improved? Right you are! Feed the processor with another code — then, it may turn out more "convenient", and the CPU will be able to process the second instruction set while struggling with the first set of instructions. In fact, Hyper-Threading is a first hint at CMP (Chip-level Multiprocessing). Actually, there is less difference between CMP and Hyper-Threading than it may seem — in pure theory, any two devices can be represented by a single device operating at double speed and processing the corresponding data (buffers, queues, registers, instructions, etc) in turns. The question is what is simpler and cheaper to implement — two devices operating at "unary" speed, or a single device operating at double speed. We can surmise that if Intel goes for CMP processors, it will far not always choose the first option…

Let's sum it all up: the NetBurst architecture retaining compatibility with the old software allows to develop new, much faster programs. This architecture is almost the ideal implementation of the expert marketing approach to CPU design.

Rational resource usage during the development: orientation to strong sides of the manufacturer.
Excellent example of a principally new CPU architecture design almost "from scratch", solely by own R&D department.
Unexpected and original, but potentially promising approach to the performance raise solution — shifting the main focus from execution speed to instruction delivery rate.
Taking into account main tendencies in the hardware development in general: increase of data transfer rates, orientation to data flow computing and multimedia.
Wealth of effective optimization options, which offer considerable gain in program execution speeds.
Rudimentary "chip-level multiprocessing" (CMP) — another modern tendency forestalled in Hyper-Threading.
Traditional "single source" approach providing maximum compatibility: users and manufacturers of the turnkey systems can get all important components (from CPUs to chipset and mainboard) from a single manufacturer.

Intel vs. AMD ;-)

— You speak as if you know all beforehand. Do you see visions?
— No. I'm just seeing clear.
(V. Sverzhin, "Unicorn Law")

The author hopes that Intel fans didn't take it to heart when reading "Athlon for the Faithful", and AMD fans were not spitting too often when reading "Catechism for a True NetBurst Fan". In the theory both groups should have burst out laughing when they reached "Catechism…". At least that was the plan :). The old journalistic trick: first take one position and then suddenly switch sides. But I didn't do it for sheer fun. That fact is, both sections of this article are true. That's the state of affairs from the point of view of the manufacturers. AMD just played one card, and Intel - another. AMD approach can be called more cautious and traditional, Intel approach is innovative but risky. But anyway, both companies obviously had their reasons, quite objective at that. Today's realities (both Pentium 4 and Athlon XP/64 are manufactured, sold, and have their fans) testify that nobody has actually made a mistake — both architectures are in demand. What now? Nothing… We have two interesting conceptual designs, one of which is good at some tasks and conditions, the other one is good at the other tasks under different conditions. We don't drive cars of the same model, don't we? Even if they have the same price, some people will prefer a Mercedes, others will choose BMW, or VAZ, or Volga. Perhaps it's a sign that it's impossible to create an ideal car. What a fresh thought… but why are we expecting from processors what we stopped expecting from other devices?

Strange as it may seem, one of the most noticeable results of an alternative to Intel x86 architecture has become activation and rapid development of benchmarks, corresponding mass media, and test labs. Processors have become really difficult to compare, and with time it's getting more and more difficult. On one hand, the development of computer mass media is a positive tendency: users got an opportunity to learn more about the products they buy. On the other hand, progressing complexity of benchmarks, which are not taking into account common readers' skills to perceive constantly growing data flow, is a potentially dangerous path, because it leads to appearance of another art for art's sake. Two years ago the author wrote about this problem. Unfortunately, this problem is not solved yet, because "benchmark-bred" buyers persistently consider any computer a universal device, "jack of all trades". It goes without saying that in this case embedded graphics looks awful (because you can compare it with Radeon 9800, which is 10 times faster!), and the processor (twice as slow as the other one) is considered complete rubbish, mind it, without even taking into account the computer where it's installed and the computer usage tasks.

So who is the winner? Attentive readers have probably already understood the author's point: nobody, because there is no sense in this question. But we can try to formulate sort of a correct answer to this ill-posed question. It will look approximately like this: "AMD is definitely a winner, but Intel is not a loser". Funny casuistry :). But it best describes this situation. It's true: Intel hardly planned to occupy the entire podium (all three places) alone all the time. That would be too naive to expect. A new strong rival with a good fast processor able to compete with your own designs… I suppose that after these ten years from the launch of the first clone, Intel had been too tired to play with the world in "UFO: Enemy Unknown" and looked at AMD Athlon with some relief :). And it goes without saying that the architecture of the new rival was exactly like it must have been — demonstrating good results, where the Intel architecture was "feeling awkward". It couldn't have been different! Just think about it: in different conditions Athlon wouldn't "have been heard". A clone cannot be better than the original, that's why the only thing that can oppose an interesting, powerful, and original product is another interesting, powerful, and original product, which does not resemble its rival. That's what happened. If not AMD, another company would have done it. Nature abhors vacuum. It's even strange how much time it took to fill the vacuum…

The present situation looks more like a stalemate: both manufacturers have already squeezed everything possible from their processors (and in a broader sense — from architectural concepts). Have you noted that nothing particular is heard from Intel about the Pentium 4 successor, nor anything detailed about the mysterious K9 core from AMD (and all that when this company is famous for its love to inform the press about the plans!). For now we are witnessing the raise in clock frequencies, rather sluggish at that. Hopes for the 90nm process are dying hard, as any decent hopes, but they die anyway: yes, that's a good way to reduce the chip price or stuff more transistors there — but the effects of reduced heat emission and raise in clock frequencies are very "vague". You can already see it in Prescott, and there are no reasons for AMD to do any better. That is, it can really be better, but not that better to make this effect similar to that of the transition from 180nm to 130.

However, forgive my pragmatism and "stark subjectiveness", having no idea of the technical characteristics and release dates of the new products, I am quite sure that the status quo will most likely be preserved. AMD honestly conquered its second place on the market, Intel managed to retain the first. To radically change the situation, AMD must design a processor essentially superior to competing CPUs, because the rival has much more money, plants, and glory, and AMD — just a fast CPU. Even if it's a little faster in general, it's not important. Intel has been up to manufacturing x86 CPUs since 1978. The first original processor (not a clone), which managed to match products from this manufacturer, was released 22 years later. All honour to AMD, but today more people know and buy Intel processors, and that's quite understandable. À "tiny bit" is not a reason to switch to the new manufacturer for a person who has been using products of the old manufacturer the whole life. If you have any doubts in this scenario, try to establish a soda water company (very tasty and high quality!) and spit further than Coca-Cola :).

Speaking of Coca-Cola, any experienced businessman knows that large sales "are made" by sales persons, not by customers. They should be the first to interest in your product, if you want to launch it on the mass level. Emphasis on price and popularity among customers is for aggressive novices, nothing more. Sales persons have their own reasons: expensive products are more profitable to sell. AMD, by the way, has already understood it and flatly abdicates from the "inexpensive alternative" image. From the marketing point of view it testifies that the company has gained a certain status — otherwise it wouldn't have minded the reputation for cheap solutions. But from the customer point of view it means that the price gap between Intel and AMD CPUs will shrunk — the better AMD is doing, the faster. Hello, fans…

Another reason for my hysterical laughter is the reasoning of some ardent fans about a "soon to be change of the leader". These gentlemen have no idea of the processor market and its scopes. I don't have now exact figures on x86, but in general Intel has 11 plants producing about 150 million (!!!) CPUs a year (x86 processors are manufactured on the six of them, if my memory does not fail me). At present AMD has one plant, which produces Athlon XP, Athlon 64, and Opteron. Plus another plant under construction. Pardon my black humour, but if tomorrow Intel stops providing the market with Pentium 4 and Celeron, the majority of customers would be left without computers — including many AMD fans :).

On the other hand, I don't mind the supposition that if AMD proceeds to manufacture CPUs (approximately similar to Intel CPUs in their performance - give or take 5-10% is not actually important) in sufficient volumes, its market share will most probably grow. To a certain limit, of course. My absolutely subjective figure (based on intuition rather than on exact calculations) for the maximum market share is 20-25%, that is 1/5 — 1/4 of the market. However even this figure can become a reality only if Intel makes all possible little mistakes, and AMD, on the contrary, makes no mistakes (I don't actually believe in rough mistakes from both sides).

What concerns comparing pe... (sorry — "megaloflops") in the top processor sector, this process will surely carry on — but sorry, what it has to do with the market position?! It's just like sports, and in football (as we have all recently witnessed) even Greece may become the first in Europe (even if the Greece territory is smaller than the Moscow region ...). But at present Intel and AMD demonstrate different attitudes to this production athletics: AMD is "aggressively-competitive", Intel is rather "placidly-meditative". But in time both competitors will have the same attitude. They'll come to an agreement after all…

Tendencies and forecast

Here the author thinks it necessary to remind you that the conclusions in the final section of this article (which is quite natural) are based on the conclusions and guesses made by the author in the previous sections of this material. It would have been illogical to deny your own reasoning, wouldn't it? On the other hand, it means that if you didn't find the above mentioned material reasonable and correct — you will hardly like the concluding part better :).

— I still don't understand anything, - confessed Shah.
— Well, - inquired Sean ironically, - did you understand more before?
— I didn't understand less in the past.
(A. Ulanov, "One hero, two hero")

Megahertz are not like they used to be…

You can easily notice that the "N times performance raise each X years" tendency has been seriously glitching recently. To those eggheads and aesthetes, who like to mention appropriately and inappropriately something they overheard about Moore's law, (which is in fact not a law at all, but it's another story…) I will help to remember that in this case the question is not about the device complication, but about their performance growth. To stuff a chip with a bunch of transistors and to raise the chip performance in real programs are "two big differences" (as they put it in Odessa). There are also no preconditions to change the tendency to slowing down the CPU performance growth. At any case, the author does not know any of such preconditions.

But nature abhors vacuum, and the number of CPU-critical software is not getting fewer. That's why it will be quite logical to suppose that the place of "speed kings" will not remain vacant. Who will take it? Most likely — programmers. Of course in cooperation with CPU designers, only in a different key — the officers will change their places with privates. The past approach to performance gain was simple and dull: we have an abstract code (any code) — we need to execute it as fast as possible. In fact it's a classic "brutal force" method. This method is no longer efficient… So the first role will be played by the "cross optimization": on one hand — optimization of programs to processors, on the other hand — CPU optimization to programs. Soap operas of the kind "MMX — 3DNow! — SSE — SSE2 — SSE3" (and, strange as it may seem, x86-64) are a good example of the main tendency: programmers are constantly discontented with CPU resources.

Of course you can say that programmers are always discontented :). But the history of x86-architecture counts six (!) attempts to somehow compensate the i386 legacy, which testifies that the problem really exists, joking aside. By the way, a good proof of its existence is SSE3. We shall not discuss its necessity, speed impact, etc. We are interested in another issue: if you have a closer look, you will understand that this instruction set was obviously ordered. Egghead core designers never gathered to decide that something was missing and to produce another Universal Extended Instruction Set Fit For Everything, not at all! SSE3 clearly demonstrates programmers' order: "Make it easy for us to develop this and that". Works in this direction will presumably continue, and exactly in the manner described above. As a matter of fact, the appearance of x86-64 can also be explained by programmers' order (why not?): it's not convenient to work with large memory volumes using 32bit addressing, even extended with a "perverted method", (PAE). At least, programmers display higher optimism about 64bit addressing in x86 and an extended set of registers. To sum it all up: gone are the times when CPU designers demonstrated the "eat what you get" attitude towards programmers.

On the other hand we see the classic "rakes", on which the automobile industry had once stepped: having created wonderful, powerful, and fast autos, manufacturers were surprised to see that there were few places where these autos could have drive at full speed — because autos move on roads, they do not fly. If you compare megabytes of code and data with kilometers of roads, then programs are the roads where CPUs drive. Programs should be optimized, levelled off — or we'll go on driving to provision shops in the old fashion (on off-road autos with driving gear to all eight wheels and 200bhp engines). In fact, we live in the era of "engine power raising" decline. Yes, the sluggish race for gigahertz and teraflops of the peak performance will endure some more time, but the victory communiques like "now 200MHz more!" do not excite the past optimism. Especially when all this results in another poor 2-3% performance gain in real applications. Poorly optimized programs are obviously easier to develop than the well-optimized, so an "external impact" is needed to make programmers obligingly agree to develop good programs. Slowing down of the peak "dull" CPU performance growth will serve as such impact. Thus, accepting the responsibility to dictate their will (or "come out with suggestions" if you please) to hardware designers, programmers in their turn will undertake for using hardware features to full extent.

Upgrade: Impossible — 2

The second tendency is in modern computer systems getting less and less prone to upgrade. However, long ago your humble servant came to a conclusion that there is no upgrade, so there is nothing to be surprised at :). To think about it, upgradability of popular products as a tendency is characteristic only of young industries. TV-sets and microwave ovens, cars and tractors, laundry washers and gas cookers — they cannot be upgraded, and it raises no protests. Per se, modifications of a completed device (object) to get new features have always been a destiny of enthusiasts or… people of scanty means. Who will now even think about making over a dress? Only those who keep this dress as a souvenir or those who have no money for a new dress… Of course, high-end audio equipment can also be conditionally called "upgradable", but note that this equipment is initially extremely expensive. But we got used to upgrading standard, popular computers.

It's funny — non-upgradability tendency sneaked up insensibly, and its first signs are not socket and slot changes. The class of devices, which consistently puts forward the idea "what you buy is what you work with until you throw it away", first of all includes notebooks and popular hand held pcs (pocket computers). Of course, their poor upgradability was well known from the very beginning (so nobody understood the trick in time), but the fact is that there have recently appeared a lot more such devices. That's why a person who works with a notebook (many large companies give their employees notebooks instead of desktops) will not even think of "installing something more up-to-date into a computer".

The mobile devices were followed by "desktops of tomorrow" — miniature barebone-kits, nice and shiny, resembling customer video or audio equipment, and… they are also poorly upgradable. You cannot surely replace the mainboard, the cooling system will not manage any processor (it is also very often unique), video card is limited in its height or the space it occupies, etc. So, the market counts more and more devices that cannot be upgraded, it means that they are in demand, and thus fewer people are disturbed that you cannot upgrade separate computer components.

Of course most moans due to this issue are heard from people who like to assemble their computers on their own. But that's the thing, percentage of this customer category is constantly decreasing with respect to the entire mass of customers. It does not mean that they are getting fewer in their numbers — they are just dissolving in the common mass of novices. In the past, home computers could be afforded only by fans and specialists. At present, a home "comp" is available in almost any well-to-do family. These people do not want to assemble, configure, modify anything — they got used to home appliances. Upgrade is becoming an appanage of enthusiasts, and is gradually getting more expensive. Sad but true. Additional proof to the above is that manufacturers are also changing their attitude to computer upgrades. Look around, almost no one places emphasis on hardware interchangeability. On the contrary, a question: "And can I...?" is almost always followed by a shrug (perhaps even a deliberate one) and a melancholic answer: "I don't know… What for?".

New Extensive Era

How will they raise CPU performance anyway? Will they stop doing that? The answer to this question is on the surface in news and press releases: I'm almost 100% sure that both companies will follow the path of integrating several cores into a CPU (CMP). It's not decent to praise oneself, but this path fits well into the performance raise problems mentioned above — CMP will do good only with additional software optimization. It must be made multithreaded at minimum. On the other hand, multi-core processor design must not be very difficult, because its only difference from a single-core processor is that it has a double number of the corresponding units. Of course they can change the core, but I doubt it that the changes will be essential: if Pentium 4 or K8 could have been improved in this way to raise their performance, then why not do it now, before the CMP switch? Now we can understand the silence of both manufacturers in regard to future CPUs: there is nothing to announce… "The same thing but in doubled number" — that's a description of "another revolution in CPU design" :).

There is, however, one small intrigue: it's all clear with AMD (they will stuff two K8 cores into a chip to make a new processor), but the rumours about the Pentium 4 successor run that NetBurst is dead and that we'll soon be happy to have a Banias/Dothan heir on desktops. This extract, despite its obviousness, must be specially commented on. I DON'T BELIEVE. There are several reasons. Firstly, the Pentium M core was designed specially for mobile processors. And if Intel decided not to finish off the desktop core but to design a unique one — that's surely not because they want to keep the main priorities safe. Thus, the initial task was to design economical and "cold" processor. Do you believe that it was done without any harm to performance on the architecture level? Me — not. And why does Intel need a desktop processor (taking into account a serious competitor – AMD!), which is slowed down on the architectural level? That's the first, most obvious reason.

Secondly, let's remember what was already said about NetBurst orientation to well-optimized programs. Programs will have to be optimized for CMP, otherwise the second core will not offer any performance gain. That's on one hand. On the other hand, we have an architecture that "shines" only after software optimization. It does not matter that we speak about optimizations of a different sort, but if programmers are made to rework the code, they will hardly implement only multithreading and refuse the other completing paths. It's important here to start the ball rolling ;). So the new wave of reworked programs can make Pentium 4 shine even in the software where it had previously felt uncomfortable due to unoptimized code. It would have been silly to refuse the chance. The same reason but from a different angle: there is a lot of software optimized for Pentium 4 even now. Turning down NetBurst, Intel is risking to lose in its traditionally winning areas.

And thirdly: notorious 64 bit. It's a widely known secret that Prescott and Nocona (the latest Intel cores for desktops and servers) support EM64T (aka x86-64, aka AMD64). And this secret will be soon quite obviously "revealed" — AMD has already collected too much cream from the market having narrated at every turn that they have 64 bit for x86, and Intel does not. So it's in Intel interests to close this gap as soon as possible. Thus, taking into account the specifics of the Pentium M CPU and its design, and a supposition that to all appearances this processor is really based on the modified Pentium III core, the author makes quite a reasonable guess that Pentium M does not support 64bit instructions — even in passive mode. It's logical at the least: somehow or other but a 64bit processor must have more transistors than a 32bit one, but Pentium M is designed for low power consumption, and thus it does not need extra transistors. Besides, the main 64bit bonus is the large volume memory support, but I have never met notebooks even with 4 GB RAM (How long will this miracle work from batteries, I wonder? an hour?...).

Summing it all up: the arguments against using Pentium M-like core in the Intel desktop series are quite essential, so despite the clamour concerning this issue in non technical computer mass media, the author considers this twist of events to be unlikely.

We can of course concoct another question: will two cores in future Intel processors mean Hyper-Threading quittance? But the answer is elementary: it's clear it won't. What for? The technology is bedded in, it requires minimum transistors. But it should eliminate the performance drop effect with enabled Hyper-Threading on a single CPU (very rare but existing phenomenon), because it now contains two physical cores.

That's about it. x86 will endure for some time owing to 64bit extensions, chip-level multiprocessing, and sluggish raise in CPU clocks. But what expects us after these resources are depleted is a more interesting question. Tetra-core processors or x86 surrender? Here my fantasy fails me…

Vladimir Rybnikov (puree@ixbt.com)
July 29, 2004

Write a comment below. No registration needed!