In the beginning - page 3 In 1981, you typed a command to run a program. It ran. You waited. Then you typed a command to run another, and it ran, and you waited. It was a paradigm based on single entities called "programs". But we needed to run *lots* of programs. And we needed to control many devices too, disks, screens, modems, and all at the same time. And we had just one processor to do all that. At the time, processors were very expensive, and *** IT WAS MORE COST-EFFECTIVE TO SHARE THE PROCESSOR THAN TO ADD MORE PROCESSORS. *** In order to share the processor, we invented things like "time slicing" and "context switching" and "memory mapping", giving each program a short slice of cpu time and a small section of the memory, in a rapid round-robin fashion. And then we got "multi-threading" and more complex memory management and more complex caching. And to better share large functional units like the floating point multipliers, we invented instruction look-ahead and out-of-order instruction queues, and a host of related things. And size and complexity and power usage went through the roof. A huge number of the transistors in modern processors are now dedicated to these multi-program coordination circuits -- circuits whose entire purpose is to enable and streamline sharing the processor. None of these things run any one program better or faster. Any program would run faster if all other programs and threads running on the same processor were stopped, and all hardware to enable other programs and threads were removed. So why not return to those early days, when processors were simple, and then go the route of using many processors running one program each.