Back to DaveOpinion

Title:CPU Performance
Date:1996??
Self Righteous:0
Opinionated:0
Simply true:9


Question: What's the difference between MHz and MIPS, and which is a good indicator of performance?

The quick answer is that neither is a good indicator of performance.

MHz stands for 'Megahertz' (million cycles per second) which tells you how fast a clock is for a given processor. It doesn't tell you what is happening during those clocks. MIPS stands for 'millions of instructions per second,' but once again, you don't know what these instructions are doing. (Some argue that MIPS should stand for 'Meaningless Indication of Processor Speed').

MIPS itself isn't a good indicator of performance, MIPS is just how quickly a computer *could* execute instructions. And who even cares? What instructions are these? Which would you rather have, a processor that runs at 1GHz that has two intructions:

  inc ra		add one to register a
  b   ra,lbl		branch to label if ra == 0

(Btw, such a processor isn't that far from being Turing complete, which means that it could do everything your nifty little workstation can - but that's an aside..)

<silly>
Or a processor that runs at 10MHz but includes the following instruction among many others:

  word			do Microsoft Word (tm)

</silly>

Even the above example doesn't answer your question because you don't know how many cycles are needed for an instruction. Usually it's something like one instruction per cycle, but that's something that's been changing over time. Besides, do you even use Word? Maybe you only run a stupid counting program, in which case the first processor is better. So the real question isn't how many MIPs a processor is or what speed it is or even what instructions it does. The real question of performance is:

How fast will this processor in a given system run the software that I run with the workloads that I give it?

Over the last decade we've been getting better at reducing this incredibly complex question to a number. It used to be clock speed in the early days when there weren't many instruction sets and there wasn't much being done with the system to speed things up (and memory wasn't so much slower than the processor back then). For a short while it was MIPs. That was just a stupid mistake probably thought up by someone in marketing, it probably helped show the gains of a slightly slower processor that ran more instructions at once, something that was needed when pipelining and superscaling was invented.

Soon after we came up with toy benchmarks, such as the 'Towers Of Hanoi' problem, figuring that this would be a good indicator of how fast a system was. It's better than MIPs or MHz in the sense that you know a system is faster *at least* with this silly problem. Who knows, maybe all you do with a computer is run Towers of Hanoi. With MIPs and MHz you can have a processor with better numbers that is guaranteed to run slower than another processor in all cases, with toy benchmarks at least you know it can run Towers Of Hanoi faster. Maybe this is some comfort :) (Towers Of Hanoi, btw, is just a small puzzle that involves three posts and moving a series of different sized disks to another post, it's not important, just an example of one of the many toy benchmarks that was being used).

So toy benchmarks were lacking. For one thing the problems were so small that they couldn't distinguish between normal hardware and beefed up hardware that was meant to handle big loads, because the benchmark didn't utilize all resources. A good example of this is cache size. If the Towers of Hanoi program only uses a few k of memory, then a 16k cache and a 4M cache are going to look exactly the same in the numbers, though obviously a 4M cache will run real- world programs better.

We realized how silly all this was getting back in '88 and SPEC was formed (Standard Performance Evaluation Corporation). SPEC Benchmarks attempted to solve the toy benchmark problem by making larger benchmarks that were either real-world programs or a subset thereof. For example, one of the Spec95 benchmarks was a subset of a gcc compilation. They were smart enough to make different numbers for integer and floating- point performance, some people don't use FP, and some chips have very different performance capabilities between integer and FP math.

You can see how unrelated the clock speed (MHz) is to the SPEC performance times by studying the SPEC table [local, compressed]

Even so, SPEC testing only gives you two numbers - reduced from runtimes for all the benchmarks (I believe something like a squared- mean is used). What if you care more about one type of test than another? What if you do or don't do lots of compilations, so how should the gcc benchmark be weighed in? Or what if you use your computer in a way that is totally unrelated to SPEC? Another problem with SPEC is that the benchmarks quickly get dated, We've gone through Spec89 and 92, we're currently on 95 and we're looking into Spec98. This isn't just an issue of getting the newest and best, the problem is that the hardware gets so good that the old SPEC starts to look like a toy problem. A good example of this is a system that has very large caches. If the cache is big enough, then the SPEC benchmarks start to fit entirely in cache. That's good for benchmarking up to a point, it'll give good SPEC numbers, but as the caches get bigger and the benchmark stays the same, then it doesn't use all of cache, and it won't get the performance boost that a real-world application would get from that extra cache.

So what's the answer? If you're a casual PC buyer then it probably doesn't matter, you're probably nailed to a specific platform delivered by one supplier, so performance isn't an issue, you know what the best is, the question is what you can afford. If you're a casual workstation buyer then you can consider SPEC numbers, They might be all you'll get to do workstation comparison. If you're a serious workstation/server buyer, then you often work with the company to see how fast it runs *your* applications, most processor labs do specific tuning for important applications, even to the point of adding hardware support that mostly helps only certain applications.

But there is no true answer in the form of a number. The only true performance equation is not useful, it looks like:

performance = cycles/sec * instructions/cycle * num instr in program

The first number is MHz. The first and second are MIPs. The third number is silly. You can't really measure the number of instructions that a program will use since it often depends on user input, and if you knew what all the inputs you'd put into a computer and the amount of time they'd take, then you'd probably already know the output and you wouldn't need a computer, eh? Even so, some of these numbers aren't calculable. How many instructions per cycle? This often depends on the instructions themselves, and whether or not they're in cache, and these days, with out-of-order being prevalent then your instruction latency depends on the other instructions also being executed.

Another factor that we've pretty much skimmed over is that of the system itself. You could take the fastest processor in the world and put it in a system with slow memory, no cache, no graphics support, slow I/O, etc, and it would be slower than a PC. Most CPUs these days spend lots of time sitting around waiting for memory, I/O, graphics. Speaking of graphics, there's a good example of hardware that can make a huge difference in application performance but won't really show up in your SPEC numbers. There are some graphics benchmarks, but they are mostly toys themselves.

So the only real answer is to run the programs you use on it, of course this only indicates the performance for a given run on a specific piece of software for a specific usage, but it's likely better than any other number you'll get.

Regardless, none of this really matters. CPU performance is becoming a non-issue, because while CPUs and computers are speeding up exponentially, humans are not. There is a CPU speed that is actually good enough. Once the computer is able to process/generate all types of information (sound, video, etc..) faster than we are, then we won't generally find ourselves waiting for our computer. Soon you'll find that people aren't even discussing clock speeds in computers anymore, though I'm sure that for a while the computer and software vendors will find ways to waste cycles just to convince us to buy newer hardware (see Microsoft for reference). Regardless, this won't last forever, and new issues will arise in computer design.


Responses:


More Opinions
DaveFAQ