I worked for Intel's Microprocessor Research Labs (from 2001 to 2006) doing microarchitecture research.

Someone requested an AMA from an EE with CPU design experience, so here I am! :)

(1) How much of today's CPU architecture is basically the same as the first CPUs ever designed?

The overall pipeline of a modern CPU would be familiar to a mainframe architect from the 60's. The main development since then was the branch predictor - credited to Jim Smith in 1981.

Of course, everything is bigger now :)

(2) Are components like logic gates still individually laid out during planning, or are past component designs stored in some database to be called upon for use in new designs? If so, how often do they have to be revised to suit the needs of new processors?

Intel is not like most design houses. There is (or, at least, was) still much done by hand. Most places use libraries of circuits, with tools combining them to fulfill the requirements of a high level specification (hardware design language - HDL).

These libraries must be reevaluated for every new process (45 nm to 32 nm, etc). Depending on how aggressive you are with labor, you have many different implementations which tradeoff size and speed.

(3) With such intricate circuits, how do you keep track of what effect an individual design element will have on the overall operation of the CPU?

There are many levels of abstraction to help deal with complexity. An architect has a very high level view - looking at pipeline stages. A logic designer is concerned with his block, with specific ins and outs, and the function to perform. A circuit designer is converting a specific block of HDL into transistors.

Software handles the overall simulation (although only small pieces can be simulated at high detail).

(4) As I understand it, modern computers are still using essentially the same BASIC developed in the 60s converted to machine code to execute instructions. Do you think the future progression of CPU technology will require going back to the beginning and designing processors that can utilize a new, and perhaps more efficient, high-level programming language (if such a thing could exist)?

BASIC is actually a high-level programming language! (From a CPU point of view).

It's true that modern assembly language is similar to that of the original machines (actually, even a CISC like x86 is simpler than many that used to exist - think Vax).

It's unlikely that this will change.

There is a lot of work required to make a new instruction set, and there is usually little to gain. Remember, anything that is done in software will be much harder to do in hardware. A Python interpreter is a complex piece of code, you wouldn't want to try and do it in hardware.

What do you think will happen as transistors approach the single atomic level and are subject to the effects of quantum mechanics?

Quantum effects have been a problem for a long time. When they start to dominate, that will be the end of transistors as we know them. We will have to move to rod based computing (like a tinker toy - except done with carbon nanotubes) or spintronics or something even wilder.

I believe economics will limit us before physics does. A modern fab costs a lot of money, and it is becoming harder to charge a premium price for CPUs.


Feel free to ask for more on any of these subjects or anything else.


Proof submitted to mods.

Comments: 1243 • Responses: 56  • Date: 

biqqie180 karma

What's the best way to apply thermal paste to a CPU?

I always feel like I'm not doing it right and shit's gonna melt sooner or later.

eabrek335 karma

No idea :)

I only built one computer by hand (in college). That was when you didn't need paste :)

crazy_loop172 karma

This might be a stupid question but why do the transistors always need to get smaller to make the chip faster, cant you just use the same 22nm size but actually make the processor bigger? I understand this cant go on forever as the chip cant me 1 meter in size but surly a CPU could easily be tripped in size to fit a shit load more transistor inside of it? Why is this not the case?

EDIT: I'm really just asking because I know we are nearing the limit when quantum theory will start to interfere with the way the chip works. So instead of going smaller with the transistor I was thinking of just scaling up the physical CPU dimensions.

eabrek73 karma

For a while, die size was increasing along with feature size decreasing. (IIRC, Pentium 4 was the first Intel product the same size as its predecessor).

Increasing die size drives up cost. Also, the most critical portions of the CPU are usually about as big as they can get (we use up a lot of die space with cache).

rudib125 karma

What will be the dominant CPU instruction sets in the next few decades? Are we stuck with X64 and ARM (A64) or is there anything else on the horizon? Perhaps something new for GPUs?

eabrek184 karma

One or both of x64 and ARM will be around a long time (probably 20 years, at least. It's hard to look past 10; and 30 or 40 is very hard).

GPUs are able to hide their instruction sets behind the driver. This allows them to change more frequently. I haven't looked at Nvidia or ATI, but the Intel GPU is very much like a normal CPU instruction set.

I would expect that to remain for a while. It won't be until we've had a generation or two of people living with no hardware improvements (due to the end of silicon scaling) that people will be ready to try something radically new (necessity and invention :)

stasinop42 karma

I just want to make sure I understand you right. Are you saying that soon we will reach a plateau and computational power will completely cease to increase for a period of time? That seems rather disheartening.

Guvante62 karma

It depends on what you mean by computation power.

They will always be able to add more computation power, the question is whether they can do that while keeping the price the same.

Basically currently you can wait 18 months and get an improvement in performance for "free". He is theorizing this free boost will go away. You can wait 18 months and have another option in the $/performance matrix, but not one that is 100% better in every case.

Scarfall13 karma

What do you mean "for free"?

eabrek43 karma

Right, a $1000 CPU today is some combination of higher performance and lower power than the one from a year ago.

Eventually, new CPUs will be the same power and performance (and either cheaper, or even the same price). At that point, people are going to start looking for different ways of doing things.

narwhalpolis16 karma

How long until we reach that point?

eabrek48 karma

Whenever Moore's Law ends. It's always 5 to 15 years out, depending on where the lithography guys are in their research.

IndexPlusPlus3 karma

Would ARM ever be viable in a desktop scenario?

eabrek9 karma

There have been ARM desktops in the past (and probably are for niche applications today).

The problem is (always) software. It would run Android, or Linux ARM. Are there applications that you want to run in a desktop setting? Are there enough people like that to justify a market?

Kisolya91 karma

How do you become a CPU researcher? Can you detail your education and skill development?

eabrek127 karma

The most common way is to get a PhD in something related to computer architecture.

My own path is more convoluted, as I did not get a PhD :)

KeytarVillain66 karma

Care to elaborate? What degree do you have? How did you become a researcher without a PhD?

eabrek97 karma

I have an MS in ECE.

I started out writing software models for a chipset development team. From there, I got a job in the research group writing processor models. Then I was doing research with the rest of the team.

Our team was intentionally mixing PhDs with non-PhDs (with relevant industry experience).

jcf135 karma

If you don't mind me asking, where did you get your BS and MS?

eabrek62 karma

Carnegie Mellon

BeastKiller4506 karma

How many of the people you worked with had their PhD? Was it something you noticed them having before they got their job or while they were a researcher.

eabrek13 karma

Our team was intentionally mixing PhDs (representing academia) with industry people - so it was about half.

Of course, our team was two people for a while, and maxed out at six :)

Sexual_tomato77 karma

How do you guys arrive at the different transistor sizes? Do you pick 40 nm and 32 nm arbitrarily, or is that influenced by outside factors? Why?

Also, how do you predict heat generation and dissipation? I'm sure you guys have it down to a science, but are there any white papers I can go read on the subject?

I'm a mech E, so I'm familiar with heat transfer, just wondering what your take on the details are.

eabrek120 karma

It's driven from two sides:

  1. Moore's Law (which is what we use to forecast our performance target)

  2. Lithography research and development (who are the real miracle workers).

So, as architects, we sit down and say "8 years from now, we will be at one quarter feature size - what does our design look like". Then, hopefully, the lithography guys deliver something close :)

ten2436 karma

If you're using Moore's Law as a performance target, then is Moore's law actually true, or are you guys just really good at hitting your targets? :)

eabrek32 karma

I remember one presentation where another team projected that the competing part from IBM would have less performance than IBM's parts at that time :)

We were like, "Is your strategy to assume IBM is stupid and incompetent?"

StarGalaxy72 karma

What's the impact of having several competitors (e.g intel, AMD) doing research in the same direction. Are there resources wasted because people work on developing the same things? Or do companys actually benefit from each others research?

eabrek129 karma

There is some amount of waste do to repeated efforts. Of course, everyone does things a little differently, so you get different solutions in the marketplace.

The majority of research is openly shared (via publications and patents). Patents are cross-licensed to prevent "mutually assured destruction".

anubis11953 karma

In your opinion, which is more important: faster clock cycle or more efficient operation flow?

eabrek88 karma

My boss used to print out a big graph: on the x-axis was MHz, and the Y-axis was SpecInt.

You can then clearly see the "brainiacs" (high IPC) and "speed demons" (high clock).

The best designs were those which fell in the middle. You can really see it when you plot the isocurves - the shortest distance is to drive at a 45 degree angle up the plot.

KlausKoe40 karma

Eli5: update of micro code.

Does it really come with windows updates?

Is there flash/eprom memory on the CPU?

Could a virus modify it?

BWT: thx 4 AMA

eabrek57 karma

The microcode (ucode) storage contains a read-only part, and a writable part. The ucode update changes the writable portion (which can override the read-only portion).

These updates are provided by the manufacturer, and signed by them. A virus could apply a patch (if it could insert itself into the proper part of the boot sequence - you can't apply them all the time), but it would have to be a valid patch. It would be cryptographically hard for a virus writer to make his own patch.

eabrek26 karma

And, yes, it is the operating system which applies these patches.

BOVINE_FETCHER37 karma

Would you rather battle one horse-sized Turing machine or 100 Turing-complete ducks?

eabrek60 karma

I'd go for one horse-size. Debugging parallel programs is a pain.

YoYoDingDongYo27 karma

Speaking from a company culture point of view, do you think it's plausible/likely that Intel weakened RdRand for the NSA, as alleged?

eabrek56 karma

It seems unlikely to me. Intel has a culture where intellectual excellence dominates (almost to a fault). Also, the company was burned very badly by CPUid, and usually tries to avoid anything that might lead to PR fiasco.

mainhaxor20 karma

How was Intel burned by CPUID? Can someone elaborate?

eabrek29 karma

Here's an old Slashdot link.

BadgerBadgerDK25 karma

Hi! There will always be fanbois (I myself love AMD) - What's your view on their APU? Is this where mobile phones are heading? (just even lower power) I'm still amazed at my fusion powered netbook even if it's "old"

eabrek25 karma

I was surprised that Intel actually productized this before AMD (Intel is a juggernaut - unstoppable, but is not famous for being nimble).

It's really a natural progression of miniaturization. Things get smaller, so more stuff gets integrated together.

oijoijoijoijoijoij14 karma

[deleted]

eabrek19 karma

blub000024 karma

Thanks for doing this. I have several questions:

  • Which do you prefer: VHDL or Verilog?
  • How much time is spent simulating the hardware vs. designing it (also do you use custom simulation software)?
  • how often do you actually have to go down to the gate-level to fix timing issues and other stuff?
  • Are all of the standard cells you use designed in-house?
  • How do you prefer to design your state machines (moore, mealy, latched mealy)?

Thanks for answering.

eabrek27 karma

I used Verilog in college and preferred that for a long time.

However, on my latest job, I have used VHDL - I like how formal it is. Verilog lets you get away with murder and create some really ugly, unreadable stuff. VHDL encourages you to make clear interfaces and supports different implementations easily.

That's why I now prefer VHDL.


We were looking at the architecture level, so most of our time was spent writing (and debugging) software. We'd set a batch of simulations into the job pool, then work on the latest oddity. Or spend time brainstorming a new arrangement that might work better.


We almost never got to the gate level in our work.


Yes, Intel was totally in house from design to manufacturing. Other design houses use a standard cell library from their fab partner.


I always preferred moore :) Not that I worked at that level very often.

catnipd23 karma

Thanks for doing this AMA. Couple of questions: 1) How big team (rough estimate) is required to develop CPU (based on existing instruction set, e.g. ARM or something simpler) from scratch? 2) Do you think FPGAs will become part of consumer-grade computers in near future, kinda like GPUs did?

eabrek35 karma

  1. It really varies. There are college courses where you build most of a CPU with two or three people (you'd need maybe 15 to 20 to make it a product). Intel has teams of hundreds :)

  2. I like FPGAs, but I doubt they will ever become widely deployed. They pay ~20x overhead, so any algorithm that is a good fit for them becomes a new instruction in the next CPU generation. The reprogrammability is only a feature in highly constrained (i.e. niche) environments.

catnipd22 karma

Thank you for your answers. One more question (in light of recent paranoia), speculatory and probably more related to workflow. Suppose there is a backdoor in modern Intel/AMD CPUs, e.g. some undocumented instruction allowing to enter real mode from protected mode. Inspecting 22nm chip itself is nigh-impossible for third party. But do you think it is possible to conceal such backdoor from majority of the team working on chip design?

eabrek35 karma

It would be very hard.

A simple scheme would be to implement something entirely in microcode (that would minimize the impact on the rest of the design). However, it would probably have a big impact on validation, which is a big part of the design. Things are very complicated already, and new modes multiply the complexity.

pseudosciense22 karma

What's your favorite instruction set?

eabrek29 karma

I'm partial to x86 (I like to be weird). I also spent a lot of time on Itanium, and have a love/hate relationship with it.

gruisinger14 karma

From a purely technical standpoint I can appreciate the Itanium architecture. But I have never understood what it's purpose was from a marketing standpoint. As a long time Vax and Alpha guy, I had high hopes for Itanium, but just like Alpha, it went nowhere outside of VMS and HPUX (and even there I'd say its adoption was less than expected). Unlike Alpha though, it really had to take hold elsewhere in order to succeed, which of course did not happen. I see Wikipedia says it was the fourth most deployed microprocessor architecture as of 2008, but it was being beat out by SPARC at that time. I wonder where it is now.

I still use Alpha processors daily, and adore them. I was not happy when Compaq killed the Alpha line. Alpha was ahead of its time in some respects. Do you think the Alpha architecture could have continued to evolve and been competitive with other 64-bit processors, or had it become too dated to be saved?

eabrek11 karma

If you're really cynical, the marketing purpose was to create a new set of patents which were held by HP and Intel (i.e. not AMD). It also scared all the RISC guys into quitting :)

If you're less cynical, it was a solution to a problem noticed in the early 1990's (and fixed in 1996...)

san11249116 karma

What is your opinion on quantum computing? Which kind of processor do you think will take over after after silicon computing had been exhausted?

eabrek22 karma

Quantum computing is interesting - in terms of what it might do to cryptography (i.e. destroy everything we've built to be secure :)

I believe the limit to silicon will be economics more than physics - so anything replacing it would need to be cheaper. Maybe something using DNA-like chemistry, or printable carbon nanotubes?

sloti16 karma

I have an exam about computer/processor architecture next week. Any tips? ;)

rui27831 karma

the control unit connects to the datapath and together they make stuff happen. Also, you don't need subtract units, just some A + not(B) +1!

eabrek22 karma

That made me laugh :)

eabrek18 karma

What textbook are you using?

sloti13 karma

Just the script our prof wrote. But i think it's based upon "Computer architecture" by Morgan-Kaufmann and "Computer System Architecture" by Prentice Hall

eabrek19 karma

It might be overkill, but a good discipline for me was working through pipeline diagrams (marking the time where each instruction in the program execution enters each pipeline stage).

It was the hardest problem on my Superscalar Processor final! :)

coghex15 karma

Will we ever have Gallium arsenide Processors? (like the Cray-3)

eabrek17 karma

Unlikely. There is so much expertise in silicon that any alternative needs to be a lot better.

Also, the limits to silicon are largely economic, so any replacement needs to be cheaper also.

technicianx14 karma

Is x86 entirely virtualized on the silicon these days? From what I understand, the silicon doesn't actually correspond to the 80x86 design anymore, and is instead designed to effectively emulate x86 instructions - am I correct, or is this wrong entirely? I just cannot imagine that these CPU's are still handling 32-bit x86 instructions in the physical silicon. It's got to be virtualized by now, right?

Neikius12 karma

You are correct, it is called microcode. http://en.wikipedia.org/wiki/Microcode

Maybe OP can tell more about specifics though.

eabrek15 karma

Microcode (or ucode) covers a lot of ideas, actually.

For example, the back-end of the machine executes uops. But, there is a 1:1 mapping for most x86 instructions. And where it's not 1:1, then it's usually 2:1 (I'm counting fused-uops).

On the gripping hand, all the wacky mode changes and weird x86isms are implemented using ucode programs.

0x000111114 karma

How exactly are the silicon gates aligned to a particular alignment during processor manufacturing? It astonishes me that such tiny circuits, all of which are probably not in the same alignment, can be put in place.

eabrek22 karma

The gates are built using "masks" (sort of like big photo-negatives, if you know what those are). It is the mask which must be aligned at each step.

Part of the development of each feature node is designing machines which can be very finely rotated until they line up (the mask is bigger than the wafer - 12 inches or more). There is a mark on the wafer which is lined up with the corresponding mark on the mask.

bulldozers11 karma

Thank you for doing this AMA! I have a few questions.

What advice do you have for a high school junior who is looking to go into computer engineering?

Is there anything I can do now to better prepare myself for a job in that field? Also, what types of higher level math do you use daily?

Edited for easier reading

eabrek13 karma

Day-to-day for me was (and is) programming. I rarely use anything above geometry.

But, for college, you'll need a lot of math. Also, programming (I'd recommend skills in a "systems" language (C, C++, or D), and a scripting language (Python)).

fluffer3136 karma

Did D really become a thing? I stopped being a professional C++ dev nearly 10 years ago (now on C#)

eabrek9 karma

D is not currently a thing. I hope it will become a thing :)

test_alpha11 karma

Hey, cool. What kinds of things did you work on at Intel? What do you do now?

If you feel like answering something a bit more involved, could you give a critique or opinion of the current state of some thing that interests you? e.g., Intel's latest CPUs; ARM CPUs; ARM64 ISA; birdwatching...

eabrek23 karma

I started in chipset design, working on software models for a server/workstation chipset.

From there, I was hired into a research group, starting out writing software models for a speculative out-of-order Itanium.

Then our group switched to x86, and I was working on next generation x86.

Now I work on software applications (similar day-to-day, hacking big codebases :)


I'm really disappointed the way things worked out. I hoped someone would build bigger and more powerful machines, rather than just reducing power.

sdmike2110 karma

I love hardware (you could say that I have a hard on for it :P) so it is super cool that you are doing this!. That said, what processor technology are you excited for that most people may not have heard of? Oh and AMD or Intel?

eabrek11 karma

I think 3d stacking will help prolong Moore's law. Not many people seem to know about it.

Intel has more resources for QA (and development), but there is a price premium. I have one AMD laptop (because it was cheap), and one Intel.

IHaveNoIdentity10 karma

Have you heard of the ootbcomp guys and their MILL CPU architecture? If so what do you think of their design and what do you believe will be their biggest challenge in terms of the implementation of software and hardware?

eabrek16 karma

It's really interesting. I keep telling myself I'll set aside time to figure it out, but never do :)

That said, (and having worked on Itanium), I'm skeptical. Our findings were that architecture doesn't matter. I could see some things being a little better, but it shouldn't be possible to get 10x on everyday code...

OldButStillFat10 karma

I believe that the next "evolutionary step" will be a mechanical, or bio-mechanical, being. What are your thoughts on this subject?

eabrek24 karma

There is work being done on using DNA (or DNA-like chemicals) for computing. I don't know much about it, but it is apparently good for massively parallel problems (i.e. things super-computers do today).

It does not seem as good a serial problems (things desktop computers do). But maybe that will change.

I agree there is a lot of potential for development in biological engineering.

notyourhand10 karma

Favorite branch prediction algorithm?

eabrek18 karma

We worked with a guy who had a neural net. Pretty cool, if not entirely practical.

[deleted]4 karma

Hardware implemented? Sounds like a very cool application of a neural network.

eabrek6 karma

Yes, a tree of adders with weights (perceptron)

detox10110 karma

Whatever happened with the "cube" CPU, or the 3d CPU since the only way cpus can go faster is by shrinking it. There has to be a limit on how small a CPU can be, and then where do we go from there.

Ii think this was from about 10 years ago and there was design flaws on the prototype were it had trouble with upper level processing?

Am I getting this wrong and just remembering something else?

eabrek20 karma

I never heard of a cube (besides Game Cube :), but 3d manufacturing is a very interesting field of research.

The spec for 3d stacked DRAM was released last year.

We should see 3d stacked CPUs soon.

Nicksaurus10 karma

Isn't there a bit of an overheating risk with stacking CPUs in 3D?

eabrek9 karma

That's always the first thing people say :)

However, it's really no different than any other process step (you get twice the number of transistors in the same area)

throwaway1310728 karma

Would suddenly having room for double/triple/more transistors mean an immediate performance boost, or would those dies have to be redesigned to make those transistors useful? I run a 4930K @ 4.5GHz/1.32v, it boggles my mind to think we can still do so much better :D

Your field is incredible and you guys are some modern magicians, thanks for your work and time!

eabrek5 karma

Double the transistors means double the heat :)

A lot of research and development is needed to make use of the transistors in the best way.

Tasty139 karma

Hi thanks for the AMA! I was curious as to what your opinion on the future of graphene in the cpu world has in store.

eabrek10 karma

If it lives up to all its promises, it could extend Moore's Law for a long time (several generations - maybe 10 years).

Irish_Subotai8 karma

[deleted]

eabrek7 karma

We were looking at features to include in Sandybridge.

Kataclysm7 karma

Do you remember the old Cyrix CPU's? What's your opinion on some of the CPU makers from the olden days that didn't survive?

Haxornator3 karma

I am also curious about cyrix. It appeared during the Pentium 1 craze then disappeared shortly there after. I got my first computer at that time and luckily got a p133 with a sweet 8MB of ram. It ran Doom and OMF like a champ! But I really am curious if cyrix was any good because the price point then was amazing.

eabrek7 karma

It comes about from only needing a double handful of guys to make a simple processor.

However, you need hundreds (or thousands) to make the most of the process technology, and have new designed coming out every year...

misterfalone6 karma

hi, technologically impaired pc gamer here. does an intel i3 with an intel i7 really affects gaming significantly? usually when im looking some pc or notebook specifications, the first and the last thing i see is the vga card. hope the answer will change my view :|

eabrek11 karma

Really high performance games are targeted at the latest graphics cards.

However, for older games, most machines should be ok. The i3 is priced as low as possible, and can have a lot lower performance. Usually an i5 or low-end i7 has good performance at a reasonable price.

Bat_turd6 karma

What is the most powerful processor in the world right now? What are they used for? :)

eabrek7 karma

I haven't been keeping up to date, but I would guess it's a POWER (from IBM). They have huge power budget.

m08inthem085 karma

How cool is this! Whats your thoughts on the Z-80 using only a 4bit ALU. I think this is amazing!

Link: http://www.righto.com/2013/09/the-z-80-has-4-bit-alu-heres-how-it.html

eabrek12 karma

Did you know the Pentium 4 did something similar? The first generation did 32 bit operations 16 bits at a time, while the next generation did 64 bit operations 32 bits at a time.

gokrix5 karma

What is "uncore"?

eabrek5 karma

It can refer to anything not a processor on the die.

I think the latest marketing is to use that to capture things like: size of L3, number of high speed links, and maybe graphics or other co-processors.

gilbertsmith4 karma

How do you feel about LGA processors? Do you think it's better for Intel or better for customers?

eabrek7 karma

I haven't followed the whole issue, but Intel is usually actually trying to do what is best for customers. Not many customers upgrade CPUs - they usually buy new machines. So, there is emphasis on making new machines cheap and have good performance.