Scott Aaronson on Computational Complexity Theory and Quantum Computers

by Y Combinator6/29/2018

Scott Aaronson is the David J. Bruton Centennial Professor of Computer Science at The University of Texas at Austin, and director of its Quantum Information Center. Before teaching at UT, he taught Electrical Engineering and Computer Science at MIT. His research interests center around the capabilities and limits of quantum computers, and computational complexity theory more generally.

If you’ve listened to our other episodes about quantum computers and are curious to learn more, check out Scott’s book Quantum Computing Since Democritus.

And if you want to read Scott’s blog you can find that at


0:10 – Why do quantum computers not solve hard search problems instantaneously by trying all the possible solutions at once?

6:40 – How did quantum computers end up in the business use case category?

12:30 – Generating random numbers with a quantum computer.

24:53 – How does the partial destruction work?

30:28 – P versus NP

37:21 – The holographic principle

47:52 – Anagh asks, how can we channel AI growth but not weaponize it?

54:08 – Micheal Burge asks, is anyone keeping track of the smallest N such that Busy Beaver of N is independent of ZF set theory?

01:01:59 – Blogging and social media

01:07:27 – Advice for nerds in general or people who want careers in science


Google Play


Craig Cannon [00:00:00] – Hey, how’s it going. This is Craig Cannon, and you’re listening to Y Combinator’s podcast. Today’s episode is with Scott Aaronson. Scott’s the David J. Bruton Centennial Professor of Computer Science at The University of Texas Austin. He’s also the director of their Quantum Information Center. Before, Scott taught at UT, he taught electrical engineering and CS at MIT. If you’ve checked out our other episodes about quantum and want to learn more, I’d recommend checking out Scott’s book, which is called Quantum Computing Since Democritus. He’s also been blogging for over 10 years and you can find all those posts at All right, here we go. Today, we have Scott Aaronson, UT professor, CS, and also a blogger at Shtetl-Optimized, and on the top of your blog, you have something that says, “If you take just one piece of information away from this blog, quantum computers would not solve hard search problems instantaneously by simply trying all the possible solutions at once.” Why not?

Scott Aaronson [00:01:02] – Great question. Glad you asked. I’ve been researching quantum computing, working in this area for about 20 years. I’ve been blogging about it for 15 years. The single most common misconception, right, which you find repeated in almost every popular article about the subject that is written says, well, a classical computer is made of bits, and so it can just try each possible solution one by one, but a quantum computer is made of qubits, which can be zero and one at the same time, and this means that if you have 100 qubits, that the quantum computer can explore two to the hundredth power states simultaneously, and then it can just try all the possible answers at once. Well, that is gesturing towards something in the vicinity of the truth, but it’s also very seriously misleading. It leads people to think that quantum computers would have capabilities that actually we don’t think that they would have. This is not even controversial within this field, right. We all know this, but it’s very hard to get the message out. I’ve been trying. Here’s the situation. The central thing that quantum mechanics says about the world is that to each possible state of a physical system, each possible way that it could be when you measure it, you have to assign a number called an amplitude, and amplitudes are related to probabilities. A larger amplitude means you’re more likely to see that outcome, but amplitudes are different from probabilities.

Scott Aaronson [00:02:56] – Unlike a probability, an amplitude can be negative. In fact, it can even be a complex number and so all the sort of magic of quantum mechanics, anything you’ve ever heard about the weirdness of the quantum world–

Craig Cannon [00:03:14] – Spooky.

Scott Aaronson [00:03:15] – Yeah, the spookiness, it all boils down in the end to the way that these amplitudes work differently from probabilities. In particular, probabilities could sort of add up positively, right. The more ways that something could happen, right, that just keeps increasing the probability that it happens, but amplitudes can, as we say, interfere destructively and cancel each other out. If a photon could reach a certain point, for example, in one way with a positive amplitude and another way with a negative amplitude, those two things can cancel so that you never see the photon there at all, as in the famous double-slit experiment. If you won’t take this on my authority, you can take it on Richard Feynman’s. He used to say that everything in quantum mechanics boils down to these minus signs.

Craig Cannon [00:04:07] – Yeah, absolutely.

Scott Aaronson [00:04:09] – A quantum computer is just a device that could maintain a state that is, as we say, a superposition, right, with an amplitude for every possible configuration of the bits. Indeed, if you had a computer with a hundred quantum bits, qubits as we call them, that is two to 100th power amplitudes that are needed to maintain that computers state. It’s actually very easy to create, as we say, an equal superposition over all the possible states, which you could think of as every possible key for a cryptographic code. Every possible solution to your optimization problem. That’s what the popular articles are trying to talk about. All of that is true. The problem is, well for a computer to be useful at some point you’ve got to look at the thing. At some point you’ve got to measure and get an answer out. The rules of quantum mechanics are very specific about how these amplitudes turn into ordinary probabilities that you’ll see something. The rule is, the probability of some outcome is just the squared absolute value of it’s amplitude, that’s the rule. It sounds a little technical but that’s one of the most basic rules of the universe. It’s probably worth remembering.

Craig Cannon [00:05:33] – If you saw it on paper it would be.

Scott Aaronson [00:05:34] – Yeah that’s right, that’s right. In particular, one thing that means is if I just created an equal superposition over all the possible answers to some problem and then I measure it, not having done anything else, then all I’m going to see will be a random answer. Now if all I wanted was a random answer I could’ve just flipped a coin a few times, saved billions of dollars building this device. The entire hope of getting a speed advantage from a quantum computer is to exploit the way that amplitudes work differently. It’s to try choreograph a pattern of interference, where for each wrong answer to your computational problem, some of the paths leading there have positive amplitudes, some have negative amplitude, so they cancel each other out. Where as for the right answer you want all the contributions to it to reinforce each other. The tricky part is you’ve got to figure out how to do that despite not knowing in advance which answer is the right one.

Craig Cannon [00:06:37] – Right, in addition to the error correction.

Scott Aaronson [00:06:39] – That’s right, oh yeah, yeah, all of that too. Though those are all the engineering difficulties. Right now I’m talking about even if you had a perfect quantum computer. It’s still not obvious how to get a speed advantage for it because you’ve got to choreograph this interference pattern. Nature gives you this really bizarre hammer, and it was not until the 1990s that people really started figuring out what nails you could hit with this hammer.

Craig Cannon [00:07:04] – Well, this is a question I had for you. Quantum computers seem to be in this technology category rather than the science category. So we did a podcast with Rana Adhikari from LIGO and that is squarely put in the science category. How did quantum computers end up in this like business use case category?

Scott Aaronson [00:07:24] – That’s a super interesting question. Indeed, often when magazines and newspapers are writing about quantum computing they put a technology reporter on it. And then, they want to know about, well, you know, how is this going to impact the finance industry in the next five years. Let’s just see if the universe, let’s prove that the universe has this capability at all. How about we do that as the first step?

Craig Cannon [00:07:50] – It’s an important understanding, because like it’s not there…

Scott Aaronson [00:07:53] – That’s right, that’s right. Part of what makes it exciting is that this is fundamental science, right. I mean, not in the sense that we’re overthrowing any of the known laws of physics. In fact, all we’re doing is, we’re in some sense, we’re taking completely seriously quantum mechanics as it was written down in 1926. It hasn’t changed at all since that time. But, what we’re doing, what we’re trying to do is to test it in an entirely new regime. Where really, it has never been tested in this regime of, let’s say, universal quantum computation, or you know, quantum error correction. The math seems very unequivocal that this could be done. Right, but this is really building something that has never been built before. There are skeptics. I talk to them a lot, they come on my blog often, who say this will never work. This is so absurd, that you could build this quantum computer that there has to be some new law of physics that will prevent it.

Craig Cannon [00:09:04] – But there seems to be equal amounts of crackpot scientist who come on your blog and figure out the P vs. NP problem.

Scott Aaronson [00:09:11] – I get every kind of original thinker on my blog. But, that’s one of the joys and pitfalls of blogging. In particular there are people, including very serious and well known computer scientists and physicists who say, well look, if not a new law of physics, it must be that we just don’t understand quantum mechanics well enough. That the error is going to inherently kill you. You will not be able to scale this. Maybe you could build a small quantum computer but nature will prevent you from scaling it up. In the 90s that actually seemed like a pretty plausible view, even to many of the experts in this field. What changed everything for most of us in the 90s was the discovery of quantum error correction and quantum fault tolerance. The upshot of which was, well if you want to build a scalable quantum computer you don’t need to get perfect qubits that are perfectly isolated from their environment. Which of course, would be a physical absurdity. You merely need to get them ridiculously well isolated from their environment. And you know, way better than we can do. But in the minds of most experts it reduced it to merely a staggeringly hard engineering problem. Now, what I’d like to say is that if it turned out that there’s some deep reason why this could never be done, and if the attempt to build quantum computers were to lead instead to the discovery of that new impossibility principle, well then that’s awesome.

Scott Aaronson [00:11:07] – That’s like Nobel Prizes for whoever discovers it and compared to that, the idea that you can build a quantum computer is the more boring possibility. That’s the more conservative option. But, we’re, as I said, testing quantum mechanics in this new regime and we want to know the truth whatever it is. There is fundamental science here. And to me that’s really why I got into this. I like to say, for me the number one application of a quantum computer is not breaking cryptographic codes, it’s not optimization problems, it’s not simulating quantum physics, it’s just proving the skeptics wrong.

Craig Cannon [00:11:51] – What happened in your childhood to cause that?

Scott Aaronson [00:11:54] – A lot happened, but we don’t have to go into that. It is sort of seeing whether nature actually has this computational ability beneath the surface. Now of course, what made a lot of the world interested in it, is that it actually could have some applications, right. Maybe the most important application that we know about is just giving us this new way to simulate nature. Simulate physics and chemistry, and maybe discover new drugs, discover new materials, right. That’s the application that Richard Feynman and the others had in mind when they first proposed the idea of quantum computing in the 1980s.

Craig Cannon [00:12:36] – But before we started recording you were talking about what you’ve been working on for the past year. Which is potentially relevant because many of these drug finding applications might need, say a million qubits?

Scott Aaronson [00:12:47] – We might already be able to start getting some of these with some hundreds of qubits. People are going to try.

Craig Cannon [00:12:53] – But we’re not even at 50 yet?

Scott Aaronson [00:12:55] – Uh, that’s right, that’s right. Not 50 that we have good enough control over, certainly.

Craig Cannon [00:13:01] – Right, so what was the application that you were working on?

Scott Aaronson [00:13:04] – Oh, okay, so I have a new idea that I’ve been working on for the past four months or so. There’s been independent work by others pursuing related ideas. But this, as far as I can see, may be the first application of quantum computing that people could actually be able to realize with near term devices, with 50 or 60 or 70 qubits. This application is to generate cryptographically secure random bits. Okay, so for example, if you have these proof-of-stake cryptocurrencies, you need a huge public source of random bits to sort of run this lottery to decide who is allowed to add a new block to the blockchain. For all sorts of other cryptographic applications you need public random bits. Of course, to decide which precincts to audit in an election for example. For cryptography you also need secret random bits. Now, for secret random bits you would need to own the quantum computer yourself. You wouldn’t want to download them over the internet. But, so now, there are many websites already that will give you public randomness. There’s one called Where you can just get random bits to order.

Craig Cannon [00:14:35] – Allegedly random.

Scott Aaronson [00:14:36] – Yeah right, yes, yes, allegedly random, thank you, right. NIST runs something called the Randomness Beacon. Where every minute they release 512 new random bits to the world, right.

Craig Cannon [00:14:48] – Oh, okay.

Scott Aaronson [00:14:49] – Which are partly generated using quantum mechanical processes. You could say if you believe quantum mechanics, right, which you should, then it’s very easy to get random bits. Right, randomness is kind of baked in to quantum mechanics, famously right. You could just keep measuring some photons you know, measure the polarization, outcomes will be random. Right, or just get some radioactive material from Kim Jong or whatever. Put a Geiger counter, right, the decays will be random. But, if you were to get these random bits from the internet then the question is, how do you know how they were generated. How do you know that the hardware wasn’t back-doored by someone? In fact, NIST did have a standard for pseudo-random bits which we learned a few years ago, because of the Snowden documents, was back-doored by, most likely, by the NSA. There is a big issue of how can you trust randomness. How can you prove to someone that the bits were actually generated randomly. It seems almost like a philosophical impossibility.

Craig Cannon [00:16:09] – How does yours work then?

Scott Aaronson [00:16:10] – There are ways to do this with quantum mechanics. Over the past 10 years people discovered that one way to do it is using the Bell inequality. Which means if you had two entangled particles that were far away and you measure them and you see that they have a certain type of correlation that could not have been produced classically. This is the famous Bell inequality, it’s the thing that disproves Einstein and shows that there are no secret local hidden variables that control what’s going on. That, sort of, quantum entanglement is a real thing in the universe. But, for many decades people said, well, this is conceptually a breakthrough that Bell made. Of course, it’s completely useless. Of course, you don’t actually need to create these correlations between faraway particles. But, what people realized a decade ago is actually the fact that you’ve created those correlations also means that there has to be some randomness in the bits that are produced. Because the only way to create those correlations without using true randomness would be using communication. But, if we put the things far enough away that a signal could not have gotten from on to the other even traveling at the speed of light, then we can have some kind of assurance that there is randomness there. But, now, the one thing is you’ve got to believe that these devices were really separated far enough. Which again, if it’s over the Internet, how would you know? The new realization is that you can get guaranteed randomness with a single device.

Scott Aaronson [00:17:57] – At least under some cryptographic assumptions. As long as that single device is able to do quantum computations that are hard to simulate with a classical computer. Basically you would imagine that Google, let’s say, has some 70 qubit quantum computer as indeed they are working to build right now, I just visited their lab a month ago. They’re working on it. I don’t know when it’s going to be ready. But, then you could, with your classical computer, submit challenges to the quantum computer that basically say just run this quantum circuit. Which is a pretty random looking, pretty messy arbitrary quantum circuit. But just run this quantum circuit and then it will lead to some probability distribution over output strings. In this case strings of 70 bits. Then it just gives me a sample from this distribution. I’ll just keep sending it challenges of that kind. One after another. Each time I demand that it send me back the sample from this distribution in a very short time. Like let’s say half a second.

Scott Aaronson [00:19:12] – And then I take these samples, and now if Google did the right thing then these samples have lots of randomness in them. But, the more interesting part is that under a cryptographic assumption, if I can check that these samples were actually correlated with distributions that they’re supposed be.

Craig Cannon [00:19:33] – In other words, one shows up 10% of the time.

Scott Aaronson [00:19:36] – All the outcomes are pretty unlikely. Because they’re all like, on the order of two to the minus 70 probability of occurring. But, not exactly, right. Some of them are like twice two to the minus 70. Some are half two to the minus 70. I can check that the heavier outcomes are more likely to occur. I can do some statistical test to check whether Google is plausibly sampling from this distribution. Then what we can do is we can mathematically prove that, if you’ll assume that some problem is hard for a quantum computer that looks like it should be hard, then it would follow that even with a quantum computer, the only way that Google could be quickly generating samples that pass this test is to generate them truly randomly. There’s no secretly deterministic way to do it without them spending a huge amount of computational power, more than we believe they have.

Craig Cannon [00:20:38] – When you were testing this out or did you test it out, with tons of computers?

Scott Aaronson [00:20:42] – It has not been tested out with real quantum computers yet. I mean, the apparatus that I use is a pen, paper, and I did use Maple a little bit to do some numerical optimizations.

Craig Cannon [00:21:03] – Not the sexiest thing, okay.

Scott Aaronson [00:21:06] – I’m a theoretical computer scientist. But, Google is hoping to move forward and test this out. And actually demonstrate it once they have the device. I mean, of course it could also be simulated with a classical computer. One could code something up. But, one thing that’s exciting about this is that it looks like pretty much as soon as you have a 60 or 70 qubit quantum computer that can sample distributions that are hard to sample with a classical computer, you can pretty much get this application.

Craig Cannon [00:21:46] – Wow, okay, cool.

Scott Aaronson [00:21:47] – It’s sort of designed for near-term quantum computers and in fact, even if you had many more qubits, we couldn’t really make use of them for this application. Because the verification, if I have N qubits is going to take two to the N time with my classical computer. Which means that with a thousand qubits, it might be working fine and yet we could never verify it.

Craig Cannon [00:22:14] – Man, what other projects did you clear from the cache on your sabbatical?

Scott Aaronson [00:22:20] – Well, not many. I was on sabbatical Tel Aviv for a year. I came with a long list of old papers that need to be written up. And I wrote up almost none of them. I just started new projects. And put the old ones even further onto the back burner. Which is often the way it goes unfortunately. But actually, I did write a paper this year about a new procedure for measuring quantum states. Which I call, shadow tomography. I had a more boring name for it and then a physicist friend.

Craig Cannon [00:23:07] – I appreciate it, it got punched up.

Scott Aaronson [00:23:09] – Yeah it came up. Physicists are much better than computer scientists at naming things. It’s called shadow tomography. And what it is, so measurement in quantum mechanics is an inherently destructive process. It, famously, it collapses the wave function, it collapses your state and you only get one chance. The problem that shadow tomography is trying to solve is, let’s say I have a small number of copies of a quantum state. But I want to know the behavior of that quantum state on a large number of measurements. So, a much, much, larger number of different measurements then the number of copies that I have. Maybe even an exponentially larger number. Let’s say, for simplicity, that each measurement has just two possible outcomes, yes or no. But I want to know for each measurement what is approximately the probability that it would return yes, applied to this state. Of course, if I have enough copies I could just measure each copy, you know, with a different measurement. But I don’t have enough copies, right. And again, if I have enough copies then I could just fully learn my state, right. Measure each copy and then eventually by collecting enough statistics I could write down in my classical computer a full description of the quantum state. But I don’t have enough copies for that either.

Craig Cannon [00:24:37] – Where are these assumptions coming from, that you don’t have enough copies?

Scott Aaronson [00:24:39] – Oh, well I’m just telling you that because this makes the question interesting.

Craig Cannon [00:24:45] – Alright thanks.

Scott Aaronson [00:24:46] – If we do have enough copies then we do one of those simpler things. But I’m asking what happens if we don’t. Maybe it’s very expensive with your hardware to create new copies of the state. What shadow tomography is, is it’s a way to, sort of, take these states and measure them very very carefully so that you can keep reusing them over and over.

Craig Cannon [00:25:14] – Without destroying them.

Scott Aaronson [00:25:15] – Without destroying them, right. Damaging them only slightly, each time. And learn the answers to each of these yes or no questions. Which again, there could be exponentially more of than there are copies of the state.

Craig Cannon [00:25:28] – How does the partial destruction work?

Scott Aaronson [00:25:31] – It has to do with the way that measurement works in quantum mechanics. If you measure your state in the ‘wrong’ basis then the state is destroyed. For example, if I have an electron that’s very spread out in position and ask it what’s its position, then I now force it to make up its mind. It’s localized now to one place and then that destroys all the other information that was in that superposition over positions. On the other hand, if I ask that electron for its momentum, well then its momentum might have been much more determinate. If I ask a question where given knowledge of the state, someone could have almost perfectly predicted the answer to that question, then the state only needs to be damaged slightly by the measurement. We know that not all measurements are destructive. For example, if I read a book and I see what words are written in it, my friend can read the book too. That’s a non-destructive measurement. But, even in the quantum realm, like, if I’m careful to measure a state in a basis where it’s already been localized then that’s not going to damage it by very much. Now the challenge, of course, is I have these copies and I don’t know in which basis they’re localized. But, I can do something, and this measurement procedure that I designed takes a very long time to carry out. I’m not promising you that it’s fast. But it makes very very careful reuse

Scott Aaronson [00:27:17] – of these same states over and over again. This has various implications; it solves various theoretical questions that I cared about. And by the way, I first conjectured that this was not possible. And this is the way that research often happens for me. I tried to rule it out, I was unable to rule it out. And then eventually I figured out why.

Craig Cannon [00:27:39] – Was it just brute force pen and paper? Or did you have a conversation that sparked it?

Scott Aaronson [00:27:44] – Well I mean, I had taught a mini course in Barbados a few years ago. Where I raised this as a question and yeah, I don’t see how to do this, maybe it’s not possible. Then I just thought about it more. Oh yeah and then of course I was building on early work that I and others had done. You never start completely from scratch. You kind of know the tools that were used to solve related problems. But anyway, then this year we’ve carried it further in joint work with Guy Rothblum from the Weizmann Institute. And, what happened was I gave a talk about this work and Guy said, this idea of measuring a quantum state very gently and not damaging it, this sounds a lot like what I work on. Guy is a classical computer scientist who works in a field called differential privacy. Some listeners may have heard of this. This is actually used… I don’t know if Facebook uses it, but some websites use this. It’s a way that you can do data mining on a database of a whole bunch of sensitive user data. Could be medical records, could be all the users’ personal data. But, you can do it in a way that mathematically guarantees, in some sense, that you’re not going to be revealing too much information about any individual user. In the sense that, if any individual user were to drop out of the database or change their data, then that would have only a very small probabilistic effect on the outputs of your algorithm. And the way that you achieve differential privacy is often things like, I may ask

Scott Aaronson [00:29:40] – how many of the users have colon cancer or something. But then I’ll add some random noise to the result. And the data with a little bit of random noise is still perfectly good for doing medical statistics, but now it’s like even if I knew everyone else’s data, I still can’t determine whether a particular person has colon cancer or not. And you do the same kinds of things to get gentle measurements of quantum states. So Guy said, there seems to be a connection here. And I said, come on. You can, you know, relate anything to anything else, right. It’s probably just an analogy. But then, we sat down and in fact there is a connection. There’s a precise mathematical connection between these two problems. You can prove it. It goes in both directions. And then we were actually able to use it. We can take work that’s been done in differential privacy by people who don’t know anything about quantum mechanics, right, just purely classical CS, and use it to get better procedures for shadow tomography.

Craig Cannon [00:30:49] – That’s really cool. Is that online?

Scott Aaronson [00:30:51] – Uh, not yet, it’s another thing on my stack to write up.

Craig Cannon [00:30:55] – It’s like another sabbatical away.

Scott Aaronson [00:30:56] – Yeah, well you know. We will try to write it up this summer.

Craig Cannon [00:31:01] – Alright, cool. Moving to one of the more poorly named CS problems that we talked about over email. The P versus NP problem. I heard you describe this because it can sound really complicated. But I heard you describe it once as, for every efficiently checkable problem, is it efficiently solvable. And I thought that was a good way to describe it.

Scott Aaronson [00:31:23] – Yeah, well, that’s just the standard way to say what this problem means.

Craig Cannon [00:31:28] – But why does it matter, I guess is the question.

Scott Aaronson [00:31:30] – It’s a strong contender for the most important unsolved problem in math of this century. NP stands for non-deterministic polynomial time. As I said, we’re not as good at naming things as the physicists. But it’s all the problems where if I told you the answer you could check it efficiently. What we mean by efficiently in computer science is, by an algorithm that uses a number of steps that scales, at most, like the size of the problem raised to some fixed power. We call that a polynomial time algorithm. That’s kind of our rough and ready criterion. It doesn’t always correspond in practice to efficient, but it’s a pretty good stand-in.

Craig Cannon [00:32:18] – It’s like a ball park.

Scott Aaronson [00:32:20] – Yeah, exactly, it’s ball park. If we can’t even answer this, than we’re not going to answer the more refined questions either! Now, P is all the problems that are efficiently solvable. Right, they actually have an algorithm that will find the answer in a polynomial number of steps. So a good example of an NP problem would be factoring. I give you an enormous number and I ask, what are its prime factors. That problem happens to underlie much of modern cryptography. A good example of a problem in P is, if I just give you a number and I ask you whether it’s prime or not but not to find the factors, then that actually has a fast algorithm. It was only proven to be in P 16 years ago. We had a big breakthrough. Okay, so that’s an illustration of how it can be very very non-obvious to figure out which problems have these efficient algorithms and which don’t. As soon as anyone, as soon as a layperson understands the P versus NP question I think most of them would say, well of course there’s not going to be an efficient way to solve every problem that’s efficiently checkable. Why are you even asking that? Like a jigsaw puzzle, obviously it’s a lot easier to look at a jigsaw puzzle that your friend did and say, oh yeah, good job, looks like you finished it, than to actually do it yourself. Same with a Sudoku puzzle. Same with breaking a cryptographic code. Same with solving some optimization problem like, optimizing an airline schedule that might involve solving some satisfying

Scott Aaronson [00:34:02] – some enormous number of constraints. Or as many of them as you can when they conflict with each other.

Craig Cannon [00:34:09] – Right, but it’s not proven not to be possible.

Scott Aaronson [00:34:11] – That’s right, that’s right. No one has ruled out that a fast such algorithm could exist. Essentially it’s very very hard to prove a negative. Occasionally we can do it. But it tends to be much harder. If something is possible you just have to show how to do it. But to prove that something is not possible you have to, in some sense, understand the space of all the possible ways that it could have been done. Give some general argument why none of them could work. Right, so sometimes we can do that. It’s not like we’ve made no progress. But we’re a long long way from being able to prove things like P not equal to NP. I like to say that if were physicists we would have just declared P not equal to NP to be a law of nature. We would of just been done with it.

Craig Cannon [00:35:02] – Yeah, Feynman would have declared it.

Scott Aaronson [00:35:03] – Yeah, yeah, that’s right, right. Well, like the Second Law of Thermodynamics. We would have given ourselves Nobel Prizes for our discovery of the law. And later if it turned out that, oh, actually P equals NP, there is a fast way to solve all these problems, well then we could award ourselves more Nobel Prizes for the law’s overthrow! But, you know, one thing you learn in an interdisciplinary subject like quantum computing is that there are differences in terminology and culture between fields, right? What the physicists call a law we call a conjecture.

Craig Cannon [00:35:38] – It’s increasingly hard to draw the line in between the two as well. In like CS, math, physics.

Scott Aaronson [00:35:45] – Oh yeah, I make fun of my brothers and sisters in physics all the time but only because I love them, of course. But, in fact, large parts of physics and CS have been coming together in the last decades. Partly, statistical physics, made this very very deep connection between spin glasses and condensed matter physics and combinatorial optimization problems. You can understand what many algorithms are doing using physical analogies. Then of course quantum computing was this enormous intersection where suddenly these fields were just thrust together and they had to quickly learn each other’s terminology and frame of reference. That’s a good part of what I’ve been doing, just helping to translate. I give colloquia in physics departments where they just want to know, like, what are P and NP. What are the basics of undergraduate level computer science. But, what’s cool is that I can talk to string theorists, let’s say, and they know this staggering tower of knowledge that I don’t know. That I’m only at the lowest foothills of. And yet suddenly they too need to know about computer science. And so…

Craig Cannon [00:37:13] – They have to respect you.

Scott Aaronson [00:37:14] – Well, you know, we have something to talk about.

Craig Cannon [00:37:18] – We did a podcast with Leonard Susskind that’s not out yet.

Scott Aaronson [00:37:22] – He’s the perfect example of someone who is been pushing this intersection. Maybe even more aggressively than I’ve been. Every time I talk to him I’m like, slow down Lenny. You know, computer science is not quite the future of all of physics. And he’s like, “It absolutely is.”

Craig Cannon [00:37:42] – We didn’t even get that far. But I do have a question related to his work. He talks about this holographic principle. How does that relate to the firewall paradox? I couldn’t quite grasp the two together.

Scott Aaronson [00:37:56] – This is a big discussion. Okay, the holographic principle is this general phenomenon where often you write down a physical theory in some number of dimensions, involving let’s say a three-dimensional space and then it turns out to be dual in some sense to a completely different looking physical theory, which is defined on the boundary of that space. Yes, in a different number of spacetime dimensions, one fewer. And the first theory, the one that’s in the bulk, as they say, in the interior, involves gravity. It’s a quantum theory of gravity. Where it can have things like black holes that form and evaporate. Whereas the theory that’s on the boundary is a pretty ordinary quantum field theory. Meaning it has no gravity. It has a flat spacetime. These two theories look totally different and you go, what do you even mean in saying that they’re the same thing?

Craig Cannon [00:39:07] – Like literally the same information.

Scott Aaronson [00:39:08] – Well you mean that there is a one-to-one mapping between states of the first theory and states of the second theory. And this mapping is non-local. I could have a little particle over here inside of the bulk and yet in the boundary theory that would correspond to some enormous smeared out thing. The mapping between the bulk theory and the boundary theory, in recent years people realized that it’s literally an example of one these quantum error-correcting codes that I told you about before. These are the same things that one would need in building a quantum computer. The whole point of an error-correcting code, is that you take like a local one bit and you smear it out. You represent it with a large number of bits. This is also what happens in a hologram. Hence the name, holographic principle. There’s this smeared-out representation of everything that’s happening in the interior, which is defined on the boundary. This provides the most precise definition that the string theorists are able to give of what they mean by quantum gravity. They’ll say, well what we really mean by quantum gravity is you define this theory on the boundary, which they more or less know how to do and then somehow there’s this thing in the bulk that’s dual to that. So again, different culture different standards.

Craig Cannon [00:40:43] – Yeah for sure.

Scott Aaronson [00:40:44] – Yeah they don’t even have a rigorous independent definition of this bulk theory. But, what they can do is in various special cases they can calculate things in the bulk theory. Then they can calculate the same thing in the boundary theory. In every single case where they can do the calculation in both places, they get the same answer. This is what leads them to say there must be this duality.

Craig Cannon [00:41:09] – They’re like “good enough.”

Scott Aaronson [00:41:09] – Yeah, good enough for them.

Craig Cannon [00:41:11] – Yeah, and so how does that relate to the firewall?

Scott Aaronson [00:41:13] – The firewall paradox is sort of like a modern refinement of Stephen Hawking’s original black hole information paradox from the 1970s.

Craig Cannon [00:41:29] – Like Hawking radiation?

Scott Aaronson [00:41:31] – Shortly after he discovered Hawking radiation in 1975, Hawking wrote a paper that posed the information paradox or puzzle of black holes. Which is basically just the question, how does information ever get out of a black hole? Why does it have to get out? Well, if we believe that quantum mechanics describes everything in the universe. Quantum mechanics, except possibly when a measurement is made, okay, if you believe in the Many-Worlds Interpretation…

Craig Cannon [00:42:07] – Something you observe you observe anything.

Scott Aaronson [00:42:08] – Yeah exactly. Well, you know, if you believe the Many-Worlds theory then even a measurement is just another ordinary physical process, but where you split into multiple branches. But let’s leave that aside. Any isolated physical system is supposed to evolve in a completely reversible way. It may be very hard to reverse in practice. You know, it’s a lot easier to scramble an egg than to unscramble it. But, in the view of physics since the 19th century that’s merely because our universe started in a very special state. Right, with a very low entropy. My friend Sean Carroll likes to say that every time you cook an egg you’re doing an experiment in cosmology. You’re proving that the Big Bang was in a special state. Right. But, in principle everything is supposed to be reversible. So in particular if I drop an encyclopedia into a black hole, then the information, what was written on the pages cannot be deleted from the universe. It has to still be there. Then the question is, well where does it go. You could say maybe when it hits the singularity it goes into some other bubble universe. People thought about that for a while. But, a popular point of view nowadays is that ultimately the information does come out. It comes out in the Hawking radiation. Which for a black hole that was the mass of our sun it would take a mere 10 to the 67 years for this to happen. Okay, but you know eventually it would.

Craig Cannon [00:43:51] – We got time.

Scott Aaronson [00:43:52] – Yeah, that’s right, you know, if you have a long enough grant you could wait, you could see this. Eventually it would come out. Of course in a very scrambled form. If I burn a book, physics tells us that the information is still there in the smoke and ash. It’s not very accessible anymore but in principle it’s still there. The idea is that a black hole is just another example of this. Okay, but there’s a big puzzle because the information, if you were floating next to the encyclopedia you would just see it go right past the event horizon of the black hole. You’d see it go all the way down into the singularity and then, you know … it doesn’t seem like it’s ever coming out. How does it get into the Hawking radiation in order to come out? This was such an acute puzzle that it forced people like Lenny Susskind and Gerard ‘t Hooft in the 1990s to this view called black hole complementarity. Which basically says that there are two different ways to look at the same situation. For an observer who’s outside the black hole or for an observer who is jumping into it with the encyclopedia. And the idea is, from the point of view of the first observer the information never even makes it past the event horizon. It just sort of gets pancaked.

Craig Cannon [00:45:22] – Right it’s like a fly hitting a windshield, it’s flat on the…

Scott Aaronson [00:45:24] – I mean first of all because of relativistic time dilation, you’re never going to see anything fall into the black hole. It’ll just get slower and slower as it nears it. And you’ll never actually see anything go in. The idea is from the outside observer’s point of view you could treat the interior of the black hole as not even existing at all. It’s just like some weird and different and scrambled way to rewrite what’s happening on the event horizon of the black hole. This is another example of one of these holographic dualities, where there’s two different ways to look at the same physical situation. There’s the interior point of view and there’s the point of view where it’s all on the event horizon. But then there are all sorts of puzzles about reconciling these two different points of view as you could imagine. The firewall paradox was a particular technical puzzle about how to reconcile these two different points of view. If we had another 20 minutes I could go into it but it might take too long. But in the meantime, the other thing people do is that they use this bulk/boundary correspondence as sort of laboratory. So they say, we have a spacetime where on the boundary we can sort of calculate what’s going on. Now let’s act inside of the bulk of that spacetime, let’s form a black hole. Now let’s try to answer all these enormous conceptual questions about what’s going on inside of a black hole by translating them into questions about what is happening

Scott Aaronson [00:47:05] – in the boundary theory. Now meaning the boundary of spacetime, not the boundary of the black hole.

Craig Cannon [00:47:10] – Right.

Scott Aaronson [00:47:11] – But that’s proven very difficult because in some sense what the theory wants is to just answer questions about what is observable by some hypothetical observer who’s far away from all the action. Who can just send in some particles that will all hit each other and then some other particles come out. This is a point of view that physicists like to take a lot. You know, all of existence is like a giant particle collider. You just smash some things into each other, then you look at the debris that comes out on the other end. But if you’re asking, what is the experience of someone who jumps into a black hole, then that’s inherently not that kind of a question. It’s not a question about the observer at infinity. It’s a question about, you know, someone who’s very much in the action.

Craig Cannon [00:48:04] – Yeah, yeah, yeah, like Alice sends Bob into black hole.

Scott Aaronson [00:48:07] – Exactly, and these boundary pictures, just, don’t seem very good yet at addressing that kind of question.

Craig Cannon [00:48:14] – Okay, so let’s move on to another unanswered question. You got a bunch of AI related questions from the internet. And it seems that people want you to opine about AGI. Let’s go with one them.

Scott Aaronson [00:48:30] – Yeah sure.

Craig Cannon [00:48:31] – Anagh asks, how can we channel AI growth but not weaponize it. In a sense, it seems like they’re assuming AGI happens. What do you think?

Scott Aaronson [00:48:44] – There will be many social issues that we’ll have to deal with around AI, or already are having to deal with even long before we reach the era when AI is near the level of human intelligence. I mean, you know, we’re obviously going to have to worry about self-driving cars and all the benefits and also the disruption and issues that those are going to bring. AI for data-mining and all of the implications that it has for privacy. Or, a deep net denies your loan application but then no human can explain why your application was turned down. These are things that lots of people are thinking about and the good thing is that we can try things out in the real world. We don’t normally think of ethics and morality as experimental sciences. But very often people have moral intuitions about something that are really bad until they’ve been checked by experience. And so we’re going to have to sort of, and we’ll have the opportunity to, refine our ethical intuitions about all these issues by seeing the ways that AI actually gets deployed. I don’t think I’m going to shock the world if I say I hope that we’ll find ways to use it for good and not for evil. I have many friends, especially here in the Bay Area, who I see every time I come to visit here, who are very very interested in what happens after that. When AI actually reaches the level of human intelligence or exceeds it. Clearly whenever that happens

Scott Aaronson [00:50:42] – then we’re living in a completely different kind of world. Think of like the woolly mammoths: once the hominids start making their spears and their bows and arrows, then life is not the same anymore. A lot of my friends in this community are very interested in the question, how can we ensure that once an AI gets created that is at a or beyond human level, that it sort of shares our values. That it doesn’t just say, “Okay my goal was to make paperclips so I’m going to just destroy the whole earth, because that’s more raw material for paperclips.” That it will say, “No, these humans created me, I should revere them as my great, although slightly dimwitted, ancestors. I should let them stay in a nice utopia, or something. Even while I go off and prove P is not equal to NP, or do whatever else interests me.” My point of view is, if civilization survives for long enough, eventually we’re going to have to deal with these kinds of questions. The human brain is the product of all these weird evolutionary pressures, including the width of the birth canal and how much food was available in the ancestral environment and all this stuff. There’s no reason to believe that we are near the limits of intelligence that are allowed by the laws of physics. Eventually, sure, it could be possible to produce beings that are much more intelligent than we are. We may have to eventually worry about that. Now, I have to confess, that personally

Scott Aaronson [00:52:43] – when I think about the future of civilization, let’s say, the next 20 years, the next 50 years. I tend to worry less about super-intelligence then I do about super-stupidity. I tend to worry about us killing ourselves off by catastrophic climate change, or by nuclear war. Or just that we’re all regressing into fascism, just giving up on liberal democracy. Of course, we’ve seen many distressing signs all over the world that there’s this kind of backsliding right now. I like to say that my biggest hope is that civilization should only last long enough that being destroyed by superintelligent robots becomes our biggest threat! Right, let that be our worst problem.

Craig Cannon [00:53:36] – Right, of course. It’s this silly mental game where it assumes we’ve learned nothing along the way and it just happens overnight.

Scott Aaronson [00:53:46] – Well look, I wouldn’t go that far. It’s good to have some people thinking about these things. Just like there should be people thinking about how could we prevent a catastrophic asteroid impact. Or how could we prevent bio-terror. They’ll probably discover various interesting things along the way that will have implication for the world of today. That usually happens when people let their minds roam freely over the far future. I’m happy to have people think about this. But as practice for solving the problem of AI alignment, let’s see if we can solve global warming first.

Craig Cannon [00:54:40] – Yeah we’ll see how that goes.

Scott Aaronson [00:54:43] – Yeah see how it goes.

Craig Cannon [00:54:43] – See how it goes man. Alright, let’s do another Twitter question. Micheal Burge asks, is anyone keeping track of the smallest N such that Busy Beaver of N is independent of ZF set theory. He mentions, I recall there was some activity after the 2016 article. I assume that was on your blog.

Scott Aaronson [00:55:04] – Yes it was.

Craig Cannon [00:55:04] – And I’m wondering if 1,919 states is still the record.

Scott Aaronson [00:55:09] – Let me back up and explain what he’s talking about. So the Busy Beaver numbers are, well they’re one of my favorite sequences of numbers since I was a teenager. The Nth Busy Beaver number, you can think of it as the largest finite number of things that could be done by any computer program that’s N bits long. We rule out programs that just go into an infinite loop. But we say, as long as your program has to eventually halt, then what is the most number of things that it could do before halting? If this program is say N bits long and it’s run a blank input. Of course this could depend on the programming language a bit, but let’s just take the original programming language, Turing machines, defined by Alan Turing in 1936. And so then the Nth Busy Beaver number is defined as the largest number of steps that can be taken by any Turing machine with N states before it halts. The amazing thing about this function is that it increases more rapidly than any function that could be calculated by any computer program. This is provable. It is a ridiculously quickly growing function. Like, the first four values of the Busy Beaver function are known. They’re like 1, 6, 21 and 107. The fifth one is already not known but it’s at least 47 million. Then the sixth is at least 10 to the 36,000 power, and the seventh you already need a stack of exponentials to express. If you’re ever in a contest to name the biggest number and you just say Busy Beaver of 100, if your opponent doesn’t know about computability theory

Scott Aaronson [00:57:15] – you will destroy them. Okay, but now another fascinating thing about this Busy Beaver sequence, besides the fact that it grows so rapidly, is that in some sense, it encodes all of mathematics. For example, if I wanted to know is the Riemann hypothesis true. Well there’s some Turing machine with some number of states that tests the Riemann hypothesis: that halts if and only if it finds a counterexample to it. Then if I knew Busy Beaver for that number of states, then I would just have to run the appropriate machine for that number of steps, see if it halts, and that would answer the Riemann hypothesis. It’s not as if it’s a huge surprise that this function grows uncomputably rapidly. Because it has, you know, so many secrets of the universe encoded into it. And furthermore, one can prove that the axioms of set theory can only determine finitely many values of this function. Beyond a certain point the standard rules of mathematics cannot even prove what are the values of the function. You know, it has some values, because every Turing machine either halts or it doesn’t halt. And yet in some sense, we can never know the values, beyond the first few. A few years ago I had a Masters student when I was still at MIT, named Adam Yedidia, and I gave him a thesis project to try to determine what is a concrete bound on the number of states where this Busy Beaver function just goes off the cliff into unknowablity. We may not be able to determine exactly where it happens. But at least we can say does it happen by at most

Scott Aaronson [00:59:16] – ten thousand states or by at most a hundred thousand states. What Adam did is that he designed a Turing machine with about 8000 states that does something that’s equivalent to just trying out all the possible theorems of set theory one after the other and halting if it ever finds a contradiction. What does that mean? Well, because of Godel’s incompleteness theorem, it means that set theory can never prove that this machine runs forever. If set theory is consistent then the machine does run forever. But if set theory was able to prove that, then set theory would be proving its own consistency. That’s a no-no. That’s exactly what Godel’s second incompleteness theorem says it can’t do without being inconsistent.

Craig Cannon [01:00:13] – I think I got it.

Scott Aaronson [01:00:15] – The way to remember it is anyone who brags about themselves probably has nothing to brag about. If your theory is bragging about its own consistency, it means it’s inconsistent. A theory could believe it’s inconsistent while being consistent, that’s possible, but not the other way around.

Craig Cannon [01:00:36] – But not the other way around, okay.

Scott Aaronson [01:00:37] – It can’t believe it’s consistent. He designed an 8000 state machine. There was a lot of software engineering that went in. You had to compile a program down to Turing machine, and keep very careful control over the number of states. Adam and I wrote a paper about this that I put I up on my blog. And then what’s cool is that a lot of hobbyists were able to look at this say, you know, maybe they could improve on it. In particular there’s guy Stefan O’Rear. And he got it down to a less than 2000-state machine. And I believe that most recently he’s gotten it down to under 800 states.

Craig Cannon [01:01:18] – Wow.

Scott Aaronson [01:01:19] – In any case, he hasn’t written a paper about it, but all of his code is available on GitHub if anyone wants to look at it, or even try to improve over what he did. I suspect that there may even be a machine with ten states, that would already exceed the ability of set theory to know what it does.

Craig Cannon [01:01:38] – Why do you suspect that?

Scott Aaronson [01:01:39] – Already with five states there are machines that whose behavior seems to hinge on some weird number theory and no one has yet understood them. And we know how rapidly this Busy Beaver function grows. The truth is that we don’t know. It’s somewhere between five and 800 or so that this thing goes off the cliff.

Craig Cannon [01:02:03] – Cool, I actually do have a question about your blog. Grom what I can tell you’re basically inactive on social media.

Scott Aaronson [01:02:11] – Well, I do not have a Twitter account. That is not an accident.

Craig Cannon [01:02:15] – Okay, yeah, that’s what I figured. Despite that, or in spite of that, you’ve been blogging for 10, 15 years.

Scott Aaronson [01:02:23] – Since 2005. And I guest blogged on some other blogs before that. But, I mean, blogs used to be considered social media.

Craig Cannon [01:02:33] – That’s true.

Scott Aaronson [01:02:34] – Yeah, I feel like a dinosaur. Back in my day we just had blogs and we liked it. My friend Sarah Constantin had a post, I thought a very insightful post about this recently. Where she was making the point that blogs are very much in keeping with the original promise of the internet. The original idea that it was going to be a space where people would discuss things. Where they could spell out an argument by composing some paragraphs of text. That they’d set out what they think and why. Take responsibility for what they said. Put a name to it. Other people would then respond to it, give counterarguments. It’s very much a continuation of the culture of say, Usenet in the 80s and 90s. Since then we seem to have moved away from that toward a model of communication on the Internet that’s a lot more like what offline communication used to be. I’ve described Twitter as sort of the world’s biggest high school. That doesn’t mean it’s all bad, in fact I have wonderful friends who use Twitter to do very worthy and great things. I like to tell them that they’re sort of like, they bear the same relationship to Twitter as the 10 righteous men would’ve bore to Sodom and Gomorrah. But unfortunately it’s not a medium that’s designed for spelling out an argument. Or for explaining carefully where you’re coming from. It is almost designed for ganging up on people. For forming these kind of outrage mobs.

Scott Aaronson [01:04:38] – Which indeed we see that it is susceptible to these repeated outrage explosions. I’m not blaming one political side. I think we can see plenty of examples on both ends of the political spectrum of Twitter being used for what I think of as really nasty purposes. Likewise with Tumblr and Instagram. It’s not always nastiness. But they’re just sort of, they’re designed for a sharing a photo and people click ‘like’ on it. It’s a lot of social signaling, it’s a lot of building up one’s popularity, one’s presence. Not discourse. They’re not really designed for carefully clarifying, well what is it that we really disagree about, where are we coming from, and that’s really what interests me. I don’t always succeed. But that’s kind of what I try to do on my blog. The problem comes when we try to have that kind of conversation on the blog, a careful conversation where anyone is welcomed to contribute, but they have to play by the ground rules of, have some empathy, try to understand where other people are coming from. When if people come into that from the culture of outrage mobs where they say, let’s just look for the most inflammatory sentence ripped out of context, that we can just put all over Twitter to say, look at these blithering idiots. Then it really becomes scary. It becomes much harder to have that kind of discourse where you’re trying to understand the other side.

Craig Cannon [01:06:32] – You’ve been a victim of this before.

Scott Aaronson [01:06:37] – You could say so.

Craig Cannon [01:06:38] – Yeah for whatever.

Scott Aaronson [01:06:39] – I mean a lot of people who try to do this have also been.

Craig Cannon [01:06:42] – For sure.

Scott Aaronson [01:06:43] – You know, in fact, a lot of people have had it much worse than I have.

Craig Cannon [01:06:46] – Yeah absolutely. Were you on Twitter at one point and these…

Scott Aaronson [01:06:50] – No. No, it just never really tempted me.

Craig Cannon [01:06:54] – Interesting.

Scott Aaronson [01:06:55] – I mean, if I have something to say, I mean, sometimes I just put little updates on the ends of my blog posts that are kind of like tweets.

Craig Cannon [01:07:05] – Do you have a favorite post?

Scott Aaronson [01:07:08] – I had these posts critiquing integrated information theory. Which is a proposed theory of consciousness by people like Giulio Tononi. I was explaining why I don’t think this theory of consciousness works, why it doesn’t solve the problem it’s supposed to solve. But, what was great about this post was that Tononi himself, got involved in the discussion. David Chalmers, the philosopher of consciousness, got involved in the comments section. And so we kind of had this Plato’s academy going. Just in my blog comments section. I feel like we were actually able to make progress on this major issue. That’s not all the time. Sometimes I write a post that’s just some stupid joke or procrastination. But sometimes when I have something that I want to get out there it’s nice to have a forum.

Craig Cannon [01:08:21] – Yeah, that’s great. You suggested this question so I’m might as well ask you. Advice for young people. You kind of are all across the world. You’re potentially licensing ideas to companies but you’re within academia. And you’re also kind of a CS science communicator, so you’re across many realms. What is your advice for nerds in general or people who want careers in science.

Scott Aaronson [01:08:52] – Well, first of all if you are currently in high school. I hope you’re having a good experience. If you are that’s awesome, then take advantage of it. If you’re not, realize that things will get better. Because this is a Y Combinator podcast, I should mention that one of the most influential essays that I ever read was Paul Graham’s “Why Nerds are Unpopular.” It has an enormous amount of insight.

Craig Cannon [01:09:33] – That’s the beginning of hackers and painters.

Scott Aaronson [01:09:35] – Yes, yes. Buy his book, but if you don’t want to buy it he’s also got it on his website. The basic argument that he develops there is that teenager-hood is sort of a creation of the modern world. It used to be that once people would pass through puberty, either they would go off and get married, or they would apprentice themselves to some craftsman or maybe they’d be working in the fields or whatever. But in any case they would not be in this environment of high school. Which is sort of an artificial environment that we’ve created because we don’t know what else to do with these people. Maybe there’s some teaching of them that goes on. Although if you look at how much knowledge the average high school graduate possesses it can’t have been that much.

Craig Cannon [01:10:35] – No, retaining not so much.

Scott Aaronson [01:10:37] – That’s right that’s right. What you do get a lot of is popularity contests that can be sort of based on nothing. And yet if you want to do well in it, then you sort of have to devote almost all of your time to this.

Craig Cannon [01:10:53] – Right, and so that’s the core of the essay.

Scott Aaronson [01:10:55] – Yes, right, right. A nerd in Graham’s telling is someone who is in the environment but who’s already thinking about the issues that matter in the wider world.

Craig Cannon [01:11:05] – Yeah, and he says basically like, they care more about being smart than being popular.

Scott Aaronson [01:11:10] – Yeah, and he says like, I get that it’s very hard to accept that that is your priority. Because it seems like you would give anything. You would even accept a lowering of 30 IQ points or something just to not be in the situation that you’re in. Except, if someone actually gave you that choice would you actually take it? But realize that there’s a wider world of people who are going to appreciate things that really matter. You can try to get to that world sooner depending on your circumstances. I actually left high school early. I got a GED from New York State when I was 15 and I went to a place called Clarkson School, in upstate New York. Which is a program for high school students who can take courses at Clarkson University and then apply to college from there. Almost every college rejected me – I’d had kind of a bizarre trajectory – but Cornell was nice enough to accept me.

Craig Cannon [01:12:30] – How old were you?

Scott Aaronson [01:12:31] – Oh, I was 16 when I started there.

Craig Cannon [01:12:32] – You were 16 when you started at Cornell.

Scott Aaronson [01:12:34] – Yeah, and since I already had one year, then I spent three years at Cornell. Then I went to Berkeley for grad school. I was lucky to be able to leave high school a little bit earlier. My parents supported me, once it became clear that this was what I wanted to do. They warned me, well, look this is going to make your social life really, really difficult. Which turned out to be 100% true. But, I remember telling them at the time, look, my social life already stinks! I was lucky to have some a few very very good friends in high school. Some of them are still my wonderful friends. But, it was only after I’d been a post-doc for a while that I started finally figuring out how to drive a car, how to ask someone on a date.

Craig Cannon [01:13:38] – Oh man.

Scott Aaronson [01:13:39] – I sort of did things in a weird order.

Craig Cannon [01:13:41] – When you wrote about, you’ve written about depression a little bit on your blog somewhat. Was that during this period or was that after?

Scott Aaronson [01:13:50] – Yeah, it was pretty much during this period. But, even starting before I had skipped any grades. That’s the thing. I felt like I was already in such a constricted environment that at least I could be learning CS and math. At least I could be in an environment where people cared about the intellectual things that I cared about. But really, once you get into that environment, you’re not the only one. Eventually you’ll be able to … the great thing about the modern world is that people can sort themselves. You can find a group of friends who will care about the things that you care about.

Craig Cannon [01:14:35] – Yeah, so in other words put yourself out there and try things. Yeah, cool. All right Scott, well thank you so much for coming in.

Scott Aaronson [01:14:44] – Yeah of course, thank you.

Craig Cannon [01:14:46] – All right thanks for listening. as always you can find the transcript and the video at if you have as second it would be awesome to give us a rating and review where ever you find your podcasts. See ya next time.


  • Y Combinator

    Y Combinator created a new model for funding early stage startups. Twice a year we invest a small amount of money ($150k) in a large number of startups (recently 200). The startups move to Silicon