Hello reddit! We’re Greg Corrado and Chris Olah, deep learning researchers at Google. Deep learning can help software understand your voice, understand images, translate between languages, and all sorts of other things! Our friends Nat & Lo made an “explain like I’m five”-inspired video about our field for you!


More about us:

Greg is a senior research scientist at Google and was one of the co-founders of Google’s deep learning research group, along with Jeff Dean and Andrew Ng. Before entering industry research, Greg did his PhD in neuroscience at Stanford.

Chris is a wandering machine learning researcher, presently hosted by Google. He contributed to DeepDream, producing the class visualizations. He writes a blog about neural networks (http://colah.github.io/).

Ask us anything!

Proof: http://googleresearch.blogspot.com/2015/09/a-beginners-guide-to-deep-neural.html http://imgur.com/a/2A1sY

(colah is Chris' personal account, but Greg will be posting under it in this thread as well.)

Hi all! Thanks for all your great questions. We're feeling a bit burnt out -- it's Friday evening -- and are going to sign off soon. Chris may log in over the weekend to answer a few more questions.

Comments: 198 • Responses: 33  • Date: 

ryanocerous12331 karma

What do you think is the most exciting/useful application for Deep Learning technology? Do you think it's one step closer to achieving AI?

colah35 karma

Greg: Deep learning is tearing through “machine perception” applications -- those tasks that involve seeing, hearing, and understand sensory input. I suspect the next big opportunities are in the area of understanding sequences, relationships, concepts, language and possibly even programs + algorithms themselves. As far as achieving AI, what we have to date might be one baby step closer to AI, but what deep networks are able to do today is still remedial by any meaningful measure of intelligence.

bellerophonvschimere18 karma

Which language do you use to code your Deep Learning algorithms ? C++, Python, Go ?

colah39 karma

Greg: The core libraries we use at Google are written in C++ for speed, but we often use Python as a convenient configuration language for constructing and training networks. Look for more about our tools in the coming months.

xoher12311 karma

Can you elaborate on the intersesction of Natural Language Processing and Deep Learning?

colah20 karma

Chris: Great question! I wrote an entire blog post on it! http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/

Kaixhin11 karma

Growing computational resources are often cited as a major reason for the resurgence of neural networks. Google has created cutting-edge models in supervised learning (Going Deeper with Convolutions), unsupervised learning (Building High-level Features Using Large Scale Unsupervised Learning) and reinforcement learning (Massively Parallel Methods for Deep Reinforcement Learning) by utilising their vast resources. In what ways/areas do you think small research labs can be competitive when it comes to furthering deep learning research?

colah9 karma

Greg: I happen to do my research from within Google (and I have to admit that having the compute resources is reeeally nice) but I feel it’s important for Google to help smaller labs whenever possible. Google (and other companies) fund a lot of this research outside their four walls. Our team also collaborates with people in academic community, hosts visiting academics, participates openly in the external research community. I personally am also very supportive of the idea of more open source tools in machine learning.

Chris: I’d like to add that smaller labs continue to publish lots of lovely, interesting, innovative papers.

_korbendallas_10 karma

Has there been any output that surprised you? (other than bugs, heh, heh)

colah21 karma

Chris: I was really really surprised by DeepDream. I think everyone was.


When I saw the “dream” images Alex made, I was shocked that they were produced by a neural network, let alone by such a simple procedure. (Naively, it seems like the procedure Alex does should just cause the image to explode.)

I was also really surprised when my own experiments with visualizing “what does a neural network think X looks like?” started producing unexpected additions to the object. Barbells have muscular arms lifting them. Balance beams have legs sprouting off the top. Lemons have knives cutting through them. In retrospect, it isn’t that surprising that the networks learned strange understandings of what these objects are, since they only have example images to learn from, but… still very surprising at the time.

jellyberg3 karma

Those dream images are absolutely amazing - almost creepy. It'd be cool if we could users could generate them on a website (or just view more examples if generation for each page load would be too intensive). Maybe a "daily neural network dreams" site? I'd love to grab them as wallpapers!

Have you got any other plans to improve Google products and services with neural networks? Or any personal pipe dreams you want to share with us?

Thanks for running such a high quality AMA, shame it hasn't yet got the attention you guys deserve.

colah10 karma

It'd be cool if we could users could generate them on a website

You can! We open sourced the code, and then lots of people made web interfaces for it. To list a few that came up first in a search:





There's a subreddit, r/deepdream/ dedicated to this stuff. People have made some really beautiful images and videos. One video I really like: https://www.youtube.com/watch?v=DgPaCWJL7XI

tr4s10 karma

How would you estimate my chances of getting a job related to deepnets with just a CS degree, a little professional machine learning experience and no academic connections?

colah16 karma

Greg: I’d suggest building something -- a demo, a tool, something cool -- to show that you have the technical knowledge and creativity to do the work.

wintersolitude9 karma

Hi guys. Can you say something about how you tune the hyperparameters of your deep learning models?

edit: And one more: what do you think will be the next big thing in machine learning?

colah11 karma

Greg: Tuning hyperparameters is a pain in many machine learning systems, and deep learning is no exception. We use a combination of intuition, experience, good luck, grid search, automated tuning, and espresso.

As for the next big thing, see our answers to the “What do you think is the most exciting/useful application” and “Where do you think the biggest challenges are going to be” questions.

Chris: Having lots of computers makes tuning hyperparameters slightly less painful, though.

simonkamronn8 karma

Supervised learning has dominated deep learning but most real world applications consist mostly of unlabelled data from sensory input (i.e. temporal). How do you see deep learning will evolve, if it will, to meet that challenge?

colah15 karma

Greg: Great question! There will always be more unlabeled data than labeled data. What to do?

Back in 2012 we did our fun “cat paper” which was largely about scaling deep learning to work lots of computers, but also on unsupervised learning. We wanted to know if a neural net could discover the concept of cats (among many other things) just from exposure to lots from frames of YouTube videos without having been trained on any images labeled “cat.” This was an open scientific question, and turned out to work better than we’d expected.

That said, most real world applications we’ve come across mostly run best on labeled data -- supervised training. We’re still really interested in unsupervised learning, but it’s hard to make work well in practice. In the long run, machine learning will probably end up with some mixture where you have occasionally labeled data mixed in with lots more unlabeled, and the system handles both -- that’s pretty similar to how people learn, where, say, a toddler sees a bunch of schoolbuses, and only occasionally will an adult say, “Look, a schoolbus!” and the kid has to handle the generalization on his/her own.

xoher1237 karma

How should a beginner get started on Neural Networks?

colah13 karma

Greg: I really like Yoshua et al’s forthcoming book: http://www.iro.umontreal.ca/~bengioy/dlbook/

Chris: Yoshua/Aaron/Ian’s book is great. I’d also mention Michael Nielsen’s excellent book (http://neuralnetworksanddeeplearning.com/) -- it’s less comprehensive than Yoshua’s, but more approachable. Also, at the risk of being a bit self-promotional, my blog is used as supplementary reading in grad courses on deep learning: http://colah.github.io/

veril3587 karma


colah21 karma

Greg: Their potential? About –70 mV. (Sorry, I’m an ex-neuroscientist, so I can’t resist the pun.)

But in all seriousness, the human brain is obviously an amazing learning machine, and it also happens to be a spiking neural network. That said, I used to do research in artificial spiking neural networks and didn’t have a lot of success. I think the problem is that we haven’t yet worked out the math to be able to do ML effectively on spiking models. Ultimately, most ML today comes down to calculus and calculating derivatives... and as spiking models aren’t generally differentiable, our usual techniques don’t work. We also don’t know for sure why it is that the human brain uses spikes. Is it for energetic reasons, is it for better information transmission along axons, is it for some deep computational reason? There are a lot of great hypotheses, but it’s an undecided question in neuroscience and so a hard thing to build an engineered ML system around.

BigxMac7 karma

What math prerequisites would you say are recommended for anyone wanting to start getting into machine learning eventually?

colah12 karma

Greg: The core mathematics you need to grok to do ML are basic linear algebra and basic multivariate calculus -- like the first third of any college textbook on each topic. That’s what you’d want to know to implement and use modern ML. Researchers in the area (like Chris or I) tend to know a lot more math, but it’s hardly required. (Oh, and Chris knows way more math than I do.)

bellerophonvschimere2 karma

"Researchers in the area (like Chris or I) tend to know a lot more math"

Could give us more details please ?

Kaixhin4 karma

Probability, statistics, and optimisation for starters.

colah10 karma

Chris: I think there’s also a lot to be said for people having a diverse background.

I’ve found connections between neural networks and all sorts of things: abstract algebra, topology, type theory, etc… (It’s pretty unclear to me how deep or important these connections are, but there’s something.) My point isn’t every machine learning researcher should learn these in particular. Instead, I think that it’s really valuable for different people to know different things, so they can bring different tools to bare.

A mentor of mine observed: Feynman didn’t know complex analysis for part of his career, which was the approach everyone used to solve integrals. Instead, he knew these fractional calculus tricks. Apparently Feynman attributed some of his results to the fact that he was trying to solve problems with very different tools than everyone else was using. [Sorry this anecdote is so vague. I only know it second-hand, and I have no background in physics.]

Greg: I was educated in physics, and learned to apply math as a descriptive language. I really enjoyed that, and I think it’s helped me in the long run. I’d say probability theory and basic differential equations are pretty important, but also go for applied texts on statistical pattern recognition or ML books that teach you math along the way. Consider the books by Kevin Murphy or Chris Bishop. I happen to also know a bit of differential geometry, and I can say without question that that’s not helpful. ;)

Chris: I don’t know anywhere in ML that differential geometry is useful, but I wouldn’t be confident there isn’t such a place.

Greg: Dude, if there’s ever going to be a quantum gravity description of deep learning, it will totally come in handy.

kil0khan2 karma

Greg do you have any tips for a fresh physics PhD looking to get into machine learning/AI research?

colah8 karma

Greg: With a Physics PhD, you might be able to get picked up by a smaller startup who needs to bolster their ML staff. Regardless, I’d suggest getting into some of the books recommended here, brushing up on your Python or similar programming languages, and maybe trying out some of the open source ML libraries that are out there.

Christian [ex-physicist colleague]: I think it really depends on what you worked on for your degree - without some solid coding chops it may be very difficult to land a ML job right out of the gate. For example, someone who has extensive experience with simulation and probability and dealing with very complex data sets may be able to extend their skillset towards ML, while an experimental physicist (or observational astrophysicist) will find it to be more of a challenge.

bellerophonvschimere7 karma

Will Deep learning still be relevant with quantum computing ? ie qubit instead of bit

colah16 karma

Greg: I was in nuclear physics once upon a time, so I know a few things. Quantum computing is still very much in its infancy, and personally I think it will be decades before it’s a deployable technology. As far as would deep learning still be relevant were QC to succeed: definitely. There are entire classes of deep models (e.g. deep boltzmann machines) that would love to compute probabilistically via qubits rather than sampled, as they are now.

euFalaHoje4 karma


I have always seen Google as one of the dreamy places to work at. Also, the work that I do is very related to Machine Learning and Deep Learning. I'll be applying!

As for the question: I noticed that Google Photos app is very smart in recognizing description of the photos. For example, I can search "kid with a dog" and it will show such a pic if it is in my library. Did you use Deep Learning to teach this to the app? How was it trained?

colah8 karma

Greg: Yes indeed. Our team helped develop “Inception,” a much deeper neural net that won an image recognition contest in late 2014. We baked a version of that into the Google Photos launch in spring 2015. One thing that’s fun about working at Google is your research can make its way into a product in a really short turnaround.

lk054 karma

One criticism of deep learning is that by using massive amounts of data, the network has effectively memorised all possible inputs, how do you counter that? have you found ways to gain insight from models you learn?

colah8 karma

Chris: Ha! I wish deep nets could memorize all possible inputs. Sadly, there’s a problem called the curse of dimensionality. For high-dimensional inputs, there’s such a vast, exponentially large space of possible inputs, that you’ll never be able to fill it.

Worse, for problems like vision, distance between data points says very little about how whether they’re similar, so we nearest neighbor-like things don’t work very well. The power of neural networks is that they they can bend the space of data so that nearby points are similar, allowing us to generalize better between data.

That said, if we train large neural networks on a small data sets, can run into problems with them memorizing their inputs. This is called “overfitting” and causes them to behave poorly on test data.

xoher1234 karma

Is it possible to intern at the Google Deep Learning Research Group?

colah14 karma

Chris: We love interns! I’ve done three internships on Google’s main deep learning research team in Mountain View. People also do internships on other ML research teams.

Some of the world’s best ML groups are at Google, so it’s very exciting to intern here. For certain teams, most interns are grad students who do research in the same area, but there are other teams where undergrad interns are common.

MyNatureIsMe3 karma

NNs are all about finding structure in data. Categories (in Category theory) are all about finding structure in mathematical concepts. I've already seen this blog post: http://colah.github.io/posts/2015-09-NN-Types-FP/ which shows off nice ideas of what might once be. Are there active projects going this path yet? - That is, ones trying to implement a Neural Network in Category Theory? You could then possibly even do weird stuff like "2-NN"s which would be "neural networks of neural networks". I have no idea what those would be but I'd guess they could basically reason about neural network architectures, finding ways in which those are interrelated and also being able to solve for/find "optimal" NNs for a given problem for a suitable definition of optimality.

colah3 karma

Chris: I was the author of that post, so it’s something I think about. One of many things.

I have actually tried to implement a “2-NN” by inputting and outputting weights. I got some very simple things to work, but haven’t tried hard. However, it isn’t really the right thing, because you’re tied to a particular architecture of neural network for your input and output. I have some vague ideas for better approaches, but nothing concrete.

Take a look at our comment on dynamic neural networks. I think it’s connected.

iguar3 karma

Where do you think the biggest challenges are going to be in the next few years in the development and applications of Deep Learning? Is it the algorithms or the hardware that runs them?

colah17 karma

Greg: Right now most deep neural networks compute in a very static way, applying the same mathematical operations in the same order for every input -- regardless of how easy or difficult the problem. I think this a huge shortcoming, and therefore a huge opportunity for improvement. We are just beginning to design networks that respond dynamically to inputs: recurrent networks, attention networks, memory networks, neural turing machines. My hope is that this will be a radical new direction in deep learning and machine learning more generally. So, clearly there’s a lot of algorithmic work to do, but if we’ve learned anything about neural nets in the last ten years, it’s that you’d better have fast compute infrastructure or you might as well go home. ;)

Chris: I’m also very excited about dynamic neural networks! Relatedly, I’m really excited about the idea of having more “structured” representations. Right now, vectors are kind of the lingua franca of neural networks. Convolutional neural nets pass tensors, though, not just vectors. And recurrent neural nets (RNNs) lists of vectors. You can think of these as big vectors with metadata. That makes me wonder what other kinds of metadata we can add…

I’d also note that these are what Greg and I are personally excited about. There’s vast open territory around us, and lots of researchers are excited about different directions!

voladoddi3 karma

Basic questions (for someone who works with NNs day in and day out) -

1) When you say "tune the parameters" of a deep NN, what do you mean, more specifically, what parameters are you referring to? Isn't NN a kind of black box ?

2) How do you decide which parameters need to be tuned?for e.g. statistical tests?

colah9 karma

Greg: It’s easy to get tangled up in terminology here! Neural networks (and other machine learning systems) have internal parameters which are modified through the course of learning, and capture everything that the system has learned so far. These “parameters” number in the millions or billions and we don’t fiddle with them manually at all.

What we tune is a much smaller set of numbers called “hyper-parameters” or “meta-parameters” which are knobs that control the behavior of the learning algorithm. For example, the classic hyper-parameter is the so-called “learning rate.” The learning rate is a scale factor that controls how big a step we take when modifying the internal state of the neural network after looking at a piece of data. If we choose a learning rate that’s too small, the network just kinda sits there. If we choose one that’s too large, the internal state just bounces around and eventually diverges (goes off into infinity). So we have to find a learning rate that’s not too fast but not too slow, hence the term “tuning.” In can be a really annoyance, and there aren’t great statistical methods that guide us.

slickric3 karma

Hi Chris, big fan of your blog! I was wondering when you plan on releasing the followup post to Groups and Group convolutions?

How does researching at Google compare with research in a university?

colah6 karma

Chris: Hi -- I’m glad you like my blog!

I don’t have any plans for a follow up to the group convolutions post in the near future. I have a bunch of stuff I want to write, but my blog posts typically take 50-200 hours to write. They’re a big investment. And the Gens & Domingos’ Deep Symmetry Networks paper beat me to the main idea I was building up to, although I think I still have a few things to add.

I’ll leave comparing university and Google to someone who’s spent more time at a university. :P

Greg: Well, as it took me 8 years to complete my PhD, I guess I’m qualified to answer that one. ;) Research in industry and academia is definitely different in a number of ways. I think academia encourages a researcher to take a stand or argue for a particular hypothesis… defending it, advocating for it. In industry, I tend to feel more part of a team that’s trying to achieve a particular objective all together. In industry, we care less about making an argument, and more about what works. Both approaches clearly have their place. It’s hard to make long term progress on basic science without the academic approach, but it’s hard to build an awesome search engine without the industrial approach.

bellerophonvschimere3 karma

How did you meet Nat and Lo ?

colah4 karma

Lo: We wanted to know more about machine learning, so started searching around and seeing who we might talk with. I found Greg’s name via the Google Research blog, he co-authored a paper about YouTube and cats (http://research.google.com/archive/unsupervised_icml2012.html) so naturally that caught my eye. And Nat and I contacted Chris after reading the DeepDream blog post earlier this summer (http://goo.gl/7fn5rG) and found out he worked with Greg.

jellyberg6 karma

What do you do in the 80% of the time you're not doing your 20% project?

colah2 karma

Nat: We both work in a group at Google called the Creative Lab. It’s a mix of writers, designers, coders, filmmakers, producers, and other people that make videos, websites, apps, experiments, and other stuff. (Nat comes from a writing background. Lo, a production background.) We’ve worked on projects like Maker Camp (https://www.youtube.com/watch?v=Lcd0Pv2eCgk) and this documentary about voice search (https://www.youtube.com/watch?v=yxxRAHVtafI) together. Lo is also really passionate about libraries and made this film:https://www.youtube.com/watch?v=tWbgQLjXPIk. I also worked on a few Android animations like this one:https://www.youtube.com/watch?v=rDPopoBL698.

This project started as a 20% project, but (at least for the moment) we are getting to spend a lot more of our time on it thanks to all the people that are watching. It’s really fun for us to do it, so thank you!

Itakecookies3 karma

As a high-schooler learning coding and highly interested in machine-learning and the work you guys have been doing, what can I do to prepare/help me on the way? Thanks!

colah8 karma

Chris: I think generally learning about math and CS is a really good idea for you.

When I was in high school, I benefited a lot from interacting with people at the University of Toronto (though not ML -- I was more interested in pure math back then).

Do you have any local researchers you could reach out to? I think a lot of people are really happy to talk to an excited high school student. This is especially true if: (1) Try to read one of their papers before meeting them. (2) Reach out to a post-doc or a senior PhD student -- unlike a famous professor, their time isn't highly in demand, and they probably have almost as deep an understanding of the subject.

Google_Your_Question3 karma

To what extent does your work contribute to the ongoing development of Google Now, and are there any plans to increase the integration moving forward?

colah4 karma

Greg: If you mean the Google app (Google Now is the predictive technology within it), there’s a lot of deep learning research that’s improved the speech recognition in the app -- our speech team in NYC just posted about how some new techniques they’ve developed in deep learning made it faster and more accurate: http://googleresearch.blogspot.com/2015/09/google-voice-search-faster-and-more.html

On the predictive side, we’re in the super-early stages of applying deep learning to Google Now. Exciting stuff, but still a bit early to say more.

wizard003 karma

Is it necessary to have a grad or PHD degree to join google's deep learning research group? It feels like most people that in machine learning industry have a degree that is higher than bachelor.

(I am a undergraduate student who are very interested in Machine learning and computer vision, and I was wondering if I need to pursue a higher education in order to get into Machine Learning industry)

colah8 karma

Chris: Well, I don’t have a PhD. In fact, I don’t even have an undergrad degree. I’m a weird case, but it’s certainly possible. I think the most important thing is your understanding of machine learning.

Greg: Chris is in fact weird. About that 95% of the people on our team have PhDs in one thing or another… some CS, some Math, mine’s Neuroscience. I do think going to school, maybe even just for a masters really helps a lot, adding both depth and breadth to your skill set.

Oddity_Odyssey2 karma

  1. Does Google have nice food?
  2. Does Google have a hammock club?

colah3 karma

There are some hammocks next to the batting cage. (Seriously.) We also have “conference bikes,” these crazy octopus-looking 8-seaters that teams can ride while doing meetings together. Hmm, perhaps an idea for 2016: “conference hammocks.”

Stereogravy2 karma

Can you translate into Klingon?

colah24 karma

Ghu'vam pagh wIlo'bogh 'ach pagh batlh wIghaj! (translated by Bing)

lecherous_hump1 karma

Serious question: assuming we survive long enough for the singularity to happen, how long do you think it will take?

colah9 karma

Chris: Predicting the future is really really hard. Generally, I prefer to stick to easy problems, like trying to understand how neural networks transform 10,000 dimensional vector spaces.

I often hear people -- especially people who aren’t machine learning researchers -- make bold, confident predictions about the future. I have no idea how one could be so confident, so I’m pretty dubious about these predictions.

HMPoweredMan1 karma


colah9 karma

Greg: Meow!

Chris: Ribbet!