Word2Vec with Adrian Colyer

Machines understand the world through mathematical representations. In order to train a machine learning model, we need to describe everything in terms of numbers.  Images, words, and sounds are too abstract for a computer. But a series of numbers is a representation that we can all agree on, whether we are a computer or a human.

In recent shows, we have explored how to train machine learning models to understand images and video. Today, we explore words. You might be thinking–”isn’t a word easy to understand? Can’t you just take the dictionary definition?” A dictionary definition does not capture the richness of a word. Dictionaries do not give you a way to measure similarity between one word and all other words in a given language.

Word2vec is a system for defining words in terms of the words that appear close to that word. For example, the sentence “Howard is sitting in a Starbucks cafe drinking a cup of coffee” gives an obvious indication that the words “cafe,” “cup,” and “coffee” are all related. With enough sentences like that, we can start to understand the entire language.

Adrian Colyer is a venture capitalist with Accel, and blogs about technical topics such as word2vec. We talked about word2vec specifically, and the deep learning space more generally. We also explored how the rapidly improving tools around deep learning are changing the venture investment landscape.

If you like this episode, we have done many other shows about machine learning with guests like Matt Zeiler, the founder of Clarif.ai and Francois Chollet, the creator of Keras. You can check out our back catalog by downloading the Software Engineering Daily app for iOS, where you can listen to all of our old episodes, and easily discover new topics that might interest you. You can upvote the episodes you like and get recommendations based on your listening history. With 600 episodes, it is hard to find the episodes that appeal to you, and we hope the app helps with that.

Question of the Week: What is your favorite continuous delivery or continuous integration tool? Email jeff@softwareengineeringdaily.com and a winner will be chosen at random to receive a Software Engineering Daily hoodie. 


To build the kinds of things developers want to build today, they need better tools.  That’s why Amazon Web Services built Amazon Aurora. A relational database engine that’s compatible with MySQL and PostgreSQL, and provides up to five times the performance of standard MySQL—on the same hardware, at a tenth of the cost. Amazon Aurora from AWS can scale up to millions of transactions per minute. Automatically grow your storage up to 64 terabytes. And replicates data to three different Availability Zones. And you don’t have to manage a thing. There are no upfront charges, no commitments—you only pay for what you use. Check it out, at aurora.aws.

Toptal is the best place to find reasonably priced, extremely talented software engineers to build your projects from scratch or scale your workforce. Get a free pair of Apple Airpods when you use Toptal.com/sedaily to work with an engineer for at least 20 hours.

Cloudflare runs 10% of the Internet, providing performance and security to millions of websites. Many of you probably already use Cloudflare on your sites. We’re not talking about using Cloudflare today though, we’re here to talk about building on top of it. If you’re a developer you can build apps which can be installed by the the millions of sites which rely on Cloudflare. You can even sell your apps; they can make you money every month. Visit cloudflare.com/sedaily to watch how you can build and deploy an app in less than 3 minutes.

Software Weekly

Software Weekly

Subscribe to Software Weekly, a curated weekly newsletter featuring the best and newest from the software engineering community.