The easiest way to train a computer to recognize a picture of cat is to show the computer a million labeled images of cats. The easiest way to train a computer to recognize a stop sign is to show the computer a million labeled stop signs.
Supervised machine learning systems require labeled data. Today, most of that labeling needs to be done by humans. When a large tech company decides to “build a machine learning model,” that often requires a massive amount of effort to get labeled data.
Hundreds of thousands of knowledge workers around the world earn their income from labeling tasks. An example task might be “label all of the pedestrians in this intersection.” You receive a picture of a crowded intersection, and your task is to circle all the pedestrians. You have now created a piece of labeled data.
Scale API is a company that turns API requests into human tasks. Their most recent release is an API for labeling data that has been generated from sensors. As self-driving cars emerge onto our streets, the sensors on these cars generate LIDAR, radar, and camera data. The cars will interpret that data in real time using their machine learning models, and then they will send that data to the cloud so that the data can be processed offline to improve the machine learning models of every car on the road.
The first step in that processing pipeline is the labeling–which is the focus of today’s conversation. Alexandr Wang is the CEO of Scale, and he joins the show to discuss self-driving cars, labeling, and the company he co-founded.
A few notes before we get started. We just launched the Software Daily job board. To check it out, go to softwaredaily.com/jobs. You can post jobs, you can apply for jobs, and it’s all free. If you are looking to hire, or looking for a job, I recommend checking it out. And if you are looking for an internship, you can use the job board to apply for an internship at Software Engineering Daily.
Also, Meetups for Software Engineering Daily are being planned! Go to softwareengineeringdaily.com/meetup if you want to register for an upcoming Meetup. In March, I’ll be visiting Datadog in New York and Hubspot in Boston, and in April I’ll be at Telesign in LA.
Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
QCon.ai is a software conference for full-stack developers looking to uncover the real-world patterns, practices, and use cases for applying artificial intelligence/machine learning in engineering. Come to QCon.ai in San Francisco, from April 9th – 11th 2018–and see talks from companies like Instacart, Uber, Coinbase, and Stripe. These companies have built and deployed state of the art machine learning models–and they come to QCon to share their developments. The keynote of QCon.ai is Matt Ranney, a Sr. Staff Engineer at UberATG (the autonomous driving unit at Uber)–and he’s an amazing speaker–he was on SE Daily in the past, if you want a preview for what he is like. I have been to QCon three times and it is a fantastic conference. What I love about QCon is the high bar for quality–quality in terms of speakers, content, peer sharing as well as the food and general atmosphere. QCon is one of my favorite conferences, and if you haven’t been to a QCon before, make QCon.ai your first. Register for QCon.ai and use promo code SEDAILY for $100 off your ticket.
Digital Ocean is a reliable, easy-to-use cloud provider. More and more people are finding out about Digital Ocean, and realizing that Digital Ocean is perfect for their application workloads. This year, Digital Ocean is making that even easier, with new node types–a $15 flexible droplet that can mix and match different configurations of CPU and RAM, to get the perfect amount of resources for your application. There are also CPU optimized droplets–perfect for highly active frontend servers, or CI/CD workloads. And running on the cloud can get expensive, which is why Digital Ocean makes it easy to choose the right size instance. And the prices on standard instances have gone down too–you can check out all their new deals by going to do.co/sedaily. And as a bonus to our listeners you will get $100 in credit over 60 days. Use the credit for hosting or infrastructure–that includes load balancers, object storage, and computation. Get your free $100 credit at do.co/sedaily. Thanks to Digital Ocean for being a sponsor of Software Engineering Daily.
When you’re building an application, you need it to be fast, secure—and always evolving. With Kubernetes Engine on Google Cloud Platform, developers can deploy fully-managed containerized apps quickly and easily. Google has been running production workloads in containers for over 15 years. We build the best of what we learn into Kubernetes, the industry-leading open source container orchestrator. Kubernetes Engine combines automatic scaling, updates and reliable, self-healing infrastructure with open-source flexibility to cut down development cycles and speed up time to market. Click to learn more about Kubernetes Engine, visit g.co/getgke.