Mapillary: Computer Vision Crowdsourcing with Peter Neubauer

Mapillary is a platform for gathering photos taken by smartphones and using that data to build a 3D model of the world. Mapillary’s model of the world includes labeled objects such as traffic signs, trees, humans, and buildings. This 3D model can be explored much like you can explore Google Street view.

The data set that underlies Mapillary is crowdsourced from volunteer users who are taking pictures from different vantage points. These smartphone photos are uploaded to Mapillary, queued, and processed to constantly update and refine the Mapillary model.

Mapillary processes high volumes of photos from around the world. The images in these photos need to be correctly fit into Mapillary’s model of the world like a puzzle piece sliding into place. The image needs to be segmented into the different entities within, and those entities need to be put through object recognition algorithm. When two pictures have a conflict, that conflict needs to be resolved.

Mapillary is full of interesting engineering problems. The high volume of images and the level of processing has created the need for a unique sequence of indexing, queueing, and distributed processing using Apache Storm. In addition to processing all of this data and building a 3-D model, Mapillary serves an API for querying geolocations about traffic signs, road conditions, and bus stops.

Peter Neubauer is the co-founder of Mapillary, and is also a co-founder of Neo Technology, the company behind Neo4j. Peter is a world-class engineer and he joins the show to give a detailed overview of the technology behind Mapillary, from ingressing the photos to running data engineering jobs to serving the API.


Show Notes


Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.


QCon San Francisco 2018 features 18 editorial tracks with 140+ speakers from places like Uber, Google, Dropbox, Slack, Twitter, and more. At QCon, we create a platform for senior software engineers, team leads, architects, and leaders working at innovator and early adopter companies to share their stories. It goes to the heart of who we are. We simply prefer practitioners over evangelists in the speakers we bring to the conference. SED listeners can save $100 off the price of a ticket using the promo code SED100.

Digital Ocean is the easiest cloud platform to run and scale your application. Try it out today and get a free $100 credit–go to Digital Ocean is a complete cloud platform to help developers and teams save time when running and scaling their applications.

OpenShift is a Kubernetes platform from Red Hat. OpenShift takes the Kubernetes container orchestration system and adds features that let you build software more quickly. OpenShift includes service discovery, CI/CD, built-in monitoring and health management, and scalability. With OpenShift, you avoid getting locked into any particular cloud provider. Check out OpenShift from RedHat, by going to

GoCD is a continuous delivery tool created by ThoughtWorks. It’s great to see the continued progress on GoCD with the new Kubernetes integrations–and you can check it out for yourself at

Software Weekly

Software Weekly

Subscribe to Software Weekly, a curated weekly newsletter featuring the best and newest from the software engineering community.