Stripe Machine Learning Infrastructure with Rob Story and Kelley Rivoire

Machine learning allows software to improve as that software consumes more data.

Machine learning is a tool that every software engineer wants to be able to use. Because machine learning is so broadly applicable, software companies want to make the tools more accessible to the developers across the organization.

There are many steps that an engineer must go through to use machine learning, and each additional step inhibits the chances that the engineer will actually get their model into production.

An engineer who wants to build machine learning into their application needs access to data sets. They need to join those data sets, and load them into a machine (or multiple machines) where their model can be trained. Once the model is trained, the model needs to test on additional data to ensure quality. If the initial model quality is insufficient, the engineer might need to tweak the training parameters.

Once a model is accurate enough, the engineer needs to deploy that model. After deployment, the model might need to be updated with new data later on. If the model is processing sensitive or financially relevant data, a provenance process might be necessary to allow for an audit trail of decisions that have been made by the model.

Rob Story and Kelley Rivoire are engineers working on machine learning infrastructure at Stripe. After recognizing the difficulties that engineers faced in creating and deploying machine learning models, Stripe engineers built out Railyard, an API for machine learning workloads within the company.

Rob and Kelley join the show to discuss data engineering and machine learning at Stripe, and their work on Railyard.

ANNOUNCEMENTS

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.


Sponsors

Pantheon makes it easier to manage your WordPress and Drupal websites, with scalable infrastructure, a fast CDN, and security features such as disaster recovery. Pantheon gives you automated workflows for managing dev, test, and production deployments, and Pantheon provides easy integrations with GitHub, CircleCI, JIRA, and more. If you have a WordPress or a Drupal website, check out pantheon.io/sedaily.

G2i is a hiring platform run by engineers that matches you with React, React Native, GraphQL, and mobile engineers who you can trust. Whether you are a new company building your first product or an established company that wants additional engineering help, G2i has the talent you need to accomplish your goals. Go to softwareengineeringdaily.com/g2i

DigitalOcean offers a simple, developer-friendly cloud platform. It’s optimized to make managing and scaling apps easy with an intuitive API, multiple storage options, integrated firewalls, load balancers and more. With predictable pricing,  flexible configurations, and world-class customer support, you’ll get access to all the infrastructure services you need to grow. Get started on DigitalOcean for free at do.co/sedaily.

FindCollabs is a place for finding collaborators and building projects. FindCollabs can be used to manage hackathons and creative projects. Check it out at FindCollabs.com

Software Weekly

Software Weekly

Subscribe to Software Weekly, a curated weekly newsletter featuring the best and newest from the software engineering community.