Serverless Research with Ion Stoica
The Berkeley AMPLab was a research lab where Apache Spark and Apache Mesos were both created. In the last five years, the Mesos and Spark projects have changed the way infrastructure is managed and improved the tools for data science.
Because of its proximity to Silicon Valley, Berkeley has become a university where fundamental research is blended with a sense of industry applications. Students and professors move between business and academia, finding problems in industry and bringing them into the lab where they can be studied without the day-to-day pressures of a corporation.
This makes Berkeley the perfect place for research around “serverless”.
Serverless computing abstracts away the notion of a server, allowing developers to work at a higher level and be less concerned about the problems inherent in servers–such as failing instances and unpredictable network connections.
With serverless functions-as-a-service, the cloud provider makes guarantees around the execution of serverless code–such as with AWS Lambda. With serverless backend services, the cloud provider makes guarantees around the reliability of a database or queueing system.
The cloud provider is operating servers to power this functionality. But the user is not exposed to those servers.
Today’s show centers around the serverless functions-as-a-service. This is a new paradigm of computing, and there are many open questions. How can the servers for our functions be quickly provisioned? How can we parallelize batch jobs into functions as a service? How can large numbers of serverless functions communicate with each other reliably to coordinate?
In production applications, functions-as-a-service are mostly used for “event-driven” applications. But the potential for functions-as-a-service is much larger.
Ion Stoica is a professor of computer science at Berkeley, where he leads the RISELab. He is the co-founder of Conviva Networks and Databricks. Databricks is the company that was born as a result of the research on Apache Spark. Ion now serves as executive chairman of Databricks. Ion joins the show to describe why serverless computing is exciting, the open research problems, and the solutions that researchers at the RISELab are exploring.
- Occupy the Cloud: Distributed Computing for the 99%
- RISELab at UC Berkeley – REAL-TIME INTELLIGENT SECURE EXECUTION
- pywren — run your python code on thousands of cores – pywren
- IEEE Cloud Serverless Workshop July 2018 Jonas
- GitHub – Vaishaal/numpywren: Serverless Scientific Computing
- Serverless for Data Scientists| Mike Lee Williams @ PyBay2018 – YouTube
- Serverless for data scientists
- Serverless Big Data Analytics at Traveloka (Cloud Next ’18) – YouTube
- With PyWren, AWS Lambda Finds an Unexpected Market in Scientific Computing – The New Stack
Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
FullStory is offering a free 1 month trial at fullstory.com/sedaily to Software Engineering Daily listeners. This free trial doubles the regular 14-day trial available from fullstory.com, giving you time to test FullStory’s powerful search and session replay and even try out FullStory’s many integrations (Jira, Bugsnag, Trello, Intercom, and more).
Datadog is a cloud-scale monitoring platform for infrastructure and applications. And with Datadog’s new Live Container view, you can see every container’s health, resource consumption, and running processes in real time. See for yourself by starting a free trial and get a free Datadog T-shirt! softwareengineeringdaily.com/
Transifex is a SaaS-based localization and translation platform that easily integrates with your agile development process. If you’re a developer who is ready to reach a global audience, check out Transifex by visiting transifex.com/sedaily and sign up for a free 15-day trial.
GoCD is a continuous delivery tool created by ThoughtWorks. It’s great to see the continued progress on GoCD with the new Kubernetes integrations–and you can check it out for yourself at gocd.org/sedaily.