Kafka Design Patterns with Gwen Shapira

Kafka is at the center of modern streaming systems. Kafka serves as a database, a pubsub system, a buffer, and a data recovery tool. It’s an extremely flexible tool, and that flexibility has led to its use as a platform for a wide variety of data intensive applications.

Today’s guest is Gwen Shapira, a product manager at Confluent. Confluent is a company that was started by the creators of Apache Kafka–Jay Kreps, Neha Narkhede, and Jun Rao, who have all been on the show in previous episodes. In those shows, we discussed the inner workings of Kafka. This episode is more about practical use cases and design patterns.

Gwen explores a few use cases. First, reconciling data between different servers. A massive, international, multi-user game like World of Warcraft needs to keep its users in sync despite the fact that those users are pinging different server locations. Kafka can help reconcile data between the multiple servers. This discussion reminded me of the awesome show we did with Yan Cui about scalable multiplayer games.

Other examples we discussed include log management, data enrichment, and large scale analytics. It was a great conversation and I think you will enjoy it as well.

Summer internship applications to Software Engineering Daily are also being accepted. If you are interested in working with us on the Software Engineering Daily open source project full-time this Summer, send an application to internships@softwareengineeringdaily.com. We’d love to hear from you.

If you haven’t seen what we are building, check out softwaredaily.com, or download the Software Engineering Daily app for iOS or Android. These apps have all 650 of our episodes in a searchable format–we have recommendations, categories, related links and discussions around the episodes. It’s all free and also open source–if you are interested in getting involved in our open source community, we have lots of people working on the project and we do our best to be friendly and inviting to new people coming in looking for their first open source project. You can find that project at Github.com/softwareengineeringdaily.

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.

Sponsors


Your enterprise produces lots of data, but you aren’t capturing as much as you would like. You aren’t storing it in the right place, and you don’t have the proper tools to run complex queries against your data. MapR is a converged data platform that runs across any cloud. MapR provides storage, analytics, and machine learning engines. Use the MapR operational database and event streams to capture your data. Use the MapR analytics and machine learning engines to analyze your data, in batch or interactively–across any cloud, on premise, or at the edge. MapR’s technology is trusted by major industries like Audi, which uses MapR for its connected vehicles. MapR also powers Aadhar, the world’s largest biometric system. To learn more about how MapR can solve problems for your enterprise, go to softwareengineeringdaily.com/mapr to find whitepapers, videos, and ebooks. Whether you are an oil company like Anadarko, a major FinTech provider like Kabbage, or a business in any other vertical, MapR can leverage the high volumes of data produced within your company. Go to softwareengineeringdaily.com/mapr and find out how MapR can help your business take full advantage of its data. 



LiveRamp is one of the fastest growing companies in data connectivity in the Bay Area, and they are looking for senior level talent to join their team. LiveRamp helps the world’s largest brands activate their data to improve customer interactions on any channel or device. The infrastructure is at a tremendous scale: a 500-billion node identity graph generated from over a thousand data sources, running an 85PB hadoop cluster; and application servers that process over 20 billion HTTP requests per day. The LiveRamp team thrives on mind-bending technical challenges. LiveRamp members value entrepreneurship, humility, and constant personal growth. If this sounds like a fit for you, check out softwareengineeringdaily.com/liveramp.



When you’re building an application, you need it to be fast, secure—and always evolving. With Kubernetes Engine on Google Cloud Platform, developers can deploy fully-managed containerized apps quickly and easily. Google has been running production workloads in containers for over 15 years. We build the best of what we learn into Kubernetes, the industry-leading open source container orchestrator. Kubernetes Engine combines automatic scaling, updates and reliable, self-healing infrastructure with open-source flexibility to cut down development cycles and speed up time to market. Click to learn more about Kubernetes Engine, visit g.co/getgke.