High Volume Event Processing with John-Daniel Trask

A popular software application serves billions of user requests. These requests could be for many different things. These requests need to be routed to the correct destination, load balanced across different instances of a service, and queued for processing. Processing a request might require generating a detailed response to the user, or making a write to a database, or the creation of a new file on a file system.

As a software product grows in popularity, it will need to scale these different parts of infrastructure at different rates. You many not need to grow your database cluster at the same pace that you grow the number of load balancers at the front of your infrastructure. Your users might start making 70% of their requests to one specific part of your application, and you might need to scale up the services that power that portion of the infrastructure.

Today’s episode is a case study of a high-volume application: a monitoring platform called Raygun.

Raygun’s software runs on client applications and delivers monitoring data and crash reports back to Raygun’s servers. If I have a podcast player application on my iPhone that runs the Raygun software, and that application crashes, Raygun takes a snapshot of the system state and reports that information along with the exception, so that the developer of that podcast player application can see the full picture of what was going on in the user’s device, along with the exception that triggered the application crash.

Throughout the day, applications all around the world are crashing and sending requests to Rayguns servers. Even when crashes are not occurring, Raygun is receiving monitoring and health data from those applications. Raygun’s infrastructure routes those different types of requests to different services, queues them up, and writes the data to multiple storage layers–ElasticSearch, a relational SQL database, and a custom file server built on top of S3.

John-Daniel Trask is the CEO of Raygun and he joins the show to describe the end-to-end architecture of Raygun’s request processing and storage system. We also explore specific refactoring changes that were made to save costs at the worker layer of the architecture. This is useful memory management strategy for anyone working in a garbage collected language. If you would like to see diagrams that explain the architecture and other technical decisions, the show notes have a video that explains what we talk about in this show. Full disclosure: Raygun is a sponsor of Software Engineering Daily.


Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.


Transferwise makes it cheaper and easier to send money to other countries. It’s a simple mission, but it’s important. TransferWise is looking for engineers to join their team. We have reported on TransferWise in past episodes, and I love the company, because they make international payments more efficient. If you are a Java developer, a full-stack engineer, a product manager, or a data analyst, check out transferwise.com/jobs. Last year, TransferWise’s VP of engineering Harsh Sinha came on Software Engineering Daily to discuss how TransferWise works–and it was a fascinating discussion. Every month, tens of thousands of people send about 1 billion dollars, in 45 currencies, to 64 countries on TransferWise. Along the way, there are many engineering challenges–so there’s plenty of opportunity to make your mark, and learn about the evolving industry of financial technology. TransferWise is built by self-sufficient, autonomous teams, and each team picks the problems they want to solve. There’s no micro-management. No one telling you what to do. Find an autonomous, challenging, rewarding job by going to transferwise.com/jobs. TransferWise has several open roles in engineering, and has offices in London, New York, Tampa, Tallin, Cherkassy, Budapest, and Singapore, among other places. Find out more at transferwise.com/jobs.

Dice helps you accelerate your tech career. Whether you’re actively looking for a job or need insights to grow in your role, Dice has the resources you need. Dice’s mobile app is the fastest and easiest way to get ahead. Search thousands of tech jobs – from software engineering to UI/UX to product management. Discover your worth with Dice’s Salary Predictor based on your unique skill set. Uncover new opportunities with Dice’s new career pathing tool which can give you insights about the best types of roles to transition to – and the skills you’ll need to get there. Manage your tech career and download the Dice Careers app on Android or iOS today. So check out Dice and support Software Engineering Daily, go to Dice.com/sedaily. Thanks to Dice for being a sponsor of Software Engineering Daily.

Simplify continuous delivery with GoCD, the on-premise, open source, continuous delivery tool by ThoughtWorks. With GoCD, you can easily model complex deployment workflows using pipelines and visualize them end-to-end with the Value Stream Map. You get complete visibility into and control of your company’s deployments. At gocd.org/sedaily, find out how to bring continuous delivery to your teams. Say goodbye to deployment panic and hello to consistent, predictable deliveries. Visit gocd.org/sedaily to learn more about GoCD. Commercial support and enterprise add-ons, including disaster recovery, are available.


Software Weekly

Software Weekly

Subscribe to Software Weekly, a curated weekly newsletter featuring the best and newest from the software engineering community.