Tag Distributed Systems

Prometheus with Julius Volz

http://traffic.libsyn.com/sedaily/Prometheus_Edited.mp3Podcast: Play in new window | Download Prometheus is an open-source monitoring tool built at SoundCloud. It can be used to produce detailed time-series data about a distributed architecture. Prometheus is based on the monitoring system inside Google’s infrastructure, called Borgmon.   Julius Volz is the creator of Prometheus, and he joins the show to explain why he built Prometheus and how it differs from previous monitoring tools. Prometheus is

Continue reading…

Peter Bailis on the Data Community’s Identity Crisis

http://traffic.libsyn.com/sedaily/database_crisis_edited_fixed.mp3Podcast: Play in new window | Download Breakthroughs in modern data research tend to come from companies like Google, Facebook, and Amazon, with projects like MapReduce, Cassandra, and Dynamo.   Twenty years ago, this types of breakthroughs would be happening in academia, which causes today’s guest Peter Bailis to ask: is the academic data community having an identity crisis?   Peter is an assistant professor at Stanford University, where he

Continue reading…

Death and Distributed Systems with Pieter Hintjens

http://traffic.libsyn.com/sedaily/Zeromq_Edited.mp3Podcast: Play in new window | Download Pieter Hintjens grew up writing software by himself. The act of writing code brought him great pleasure, but the isolated creative process disconnected him from the rest of the world. As his life progressed he became involved in open source communities, and he discovered a passion for human interaction. Open source software succeeds or fails on the strength of the community. One story

Continue reading…

Elixir and Erlang with Jose Valim

http://traffic.libsyn.com/sedaily/Elixir_Edited_2.mp3Podcast: Play in new window | Download “Functional programming is about making the complex parts of your system explicit.” Elixir is a programming language built on top of the Erlang virtual machine. Elixir allows metaprogramming, polymorphism, and a web framework called Phoenix that has drawn positive comparisons to Ruby on Rails. Jose Valim is today’s guest. He built Elixir to augment a language that he loved–Erlang. On Software Engineering Daily,

Continue reading…

Search as a Service with Julien Lemoine

http://traffic.libsyn.com/sedaily/Algolia_Edited_2.mp3Podcast: Play in new window | Download “You need to build more things yourself to be highly available, but one of the very good consequences of being bare metal is that the prices are very low compared to what you could get on the cloud provider.” Engineers who want to add search to their application usually deploy Elasticsearch, or write their own search engine that uses TF-IDF. These solutions work

Continue reading…

Managing a CDN with Carl Gustas

http://traffic.libsyn.com/sedaily/CDN_Edited.mp3Podcast: Play in new window | Download “We’re not always in control of other people’s networks.” CDN stands for content delivery network. A content delivery network is a system of distributed servers that delivers web pages and other web content. Without CDNs, the internet would be much slower, because CDNs function as a caching layer for most web resources. Carl Gustas is an engineer at CacheFly, a popular content delivery

Continue reading…

Logging and NoOps with Christian Beedgen

http://traffic.libsyn.com/sedaily/Sumo_Edited.mp3Podcast: Play in new window | Download “You write the code, but you don’t run it? That’s just preposterous.” Software applications are constantly generating logs. These logs are necessary to understand how an application is functioning, and logs are key to debugging. As applications have gotten more complex, logging infrastructure has become complex as well. Storing and managing all of our log data is such a big task that several

Continue reading…

Automating Infrastructure at HashiCorp with Mitchell Hashimoto

http://traffic.libsyn.com/sedaily/Hashi_Edited_2.mp3Podcast: Play in new window | Download “SaaS, whether we want it or not, in enterprise technology or in our data centers, is coming.” Application delivery has become more complex as software architectures have moved into the cloud. Data center infrastructure has turned into code to be manipulated, and software engineering teams are adjusting their strategies. HashiCorp is a company that builds open-source software for application development and deployment. Mitchell

Continue reading…

Stream Processing at Uber with Danny Yuan

http://traffic.libsyn.com/sedaily/uber_danny_edited.mp3Podcast: Play in new window | Download “Be aggressive in vision, but conservative in operation.” Uber is a transportation company with a high volume of temporal spacial data, constantly being collected from the devices of its users. At any given time, the engineers and data scientists at Uber need to be able to query the system, and understand what is going on with drivers and riders. The unique real-time engineering

Continue reading…

Bootstrapping a SaaS for Developers with Itai Lahan

http://traffic.libsyn.com/sedaily/cloudinary_Edited.mp3Podcast: Play in new window | Download “It’s an amazing era for software developers – we have all this amazing infrastructure behind the scenes that we can build upon.” Ten years ago, building a highly scalable image delivery service would require millions of dollars in upfront costs, and hours of work configuring hardware server infrastructure. Today, it is possible to bootstrap this type of service, with minimal investment. Today’s episode

Continue reading…

Alluxio and Memory-centric Distributed Storage with Haoyuan Li

http://traffic.libsyn.com/sedaily/Alluxio_Edited.mp3Podcast: Play in new window | Download “Its not really about removing disk from the picture per se – it’s more like saying, ‘how do we leverage more and more resources from DRAM?’ ” Memory is king. The cost of memory and disk capacity are both decreasing every year–but only the throughput of memory is increasing exponentially. This trend is driving opportunity in the space of big data processing. Alluxio

Continue reading…

OpenStack and the Future of Cloud Computing with John Purrier

http://traffic.libsyn.com/sedaily/Openstack_Edited.mp3Podcast: Play in new window | Download “Why do we need any open source versions of proprietary implementations? I would argue that first of all, it’s just good for industry and the ecosystem.” Cloud service providers like Amazon, Google, and Microsoft provide both infrastructure as a service and platform as a service. Infrastructure as a service gives developers access to virtual machines, servers, and network infrastructure. Platform as a service

Continue reading…

Microservices, Distributed Teams, and Conferences with Juan Pablo Buriticá

http://traffic.libsyn.com/sedaily/Ride_Edited.mp3Podcast: Play in new window | Download “With any system, whether it’s an organization or a biological system, or an information system, communication is always going to be a challenge. And how pieces of the system communicate will have a direct impact on how effective impact on how effective or efficient that organism or organization is.” In today’s episode, Ben Halpern interviews Juan Buritica, VP of Engineering at Ride. They

Continue reading…

Robots in the Warehouse with Akash Gupta

http://traffic.libsyn.com/sedaily/greyorange_final.mp3Podcast: Play in new window | Download “Our major teams are lead by two people which include one person who’s very strong in their own field, and then there’s a person who has a very good understanding of the cross-functional nature of product development.” GreyOrange Robotics builds robots for warehouse automation of logistics and ecommerce companies for quicker deliveries. Today’s episode features Akash Gupta, the CTO of GreyOrange. Questions Is

Continue reading…

Developer Analytics with Calvin French-Owen

http://traffic.libsyn.com/sedaily/Segment_Edited.mp3Podcast: Play in new window | Download “Its sort of like the old joke in computer science – what do you do when you have a problem? Well, add a layer of abstraction.” Today’s guest is Calvin French-Owen, the CTO of Segment, a tool that companies use to aggregate their analytics into once place. As Segment has scaled, the company has had to restructure its etire technical architecture. Microservices, containers,

Continue reading…

FiloDB with Evan Chan

http://traffic.libsyn.com/sedaily/Filodb_Edited.mp3Podcast: Play in new window | Download “The world is becoming more and more interactive, and people want answers right away, so you’re seeing the rise of stream processing and real-time.” Big data is yesterday–fast data is now. FiloDB is a reactive columnar OLAP database that is built on Cassandra and Spark. Today’s guest is Evan Chan, creator of FiloDB. In our discussion today, we talk about the use cases

Continue reading…

What are the differences between Druid and AWS Redshift?

From Eric Tschetter’s answer via Quora: The difference you are asking about though is ParAccel vs. Druid.  ParAccel is the software that Amazon is licensing for RedShift. Aside from just potential differences in performance, there are some functional differences (these are all based on a cursory understanding of what ParAccel does, I’ve read what I could find on it, but a lot of my understanding is extracted from interpretations of marketing

Continue reading…

Cassandra with Tim Berglund

http://traffic.libsyn.com/sedaily/Cassandra_Edited.mp3Podcast: Play in new window | Download “There isn’t any central node in Cassandra. Every node is a peer, there is no master – there is no single point of failure.” Apache Cassandra can serve as both the real-time data store for online transactional applications, as well as the read-intensive database for data warehousing operations. In order to combine these two use cases into a single database, Apache Cassandra required

Continue reading…

Hadoop: Past, Present and Future with Mike Cafarella

http://traffic.libsyn.com/sedaily/Hadoop_2_Edited.mp3Podcast: Play in new window | Download “HDFS is going to be a cockroach – I don’t think its ever going away.” Hadoop was created in 2003. In the early years, Hadoop provided large scale data processing with MapReduce, and distributed fault-tolerant storage with the Hadoop Distributed File System. Over the last decade, Hadoop has evolved rapidly, with the support of a big open-source community. Today’s guest is Mike Cafarella,

Continue reading…

Data Engineering at Airbnb with Maxime Beauchemin

http://traffic.libsyn.com/sedaily/Airbnb_Edited.mp3Podcast: Play in new window | Download “One big transformation we’re seeing right now is the slow agonizing death of MapReduce.” When a company gets big enough, there is so much data to be processed that an entire data engineering team becomes responsible for managing this data and making it available to other teams. Airbnb is one such company. Max Beauchemin works on the data engineering team at Airbnb, where

Continue reading…