Demystifying Stream Processing with Neha Narkhede

neha-narkhede

“Systems are giving up correctness for latency, and I’m arguing that stream processing systems have to be designed to allow the user to pick the tradeoffs that the application needs.”

The stream processing paradigm is increasingly being adopted by applications that need to process and handle large volumes of data. Apache Kafka is an open-source distributed publish-subscribe messaging system that is built to support streaming data processing.

Neha Narkhede is the one of the creators of Apache Kafka, which she built to address engineering challenges while working at LinkedIn. She is also the co-founder of Confluent, which builds enterprise products around Kafka.

Questions

  • Why was Kafka originally developed?
  • Before Kafka, how did Linkedin maintain consistency between these different services?
  • What is the purpose of a broker in Kafka?
  • How did you end up with the log structure?
  • How does Kafka do garbage collection on messages?
  • Could you describe Kafka’s persistence model in more detail?
  • Is the batch versus streaming discussion a false dichotomy?

Links

Sponsors

Hired.com is the job marketplace for software engineers. Go to hired.com/softwareengineeringdaily to get a $2000 bonus upon landing a job through Hired.

Digital Ocean is the simplest cloud hosting provider. Use promo code SEDAILY for $10 in free credit.

Comments