Tag Storm

Stream Processing with Satish Mittal

“We still need to see in the long run how much of community and industry adoption is there. Because at the end of the day, these are the single two most important things which define and determine the success of any platform.”

Continue reading…

Databases: Fundamental Answers

Databases Week began with a set of fundamental questions. What is a database? Every interviewee during Database Week has given a different answer to the question of "What is a database?" — SE Daily (@software_daily) August 21, 2015 One definition: “an application component for storing and retrieving data”. All of the different databases companies have this functionality. But similarities end there. RethinkDB pushes data to the application MemSQL is a faster, proprietary version

Continue reading…

Transactions and Analytics with VoltDB’s Ryan Betts

http://traffic.libsyn.com/sedaily/voltdb_rbetts.mp3Podcast: Play in new window | DownloadStreaming pipelines and in-memory analytics are difficult to support with old database systems. VoltDB provides streaming analytics with transactions.     Questions How does VoltDB exemplify Michael Stonebraker’s thesis that one size does not fit all? What is the difference between OLTP and Streaming? How does VoltDB serve the common Zookeeper-Kafka-Storm-Cassandra stack? What trends and requirements among OLTP and OLAP systems are changing most

Continue reading…

Streaming SQL with PipelineDB CEO Derek Nelson

http://traffic.libsyn.com/sedaily/pipelinedb_derek.mp3Podcast: Play in new window | DownloadPipelineDB is a streaming SQL database. Derek Nelson is the CEO of PipelineDB. Questions What are continuous views? Why is PipelineDB a good fit for the Kafka+Storm+HBase-type architecture? How does PipelineDB affect the application tier or the browser tier? What are the latency guarantees for how long it takes raw data streams to be converted into the refined queries provided by a continuous view?

Continue reading…

Big Data: Fundamental Answers

Fundamental questions as big as data itself loomed at the beginning of Big Data Week. Some answers: How do customers of multiple managed big data companies deal with the heterogeneity? Confluent provides Kafka, Rocana provides ops, Databricks gives you data science, Cloudera and Hortonworks give you everything else. Each company has a proprietary layer meshed with open-source software. Generally, the more proprietary software you are running, the more you will need

Continue reading…

Apache Kafka with Guozhang Wang

http://traffic.libsyn.com/sedaily/guozhang_kafka.mp3Podcast: Play in new window | DownloadApache Kafka is a publish-subscribe messaging system rethought as a distributed commit log. Kafka serves as the central repository for data streams in a distributed system. Guozhang Wang is an engineer at Confluent, which offers a stream data platform built using Kafka. Questions include: What is a central repository for data streams? How does Kafka improve transportation between systems? How does Kafka allow for richer

Continue reading…

Cloudera Chief Technologist Eli Collins Discusses Streaming, Batch, Business, and Open-Source

http://traffic.libsyn.com/sedaily/eli_cloudera.mp3Podcast: Play in new window | DownloadCloudera allows enterprises to leverage their data through its Hadoop platform. Eli Collins is the Chief Technologist at Cloudera. Topics include: changes to Hadoop since Cloudera’s founding Cloudera’s usage of Spark, Docker, and other open-source technologies how enterprises use batch and streaming together Cloudera’s open-source policy Should Frito Lay open source its chip-making abilities? how collaboration occurs between big, competing companies the growth of increasingly

Continue reading…