Tag Big Data

Teaching Data Science with Vik Paruchuri

There is a need for more data scientists to make sense of the vast amounts of data we produce and store. Dataquest is an in-browser platform for learning data science that is tackling this problem.

Vik Paruchuri is the founder of Dataquest. He was previously a machine learning engineer at EdX and before that a U.S. diplomat.

Continue reading…

Security and Privacy with Bruce Schneier

“What we learn again and again is that security is less about what you think of, and more about what you didn’t think of.”

Bruce Schneier is a security researcher and author of Data and Goliath.

Continue reading…

Bitcoin with Andreas Antonopoulos

http://traffic.libsyn.com/sedaily/bitcoin_andreas.mp3Podcast: Play in new window | DownloadBitcoin’s cultural implications inform the engineering opportunities and constraints. Andreas Antonopoulos is a bitcoin researcher, journalist, and evangelist. Questions What are the taboo topics within the bitcoin community? What do you think of when people say “we know bitcoin is the first real cryptocurrency, but the big question is whether it will be the last”? Were grey markets previously underserved before bitcoin? What is

Continue reading…

Big Data: Fundamental Answers

Fundamental questions as big as data itself loomed at the beginning of Big Data Week. Some answers: How do customers of multiple managed big data companies deal with the heterogeneity? Confluent provides Kafka, Rocana provides ops, Databricks gives you data science, Cloudera and Hortonworks give you everything else. Each company has a proprietary layer meshed with open-source software. Generally, the more proprietary software you are running, the more you will need

Continue reading…

Why Has The Number of Database Products Exploded?

The result of high number of database products is due to the amount of Data we generate. Yad Faeq via Quora You’ve hinted to the term long tail for databases, which leads to a very interesting discussion. Chris Anderson explains the long tail among the entertainment industry in this talk, the same basis may apply to technology and specifically data. Here are just a few applications of data that  I

Continue reading…

Hortonworks Data Platform with Venkatesh Seetharam

http://traffic.libsyn.com/sedaily/venkatesth_hortonworks.mp3Podcast: Play in new window | DownloadHortonworks Data Platform is a managed Hadoop architecture for enterprises. Venkatesh Seetharam is a software engineer at Hortonworks. He has worked on several Apache projects, including Hadoop, Falcon, and Atlas. Questions include: Will Hadoop ever be so big we will have to start over from scratch? What is the YARN data operating system? How are customers of Hortonworks dealing with numerous managed Big Data

Continue reading…

Apache Spark Creator Matei Zaharia Interview

http://traffic.libsyn.com/sedaily/matei_spark.mp3Podcast: Play in new window | Download  Apache Spark is a fast and general engine for big data processing. Matei Zaharia created Spark, and is the co-founder of Databricks, a company using Spark to power data science. Questions: What was the motivation behind creating Spark? How much faster is a Spark job than a Hadoop job? What is the relationship between streaming and batch processing? Is Spark’s core advantage over Storm

Continue reading…

  • 1 2