Data Engineering with Tobias Macey
The Hadoop ecosystem provided every company with the tools to store and query large amounts of data at a low cost. Since 2005, that ecosystem has expanded with more and more open source applications for data infrastructure.
Data infrastructure includes databases, data lakes, distributed queues, data warehouses, query engines, web applications, on-prem software, closed source, open source, cloud platforms, and software-as-a-service. There is so much software, and it is difficult to keep track of everything in the data engineering ecosystem.
Tobias Macey is the host of the Data Engineering Podcast, a show about technologies and practices in the world of data engineering. Tobias joins today’s episode as a guest to give his perspective on the evolving landscape of data engineering, and the trends he is seeing on the data engineering podcast.
- New SEDaily app for iOS and for Android. It includes all 1000 of our old episodes, as well as related links, greatest hits, and topics. You can comment on episodes and have discussions with other members of the community. I’ll be commenting on each episode, so if you hear an episode that you have some commentary on, jump onto the app, or on SoftwareDaily.com to share your thoughts. And you can become a paid subscriber for ad free episodes at softwareengineeringdaily.com/subscribe. Altalogy is the company who has been developing much of the software for the newest app, and if you are looking for a company to help you with your mobile and web development, I recommend checking them out.
- FindCollabs is a place to find collaborators and build projects. FindCollabs is the company I am building, and we are having an online hackathon with $2500 in prizes. If you are working on a project, or you are looking for other programmers to build a project or start a company with, check out FindCollabs. I’ve been interviewing people from some of these projects on the FindCollabs podcast, so if you want to learn more about the community you can hear that podcast.
- Upcoming conferences I’m attending: Datadog Dash July 16th and 17th in NYC, Open Core Summit September 19th and 20th in San Francisco
- We are hiring two interns for software engineering and business development! If you are interested in either position, send an email with your resume to email@example.com with “Internship” in the subject line.
Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
With Triplebyte, you do one online interview, and then you get to go straight to final interviews at hundreds of companies (from tech giants like Dropbox to exciting startups). It’s like the Common App for software engineers. No resume needed. Apply now at triplebyte.com/sedaily. If you take a job through Triplebyte, you’ll get a $1000 signing bonus.
SpringOne Platform is a conference for learning about building scalable web applications. SpringOne Platform is organized by Pivotal, the company that has helped build open source technologies such as Spring and Cloud Foundry. SpringOne Platform 2019 is in Austin, Texas October 7-10, and you can get $200 off your pass by going to softwareengineeringdaily.com/spring, and using promo code S1P200_SED.
With MongoDB Atlas, you can take advantage of MongoDB’s flexible document data model as a fully automated cloud service. MongoDB Atlas handles all the costly database operations and admin tasks that you’d rather not spend time on, like security, high availability, data recovery, monitoring, and elastic scaling.Try MongoDB Atlas for free today! Visit mongdb.com/se to learn more.
GoCD is a continuous delivery tool from ThoughtWorks. If you have heard about continuous delivery, but you don’t know what it looks like in action, try the GoCD test drive at gocd.org/sedaily. GoCD’s test drive will set up example pipelines for you to see how GoCD manages your continuous delivery workflows. Visualize your deployment pipelines and understand which tests are passing and which tests are failing.