Meltano: ELT for DataOps with Douwe Maan
ELT is a process for copying data from a source system into a target system. It stands for “Extract, Load, Transform” and starts with extracting a copy of data from the source location. It’s loaded into the target system like a data warehouse, and then it’s ready to be transformed into a usable format for things like modern cloud applications.
The company Meltano provides code that manages ELT pipelines through an open-source, self-hosted, CLI-first, debuggable, and extensible process. Meltano projects manage your Singer tap and target configurations to easily select which entities and attributes to extract. These pipelines track their own incremental replication state so they can pick up where the previous run left off. Once your raw data is in its target source, Meltano helps you transform it into a usable format. These pipelines can run on a schedule and be fed to supported orchestrators like Apache Airflow.
In this episode we talk to Douwe Maan, founder and CEO of Meltano, about their product-market fit and delivery plans.
Sponsorship inquiries: email@example.com
Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com to get 15% off the first three months of audio editing and transcription services with code: SED. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Are you bored writing scripts to move data into SaaS tools like Salesforce? Hightouch is the easiest way to sync data into the tools that your business teams rely on. It’s simple — connect your data warehouse, paste a SQL query, and use our visual mapper to specify how data should appear in downstream tools. No scripts, just SQL. Get started for free at hightouch.io/sedaily.
Rockset is a real-time indexing database that indexes your data, so you get sub-second search, aggregations and joins on any type of data. It’s a cloud service for real-time analytics. Rockset is purpose-built for real-time analytics. It’s a real-time indexing database that powers sub-second analytical queries including search, aggregations and joins at cloud scale. Deliver real-time interactive analytics in your application in record time. Try now with a free 14-day trial: softwareengineeringdaily.com/rockset.
Gather’s virtual office spaces are an immersive way to work next to your team — while you work remotely. Drop by a coworker’s desk, chat by the water cooler, and host game nights, all with your arrow keys. Video turns on when you get close to a teammate. Dev teams use Gather for urgent hotfix sessions, pair programming, standups, and much more. Feel like you’re next to coworkers, even when you’re thousands of miles away. It’s fully customizable. Make your own maps, add your own objects, and hook into the API to make your space your own. Gather is free up to 25 users and has been used by millions of people around the world. Try it today at gather.town/sedaily.
Act in Time with InfluxData. Easy to start, easy to scale. InfluxDB, is THE open source time series database. Purpose-built to handle the massive volumes of time-stamped data produced by IoT devices, applications, networks, containers and computers. Programmable and performant, InfluxDB gives you high granularity, high scale, and high availability. Capture, analyze, and store millions of points per second to see across all your data sources. For more information and to try it for free, visit influxdata.com/sedaily
Today’s podcast is brought to you by Google Cloud and DORA research team. The team recently launched a survey to collect insights for the 2021 State of DevOps report and would love your input! The State of DevOps report is the largest and longest running research of its kind, providing insight into how we can improve software delivery performance with DevOps. By completing the survey, you get to shape the conversation on DevOps along with over 30 thousand software professionals who took the survey over the past six years. So what are you waiting for? Take the survey at cloud.google.com/devops!