Diffbot Infrastructure with Mike Tung

Diffbot is a knowledge graph that allows developers to interface with the unstructured web as if it was a structured database. In today’s show, Diffbot CEO Mike Tung returns for a second discussion about how he has built Diffbot and how Diffbot is used.

The web has many different entities. Web pages, topics, people, stories, articles, companies, and much more. Humans use a search engine to find answers to their questions within web pages. Machines need to find answers to these kinds of questions as well, but a machine is not sophisticated enough to figure out answers from an unstructured web page.

Diffbot brings structure to those webpages, and gives them an API interface for developers to build on top of. In order to create this system in a cost-efficient manner, Diffbot runs its own data centers, where web scraping, machine learning, and API infrastructure are all used to build the Diffbot application.

Mike joins me for an interview about creating Diffbot, as well as his strategy for running the business.

Sponsorship inquiries: sponsor@softwareengineeringdaily.com

Check out our active projects:

  • We are hiring a head of growth. If you like Software Engineering Daily and consider yourself competent in sales, marketing, and strategy, send me an email: jeff@softwareengineeringdaily.com
  • FindCollabs is a place to build open source software.
  • The SEDaily app for iOS and Android includes all 1000 of our old episodes, as well as related links, greatest hits, and topics. Subscribe for ad-free episodes.

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.


Sponsors

PagerDuty helps your company’s digital operations run more smoothly. PagerDuty helps you intelligently pinpoint issues like outages, as well as capitalize on opportunities, empowering teams to take the right, real time action. To see how companies like GE, Vodafone, Box and American Eagle Outfitters rely on PagerDuty to continuously improve their digital operations visit PagerDuty.com.

Cruise is a San Francisco-based company building a fully electric self-driving car service. Cruise is a place where you can build on your existing skills while developing new skills and experiences that are pioneering the future of industry. There are opportunities for backend engineers, frontend developers, machine learning programmers, and many more positions. At Cruise you will be surrounded by talented, driven engineers-all while helping make cities safer and cleaner. Apply to work at Cruise, by going to getcruise.com/careers.

Vettery is an online hiring marketplace that connects highly qualified workers with top companies. Vettery keeps the quality of workers and companies on the platform high, because they vet both workers and companies. Check out vettery.com/sedaily, and get a $300 sign-up bonus if you accept a job through Vettery.

MongoDB is the most popular document-based database built for modern application developers and the cloud era. Try MongoDB today with Atlas, the global cloud database service that runs on AWS, Azure, and Google Cloud. Configure, deploy, and connect to your database in just a few minutes. Check it out at mongodb.com/atlas.

Software Weekly

Software Weekly

Subscribe to Software Weekly, a curated weekly newsletter featuring the best and newest from the software engineering community.