Diffbot Infrastructure with Mike Tung

Diffbot is a knowledge graph that allows developers to interface with the unstructured web as if it was a structured database. In today’s show, Diffbot CEO Mike Tung returns for a second discussion about how he has built Diffbot and how Diffbot is used.

The web has many different entities. Web pages, topics, people, stories, articles, companies, and much more. Humans use a search engine to find answers to their questions within web pages. Machines need to find answers to these kinds of questions as well, but a machine is not sophisticated enough to figure out answers from an unstructured web page.

Diffbot brings structure to those webpages, and gives them an API interface for developers to build on top of. In order to create this system in a cost-efficient manner, Diffbot runs its own data centers, where web scraping, machine learning, and API infrastructure are all used to build the Diffbot application.

Mike joins me for an interview about creating Diffbot, as well as his strategy for running the business.

Sponsorship inquiries: sponsor@softwareengineeringdaily.com

Check out our active projects:

  • We are hiring a head of growth. If you like Software Engineering Daily and consider yourself competent in sales, marketing, and strategy, send me an email: jeff@softwareengineeringdaily.com
  • FindCollabs is a place to build open source software.
  • The SEDaily app for iOS and Android includes all 1000 of our old episodes, as well as related links, greatest hits, and topics. Subscribe for ad-free episodes.

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.


Software Weekly

Software Weekly

Subscribe to Software Weekly, a curated weekly newsletter featuring the best and newest from the software engineering community.