Diffbot: Knowledge Graph API with Mike Tung
Google Search allows humans to find and access information across the web. A human enters an unstructured query into the search box, the search engine provides several links as a result, and the human clicks on one of those links. That link brings up a web page, which is a set of unstructured data. Humans can read and understand news articles, videos, and Wikipedia pages.
Google Search solves the problem of organizing and distributing all of the unstructured data across the web, for humans to consume. Diffbot is a company with a goal of solving a related, but distinctly different problem: how to derive structure from the unstructured web, understand relationships within that structure, and allow machines to utilize those relationships through APIs.
Mike Tung is the founder of Diffbot. He joins the show to talk about the last decade that he has spent building artificial intelligence applications, from his research at Stanford to a mature, widely used product in Diffbot. I have built a few applications with Diffbot, and I encourage anyone who is a tinkerer or prototype builder to play around with it. It’s an API for accessing web pages as structured data.
Diffbot crawls the entire web, parsing websites, using NLP and NLU to comprehend those pages, and using probabilistic estimations to draw relationships between entities. It’s an ambitious product, and Mike has been working on it for a long time. I enjoyed our conversation.
- Knowledge Graph Search Widget
- Knowledge Graph, AI Web Data Extraction and Crawling | Diffbot
- Knowledge Graph | Diffbot
- Automatic APIs: automatically extract content from web pages
- Web Crawler API: Crawlbot
- Diffblog | Structured Data Blog
- Introducing the Diffbot Knowledge Graph | Diffblog
- Diffbot launches AI-powered knowledge graph of 1 trillion facts about people, places, and things | VentureBeat
- Web crawler – Wikipedia
- Natural Language Processing (NLP) Techniques for Extracting Information | Search Technologies
We recently launched a new podcast: Fintech Daily! Fintech Daily is about payments, cryptocurrencies, trading, and the intersection between finance and technology. You can find it on fintechdaily.co or Apple and Google podcasts. We are looking for other hosts who want to participate. If you are interested in becoming a host, send us an email: firstname.lastname@example.org
Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.