Similarity Search with Jeff Johnson

Querying a search index for objects similar to a given object is a common problem. A user who has just read a great news article might want to read articles similar to it. A user who has just taken a picture of a dog might want to search for dog photos similar to it. In both of these cases, the query object is turned into a vector and compared to the vectors representing the objects in the search index.

Facebook contains a lot of news articles and a lot of dog pictures. How do you index and query all that information efficiently? Much of that data is unlabeled. How can you use deep learning to classify entities and add more richness to the vectors?

Jeff Johnson is a research engineer at Facebook. He joins the show to discuss how similarity search works at scale, including how to represent that data and the tradeoffs of this kind of search engine across speed, memory usage, and accuracy.

Notes: Jeff’s blog post about similarity search

Sponsors


Spring Framework gives developers an environment for building cloud native projects. On December 4th-7th, SpringOne Platform is coming to San Francisco. SpringOne Platform is a conference where developers congregate to explore the latest technologies in the Spring ecosystem and beyond. Speakers at SpringOne Platform include Eric Brewer (who created the CAP theorem), Vaughn Vernon (who writes extensively about Domain Driven Design), and many thought leaders in the Spring Ecosystem. SpringOne Platform is the premier conference for those who build, deploy, and run cloud-native software. Software Engineering Daily listeners can sign up with the discount code SEDaily100 and receive $100 off of a Spring One Platform conference pass. I will also be at SpringOne reporting on developments in the cloud native ecosystem. Join me December 4th-7th at the SpringOne Platform conference, and use discount code SEDaily100 for $100 off your conference pass.


GrammaTech CodeSonar helps development teams improve code quality with static analysis. It helps flag issues early in the development process, allowing developers to release better code faster. CodeSonar can easily be integrated into any development process. CodeSonar performs advanced static analysis of C, C++, Java, and even raw binary code. CodeSonar performs unique dataflow and symbolic execution analysis to aggressively scan for problems in your code. Just like battleships use sonar to detect objects deep underwater, engineers use CodeSonar to detect subtle problems deep within their code. Go to go.grammatech.com/sedaily to get your free 30-day trial, exclusively for Software Engineering Daily listeners and unleash the power of advanced static analysis.


Bugsnag improves the task of troubleshooting errors by making it more enjoyable and less time-consuming. For example, when an error occurs, your team can get notified via Slack, see diagnostic information on the error, and identify the developer who committed the code. Bugsnag’s integration with Jira and other collaboration tools makes it easy to assign and track bugs as they are being fixed. There is a special offer for Software Engineering Daily listeners. Try all features free for 60 days at https://www.bugsnag.com/sedaily. Development teams can now iterate faster and improve software quality. To get started, go to https://www.bugsnag.com/sedaily/. Get up and running in three minutes. Airbnb, Lyft, and Shopify all use Bugsnag to monitor application errors.  

 


Flip the traditional job search and let Indeed Prime work for you while you’re busy with other engineering work, or coding your side project. Upload your resume and in one click, gain immediate exposure to companies like Facebook, Uber, and Dropbox. Interested employers will reach out to you within one week with salary, position, and equity up front. Don’t let applying for jobs become a full-time job. With Indeed Prime, jobs come to you. The average software developer gets 5 employer contacts and an average salary offer of $125,000. Indeed Prime is 100% free for candidates – no strings attached. Sign up now at indeed.com/sedaily.

  • Paul Nogas

    Can you post the link to the blog post mentioned in this podcast?

    • Jeff Meyerson

      Done! Sorry about that