Video Search with Rasty Turek

Searching through all of the videos on the Internet is not a simple problem.

In order to search through all the videos, you need to build a search index. In order to build a search index, you need to build a web crawler. Video files are large. To store all of the actual video files would cost far too much money. In order to build an index in a cost-efficient manner, you need to have a way of storing information about a video without storing the entire video itself.

You might be thinking “hasn’t Google already solved video search? Why are we even talking about this?” Google has solved some aspects of video search–but a different set of challenges is being tackled by a video search company called Pex.

In order to explain what Pex is building, we should first explain the problem set they are trying to tackle.

Videos across the internet are consumed on a variety of platforms such as YouTube, Instagram, Facebook, and Vimeo. These videos are sliced up, bootlegged, and repurposed from one platform to another. For content creators who earn their living from their hosted video streams, this can be a nightmare.

Imagine you are a musician, and you make lots of money from music videos. You upload your cool new video to YouTube, and it instantly gets bootlegged by other users and shared across the internet in hundreds of different places. When people watch the stolen versions of your video, you are not getting compensated. If you could locate all of those stolen videos, you could order them to take it down, or claim the video so that you are paid for it.

And here is the engineering problem–how can you find all those re-posted videos? By crawling the web and building a search index for every video on the web.

Rasty Turek is the CEO of Pex, and in this episode he describes how to build a system that crawls the Internet and indexes videos. It’s a large scale engineering challenge, and there are lots of tradeoffs to be made between financial cost, speed, accuracy, and engineering complexity.

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.

Sponsors

Video is complex, and figuring out how to optimize the delivery of video is not easy–especially since there is both mobile and desktop, and mobile users might not have as much bandwidth as desktop users. Check out mux.com — after you’ve signed up, mention Software Engineering Daily for a $50 credit. And if you are an engineer who is looking for work, you can also apply for a job at mux.com.

Hired is a career marketplace that intelligently matches tech talent with the world’s most innovative companies. We combine cutting-edge technology with unbiased career coaching so both talent and employers can find the right fit, faster. We are on a mission to find everyone a job they love. Go to hired.com/sedaily, and get $600 free, if you find a job through Hired.

Stack Overflow for Teams is a private, secure home for your team’s questions and answers. Try it today, with your first 14 days free. Go to s.tk/daily.