We talk about using what we learn about User 1 to influence what we serve to User 2 in terms of search results, suggestions, recommendations, or playlists. But the truth is that almost everyone is User 2. Almost everyone is also User 1. Whoa, that’s confusing. What do we mean?
Almost everyone receives results that are not simply freshly-fallen digital snow. Most results are in some way influenced by previous users. That “influencer” (who we generally refer to as User 1) may be the same actual person as User 2 (rare, what we call “autobiographical continuity”) or (more commonly) may be a member of a cohort of users similar to User 2.
The truth is that, in most cases, someone has searched for what you’re searching for, or performed a very similar query, or had a similar history in a media library (preferring cat videos rather than watching dog videos) that indicates what you (as User 2) might prefer. Most websites featuring internal search (or Google-analytics-based search), a photo library, a video library, or a music library have enough data on previous users to serve reasonable results and reasonable suggestions.
It is the temporal aspect of considering which result to serve – and serving that result – that implicates our technology. Our technologies are focused on using the amount of time a user interacts with a piece of media (written, video, audio, or otherwise) to locate, refine, and serve a series of search results to subsequent users. Basically, our technology uses how User 1 spends its time in your app, on your site, or in your library to help out User 2. And that makes things better for everyone because everyone – even User 1 – is eventually User 2 and gets the benefit of this.
This effect is particularly noticeable in the case of video libraries. Despite our best efforts to put traditional machine learning, black box deep learning, and automatic categorization tools to work on this problem, the best gauge of whether videos are high-quality, relevant, and interesting to humans is still… humans. It’s possible to have a turking “call center” of humans watching videos and figuring out whether or not they’re interesting, but that’s cumbersome and expensive. The key insight we found and exploited back in 2009 and 2010 was that existing users already generate that data.
By observing (and doing some cool math related to) the amount of time a person watches a video, we can see which parts of the video library are most interesting. This creates reliable, actionable data to be used by the library’s owner to match future users (the “User 2s”) to content while adding no lag, inconvenience, surveys, or “upvote” mechanisms to the first user (the “User 1”)’s experience.
So what data are your current users generating that you’re ignoring? Let’s talk!