The Million Dollar Programming Prize
…
A second area of collaborative-filtering research we pursued involves what are known as latent-factor models. These score both a given movie and a given viewer according to a set of factors, themselves inferred from patterns in the ratings given to all the movies by all the viewers [see illustration, “The Latent-Factor Approach“]. Factors for movies may measure comedy versus drama, action versus romance, and orientation to children versus orientation to adults. Because the factors are determined automatically by algorithms, they may correspond to hard-to-describe concepts such as quirkiness, or they may not be interpretable by humans at all.
…
The model may use 20 to 40 such factors to locate each movie and viewer in a multidimensional space. It then predicts a viewer’s rating of a movie according to the movie’s score on the dimensions that person cares about most. We can put these judgments in quantitative terms by taking the dot (or scalar) product of the locations of the viewer and the movie.
…
We found that most nearest-neighbor techniques work best on 50 or fewer neighbors, which means these methods can’t exploit all the information a viewer’s ratings may contain. Latent-factor models have the opposite weakness: They are bad at detecting strong associations among a few closely related films, such as The Lord of the Rings trilogy (2001–2003).
Because these two methods are complementary, we combined them, using many versions of each in what machine-learning experts call an ensemble approach. This allowed us to build systems that were simple and therefore easy to code and fast to run.
Interesting article. See some other posts on challenge prizes.
Read: posts on programing – Problems Programming Math – Programmers (comic)



