Recommender Systems

A Recommender Systems are a subclass of information filtering system that seek to predict the 'rating' or 'preference' that a user would give to an item (such as music, books, or movies) or social element (e.g. people or groups) they had not yet considered.

It's a form of personalisation that is related to instance based learning (uses a similarity function).

Examples, Amazon or eBay.

# Content-Based Recommendation

Users are recommended items that are similar to past choices. The idea comes from information retrieval and requires a profile of the content/description of items.

c = user, s = items

(1)
$$u(c,s) = score(profile(c), content(s))$$

e.g.

(2)
\begin{align} u(c,s) = cosineDistance(\vec{w_c}, \vec{w_s}) = \frac{\vec{w_c}\times\vec{w_s}}{||\vec{w_c}||^2\times||\vec{w_s}||^2} \end{align}

$\vec{w_c}$ is a vector summarising c's past choices, and $\vec{w_s}$ is a vector of the terms describing s.

• Well-understood techniques from information retrieval
• Can extract latent features from text analysis (determine underlying themes)

• Can over-specialise (not branch out)
• What to do with new users?

# Collaborative-Based Recommendation

Users are recommended items that users with similar tastes have chosen.

The two main methods are memory-based and model-based collaborative filtering (CF).

## Memory-Based CF

(3)
\begin{align} r_{c,s} = aggregate_{c' \in C} r_{c', s} \end{align}

Where c is the user, c' is other users and $r_{c,s}$ is the rating for the item s by the user c.

Can use the weighted sum as the aggregation:

(4)
\begin{align} r_{c,s} = k\sum_{c' \in C} similarity(c, c') \times r_{c',s} \end{align}

Where k is a normalising factor and the similarity function can be correlation, cosine distance, item-based similarity, etc.

## Model-Based CF

It's like a nearest-neighbour method, and it uses other ML methods to build a model to predict the rating from database examples.

• Works well in practice
• Doesn't require content descriptions

• Still new user problem - no 'taste' developed yet
• New item problem - must be rated before can be used
• Grey sheep = insufficiently individual users
• Black sheep = too individual users

## Hybrid Recommender Systems

The key idea is to combine memory and model based approaches:

• "cold-start" (new user) problem - provide a default model to predict before user activity
• "sparsity" problem - use the model to predict missing values

Learning these models may be difficult/expensive.

page revision: 1, last edited: 16 Apr 2012 15:19