Bias In ML

Every ML Concept Learning Algorithm has bias. It allows us to find the target function reasonably accurately and efficiently. Bias-Free Learning is considered impossible.

Inductive Bias

By 'bias' we mean "inductive bias". By which we mean the set of assertions that the learner uses to predict outputs given inputs that it has not encountered. I.e. This decent explanation says it as "the assumptions that must be added to the observed data to transform the algorithm's outputs into logical deductions".


So the set of assumptions such that for any target concept c:

For all instances within X, the attributes that are the same between the assertions, the (positive) training examples and the specific instance, logically imply the classification given to the instance.

\begin{align} \forall x_i \in X [(B \wedge D_c \wedge x_i) \vdash L(x_i, D_c)] \end{align}

(Where B is the set of assertions, L is the concept learning algorithm, X is the set of instances and Dc is the set of training examples).

An Unbiased Learner

Consider a case like:
{Sunny, Warm, Normal, ?, ?, ?} vs {?, ?, ?, ?, ?, Change}
Where both are positive.

Because we can't nicely deal with, and hence don't allow disjunctions we can't learn this.

If we did allow it we'd be able to represent any boolean combination, and we'd never be able to generalise.

Inductive Bias vs Deductive Bias

Deductive Bias explicitly supplies the bias to the learner.

Bias in Hypothesis Space Search

Bias is a preference for some hypotheses over others, rather than a restriction of which hypotheses to search.

So it's an incomplete search of a complete hypothesis (unrestricted) space, vs a complete (if the answer is there it will find it) search of an incomplete hypothesis space (as an unbias'd learner does).