Hypothesis Space Search (ML)

# Hypothesis Space Search (ID3)

(As in, an algorithm that searches the space of possible decision trees, doing hill climbing on information gain).

ID = Iterative Deepening.

## Algorithm

- Complete the hypothesis space (make it contain all finite discrete-valued functions).
- Target function must be there somewhere (each function has at least one tree representing it)

- Output a single hypothesis
- No Back-Tracking
- Hence can get stuck in Local Minima

- Statistically-based search choices
- Hence robust to noisy data

- Inductive Bias
- Prefers short-trees

## Inductive Bias

There is a bias in the ID3 Hypothesis Search Space; it has a preference for short trees and those with attributes near the root that have high information gain.

This fits with Occam's razor; the shortest hypothesis that fits the data will likely be the most general (hence most informative, hence most preferable).

### Occam's Razor

Benefits to picking shortest valid hypothesis:

- Fewer short hypotheses than long ones
- Unlikely to be coincidental if a short one is valid, rather than if a long one fits the data

Disadvantages to picking shortest valid hypothesis:

- There are many ways to define small sets of hypotheses (as in, all trees with 74902384092384 nodes, is a small set). So what's special about small-sized hypotheses, as a set?

page revision: 2, last edited: 12 Mar 2012 11:10