Hypothesis Space Search (ML)

Hypothesis Space Search (ID3)

(As in, an algorithm that searches the space of possible decision trees, doing hill climbing on information gain).

ID = Iterative Deepening.

Algorithm

• Complete the hypothesis space (make it contain all finite discrete-valued functions).
• Target function must be there somewhere (each function has at least one tree representing it)
• Output a single hypothesis
• No Back-Tracking
• Hence can get stuck in Local Minima
• Statistically-based search choices
• Hence robust to noisy data
• Inductive Bias
• Prefers short-trees

Inductive Bias

There is a bias in the ID3 Hypothesis Search Space; it has a preference for short trees and those with attributes near the root that have high information gain.

This fits with Occam's razor; the shortest hypothesis that fits the data will likely be the most general (hence most informative, hence most preferable).

Occam's Razor

Benefits to picking shortest valid hypothesis:

• Fewer short hypotheses than long ones
• Unlikely to be coincidental if a short one is valid, rather than if a long one fits the data

Disadvantages to picking shortest valid hypothesis:

• There are many ways to define small sets of hypotheses (as in, all trees with 74902384092384 nodes, is a small set). So what's special about small-sized hypotheses, as a set?
page revision: 2, last edited: 12 Mar 2012 11:10