Symbolic vs Numeric Representations
So far we've looked symbolic (discrete) representations of data and hypotheses, but often there are tasks that are naturally represented as a prediction of numeric values.
In a symbolic representation, machine learning takes the form of a hypothesis space search - represented using formal hypothesis language (Trees, Rules, Logic, etc).
However for numeric representations, machine learning takes the form of a function space "search" - represented using mathematical models (Linear Equations, Neural Networks, etc).
Methods for Numeric Representation
Some methods we use for this are:
- Linear Regression (from statistics) - the process of computing an expression that predicts a numeric quantity (from data we have).
- Perceptron (from machine learning) - a biologically-inspired linear prediction method (artificial neural network).
- Multi-Layer Neural Networks - learning non-linear predictors, based on hidden nodes between the input and output
- Regression Trees - each leaf predicts a numeric quantity (the average value of training instances that reach the leaf), and each internal node can test discrete or continuous attributes.
- Model Trees - regression tree with linear regression models at the leaf nodes. These can fit with non-axis-orthogonal slopes, and have a smoothing operation at the internal
nodes to approximate continuous fractions.
Regression
Regression is the process of determining the weights for the regression equation, which is the linear sum of attribute values (with appropriate weights) to determine a numeric quantity.