2018-09-25

## Supervised Learning Framework (JWHT 2, HTF 2)

Training set: a set of labeled examples of the form

$\langle x_1,\,x_2,\,\dots x_p,y\rangle,$

where $$x_j$$ are feature values and $$y$$ is the output

• Task: Given a new $$x_1,\,x_2,\,\dots x_p$$, predict $$y$$

What to learn: A function $$h:\mathcal{X}_1 \times \mathcal{X}_2 \times \cdots \times \mathcal{X}_p \rightarrow \mathcal{Y}$$, which maps the features into the output domain

• Goal: Make accurate future predictions (on unseen data)
• From Reintroduction to Statistics, we saw how this goal is formalized in terms of Generalization Error

## Types of Supervised Learning

• Problems are categorized by the type of output domain

• If $${{\cal Y}}=\mathbb{R}$$,
the problem is called regression

• If $${{\cal Y}}$$ is a finite discrete set,
the problem is called classification

• If $${{\cal Y}}$$ has 2 elements,
the problem is called binary classification

## Supervised learning problem

• Given a data set $$D \subset ({{\cal X}}\times {{\cal Y}})^n$$, find a function: $h : {{\cal X}}\rightarrow {{\cal Y}}$ such that $$h({\bf x})$$ is a “good predictor” for the value of $$y$$.

• $$h$$ is called a predictive model or hypothesis

• Assumption: Dataset $$D$$ is drawn from the same distribution that we will use to evaluate generalization error

## Solving a supervised learning problem: optimisation-based approach

1. Decide what the input-output pairs are.

2. Decide how to encode inputs and outputs.

This defines the input space $${{\cal X}}$$, and the output space $${{\cal Y}}$$.

3. Choose a class of models/hypotheses $${{\cal H}}$$.

4. Choose an error function (cost function) to define the best model in the class according to the training data

5. Choose an algorithm for searching through the space of models to find the best one.

This approach is taken by many techniques, from Ordinary Least Squares (OLS) regression to Deep Learning.