Here is a review notes of Machine Learning.
There are several ways to measure the performance of ML. One is using loss function to measure the closeness between prediction output with the ground truth. The second way doesn’t just want the prediction of one test data but all the possible input from
As for the regression problem, if the loss function is square loss then the risk is the mean square error (MSE).
Bayes Optimal rule
Ideal goal: to construct the prediction rule :
The best possible performance which is Bayes risk is:
This optimal rule is not computable because it depends on unknown
Training process
When we talk about the performance of a learning algorithm, we are talking about how well does the algorithm do on average for a test example drawn at random, which is the Risk, and for a set of training examples and labels
The ideal goal of a learning problem is the Bayes optimal rule. However, in the practical process, the practical goal is: given
This is called Empirical Risk minimizer. Under the Law of Large Number, we can get: