## Measuring Error

There are multiple methods for calculating the difference between the regression prediction and actual data.

### MAE

📘 Wikipedia: MAE (Mean Absolute Error)

Mean Absolute Error (**MAE**) is obtained by calculating the absolute difference between the model predictions and the true (actual) values

**MAE** is a measure of the average magnitude of error generated by the regression model

**MAE** is calculated by following these steps:

- Calculate the residual of every data point
- Calculate the absolute value (to get rid of the sign)
- Calculate the average of all residuals

If **MAE** is zero, this indicates that the model predictions are perfect.

Error increases in proportional fashion.

### MSE

📘 Wikipedia: MSE (Mean Squared Error)

Mean Square Error (**MSE**) is very similar to the Mean Absolute Error (MAE) but instead of using absolute values, squares of the difference between the model predictions and the training dataset (true values) is being calculated.

**MSE** values are generally larger compared to the **MAE** since the residuals are being squared.

In case of data outliers, **MSE** will become much larger compared to **MAE**

In **MSE**, error increases in a quadratic fashion while the error increases in proportional fashion in **MAE**

In **MSE**, since the error is being squared, any predicting error is being heavily penalized

**MSE** is calculated by following these steps:

- Calculate the residual for every data point
- Calculate the squared value of the residuals
- Calculate the average of results from step #2

### RMSE

📘 Wikipedia: Root-mean-square deviation

Root Mean Square Error (**RMSE**) represents the standard deviation of the residuals (i.e. difference between the model predictions and the true values (training data)).

**RMSE** can be easily interpreted compared to **MSE** because **RMSE** units match the units of the output.

**RMSE** provides an estimate of how large the residuals are being dispersed.

**RMSE** is calculated by following these steps:

- Calculate the residual for every data point
- Calculate the squared value of the residuals
- Calculate the average of the squared residuals
- Obtain the square root of the result

### MAPE

📘 Wikipedia: Mean absolute percentage error

Mean Absolute Percentage Error (**MAPE**).

**MAE** values can range from 0 to infinity which makes it difficult to interpret the result as compared to the training data.

**MAPE** is the equivalent to **MAE** but provides the error in a percentage form and therefore overcomes **MAE** limitations.

**MAPE** might exhibit some limitations if the data point value is zero (since there is division operation involved)

### MPE

📘 Wikipedia: Mean percentage error

MPE is similar to MAPE but without the absolute operation

MPE is useful to provide an insight of how many positive errors as compared to negative ones

## Regression Metrics

### R Squared

📘 Wikipedia: Coefficient of determination

**R-squared**, also known as coefficient of determination represents the proportion of variance (of y) that has been explained by the independent variables in the model.

If R-squared = 80, this means that 80% of the increase in insurance cost is due to increase in the age of the applicant.

R-squared provides an insight of goodness of fit

It gives a measure of how well unseen samples are likely to be predicted by the model, through the proportion of explained variance.

Maximum R-squared value is 1

A constant model that always predicts the expected value of y, disregarding the input features, will have an R-squared score of 0.

### Adjusted R Square

📘 Wikipedia: Adjusted R Squared

One limitation of R Squared is that it increases by adding independent variables to the model which is misleading since some added variables might be useless with minimal significance

Adjusted R Squared overcomes this issue by adding a penalty if we make an attempt to add independent variables that do not improve the model

Adjust R Squared is a modified version of the R Squared and takes into account the number of predictors in the model.

If useless predictors are added to the model, Adjusted R Squared will decrease

If useful predictors are added to the model, Adjusted R Squared will increase