Regression Metrics & KPI's
Measuring Error
There are multiple methods for calculating the difference between the regression prediction and actual data.
MAE
📘 Wikipedia: MAE (Mean Absolute Error)
Mean Absolute Error (MAE) is obtained by calculating the absolute difference between the model predictions and the true (actual) values
MAE is a measure of the average magnitude of error generated by the regression model
MAE is calculated by following these steps:
- Calculate the residual of every data point
- Calculate the absolute value (to get rid of the sign)
- Calculate the average of all residuals
If MAE is zero, this indicates that the model predictions are perfect.
Error increases in proportional fashion.
MSE
📘 Wikipedia: MSE (Mean Squared Error)
Mean Square Error (MSE) is very similar to the Mean Absolute Error (MAE) but instead of using absolute values, squares of the difference between the model predictions and the training dataset (true values) is being calculated.
MSE values are generally larger compared to the MAE since the residuals are being squared.
In case of data outliers, MSE will become much larger compared to MAE
In MSE, error increases in a quadratic fashion while the error increases in proportional fashion in MAE
In MSE, since the error is being squared, any predicting error is being heavily penalized
MSE is calculated by following these steps:
- Calculate the residual for every data point
- Calculate the squared value of the residuals
- Calculate the average of results from step #2
RMSE
📘 Wikipedia: Root-mean-square deviation
Root Mean Square Error (RMSE) represents the standard deviation of the residuals (i.e. difference between the model predictions and the true values (training data)).
RMSE can be easily interpreted compared to MSE because RMSE units match the units of the output.
RMSE provides an estimate of how large the residuals are being dispersed.
RMSE is calculated by following these steps:
- Calculate the residual for every data point
- Calculate the squared value of the residuals
- Calculate the average of the squared residuals
- Obtain the square root of the result
MAPE
📘 Wikipedia: Mean absolute percentage error
Mean Absolute Percentage Error (MAPE).
MAE values can range from 0 to infinity which makes it difficult to interpret the result as compared to the training data.
MAPE is the equivalent to MAE but provides the error in a percentage form and therefore overcomes MAE limitations.
MAPE might exhibit some limitations if the data point value is zero (since there is division operation involved)
MPE
📘 Wikipedia: Mean percentage error
MPE is similar to MAPE but without the absolute operation
MPE is useful to provide an insight of how many positive errors as compared to negative ones
Regression Metrics
R Squared
📘 Wikipedia: Coefficient of determination
R-squared, also known as coefficient of determination represents the proportion of variance (of y) that has been explained by the independent variables in the model.
If R-squared = 80, this means that 80% of the increase in insurance cost is due to increase in the age of the applicant.
R-squared provides an insight of goodness of fit
It gives a measure of how well unseen samples are likely to be predicted by the model, through the proportion of explained variance.
Maximum R-squared value is 1
A constant model that always predicts the expected value of y, disregarding the input features, will have an R-squared score of 0.
Adjusted R Square
📘 Wikipedia: Adjusted R Squared
One limitation of R Squared is that it increases by adding independent variables to the model which is misleading since some added variables might be useless with minimal significance
Adjusted R Squared overcomes this issue by adding a penalty if we make an attempt to add independent variables that do not improve the model
Adjust R Squared is a modified version of the R Squared and takes into account the number of predictors in the model.
If useless predictors are added to the model, Adjusted R Squared will decrease
If useful predictors are added to the model, Adjusted R Squared will increase
Comments
Recent Work
Basalt
basalt.softwareFree desktop AI Chat client, designed for developers and businesses. Unlocks advanced model settings only available in the API. Includes quality of life features like custom syntax highlighting.
BidBear
bidbear.ioBidbear is a report automation tool. It downloads Amazon Seller and Advertising reports, daily, to a private database. It then merges and formats the data into beautiful, on demand, exportable performance reports.