MAD vs RMSE vs MAE vs MSLE vs R²: When to use which?

Marcel Pinheiro 11/07/2020 BIG DATA Leave a comment 3,206 Views

Well actually these can give you different insights into your models errors. If $y$ is your target, $p$ your prediction and $e = p - y$ the errors:

Mean Error: $M E = m e a n (e)$

In (-∞,∞), the closer to 0 the better.
Measures additive bias in the error. Unbiased estimates should have the same mean as your target thus ME should be close to 0, if it’s positive your predictions overestimate the target, if it’s negative they underestimate.

Root Mean Squared Error: $R M S E = \sqrt{m e a n (e^{2})}$ .

In [0,∞), the smaller the better.
Measures the mean square magnitude of errors. Root square is taken to make the units of the error be the same as the units of the target. This measure gives more weight to large deviations such as outliers, since large differences squared become larger and small (smaller than 1) differences squared become smaller.

Mean Absolute Error: $M A E = m e a n (| e |)$ .

In [0,∞), the smaller the better.
Measures the absolute magnitude of errors and it’s units are the same as the units of the target. Makes for more easily interprectable errors and gives less weight to outliers. However a model with good $M A E$ can have punctually very high errors.

(Root) Mean Squared Log Error: $M S L E = m e a n ((l o g (p + 1) - l o g (y + 1))^{2})$ .

In [0,∞), the smaller the better.
This is useful when dealing with right skewed targets, since taking the log transform makes the target more normally distributed. In practice it’s usually achieved by changing the target to $\hat{y} = l o g (y + 1)$ and then predicting as $y = e^{\hat{y}} - 1$

Median Absolute Deviation: $M A D = m e d i a n (e - m e d i a n (e))$ .

In [0,∞), the smaller the better.
This is a spread metric similar to standard deviation but meant to be more robust to outliers. Instead of taking means of squares as the sd, MAD takes medians of absolutes making it more robust.

R², coefficient of determination:

In (−∞,1] the closer to 1 the better Is a measure of the ratio of variability that your model can capture vs the natural variability in the target variable.

In practice I usually use a combination of $M E$ , $R^{2}$ and: $R M S E$ if there are no outliers in the data, $M A E$ if I have a large dataset and there may be outliers, $R M L S E$ if the target is right skewed.

This link offers a very nice overview on the matter: http://www.cawcr.gov.au/projects/verification/#Methods_for_foreasts_of_continuous_variables

ORIGINAL: https://datascience.stackexchange.com/questions/42760/mad-vs-rmse-vs-mae-vs-msle-vs-r%C2%B2-when-to-use-which

Malum Big Data Big Data & Data Science Stuff!

MAD vs RMSE vs MAE vs MSLE vs R²: When to use which?

Related Articles

Check Also

Criando o requirements.txt de forma automática

Leave a Reply Cancel reply

Criando o requirements.txt de forma automática