Mark As Completed Discussion

Overfitting and Underfitting

If our model is performing poorly, then the first thing we should check is whether the model is overfitting or underfitting.

  • Overfitting occurs when the model is modeling the training data too well. In other words, it learns all the details about the specific data we have provided, including noise. The problem appears when new data is introduced. If this data lacks those particular details/noise, then the model is not able to model the new data correctly.

  • Underfitting, on the other side, is when the model cannot model anything - even the training data. Obviously, the model then can't generalize to new data.

The best way to handle either of these unwanted behaviors is to introduce a validation set.