Underfitting occurs in machine learning when a model is too simplistic to capture the underlying patterns in the data, leading to poor performance on both training and unseen data. This situation arises when the model fails to learn the complexities of the data, resulting in high bias and low variance.
Causes of Underfitting:
- Model Simplicity: Using a model that is too simple for the data, such as linear regression for non-linear data, can lead to underfitting.
- Insufficient Features: Not including enough relevant features in the model can prevent it from capturing the necessary patterns.
- Inadequate Training: Insufficient training time or inadequate training data can result in a model that doesn’t learn effectively.
Consequences of Underfitting:
- Poor Performance: The model will have high error rates on both training and test datasets.
- Inability to Generalize: An underfitted model cannot generalize well to new, unseen data.
Addressing Underfitting:
- Increase Model Complexity: Use more complex models that can capture the data’s patterns, such as polynomial regression or decision trees.
- Add Relevant Features: Incorporate additional relevant features that can provide more information to the model.
- Improve Training: Provide more training data and allow the model more time to learn.