« Back to Glossary Index

Model pruning is a technique in deep learning that involves removing less significant parameters—such as weights, neurons, or filters—from a trained neural network. The goal is to reduce the model’s size and computational requirements while maintaining, or minimally impacting, its performance. This optimization is particularly beneficial for deploying models on resource-constrained devices, as it enhances efficiency and reduces inference time.

Pruning can be applied in various ways, including:

  • Weight Pruning: Eliminating individual weights based on certain criteria, such as their magnitude.
  • Neuron Pruning: Removing entire neurons that contribute minimally to the output.
  • Filter Pruning: Discarding less important filters in convolutional layers, which can significantly reduce model complexity.

Implementing pruning requires careful consideration to balance the trade-off between model size and accuracy. Post-pruning fine-tuning is often necessary to recover any performance loss due to the removal of parameters. Frameworks like TensorFlow and PyTorch offer tools to facilitate the pruning process.

« Back to Glossary Index