Quantization - Weekend Geeks

« Back to Glossary Index

Quantization is a process that reduces the precision of a model’s parameters and computations, typically by converting them from higher bit-width representations like 32-bit floating point to lower bit-width formats such as 8-bit integers. This reduction decreases both the computational and memory demands of running inference, enabling more efficient deployment of models on resource-constrained devices without significantly compromising accuracy.

« Back to Glossary Index