Quantized Compressive K-Means

IEEE Signal Processing Letters

Abstract: The recent framework of compressive statistical learning proposes to design tractable learning algorithms that use only a heavily compressed representation - or sketch - of massive datasets. Compressive K-Means (CKM) is such a method: It aims at estimating the centroids of data clusters from pooled, nonlinear, and random signatures of the learning examples. While this approach significantly reduces computational time on very large datasets, its digital implementation wastes acquisition resources because the learning examples are compressed only after the sensing stage. The present work generalizes the CKM sketching procedure to a large class of periodic nonlinearities including hardware-friendly implementations that compressively acquire entire datasets. This idea is exemplified in a quantized CKM procedure, a variant of CKM that leverages 1-bit universal quantization (i.e., retaining the least significant bit of a standard uniform quantizer) as the periodic sketch nonlinearity. Trading for this resource-efficient signature (standard in most acquisition schemes) has almost no impact on the clustering performance, as illustrated by numerical experiments.