Abstract: The compressive learning framework reduces the computational cost of training on large-scale datasets. In a sketching phase, the data is first compressed to a lightweight sketch vector, obtained by mapping the data samples through a well-chosen feature map, and averaging those contributions. In a learning phase, the desired model parameters are then extracted from this sketch by solving an optimization problem, which also involves a feature map. When the feature map is identical during the sketching and learning phases, formal statistical guarantees (excess risk bounds) have been proven.
However, the desirable properties of the feature map are different during sketching and learning (such as quantized outputs, and differentiability, respectively). We thus study the relaxation where this map is allowed to be different for each phase. First, we prove that the existing guarantees carry over to this asymmetric scheme, up to a controlled error term, provided some Limited Projected Distortion (LPD) property holds. We then instantiate this framework to the setting of quantized sketches, by proving that the LPD indeed holds for binary sketch contributions. Finally, we further validate the approach with numerical simulations, including a large-scale application in audio event classification.
This is a joint work with Vincent Schellekens (CEA-List, France)