Thursday, November 9, 2017 - 4:30pm
Event Calendar Category
LIDS & Stats Tea
Speaker Name
Tal Wagner
Affiliation
CSAIL
Building and Room Number
LIDS Lounge
Abstract
Distance-preserving compression (or quantization) is a widely used tool for speeding up machine learning and data mining tasks such as near-neighbor searches. The goal is to compress a large high-dimensional dataset so as to approximately preserve the distances between the elements. Current algorithms are heuristic in nature and are unsuitable for high-precision regimes. I will present a new algorithm that achieves both state-of-the-art performance on standard benchmark datasets, and provably near-optimal theoretical compression bounds.
Based on joint works with Piotr Indyk and Ilya Razenshteyn.