Book
Real-time Data De-duplication using Locality-sensitive Hashing powered by Storm and Riak (Berlin Buzzwords 2014)
Oliver Schulte Machine Learning 726 Nonparametric Methods: Nearest Neighbors.
Sketching, Sampling, and other Sublinear Algorithms 1 (Lecture by Alex Andoni)
3 - Finding similar items
Pairwise Sequence Alignment BMI/CS 576 Colin Dewey [email protected] Fall 2010.
1 CS345A: Data Mining on the Web Course Introduction Issues in Data Mining Bonferroni’s Principle.
AUTOMATIC ANNOTATION OF GEO-INFORMATION IN PANORAMIC STREET VIEW BY IMAGE RETRIEVAL Ming Chen, Yueting Zhuang, Fei Wu College of Computer Science, Zhejiang.
Why Not Grab a Free Lunch? Mining Large Corpora for Parallel Sentences to Improve Translation Modeling Ferhan Ture and Jimmy Lin University of Maryland,
Finding Similar Items
Minimal Loss Hashing for Compact Binary Codes
Minning of Massive Datasets