Jacob Kaplan-Moss

📌 Geeking with Greg: Clever method of near duplicate detection

A slick algorithm to “fingerprint” text based on chains for words following stop words.