CUbRIK Research at CIKM 2012: Efficient Jaccard-based Diversity Analysis of Large Document Collections