-
total lines in shingles.congress.sorted = 64300
-
total lines after single pass over file to remove singletons = 35940
-
there are 6000 possible pairs before reduction
-
there are 600 pairs after reducing to only pairs from different files
-
there are 720313 pairs to check
-
there are 231971 unique pairs