Uses pairwise distance (cosine sim) on the entire dataset embeddings to detect which tweets are similar. Output a list of tweets that are grouped together according to the threshold defined in config.yaml
Uses pairwise distance (cosine sim) on the entire dataset embeddings to detect which tweets are similar. Output a list of tweets that are grouped together according to the threshold defined in config.yaml