Uttej's Contrivance: social

With the social media platforms leading the internet in terms of user base and the average time spent, a significant amount of data is being generated by these platforms every day. This makes social media platforms a go-to place to understand the reviews, trends, and opinions of the people.

Any regular search for a popular topic would result in an abundance of information and thus it is impossible to go through these large amounts of data manually to understand the trends. This thesis discusses techniques for the intra-topic clustering of such social media data and discusses how social media noise increases the redundancy of the search results.

Our goal is to filter the amount of redundant information an end-user must review from a regular social media search. The research proposes clustering models based on two string similarity measures Jaccard word token and TInformation distance. Evaluation parameters are introduced and the models are evaluated by clustering a set of current and historical topics to determine which techniques are the most effective.

Full thesis text here: Uttej's thesis

Uttej's Contrivance

Wednesday, December 9, 2020

Intra-topic clustering for social media ?

Blog Archive