Filtering users accounts for enhancing the results of social media mining tasks

Funding Sponsor

American University in Cairo

Author's Department

Computer Science & Engineering Department

Find in your Library

Document Type

Research Article

Publication Title

Advances in Intelligent Systems and Computing

Publication Date





© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020. Filtering out the illegitimate Twitter accounts for online social media mining tasks reduces the noise and thus improves the quality of the outcomes of those tasks. Developing a supervised machine learning classifier requires a large annotated dataset. While building the annotation guidelines, the rules were found suitable to develop an unsupervised rule-based classifying program. However, despite its high accuracy, the performance of the rule-based program was not time efficient. So, we decided to use the unsupervised rule-based program to create a massive annotated dataset to build a supervised machine learning classifier, which was found to be fast and matched the unsupervised classifier performance with an F-Score of 92%. The impact of removing those illegitimate accounts on an influential users identification program developed by the authors, was investigated. There were slight improvements in the precision results but not statistically significant, which indicated that the influential user program didn’t identify erroneously spam accounts as influential.

First Page


Last Page


This document is currently not available here.