Abstract
Social media has become our new reality, people wake up every morning and the first thing they do before getting out of bed, is check their social media. Nowadays, people rarely read newspapers, they even rarely watch TV news or listen to radio broadcasts. In recent years, we have witnessed lots of fake news roaming social media every second, with people simply believing it and spreading it even more without checking the credibility of this news. This fake news affected several domains like what happened in the US election in 2016 and again in 2020, the false information about Covid-19 treatment, status in countries, vaccines, and others. A rumor can affect the stock market dramatically and cause a major economic crisis. Some websites are taking part in fighting this by relying on experts to fact-check news and they keep their websites updated with the fact-checked news and how real or fake they are. The main thing is that manual fact-checking requires quite a man labor and time that does not match the speed the news spreads. Therefore, the need for automatic fact-checking tools is becoming more urgent. In social media, the text of the post is not the only factor in spreading the news. The users’ engagement, by reposting, liking, commenting, and replying, affects the spread of the post. The credibility of the user also affects how people in the same circle react to the post. The way the news propagates in the network through time and the circle of networks is also an important factor. Much research has tackled the problem by considering mainly the textual content of posts, and some others focused on the user features. This research aims to develop a comprehensive machine learning framework for detecting fake news on social media by incorporating multiple modalities of data. The research proposes of a multimodal approach that integrates content-based, user-based, and propagation-based approaches. Specifically, it will examine how contextual information can enhance detection of fake social posts and the extent to which integrating news articles can further enhance this performance. Large langugae models are used for textual representation of social posts and news articles, and deep learning neural networks are used to capture the contextual features of the text and identify the post as fake or real. This integration yielded the TChecker model which could achieve an F1 score of 0.93 compared to 0.91 for state of the art models integrating both social posts and news articles. Additionally, the investigation will delve into the impact of social post metrics like retweets, replies, and likes and user features such as follower count and account age on the performance of fake news detection models. Those features are fed to the TChecker model resulting in the TChecker+ model which could achieve an F1 score of 0.94. Furthermore, the study will assess how the spread of news through social networks influences the identification of fake news and how the combination of propagation features with textual features can improve detection processes. Ultimately, the research will seek to identify the most effective approach for integrating these approaches to advance the reliability of fake news detection on social media platforms through proposing the Multimodal model. The results show an enhancement of the performance of the Multimodal model over stand-alone models that rely on one or two modes of features. The Multimodal model could achieve an F1 score of 0.96 compared to 0.91 by the state-of-the-art model that integrates the textual content of the post and its social context.
School
School of Sciences and Engineering
Department
Computer Science & Engineering Department
Degree Name
PhD in Applied Sciences
Graduation Date
Summer 6-12-2024
Submission Date
5-28-2024
First Advisor
Ahmed Rafea
Second Advisor
Hossam Sharara
Committee Member 1
Mostafa Youssef
Committee Member 2
Nourhan Sakr
Committee Member 3
Aly Fahmy
Extent
87 p.
Document Type
Doctoral Dissertation
Institutional Review Board (IRB) Approval
Not necessary for this item
Recommended Citation
APA Citation
GabAllah, N. A.
(2024).Machine Learning Multimodal Framework for Fake News Detection and Mitigation [Doctoral Dissertation, the American University in Cairo]. AUC Knowledge Fountain.
https://fount.aucegypt.edu/etds/2349
MLA Citation
GabAllah, Nada A.. Machine Learning Multimodal Framework for Fake News Detection and Mitigation. 2024. American University in Cairo, Doctoral Dissertation. AUC Knowledge Fountain.
https://fount.aucegypt.edu/etds/2349