Document Details
Document Type |
: |
Thesis |
Document Title |
: |
APPLYING WHALE OPTIMIZATION AND GENETIC ALGORITHMS FOR SPAM DETECTION IN TWITTER تطبيق خوارزمية تحسين الحيتان و الخوارزمية الجينية لكشف المحتوى الغير مرغوب فيه في تويتر |
Subject |
: |
Faculty of Computing and Information Technology |
Document Language |
: |
Arabic |
Abstract |
: |
Over the past 10 years, Online Social Networks (OSNs) have become more popular. The use of various social media sites, such as Facebook, Twitter, Instagram, Snapchat, Pinterest, and LinkedIn, increases the ability to disseminate information through OSNs. It also facilitates the spread of spam and threatens user information and privacy. Many research studies have been conducted to detect spam in OSNs. In these studies, the researchers depend on classification processes that use a large number of features, requiring longer execution time. This thesis research proposes a new model to detect spam in Twitter using two different linguistic datasets (English and Arabic). The proposed model uses an optimized classification method to detect spam in tweets in the two different linguistic datasets. For classification process, three algorithms are used Naïve Bayes, Logistic Regression and Stochastic Gradient Descent. Then the optimization is done in two separate experiments. In the first experiment, the Whale Optimization Algorithm (WOA) is applied. In the second one, the Genetic Algorithm (GA) is applied. The research experiments yielded the following results: for the English dataset, the Naïve Bayes algorithm was the best classifier for its high results before optimization, and it had the highest results after optimization, its accuracy improved from 93.1% to 95.3% after optimization. Additionally, the number of required features decreased from 20,000 to 3000. For the Arabic dataset, the Logistic Regression algorithm was the best classifier before and after optimization, its accuracy improved from 89.5% to 91.1%. Similar to English dataset, the number of required features decreased from 6689 to 2400. The results showed that for both datasets, the WOA enhanced the accuracy of the classification process, and it reduced the number of the required features. Thus, we can conclude that optimizing the classification process using the WOA algorithm improves the classification model, leading to faster real-time prediction and a shorter execution time. While using the GA did not improve the result of the classification model for the experiments done in this research. |
Supervisor |
: |
Dr. Ghada Amoudi |
Thesis Type |
: |
Master Thesis |
Publishing Year |
: |
1442 AH
2020 AD |
Added Date |
: |
Saturday, December 12, 2020 |
|
Researchers
فاطمة محمد القحطاني | Alqahtani, Fatimah Mohammed | Researcher | Master | |
|
Back To Researches Page
|