首页|A Comprehensive Pre-processing Approach for High-Performance Classification of Twitter Data with several Machine Learning Algorithms

A Comprehensive Pre-processing Approach for High-Performance Classification of Twitter Data with several Machine Learning Algorithms

扫码查看
Producing an average of five hundred million tweets per date, Twitter has grown as one of the most comprehensive platforms of data interpretation for the researchers。 Beforehand, various researches have been conveyed on twitter data i。e。, sentimental analysis。 Nevertheless, not much research has been performed to classify the tweets in terms of categories so that tweets can be spread as per user preferences。 In this research, we started by constructing four comprehensive classes: politics, sports, crime and natural。 Next, we implemented our proposed preprocessing model on the raw twitter dataset。 After that, we implemented different machine learning techniques (Random Forest, K-Nearest Neighbors, Naive Bayes, Logistic Regression, Decision Tree and Support Vector Machine) to classify the twitter data。 Finally, we examined the outcomes with and without preprocessing in terms of sensitivity, specificity, and accuracy。 We found that our proposed preprocessing model enhanced the performance of all the machine learning classifiers。

Twitter Data ClassificationNovel Preprocessing ModelRandom ForestK-Nearest NeighborsNaive BayesLogistic RegressionDecision TreeSupport Vector Machine

Ananya Sarker、Md. Rabiul Islam、Azmain Yakin Srizon

展开 >

Rajshahi University of Engineering and Technology, Rajshahi, Bangladesh

IEEE Region 10 Symposium

Dhaka(BD)

2020 IEEE Region 10 Symposium

630-633

2020