论文类型:
期刊论文
第一作者:
Alam, Saqib
通讯作者:
Alam, S (reprint author), Dalian Univ Technol, Dept Elect Informat & Elect Engn, Black Bldg,Linggong Rd 2, Dalian 116024, Peoples R China.
合写作者:
Yao, Nianmin
发表时间:
2019-09-01
发表刊物:
COMPUTATIONAL AND MATHEMATICAL ORGANIZATION THEORY
收录刊物:
SCIE、SSCI
文献类型:
J
卷号:
25
期号:
3
页面范围:
319-335
ISSN号:
1381-298X
关键字:
Preprocessing; Machine learning; Sentiment analysis; Word2Vec
摘要:
Big data and its related technologies have become active areas of research recently. There is a huge amount of data generated every minute and second that includes unstructured data which is the topic of interest for researchers now a days. A lot of research work is currently going on in the areas of text analytics and text preprocessing. In this paper, we have studied the impact of different preprocessing steps on the accuracy of three machine learning algorithms for sentiment analysis. We applied different text preprocessing techniques and studied their impact on accuracy for sentiment classification using three well-known machine learning classifiers including Naive Bayes (NB), maximum entropy (MaxE), and support vector machines (SVM). We calculated accuracy of the three machine learning algorithms before and after applying the preprocessing steps. Results proved that the accuracy of NB algorithm was significantly improved after applying the preprocessing steps. Slight improvement in accuracy of SVM algorithm was seen after applying the preprocessing steps. Interestingly, in case of MaxE algorithm, no improvement in accuracy was seen. Our work is a comparative study, and our results proved that in case of NB algorithm, actuary was again significantly high than any other machine learning algorithm after applying the preprocessing steps; followed by MaxE and SVM algorithms. This research work proves that text preprocessing impacts the accuracy of machine learning algorithms. It further concludes that in case of NB algorithm, accuracy has significantly improved after applying text preprocessing steps.
是否译文:
否