Frontiers in Signal Processing

Spam Comment Recognition Based on Wide & Deep Learning

Download PDF (593.6 KB) PP. 30 - 36 Pub. Date: January 5, 2020

DOI: 10.22606/fsp.2020.41005

Author(s)

Meiling Fu
Key Laboratory of Electronic and Information Engineering,Southwest Minzu University,State Ethnic Affairs Commission, Chengdu, Sichuan 610000, China
Daji Ergu^*
Key Laboratory of Electronic and Information Engineering,Southwest Minzu University,State Ethnic Affairs Commission, Chengdu, Sichuan 610000, China

Abstract

The flood of e-commerce platform spam comments affects consumers' purchasing decisions, which greatly damages the interests of consumers. In the process of spam comment recognition, the explicit discrete features of spam comments were usually used as the input of the model. This paper combines the implicit semantic features of spam comments and the explicit discrete features of spam comments to identify the spam comment. First, SMOTE oversampling method is used to balance positive and negative sample sets. Then, wide & deep model, a recommendation system model proposed by Google, is improved and applied to one of the public datasets of spam comment recognition and the commodity datasets collected from one of the biggest e-commerce platform in China. The experimental results show that the improved algorithm can achieve good results in both the gold-standard opinion spam datasets and the commodity datasets.

Keywords

Wide & deep, spam comment, SMOTE, recognition.

References

[1] Shehnepoor S, Salehi M, Farahbakhsh R, et al. NetSpam: A network-based spam detection framework for reviews in online social media[J]. IEEE Transactions on Information Forensics and Security, 2017, 12(7): 1585- 1595.

[2] Rajamohana S P, Umamaheswari K, Abirami B. Adaptive binary flower pollination algorithm for feature selection in review spam detection[C]//2017 International Conference on Innovations in Green Energy and Healthcare Technologies (IGEHT). IEEE, 2017: 1-4.

[3] Etaiwi W, Awajan A. The effects of features selection methods on spam review detection performance[C]//2017 International Conference on New Trends in Computing Sciences (ICTCS). IEEE, 2017: 116-120.

[4] S. Jia, X. Zhang, X. Wang and Y. Liu, "Fake reviews detection based on LDA," 2018 4th International Conference on Information Management (ICIM), Oxford, 2018, pp. 280-283.

[5] G. Xu, M. Hu, C. Ma and M. Daneshmand, "GSCPM: CPM-Based Group Spamming Detection in Online Product Reviews," ICC 2019 - 2019 IEEE International Conference on Communications (ICC), Shanghai, China, 2019, pp. 1-6.

[6] M. Li, B. Wu and Y. Wang, "Comment Spam Detection via Effective Features Combination," ICC 2019 - 2019 IEEE International Conference on Communications (ICC), Shanghai, China, 2019, pp. 1-6

[7] N. A. Patel and R. Patel, "A Survey on Fake Review Detection using Machine Learning Techniques," 2018 4th International Conference on Computing Communication and Automation (ICCCA), Greater Noida, India, 2018, pp. 1-6.

[8] J. K. Rout, A. K. Dash and N. K. Ray, "A Framework for Fake Review Detection: Issues and Challenges," 2018 International Conference on Information Technology (ICIT), Bhubaneswar, India, 2018, pp. 7-10.

[9] J. Li, "Identification Model of Commodity False Reviews Based on Integrated Features," 2018 International Conference on Virtual Reality and Intelligent Systems (ICVRIS), Changsha, 2018, pp. 395-398.

[10] W. Liu, J. He, S. Han, F. Cai, Z. Yang and N. Zhu, "A Method for the Detection of Fake Reviews Based on Temporal Features of Reviews and Comments," in IEEE Engineering Management Review

[11] J. Li, "Identification Model of Commodity False Reviews Based on Integrated Features," 2018 International Conference on Virtual Reality and Intelligent Systems (ICVRIS), Changsha, 2018, pp. 395-398.

[12] C. He and Y. Shi, "Research on Chinese Spam Comments Detection Based on Chinese Characteristics," 2018 IEEE 4th International Conference on Computer and Communications (ICCC), Chengdu, China, 2018, pp. 2608-2612.

[13] Cheng H T , Koc L , Harmsen J , et al. Wide & Deep Learning for Recommender Systems[J]. 2016.

[14] Y. Ren and D. Ji, "Learning to Detect Deceptive Opinion Spam: A Survey," in IEEE Access, vol. 7, pp. 42934- 42945, 2019.

[15] A. Bitarafan and C. Dadkhah, "SPGD_HIN: Spammer Group Detection based on Heterogeneous Information Network," 2019 5th International Conference on Web Research (ICWR), Tehran, Iran, 2019, pp. 228-233

[16] Li J, Ott M, Cardie C, et al. Towards a general rule for identifying deceptive opinion spam[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2014: 1566-1576.

[17] W. Feng, W. Huang and W. Bao, "Imbalanced Hyperspectral Image Classification With an Adaptive Ensemble Method Based on SMOTE and Rotation Forest With Differentiated Sampling Rates," in IEEE Geoscience and Remote Sensing Letters.

[18] R. K. Jeevan, S. Venu Madhava Rao, P. Shiva Kumar and M. Srivikas, "EEG-based emotion recognition using LSTM-RNN machine learning algorithm," 2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT), CHENNAI, India, 2019, pp. 1-4.