本平台为互联网非涉密平台,严禁处理、传输国家秘密或工作秘密

融合多层级特征的卷烟消费者评价情感分析

Smokers' sentiment towards cigarette consumption: an evaluation by integrating multi-level features

  • 摘要: 为了获取卷烟消费者反馈,洞察消费行为成因,构建了一个大规模多情感消费者评价数据集,提出了融合多层级特征的消费者评价情感分析模型ECBHA,并基于该模型建立了卷烟消费者典型意见智能挖掘系统。模型ECBHA采用预训练模型ERNIE(Enhanced Representation through Knowledge Integration)获得包含上下文信息的动态词向量,通过卷积神经网络(CNN)和双向长短时记忆神经网络(BiLSTM)提取局部和全局特征,使用层次注意力网络(HAN)在词和句子两个层面提取情感判断的关键特征。在卷烟消费者评价数据集上的实验结果表明,模型ECBHA的各项指标均优于支持向量机(SVM)、多元逻辑回归(LG)、文本卷积神经网络(TextCNN)等9种机器学习或深度学习的基线方法,其中整体分类准确率为85.29%,在正向、中性、负向情感分类中的F1值分别为90.51%、67.96%和86.21%,比表现最好的基线方法ERNIE分别提高了3.28、2.02、5.32和4.77百分点;基于模型ECBHA建立的卷烟消费者典型意见智能挖掘系统,可以实现对卷烟产品画像、产品情感分析结果的对比,帮助企业快速掌握消费者对卷烟产品的态度,为产品研发和精准营销提供支持。

     

    Abstract: In order to obtain feedback from smokers and gain insights into factors affecting their cigarette consumption behavior, a large-scale multi-sentiment consumers' evaluation dataset was constructed, and the resultant smoker evaluation sentiment analysis model, ECBHA, was proposed by integrating key multi-level features. Based on this model, an intelligent data mining system for typical opinions of smokers was developed. The ECBHA model used a pre-trained model, Enhanced Representation through Knowledge Integration (ERNIE), to generate dynamic word vectors with contextual information, and extracted local and global features through convolutional neural network (CNN) and bi-directional long short-term memory neural network (BiLSTM). A hierarchical attention network (HAN) was employed to extract key features for sentiment judgment at the word and sentence levels used by smokers. Experimental results based on smokers' evaluation dataset showed that ECBHA model outperformed nine baseline methods of machine learning or deep learning, including Support Vector Machine (SVM), Multinomial Logistic Regression (LG), and Text Convolutional Neural Network (TextCNN) across all major indexes. Among which, the overall classification accuracy of ECBHA model was 85.29%, and the F1 scores for positive, neutral and negative sentiment classification were 90.51%, 67.96% and 86.21% respectively, which were 3.28, 2.02, 5.32 and 4.77 percentage points higher than those of ERNIE, the baseline method displayed with best performance. The intelligent mining system for typical opinions of smokers built on the ECBHA model enabled functions such as generation of cigarette product portraits and comparison of sentiment analysis results, assisting manufacturers in swiftly understanding smokers' attitudes towards cigarette products and providing support for product research and development as well as precise marketing.

     

/

返回文章
返回