A Topic Model based on Weighted Word Co-Occurrence Matrix and user Topic Relationships


Ziqi Xu1,2, Bo Cheng1,2, Kang Yang3, Lili Zhong3, Yan Tang3, 1Shenzhen University, China, 2Beijing University of Posts And Telecommunication, China, 3Ping An Bank Co., Ltd., China


Various industries have widespread adopted the intelligent customer service, thus how to understand customer intent more accurately and extract key information has become a current research hotspot. However, the features that customer service dialog texts are short length, specialization and sparse lead to the poor performance of traditional topic extraction. Based on the above background and characteristics, this paper proposes a topic model WCMUT-HDP, which is based on a weighted word co-occurrence matrix and user topic relations. In the WCMUT-HDP model, this paper introduces a semantically weighted word co-occurrence matrix to mine the statistical and semantic features of customer service texts and optimize the effect of clustering. For the structure of customer service dialogues, this paper introduces temporal and author attributes of customer service dialog into the topic recognition of customer service texts. This method helps us to accurately extracts the user's intention. WCMUT-HDP is based on the Dirichlet process, and does not need to specify the number of topics in advance, which saving the time overhead of parameter experiments and evaluation. In the end of this paper, the experimental results show that the WCMUT-HDP model can effectively identify the topics of customer service conversations, and the extracted topics can accurately reflect the user's conversational intent.


Customer Service Text,Word Co-occurrence Matrix,Hierarchical Dirichlet Process,Topic Model

Full Text  Volume 13, Number 9