CFDP 2176

Attribute Sentiment Scoring With Online Text Reviews : Accounting for Language Structure and Attribute Self-Selection


Publication Date: May 2019

Pages: 55


The authors address two novel and significant challenges in using online text reviews to obtain attribute level ratings. First, they introduce the problem of inferring attribute level sentiment from text data to the marketing literature and develop a deep learning model to address it. While extant bag of words based topic models are fairly good at attribute discovery based on frequency of word or phrase occurrences, associating sentiments to attributes requires exploiting the spatial and sequential structure of language.  Second, they illustrate how to correct for attribute self-selection—reviewers choose the subset of attributes to write about—in metrics of attribute level restaurant performance.  Using reviews for empirical illustration, they find that a hybrid deep learning (CNN-LSTM) model, where CNN and LSTM exploit the spatial and sequential structure of language respectively provide the best performance in accuracy, training speed and training data size requirements. The model does particularly well on the “hard” sentiment classification problems.   Further, accounting for attribute self-selection significantly impacts  sentiment scores, especially on attributes that are frequently missing. 


Text mining, Natural language processing (NLP), Convolutional neural networks (CNN), Long-short term memory (LSTM) Networks, Deep learning, Lexicons, Endogeneity, Self-selection, Online reviews, Online ratings, Customer satisfaction

JEL Classification Codes: M1, M3, C8, C5

See CFDP Version(s): CFDP 2176R