Using online reviews for evaluating quality
Photo by Michele Bitetto on Unsplash.
Marco Brambilla, Using Online Reviews for Evaluating the Quality of Cultural Tourism, Towards Data Science3 December 2021
The Experience of Italian Museums
In the cultural tourism field, there has been an increasing interest in adopting data-driven approaches that are aimed at measuring the quality of service through online reviews. Online reviews have long represented a valuable source for data analysis in the tourism field, but these data sources have been mostly studied in terms of the numerical ratings offered by the review platforms.
Top-down vs. bottom-up data-driven quality assessment
In a recent article (available as full open-access), we explored if social media and online review platforms can be a good source of quantitative evaluation of service quality of cultural venues, such as museums, theaters and so on. Our paper applies automatic analysis of online reviews, by comparing two different automated analysis approaches to evaluate which of the two is more adequate for assessing the quality dimensions. The analysis covers user-generated reviews over the top 100 Italian museums.
Specifically, we compare two approaches:
- a ‘top-down’ approach that is based on a supervised classification based upon strategic choices defined by policy makers’ guidelines at the national level; and
- a ‘bottom-up’ approach that is based on an unsupervised topic model of the online words of reviewers.
We compare the resulting museum quality dimensions, showing that the ‘bottom-up’ approach reveals additional quality dimensions compared with those obtained through the ‘top-down’ approach.
The misalignment of the results of the ‘top-down’ strategic studies and ‘bottom-up’ data-driven approaches highlights how data science can offer an important contribution to decision making in cultural tourism.
Questions on quality dimensions
The study addresses these 3 questions:
- Which museum quality dimensions are identified following a ‘top-down’ approach from a strategic, centralized perspective?
- Which museum quality dimensions are identified following a ‘bottom-up’ data driven approach on online reviews?
- How are the quality dimensions in the two approaches different?
In the ‘top-down’ approach, a predefined set of dimensions is defined by the decision maker (i.e., the policy maker at the national level), and we use a keyword-based classifier to analyse the expected dimensions in the text of online reviews.
In the ‘bottom-up’ approach, that latent quality dimensions have been directly derived from the textual description of the visitors’ experiences by relying on unsupervised topic analysis, without imposing a predefined set of quality dimensions.
Quality dimensions chosen by the decision maker can be very different from the aspects perceived as quality dimensions by museum visitors.
Dataset preparation and exploration
Data from online reviews have been collected from the TripAdvisor pages of the top 100 Italian public museums, as identified by the Italian Ministry of Cultural Heritage and Activities and Tourism. Once collected, the online reviews were enriched through a language detection phase. Finally, both the analysis approaches have been applied to the same dataset of 14,250 Italian reviews.
A top-down approach is one meant to be applied by subjective choice by a decision maker on the quality measures of the service/product to be considered.
In our empirical context, the decision maker is represented by the policy maker “Italian Ministry of Cultural Heritage and Tourism”, which in 2018 introduced a set of quality standards for public museums. Based on this guidelines and on the interviews with policy makers, we identified five quality dimensions that follow the ‘top-down’ perspective: Ticketing and Welcoming, Space, Comfort, Activities, and Communication. Each of these dimensions has been considered as a class in a classification problem over user reviews. The top down approach allowed us to tag each review as descriptive of one of those 5 dimensions.
Classification has been implemented both as a machine learning classification problem and as and keyword-based tagging (i.e., each dimension was associated with a set of keywords expected to be representative of that dimension). In particular, we compared the keyword-based classifier with a Bidirectional Encoder Representations from Transformers (BERT) language model. Performance was checked over a manually tagged set of 1,000 reviews. The keyword-based method obtained an average accuracy of 80% and recall of 50% among the five classes (Table 2), while the BERT method obtained 88.2% accuracy and 58% recall. The approach suffered of a highly unbalanced nature of the data over the classes.
The ‘bottom-up’ approach exploits online reviews without a predefined expectation regarding the dimensions of the visit; rather, it is based on deriving the latent dimensions of the experience directly from the reviewers’ words without any predefined expectations.
From an analytical perspective, the ‘bottom-up’ approach has been implemented through an unsupervised topic modelling approach, namely LDA (Latent Dirichlet Allocation), implemented and tuned over a range up to 30 topics. The best ‘bottom-up’ model we selected, identifies 13 latent dimensions in review texts.
For better interpretability, we manually grouped the resulting latent topics of the discussion into three main ‘bottom-up’ dimensions of museum quality, which we interpreted as Museum Cultural Heritage, Personal Experience and Museum Services.
Results and comparison
While for detailed results I’d like to redirect the reader to the full paper, here I’m summarizing the main findings.
- the ‘top-down’ approach (based on a set of keywords defined from the standards issued by the policy maker) resulted in 63% of online reviews that did not fit into any of the predefined quality dimension;
- The ‘bottom-up’ data-driven approach overcomes this limitation by searching for the aspects of interest using reviewers’ own words, without even acknowledging how many or which could be the quality dimensions of a museum;
- In particular, when analyzing reviews tagged as Other Aspects, the ‘bottom-up’ analysis recognises Museum Services with a 21% average probability. This means that the policy maker would also be able to detect further aspects related to service quality through the ‘bottom-up’ approach;
- Based on topic analysis, we see that a museum review discusses more about a museum’s cultural heritage aspects (46% average probability) and personal experiences (31% average probability) than the services offered by the museum (23% average probability);
- This means that most of the attention of visitors is not focused on services but on actual content quality of the museum.
Among the various quantitative findings of the study, I think the most important point is that the aspects considered as quality dimensions by the decision maker can be highly different from those aspects perceived as quality dimensions by museum visitors: using a ‘top-down’ approach within the specific setting of museums, most of the reviews (63%) do not relate to the museum service quality dimensions defined by the policy maker because museum visitors cherish quality dimensions beyond just those of museum services (23%), placing more emphasis instead on cultural heritage (46%) and personal experiences (31%).
You can find out more about this analysis by reading the full article published online as open-access. The full reference to the paper is:
Agostino, D.; Brambilla, M.; Pavanetto, S.; Riva, P. The Contribution of Online Reviews for Quality Evaluation of Cultural Tourism Offers: The Experience of Italian Museums. Sustainability 2021, 13, 13340. https://doi.org/10.3390/su132313340