EXTRACTION OF TOURIST ATTENTION POINTS FROM LOW-RATED REVIEWS AND CLASSIFICATION BY VIEWPOINT - IHCI 2021

Page created by Ida Dawson
 
CONTINUE READING
ISBN: 978-989-8704-32-0 © 2021

     EXTRACTION OF TOURIST ATTENTION POINTS
   FROM LOW-RATED REVIEWS AND CLASSIFICATION
                 BY VIEWPOINT

                                         Junichi Fukumoto and Kazuki Ito
                        College of Information Science and Engineering, Ritsumeikan University
                                   1-1-1 Noji-higashi, Kusatsu, Shiga 525-8577 Japan

ABSTRACT
There are various Internet sites for tourists and a lot of positive and negative word-of-mouses are posted from tourists.
Negative information can be used as attention points for sightseeing to prevent the same mistakes for their first visit. The
purpose of this research is to extract such negative information from word-of-mouth as tourist attention points and
classify them for easy-to-understand. In the experiments, we successfully extracted attention points from actual tourist
reviews and classified them based on target of the points.

KEYWORDS
Tourism Information, Attention Points, Low-Rated Reviews, Classification, Dependency Relation

1. INTRODUCTION
There are various Internet sites for tourists and a lot of word-of-mouses are posted. When people plan to visit
some sightseeing spot at the first time, they refer posted word-of-mouses of these sites to get useful
information and select tourist destinations. Internet posted information were used to improve sightseeing
satisfaction and recommend appropriate place to visit and stay (Dincer, 2017) (Han, 2020) (Ogawa 2014).
There are various positive and negative opinions such as recommendation of a location, dissatisfied
experience, stories of failed experiences, and so on. Important point is that negative information can be used
as “attention points for sightseeing” to prevent the same mistakes. There are some approaches that negative
information is used to for product improvement (Kurihara, 2014) (Ohmori, 2012). In tourism information, for
example, if you read the review "There is no toilet in the castle and it is a little unsuitable for children
because of the stairs", you can use a toilet before you go into the castle. Now a huge amount of reviews has
been posted, there are many reviews that include tourist attention points. However, it is difficult and
time-consuming to read all such reviews to find tourist attention points from negative reviews.
    The purpose of this research is to extract such negative information from the huge amount of
word-of-mouth as tourist attention points and present it to user in an easy-to-understand manner. We will
focus on 1 to 3-star low-rated reviews that contain a lot of contributors’ dissatisfaction and complain based
on our preliminary survey of reviews. We will extract attention points from sentences that include negative
expressions and classify the extracted tourist caution points based on the negative evaluation viewpoint.
    In the following chapters, we will show our proposed method: extraction method of attention point and
classification method of the extracted attention points. In the experiment, sample extraction of attention
points and classification using tourist reviews of Wakamatsu castle and discuss the extraction results and
some problems in our current method.

154
International Conferences Computer Graphics, Visualization, Computer Vision and Image Processing 2021;
                                                                                      Connected Smart Cities 2021;
                                          and Big Data Analytics, Data Mining and Computational Intelligence 2021

2. PROPOSED METHOD

2.1 Extraction of Attention Points from Negative Reviews
From the tourist site “Jalan net”, we obtained 3814 reviews of 1 to 3 stars related to the castle by web
scraping. To select negative reviews effectively, we used sentiment analysis module of Watson NLU tool and
classified the reviews into positive, neutral, and negative ones. As a result, we obtained 1260 negative
reviews. To extract attention points for tourists, we used negative expressions and dependency structure of
negative review sentences.
    Firstly, all the documents of negative reviews are divided into sentences using Japanese sentence
separator such as a punctuation mark. Next, sentences will be morphologically analyzed using Japanese
morphological analyzer MeCab with dictionally NEologd and syntactically analyzed with Japanese syntax
analyzer CaboCha. We will extract attention points using dependency relation of syntax structure in the
following patterns.

  Pattern 1: extract phrases depending to a negative expression and phrases to the dependent phrases
  Pattern 2: extract dependent phrases of a negative expression and phrases to the dependent phrases
  Pattern 3: if there is a negative expression sentence, the next sentence will be extracted.

    In the pattern 1 and 2, related elements with negative expression are extracted using dependency
structure, however, adverb phrases will not be extracted. Pattern 3 is a case that some unpleasant situation
will be described, and only negative impression will be added after the situation. This type of description is
often used in blogs. We prepared Japanese negative expressions shown in Table 1. English translations are
shown in brackets.
                                  Table 1. A list of Japanese negative expressions

  残念(unfortunate), 小さい(small), 大変(hard), 狭い(narrow), がっかり(disappointed), 暑い(hot),
  断念(give up), キツイ(hard), 諦め(give up), しんどい(hard), 苦労(hardship), 足りない(not enough),
  後悔(regret), 悪い(bad), 邪魔(disturbing), 厳しい(severe), 退屈(boring), 重い(heavy), 嫌悪(disgust),
  微妙(negative mood), 汚い(dirty), 不便(inconvenient), 怖い(scare), 興ざめ(disappointed), 辛い(hard),
  うるさい(noisy), 疲労(fatigue), 危ない(dangerous), 不満(dissatisfied), 難しい(difficult),
  注意(attention), 遠い(far), こじんまり(small), 退屈(boring), 苦労(hard), 混雑(crowded)

2.2 Classification of Attention Points
There are many kinds of attention points extracted from negative reviews. We will classify attention points
based on viewpoints in the description. This classification will help these points easy to check. It will be
effective to classify by very related words to a negative expression of attention points. To choose this related
word, we use dependency analysis of description of attention points.
    In case of extraction pattern 1, shown in the above, noun or proper noun of dependent phrase to a negative
expression will be a classification clue. If there is no noun in this dependent phrase, a dependent phrase to the
first dependent phrase will be checked to extract noun or proper noun, repetitively, and the extracted noun or
proper noun will be the classification clue. All the classification clue words will be extracted in pattern 1 type
sentences because there are no such related words in pattern 2 and 3 type attention points. To classify pattern
2 and 3 type attention points, we will use classification clues. The pattern 2 and 3 attention points will be
classified by the words they contain in the list. In the extracted classification clues, we set a stop word list of
Japanese one letter words such as “中 (middle)”, “上 (upper)”, “事 (thing)” to exclude meaningless clues.

                                                                                                               155
ISBN: 978-989-8704-32-0 © 2021

    Figure 1 shows an example of dependency analysis of extracted attention point although the word order
of Japanese is different from its English translation. Word level translations are shown in the below of words
with brackets, and full translation is at the bottom of this figure. In this example, the phrase “大変です
(hard)” is a negative expression. The phrase “行くのが (go to)” modifies the negative expression but this is
not noun. The phrase “天守閣が (the castle tower)” modifies the phrase “行くのが (go to)” and this is
noun, then this phrase will be a classification clue.

                            Figure 1. Example of dependency analysis of attention point

3. EXPERIMENTS
We used 259 reviews of Wakamatsu Castle for the experiment. A part of results of extraction of attention
points is shown in Table 2. English translations are shown in brackets.

                                 Table 2. Example of extraction of attention point
Negative                                                 Attention points
expressions
悪い (bad)       ライトアップしてましたが天気が悪かった (It was light up, but the weather was bad.)
難しい            登りは階段なので足が不自由な方は難しい (Climbing is with staircase, so it is difficult
(difficult)    for people with disabilities.)
大変 (hard)      子供も段差が大変 (The steps are also hard for children.)
悪い (bad)       全体的にみれば悪い場所ではないので商売っ気が少々強すぎる気がして、私には合
               わない場所でした (Overall, it's not a bad place, so I felt that the business was a little too
               strong, so it was a place that didn't suit me.)
汚い (dirty)     トイレが汚く (The toilet is dirty.)
残念             こちらで残念だったのは入場券を購入する際に何も言わないと茶室の入場券付のチ
(disappoint)   ケットがくる事、その茶室に行くとまた別料金で抹茶和菓子が如何か聞かれる事、
               城の出口に向かうと売店を通るようになる事が興ざめ (What was disappointing here
               was that if you didn't say anything when purchasing the admission ticket, you would get a
               ticket with an admission ticket for the tea room, and when you went to the tea room, you
               would be asked what kind of matcha Japanese sweets would be for an extra charge. When
               you go to the exit, you will be able to go through the shop.)

    For extraction of classification clues, we applied our method for the extracted attention points. Among the
extracted classification clues, 10 single letter clues are excluded using stop word rule. We will show sample
results of classification of attention points with clue word “階段 (stairs)”, “トイレ (toilet)” and “天守閣
(castle tower)” in the following.

156
International Conferences Computer Graphics, Visualization, Computer Vision and Image Processing 2021;
                                                                                      Connected Smart Cities 2021;
                                          and Big Data Analytics, Data Mining and Computational Intelligence 2021

clue word: “階段 (stairs)”
登りは階段なので足が不自由な方は難しい (Climbing is a staircase, so it is difficult for people with
disabilities)
階段は小さい子は抱っこしなきゃいけないし、狭いしきつい (The stairs are narrow and tight, small
children must hug.)
階段しかないので、抱っこして上まであがるのが大変 (Since there are only stairs, it is difficult to hold
children and climb up.)
ご年配の方には階段を昇るは大変 (It is difficult for elderly people to climb the stairs.)
階段が多くて当日は筋肉痛も有ったので辛かった (It was painful because there were many stairs and I
had muscle pain on the day.)
2 回目の鶴ヶ城!修学旅行生で混んで居るけど学べましたまた桜の時期に行きたいです。年配の人
には階段キツイ (The second time of Tsuruga Castle! It's crowded with school students, but I learned it. I
want to go again at the cherry blossom season. Stairs are hard for older people.)
clue word: “トイレ (toilet)”
トイレが汚くて (The toilet is dirty.)
お城の中を見ていて、子供がトイレに行きたくなり、中になく外まで行かないとないので大変で
す (It's hard because I'm looking inside the castle and my child wants to go to the bathroom, so I must go
outside.)
お城の中にトイレが無いので幼年には少し不向きそれだけのために階段の上り下り歩いて入口ま
で戻るのは辛い (Since there is no toilet in the castle, it is a little unsuitable for childhood because it is
difficult to walk up and down the stairs and return to the entrance.)
clue word “天守閣: (castle tower)”
ただ連休中だったので天守閣の回りは人で溢れて一周回るのが一苦労 (However, since it was a
consecutive holiday, it was difficult to go around the castle tower because it was full of people.)
時間があれば、若松城周辺の公園等はじめ見所がたくさんあり楽しみがあります、強いて言えば、天守閣の
窓が、普通の窓であるのが残念 (If you have time, there are lots of things to see, such as the parks around
Aizuwakamatsu Castle, and you can have fun.)
最上階天守閣は・・・思ったより狭い (The castle tower on the top floor is ... narrower than I expected.)
だけど、お城の中は年表やら何藩がどうだとかのボードばかりで、当時の物の展示が少なく、天
守閣からの眺めは良かったですが、それ以外は残念 (However, in the castle, there are only boards such
as the chronological table and what kind of clan it is, there are few exhibits of things at that time, and the
view from the castle tower was good, but other than that, it is disappointing.)
天守閣への入場が大変 (Admission to the castle tower is difficult.)
天守閣も工事中で外の景色が見れず残念 (It's a pity that the castle tower is also under construction and I
couldn't see the outside scenery.)
期待をして行ったが、天守閣の修理中で(中は入れる)外観および天守閣から風景を見ることが
できず残念 (I was expecting it, but I am sorry that the castle tower is being repaired (the inside is inside)
and I couldn’t see the scenery from the castle tower.)
残念ながら…屋根の瓦のふきかえ工事にあたってしまいお城が見れませんでしたが、逆に工事の
方がめずらしいので天守閣などに使う瓦に名前が刻める寄付的なコトもあり…記念にどうでしょ
う? (Unfortunately, ... I couldn't see the castle because the roof tiles were repairing, but on the contrary, the
construction is rare opportunity, so there is a donation to engrave the name on the tiles used for the castle
tower, etc. ... how about a memorial?)
雨天で天守閣からの眺めは厚い雲で覆われて山々がと見えず、残念 (Unfortunately, the view from the
castle tower was covered with thick clouds and I couldn’t see the mountains due to rain.)

                                                                                                              157
ISBN: 978-989-8704-32-0 © 2021

4. DISCUSSIONS
As for extraction of attention points, there are many cases that sentences with negative expressions
co-occurred with attention points in 1 to 3 stars negative reviews. If a sentence is only negative expression,
extracting adjacent sentence also works well to take an attention point. In case that here are some sentences
which use negation with a negative expression, attention points will be positive ones. When an attention
point appears with a negative expression, it will be extracted, but other attention points will not. Moreover, a
long sentence including negative expression has several attention points and some positive points. It is
impossible to extract all attention points and a positive point might be extracted by mistake. It is also
necessary to make the extracted negative point more compact.
    As for choosing classification clues, we choose noun or proper noun related to a negative expression
using syntax analysis result. This strategy works well because such nouns are very related to negative
expression, then it was appropriate for classification clue. In addition, classification using clue words helped
to make it easier to understand the points that user dissatisfied at. To exclude meaningless clue words, we
used to stop word list of Japanese single character words, but there is some case that some important words
were deleted. It is necessary to improve the rule, for example, use of word frequency of review documents
and so on.

5. CONCLUSION
In this paper, we focused on low-rated reviews and extracted tourist attention points from sentences that
include negative expressions using dependency relation. The extracted tourist attention points are classified
using clue words related to negative expression. Extraction of tourist attention points and classification of
these points were enough level to help tourist understand. However, it is necessary to improve stop word
information and handling of negation sentences. It is also required to apply more reviews of other domain
such as other sightseeing spots and product evaluation.

ACKNOWLEDGEMENT
The authors would like to express their sincere thanks to the anonymous referees for the useful comments.

REFERENCES
Dinçer, M. D. and Alrawadieh, Z., 2017, Negative Word of Mouse in the Hotel Industry: A Content Analysis of Online
   Reviews on Luxury Hotels in Jordan, Journal of Hospitality Marketing & Management, Vol. 26, No. 8, pp. 785-804.
Han, K. and Kitayama, D., 2020, An Association Method of Tourist Spots Using User Reviews for Advancing
   Explainability, IPSJ Transactions on database, Vol.13, No. 1, pp.1-7.
Kurihara, K. and Shimada, K., 2014, Trouble Information Extraction from Twitter based on Bootstrap Method, Proc. of
   the 21th annual meeting of the ANLP, Japan, pp. 341-344. (in Japanese)
Ohmori, N. and Mori, T., 2012, Automatic Extraction of Words Representing Industrial Products and Their Parts:
   Classification Methods According to “Word Tangibility”, The IEICE transactions on information and systems 95(3),
   pp. 697-706. (in Japanese)
Ogawa, K, Sugimoto, Y., et.al., 2014, Basic design of a sightseeing recommendation system using Characteristic Words,
   IPSJ SIG on DPS, 2014-DPS-159 (14), pp.1-6. (in Japanese)

158
You can also read