Cheating in Ranking Systems

Page created by Joshua Weber
 
CONTINUE READING
Cheating in Ranking Systems

                                                             Lihi Dery∗            Dror Hermel        †
                                                                                                               Artyom Jelnov‡
arXiv:1905.09116v1 [econ.TH] 22 May 2019

                                                                                       May 23, 2019

                                                                                           Abstract

                                                      Consider an application (e.g., an app) sold on an on-line platform (e.g., Google
                                                   Play), with the app paying a commission fee and, henceforth, offered for sale on the
                                                   platform. The ability to sell the application depends on its customer ranking. There-
                                                   fore, developers may have an incentive to promote their application’s ranking in a
                                                   dishonest manner. One way to do this is by faking positive customer reviews. How-
                                                   ever, the platform is able to detect dishonest behavior (cheating) with some probability,
                                                   and then proceeds to decide whether to ban the application. We provide an analysis,
                                                   and find the equilibrium behaviors of both the application’s developers (cheat or not)
                                                   and the platform (setting of the commission fee). We provide initial insights into how
                                                   the platform’s detection accuracy affects the incentives of the app developers.

                                                 Acknowledgment. We wish to thank Christopher Thomas Ryan, Yair Tauman, Richard
                                           Zeckhauser and the anonymous reviewers for their helpful suggestions.
                                             ∗
                                               Ariel University, Israel, lihid@ariel.ac.il
                                             †
                                               Ryerson University, Canada, and Ariel University, Israel, drorhe@ariel.ac.il
                                             ‡
                                               Ariel University, Israel, artyomj@ariel.ac.il

                                                                                                1
1     Introduction

Various systems allow users to rate items. Using these ratings, the systems are then able
to present a ranked list of items. Strategic agents may attempt to manipulate these ranked
recommendations in order to increase their personal utility. However, these manipulations
are costly. Furthermore, such manipulation attempts can be identified by inspection, which
are also costly.
    Consider, for example, an application (an app) and an on-line platform (e.g., the Apple
App store) . The app may buy fake ratings which translate into a higher ranking on the
App Store – a measure many users look for when searching for a new app to download. The
negative impact generated in this scenario is multifold. From the end user side, a monetary
investment - or rather loss - is made to purchase an app that does not generate much positive
utility. Collectively, from the developer side, those who cheat gain profits in the short run,
while for the others there is a short term monetary loss (as the apps they develop are ranked
lower).
    For the platform there is reputation loss (which can be associated with lost revenues), as
users may be overly cautious before downloading new apps. Therefore, it is customary for
the platform to use some mechanism to detect and remove cheating apps.
    In this paper, we develop a model that studies the interaction between a platform and an
application. The platform collects a fee from applications that want to use it. We study two
cases: when the fee is exogenously provided and when the platform sets the fee. We show
that from the application’s side, for an imperfect cheating detection technology, cheating
will take place. We analyze how the quality of the detection algorithm and app rankings
affect the incentive to cheat. Furthermore, we analyze the platform’s decision as to what
commission fee to impose.
    To the best of our knowledge, this is the first attempt to apply game theory into ranking
systems in practice.

                                              2
We begin with some background (section 2). We then present our theoretical model
(section 3) and conclude with a discussion of our main findings and future research directions
(section 4).

2     Background

We begin by surveying related work (section 2.1), proceed to provide some intuition as to
why rankings are significant enough so that people are willing to manipulate ratings in order
to receive a high ranking (section 2.2), and then survey methods for manipulation detection
(section 2.3).

2.1    Related Works

We suggest a new focal point to address cheating in ranking systems - an approach related
to the well-established inspection games literature (cf. Avenhaus et al., 2002 for a survey).
The substantial difference that we implement is that while in inspection games one of the
players decides whether to commit some violation, and another player, namely, the inspector,
decides whether to perform a costly test to detect this violation, in our model a noisy alert
of the violation is sent automatically.
    The notion of an automatically sent signal based on the action of one of the players
appears in the literature in different contexts: industrial espionage (Barrachina et al., 2014),
international conflicts (Jelnov et al., 2017), and sports (Berentsen, 2002; Kirstein, 2014) to
name a few. In our case, we identify the favorable and adverse effects resulting in case the
platform attempts to deter cheaters using an imperfect detection mechanism.
    Our paper is related to the economic law enforcement literature, which goes back to
Becker (1968) and is surveyed in Polinsky and Shavell (2007). In our setting, we have an
enforcer (the platform) and a potential violator (the application). We study a specific kind of
violation: cheating in reviewer ratings. In our setting, this violation depends on the initial

                                               3
rating. Moreover, the enforcer and the potential violator may have a common interest,
because the application pays a commission fee to the platform.
    The work of Darby and Karni (1973) resembles our topic in the sense that in their paper
a violation is wrong information given by a service supplier to a consumer. They study the
existence of this kind of violation in a free market, and discuss how government intervention
can reduce it. Darby and Karni (1973) do not model strategic behavior by the government.
In our paper, a platform, not a government, enforces honest behavior on an application, and
we incorporate strategic considerations of the platform.
    The literature on economics of tort law, which can be traced to Landes and Posner (1984),
relates to our paper as the application may cause damage to the platform. However, the
tort law literature discusses how to cause one party to take care and prevent accidents which
damage another party. Compensation for damage is the most common tool in tort law. In
our case, no compensation is paid to the platform.

2.2     Why Rankings Matter

It has been shown that a website’s rank, not just its relevance, strongly and significantly
affects the likelihood of a click (Glick et al., 2014).
    As of March 2017, Google had 2.8 million apps available through its Android platform,
and Apple had 2.2 million apps available on its iTunes App Store.1 With such massive
numbers, users interested in discovering apps rely on rated listings, known as “top charts”
such as “top free games”, “top free apps”, etc. Furthermore, a study by Carare (2012)
indicates that users are willing to pay $4.50 or more for an app that is top ranked as compared
to the same app that is unranked, as people in general tend to disproportionately select
products that are ranked at the top (Smith and Brynjolfsson, 2001; Cabral and Natividad,
2016). We further emphasize the monetary effects of higher ranking of apps by referring the
   1
     Statista: The Statistics Portal https://www.statista.com/statistics/276623/number-of-apps-available-in-
leading-app-stores/

                                                     4
reader to Lee and Raghu (2014), who claim that one of the keys to a successful app is top
rank status.
   Reviews play a critical role in online commerce (Mauri and Minazzi, 2013). For example,
hotel reviews on websites that customers perceive as credible influence purchase behaviors
(see Casalo et al., 2015). Mayzlin et al. (2014) show that competing products can self-
promote by faking positive reviews for themselves, or negative reviews for competitors. For
analysis of the impact of reputation in e-commerce, see also Resnick and Zeckhauser (2002),
who show that positive reviews of previous online transactions can predict good transactions
in the future. Thus, the monetary gain from a highly ranked app is an incentive for app
developers to boost their app rankings on the charts. In competing over reputation and
higher ranking, product managers might be tempted to engage in manipulative behavior
(Gössling et al., 2016).
   Unfortunately, in the app development context, some app developers choose to do so in a
deceptive manner by paying fraudulent ranking services. If not addressed, these deceptions
are harmful to the app platform – for developers, users and platform owners. For the
developers, fraudulent ranking leads to unfair competition, which might discourage honest
developers. The users might be led to install misbehaving or malicious apps, or they might
be dissatisfied with platform app ratings and stop trusting the app charts. Finally, the
platform’s reputation might be compromised.

2.3    Manipulation Detection

Detection of deceptions is a top-priority for platforms, and is performed by detecting suspi-
cious app patterns, user patterns, or both. A closely related line of work focuses on malware
detection (Burguera et al., 2011; Narudin et al., 2016; Seneviratne et al., 2017). However,
our focus is on ranking fraud, where the manipulating app is not necessarily malware.
   Manipulating apps exhibit a different review pattern when compared with honest apps.

                                             5
Some algorithms for fake review detection focus on textual analysis of fake reviews, finding
certain language constructs that are often used in fake reviews (e.g., Ott et al., 2011; Hu et al.,
2012; Banerjee et al., 2017), while others (Schuckert et al., 2016) point out the contradictions
between overall (e.g. a hotel) and detailed rating of the same product (e.g. specifics such as
cleaning or location), which can expose fraudulent ratings.
   Manipulated app rankings are likely to generate drastic ranking increases or decreases
in a short time, or show strong rating deviations (see Zhu et al., 2013). Heydari et al.
(2016) show that the time interval in which the rating is given is also a measure of review
trustworthiness. The detection can be based solely on ratings, by analyzing rating shifts,
under the assumption that most ratings are honest (Akoglu et al., 2013; Savage et al., 2015).
In addition, manipulative users have a different review pattern. As the cost of setting up a
valid account is quite high (e.g., to rate an app on Google Play requires a Google account, a
mobile phone registered with the account, and the installed app), manipulators reuse their
account and rate many apps over a short time frame. Manipulators may rate up to 500 apps
a day, rating them all with 5 stars.
   Manipulative users are usually part of a well-organized crowdsourcing system that per-
forms malicious tasks. These are nicknamed “crowdturfing” systems, and their unique fea-
tures have been mapped (see Wang et al., 2012). A recent study by Chen et al. (2017)
focused on identifying app clusters that are co-promoted by collusive attackers. The identi-
fication is based on unusual changes in rating patterns, measuring feature similarity in apps
and applying machine learning techniques. Ye and Akoglu (2015) show that network infor-
mation can also be employed. The baseline assumption is that an honest set of reviews for a
product (or app) is formed by independent reviewer actions with various levels of activities
and reviews. Therefore, a non-manipulative app should have reviews with various levels of
network centrality. Furthermore, correlated review activities can be combined with linguistic
and behavioral review signals (Rahman et al., 2017).
   In the next section, we present a formal model for the interaction of two agents, an

                                                6
application and a platform. Our model assumes that one (or all) of the above manipulation
detection capabilities are available to the platform.

3     Model

We consider two models: one with an exogenous fee and one with an endogenous fee i.e., a
fee set by the platform.

3.1    Preliminaries

We consider two risk-neutral agents: an application (A) and a platform (P ). We focus only
on cases when the application decides to enter the platform.
    A rating for each application is calculated periodically. The rating represents the opinion
of the users and is observed both by the application and the platform. Upon entrance, at
stage t0 each application receives an initial rating of r0 = 0. At stages t1 , t2 , the application
obtains a rating of r1 , r2 ∈ [0, 1], respectively. For simplicity, we denote r1 as r. We hereby
study the last stage (t2 ).
    Naturally, for the application, a higher rating results in higher visibility on the platform
translating into more profits. We assume the profit is proportional to the application’s
current rating r2 , minus a commission fee f (f ≥ 0) payed to the platform. Thus the
application is left with a revenue of γr2 (1 − f ), where γ > 0 is the proportion coefficient.
    In order to increase the rating r2 , the application may decide to cheat (c) (e.g., by adding
fake ratings). If the application does not cheat (ĉ) it still has a probability l(r) to obtain the
highest rating r2 = 1. However, with probability 1−l(r) the rating is r2 = r. The probability
l(r) (0 < l(r) < 1) increases in r, namely, the higher the rating r, the more probable it is
that the application will reach r2 = 1.
    The platform has some imperfect algorithm, that enables it to detect applications that
might be cheating (see Section 2.3 for more details on such algorithms). Indeed, no algorithm

                                                7
or technology is 100% error free, and the used algorithm might overlook some cheating
applications as well as label honest applications as cheaters.
   At stage t2 , a rating of r2 = 1 triggers an automatic noisy alert sent to the platform.
The alert s means that the application is suspected of cheating (ŝ means the opposite alert).
When the platform receives s it is required to choose whether to ban (b) or not ban (b̂) the
application. The ban decision is equivalent to setting the application’s rating to r2 = 0.
Note that this implies a different penalty cost for different applications; an application with
a higher rating at stage t1 has more to lose from a ban than an application with a lower
rating.
   Let α(r) be a type-I probability error, namely, the probability that s is sent when A does
not cheat, and let β(r) be the probability of a type-II error, namely, the probability that s
is not sent when A cheats.
   We consider α and β as commonly known. When a platform chooses a cheating detection
algorithm, as part of the acceptance testing performed when integrating it, the algorithm
is tested on different scenarios (where the results are known), and α and β can then be
estimated. In a similar manner, application developers can estimate these parameters.
   We assume that β(r) weakly increases in r. Namely, a high rating r that increases to
r2 = 1, is less detectable than a low rating that increases to r2 = 1.
   The platform’s utility consists of three factors: the revenue from the commission fee that
the application pays (γr2 f ), the cost of non-detection (denoted w), and the cost of false
accusation (denoted v). The two latter costs can be interpreted as a loss of the platform’s
reputation, which translates into loss of user confidence in the platform, leading to a decrease
in purchases and thus in the platform’s revenues.
   Consequently, if the application cheats and is not banned, P ’s utility is γr2 f − w, w > 0.
However, if the application does not cheat and is not banned, P obtains γr2 f + v, v > 0.
If the application is banned, the platforms revenue is 0. The game and player utilities are
defined in Figure 3.1.

                                               8
Figure 3.1: Description of the game. Each pair of utilities represents application and platform
utilities, respectively.

   If r = 1, the highest rating is guaranteed to the application. Trivially, in this case there
is an equilibrium where the application does not cheat, and the platform does not ban it.
We assume hereafter r < 1.

3.2    Exogenous fee

We now proceed and describe our results for an exogenous fee. The platform receives a signal
s. If α(r) = 0 (i.e., there is no possibility for a mistake regarding the signal), the platform
bans the application. In the other cases:

  1. If the platform’s revenue from the commission fee is higher than the cost of non-
      detection (i.e. the reputation loss) then it will not ban the application even if it
      suspects cheating. Consequently, the application will surely cheat.

  2. If β(r) is high then the application is encouraged to cheat, since it is likely that cheating

                                               9
will not be detected.

  3. If none of the former occurs, then the application will cheat with some positive prob-
     ability, but not with certainty.

   Formally, the following proposition characterizes an equilibrium of the game.

Proposition 1. Let α(r) > 0.

  1. If w < γf , then in the unique equilibrium of G the application cheats with certainty,
     and with certainty, the platform does not ban the application.

  2. If l(r) − α(r)l(r) + r − rl(r) < β(r), then in the unique equilibrium of G the application
     cheats with certainty, and following an alert s, the platform bans with certainty.

  3. If w > γf and l(r) − α(r)l(r) + r − rl(r) > β(r), then the application cheats with a
     probability Pc , 0 < Pc < 1, and following an alert s the platform bans with probability
     Pb , 0 < Pb < 1.

   All proofs appears in the appendix.
   Note, that for w = γf or l(r) − α(r)l(r) + r − rl(r) = β(r) the equilibrium may not be
unique.
   We now focus on part (3), where cheating occurs with some probability 0 < Pc < 1. In
this case, an increase in α(r), β(r), v and f and a decrease in w results in more cheating.
   The application compensates for an increase in the commission fee (f ) by cheating, which
may lead to a higher rating and thus an increase in profit. An increase in the false accusation
cost (v) means that the platform is more reluctant to ban an application, which also leads
to more cheating.
   An increase in the probability of false accusation α(r) means that even an honest appli-
cation is more likely to be (mistakenly) banned for cheating, thus the application has less
incentive to act honestly. Consequently, the probability the platform will ban applications

                                              10
increases in α(r). An increase in the probability of non-detection of cheating β(r) as well as
a decrease in non-detection cost (w) will lead to more cheating. Formally, and as a direct
result of part 3 in Proposition 1:

Corollary 1. Let w > γf and l(r) − α(r)l(r) + r − rl(r) > β(r). Then in the equilibrium
of G:

   1. The probability that the application cheats increases in: f , v, α(r), and β(r), and
         decreases in w.

   2. The probability that P bans A, following alert s, increases in β(r) and in α(r).

       When α(r) is independent of r, and cheating occurs with some probability 0 < Pc < 1,
then as the application approaches a rating of 1, the more it is likely to cheat. The intuition
behind this is that since applications with a rating close to 1 are less suspected of cheating,
and hence less likely to be detected, they can thus cheat more freely.

Corollary 2. Suppose α(r) is constant, α(r) ≡ α. Let w > f a and l(r)−α(r)l(r)+r−rl(r) >
β(r). Then in the equilibrium of G the probability that A cheats increases in r.

       Corollary 2, that cheating increases in r, is not very surprising for a strictly increasing β(r)
since this means that less alerts s are sent when r is close to 1. However, what is somewhat
astounding is the fact that this claim holds even for those cases where β(r) is constant,
meaning - that even when the probability of the alert is constant, cheating increases in r.

3.3        Endogenous fee

In section 3.2 the fee f is exogenous. However, in reality the fee is set by the platform.
Consider a model where at an initial stage t0 the platform chooses a fee 0 ≤ f ≤ 1, and then
the game proceeds as defined in Figure 3.1.2
   2
   The platform maximizes its expected utility. The technicalities are characterized in proposition 2 in the
Appendix.

                                                    11
An increase in the commission fee affects the platform’s utility in two ways: (1) positive –
an increase in revenue and (2) negative – an increase in cheating, which reduces the platform’s
utility, a direct consequence of Corollary 1.
   As Figure 3.2 illustrates, the platform may maximize its expected utility for a fee lower
than 1, and the platform collects a higher fee when the alert it obtains is more precise (a
lower β(r)).
   These results are not general. As can be seen in Figure 3.3, when the cost of non-detection
is set to w = 4 instead of w = 3, the platform maximizes its expected utility for a fee of
f = 1.

Figure 3.2: Result for various β(r) with γ = 1, r = 0.6, α(r) = 0.1, l(r) = 0.6, v = 9, w = 3

                                                12
Figure 3.3: Result for w = 4 with γ = 1, r = 0.6, α(r) = 0.1, β(r) = 0.1, l(r) = 0.6, v = 9

4     Conclusions

In this paper, we provide a novel stylized framework to study the interaction between an
app sales platform (e.g., Apple’s app store or Google Play) and an app developer who may
be tempted to cheat in order to increase their app ranking. Our framework captures some
of this interesting interaction, and the consequential equilibrium analysis gives rise to some
important implications.
    Our most significant finding is that a higher fee leads to more cheating. Consequently,
even a monopolistic platform may choose not to impose a high fee.
    Furthermore, we found that precise alert signals decrease cheating; when the cheating
detection algorithm is a good one (i.e. α and β are low) less cheating occurs. Thus, we
conclude that if the platform has a good manipulation detection algorithm then it should
make this known (i.e. publicize its α and β), since the application developers will refrain
from cheating if they know that there is a high chance they will be caught.
    Numerically, we show that a lower non-detection rate allows the platform to request
a higher fee. This result gives the platform yet another reason to invest in acquiring or
developing a good manipulation detection algorithm.
    We focused on the commission fee the product has to pay the platform and considered
other costs, such as promotion costs and the cost of creating fake reviews as negligible. We

                                             13
assumed that the cost of a fake review is sufficiently low, and the reward from an undetected
cheat is high, so there is incentive to cheat. Note that platforms have different tools to protect
against manipulations and make it more difficult to create a fake review. For example, they
can require reviewers to use the verification system CAPTCHA c , or to verify that a reviewer
really consumes the product. Still, sophisticated manipulators can bypass these barriers.
   Utilities in our model are exogenous. We assume that the platform is interested in
its reputation, and that cheating harms the platform’s reputation. In a future extended
model, platform competition can be considered, where more than one platform competes
for customers and customers abandon a platform if they are dissatisfied with its cheating
prevention level (for platform competition in a non-cheating environment see, for example,
Halaburda and Yehezkel, 2016).
   Most importantly, we provide initial insights into how the platform detection accuracy
affects the incentives of the app developers. Understanding these interactions, and the
resulting equilibria, provide ample foundations to address future points of interest. For
example, it can be used to better understand how to put in place mechanisms that align
incentives and provides a benchmark framework for future empirical work.
   Our findings and conclusion are relevant to other types of e-commerce as well, and can
be of interest in any scenario that involves a product that faces a fee to be rated and ranked
on an online platform. Other examples of such systems may include online vendor sites such
as Amazon, eBay, and hotel bookings sites.

References

Akoglu, L., Chandy, R., and Faloutsos, C. (2013). Opinion fraud detection in online reviews
  by network effects. Proceedings of the Seventh International AAAI Conference on Weblogs
  and Social Media.

Avenhaus, R., Von Stengel, B., and Zamir, S. (2002). Inspection games. R. J. Aumann

                                               14
and S. Hart (eds), Handbook of game theory with economic applications, 3:1947–1987,
  North–Holland, Amsterdam.

Banerjee, S., Chua, A. Y., and Kim, J.-J. (2017). Don’t be deceived: Using linguistic
  analysis to learn how to discern online review authenticity. Journal of the Association for
  Information Science and Technology.

Barrachina, A., Tauman, Y., and Urbano, A. (2014). Entry and espionage with noisy signals.
  Games and economic behavior, 83:127–146.

Becker, G. S. (1968). Crime and punishment: An economic approach. In The economic
  dimensions of crime, pages 13–68. Palgrave Macmillan, London.

Berentsen, A. (2002). The economics of doping. European Journal of Political Economy,
  18(1):109–127.

Burguera, I., Zurutuza, U., and Nadjm-Tehrani, S. (2011). Crowdroid: behavior-based
  malware detection system for android. In Proceedings of the 1st ACM workshop on Security
  and privacy in smartphones and mobile devices, pages 15–26. ACM.

Cabral, L. and Natividad, G. (2016). Box-office demand: The importance of being# 1. The
  Journal of Industrial Economics, 64(2):277–294.

Carare, O. (2012). The impact of bestseller rank on demand: Evidence from the app market.
  International Economic Review, 53(3):717–742.

Casalo, L. V., Flavian, C., Guinaliu, M., and Ekinci, Y. (2015). Do online hotel rating
  schemes influence booking behaviors? International Journal of Hospitality Management,
  49:28–36.

Chen, H., He, D., Zhu, S., and Yang, J. (2017). Toward detecting collusive ranking manipula-
  tion attackers in mobile app markets. In Proceedings of the 2017 ACM on Asia Conference

                                             15
on Computer and Communications Security, ASIA CCS ’17, pages 58–70, New York, NY,
  USA. ACM.

Darby, M. R. and Karni, E. (1973). Free competition and the optimal amount of fraud. The
  Journal of law and economics, 16(1):67–88.

Glick, M., Richards, G., Sapozhnikov, M., and Seabright, P. (2014). How does ranking affect
  user choice in online search? Review of Industrial Organization, 45(2):99–119.

Gössling, S., Hall, C. M., and Andersson, A.-C. (2016). The manager’s dilemma: a con-
  ceptualization of online review manipulation strategies. Current Issues in Tourism, pages
  1–20.

Halaburda, H. and Yehezkel, Y. (2016). The role of coordination bias in platform competi-
  tion. Journal of Economics & Management Strategy, 25(2):274–312.

Heydari, A., Tavakoli, M., and Salim, N. (2016). Detection of fake opinions using time series.
  Expert Systems with Applications, 58:83–92.

Hu, N., Bose, I., Koh, N. S., and Liu, L. (2012). Manipulation of online reviews: An analysis
  of ratings, readability, and sentiments. Decision Support Systems, 52(3):674–684.

Jelnov, A., Tauman, Y., and Zeckhauser, R. (2017). Attacking the unknown weapons of a
  potential bomb builder: The impact of intelligence on the strategic interaction. Games
  and Economic Behavior, 104:177–189.

Kirstein, R. (2014). Doping, the inspection game, and Bayesian enforcement. Journal of
  Sports Economics, 15(4):385–409.

Landes, W. M. and Posner, R. A. (1984). Tort law as a regulatory regime for catastrophic
  personal injuries. The Journal of Legal Studies, 13(3):417–434.

                                             16
Lee, G. and Raghu, T. S. (2014). Determinants of mobile apps’ success: evidence from the
  App Store market. Journal of Management Information Systems, 31(2):133–170.

Mauri, A. G. and Minazzi, R. (2013). Web reviews influence on expectations and purchasing
  intentions of hotel potential customers. International Journal of Hospitality Management,
  34:99–107.

Mayzlin, D., Dover, Y., and Chevalier, J. (2014). Promotional reviews: An empirical investi-
  gation of online review manipulation. The American Economic Review, 104(8):2421–2455.

Narudin, F. A., Feizollah, A., Anuar, N. B., and Gani, A. (2016). Evaluation of machine
  learning classifiers for mobile malware detection. Soft Comput., 20(1):343–357.

Ott, M., Choi, Y., Cardie, C., and Hancock, J. T. (2011). Finding deceptive opinion spam
  by any stretch of the imagination. In Proceedings of the 49th Annual Meeting of the As-
  sociation for Computational Linguistics: Human Language Technologies-Volume 1, pages
  309–319. Association for Computational Linguistics.

Polinsky, A. M. and Shavell, S. (2007). The theory of public enforcement of law. Handbook
  of law and economics, 1:403–454.

Rahman, M., Rahman, M., Carbunar, B., and Chau, D. H. (2017). Search rank fraud and
  malware detection in Google Play. IEEE Transactions on Knowledge and Data Engineer-
  ing, 29(6):1329–1342.

Resnick, P. and Zeckhauser, R. (2002). Trust among strangers in internet transactions:
  Empirical analysis of ebay’s reputation system. In The Economics of the Internet and
  E-commerce, pages 127–157. Emerald Group Publishing Limited.

Savage, D., Zhang, X., Yu, X., Chou, P., and Wang, Q. (2015). Detection of opinion spam
  based on anomalous rating deviation. Expert Systems with Applications, 42(22):8650–8657.

                                            17
Schuckert, M., Liu, X., and Law, R. (2016). Insights into suspicious online ratings: direct
  evidence from tripadvisor. Asia Pacific Journal of Tourism Research, 21(3):259–272.

Seneviratne, S., Seneviratne, A., Kaafar, M. A., Mahanti, A., and Mohapatra, P. (2017).
  Spam mobile apps: Characteristics, detection, and in the wild analysis. ACM Transactions
  on the Web (TWEB), 11(1):4.

Smith, M. D. and Brynjolfsson, E. (2001). Consumer decision-making at an internet shopbot:
  Brand still matters. The Journal of Industrial Economics, 49(4):541–558.

Wang, G., Wilson, C., Zhao, X., Zhu, Y., Mohanlal, M., Zheng, H., and Zhao, B. Y. (2012).
  Serf and turf: crowdturfing for fun and profit. In Proceedings of the 21st international
  conference on World Wide Web, pages 679–688. ACM.

Ye, J. and Akoglu, L. (2015). Discovering opinion spammer groups by network footprints.
  In ECML/PKDD (1), pages 267–282.

Zhu, H., Xiong, H., Ge, Y., and Chen, E. (2013). Ranking fraud detection for mobile apps:
  A holistic view. In Proceedings of the 22nd ACM international conference on Information
  & Knowledge Management, pages 619–628. ACM.

Appendix

Proof of proposition 1. Consider first an equilibrium where P chooses pure b̂. Then c is a
superior action of A. By Figure 3.1, the expected utility of P in this case is γf − w. If P
chooses b, its payoff is 0, and the platform prefers b̂ to b for w < γf .
   Next, consider P , following s, chooses pure b. Observe, that for α(r) > 0, pure ĉ is not
an equilibrium in this case. By contrary, if A chooses pure ĉ, with positive probability it
obtains the rating 1 and a false signal s is sent to P ; therefore, b is not the best reply of the
platform.

                                               18
If A chooses c, and P , following s, chooses pure b, A’s expected utility is γ(1 − f )β(r). If
A does not cheat, and P , following s, chooses pure b, A’s expected utility is γ(1 − f )[r(1 −
l(r)) + l(r)(1 − α(r))]. Thus, A prefers c to ĉ for β(r) > r − rl(r) + l(r) − α(r)l(r).
   Let A choosing c with probability Pc . Given alert s, let P (c|s) be belief of P that A
cheats:
                                                Pc (1 − β(r))
                          P (c|s) =                                     .                    (1)
                                      Pc (1 − β(r)) + (1 − Pc )l(r)α(r)

   Following rating 1 of A and alert s, P is indifferent between b and b̂ for

                               γf − wP (c|s) + v(1 − P (c|s)) = 0,

and by (1), this is equivalent to

                                          α(r)l(r)(γf + v)
                        Pc =                                          .                      (2)
                               α(r)l(r)(γf + v) + (1 − β(r))(w − γf )

By (2), 0 < Pc < 1 for w > γf .
   Let Pb be a probability with which P bans the Application, following alert s. A is
indifferent between c and ĉ for

           γ(1 − f )[1 − (1 − β(r))Pb ] = γ(1 − f )[(1 − l(r))r + l(r)(1 − α(r)Pb )],

namely,
                                             (1 − r)(1 − l(r))
                                    Pb =                       .                             (3)
                                           1 − β(r) − α(r)l(r)

By (3), 0 < Pb < 1 for l(r) − α(r)l(r) + r − rl(r) > β(r).

Proof of Corollary 1. Since conditions of part 3 of proposition 1 hold, probabilities of cheat-
ing and of banning are given by (2) and (3), respectively. Results follow directly by (2) and
(3).

                                                 19
Proof of Corollary 2. Since conditions of part 3 of proposition 1 hold, probabilities of cheat-
                                                                  ∂l(r)             ∂β(r)
ing is given by (2). The result follows directly by (2) and by     ∂r
                                                                          > 0 and    ∂r
                                                                                            ≥ 0.

Proposition 2.       1. If w < γf , then in equilibrium the expected utility of P is γf − w.

  2. If l(r) − α(r)l(r) + r − rl(r) < β(r), then in equilibrium the expected utility of P is
     β(r)(γf − w).

  3. If w > γf and l(r) − α(r)l(r) + r − rl(r) > β(r), then in equilibrium the expected
     utility of P is (1 − Pc )[γf [(1 − l(r))r + l(r)] + v] + Pc (γf − w), where Pc is given by (2).

Proof. Directly from Proposition 1 and Figure 3.1.

                                                20
You can also read