The Probabilistic Final Standing Calculator: a fair stochastic tool to handle abruptly stopped football seasons

Page created by Dale Perez
 
CONTINUE READING
The Probabilistic Final Standing Calculator: a fair stochastic
                                                       tool to handle abruptly stopped football seasons

                                                       Hans Van Eetvelde1 , Lars Magnus Hvattum2 , and Christophe Ley3
                                                1
                                                  Department of Applied Mathematics, Computer Science and Statistics, Ghent
                                                                          University, Ghent, Belgium
arXiv:2101.10597v1 [stat.AP] 26 Jan 2021

                                                           2
                                                             Faculty of Logistics, Molde University College, Norway
                                                3
                                                  Department of Applied Mathematics, Computer Science and Statistics, Ghent
                                                                          University, Ghent, Belgium

                                                                                           Abstract

                                                     The COVID-19 pandemic has left its marks in the sports world, forcing the full-stop of
                                                 all sports-related activities in the first half of 2020. Football leagues were suddenly stopped
                                                 and each country was hesitating between a relaunch of the competition and a premature
                                                 ending. Some opted for the latter option, and took as the final standing of the season the
                                                 ranking from the moment the competition got interrupted. This decision has been perceived
                                                 as unfair, especially by those teams who had remaining matches against easier opponents.
                                                 In this paper, we introduce a tool to calculate in a fairer way the final standings of domestic
                                                 leagues that have to stop prematurely: our Probabilistic Final Standing Calculator (PFSC).
                                                 It is based on a stochastic model taking into account the results of the matches played
                                                 and simulating the remaining matches, yielding the probabilities for the various possible
                                                 final rankings. We have compared our PFSC with state-of-the-art prediction models, using
                                                 previous seasons which we pretend to stop at different points in time. We illustrate our PFSC
                                                 by showing how a probabilistic ranking of the French Ligue 1 in the stopped 2019-2020 season
                                                 could have led to alternative, potentially fairer, decisions on the final standing.

                                           Key words: Bivariate Poisson, Plus-minus rating, Prediction, Ranking, (Tournament) Rank Prob-
                                           ability Score

                                                                                               1
1    Introduction

The COVID-19 pandemic has left its marks in the world of sports, forcing in February-March
2020 the full-stop of nearly all sports-related activities around the globe. This implied that the
national and international sports leagues were abruptly stopped, which had dramatic impacts
especially in the most popular sport, football (or soccer). Indeed, the absence of regular income
through ticket selling, television money and merchandising around live matches entailed that a
large majority of professional clubs no longer were able to pay their players and other employees
(Sky Sports, 2020; kicker, 2020). Professional football having become such a big business, this
also had an impact on other people whose main income was related to football matches (BBC
Sport, 2020). And, last but not least, the fans were starving to have their favourite sport resume.
However, the very intense and uncertain times made it a tough choice for decision-makers to take
the risk of letting competitions go on, especially as it would be without stadium attendance for
the first weeks or possibly months. Consequently, some leagues never resumed. The Dutch
Eredivisie was the first to declare that it would not be continued, on April 25 2020, followed by
the French Ligue 1 on April 28 and the Belgian Jupiler Pro League on May 15. The German
Bundesliga was the first professional league to get restarted, followed by the English Premier
League, Spanish Primera Division and Italian Serie A.

The leagues that decided to not resume playing were facing a tough question: how to evaluate
the current season? Should it simply be declared void, should the current ranking be taken as
final ranking, or should only a part of the season be evaluated? Either decision would have an
enormous financial impact given the huge amount of money at stake. Which team would be
declared champion? Which teams would be allowed to represent their country at the European
level and hence earn a lot of money through these international games, especially in the Cham-
pions League? Which teams would be relegated and, as a potential consequence, be forced to
dismiss several employees? Moreover, the broadcasting revenue is allocated on the basis of final
standings. Different countries reacted differently: the Eredivisie had no champion nor relegated
teams, but the teams qualified for the European competitions were decided based on the rank-
ing of March 8. The Ligue 1 and the Jupiler Pro League, on the other hand, declared also a
champion and relegated teams on the basis of the ranking from the moment their seasons got
stopped. However, the Jupiler Pro League had to come back on this decision and the relegation
was nullified. The Ligue 1 based their final standing on the ratio of points earned per game, since
not all teams had played an equal amount of games when the season got stopped. Obviously,
several teams were not pleased by such decisions, considering them to be unfair (Holroyd, 2020)
because, inevitably, this favoured those teams that had still to play the strongest opponents in

                                                2
the remaining matches over those that were looking forward to a rather light end-of-season.

This naturally raises the question to find a more balanced, scientifically sound way to evaluate
the final standing of abruptly ended seasons. It also represents a bigger challenge than eval-
uating in a fair way an abruptly stopped single game, where for instance cricket has adopted
a mathematical formula known as the Duckworth-Lewis-Stern method (Duckworth and Lewis,
1998). The literature addressing the challenging question of how a stopped season should be
evaluated in the most objective way is fuelled by proposals from after the outbreak of COVID-
19. The first and third authors of this paper used the current-strength based rankings by Ley
et al. (2019), a statistical model based on the bivariate Poisson distribution and weighted maxi-
mum likelihood, to provide all possible final standings of the Belgian Jupiler Pro League together
with their associated probabilities. Their results were summarized in a mainstream journal (Het
Laatste Nieuws, 2020). Guyon (2020) proposed an Elo-based approach applied to the English
Premier League. Lambers and Spieksma (2020) suggested an eigenvector-based approach, while
Csató (2020) discussed general criteria that a fair ranking should fulfil in such a situation and
proposed the generalized row sum method. Recently, Gorgi et al. (2020) used a statistical model
to determine a ranking based on the expected total number of points per team.

In this paper, we investigate the extent to which a relatively simple stochastic model can serve
the purpose of producing fair final standings for prematurely stopped round-robin type football
competitions. Our original approach goes as follows. We construct a stochastic soccer model that
is fitted on the played matches and then is used to simulate the remainder of the competition a
large number of times, thus yielding for every team the probabilities to reach each possible final
rank. This output is much richer in terms of information than giving only the most likely or the
expected ranking. This also explains the terminology for our model, namely Probabilistic Final
Standing Calculator, which we abbreviate as PFSC. In order to assess its predictive strength, we
compare our PFSC with two benchmark prediction models, namely the best performing model
of Ley et al. (2019), that uses a similar stochastic model to estimate the current strength of a
team based on its matches played in the last two years, and the plus-minus-ratings approach
of Pantuso and Hvattum (2020) that is based on the strengths of the players making up the
teams. For each model, the probabilistic final standing of a not yet ended season is obtained by
simulating the remaining matches 100,000 times, which gives us for every team the probabilities
of reaching each possible place in the final standing. It is not appropriate to compare the
predictions of these models on the 2019-2020 competitions which were resumed after the break,
since those matches were played under different circumstances, including the absence of fans. It
has been shown (Fischer and Haucap, 2020) that these changed conditions could influence team

                                                3
performances, by lowering the effect of the home advantage. Therefore we rather compare the
three models on the basis of the three preceding seasons of the five most important European
football leagues (England, Spain, Germany, Italy and France), which we stopped artificially after
every match day. Our evaluation of the models’ performance is done in two ways: by means of
the Rank Probability Score (RPS) (Epstein, 1969) and the Tournament Rank Probability Score
(TRPS) (Ekstrøm et al., 2020), see Section 2.4 for their definition. From this comparison, we
can see at which point in time the PFSC is able to catch up with the two high-performing but
more complicated prediction models. The reader may now wonder why we do not use any of
these more elaborate models as PFSC; the reason is that we wish to propose a handy tool that
sports professionals can indeed use without the need of too long computation time or large data
collections. In the same vein, we will also make the PFSC freely available on the software R (R
Core Team, 2020).

The remainder of the paper is organized as follows. In Section 2 we describe our PFSC along with
the two alternative models, as well as the two model performance evaluation metrics. Section 3
then presents the results of this broad comparison. In Section 4, we illustrate the advantages of
our PFSC by analyzing the French Ligue 1 season 2019-2020 and discussing how fairer decisions
could be obtained on the basis of our PFSC. We conclude the paper with final comments in
Section 5.

2     Methods

In this section, we start by explaining the PFSC (Section 2.1) and the two benchmark models
(Sections 2.2 and 2.3), followed by a description of the two evaluation measures for comparison
(Section 2.4). In what follows, we suppose to have a total of n teams competing in a round-robin
type tournament of M matches.

2.1   The PFSC: a bivariate Poisson-based model

For modelling football matches, the PFSC will make use of the bivariate Poisson distribution.
Building on the original idea of Maher (1982) to model football match outcomes via Poisson
distributions, the bivariate Poisson distribution has been popularized by Karlis and Ntzoufras
(2003). Let Yijm stand for the number of goals scored by team i against team j (i, j ∈ {1, ..., n})
in match m (where m ∈ {1, ..., M }) and let λijm ≥ 0 resp. λjim ≥ 0 be the expected number

                                                4
of goals for team i resp. j in this match. The joint probability function of the home and away
score is then given by the bivariate Poisson probability mass function

                                                                    min(x,y) 
                          λxijm λyjim
                                                                                                k
                                                                     X           x y         λC
P(Yijm = x, Yjim = y) =                 exp(−(λijm + λjim + λC ))                      k!              ,
                             x!y!                                                k   k    λijm λjim
                                                                     k=0

where λC ≥ 0 is a covariance parameter representing the interaction between both teams. This
parameter is kept constant over all matches, as suggested in Ley et al. (2019), who mentioned
that models where this parameter depends on the teams at play perform worse. Note that λC = 0
yields the Independent Poisson model. The expected goals λijm are expressed in terms of the
strengths of team i and team j, which we denote ri and rj , respectively, in the following way:
log(λijm ) = β0 + (ri − rj ) + h · I(team i playing at home), where h is a real-valued parameter
representing the home effect and is only added if team i plays at home, and β0 is a real-valued
intercept indicating the expected number of goals eβ0 if both teams are equally strong and play
on a neutral ground. The strengths r1 , . . . , rn can take both positive and negative real values
and are subject to the identification constraint ni=1 ri = 0. Over a period of M matches (which
                                                  P

are assumed to be independent), this leads to the likelihood function

                                    M
                                    Y
                              L=          P(Yijm = yijm , Yjim = yjim ),                             (1)
                                    m=1

where yijm and yjim stand for the actual number of goals scored by teams i and j in match
m. The unknown values of the strength parameters r1 , . . . , rn are then computed numerically as
maximum likelihood estimates, that is, in such a way that they best fit a set of observed match
results.

Ley et al. (2019) established that the bivariate Poisson model and its Independent counterpart
are the best-performing maximum likelihood based models for predicting football matches. We
refer the interested reader to Section 2 of Ley et al. (2019) for more variants of the bivariate
Poisson model, as well as for Bradley-Terry type models where the outcome (win/draw/loss) is
modelled directly instead of as a function of the goals scored.

Using the bivariate Poisson model in the final standing prediction works as follows. The param-
eters λC , β0 , h and the strength parameters r1 , ..., rn are estimated using the matches played
so far in the current season. Next, these parameters are used to simulate 100,000 times the
remaining matches, by sampling the number of goals for each team in each match from the
corresponding bivariate Poisson distribution. For each simulated end of season, a final standing
is created based on the played and simulated matches, taking into account the specific rules of

                                                   5
the leagues. The probabilistic final standing is then calculated by averaging the results over
all 100,000 simulations, giving each team a probability to reach every possible rank. Note that
Gorgi et al. (2020) also used the bivariate Poisson distribution as their statistical model, but
they only calculate expected ranks and not the complete probabilistic picture as we do.

This model is relatively simple compared to the benchmark models that we describe below, but
it has some nice properties that make it perfectly suited for determining the final standing of
a prematurely stopped competition. First, the PFSC only takes into account match results, so
data requirements are benign. Second, the PFSC only takes into account matches of the current
season, so there is no bias to teams that performed well in the previous season(s). Third, each
played game has the same weight in the estimation of the team strengths. These three properties
make this method a fair way to evaluate an unfinished football season. On top of this, the code
for the model can easily be executed in a short time.

2.2   Current-strength based team ratings

The first benchmark model is an extension of the previous model. The idea of Ley et al. (2019)
was to use a weighted maximum likelihood, where the weight is a time depreciation factor
wtime,m > 0 for match m, resulting in

                              M
                              Y
                         L=         (P(Yijm = yijm , Yjim = yjim ))wtime,m .
                              m=1

The exponentially decreasing time decay function is defined as follows: a match played xm days
back gets a weight of
                                                     xm
                                                 1 Half period
                                wtime,m (xm ) =                .
                                                 2
In other words, a match played Half period days ago only contributes half as much as a match
played today and a match played 3×Half period days ago contributes 12.5% of a match played
today. This weighting scheme gives more importance to recent matches and leads to a so-called
current-strength ranking based on the estimated strength parameters of the teams.

Another difference is that this model uses two years of past matches to estimate the team
strengths. The half period is set to 390 days, as this was found to be the optimal half period by
Ley et al. (2019) when evaluated on 10 seasons of the Premier League. The predicted probabilities
for each rank in the final standing are obtained in the same way as in the PFSC.

                                                  6
2.3   Plus-minus ratings

Plus-minus ratings, the second benchmark model, are based on the idea of distributing credit for
the performance of a team onto the players of the team. We consider the variant of plus-minus
proposed by Pantuso and Hvattum (2020). Each football match is partitioned into segments of
time, with new segments starting whenever a player is sent off or a substitution is made. For
each segment, the set of players appearing on the pitch does not change, and a goal difference
is observed from the perspective of the home team, equal to the number of goals scored by the
home team during the segment minus the number of goals scored by the away team. The main
principle of the plus-minus ratings considered is to find ratings such that the sum of the player
ratings of the home team minus the sum of the player ratings of the away team is as close as
possible to the observed goal difference.

Let S be the set of segments, Ph(s) respectively Pa(s) the set of players on the pitch for the home
respectively away team during segment s ∈ S. Denote by g(s) the goal difference in the segment
as seen from the perspective of the home team. If a real-valued parameter βj is used to denote
the rating of player j, the identification of ratings can be expressed as minimizing
                                                                      2
                              X         X             X
                                             βj −             βj − g(s) ,
                              s∈S   j∈Ph(s)          j∈Pa(s)

the squared difference between observed goal differences and goal differences implied by the
ratings of players involved. To derive more reasonable player ratings, Pantuso and Hvattum
(2020) considered a range of additional factors, which we also consider here: 1) Segments have
different durations, so the ratings in each segment are scaled to correspond with the length
of the segment. 2) The home team has a natural advantage, which is added using a separate
parameter. 3) Some segments have missing players, either due to players being sent off by
the referee or due to injuries happening after all allowed substitutions have been made. These
situations are represented using additional variables corresponding to the missing players, while
remaining player ratings are scaled so that their sum corresponds to an average rating for a full
team. 4) The player ratings are not assumed to be constant over the whole data set, but rather
to follow a curve that is a function of the age of players. This curve is modelled as a piece-wise
linear function which is estimated together with the ratings by introducing corresponding age
adjustment variables. 5) Each segment is further weighted by factors that depend on the recency
of the segment and the game state. A complete mathematical formulation of the plus-minus
rating system was provided by Pantuso and Hvattum (2020).

                                                     7
To move from plus-minus player ratings to match predictions, an ordered logit regression model
is used. This model derives probabilities for a home win, a draw, and an away win based on a
single value associated with each match: the sum of the ratings of the players in the starting
line-up for the home team, minus the sum of the ratings of the players in the starting line-up for
the away team, plus the home field advantage of the corresponding competition.

As with the previous benchmark, the remaining matches of a league are simulated. However,
some slight differences can be observed. Following Sæbø and Hvattum (2019), the starting line-
ups of the teams are also simulated, based on the players available in the squads. Each player
has a 10 % chance of being injured or otherwise inadmissible for a given match. Subject to these
random unavailable players, the best possible starting line-up is found, consisting of exactly one
goalkeeper, and at least three defenders, three midfielders, and one forward. Based on this,
probabilities of a home win, draw and away win are derived using the ordered logit regression
model. Since this does not provide a goal difference, but just a result, the simulation further
assumes that losing teams score 0 goals and drawing teams score one goal, whereas the number
of goals for winning teams is selected at random from 1 to 3.

2.4   Metrics to evaluate and compare the three models

We have provided three proposals for predicting the final standings of abruptly stopped football
seasons. These are evaluated by predicting, for several completed seasons from different top
leagues, the remaining matches after artificially stopping each season after every match. The
evaluation of their predictive abilities is done at two levels: single match outcomes and final
season standings. For the former, we use the Rank Probability Score as metric, for the latter the
Tournament Rank Probability Score.

The Rank Probability Score (RPS) is a proper scoring rule that preserves the ordering of the
ranks and places a smaller penalty on predictions that are closer to the observed data than
predictions that are further away from the observed data (Epstein, 1969; Gneiting and Raftery,
2007; Constantinou and Fenton, 2012). The RPS is defined for a single match as
                                                        2
                                         R   r
                                      1 X X
                              RP S =           (oj − xj )
                                     R−1
                                             r=1    j=1

where R is the number of possible outcomes, oj the empirical probability of outcome j (which is
either 1 or 0), and xj the forecasted probability of outcome j. The smaller the RPS, the better

                                                8
the prediction. The RPS is similar to the Brier score, but measures the accuracy of a prediction
differently when there are more than two ordered categories, by using the cumulative predictions
in order to be sensitive to the distance. Let us give some further intuition about this metric. Let
1 stand for home win, 2 for draw and 3 for away win, so obviously R = 3. The formula of the
RPS can be simplified to

        1                                                1
          (o1 − x1 )2 + (o1 + o2 − x1 − x2 )2 + (1 − 1)2 =   (o1 − x1 )2 + (o3 − x3 )2 ,
                                                                                      
        2                                                  2

which shows, for instance, that a home win predicted as draw is less severely penalized than
would be a predicted away win in such case.

Ekstrøm et al. (2020) extended the RPS to final tournament or league standings, and conse-
quently termed it TRPS for Tournament RPS. The idea is very similar to the RPS, as the TRPS
compares the cumulative prediction Xrt that team t will reach at least rank r (with lower values
of r signifying a better ranking) to the corresponding empirical cumulative probability Ort . The
latter also only attains two different values: a column t in Ort is 0 until the rank which team t
obtained in the tournament, after which it is 1. Consequently, the TRPS is defined as

                                          T     R−1
                                        1X 1 X
                             T RP S =              (Ort − Xrt )2 ,
                                        T   R−1
                                          t=1       r=1

where T is the number of teams and R is the total number of possible ranks in a tournament or
league. A perfect prediction will yield a TRPS of 0 while the TRPS increases when the prediction
worsens. The TRPS is a proper scoring rule, very flexible and handles partial rankings. It retains
for league predictions the desirable properties of the RPS, and as such assigns lower values to
predictions that are almost right than to predictions that are clearly wrong.

3    Results

The PFSC, the current-strength team ratings and the plus-minus player ratings are evaluated
in terms of correctly predicting match outcomes and the final league table. The evaluation is
conducted on the top leagues of England, France, Germany, Italy, and Spain, for the 2016-2017,
2017-2018 and 2018-2019 seasons. Each season and league is halted after each match day, and
the outcomes of the remaining matches as well as the final league tables are predicted.

Figure 1 shows the mean RPS of the remaining matches, given the current match day. The figure

                                                9
illustrates that the performances of the current-strength team ratings and the plus-minus player
ratings are similar throughout. The PFSC is a much simpler model, and only uses data from
the current season. Therefore, its performance is relatively bad in the beginning of the season.
However, in most cases the PFSC converges towards the performance of the benchmark methods
after around 10 match days, and in all but one of fifteen league-seasons it has caught up after
25 match days. The exception is the 2017-2018 season of the Italian Serie A, where the PFSC
leads to worse predictions than the benchmarks throughout.

When nearing the end of the season, the RPS is calculated over few matches. Therefore, the
mean RPS behaves erratically and sometimes increases sharply. This is because a single upset
in one of the final rounds can have a large effect on the calculated RPS values. However, as the
plots in Figure 1 show, all three methods follow each other closely, indicating that the results
that are difficult to predict are equally hard for all methods.

In Figure 2 the mean TRPS is shown for each league and season. As the final table becomes
easier to predict the more matches have been played, the TRPS converges to zero. Therefore,
the difference in performance among the three methods also converges to zero. However, the
PFSC has a similar prediction quality as the benchmark methods already somewhere between
10 and 25 match days into the season. Even for the Italian Serie A in 2017-2018, the TRPS is
similar for all methods after 30 match days, although the RPS indicated that the predictions of
the PFSC are worse for the remaining matches in that particular season. As for the RPS, the
performances of the current-strength team ratings and the plus-minus player ratings are very
similar.

4    Discussion

We have shown in the previous section that our simple PFSC is comparable in terms of predictive
performance at match and final standing levels to the two benchmark models that are more
computationally demanding, more time-consuming and require more input data. This establishes
the PFSC as a very good candidate for obtaining fair final standings.

We will now illustrate how our PFSC can be used in practice by decision-makers to reach fairer
decisions on how an abruptly stopped season should be evaluated. To this end, we will show how
the French Ligue 1 could have been settled after it was abruptly stopped in the 2019-2020 season.
The match results were downloaded from the website https://www.football-data.co.uk/.
The code for the PFSC applied on the present case study is available in the online supplementary

                                                10
Method        PFSC   current−strength team ratings   plus−minus player ratings

                              2016                               2017                                    2018

                  0.24

                                                                                                                     England
                  0.20

                  0.16

                  0.24

                                                                                                                     France
                  0.20

                  0.16

                  0.24
       Mean RPS

                                                                                                                     Germany
                  0.20

                  0.16

                  0.24
                                                                                                                     Italy

                  0.20

                  0.16

                  0.24
                                                                                                                     Spain

                  0.20

                  0.16

                         10   20       30             10          20         30               10         20     30
                                                      number of matchdays

Figure 1: Mean RPS values calculated after each match day for five leagues and three seasons.

                                                               11
Method        PFSC    current−strength team ratings   plus−minus player ratings

                               2016                                2017                                    2018
                   0.20

                   0.15

                                                                                                                            England
                   0.10

                   0.05

                   0.00

                   0.20

                   0.15

                                                                                                                            France
                   0.10

                   0.05

                   0.00

                   0.20

                   0.15
       Mean TRPS

                                                                                                                            Germany
                   0.10

                   0.05

                   0.00

                   0.20

                   0.15
                                                                                                                            Italy

                   0.10

                   0.05

                   0.00

                   0.20

                   0.15
                                                                                                                            Spain

                   0.10

                   0.05

                   0.00
                          10   20       30      40      10         20         30     40         10         20     30   40
                                                        number of matchdays

Figure 2: Mean TRPS values calculated after each match day for five leagues and three seasons.

                                                                 12
material. The official ranking of the Ligue 1 is given in Table 1.

Table 1: The official ranking of the French Ligue 1 in the 2019-2020 season. The order of the
teams is determined by the number of points earned per match.

                                                          Goals       Goal                  Points
      Team          Points   Win   Draw   Loss   Goals                          Matches
                                                         against   difference             per match
  1   PSG            68      22      2     3      75       24           51        27         2.52
  2   Marseille      56      16      8     4      41       29           12        28         2.00
  3   Rennes         50      15      5     8      38       24          14         28         1.79
  4   Lille          49      15      4     9      35       27            8        28         1.75
  5   Nice           41      11      8     9      41       38            3        28         1.46
  6   Reims          41      10     11    7       26       21            5        28         1.46
  7   Lyon           40      11      7    10      42       27           15        28         1.43
  8   Montpellier    40      11      7    10      35       34            1        28         1.43
  9   Monaco         40      11      7    10      44       44           0         28         1.43
 10   Strasbourg     38      11      5    11      32       32            0        27         1.41
 11   Angers         39      11     6     11      28       33           -5        28         1.39
 12   Bordeaux       37       9     10    9       40       34            6        28         1.32
 13   Nantes         37      11     4     13      28       31           -3        28         1.32
 14   Brest          34       8     10    10      34       37           -3        28         1.21
 15   Metz           34       8     10    10      27       35           -8        28         1.21
 16   Dijon          30       7     9     12      27       37          -10        28         1.07
 17   St. Etienne    30       8      6    14      29       45          -16        28         1.07
 18   Nîmes          27       7      6    15      29       44          -15        28         0.96
 19   Amiens         23       4     11    13      31       50          -19        28         0.82
 20   Toulouse       13       3      4    21      22       58          -36        28         0.46

4.1    Evaluating the French Ligue 1 2019-2020 season

At the time of stopping the competition, each team had played at least 27 matches. From the
findings of the previous section, we know that after this number of match days, our PFSC is
a competitive model for predicting the remaining matches and the final ranking. Based on the
played matches, the teams strengths were estimated, which resulted in the strengths reported in
Table 2. We can see that Paris Saint-Germain (PSG) is by far considered as the strongest team
in the league. Surprisingly, Olympique Lyon comes out as the second strongest team, while only
standing on the 7th place in the official ranking at that time. This could indicate that Lyon
did have some bad luck during the season. Looking at their match results, we could see that
in almost all their lost matches, they lost with a goal difference of 1 goal. Only PSG at home
managed to get a margin of two goals against Lyon. At the bottom of the table, we find that
Toulouse was the weakest team in the league, followed by Amiens, St. Etienne and Nîmes. This
is in agreement with the official ranking, up to a slightly different ordering of the teams.

                                                   13
Table 2: The estimated ratings ri , i = 1, . . . , 18, of the teams in the French Ligue 1, based on the
played matches in the 2019-2020 season, obtained via the bivariate Poisson model in our PFSC.

                                   Teams          Estimated strengths
                               1   PSG                           0.85
                               2   Lyon                          0.28
                               3   Rennes                        0.24
                               4   Marseille                     0.19
                               5   Lille                         0.15
                               6   Bordeaux                      0.15
                               7   Reims                         0.08
                               8   Nice                          0.03
                               9   Montpellier                   0.02
                              10   Monaco                        0.01
                              11   Strasbourg                   −0.01
                              12   Nantes                       −0.02
                              13   Brest                        −0.09
                              14   Angers                       −0.10
                              15   Metz                         −0.16
                              16   Dijon                        −0.17
                              17   Nîmes                        −0.27
                              18   St. Etienne                  −0.29
                              19   Amiens                       −0.32
                              20   Toulouse                     −0.59

                                                  14
Using these strengths, we have simulated the remainder of the season 100,000 times and by
taking the mean over these simulations, we have calculated the probabilities for each team to
reach each possible position, which is summarized in Table 3. We can see that PSG would win
the league with approximately 100% probability, thanks to the big lead they had and the high
team strength. Marseille had a 77% chance of keeping the second position, with also a certain
chance of becoming third (18%), or even fourth (5%). Furthermore, we see that Lyon, thanks
to their high estimated strength, had the highest probability to be ranked as fifth (29.5%).
Their frustration with respect to the decision as it was taken officially by the Ligue 1 is thus
understandable (Holroyd, 2020). In the bottom of the standing, we see that Toulouse was doomed
to be relegated, with only 0.1% chance of not ending at the 19th or 20th place in the league.
Amiens had still about 32% chance of staying in the first league.

Table 3: The Probabilistic final standing (in percentages) of the Ligue 1 in the 2019-2020 season,
according to our PFSC method. Probabilities are rounded to the nearest 1 percent.

                1     2    3    4    5    6    7    8    9    10   11   12   13   14   15   16   17   18   19   20
        PSG     100
   Marseille          77   18   5
     Rennes           12   41   36   8    2    1
        Lille         11   37   39   9    3    1
       Lyon                3    10   30   18   13   9    6    4    3    2    1
      Reims                     3    12   15   15   13   12   10   8    6    4    2
 Montpellier                    3    11   13   13   12   12   10   9    7    5    3    1
  Bordeaux                      2    7    11   12   12   12   11   10   9    7    4    1
        Nice                    1    7    10   12   12   12   12   11   9    7    4    1
 Strasbourg                     1    6    9    11   11   12   12   11   10   9    5    2    1
     Monaco                     1    4    7    9    10   12   12   13   12   10   6    3    1
      Nantes                    1    4    7    8    10   11   12   12   12   11   7    3    1
     Angers                          2    4    5    7    9    11   13   15   15   11   5    2    1
       Metz                                         1    1    2    4    7    11   21   22   16   10   4
       Brest                                        1    1    2    4    6    10   19   23   18   10   5    1
       Dijon                                                       1    2    4    10   16   22   23   15   5
 St. Etienne                                                       1    1    3    7    14   22   25   20   7
      Nîmes                                                                  1    3    6    12   22   36   18
     Amiens                                                                            1    3    8    20   66   2
   Toulouse                                                                                                2    98

Now, how could this table be used by decision-makers to handle the discontinued season? One
has to decide which team will become the champion, which teams will play in the Champions
League and Europe League and which teams will be relegated to the second division.

For the first answer, some leagues nowadays have entered a rule stating that if enough matches
are played, the current leader of the season would be considered as the champion. However, this
does not take into account the gap between the first and the second in the standing. We would

                                                         15
recommend changing the rule, in the sense that a team can only be declared champion if it has
more than C% chance to win the league according to the PFSC (C could e.g. be 80, but this
decision of course has to be made by the leagues). For our example, there is little doubt. PSG
was expected to be the winner of the league with an estimated chance of 100%, so they should be
considered as the champions of the Ligue 1. A similar strategy can be adopted regarding which
teams should be relegated to the second division.

For the participation in the Champions League and Europe League, the leagues need a determined
final standing instead of a probabilistic final standing. We will next show how we can get a
determined final standing using our PFSC, and how we can use the PFSC to help to determine
financial compensations.

4.2   Determined final standing and financial compensations

First we make a determined final standing by calculating the expected rank, using the probabili-
ties. This results in the standing shown in Table 4. In the example of the French League, we see
that PSG gets the direct ticket for the Champions League, while Marseille and Rennes get the
tickets for the CL qualification rounds. Lille and Lyon would have received the tickets for the
group stage of the Europe League and Reims the ticket for the qualifications. This shows that
Nice was one of the teams that got an advantage from the decision of the French league to halt
the season.

However, transforming our probabilistic standing to a determined standing causes a number of
teams to be (dis)advantaged. For example, in Table 4 we can see that the expected rank of
Rennes is 3.55, which is the third-highest expected rank. Assigning Rennes to the third rank
is therefore an advantageous outcome. Lille, on the other hand, has an expected rank of 3.62,
which is only the fourth-best expected rank. Lille is therefore at a disadvantage when being
assigned to rank 4.

This issue could be solved by using a compensation fund. Assume that the expected profit (in
particular prize money for the league placement and starting and prize money from Champions
League and Europe League) of a team ending in rank i is equal to Pi . The expected profit for e.g.
Marseille would be 0.77∗P2 +0.18∗P3 +0.05∗P4 . In the determined ranking, they end as second, so
they will receive P2 . Actually, they receive too much, since they had no chance of ranking higher
than second, but they had a reasonable chance to become third or even fourth. To compensate for
this, they should hand over P2 −(0.77∗P2 +0.18∗P3 +0.05∗P4 ) = 0.18∗(P2 −P3 )+0.05∗(P2 −P4 )

                                               16
to the compensation fund. This will then be used to compensate teams that are disadvantaged
by the establishing of a determined ranking. There will still be the difficulty of estimating the
expected profit from reaching a certain rank (e.g., a team reaching the Europe League will have
further merchandising advantages besides the profit mentioned above as compared to the team
classified just outside of these ranks), but we believe that this tool could be very useful for
decision-makers in determining which teams have received an advantage or disadvantage from
an early stop of the league, and how to compensate for this.

Table 4: Determined final standing, using the PFSC probabilities. This standing could be used
to decide which teams will play in the Champions League and Europe League.

                              Rank   Team           Expected rank
                                 1   PSG                     1.00
                                 2   Marseille                2.28
                                 3   Rennes                  3.55
                                 4   Lille                    3.62
                                 5   Lyon                    6.52
                                 6   Reims                   8.17
                                 7   Montpellier             8.51
                                 8   Bordeaux                9.11
                                 9   Nice                    9.20
                                10   Strasbourg              9.58
                                11   Monaco                 10.06
                                12   Nantes                 10.17
                                13   Angers                 11.08
                                14   Metz                   14.38
                                15   Brest                  14.53
                                16   Dijon                  16.00
                                17   St. Etienne            16.40
                                18   Nîmes                  17.34
                                19   Amiens                 18.52
                                20   Toulouse               19.98

5    Conclusion

In this paper we proposed a novel tool, the Probabilistic Final Standing Calculator, to determine
the most likely outcome of an abruptly stopped football season. Unlike other recent proposals, we
provide probabilities for every team to reach each possible rank, which is more informative than
only providing the single most likely or expected final ranking. We have shown the strength of

                                               17
our PFSC by comparing it to two benchmark models that are based on much more information,
and yet our PFSC is exhibiting similar performances except when a season would get stopped
extremely early, which however was anyway more a theoretical than a practical concern (a season
stopped after less than a third of the games played would certainly be declared void). Our
evaluation has been done at both the match-level (via the RPS) and the final standing level (via
the TRPS).

We have shown on the concrete example of the 2019-2020 season of the French Ligue 1 how our
PFSC can be used, also for a fair division of the money shares. We hope that our PFSC will
help decision-makers in the future to reach fairer decisions that will not lead to the same level
of dissatisfaction and controversies that one could observe in various countries in the 2019-2020
season. The idea of the PFSC can also be applied, up to minor modifications, to several other
types of sports tournaments.

We conclude this paper with a historical remark. The problem treated here, namely finding a
fair way to determine final league standings if one cannot continue playing, goes back to the
very roots of probability theory. The French writer Antoine Gombaud (1607-1684), famously
known as Chevalier de Méré, was a gambler interested by the “problem of the points”: if two
players play a pre-determined series of games, say 13, and they get interrupted at a score of,
say, 5-2, and cannot resume the games, how should the stake be divided among them? The
Chevalier de Méré posed this problem around 1654 to the mathematical community, and the two
famous mathematicians Blaise Pascal (1623-1662) and Pierre de Fermat (1607-1665) accepted
the challenge. They addressed the problem in an exchange of letters that established the basis
of our modern probability theory (Devlin, 2010).

6    Declarations

Funding: The authors have no funding to report
Conflicts of interest/Competing interests: The authors have no conflict of interest to declare.
Availability of data and material: Data needed for the application
Code availability: The code for the application is available in a supplementary file.

References
BBC Sport. Arsenal plan to make 55 staff redundant because of coronavirus pandemic. https:

                                               18
//www.bbc.co.uk/sport/football/53668802, 2020. Accessed 14 December 2020.

A.C. Constantinou and N.E. Fenton. Solving the problem of inadequate scoring rules for assessing
  probabilistic football forecast models. Journal of Quantitative Analysis in Sports, 8(1), 2012.

L. Csató. Coronavirus and sports leagues: how to obtain a fair ranking if the season cannot
  resume? https://arxiv.org/abs/2005.02280, 2020. Accessed 23 December 2020.

K. Devlin. The Unfinished Game: Pascal, Fermat, and the Seventeenth-Century Letter that Made
  the World Modern. Basic Books, 2010.

F.C. Duckworth and A.J. Lewis. A fair method for resetting the target in interrupted one-day
  cricket matches. Journal of the Operational Research Society, 49:220–227, 1998.

C.T. Ekstrøm, H. Van Eetvelde, C. Ley, and U. Brefeld. Evaluating one-shot tournament pre-
  dictions. Journal of Sports Analytics, in press, 2020.

E.S. Epstein. A scoring system for probability forecasts of ranked categories. Journal of Applied
  Meteorology, 8(6):985–987, 1969.

K. Fischer and J. Haucap. Does crowd support drive the home advantage in professional soccer?
  Evidence from German ghost games during the COVID-19 pandemic. http://hdl.handle.
  net/10419/222278, 2020. Accessed 23 December 2020.

T. Gneiting and A.E. Raftery. Strictly proper scoring rules, prediction, and estimation. Journal
  of the American Statistical Association, 102(477):359–378, 2007.

P. Gorgi, S.J. Koopman, and R. Lit. Estimation of final standings in football competitions with
  premature ending: the case of COVID-19. https://papers.tinbergen.nl/20070.pdf, 2020.
  Accessed 23 December 2020.

J. Guyon. The model to determine Premier League standings. https://www.thetimes.co.uk/
  article/the-model-to-determine-premier-league-standings-ttt8tnldd, 2020.                       ISSN
  0140-0460. Accessed 14 December 2020.

Het       Laatste    Nieuws.       Wat    als    corona     geen    spelbreker   was?       Statis-
  tici:       Club    kampioen,   Europese      tickets   volgens   stand   na   speeldag   29     en
  geen       Europees     voor    Anderlecht.             https://www.hln.be/belgisch-voetbal/
  wat-als-corona-geen-spelbreker-was-statistici-club-kampioen-europese-tickets-volgens-stand-
  2020. Accessed 14 December 2020.

                                                  19
M.       Holroyd.            Lyon        among    French    football    teams        upset   over
  league        position     decision.                 https://www.euronews.com/2020/05/01/
  french-ligue-1-abandoned-clubs-to-consider-appealing-decision, May 2020.                     Ac-
  cessed 14 December 2020.

D. Karlis and I. Ntzoufras. Analysis of sports data by using bivariate Poisson models. Journal
  of the Royal Statistical Society: Series D (The Statistician), 52:381–393, 2003.

kicker. 13 der 36 Profiklubs droht Insolvenz - noch in dieser Saison. https://www.kicker.de/
  13_der_36_profiklubs_droht_insolvenz_noch_in_dieser_saison-773385/artikel, 2020.
  Accessed 14 December 2020.

R. Lambers and F. Spieksma.         True rankings.    https://www.euro-online.org/websites/
  orinsports/wp-content/uploads/sites/10/2020/05/TrueRanking.pdf, 2020. Accessed 14
  December 2020.

C. Ley, T. Van de Wiele, and H. Van Eetvelde. Ranking soccer teams on the basis of their
  current strength: A comparison of maximum likelihood approaches. Statistical Modelling, 19:
  55–73, 2019.

M.J. Maher. Modelling association football scores. Statistica Neerlandica, 36:109–118, 1982.

G. Pantuso and L.M. Hvattum. Maximizing performance with an eye on the finances: a chance-
  constrained model for football transfer market decisions. TOP, 2020. Forthcoming.

R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for
  Statistical Computing, Vienna, Austria, 2020. URL https://www.R-project.org/.

O.D. Sæbø and L.M. Hvattum. Modelling the financial contribution of soccer players to their
  clubs. Journal of Sports Analytics, 5:23–34, 2019.

Sky Sports.      Coronavirus: Football clubs at risk of going bust without government aid,
  open     letter   warns.     https://www.skysports.com/football/news/11095/12083809/
  coronavirus-football-clubs-at-risk-of-going-bust-without-government-aid-open-letter-warns,
  2020. Accessed 14 December 2020.

                                                 20
You can also read