Labor-right Protecting Dispatch of Meal Delivery Platforms

Page created by Travis Contreras
 
CONTINUE READING
Labor-right Protecting Dispatch of Meal Delivery Platforms
Labor-right Protecting Dispatch of Meal Delivery Platforms
 Wentao Weng and Yang Yu

 Abstract— The boom in the meal delivery industry brings dispatch focuses mainly on timely delivery for attracting
 growing concern about the labor rights of riders. Current consumers while profit-driven dispatch pursues minimizing
 dispatch policies of meal-delivery platforms focus mainly on the population of riders for cost saving[4], [5]. Consequently,
 satisfying consumers or minimizing the number of riders for
 cost savings. There are few discussions on improving the the current dispatch algorithms limit the number of riders
 working conditions of riders by algorithm design. The lack of and continuously shorten the time window of deliveries
 concerns on labor rights in mechanism and dispatch design has to attract more consumers by quickly delivering foods.
arXiv:2109.14156v1 [cs.PF] 29 Sep 2021

 resulted in a very large time waste for riders and their risky Both mechanisms, user-centric and profit-driven dispatch,
 driving. In this research, we propose a queuing-model-based contribute to overload of riders. The above two problems
 framework to discuss optimal dispatch policy with the goal of
 labor rights protection. We apply our framework to develop an of the current food-delivery models together drive riders to
 algorithm minimizing the waiting time of food delivery riders overspeed.
 with guaranteed user experience. Our framework also allows In this research, we propose a queuing-model-based frame-
 us to manifest the value of restaurants’ data about their offline- work of designing dispatch policies from the perspective of
 order numbers on improving the benefits of riders. protecting the traffic safty of riders in meal delivery platforms.
 In contrast to user-centric dispatch or profit-driven dispatch, in
 I. INTRODUCTION
 this research, we develop a framework that carefully models
 A. Background the working flow of riders and seeks to mitigate the stress of
 Labor rights in meal delivery have become one of the the delivery-time limit that forces riders to drive at excessive
 top public concerns in platform economy. During the last speeds. Our rider-centric framework enables us to analyze
 decade, the meal-delivery industry has been flourishing due the pathway of protecting rider safety. According to our
 to the penetration of smart mobile phones. However, a framework, we develop a labor-rights-protection dispatch
 growing number of traffic violations and accidents due to algorithm to control the waiting time of riders in restaurants.
 risky driving of riders have attracted people to discuss the Our algorithm can achieve the optimal dispatch strategy
 relationship between platform algorithms and labor rights. that minimizes the waiting time of riders with guaranteed
 Current statistics reveal that meal delivery riders contribute to consumer-acceptable order waiting time. We also apply our
 a significant share of traffic violations. In 2017, approximately framework to explore how restaurant information can further
 one food delivery rider was reported to be injured or killed improve the labor rights of riders with a refined platform
 every 2.5 days in Shanghai, China[1]. The emerging food- dispatch policy.
 delivery platform also contributes to a nonnegligible fraction Our rider-centric framework further allows us to compre-
 of traffic accidents in Korea [2]. In the U.S., food delivery hensively capture the impacts from all stakeholders, including
 riders are viewed as one of the most dangerous jobs[3]. restaurants, consumers, platforms, and riders themselves. We
 It is necessary to develop and analyze dispatch algorithms found that information about the number of restaurant offline
 of meal-delivery platforms from the labor-right protecting orders is critical for managing rider waiting time for orders.
 perspective. However, the current algorithmic models for Therefore, our framework decomposes the meal-delivery task
 meal delivery are either user-centric or platform profit-driven into two processes: an information disclosure process and a
 and ignore the perspective of protecting riders from traffic dispatch process. The information disclosure process captures
 risks. The absence of the perspective of labor rights of riders the impact of restaurants sharing their information, while
 causes two problems. First, the current dispatch models the dispatch process captures the impact of the patience
 simplify the procedure of the working flow of riders by of consumers and decisions of platforms. We found that
 merging riders’ picking-up order period and delivery period restaurants can help with labor rights protection by sharing
 into one period. This oversimplification causes the current their private information about the number of their offline
 models to ignore the potential of improving rider traffic safety orders. This conclusion reveals the tight connection between
 by managing the rider waiting time during the period of private information and labor rights in platform economy.
 picking up orders. Consequently, user-centric or profit-driven
 B. Related work
 dispatch results in overwaiting riders. Second, the user-centric
 The growth of meal delivery platforms opens a wide
 W. Weng is with the Department of Electrical Engineering and Computer range of research areas. Due to the unique process structure
 Science, Massachusetts Institute of Technology wweng@mit.edu. Y. Yu of meal deliveries, queuing models are frequently used to
 is with the Institute for Interdisciplinary Information Sciences, Tsinghua
 University yangyu1@mail.tsinghua.edu.This work was done understand profit brought by meal delivery platforms, or
 when W.Weng was a senior undergraduate at Tsinghua University. broadly on-demand platforms [6], [7], [8], [9], [10]. The
Labor-right Protecting Dispatch of Meal Delivery Platforms
current user-centric or profit-driven dispatch adopts mainly attitudes of customers toward late orders and familiarity
a single-sided queueing model focusing on timely service of riders with roads into optimization models and show
of the randomly arriving orders by the least size of rider by simulations that data can improve dispatch efficiency.
population. In particular, [8] studies the benefit of delivery Nevertheless, to the best of our knowledge, there is yet no
platforms to restaurants by analyzing an associated queuing work discussing the impact of restaurant data and providing
model and proposing new contracts. Then, in [9], the authors theoretical insights into how restaurant data can help improve
propose a queuing model with a pricing scheme to study dispatch efficiency and labor welfare.
interactions between platforms and customers. Their model
also discusses the impact of a finite rider pool on the profit C. Labor rights of riders in food-delivery platforms
of a platform. In addition, the model of [10] analyzes the
benefit of order batching via queues with batch arrivals. Labor rights can be divided into two categories: 1) working
 From a technical perspective, our queueing model belongs benefits such as minimum wages and reasonable working
to the class of two-sided queues, whose study can be dated hours; 2) the safety of working conditions[30]. Most current
back to the 1990s [11]. In the original model, there are relative discussions in the context of platform economy focus
arrivals of positive jobs to one single queue. There are also on the first economic aspect of labor rights [31], [27], [32],
negative jobs that can cancel one positive job waiting in the while the working safety of riders is absent from the current
queue. However, when there is no positive job, the negative literature about platform economy.
job will directly leave. The sojourn time of positive jobs has Here, our work provides a first step to consider the driving
been derived [12], and the model has also been extended to safety of riders in platform economy by minimizing the
more general service time [13] and arrival processes [14]. waiting time of riders. We focus on the waiting time of riders
The stationary distribution is known to have a product form in restaurants for two reasons. First, minimizing the waste
in queueing networks with negative arrivals [15], [16]. Our of time during the order-picking-up process allows riders to
model, however, is different from previous studies in the sense have sufficient time to deliver the order. Second, in contrast
that negative arrivals, i.e., riders, can wait in another queue to user-centric or profit-driven dispatch, minimizing the
for future orders. The arrival process of riders can depend on rider’s waiting time in restaurants can prevent the algorithm
the queue length of orders instead of a time-homogeneous from continuously shortening the available time window of
process. deliveries. We believe that our work could offer a new angle
 In addition to queuing models, there are optimization for the control community to think about platform economy.
and reinforcement learning studies aiming at developing
online optimization strategies for efficient real-time meal D. Contributions
deliveries. A stream of work models rider dispatch problems
by optimizations with the objective of maximizing the number We summarize our three contributions below.
of timely orders or minimizing the averaged delays [17], • Framework of modeling labor-right protecting prob-
[18], [19]. In addition, studies of meal deliveries when a lems in meal-delivery platforms. In contrast to the
customer’s order can include food from multiple restaurants literature, we developed a model framework addressing
is studied in [20]. Randomness in meal delivery problems is labor rights and specifically labor safety in platform
considered in the optimization framework in [21], [22]. The economy. As riders play the roles linking all other players
model in [23] uses deep inverse reinforcement learning to on the platform, the labor-right protecting model has to
learn a better dispatch policy. In addition to providing efficient comprehensively model interactions between consumers,
dispatch, customer satisfaction is another core concern of restaurants, riders, and the platform.
current dispatch studies. Some recent studies discussed • Optimal dispatch algorithm for labor rights protec-
reliable estimation of arrival time by using supervised learning tion. Based on the above framework, we propose an
on deliveries [24], [25]. There also exist works pursuing efficient dispatch algorithm and rigorously prove that
improving customer’s satisfaction by designing reasonable it minimizes the time waste of riders with satisfactory
range of customers to whom each restaurant can expose [26]. user experience.
 Although the labor rights of riders have attracted sufficient • Value of restaurant data on labor rights protection.
social attention, there are only limited studies on the topics, Finally, our analysis discovers the sharp value of the
most of which focus on the impacts of wage design on private information of restaurants in improving waiting
welfare distribution. For instance, [27], [28] examine how time of riders. Such a result reveals a new perspective
wages in platforms impact customers and laborers but have on the role of restaurants in protecting labor welfare.
not discussed how the process of dispatching may impact
labor welfare. To the best of our knowledge, there is still a II. M ODEL SETTING
lack of systematic discussion about improving the dispatch for
mitigating the time-limit stress on riders, which forces riders In this section, we provide the detailed setting of an online
to drive riskily during delivery orders. We also found that meal delivery platform. Particularly, the environment enrolls
discussion about the impacts of information on labor rights four elements: a platform, a restaurant, arrivals of food orders
is rare. [29], [22] integrate the estimation of travel times, and delivery riders.
A. Model of a Meal-Delivery Process restaurant’s information disclosure scheme and 2) dispatch
 process of the platform.
 In online meal delivery service, an order needs to go
 1) Information Disclosure: When a user submits an order
through the following process until its corresponding customer
 to the platform, he/she can reveal his/her patience time T ∗ ,
can receive the food. First, immediately after being placed
 that is, how much time he/she can endure for extra order
online (via apps or websites), the order joins into the
 delays due to deliveries. Users’ patience time is crucial for
processing queue at the restaurant. We assume that the
 riders since in practice, riders would get low tips or even be
restaurant prepares orders in the processing queue one by one
 disqualified for late deliveries [34]. We assume that all orders
in a first come first serve manner, and it can only prepare one
 will fully disclose the same fixed value T ∗ . The platform has
order at one time. After an order is done, the order enters
 to design algorithms satisfying this constraint. We formalize
a new waiting queue and waits there until a rider picks it
 this decision problem in Section III.
up. In the meantime, the platform will consistently dispatch
riders to the restaurant at possibly time-varying rates. In From the view of the restaurant, when Q2 (t) ≥ 0, its value
our setting, we assume that one arriving rider will take the is only visible to the restaurant, not the platform, since the
earliest prepared order from the waiting queue. If there are platform has no information of whether one order is finished
no prepared orders, riders will also wait in a queue. When or not. Therefore, the restaurant could decide whether to reveal
one order is ready, the earliest arriving rider will take this such information to the platform. To model the restaurant’s
order and leave. willingness to share its private data with the platform, we
 design the following information disclosure mechanism. In
 We assume that orders arrive in a Poisson process with
 particular, the restaurant sets a nonnegative integer M as a
rate 1, and its service time is an exponential distribution of
 threshold that is also known by the platform. For every time t,
rate µ independent of the arrival process. The assumption
 the restaurant maintains a signal θ(t) visible to the platform,
of a Poisson arrival process and exponential service time
 given by
follows the vast study of restaurant service systems [9], [33],
[10]. Note that the assumption of a unit-rate arrival process is
 (
 0, if Q2 (t) ≤ M ;
without loss of generality because we can always normalize θ(t) = (1)
 Q2 (t) − M, if Q2 (t) > M.
the time scale. Denote the food processing queue as Q1 ,
and the waiting queue of prepared food as Q2 . Then, Q1 We allow θ(t) ≡ 0, and write it as M = ∞. Such a screening
itself is an M/M/1 queue. When one order leaves Q1 , it policy is motivated by the fact that restaurants would like to
immediately joins Q2 to wait for riders. Let Q1 (t), Q2 (t) be hide the number of waiting prepared orders when it is small.
the number of orders in Q1 , Q2 at time t, respectively. When In this way, the platform may send excessive riders, which
there is no waiting order but a waiting riders, we would write can help reduce the waiting time of orders. The threshold
Q2 (t) = −a. In this sense, riders are like negative arrivals M measures how much information is being shared. When
that can cancel an order from the queue. Figure 1 provides M = 0, the restaurant shares all of its private information.
an illustration of the whole process. However, if M = ∞, the restaurant would provide no data
 to the platform.
B. Information Disclosure Dispatch Process 2) Dispatch process: Finally, we define the dispatch
 Based on knowledge of the order arrival rate, the platform process of the platform. The dispatch policy is assumed
must design corresponding dispatch algorithms to send riders. to be a time-varying Poisson arrival process of riders whose
In addition, the platform has two sources of information, rates change according to θ(t) provided by the restaurant.
one from users and the other from the restaurant, both of Extending the dispatch policy to more general processes may
which may play a role in the dispatch process. Specifically, entail more technical challenges, which exceeds the focus of
we divide the decision process into two parts: 1) users’ and this paper.
 In particular, the platform first sets a list of dispatch rate
 parameters ~λ = (λ0 , λ1 , · · · ) where λi ≤ Λ and the upper
 bound Λ measures the maximal availability of riders near the
 Riders Orders restaurant. After choosing ~λ, the platform will send riders to
 Platform
 the restaurant in a Poisson process whose rate is given by
 λθ(t) . That is, conditioned on the event that the restaurant
 signal is equal to a value i, riders will arrive to the restaurant
 in a Poisson process of rate λi . Furthermore, if there are
 already d riders waiting at the restaurant, i.e., Q2 (t) = −d,
 the platform will not send any riders even when it is going
 to do so. We call d the buffer length. For ease of exposition,
 we assume a rider will arrive at the restaurant immediately
 when he/she is summoned by the platform. In addition, we
 assume that the arrival process of riders is independent of
 Fig. 1: Illustration of Meal Delivery Services the order arrival process and service time of orders.
0,-d 1,-d 2,-d 3,-d ... III. L ABOR -R IGHT P ROTECTING D ISPATCH P ROBLEM
 ... AND A NALYSIS

 ...

 ...

 ...
 Based on the mathematical model of a meal delivery
 0,-1 1,-1 2,-1 3,-1 ... platform, we propose the labor-right protecting dispatch
 problem in this section. The goal is to protect laborers while
 guaranteeing customer experience. In particular, this section
 0,0 1,0 2,0 3,0 ... includes two parts. In the first part, we formulate the labor
 rights-protecting dispatch problem. In the second part, we
 analyze the characteristics of the model, which enables us to
 design algorithms for the dispatch problem in Section IV.
 0,1 1,1 2,1 3,1 ...
 ... A. Problem Formulation
 ...
 ...
 ...

 To formalize the experience guarantee, we first notice that
 Fig. 2: The Markov chain with state (q1 , q2 ) the expected order waiting time is at least the mean response
 time in an M/M/1 queue of arrival rate 1 and service rate
 1
 µ. Then, we know E [To ] is at least µ−1 [35].
 Proposition 1: The averaged order waiting time E [To ] is
 We note here that ~λ, d and Λ are tightly connected to two 1
 at least µ−1 .
crucial parts in today’s meal deliveries: postponed dispatch Based on this lower bound on expected order waiting time,
[21] and delivery scope design [26]. Specifically, in postponed we can formalize the meaning of patient time T ∗ in the
dispatch, the platform will not immediately dispatch a rider sense that we require E [To ] ≤ µ−11
 + T ∗ since E [To ] − µ−1
 1

to restaurants when an order arrives so that they will not measures the extra delays brought by deliveries. Then, after
arrive with orders being unprepared. Our arrival rates ~λ can the restaurant reveals its threshold M , the platform needs to
thus be seen as how fast the platform will send out a rider. find the best ~λ, d to minimize the averaged rider waiting time
Furthermore, when there are already too many riders (d in while satisfying requirements of users.
our model) waiting for food, the platform won’t send any Based on the above discussion, we propose the following
rider. In addition, the platform in general will divide a whole optimization framework for the labor-right protecting dispatch
city into multiple regions and deploy different numbers of problem.
riders in each area. The number of riders in one region will
then decide the magnitude of the parameter Λ.
 minimize E~λ,d,M [rider waiting time Tr ]
 To ensure that the platform’s capacity is sufficient to serve ~
 λ,d
incoming orders and that the system is stable, we impose the subject to E~λ,d,M [order waiting time To ] (2)
following assumptions on dispatch rates of riders. 1
 Assumption 1: Rates µ, Λ, λ0 are larger than 1, and λi > 0 ≤ + user patience time T ∗ .
 µ−1
for all i > 0. In addition,
 P∞ itQsatisfies that there exists a constant
 i Note that our optimization problem also belongs to the vast
C < ∞, such that i=1 j=1 λ1i = C.
 domain of constrained Markov decision processes (CMDPs)
 Under the above assumption, the system state can be [36]. The goal of CMDP is to obtain the optimal state-
the pair of (Q1 (t), Q2 (t)), which forms a continuous-time dependent policy of a certain objective while satisfying the
Markov chain. Fig. 2 illustrates a snapshot of this Markov given constraints. The optimal policy can be complicated
chain. Furthermore, the system is positively recurrent and and even randomized. However, in the context of our paper,
enjoys a unique stationary distribution based on Assumption we not only want to solve the optimization problem (2)
1 [35]. To measure customer experience and labor rights, this but also seek to explicitly understand the trade-off between
paper focuses on the mean order waiting time E [To ] and the the objective "rider waiting time" and the constraint "order
mean rider waiting time E [Tr ] in the system. To be more waiting time" and how the decisions and the trade-off could
specific, the waiting time of an order is measured as the change in response to the volume of information provided by
time difference between its arrival at the restaurant and its the restaurant. Therefore, the policy space in our optimization
time fetched by a rider. The mean order waiting time is the problems is kept as simple as possible to discuss the above
time-averaged waiting time among all orders. Similarly, the information-related decisions and does not involve the general
waiting time of one rider is the time between his/her arrival space of randomized policies. We believe a new framework
at the restaurant and the time he/she fetches one order. The is needed for such discussions and thus leave it as a future
mean rider waiting time is then the time-averaged waiting study.
time of all riders.
 Note that the two expectations are determined by the B. Model Analysis
platform’s setting of ~λ, d, M . Therefore, when we want to To obtain a deeper understanding of the problem, we first
stress such dependence, they are written as E~λ,d,M [To ] and study the stationary distribution of the system in the general
E~λ,d,M [Tr ]. setting where the restaurant allows a threshold M , and the
platform sets the associated dispatch rate ~λ and the buffer 3) Analysis for M = ∞: Restricting our scope to M =
length d. Based on the distribution, we observe a decoupling ∞, based on Lemma 1, we can obtain expressions of E [To ]
phenomenon between Q1 and Q2 . Finally, we restrict our and E [Tr ] via Little’s Law[35]. Formally, we have this lemma.
scope to the case of M = ∞ and obtain close-form formulas Lemma 3: For fixed λ0 > 1 and d, let ρ = λ10 . It satisfies
for E [To ] and E [Tr ]. that
 1) Stationary Distribution of the General Model: Recall 1 ρd+1
 Eλ0 ,d [To ] = + , (6a)
that the state of the system is given by (Q1 (t), Q2 (t)), namely, µ−1 1−ρ
the number of orders waiting for preparation and the number
of prepared orders waiting for riders. If Q2 (t) < 0, it is the ρ − ρd+1
 Eλ0 ,d [Tr ] = d − . (6b)
negative of waiting riders. Let (Q1 , Q2 ) be the corresponding 1−ρ
random variables distributed with the stationary distribution IV. L ABOR - RIGHT PROTECTING ALGORITHM
of (Q1 (t), Q2 (t)). Let πq1 ,q2 be the stationary distribution
 Since it is generally nontrivial to solve (2), in the first
of the event (Q1 = q1 , Q2 = q2 ). Fix the threshold M , the
 part of this section, we start our analysis by assuming that
dispatch rates ~λ = (λ0 , λ1 , · · · ), and the buffer level d, and
 there is no restaurant information, i.e., M = ∞. Under
assume they satisfy the assumption 1. The state space of the
 this assumption, we propose Algorithm 1 that provides the
system would be N × ({−d, d + 1, · · · , 1} ∪ N).
 procedure to find the optimal λ0 , d. Second, when there are
 Define ρ = λ10 . The following lemma shows thatπq1 ,q2 restaurant data, we design Algorithm 2 to improve policies
exhibits a product-form property similar to the property in based on new information.
queueing networks with negative arrivals [15]. The proof of
this lemma can be found in the appendix. A. Algorithm and its Optimality when No Restaurant Data
 Lemma 1: It holds that for q1 ≥ 0, q2 ≥ −d, When there is no restaurant information, the platform needs
     q1 only to seek the optimal design of λ0 , d. The closed-form
 1 1
 C 1 − ρq2 , if q2 ≤ M ; expressions of E [To ] and E [Tr ] shown in the Lemma 3
 
 
  1
 
  µ µ provides the basis of the following algorithm to solve (2).
 πq1 ,q2 =   q1 −M
 q2Y
 Algorithm 1 (Optimal Dispatch Algorithm): The
 
  1 1 M 1
 
  C1 1 − ρ , otherwise, algorithm takes T ∗ and Λ as inputs and outputs two
 µ µ λi
 
 
 i=1
 parameters λ0 , d as follows.
 Λ
where • If 1−Λ ≤ T , simply returns d = 0, λ0 = Λ;
 • Otherwise, for every ρ ∈ [0, 1), define D(ρ) as the
 1−ρ D(ρ)+1
 C1 = P∞ Qi . (3) smallest nonnegative integer such that ρ 1−ρ ≤ T ∗ ;
 ρ−d − ρM +1+ (1 − ρ)ρM i=1 j=1 λ1i and for every nonnegative integer n, define γ(n) as the
 2) Queue Decoupling: Observe that by Lemma 1, we maximum real number in (0, 1) such that γ(n)
 n+1
 ∗
 1−γ(n) ≤ T .
can indeed decouple Q2 from Q1 . If we consider only the 1 1
 • Let d1 = D( Λ ). The algorithm returns d1 , γ(d ) .
distribution of events Q2 = q for q ≥ −d, the stationary 1
 The next theorem supports the optimality of Algorithm 1.
distribution of Q2 is equivalent to the stationary distribution
of the following Markov chain. In the equivalent Markov Theorem 1: Under the two parameters λ0 , d found by
chain, the set of the state is {−d, · · · , −1} ∪ N. For each Algorithm 1, and assuming M = ∞, it satisfies E [To ] =
 1 ∗
state q ≥ −d, it transitions to state q + 1 with rate 1 and µ−1 + T , and E [Tr ] is minimized.
to q − 1 with rate λmax(q−M,0) if q > −d. The stationary Let’s look at a proof sketch of Theorem 1. A formal proof
distribution of this Markov chain, denoted by νq , is exactly is provided in the appendix. The goal is to minimize E [Tr ]
given by while satisfying the delay constraint in (2). By Lemma 3, we
 d+1
 want to find λ0 , d such that d − ρ−ρ 1−ρ is minimized and
 C ρq2 , if q2 ≤ M ; ρd+1
 ≤ T ∗ . The intuition of Algorithm 1 is to choose the
 
  1
  1−ρ
 q2Y−M smallest λ0 satisfying the constraint in (2). Note that E [Tr ]
 νq = M 1 (4)
 C1 ρ
  , otherwise, is a decreasing function of ρ when d is fixed. Therefore,
 λi
 i=1 after finding the smallest d, we would increase the value of
 ρ (or equivalently decrease λ0 ) until the constraint in (2) is
where ρ, C1 are the same as those in Lemma 1.
 violated.
 In addition, let Q∗ be the random variable corresponding
to the state in this Markov chain. Then, by equivalence, we B. Policy Improvement using Restaurant Information
have the following property connecting E [To ] , E [Tr ] with
 In this section, we explore how to improve platform
Q∗ . Due to space limit, we omit its proof.
 policies using restaurant data. Particularly, the restaurant
 Lemma 2: It holds that provides more information to the platform by reducing its
 1 threshold from M to m, where m < M . Our results highlight
 E [To ] = + E [max(Q∗ , 0)] that restaurant data are of particular importance to help the
 µ−1 (5)
 E [Tr ] = −E [min(Q∗ , 0)] . platform make better decisions.
Formally, let us assume that the platform has a policy given This result shows that the sum of T ∗ and E [Tr ] is lower
 by when the threshold is M . Then, algorithm 2 outputs a bounded by a value determined by the platform capacity Λ.
 new policy for a given threshold m, m < M that is arguably As a result, when Λ is fixed, no matter how the platform acts,
 better than the original policy. Algorithm 2 is given below. it is impossible for both T ∗ and E [Tr ] to become negligible
 Algorithm 2 (Policy Improvement with Restaurant’s Data): simultaneously. Nevertheless, the lower bound decreases when
The algorithm takes ~λ = (λ0 , · · · ), d, M, m and Λ as T ∗ or Λ increases, thus implying that if customers allow a
 inputs and outputs the dispatch rates and buffer level, longer delivery time or if the platform is able to recruit more
~τ = (τ0 , · · · ), d1 , of the new policy as follows. Define riders, then laborers can enjoy a lower average waiting time.
 ρ = 1/λ0 .
 ρ−ρM −m+1 P∞ Qi−M VI. N UMERICAL C ASE
 • Set C = 1−ρ + ρM −m i=M +1 j=1 λ1j .
 1 In this section, we aim to illustrate the importance of
 • Set τ0 = λ0 , τ1 = 1 , and τi = Λ for all i > 1.
 (1− Λ )C the labor rights protection problem, the dilemma between
 • Let d1 = d. Output (~ τ , d1 ). customer experience and rider waiting time, and the value
 Our next result shows that using Algorithm 2, the platform of restaurant information using simulations. Particularly, we
 can always find a policy with better averaged order waiting assume that λ0 = 1, and µ = Λ = 1.5. It is a reasonable
 time and the same averaged rider waiting time based on any assumption because in practice, the restaurant and the platform
 policy of a larger threshold. usually only recruit nearly sufficient numbers of riders to
 Theorem 2: Suppose that the platform has a policy with reduce cost.
 parameters ~λ = (λ0 , · · · ), d and the restaurant’s threshold
 is M , M > 0. If M reduces to m, m < M , then using A. Importance of labor rights protection and Trade-offs
Algorithm 2, the platform can construct a new set of dispatch To study the impact of the labor rights protection prob-
 rates ~τ , such that lem, we consider three settings of patient time, T ∗ =
 E~τ ,d,m [To ] ≤ E~λ,d,M [To ] , E~τ ,d,m [Tr ] = E~λ,d,M [Tr ] , (7) 0.01, 0.05, 0.1. In addition, we assume that there is no
 restaurant information, so M = ∞. Then, for λ0 ∈ (1, Λ),
where the inequality becomes equal only if we plot the curve of Eλ0 ,d [Tr ], where d is the minimum
 
 ∞ i−M
  1
 integer such that Eλ0 ,d [Tr ] − µ−1 ≤ T ∗ . The result is shown
 Λ − 1  ρ − ρM −m+1 X Y 1 in Fig. 3. As we can see, for a fixed T ∗ , although all the
 ρ= + ρM −m .
 Λ 1−ρ λj
 i=M +1 j=1
 Remark 1: Note that if M = ∞ in Theorem 2, the
inequality in (7) becomes equal if and only if λ0 = Λ, that is,
the platform is already dispatching riders at its full capacity.
The proof of Theorem 2 can be found in the appendix.
 V. O BLIGATION OF DIFFERENT STAKEHOLDERS
 In this section, we discuss obligations in protecting the
labor rights of the three stakeholders: the platform, users, and Fig. 3: Averaged Rider waiting time under Different Dispatch
the restaurant. Rates Given that Customer Experience is satisfied
 First, for restaurant information, Theorem 2 tells us that
whenever there are data from the restaurant side, the platform patience time is satisfied, the averaged rider waiting time
can improve the experience of customers while maintaining varies greatly. A clever setting of λ0 can reduce the waiting
the same averaged rider waiting time. In this sense, restaurant time by a factor of 4. In addition, when T ∗ increases, the
data help to improve labor rights by making the labor more averaged rider waiting time decreases. It characterizes the
efficient. We will further quantify the effect of queueing data trade-off between customer experience and labor rights.
of the restaurant in Section VI.
 Second, we desire to understand how the customer’s B. Value of Restaurant’s Data
requirement of experience and the platform’s ability to recruit To highlight the value of the data from the restaurant,
riders may affect the averaged rider waiting time. For ease of we assume that M could be 0, 10, ∞. In this case, we are
exposition, we assume that M = ∞ in this part. Furthermore, able to differentiate different levels of restaurant information.
suppose that the platform is still trying to solve the labor-right For the platform’s policy, we fix d = 0 and then allow
protecting dispatch problem introduced in (2). λ0 to vary from 1 to Λ to be the policy for M = ∞.
 In this case, we show that there is a fundamental lower Then, for other M , we employ Algorithm 2 to generate new
bound of E [Tr ] given that T ∗ is low. The proof is deferred policies. We plot the value of E~λ,d,M [To ] after the policy
to the appendix. improvement as shown in Fig. 4. As we can tell from Fig. 4,
 Theorem 3 (Dilemma between customers and riders): restaurant data and algorithm 2 can indeed greatly improve
 ρ̂4 ∗
Suppose that ρ̂ : = Λ1 ≥ 0.3 and 1− ρ̂ ≥ T and M = ∞. the customer experience, decreasing E [To ] by a factor of two
 ln(4/3) 1 ∗
Then, it holds E [Tr ] ≥ 4 1−ρ̂ − T , for all feasible d when M = 0 and λ0 = 1.2. Specifically, we can observe that
and λ0 satisfying the constraint in (2). the improvement is greater when λ0 or M is smaller.
[11] E. Gelenbe, P. Glynn, and K. Sigman, “Queues with negative arrivals,”
 Journal of applied probability, vol. 28, no. 1, pp. 245–250, 1991.
 [12] P. G. Harrison and E. Pitel, “Sojourn times in single-server queues by
 negative customers,” Journal of Applied Probability, vol. 30, no. 4, pp.
 943–963, 1993.
 [13] ——, “The m/g/1 queue with negative customers,” Advances in Applied
 Probability, vol. 28, no. 2, pp. 540–566, 1996.
 [14] Q.-L. Li and Y. Q. Zhao, “A map/g/1 queue with negative customers,”
 Queueing Systems, vol. 47, no. 1, pp. 5–43, 2004.
Fig. 4: Averaged Order Waiting Time After Using Restaurant [15] W. Henderson, “Queueing networks with negative customers and
 negative queue lengths,” Journal of Applied Probability, pp. 931–942,
Data 1993.
 [16] E. Gelenbe, “Product-form queueing networks with negative and
 positive customers,” Journal of applied probability, vol. 28, no. 3,
 pp. 656–663, 1991.
 VII. CONCLUSIONS [17] B. Yildiz and M. Savelsbergh, “Provably high-quality solutions for the
 In this paper, we formulated a queueing model for meal meal delivery routing problem,” Transportation Science, vol. 53, no. 5,
 pp. 1372–1388, 2019.
delivery platforms that incorporated four key stakeholders: the [18] D. Reyes, A. Erera, M. Savelsbergh, S. Sahasrabudhe, and R. O’Neil,
platform, the restaurant, riders, and customers. In particular, “The meal delivery routing problem,” Optimization Online, 2018.
we proposed the labor rights protecting problem, whose goal is [19] M. Joshi, A. Singh, S. Ranu, A. Bagchi, P. Karia, and P. Kala, “Batching
 and matching for food delivery in dynamic road networks,” arXiv
to reduce waiting time of riders while maintaining satisfactory preprint arXiv:2008.12905, 2020.
customer experience. When there is no restaurant information, [20] Z. Steever, M. Karwan, and C. Murray, “Dynamic courier routing for
we designed an efficient algorithm to find the optimal dispatch a food delivery service,” Computers & Operations Research, vol. 107,
 pp. 173–188, 2019.
parameters for the platform. When the restaurant is willing [21] M. W. Ulmer, B. W. Thomas, A. M. Campbell, and N. Woyak, “The
to share its private data, we further characterized a policy restaurant meal delivery problem: Dynamic pickup and delivery with
improvement method that takes any policy as an input and deadlines and random ready times,” Transportation Science, 2020.
returns a refined policy based on restaurant data. The refined [22] S. Liu, L. He, and Z.-J. Max Shen, “On-time last-mile delivery: Order
 assignment with travel-time predictors,” Management Science, 2020.
policy provably improves customer experience, while labor [23] S. Liu, H. Jiang, S. Chen, J. Ye, R. He, and Z. Sun, “Integrating
rights did not get hurt. Our results also indicated that there dijkstra’s algorithm into deep inverse reinforcement learning for food
was a trade-off between customer experience and labor rights delivery route planning,” Transportation Research Part E: Logistics
 and Transportation Review, vol. 142, p. 102070, 2020.
in the sense that they cannot be satisfied simultaneously. Our [24] L. Zhu, W. Yu, K. Zhou, X. Wang, W. Feng, P. Wang, N. Chen, and
simulations confirmed the theoretical results. Specifically, they P. Lee, “Order fulfillment cycle time estimation for on-demand food
revealed that an optimal labor-rights protecting algorithm delivery,” in Proceedings of the 26th ACM SIGKDD International
 Conference on Knowledge Discovery & Data Mining, 2020, pp. 2571–
could dramatically reduce the waiting time of riders, and 2580.
restaurant data could considerably reduce the order waiting [25] F. D. Hildebrandt and M. W. Ulmer, “Supervised learning for arrival
time, especially when the restaurant would like to share more time estimations in restaurant meal delivery,” Working paper, Tech.
 Rep., 2020.
information. [26] X. Ding, R. Zhang, Z. Mao, K. Xing, F. Du, X. Liu, G. Wei, F. Yin,
 R. He, and Z. Sun, “Delivery scope: A new way of restaurant retrieval
 R EFERENCES for on-demand food delivery service,” in Proceedings of the 26th ACM
 SIGKDD International Conference on Knowledge Discovery & Data
 [1] “Takeaway riders become a high-risk profession? two takeaway
 Mining, 2020, pp. 3026–3034.
 platforms: set flexible time,” https://www.tellerreport.com/life/
 [27] S. Benjaafar, J.-Y. Ding, G. Kong, and T. Taylor, “Labor welfare in
 2020-09-10-takeaway-riders-become-a-high-risk-profession--two-
 on-demand service platforms,” Available at SSRN 3103447, 2020.
 takeaway-platforms--set-flexible-time.BJexgaC4DEP.html, accessed:
 2021-09-06. [28] S. Wu, S. Xiao, and S. Benjaafar, “Two-sided competition between
 [2] J. H. Byun, B. Y. Jeong, and M. H. Park, “Characteristics of motorcycle on-demand service platforms,” Available at SSRN 3525971, 2020.
 crashes of food delivery workers,” Journal of the Ergonomics Society [29] W. Mao, L. Ming, Y. Rong, C. S. Tang, and H. Zheng, “Faster
 of Korea, vol. 36, no. 2, pp. 157–168, 2017. deliveries and smarter order assignments for an on-demand meal
 [3] “Top 25 most dangerous jobs in the united states,” https:// delivery platform,” Available at SSRN 3469015, 2019.
 advisorsmith.com/data/most-dangerous-jobs/, accessed: 2021-09-06. [30] “Labor rights,” https://en.wikipedia.org/wiki/Labor_rights, accessed:
 [4] “Food delivery: Drivers take the risks. platforms reap the re- 2021-09-06.
 wards.” https://technode.com/2019/12/04/food-delivery-drivers-take- [31] V. De Stefano, “The rise of the just-in-time workforce: On-demand
 the-risks-platforms-reap-the-rewards/, accessed: 2021-09-06. work, crowdwork, and labor protection in the gig-economy,” Comp.
 [5] N. Christie and H. Ward, “The health and safety risks for people who Lab. L. & Pol’y J., vol. 37, p. 471, 2015.
 drive for work in the gig economy,” Journal of Transport & Health, [32] J. A. Parrott and M. Reich, “An earnings standard for new york city’s
 vol. 13, pp. 115–127, 2019. app-based drivers,” New York: The New School: Center for New York
 [6] J. Bai, K. C. So, C. S. Tang, X. Chen, and H. Wang, “Coordinating City Affairs, 2018.
 supply and demand on an on-demand service platform with impatient [33] J. Hwang, L. Gao, and W. Jang, “Joint demand and capacity manage-
 customers,” Manufacturing & Service Operations Management, vol. 21, ment in a restaurant system,” European journal of operational research,
 no. 3, pp. 556–570, 2019. vol. 207, no. 1, pp. 465–472, 2010.
 [7] T. A. Taylor, “On-demand service platforms,” Manufacturing & Service [34] “Lateness based violations explained,” https://help.doordash.com/
 Operations Management, vol. 20, no. 4, pp. 704–720, 2018. dashers/s/article/Lateness-Based-Deactivations-Explained?language=
 [8] P. Feldman, A. E. Frazelle, and R. Swinney, “Can delivery platforms en_US, accessed: 2021-09-06.
 benefit restaurants?” Available at SSRN 3258739, 2019. [35] M. Harchol-Balter, Performance modeling and design of computer
 [9] M. Chen, M. Hu, and J. Wang, “Food delivery service and restaurant: systems: queueing theory in action. Cambridge University Press,
 Friend or foe?” Available at SSRN 3469971, 2019. 2013.
[10] M. Chen and M. Hu, “Courier dispatch in on-demand delivery,” [36] E. Altman, Constrained Markov decision processes. CRC Press, 1999,
 Available at SSRN, 2020. vol. 7.
A PPENDIX Simplifying Lemma 1 by setting M = ∞. We have πq1 ,q2 =
 (1 − ρ)ρd (1 − µ1 ) µ1q ρq2 for q1 ≥ 0, q2 ≥ −d. Then
A. Proof of Lemma 1 1

 P∞ P∞ ∞ ∞
 Proof: Notice that q2 =−d πq1 ,q2 = 1. We
 q1 =0
 X X
can then prove the result by checking that πq1 ,q2 satisfies E [No ] = q1 πq1 ,q2 + q2 πq1 ,q2
 q1 ,q2 =0 q1 ,q2 =0
the balance equation of the Markov chain because the ∞
steady state distribution is the unique solution to the balance 1 X
 = + (1 − ρ)ρd+j j
equation given that it is normalized[35]. For any x, denote µ − 1 q =0
 2
x+ = max(0, x). In the considered Markov chain, for a state
 1 ρd+1
(q1 , q2 ), its transitions are given as follows: = + .
  µ−1 1−ρ
 (q1 + 1, q2 ), of rate 1;
 On the other hand, define E [Nr ] be the expected waiting
 
 
(q1 , q2 ) → (q1 − 1, q2 + 1), of rate µ if q1 > 0; riders. Again, by Little’s Law, E [Nr ] = λ̂E [Tr ], where λ̂ is
 
 (q1 , q2 − 1), of rate λ(q2 −M )+ if q2 > −d. the "true" dispatch rate of riders. That is,
 

 ∞
 !
For any two state Q, P , denote the rate from Q to P by rQ,P . X
Fix a state Q = (q1 , q2 ). Here we only check the balance λ̂ = λ0 1 − πq1 ,−d = λ0 ρ = 1.
equation for state (q1 , q2 ) satisfying q1 > 0, q2 > −d. The q1 =0

same strategy can verify other cases similarly. The balance Then for E [Nr ], it holds
equation of this state is that
 X X ∞
 X −1
 X d
 X
 inflow : = πP rP,Q = πQ rQ,P = : outflow. E [Nr ] = −q2 πq1 ,q2 = (1 − ρ)ρd jρj
 P P q1 =0 q2 =−d j=1
 d
We need to consider three cases. First, suppose that q2 < M . 1−ρ
 =d− ,
Then by the form of π, it holds λ0 − 1
 1−ρd
 inflow = πq1 −1,q2 + πq1 ,q2 +1 λ0 + πq1 +1,q2 −1 µ and thus E [Tr ] = d − λ0 −1 .
 !
 1 1 1
 =C + q1 q2 + C. Formal Proof of Theorem 1
 µq1 −1 λq02 µ λ0 µq1 λ0q2 −1
 We first have the following lemma demonstrating the
 = πq1 ,q2 (1 + λ0 + µ) = outflow.
 optimality of choosing smallest feasible d.
Then if q2 = M , it satisfies that Recall the function D(ρ) and the function γ(n) from
 Algorithm 1 defined for a fixed T ∗ . Let dl = max(1, D( Λ1 )).
 inflow = πq1 −1,q2 + πq1 ,q2 +1 λ1 + πq1 +1,q2 −1 µ For every d ≥ dl , define h(d) = E γ(d)
 ! 1
 ,d [Tr ].
 1 1 1 Lemma 4: It holds that h(d) is a strictly increasing func-
 =C + q1 q2 +
 µq1 −1 λq02 µ λ0 µ 1 λ0q2 −1
 q tion when d ≥ dl .
 Proof: Let ρl be the smallest real number such that
 = πq1 ,q2 (1 + λ0 + µ) = outflow. ρ2l ∗ 1 1
 1−ρl ≥ T . And if ρl < Λ , set it to be Λ . We can then see
Finally, for q2 > M , that dl = D(ρl ). Now for every ρ ≥ ρl , define x(ρ) such
 x(ρ)+1 ∗
 that ρ 1−ρ = T ∗ . Then x(ρ) + 1 = ln T +ln(1−ρ)
 ln ρ ≥ 1. Note
 inflow
 that x(ρ) may not be integer.
 = πq1 −1,q2 + πq1 ,q2 +1 λq2 +1−M + πq1 +1,q2 −1 µ We claim that for every d ≥ dl , it holds x(γ(d)) = d.
 −M
 q2Y q2 −M
 Y−1
 !!
 1 1 1 1 That is, x is like a reverse function of γ, but it extends the
 =C + q M + definitional domain. We prove this claim by contradiction.
 µq1 −1 λq02 µ λ0
 1 λi λi
 i=1 i=1 Suppose it is not. It must hold that x(γ(d)) < d, and
 = πq1 ,q2 (1 + λq2 −M + µ) = outflow, (γ(d))d+1 ∗
 1−γ(d) < T . However, in this case, we can find a ρ > γ(d)
 d+1
 ρ
which completes the proof. such that 1−ρ) ≤ T ∗ , which contradicts the fact that γ(d)
 is the largest such possible ρ. As a result, we must have
B. Proof of Lemma 3 x(γ(d)) = d.
 x(ρ)+1

 Proof: Define E [No ] be the expected number of unsent For ρ ≥ ρl , define g(ρ) : = x(ρ) − ρ−ρ1−ρ . Then
orders. By Little’s Law, E [To ] = E [No ] because orders’ to prove that h(d) is strictly increasing, it is sufficient to
arrival rate is 1. It holds that prove that g(ρ) is strictly increasing since h(d) = g(γ(d)).
 x(ρ)+1

 ∞ X
 ∞ Furthermore, observe that for ρ ≥ ρl , ρ 1−ρ = T ∗ , which
 is a∗constant. It is enough to prove the function G(ρ) : =
 X
 E [No ] = (q1 + q2 )πq1 ,q2 . ln T +ln(1−ρ) ρ
 q1 =0 q2 =0 ln ρ − 1−ρ − 1 is strictly increasing.
To prove it, take derivative of G, and we can get could notice that
  
 M ∞ i−M
 −1 ln T ∗ + ln(1 − ρ) 1 1
 G0 (ρ) =
 X X Y
 − − πA (m)  ρi−m + ρM −m = 1,
 (1 − ρ) ln ρ ρ(ln ρ)2 (1 − ρ)2 λj
 i=−d i=M +1 j=1
 −1 (x(ρ) + 1) ln ρ 1
 = − − (8) and
 (1 − ρ) ln ρ ρ(ln ρ)2 (1 − ρ)2
 −1 1 1 m ∞  i−m−1 !
 ≥ − − ,
 X
 i−m
 X 1
 (1 − ρ) ln ρ ρ ln ρ (1 − ρ)2 πB (m) ρ + τ1 = 1.
 i=m+1
 Λ
 i=−d
where the last inequality is because x(ρ)+1 ≥ 1 and ln(ρ) < Then by the definition of τ1 , it holds that πA (m) = πB (m).
0. As a result, by Lemma 2, it leads to the fact that
 −1 1 1
 It remains to show that G1 (ρ) = (1−ρ) ln ρ − ρ ln ρ − (1−ρ)2 −1
is positive in (0, 1). Simplifying the terms gives G1 (ρ) = X
 −ρ+ρ ln ρ+1 E~τ ,d,m [Tr ] = (−i)ρi−m πA (m) = E~λ,d,M [Tr ] .
−(1−ρ)2 ρ ln ρ . For the function G2 (ρ) = −ρ + ρ ln ρ + 1, i=−d
its derivative is ln ρ, which is negative in (0, 1). Therefore,
the function G2 (ρ) is a strictly decreasing function in (0, 1). 2) Optimization Framework and Greedy Improvement:
Since G2 (1) = 0, we know both G2 (ρ) is always positive It remains to show that the mean number of waiting orders
in (0, 1), and so is G1 (ρ). Consequently, G(ρ) is a strictly decreases. For Markov chain A, define aq = ππAA(m)(q)
 for
increasing function, which completes the proof. q > 0. Similarly, define bq for Markov chain B. Indeed,
 aq−1
We can now finish the proof of Theorem 1. aq = λmax(q−M,0) . Then by Lemma 2, we have
 Λ
 Proof: In Algorithm 1, we first check whether 1−Λ ≤ ∞
 X
 ∗
T . If it is, it means it is feasible to set d = 0, and λ0 = Λ. E~λ,d,M [To ] − E~τ ,d,m [To ] = πA (m) q (aq − bq ) .
It is optimal because EΛ,0 [Tr ] = 0. q=1
 Otherwise, we have to set d ≥ 1. Notice that when d is
fixed, EΛ,0 [To ] is a decreasing function, while EΛ,0 [Tr ] is an
 Furthermore,
 P∞ we can P∞observe that by the definition of τ1 , it
 holds q=1 aq = q=1 bq < ∞. Denote this constant by H.
increasing function. Therefore, suppose we want to select d To show that E~λ,d,M [To ] ≥ E~τ ,d,m [To ], it suffices to prove
one by one. We can first check whether the pair of (λ0 = Λ, d) {bq } is the solution to the following optimization problem:
can satisfy the constraint in (2). If it does, then we decrease
 ∞
the value of λ0 to obtain lower averaged rider waiting time. X
 minimize i · xi
The above discussion claims that feasible d must be at least ~
 x=x1 ,···
 i=1
D( Λ1 ). And for each feasible d, the optimal choice is to ∞
 1 X
choose λ0 = γ(d) . The averaged rider waiting time given subject to xi = H
by this parameter setting is exactly h(d). Now according i=1 (9)
to Lemma 4, h(d) is an increasing function. Therefore, the xi 1
 ≥ , ∀i > 1
optimal solution is to set d = D( Λ1 ), and λ0 = γ(d) 1
 . xi−1 Λ
 1
 x1 ≥ .
D. Proof of Theorem 2 Λ
 We now provide the proof of Theorem 2. The intuition The requirement of xxi−1i
 ≥ Λ1 comes from the fact that the
is that, by Little’s Law, it suffices to prove that after using dispatch rate cannot exceed Λ.
the new policy (~τ , d, m), the mean number of waiting riders Greedily, given a feasible setting of ~x in 9, the best strategy
remains the same, but the mean number of waiting orders is to keep xi large for small i, and keep xi small for large i.
decreases. Particularly, we claim that the solution must satisfy xxi+1i
 = Λ1
 Then the proof starts by coupling the Markov chain of the for all i ≥ 1. Otherwise, we can find the smallest i with
 xi+1 1
threshold m and that of the threshold M , M > m. After that, xi > Λ . It always improves the objective by decreasing
we propose an optimization problem to minimize the mean xi+1 and increasing xi . Then we can solve out x1 , and it
number of waiting orders, and use a greedy policy to solve it. turns out that the produced solution is indeed {bq }, which
Algorithm 2 corresponds to the final outcome of this policy, completes the proof.
and thus makes an improvement.
 E. Proof of Theorem 3
 1) System Coupling: To enable analysis, recall the dis-
cussion in Section III-B.2, we only need to consider the Fix λ0 and d satisfying the constraint in (2). By the
equivalent Markov chain of Q2 to calculate the mean numbers assumption in Theorem 3, it holds that λ10 ≥ 0.3 and d ≥ 3.
of waiting orders and riders. Denote the new Markov chain of Then to show that there is a trade-off between customer
(~λ, d, M ) as A, and that of (~τ , d, m) as B. Denote their steady- experience and labor right, one key observation is that, by
state distributions by πA (q), πB (q). We can now couple the Lemma 3,
two Markov chains on state m. Recall that ρ = λ10 . By 1 2ρd+1 − ρ
solving balance equations with state m as the initial start, we E [Tr ] + E [To ] − =d+ , (10)
 µ−1 1−ρ
d+1
where ρ = λ10 . Let f (ρ) = 2ρ 1−ρ−ρ . We then consider two
cases.
 First, when ρd+1 ≥ 43 , it holds f (ρ) ≥ 2(1−ρ)
 1
 because
 1 1
ρ < 1. Therefore, E [Tr ] + E [To ] − µ−1 ≥ 2(1−ρ) .
 Second, when ρd+1 < 34 , we claim that E [Tr ] + E [To ] −
 1 d
µ−1 ≥ 2 . To see why it is true, we only need to show
 d
 −1
that f (ρ) = ρ 2ρ1−ρ ≥ − ρd
 2 . For x ∈ [0, 1), Define u(x) =
 2x−1
1−x1/d
 . We claim that u(x) ≥ − d2 .
 Proof: If not, then it holds that
 2x − 1 d d d
 1/d
 ≤ − ⇐⇒ 0 ≤ − + x1/d − 2x + 1.
 1−x 2 2 2
Define v(x) = − d2 + d2 x1/d − 2x + 1. Take derivative of
v(x), we know v 0 (x) = 12 x1/d−1 − 2. Since d ≥ 3, v 0 (x) is
a decreasing function. In addition, limx→0+ v 0 (x) > 0, and
v 0 (1) < 0. Therefore, the value x̄ such that v 0 (x̄) = 0 is the
 −d
maximum point of v(x). Set v 0 (x̄) = 0. It holds x̄ = 4 d−1 .
Therefore, the maximum value of h(x) for x ∈ (0, 1) would
 1
be h(x̄) = − d2 + d2 4 d−1 −2x̄+1 ≤ −2·4−1.5 +1 < 0. It thus
contradicts the assumption that there is some x satisfying
h(x) ≥ 0, which completes the proof.
 1
 As a result, it holds E [Tr ] + E [To ] − µ−1 ≥ d2 ≥ ln(4/3)
 −2 ln ρ
since ρd ≤ 34 . Furthermore, by assumption, ρ ≥ 0.3, meaning
that − ln ρ ≤ 2(1 − ρ). Therefore, summarizing the above
discussions, we have
  
 ln(4/3) 1 ln(4/3)
E [Tr ] ≥ − E [To ] − ) ≥ − T ∗,
 4(1 − ρ) µ−1 4(1 − ρ̂)
 (11)
where ρ̂ = Λ1 .
You can also read