# Spatial Competition in the French Supermarket Industry

←

**Page content transcription**

If your browser does not render page correctly, please read the page content below

Spatial Competition in the French Supermarket Industry∗ Stéphane Turolla† INRA UMR SMART – Rennes First version: February 2008. This version: May 2010 Abstract This papers develops a structural model of spatial competition to analyze the competition intensity among large grocery stores at geographical market level. The model is estimated for a metropolitan area of South of France and uses a cross-sectional household survey containing detailed information on stores visited for the main food product categories. Using estimates of demand parameters and assuming a particular pricing rule, we recover both stores’ marginal cost and margin. The results point out that on the whole retailers exert a significant local monopoly power due to important diﬀer- entiation forces, especially for the hypermarket format. We then perform counterfactual policy simulations based on propositions formulated by the Competition Authority that aim to restore eﬀective competition in this in- dustry. We show that imposing a hypermarket divestiture to a dominant retailer is always beneficial to consumers whatever the purchaser identity. Keywords: Spatial competition, Structural model, Discrete choice model, Diﬀerentiated products, Supermarket industry JEL Classification: C35, L13, L81 ∗ We are grateful to Aurélie Bonein, Stéphane Caprice, François Gardes, Marc Ivaldi, Jean- Louis Monino, Vincent Réquillart and several participants at the JMA 2007 conference, Journées doctorales de l’ADRES 2008, the AFIO-INRA seminar in Toulouse and the ESEM 2008 conference in Milano. We are most grateful to the CCI of Montpellier for financial assistance and data accessibility. All errors are my own. † Address: INRA UMR SMART, 4 Allée Adolphe Bobierre, CS 61103, F- 35011 Rennes Cedex (France). Email: stephane.turolla@rennes.inra.fr

1 Introduction Over the last fifteen years, prices for a wide range of food products raised signifi- cantly in France; while they are remained stable in the Eurozone, even decreased for some countries (e.g. Germany, Netherlands).1 This inflationary trend has led the French government to commission a series of investigations in order to question pricing practices and market power enjoyed by French retailers.2 The conclusions of these inquiries have stressed unanimously that the decrease of the price competition level results from the passing of the Galland Law and the Raﬀarin Law (both en- acted in 1996). These laws were promoted in order to counterbalance the increasing power of retail chains over both manufacturers and small independent stores. The Galland Law was dedicated to prevent retailers from engaging in below-cost pricing by defining clearly the below-cost selling threshold. Unfortunately, instead of restor- ing a faithful negotiation framework, the Galland Law has shifted the bargaining process from “upfront margins” to “hidden margins” to the expense of final prices. The mechanism by which this regulation has relaxed intra-brand competition has been widely documented in the theoretical literature (see for instance Allain and Chambolle, 2009). Further, Biscourp, Boutin, and Vergé (2008) confirmed empiri- cally the price-raising eﬀect of the Galland Law. At the same time, concerned by preserving small independent stores from the entry of German mass discounters (i.e. Aldi, Lidl), the legislator toughened the entry regulation through the passing of the Raﬀarin Law. The administrative authorization, prerequisite for the granting of the building permit, has been enlarged to stores with sales areas over 300 m2 (1,500 m2 under the previous regulation). As a result, important barriers to entry were estab- lished that secured the rent of the incumbents by preventing them from potential entrants and, consequently, soften both upstream and downstream competition. Nowadays, it is well admitted that the entry into force of these regulations has reinforced retailers’ market power. According to the producer association ILEC, retailers’ average gross margin has raised by 49.1% from 1998 to 2004. As a con- sequence, since the middle of the 2000’s, the French government seeks to restore an eﬀective competition in the grocery retailing. This led first to define a new net invoice price that aims to transfer part of “hidden margins” to consumers. If the intention was laudable, it resulted in practice a slight decrease of retail prices. This rigidity is mainly explained by the concentrated market structure of the downstream sector that prevents a fierce price competition. In 2009, the largest five retailers had a share of 75.6%, placing France second in Europe. Recognizing that lower re- tail prices can not occur without enhancing price competition into the downstream market, France has applied itself to amend its retail planning regulation in order to stimulate the entry of new competitors in trading areas. Hence, in October 2007, the French Minister of Economy asked the Competition Authority (CA) to issue an opinion on the entry regulation (Competition Authority, 2007). The CA has stressed 1 The average price of food products corrected for inflation increased by 5.9% in France over the 1996-2009 period (source: Eurostat, IPCH). 2 For a recent overview of experts reports on these issues, see Commission Attali (2008), Com- mission Hagelsteen (2008) and Rapport Charié (2009). 1

that beyond the degree of concentration at the national level, one may pay attention to retailers’ local monopoly power. The presumption is strong that retailers exert locally a significant market power that distorts price competition. In addition, the CA has suggested several lines of inquiry likely to evolve the market structures in favor of an increased competition. The primary goal of this paper is to assess empirically the extent of retailers’ market power for a typical local area and uncover its drivers. To that end, we develop and estimate a structural model of demand among spatially diﬀerentiated grocery stores that accounts for consumers’ preferences over stores characteristics and geographic proximity. We resort to a mixed logit model, rather than a logit or nested logit model, so as to capture consumers unobserved heterogeneity and give an accurate appraisal of substitution patterns. The estimated demand parameters are then used to compute retailers’ margins under alternative pricing rules. With these results in hand, we are able to perform some counterfactual experiments and quantify the eﬀects on retail prices and consumer welfare regarding the hypothetical measures considered by the CA. Doing so, we provide valuable insights into the competitive landscape of this industry. Researches on supermarket competition are numerous. However, they usually rely on observation of purchasing decisions for a limited number of product cat- egories or players in the market (see, for instance, Richards and Hamilton, 2006, and Richards, 2007, respectively), which limits the scope of their conclusions. One exception is the empirical study of Smith (2004) on the UK supermarket industry. Using a household panel survey, Smith investigates the extent of retailers’ market power derived from multi-stores ownership under the particular assumption that retail chains adopt a zoning pricing strategy at the regional level. The shopping patterns are estimated using a discrete-continuous choice model where consumers choose where to shop and how much to spend. Once the parameters estimated, he runs several experiments to evaluate the price response of stores for diﬀerent own- ership structures (demerger/mergers). Following the methodology of Smith (2004), Dubois and Jódar-Rosell (2008) extend his analysis for the case of France. In ad- dition to the pricing strategy, these authors consider that retailers can adjust their ratio of private labels oﬀered with respect to national brands in order to maximize their profits. In this paper, we exploit a unique database that is not subjected to the prelim- inary remarks formulated. Besides, we diﬀer from the studies of Smith (2004) and Dubois and Jódar-Rosell (2008) owing to the nature of our data that allows us to examine more precisely (from a geographical point of view) the features of spatial competition among grocery stores within a trading area. Concretely, consumers are asked in our survey for the stores visited according to eight food product categories covering a large part of food sales. The richness of the data allows us to observe a significant heterogeneity among consumers for product categories purchased in large grocery stores. We depart from this observation to modelize consumer’s store choice decision as a two-stage process: (1) whether or not buying a particular food product category in a large grocery store and, conditional on the shopping basket constituted, (2) which store to visit. We argue that the consideration of an individ- 2

ual shopping basket is more adapted to capture the competition eﬀects of pricing strategies diﬀerentiated by product categories adopted by retailers. The model is estimated for a metropolitan area of South of France that is rep- resentative of the high level of concentration encountered on the French territory. Using a unique cross-section survey of 1,654 households and a database of stores characteristics, we find that on the whole large grocery stores exert a substantial local monopoly power. Among store formats, hypermarkets appear as the most profitable (on median) due to important diﬀerentiation forces. However, the within- format heterogeneity observed in our results points out that local competitive envi- ronment (i.e. spatial competition) accounts greatly in the extent of market power. Our simulation results support the proposition of the CA that implies stores di- vestitures to limit the anti-competitive eﬀects ensuing from an abuse of dominant position. We show that imposing a hypermarket divestiture to a dominant retailer is always beneficial to consumers whatever the purchaser identity. This paper is related to the growing literature in empirical industrial organiza- tion devoted to the estimation of structural model of competition for diﬀerentiated products (see Ackerberg, Benkard, Berry, and Pakes, 2007, for a review). Sev- eral recent papers extend the methodology proposed by Berry (1994) and Berry, Levinsohn, and Pakes (1995) to analyze retail markets where firms compete in a ‘Hotelling-type’ model. For example, Manuszak (2001) and Thomadsen (2005) eval- uate empirically the competitive eﬀects of mergers in the Hawaiian gasoline market and US fast food industry, respectively. Davis (2006) carefully investigates product positioning and the eﬀect of distance on rivals in the US movie theater industry; likewise Thomadsen (2007) for the US fast food industry. McManus (2007) also ex- amines the product design eﬃciency under nonlinear pricing for the specialty coﬀees market. In the spirit of these papers, Chiou (2009) evaluates consumers’ preferences for buying DVD at Wal-Mart, compared to others mass merchants and alternative retail channels. In a broader sense this paper is also related to empirical studies focusing on contract distortions and price-raising eﬀects observed after the introduction of the Galland Law. Hence, by explicitly specifying a model of vertical relationship with nonlinear pricing between retailers and manufacturers, Bonnet and Dubois (2010) have shown that manufacturers in the bottled water industry used resale price main- tenance to the detriment of retail price. Using CPI data, Biscourp, Boutin, and Vergé (2008) have also stressed that the passing of the Galland Law has favored the decrease of the intra-brand competition for the whole range of food products. The remainder of the paper is organized as follows. First, we briefly depict the market structure of the French supermarket industry and compute some concentra- tion indicators to highlight the high level of concentration at geographical market level (Section 2). Section 3 provides an overview of the data used for the estimation. The empirical model used to determine households’ store choice is then specified in Section 4 as well as the pricing equations that allow us to back out stores’ margin. We then present the estimation method in Section 5 and discuss the assumptions required to identify the estimates of demand parameters. Section 6 presents both the estimates of demand parameters and stores’ margin, and also reports the results 3

of robustness tests performed. We discuss the impact of some counterfactual policy simulations on retail prices and consumer welfare in Section 7. Finally, we conclude in Section 8 and outline some refinements for future researches. 2 The French supermarket industry In 2008 the French food retail industry had sale revenues of e196,8 billions and represented 594,000 jobs. Since its expansion from the end of the 1950s to these days, the supermarket industry appears as one of the most dynamic sector of the French economy. Over the years, this sector became the favorite distribution channel of the French and accounts for, to date, 70% of sales in the food retail market and 20% for non food items. One of the most striking features of this success is the low number of players who share it. The French grocery retailing industry is then dominated by six firms that together had 84% of the market shares in 2009: Carrefour (24%), Leclerc (17%), Intermarché (13%), Auchan (11%), Casino (10%) and Système U (9%). This con- centrated market structure is not specific to the French market, since we observe the same tendency to concentration in a large majority of European countries. Several reasons are in force to explain this. First, Maican and Orth (2009) have shown that the productivity gains that have accompanied the entry of large grocery stores have fostered exit of the less productive firms (including a substantial number of small independent stores). Alongside, incumbents have influenced the contestability of the market because of the nature of competition they engage in it. According to Ellickson (2006), the explosion of product variety and stores size have led to increase significantly endogenous sunk costs, reducing the threat of entry of new competi- tors.3 It is also largely documented that large entrants used their buyer power to increase their market dominance (Inderst and Mazzarotto, 2009). In addition to these market mechanisms, European countries have adopted entry regulations that have established important barriers to entry that favored incumbents, even if their nature diﬀers substantially across countries. This last eﬀect being more pronounced for France (see Boylaud and Nicoletti, 2001). In its first opinion on the French grocery retailing industry, the CA did not consider that the market structure of the downstream sector may be harmful to competition; contrary to tariﬀ practices encountered into the upstream sector (see Competition Authority, 1997). Indeed, none of the retailers have a dominant posi- tion nationally. However market configurations diﬀer noticeably from one geograph- ical market to another which presumes that retailers may have strong positions in a number of trading areas. Therefore, it seems more relevant to conduct such analysis for geographical markets. As an illustrative example, we compute some simple statistics summarizing the concentration at the market level. Similar to the approach used by Barros, Brito, and de Lucena (2006) and Biscourp, Boutin, and Vergé (2008), we assume that a given store competes with rivals located within a 3 Assuming that consumers value store size as a ‘vertical’ characteristic, Ellickson (2006) demon- strates with help of a structural approach that “escalating investments in variety enhancing dis- tribution systems yield a natural oligopoly of high quality firms”. 4

Table 1: Market structure for the 500 largest French cities Fascia Firm Nb Mkt Sh. 1 (%) HHI Nb Mkt Sh. 1 (%) HHI Q1 15.42 27.88 1664.31 8.84 33.47 2189.05 Median 18.08 24.63 1436.48 9.67 31.16 2024.85 Q3 22.60 23.68 1372.29 10.30 30.84 1999.82 Total 25.11 23.80 1389.49 10.32 32.12 2093.41 Montpellier 18.00 19.88 1087.82 10.00 34.60 2340.05 Notes: Descriptive statistics are reported for the first quarter of 2000 and data from the 1999 census population. The database surveyed all the hypermarkets (selling area over 2500 m2 ), supermarkets (selling area between 400 and 2500 m2 ) and hard discount stores. In total, we count 46 fascias and 14 firms. The average of the statistics are reported. Source: author’s calculations. radius of 10 km. Since we do not have other information than the ZIP-code of stores, we assume that stores are positioning at the center of their respective city. We limit our analysis to the 500 largest cities, so a relevant market consists in one of these cities surrounded by neighboring cities located by less than 10 km.4 Table 1 reports the number of fascias and retailers per market, the market share of the leader and a measure of concentration through the computation of the Herfindahl-Hirschman Index (HHI) based on selling areas (again considering fascia and retailer). We detail the results by quartile of the population distribution of the cities. It appears that the extent of market concentration is more pronounced at the market level. For a significant number of markets, we observe that market leader has a market share higher than 32%. Also, the HHI suggests that a majority of trading areas are con- centrated (highly concentrated) at the fascia level (firm level), following a standard interpretation.5 According to several recent empirical studies in the European food retailing sector, this local concentration is not costless for consumers since a clear positive relationship has been emphasized between market concentration and food prices (see Barros, Brito, and de Lucena, 2006; Biscourp, Boutin, and Vergé, 2008). Besides, even if it has been demonstrated that the Galland Law has implicitly introduced an industry-wide price floor, in practice stores operating under a same fascia charge diﬀerent prices. A recent survey conducted by the consumers association UFC-Que Choisir reveals that prices may vary up to 20% between two hypermarkets of a same retailer, depending on competition encountered.6 Beyond market structure, price competition is also strongly distorted by vari- ous diﬀerentiation strategies carried out by retailers. Hence, when choosing which stores to visit, a consumer accounts for a variety of factors other than price. For in- stance, private labels, product range, quality of products (e.g. freshness), consumer services are important components of the consumer decision making process that relax price competition. One factor seems, however, raised more attention because of its leading part in a grocery store success: location. Highlighted by the aphorism 4 This distance corresponds to a 12-minute drive time for an average driving speed of 50 km/h. 5 Formally, the HHI is defined as the sum of squares of all the market shares in the market. According to the 2004 EU Merger Guidelines, a HHI over 2000 indicates a highly concentrated market. 6 UFC-Que Choisir (26/12/2007). 5

“location, location, location”, spatial positioning appears as a major diﬀerentiation force in grocery retailing, similarly to other retail industries. According to a survey of the French national institute for statistics and economic studies (INSEE), 67% of households consider that accessibility is their primary criteria for choosing their shopping destination. Due to the importance attached by consumers to distance traveled, stores location plays a prominent role in competition among retailers.7 Taken together, market structures and spatial diﬀerentiation appear as key ele- ments to investigate whether retailers exert local monopoly power to dampen price competition. 3 Data 3.1 Presentation and descriptive statistics We use a original database that surveys households’ store choice, dwelling in a metropolitan area of South of France, for several food and non-food categories. The area of study is the French administrative aire urbaine of Montpellier, covering a total number of 459,916 people.8 According to Table 1, it is representative of the concentrated market structure observed in other geographical markets. The survey was conducted jointly by the chamber of commerce of Montpellier and the depart- ment of economics of University Montpellier I during the year 2000.9 It follows the quota sampling methodology to create a sample to be representative of the geograph- ical, age and socio-economic group composition of the population of concerns. The data was collected at the household level. A total of 1,654 households were asked for stores visited according to 49 product categories. In the following, we restrict our analysis to the most purchased height food categories (for twelve recorded) due to the computation burden of the model. One appeal of the database stems from the richness of information collected at the category level. For instance, we know for each category, all stores patronized by the household regardless the distribution channel (e.g. specialized store, retail store, market place) and the corresponding fre- quency of purchase (by class amplitude of 25%, plus an epsilon option). Besides, the survey gives numerous information on households’ characteristics like head house- hold’s age and socio-economic group, household’s income, number of persons per household as well as their location residence among others. We supplement this database with information on stores characteristics obtained from the Atlas de la distribution, a national survey of French outlets, and in situ survey. We obtain stores characteristics information such as fascia, location, store size, the number of employees, the existence of a gas station among others. In order to determine distances traveled by consumers to visit stores, we geocoded 7 The importance of stores location has led the UK Competition Commission and the CA to put under scrutiny the presumed anti-competitive practices of land banking by retailers (see Com- petition Commission, 2008 and the inquiry started by its own initiative by the CA in February 2010). 8 Source: population census INSEE 1999. 9 See LSA n◦ 1563 for further details. 6

Table 2: Summary statistics Store data Variable N Units Mean SD Min Max Hypermarket 62 Binary 0.19 0.40 0 1 Supermarket 62 Binary 0.42 0.50 0 1 Hard discount 62 Binary 0.31 0.46 0 1 Convenience store 62 Binary 0.08 0.27 0 1 Surface 62 m2 2167.14 2503.56 450.00 11799.94 Parking slots/m2 62 Nb./m2 0.14 0.09 0 0.55 Cash registers/m2 62 Nb.*100/m2 0.66 0.20 0.33 1.25 # Stores 62 # Hypermarkets 12 # Supermarkets 26 # Hard discounts 19 # Convenience stores 5 Household data Variable N % of households Age group 1 11.76 Age group 2 11.27 Age group 3 19.23 Age group 4 22.96 Age group 5 15.21 Age group 6 19.57 Credit card holder 81.74 Living in a house 56.64 Montpellier 52.42 Rural town 28.77 # Households 1446 Notes: There are 6 age groups (20 to 24, 25 to 29, 30 to 39, 40 to 49, 50 to 59, ≥ 60). Source: author’s calculations. in a geographical information system stores address, as well as census tracts and Montpellier’s block-group. Thus we are able to compute euclidian distances between each household and stores belonging to their respective choice set by assuming that households live at their block-group or census tract’s centroid, respectively.10 Since we are focusing on the supermarket industry, we aggregate outlets other than hypermarkets, supermarkets, hard discounts and large convenience stores in a single outside option. This leads to a total number of 80 + 1 alternatives. Also, we limit our analysis to household primary shopping destination (see the discussion below). Since some outlets are only visited for top-up shopping, this reduces the potential household’s choice set to a lower, but still important, number of 62 + 1 alternatives. Nonetheless, the area of study sprawls approximately over an area of 1,500 km2 which could lead to believe that only a subset of the alternatives is eﬀec- tively considered by each household. In order to account for a more realistic choice 10 By specifying a single-address model, we argue that household’s residence corresponds mainly to the point of departure of the shopping trip. Multi -address models are more a matter of concern for markets where purchase is motivated by impulsive behavior or immediate need (see for instance the study of Houde (2010) for the gasoline market). 7

set for each household, we follow the methodology applies by the European Com- mission or the CA in previous investigations and include in household’s choice set outlets that are located within a radius of 20 km around its residence.11 Departing from this choice set, we decide to restrict the number of potential stores visited by allowing only one store by fascia (the nearest to the household’s residence), except the case where two stores of a same fascia are distant up to one kilometer. This last condition being imposed to account for the lack of precision of lower computed dis- tances owing to the positioning of households at their block-group or census tract’s centroid. Note that one limitation of our definition of households’ choice sets is that we exclude de facto purchases realized in outlets far from households’ residence which typically may arise for consumers living in a small peripheral rural town but working at the metropolitan city. We count 208 households in this case. Hopefully, this concerns a small part of our sample (i.e. ≈12.5%) that we remove in the rest of the study. To resume, after eliminating these households, the database used to conduct our analysis corresponds to a cross-section survey of 1446 households. We present some summary statistics of stores and households characteristics in Table 2. 3.2 The price index The database in our possession contains a rich set of information about households and stores characteristics. Nevertheless, we do not observe prices paid by households for items which make up their shopping basket and let alone the entire set of prices across all stores that composed their respective choice set. This is problematic to infer correctly the drivers of consumers store choice. To solve this problem, we first run a survey on a sample of items’ price for a subset of stores of the area of study. Thereafter, we follow recent studies facing the same issue (see, for instance, Chiou, 2009) and estimate a price index for each category for stores non-surveyed. More precisely, we collect the price of 91 national brand products and first price products in 27 stores of the area of study.12 The choice of including national brand products in our prices report lies on their availability in almost all stores (excepted hard discount stores) and accounts for the need to work with homogeneous varieties (e.g. a 400g jar of hazelnut spread of brand name Nutella). However in order to construct price indices which encompass all store formats - specifically hard discounters - the selection of first price products appears unavoidable. Then for a store j and a category c composed of k = 1, .., K items, the price index is computed according to the following expression: �K pk,j,c p�j,c = k=1 (1) K 11 See, for example, decisions in cases No. IV\M.1085 Promodes/Catteau, No. COMP/M.1221 Rewe/Meinl or No. COMP/M.1684 Carrefour/Promodès. A 20 km radius corresponds approxi- mately to 20 to 30 minutes, depending the average driving speed assuming. 12 A list of the selected products is available upon request. 8

Table 3: Hedonic regression of Log-price SUR Model Fruits & vegetables Meat Cooked meat Cheese Other dairy product Grocery item Alcoholic drink Soft drink Constant 4.6550*** 4.6673*** 4.4256*** 4.4959*** 4.5845*** 4.5683*** 4.5427*** 4.5580*** (0.1713) (0.0776) (0.0758) (0.0514) (0.0520) (0.0522) (0.0259) (0.1025) Hypermarket -0.0601 -0.0278 0.0078 -0.0307* -0.0206 -0.0206 -0.0174** -0.0667** (0.0531) (0.0241) (0.0235) (0.0159) (0.0161) (0.0162) (0.0080) (0.0318) Variety index 0.0240 -0.1606** 0.2084*** 0.1893*** 0.0080 0.1427*** 0.1078*** -0.0508 9 (0.1619) (0.0734) (0.0716) (0.0486) (0.0491) (0.0494) (0.0245) (0.0969) # rivals

The store with the biggest turnover of the metropolitan area is chosen as base 100 of the price indices (Carrefour #2 ). Besides, the sample of stores surveyed was constructed so that to be representative of store formats, fascias as well as locations. Prices report has been realized in three days in order to avoid seasonal variations, especially for the fruits and vegetables category. Thereafter price indices for stores non-surveyed are estimated by requiring a hedonic price regression. But instead of running separate regressions for each category, we argue that unobserved hetero- geneity in the pricing decision across categories may be correlated. For example, items of diﬀerent categories share identical transportation and logistic costs. Conse- quently, we assume that this unobserved heterogeneity is distributed according to a multivariate normal distribution and specify a seemingly unrelated regression (SUR) equations model. The log of the price of the selected basket of items is regressed on a set of retailer fixed-eﬀects and variables describing both the competitive environ- ment and the demand. Table 3 reports the estimates.13 The correlation matrix of residuals are reported in appendix. It may be pointed out that calculation of price indices for hard discount stores follows a separate procedure. Since hard discounters set uniform prices across their stores, the use of a hedonic regression is no more necessary. Instead, we depart from our price report and simply refer to Eq.(1) to compute the average price of their shopping basket for each category. Finally, to insure a correct computation of the indices, we remove from the price index of the base outlet, (i.e. Carrefour #2 ), the national brand products in order to deal with similar products. 4 The empirical framework In this section, we first specify the formulation of our demand model, then we derive the pricing equations from the pricing game supposed played by French retailers. Then we back out stores marginal cost and thereafter their price-cost margin. Departing from our data, several modelisation approaches could be carry on to estimate households’ preferences on store choice. The more intuitive consists in using the information at the category level and specify a model that explains house- hold’s store choice, conditional on a category. The central point of this approach relies on the model faculty to handle the multiple choices of a given household. In this view, we could refer to the multiple discrete choices model proposed by Dubé (2005) in the context of multiple purchases of carbonated soft drinks. Another way would consist in estimating a ‘logit-type’ model where household’s choices should be both correlated over category purchase occasions and stores, which implies a large set of parameters in our context. We observe however no variation among categories (except the price), which prevents us to fully adopt one of these models and esti- mate households’ preferences on the basis of their category purchases. Instead, we propose a more parsimonious model where households are supposed to purchase a bundle of goods, which belong to those categories. Consequently, households choose 13 It is worth noting that � the Breusch-Pagan LM test� for error independence supports the resort to a SUR specification χ2 (28) = 202.62; p = 0.0000 . 10

Table 4: Supermarket channel choice by category (in %) Category Mean (S.D.) Category Mean (S.D.) Fruits & vegetables 0.4682 (0.4992) Other dairy product 0.8755 (0.3302) Meat 0.6065 (0.4887) Grocery item 0.8416 (0.3652) Cooked meat 0.5712 (0.4951) Alcoholic drink 0.7040 (0.4566) Cheese 0.7906 (0.4041) Soft drink 0.8831 (0.3214) Notes: S.D. corresponds to standard deviation. The number of observations is 1446. Source: author’s calculations. their primary shopping destination among retail stores, conditional on a bundle of goods (i.e. a shopping basket). This implies to aggregate households store choices over the whole set of product categories in order to determine their primary shop- ping destination for the entire food products.14 Doing so, we limit our analysis to competition among stores and leave aside the issues of multi-stop shopping and between-categories complementarities. Nonetheless, it seems realistic to assume that beyond retail store choices, between- households heterogeneity is also observable in the type of retail channel visited ac- cording to product category. A simple look at descriptive statistics on frequency of purchase for the supermarket channel confirms this fact (see Table 4). The pattern being more pronounced for the fresh products. Therefore we argue that competition does not necessarily take place fully for the entire product range sold in a store ac- cording to the household who visits it. Depending on households’ habits, households may pay attention or not to some product categories which could either strengthen or lower competition consequently. For instance, some consumers may prefer to buy perishable products in specialized stores due to a higher marginal utility for quality which reduces their shopping list for retail stores compared to large-basket shoppers. Hence, the within-household variation observed among categories individualizes the price of the consumer’s shopping basket. A well-documented consequence of this behavior is that retailers price discriminate regardless product type and market de- mand configuration (Giulietti and Waterson, 1997; Walsh and Whelan, 1999), but also adopt diﬀerent kinds of pricing strategies (e.g. EDLP or HiLo) in order to attract a larger share of consumers. In order to account for heterogeneity among consumers in terms of product cat- egories purchased in the supermarket distribution channel, we propose a two-stage model where in the first stage we estimate the probability that a household buy a certain category in a large grocery store. Thereafter, we use these probabilities to weight the price of the corresponding category such that households pay higher attention to categories’ price they usually buy in large grocery stores. Our formula- tion of a weighted average price of the shopping basket is closer to the one adopted by Briesch, Chintagunta, and Fox (2009). Nonetheless, we diﬀer to these authors by at least two points. First, in our model, the household’s choice relies on the 14 In the following, we determine household’s primary shopping destination by computing a weighted average of visits for each store of its respective choice set. To account for the relative importance of each category in the aggregate decision, we compute from the TNS Worldpanel survey the share of expenditures for each category and weight the binary decisions (see the appendix for further details). 11

decision to realize its purchases in the supermarket distribution channel whereas the occurrence in Briesch, Chintagunta, and Fox (2009) encompasses all types of retail channels. Second, we adopt a modelisation which accounts for the potential correlation between households’ choices which Briesch, Chintagunta, and Fox (2009) do not. 4.1 Demand model: retail channel choice For each household h, (h = 1, . . . , H), we observe its decision to buy a category c, (ch = 1, . . . , Ch ), in a retail store across a set of C categories. Following the discrete choice literature, its purchase incidence can be represented by a vector ih = �ih1 , ih2 , · · · , ihC � of binary dependent variables. We estimate the probability Pr (ihc ) of a single decision through a system of simultaneous probit equations. We denote i∗hc the underlying latent variable associated with the c-th category. The link between the purchase incidence and the latent variable is expressed as follows: � 1 if i∗hc > 0 ihc = 0 otherwise These latent variables are defined by a linear combination of a set of explanatory variables and a error term. Using the matrix notation, the system can be written as follows: I ∗ = Xβ + ε (2) where X = �x1 , . . . , xp � is a C × p vector of p explanatory variables, β �β1 , . . . , βp � is a corresponding vector of parameters of same dimension and ε is a C × 1 vector of error terms that accounts for unobservable heterogeneity. We assume that the choice of a large grocery store for a category purchase is explained both by house- hold’s characteristics (age groups, house, card, work) and by its surrounding retail environment (montpellier, # hypermarket 10km). However, we believe that the household’s decision of whether or not to buy the c-th category in the supermarket retail channel may not be independent of its decisions with respect to other categories. In other words, household’s choices may be related due to cross-eﬀects that reflect complementarities among categories, but also unobservable factors like shopping cost considerations (e.g. search costs, travel costs) or quality seeking behavior that appear at the retail channel level. This assumption is corroborated by the calculation of the tetrachoric correlation coeﬃcients between all the categories (see Table 5). As we observe, the entire set of estimated correlations are significant and positive. Besides, the magnitude of the estimates for certain pair of categories, for instance {meat, cooked meat} or {cheese, other dairy products}, suggests a strong complementarity eﬀect to purchase jointly certain kinds of items in a same retail channel. Similarly to Chib, Seetharaman, and Strijnev (2004), we argue the necessity to model jointly households’ purchase decisions at the category level when it is possible.15 Therefore to control for possible correlations arising from unobservable factors we assume that the error terms of the 15 Nonetheless, we diﬀer from the marketing literature that addresses the subject of multicategory purchasing behavior by the level at which the co-incidence occurred, i.e. the retail channel. 12

Table 5: Tetrachoric correlation matrix of food categories Fruits & veg- Meat Cooked meat Cheese Other dairy Grocery item Alcoholic Soft drink etables product drink Fruits & vegetables 1.0000 Meat 0.7586* 1.0000 Cooked meat 0.6507* 0.9289* 1.0000 13 Cheese 0.6515* 0.7529* 0.7408* 1.0000 Other dairy product 0.6943* 0.7792* 0.7789* 0.9335* 1.0000 Grocery item 0.5760* 0.6618* 0.6783* 0.7613* 0.9012* 1.0000 Alcoholic drink 0.3322* 0.4128* 0.4626* 0.4931* 0.6223* 0.6121* 1.0000 Soft drink 0.5052* 0.6005* 0.5762* 0.7834* 0.8892* 0.8263* 0.7238* 1.0000 Note: * significance at the 1% level. Source: Author’s calculations

latent equations are distributed according to a multivariate normal distribution, ε ∼ N (0, Σ), where Σ = {ρjk } is the correlation matrix obtained considering the Cholesky decomposition of the covariance matrix of the errors: Σ = Lee� L� , where e are independent standard normal random variables and L the lower triangular matrix with diagonal elements equal to unity: 1 ρ12 · · · ρ1C ρ21 1 · · · ρ2C Σ = .. .. . . .. . . . . ρC1 ρC2 · · · 1 It results that the outcome for the C diﬀerent choices, for the household h, is now specified through a Multivariate Probit model (Chib and Greenberg, 1998; MVP hereafter). The probability of the corresponding combination of choices, conditioned on parameters β and Σ is given by: Pr(I h = ih |β, Σ) = ΦC (xβ1 , . . . , xβC ) where ΦC (·) denotes the C-variate standard normal distribution. The results are reported in the appendix. 4.2 Demand model: household’s store choice The second part of the demand model is more familiar with respect to the liter- ature on structural model of demand (Berry, 1994; Berry, Levinsohn, and Pakes, 1995; Nevo, 2001). Given the discrete nature of household’s decision, we follow the standard random utility approach and specify a discrete choice model to assess the drivers of households’ store choice. Households preferences are assumed to diﬀer due to their location residence as well as observed and unobserved heterogeneity in their taste for stores characteristics. To account for this flexibility, we define a random coeﬃcients logit model (or mixed logit) which allows to estimate more re- alistic substitution patterns than simple ‘logit-type’ model. Concretely, the mixed logit model yields flexible estimates of own- and cross-price elasticities by avoiding the problematic independence of irrelevant alternatives (IIA) assumption involved in discrete choice models where heterogeneity is solely captured through the idiosyn- cratic term.16 We assume that a household h chooses its primary shopping destination ac- cording to the highest utility rule derived from patronizing one of the stores j, (j = 1, . . . , J), including in its choice set Jh , or choosing an outside option j = 0. Recall that household’s choice set is defined as the closest store by fascia located within a radius of 20 km around its residence. Thus, following the typical notation for discrete choice models of demand, the indirect utility that a household h, who 16 See Train (2003) for further insights. 14

resides in location Lh , gets from visiting store j ∈ Jh located in Lj is: 6 � Uhj = α0 p�hj + αg p�hj dage hg + δ (D (Lh , Lj ) ; λh ) g=2 � � �� + γm D (Lh , Lj ) zhm + φn xjn + ϕsq vq dfs ormat + ξf + εhj (3) m n s q where p� is the household specific price of the shopping basket, dage hg is a dummy vari- able equals to one if household h is in age group g, δ (·) is a parametric function of the distance between Lh and Lj (i.e. D (Lh , Lj )) know up to the parameter λh which is assumed varying by household, zhm are M observed household characteristics and xjn are N observed store characteristics. Similarly to the distance parameter, we as- sume that the parameter of store size varies by household. We also include a dummy variable dfs ormat for one of the S store formats (s=hypermarket, supermarket, hard discount store, convenience store) that are interacted with Q variables representing a mix of household and store characteristics denoted by v. Finally, ξf is an index of unobserved – to the econometrician – fascia attributes and εhj the idiosyncratic term supposed i.i.d. according to a type I extreme value distribution. Price sensitivity is supposed to vary by six head household’s age groups (with the youngest taken as the reference). Thus, the coeﬃcient α0 corresponds to the marginal utility of price of a ‘representative’ household and deviation from this mean depends on the coeﬃcients of the interaction of price with households characteristics. Recall that following the first stage of our model, the price variable is defined as the sum over the eight categories of the category purchase probability Pr (Ihc ) multiplied by its corresponding price index p�j,c : C � p�hj = Pr (Ihc ) p�j,c (4) c=1 Similarly, we allow the coeﬃcient of distance λh to vary by household. But in- stead of introducing heterogeneity through households categories, we specify a ran- dom coeﬃcient on distance which is more appropriated to account for the diversity of households location. We denote by ω these unobserved household characteris- tics. Precisely, we assume that the coeﬃcient of distance is normally distributed and independent to the idiosyncratic term ε. Again, we interact the distance with observed households characteristics (e.g. the number of cars, the type of residence and whether household’s residence is in a rural town). The set of store characteristics xjn includes the number of parking slots and cash registers both per square meters as well as store size to which one associates a random coeﬃcient supposed distributed according to a normal distribution. Besides, we supplement these variables by interacting store’s format with rivals counts (by store type) in a radius of 0.5 km and 2 km to account for the competitive environment of their vicinity. Unobserved stores characteristics (like shelfs display or assortment, for instance), are supposed captured by the fixed-eﬀects ξf . We argue that these unobserved characteristics reflect essentially national strategies enacted by retailers 15

for their fascia. These common shocks are thus captured by fixed-eﬀects set at the fascia level. As usual, we assume that households value identically these unobserved characteristics. Similarly to the “outside good” in classical demand models, households may decide to visit other channels of retailing than large grocery stores (e.g. small convenience store, specialized store, market place) or not purchasing those food categories at all; which is resumed through the outside option j = 0. Without additional information on characteristics of this alternative, we decide to normalize to zero the characteristics of the outside option, i.e. p�h0 = D (Lh , L0 ) = x0n = ξ0 = 0. According to the highest utility rule, it results that household h visits store j with probability: � Phj = dF (εh )dF (ωh ) Ahj with Ahj = {(εh , ωh ) |Uhj > Uhl ; l �= j} and F (·) denotes the distribution function. 4.3 Supply side: The pricing equation We now describe the pricing rule that retailers follow. We take as given that stores compete in prices and set their price simultaneously, conditional on their charac- teristics supposed chosen prior to this decision (e.g. location, store size, quality, etc.). The prices that result from this behavior are thus an equilibrium of a Nash- Bertrand game. By deriving the pricing equations from the first-order conditions of the profit maximization problem, we will be able to recover stores marginal cost and consequently compute their price-cost margin. We assume that stores manager seek to maintain the price competitiveness of their store across the entire product categories. This suggests that stores manager think in terms of price positioning of the shopping basket and do not adopt a cate- gory management. Besides, an important feature of the French market is that two types of pricing behavior coexist depending on store format: (i) hypermarket and supermarket prices are fixed by store manager based on local competition whereas (ii) hard discount chains use national pricing. Consider the problem of a retailer R that sets uniform prices in a set of JR of its stores. The profits of the retailer R are: � ΠR = (pj − cj ) M sj (p) − Cj (5) j∈JR where cj denotes the constant marginal cost of selling a unit of a shopping basket for store j, M is the size of the market, sj (p) is the market share of j and Cj a fixed-cost. Assuming the existence of a pure-strategy Nash equilibrium in prices, the first-order condition for a typical store j is: � ∂sl (p) sj (p) + TR (l, j) (pl − cl ) =0 (6) l ∂p j 16

where TR corresponds to the retailer’s ownership matrix with general element TR (j, l) equals to one when both stores l and j belong to the same retailer and zero otherwise. This gives us a system of JR equations. Note that the second term of the left hand side of the equation simplifies to a single element if prices are set by stores man- ager. Define ∆R as the retailer’s response matrix with element (j, l) = ∂sj (p)/∂pl , retailer’s price-cost margins can now be expressed in matrix notation by stacking up the first-order conditions and rearranging terms: (p − c) = − [T ⊗ ∆ (p)]−1 s (p) (7) where ⊗ corresponds to the kronecker product. It follows that estimated stores marginal cost depends exclusively on the parameters of the demand system and the market conduct assumption: ĉ = p + [T ⊗ ∆ (p)]−1 s (p) (8) It is worth noting that manufacturers are absent from this scenario. As a result, the stores marginal cost estimated from Eq.(8) include both manufacturers prices and manufacturers margins. Depending on the vertical pricing model (linear, two- part tariﬀ, etc.), the distortion between the estimated and the true store’s marginal cost could be sizeable. However, if manufacturers oﬀer a two-part tariﬀ contract by setting their prices equal to their marginal costs, the double marginalization problem vanishes and the estimated store’s marginal cost coincides with its true value. Since our model is defined at the shopping basket level and does not refer explicitly to a set of manufacturers, we do not specify a vertical relationship. Thus, we have to keep in mind when discussing the results that a gap might exist between estimated and true stores’ margins depending on vertical contracts adopted by parties.17 5 Identification and estimation strategy The demand parameters expressed in Eq.(2) and Eq.(3) are estimated with simulated maximum likelihood (SML). We denote θM V P = {β, ρ} and θM XL = {α, λ, γ, φ, ϕ} the set of demand parameters corresponding to the multivariate probit model and the mixed logit model, respectively. Conditional on θM V P , the log-likelihood function of the combination of the category purchase incidences may be written as: � � � � � �� L I; θM V P = 1 log ΦCh X; θM V P (9) h The multivariate probability ΦCh (·) does not have a closed-form formula due to the problem of high order multivariate normal integrals. A standard approach consists in 17 Although resale price maintenance (RPM) is illegal per se in France, it is well-documented that the adoption of the Galland Law has indirectly promoted this practice. Bonnet and Dubois (2010) have thus shown that manufacturers, in the retail market bottled water, use nonlinear pricing contracts with RPM. Furthermore, their estimates of other nonlinear pricing contracts suggest that manufacturers margin accounts for a limited part of retailers marginal cost. 17

approximate its value by simulation. To that end, we employ the so-called Geweke- Hajivassilou-Keane (GHK) simulator and draw from an upper-truncated standard normal distribution R values. Identically, the log-likelihood of store choice conditional on θM XL is given by: Jh �� � M XL � � � � ��� L Y ;θ ,I = 1 log shj θM XL , I (10) h j=1 where Y is the vector of store choices and shj is the probability that household h chooses store j as primary shopping destination. Since we specify a mixed logit model, the latter is defined as: eVhj (θh ,I ) M XL � � shj θhM XL , I =� � � (11) Jh Vhj (θh M XL ,I ) j=1 e Unfortunately, this closed-form expression is conditional on θhM XL . Since we do not know the true value of θhM XL , we need to integrate Eq.(11) over all possible values of θhM XL . The unconditional store choice probability is then approximated by numerical simulation: 1� R � � s�hj = shj θhM XL,r , I (12) R r=1 As we explain, we refer to simulation to evaluate accurately the probability terms in both parts of the model. By proprieties, these simulated probabilities are unbiased and their variance diminishes as the number of draws rises. Instead of using random draws, we follow recent advances in simulation methods and generate 100 Halton draws. Note that, we keep the same set of draws for each iteration.18 At this point, two estimation strategies are conceivable: a full information maxi- mum likelihood (FIML) or a two-step method. The adoption of one of them reflect- ing the trade-oﬀ between an eﬃciency gain and the burden of computation required to achieve it. Indeed, the joint estimation of the log-likelihood functions (see Eq.(9) and Eq.(10)) gives the true standard errors of the estimates, whereas a sequential estimation introduced a measurement error for the estimates of the second model. Hence, adopting the two-step approach would bias the variance-covariance matrix of the mixed logit since the shopping basket’s price is computed from the estimated probabilities of the category purchase incidences. Nonetheless, the matter of con- cern of the empirical model is to provide accurate estimation of the substitution eﬀects, which results from the demand parameters. Eﬃcient estimates then appear as second-order concern. In addition, the computation time needed to estimate the multivariate probit for eight categories is sizeable itself. As a result, we decide to adopt a two-stage approach and adjust the standard errors of the mixed nested logit by using the correction methods proposed by Murphy and Topel (1985). 18 A consensus exists in the literature regarding the superiority of Halton draws over random draws (see Bhat, 2001; Train, 2003; Chiou and Walker, 2007, among others). For a given number of draws, Halton draws achieve greater eﬃciency and coverage since successive Halton draws are negatively correlated. 18

The demand parameters are identified through several sources of variation. First, each choice occasion diﬀers from one other due to heterogeneity observed in house- holds characteristics. This allows the identification of parameters {α, γ}. Further, for a given choice occasion, household faces a set of stores whose characteristics diﬀer. Hence, the average valuation of stores characteristics identifies φ and, for the same reason, the unobserved characteristics of each fascia ξf . Note that by spec- ifying the fixed-eﬀects at the fascia level rather than at the store level, we avoid the identification problem that may arise for the stores characteristics parameters since stores dummy variables should be strongly correlated with observed stores characteristics. Beyond, the main source of variation among choice occasions provides from the heterogeneity in the spatial distribution of households’ residence and stores; that induces diﬀerent choice sets among households. It results that we observe diﬀerent distributions of distance among households dwelling in separate block-groups. This permits the identification of the parameter λh . Finally, since the price of the shop- ping basket is specific to a household and varies across alternatives for a given choice set, there exists suﬃcient variation to identify parameters associated with p�hj . The literature on discrete choice models of demand has stressed many times that an endogeneity bias may arise between prices and unobserved characteristics (Berry, 1994; Berry, Levinsohn, and Pakes, 1995). If stores managers set their price by taking into account what the econometrician can not observed, then the price parameter appears correlated with these unobserved characteristics and would be upward-bias. In our model, fixed-eﬀects ξf are introduced in order to control for unobserved fascias characteristics. Since we have cross-section data, we are not con- cern by time-varying unobserved quality for fascia. In our study, the occurrence of an endogeneity problem then relies on the existence of unobserved stores character- istics that participate to the price-setting decision of stores manager. More precisely, the likelihood of the endogeneity bias depends in what extent stores unobserved at- tributes might deviate from the mean of the fascia. In order to control for this bias, we could introduce store-specific dummy variables in the indirect utility function and regress, in a second step, theses parameters on stores characteristics similar to the approach followed by Goolsbee and Petrin (2004). However, we exclude this possibility due to the number of dummy variables needed regarding the number of observations in our database. The second method, developed by Petrin and Train (2010) for controlling the endogeneity bias and known as the “control function”, is also inapplicable. Its principle consists in regressing the price variable on all exoge- nous factors and includes thereafter the residuals obtained in the utility specification along with the price variable. Yet, since our price variable is estimated from our SUR model, we can not use this two-stage error correction method. Thus, to insure that our price estimate is not biased we check the absence of several outcomes de- scribed in the literature when the endogeneity bias occurs (see section Robustness below). 19

6 Results 6.1 Mixed logit demand model Estimation results are reported in Table 6. Recall that the parameter estimates must be interpreted relative to the outside option. Almost all the coeﬃcients are both statistically and economically significant. Overall, we note that shopping patterns diﬀer significantly by households, store formats and the area of living. As expected, households express a disutility of price and distance. Precisely, households between 30 to 49 years (age groups 3 and 4) appear less sensitive to price contrary to youngest people. The utility specification in Eq.(3) allows the marginal valuation of distance to vary by observable and unobservable households characteristics. We thus observe that the high disutility of traveling, revealed by the estimated mean of the distance coeﬃcient distribution (mean=-2.1065), is reinforced for people living in a house or in a rural town. Conversely, the higher the number of cars owned by a household is, the lower is its sensitivity to distance. Nonetheless, the statistical significance of the estimated standard deviation of the random coeﬃcient on distance reveals that beyond these interaction terms unobserved heterogeneity exists among households regarding the willingness to travel. Similarly to the distance coeﬃcient, we allow the store size parameter to vary by household. On average, households value positively the log of the selling area of a store (mean=1.5820), albeit important heterogeneity around this mean is observed (S.D.=2.6664). Moreover, we remark that households seems to pay greater attention to the waiting time at cash registers, as suggested by the estimated parameter of this variable. We introduce several interactions terms with store formats in order to capture variety in shopping patterns according to store formats. As we note, the willingness expressed by single household to visit a large grocery store diminishes whatever the format is. Besides, living in a rural town rises the marginal valuation of the supermarket format. In addition, we count for a given store the number of rivals within a radius of 0.5 km and 2 km by format, and interact these variables with its format. The interest of introducing these variables is twofold. First we inves- tigate the nature of the competition among store formats, but we also control for the endogeneity bias discussed above by accounting for elements that may influence the price setting of stores’ manager. Interestingly, it appears that the willingness to choose a hard discount store increases with the number of hypermarkets and super- markets located within a radius of 0.5 km, whereas the eﬀect is opposite if we extend the radius to 2 km. This may suggest that hard discount stores take advantage of store traﬃc generated by large grocery stores in its surrounding environment. On the opposite, hypermarkets suﬀer from this close competition as suggested by the estimated parameter (-2.8008). However, when we extend the radius to 2 km, the eﬀects are reversed revealing that the complementarity eﬀect at play between hyper- markets and hard discounts stores may turn into a substituability eﬀect depending the distance between them. One advantage that a mixed logit model has over a simple logit model is that it provides accurate estimates of substitution patterns since cross-price elasticities vary by competing alternatives. We determine the elasticities of the market share 20

You can also read