Mass gatherings contributed to early COVID‐19 mortality: Evidence from US sports

Social distancing is important to slow the community spread of infectious disease, but it creates enormous economic and social cost. Thus, it is important to quantify the benefits of different measures. We study the ban of mass gatherings, an intervention with comparably low cost. We exploit exogenous variation in the number of National Basketball Association and National Hockey League games, which arises due to the leagues' predetermined schedules, and the sudden suspension of the 2019–2020 seasons. We find that, among clusters of counties that are adjacent to sports venues, each additional mass gathering increased the cumulative number of COVID‐19 deaths by 10.3%.

extended periods of time. A temporary mass gathering ban is relatively cheap and easy to implement compared to, for example, school and workplace closures. In response to the spread of SARS-CoV-2, the pathogen leading to COVID-19, several prominent events have been canceled or postponed, even before widespread quarantine measures were enacted (McCloskey et al., 2020). These include religious, cultural, and sporting events. While most major sports leagues in the US have performed without live attendance or at significantly reduced capacity, the debate over the necessity of such restrictions has gathered fresh momentum when the Texas Rangers franchise in the Major League Baseball decided to stage their first game of the 2021 season in front of a full capacity of 40,300. This announcement was met with sharp criticism by experts and politicians, as it was announced during a phase when infection rates in the US were on the rise once again. 2 Recent outbreaks have lead US major sport leagues to quarantine a significant number of players, while the National Basketball Association (NBA) and National Hockey League (NHL) were forced to postpone a series of games. 3 We quantify how NBA and NHL games have contributed to the early spread of COVID-19 in the United States. 4 Both leagues play exclusively in indoor venues, which present a high-risk setting for infectious disease transmission. Before play was suspended on March 12, up to 12 games per league with an average audience of about 18,000 people were held per day. We analyze how much games held between March 1 and March 11 have contributed to the community spread of COVID-19 in counties surrounding NBA and NHL venues. Since the game schedules were determined long before the first COVID-19 case became public, their spatial and temporal distribution should be unrelated to the initial spread of COVID-19 in the US. 5 In fact, we can show that game schedules are not correlated with observable county characteristics and that ticket sales did not systematically change until the NBA and NHL suspended play. While we do not have data on actual game attendance, we note that any no-show behavior would imply that our estimates are conservative and the true effect on COVID-19 mortality perhaps much higher. 6 Our results suggest that each additional indoor mass gathering between March 1 and 11 in the form of an NBA or NHL game increased the cumulative number of COVID-19 cases (on April 30, 2020) in counties that either host sports venues or are adjacent to venue-hosting counties by at least 269 per million population (p < 0.01) or 9.2% in terms of the average case rate among counties in our sample, and the number of COVID-19 deaths per million by 15 (p < 0.01) or 10.3% in terms of the average death rate. These effects are larger in densely populated areas and in colder regions. A placebo check in which we reassign venues to large US cities that currently do not host sports teams confirms our results. We conclude that banning indoor mass gatherings has an enormous potential to save lives. This is especially important given that such measures are relatively easy and cheap to implement.
Our results contribute to the literature evaluating the role of mass gatherings in the spread of infectious disease, and the benefits of social distancing more generally. In a recent survey, Nunan and Brassey (2020) conclude that the impact of mass gatherings on COVID-19 is still poorly understood. Some evidence is originating from outdoor and partly non-randomly scheduled events. Studying outdoor soccer matches in the UK in March 2020, Olczak et al. (2020) finds that one additional such event increases the number of COVID-19 deaths per million by 20. Focusing on the reopening of soccer stadiums to reduced crowds during the second COVID-19 wave in Germany in late summer/autumn, Fischer (2022) finds moderate effects of an additional game on regional infection rates. These findings are not surprising, as hygiene concepts had been in place across the board. In addition, reopening was certainly endogenous, as the pandemic situation was well monitored during that time in Germany. This certainly had repercussions on the potential COVID-19 spread, as scheduling of games and certainly spectator allowance was adapted according to regional pandemic indicators. In a closely related study, Carlin et al. (2021) take a similar empirical approach and arrive at comparable conclusions as we do.
Another notable contribution is Mangrum and Niekamp (2020), who present evidence that college student travel contributed to the spread of COVID-19. Their estimates show that counties with more early spring break students had higher confirmed case growth rates than counties with fewer early spring break students. A very particular type of mass gathering is considered by Dave et al. (2021), who analyze the effects of the January 6, 2020 US Capitol riots on the spread of COVID-19. They find no significant contribution of this event on the pandemic.
For other infectious diseases there is more evidence, but mostly in the form of retrospective observational studies (Hoang & Gautret, 2018;Karami et al., 2019;Rainey et al., 2016). The best available evidence suggests that multiple-day events with crowded communal accommodations are most strongly associated with increased risk of infection (Nunan & Brassey, 2020). Two papers relate sport events to local seasonal influenza mortality. Stoecker et al. (2016) show that National Football League team appearances in the Super Bowl caused a 18% increase in flu mortality in the population over age 65, and Cardazzi et al. (2020) find that US cities which get to host a professional sports team experience an increase in local influenza mortality by an estimated 4%-24%. Our estimates contribute to the growing design-based evidence on the impact of mass gatherings on the community spread of COVID-19.
Other NPIs have received more attention in the context of COVID-19. These studies differ with respect to outcomes, interventions, and geographic coverage. Gupta  that aim at fostering social distancing have affected people's mobility. The authors proxy mobility with cell signal data, and find SIPOs to have the largest mobility-reducing impact. Two studies examine the impact of SIPOs on COVID-19 cases and deaths. Dave, Friedson, Matsuzawa & Sabia (2020) exploit variation in SIPOs across time and all US states. Their results suggest that approximately 3 weeks following the adoption of a SIPO, cumulative COVID-19 cases fell by 44%. Friedson et al. (2020) focus on California, which was the first state to enact a SIPO. Using a synthetic control design, they find that California's SIPO reduced cases by 125.5 per 100,000 population and deaths by 1661. Similar evidence is provided for Texas (Dave et al., 2020b). All these papers use difference-in-differences designs, 7 which Goodman-Bacon and Marcus (2020) provide a critical account of in the context of evaluating NPIs. Glaeser et al. (2020) investigate the effect of SIPO regulations on mobility and COVID-19 infections for five large US cities. They estimate a decrease in infections of 19% for every 10% point fall in mobility.
Extensive work has focused on previous pandemics. However, the majority of these studies are descriptive in nature. Studying the 1918 influenza pandemic, Bootsma and Ferguson (2007), Hatchett et al. (2007), and Markel et al. (2007), for example, find a strong correlation between excess mortality and how early public health measures were enacted in US cities. It is difficult to infer causality from these results, since NPIs are not exogenous and may be enacted in response to preexisting trends in death rates. Barro (2020) attempts to account for this endogeneity by using the distance to army ports in Boston as instrumental variables for NPI introduction. He argues that, because the influenza spread from Boston to other US cities, the farther away cities are from Boston, the more time they had to react and implement NPIs. Barro finds no effect on overall deaths, but that the ratio of peak to average deaths decreased (i.e., a flatter curve). Chapelle (2020) finds a similar pattern using a difference-in-differences model exploiting differences in the timing of NPI introduction. He claims that the lack of herd immunity in subsequent years offset the initial reduction in deaths during the peak of the pandemic, which led to an overall zero effect on deaths. Adda (2016) investigates the effects of school closures on the transmission of different virus diseases, and find a significant disease prevalence reducing effect. However, he concludes that school closures, as well as other policies reducing inter-personal contacts, are not cost-effective. For recent influenza waves, there is some suggestive evidence that school closures (e.g., Earn, 2012;Wheeler et al., 2010) and workplace social distancing (e.g., Ahmed et al., 2018;Miyaki et al., 2011) may be associated with lower disease transmission. However, this literature consists mostly of small case studies on scheduled school closures (e.g., during holidays) or single firms. Viner et al. (2020) conclude that school closures were largely ineffective in controlling past Coronavirus outbreaks (i.e., SARS and MERS). The cost of school and workplace closures are massive. Fuchs-Schündeln et al. (2020) argue that school closures associated with COVID-19 lead to substantial and permanent welfare losses for children, which can only partly be offset by parental reactions. These long-lasting effects are confirmed by Agostinelli et al. (2020). Sadique et al. (2008) estimate that school closures in the United Kingdom could cost up to £1.2 billion per week.
In the early stages of COVID-19, Alexander and Karger (2020) find that people already traveled 9% less and made 13% fewer visits to non-essential businesses. Their preliminary evidence suggests that consumer spending for over 1 million small US business may be reduced by 40%. In a recent survey, respondents reported average wealth losses due to COVID-19 of about $33,000 (Coibion et al., 2020). However, Greenstone and Nigam (2020) estimate that even a moderate form of social distancing (i.e., isolation of suspect cases and their family members and social distancing of the elderly) beginning in March 2020 can reduce COVID-19 fatalities by almost 1.8 million over the following 6 months, amounting to economic benefits of almost $8 trillion. Similarly, Thunström et al. (2020) estimate the potential benefits of social distancing at around $5.2 trillion.
The remainder of the paper is structured as follows. Section 2 describes our data sources. In Section 3, we present our estimation strategy. Sections 4 and 5 report the main results and a heterogeneity analysis. Section 6 presents results from different placebo tests. Section 7 reports different sensitivity checks. Section 8 provides concluding comments. Additional figures and tables we delegate to the Appendix.

| DATA
Our estimation sample consists of 38 counties which host either a NBA or a NHL venue, or both, and all their 204 neighboring counties. A venue county together with its adjacent counties we call a "perimeter" (see Figure 1). Perimeters can be overlapping for counties that are adjacent to multiple venue counties. For all counties, we collect information on COVID-19 cases and deaths. 8 In Appendix Figure A.1, we show the number of cases (panel a) and deaths (panel b) per million population measured on March 13 (indicated by the left scatter) and on April 30 (the right scatter) for each venue county, grouped by state, in our data. Additionally, we compute the average number of cases and deaths across each set of neighboring counties. The highest increases are in Essex County, NJ; Orleans Parish, LA; and Suffolk County, MA.
To generate our treatment variable, we use information on NBA and NHL games played between March 1 and March 11. 9 During this time span, 78 NBA games (on average about 7 per day) and 57 NHL games (on average 5 per day) were played in US venues. Both leagues suspended all remaining games for the 2019/2020 season indefinitely on March 12. The NBA canceled two games right before tip-off on March 11: Utah Jazz at Oklahoma City Thunder, where Utah player Rudy Gobert tested positive for COVID-19 prior to the game, and New Orleans Pelicans at Sacramento Kings, due to a suspected infection involving a referee who was part of the officiating crew in a Utah Jazz game earlier the same week. We do not include these two games in our analysis.
To generate covariates, we collect county-level data from different sources. First, we obtain data on population by age, sex, and ethnicity from the 2016 US census provided by the Survey of Epidemiology and End Results, US State and County Population Data. 10 Second, we use a set of county-level socioeconomic and healthcare-related variables linked to COVID spread provided by Killeen et al. (2020). These include the 2018 unemployment rate, the share of population with a high school degree, the number of housing units per million population, the median household income, the number of hospitals, ICU beds, physician assistants, and nurses per million population. Third, we draw data on the number of large (international) airports in a county from the International Air Transport Association. Fourth, we use 2020 survey information from Morning Consult to construct the share of Black fans for each NBA and NHL team. 11 To stratify our analysis, we use, among others, information on population density, climate, and the timing of SIPOs. The information on county land area is collected from the US Census Bureau. 12 Historical climate data on the countylevel, including data on April temperatures, are collected from the National Centers for Environmental Information. 13 Finally, information on the introduction of SIPOs on the state level is taken from Dave, Friedson, Matsuzawa & Sabia (2020). Descriptive statistics for all variables used in our empirical analysis by county type are presented in Table 1.

| ESTIMATION STRATEGY
In our econometric analysis, we aim to explain the cumulative number of COVID-19 infections and deaths in a given county c located in state s, affected by venue v. The sample comprises counties that host an NBA or NHL venue and those adjacent to a venue-hosting county. For each venue v, we construct perimeters, which are sets of counties comprised of both the venue county itself and all its adjacent counties.

F I G U R E 1 National Basketball Association (NBA) and National Hockey League (NHL) venues and adjacent countries in the United
States. This map provides an overview on the counties we use in our analysis. Counties where venues are located are marked with red dots. The light-gray shaded counties are adjacent to either a NBA or a NHL venue, the dark-gray shaded counties are in the perimeter of both a NBA and a NHL venue. -5 The dependent variable, s) , is defined as the cumulative number of COVID-19 deaths in county c (measured on April 30, 2020) per one million population. The average death rate in venue and adjacent counties is 147.9 with a standard deviation of 262.5 (see Table 1). The explanatory variable of primary interest, games c(v,s) , measures the cumulative number of games (NBA and NHL) at venue v between March 1 and 11. Starting from March 12, both leagues suspended their seasons and all games were canceled. 14 On average, there were 3.5 games, with considerable variation to exploit: The number of games varies between 0 and 16, with a standard deviation of 2.69.
We note two important things. First, since our treatment is time-invariant, we construct our sample as a crosssection. Second, we assume that venue counties and counties adjacent to venue counties are affected in the same manner by a sports game. If an adjacent county is in the perimeter of multiple venue counties, we add up the number of games the adjacent county is exposed to. 15 This set-up translates to the following dose response model: where X c are county-level controls, and γ s are state fixed effects. Our county-level controls comprise population density, the sex-race-age distribution, and a set of sociodemographic and healthcare-related variables possibly correlated with COVID-19 spread that are listed in Section 2. 16 We refrain from including venue fixed effects, as we observe 75 counties which are affected by events held at more than one venue. We estimate model (1) with county population weights. Our main parameter of interest is β, which captures the impact of an additional mass gathering due to a NBA or NHL game on the cumulative number of COVID-19 deaths. Given that the game schedules were determined long before the first COVID-19 case became public, there should be no correlation between games c(v,s) and the error term ɛ c (v,s) . This identifying assumption is supported by the fact that the number of games does not correlate with observed county characteristics (see Appendix Table A.1).
There are four potential issues in regards to the interpretation of our estimate. First, there could be anticipation effects, in the sense that people may increasingly had refrained from visiting games prior to the lockdown. This would lead to an attenuation bias, which implies that our estimate is a lower bound of the actual effect. Unfortunately we do not have data on game attendance, but we can show that ticket sales did not systematically change before suspension of play (see Appendix Figure A.2). Second, our results can only speak to US counties that either host or are adjacent to a sports venue. These are primarily urban regions, where disease may spread more rapidly than in lesser populated areas. Also, there may be spillovers to counties not in the data that we cannot account for. Although we know little about whether fans travel inter-state for sports games, secondary infections may cause the virus to spread beyond our perimeters. This again implies that our findings are conservative estimates. Third, for now we assume that an additional game has the same effect on each county. This may be a strong assumption, because there are substantial disparities in land area and population density across counties. To account for this heterogeneity, we provide separate estimates for low and high population density areas in Section 5. Fourth, there may be differences in testing availability we cannot account for. This is the reason why we consider both cases and deaths as an outcome, where the latter should be less affected by access to tests. Also, we believe that most of the variation in testing is captured by our state fixed effects.

| MAIN ESTIMATION RESULTS
Our estimation results are summarized in Tables 2 and 3. We estimate the model in Equation (1) both on the cumulative number of COVID-19 cases (Table 2) and deaths (Table 3) per million population. We find a significant positive effect of the number of mass gatherings on both of these outcomes. Our preferred estimates (column 5) indicate that each additional mass gathering between March 1 and 11 increased cases by 269 per million and deaths by approximately 15 per million population. These are substantial effects. Compared to the average case and death rates across the counties in the data, our estimates correspond to increases of 9.2% and 10.3% per game, respectively. Both estimates are statistically significant at the 1% level. These findings are robust across different specifications. In column (1), we show the unconditional relationship between cases/deaths and games. In column (2), we introduce state fixed effects. In column (3), we additionally include a binary indicator capturing whether the county hosts a venue or is an adjacent county, the population density, and the shares of non-Whites, people above 60 years of age, and females in the population. In column (4), we alternatively use the full interacted sex-race-age distribution, defined as a set of 16 variables capturing the share of sex g, of race h, and in age-group i in the population; where h is White or non-White, and i is 0 [20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41][42][43][44][45][46][47][48][49][50][51][52][53][54][55][56][57][58][59]or 60+. Column (5) additionally controls for a set of sociodemographic and healthcare-related variables that may be related to COVID-19 spread. This is our preferred specification. When analyzing deaths in Table 3, we additionally control for the number of confirmed cases by March 13 in columns (3) to (5). Our covariates do not have causal interpretations, hence we refrain from interpreting them.
In Figure 2, we provide an overview on the dynamics underlying these effects. The horizontal axis measures time from March 1 to April 30. The black squares capture the cumulative number of games (NBA plus NHL) before the leagues suspended play on March 12, which is indicated by the red line. The hollow circles measure the estimated effect of an additional game on the cumulative number of COVID-19 cases (Panel A) and deaths (Panel B) on each day between March 13 and April 30. Each estimate comes from a separate regression, with the dependent variable being measured on different days. The right-most estimate is our baseline. A priori, we expect effects to be strongest around 3 weeks after the shutdown. This is precisely what we find. The effect of games starts to pick up around March 19 and increases at a decreasing rate since then. This is true for both cases and deaths. Furthermore, we see that cases respond sooner than deaths, which makes sense given the natural lag between diagnosis and death. In terms of magnitudes, estimates for COVID-19 deaths (cases) range between 0.002 (0.367) on March 13 and 15.195 (269.131) on April 30.
If we split our treatment variable and count NBA and NHL games separately, we find that games in both leagues positively affect COVID-19 spread (Table A.2 in the Appendix). While the effect on cases (column 1) is statistically insignificant for NHL games, the point estimate is indeed economically significant at over 200. When looking at deaths, we find large positive effects for both leagues.

| HETEROGENEITY
So far, we have established that mass gatherings in early March increased COVID-19 deaths in counties surrounding NBA and NHL venues by 10.3% per game. Additionally, we are interested whether there is heterogeneity in these effects by county characteristics. In Figure 3, we therefore stratify our sample by population density, ethnic composition (measured by the share of Black people in the population), average temperature, and policy responsiveness (i.e., when SIPOs were first introduced). We split each variable by its sample median and repeat our regressions from above. In Panel (a) we consider cases, in Panel (b) deaths. Due to the smaller sample sizes, we report 90% confidence intervals. -9 Effects are stronger in areas with high population density. This is to be expected; where people live close to one another, the risk of transmission is greatly increased. In less densely populated areas, the effect of mass gatherings is close to zero and statistically insignificant. This is true for both cases and deaths. We do also find different effects if we split the sample by the share of Black people in the population. The estimated effects for deaths are driven by counties with a low share of Black people. This is surprising, given that early reports in the medical literature suggest that Black people tend to be affected more strongly by COVID-19 than other ethnic groups (e.g., Yancy, 2020). In terms of temperature, we find that colder areas clearly drive our effects. In counties with below-median temperatures, the effect on deaths is almost five times as high as in the baseline. This is in line with the idea that the virus replicates more easily in lower-temperature conditions, while higher temperatures, and in particular sunlight, may offer F I G U R E 3 Treatment effect heterogeneity. We replicate our results in sub-samples defined by the median of the respective stratification variable. Estimates base on county level population density are based on the split along the median of the population density distribution of all 242 counties in our data. The county level share of African American population was calculated using 2016 US census data provided by the National Bureau of Economic Research. The third sample split is based on the maximum temperature in April for the 20 most recent years, 1998-2019. Low indicates counties below the median of this long-term temperature median, high indicates above the median. Finally, we split along the median of days statewide shelter-in-place order (SIPO) regulations were in place by April 30. States  protection against infection (Slusky & Zeckhauser, 2020). However, the literature has not yet reached consensus whether this is indeed the case for COVID-19. While some early reports from China document a negative correlation between temperature and COVID-19 spread (e.g., Wang et al., 2020), others find no such (or even a positive) connection (e.g., Ma et al., 2020;Yao et al., 2020). Note that, because we consider indoor mass gatherings, temperature plays a lesser role for primary infections, but it may be important in determining the extent of secondary and subsequent infections. While the effect for deaths is somewhat larger for counties that adopted SIPO late, we do not find pronounced effect heterogeneity by how early states enacted SIPO orders. 17 We assume that this is because SIPO timing measures two countervailing effects; while SIPOs reduce disease transmission, early-adopter states may also be more strongly hit by the pandemic.

| PLACEBO TESTS
If our results indeed measure the effect of sports games on COVID-19 deaths, we should not find systematic effects on deaths in counties that are not exposed to sports games. Therefore, we suggest an in-space placebo test using continental US counties outside our sample to validate our results. 18 We think of this as randomly reassigning venues to counties not in our data (i.e., the white-colored counties in Figure 1), and repeating our estimations from above. To construct this hypothetical scenario, we identify counties that are similar in observables to our venue counties. Essentially, we approach this by estimating conditional propensities to host a venue for each county, and then look for nearest-neighbor pairs in terms of these propensity scores. We then construct perimeters around each placebo county, similar to our main sample. 19 This gives us a sample of 38 placebo venue counties, and 148 adjacent counties. The results of this exercise are summarized in Figure 4. To show how estimates change over time, we repeat our rolling window estimates from before. We find robust zero effects on both cases and deaths.
In a similar vein, we randomly reassign the number of games across the counties in our sample 500 times and plot the resulting treatment effects ordered by their magnitude. This is shown in Appendix Figure A.3. As expected, our baseline findings indicated by the blue triangles are among the largest estimates of this permutation exercise. For cases (panel a), the true estimate ranks first and for deaths (panel b) it ranks fourth across permutations.
In a final placebo test, we control for away games in our estimations. Although fans may still gather to watch the games, this variable should have less of an influence on COVID-19 transmissions in the home venue county and its perimeter. This is confirmed by Appendix Table A.3, where we add the number of a team's away games between March 1 and March 11 as a control variable to our main specification. We find that the home game coefficients remain practically unchanged, while the point estimate on away games is statistically insignificant.

| SENSITIVITY CHECKS
To assess the sensitivity of our findings, we replicate our main analysis using different estimation samples and techniques, weights, outcome definitions, and inference methods. First, we modify our estimation sample by omitting each state once and estimating effects for all remaining states. This exercise is in the spirit of jackknife resampling; it should reveal whether individual states have enough leverage to drive our estimates. We obtain 31 different leave-one-out estimation samples, the estimates based on those samples are summarized in Appendix Figure A.4. Panel A shows the estimated coefficients for the cumulative number of COVID-19 cases on April 30. All estimated coefficients are statistically significant and have overlapping 95% confidence intervals. Panel B shows the estimated coefficients for the cumulative number of COVID-19 deaths. With one exception, all coefficients are statistically significant, and confidence intervals are again overlapping.
We want to point out two observations. First, omitting California leads to wider confidence intervals. This can be explained by the fact that a substantial number of counties in our sample are located in California (26 counties or 10.74%). If we reduce the sample in such a substantial manner, our estimate is still positive but not statistically significant at the 5% level. Second, omitting New York reduces the estimated effect size from 15 deaths (269 cases) to 7.6 deaths (109 cases). The estimated coefficients are still statistically significant at the 10% levels, and the corresponding confidence intervals overlap with those resulting from the other restricted samples. Overall, we conclude that our findings are not driven by a certain state.
Second, we repeat our analysis without population-weighting the regressions. For the cumulative number of COVID-19 cases, we obtain larger coefficients with larger standard errors (see Panel A.1 in Appendix Figure A.5). For the cumulative number of COVID-19 deaths, the coefficients hardly change, but standard errors increase somewhat (see Panel A.2 in Appendix Figure A.5). For both outcomes, the overall pattern is comparable to our weighted baseline estimates. Third, we present our baseline estimates with clustered standard errors at the venue level. In our baseline estimation, we refrained from clustering, since we have a rather low number of clusters and some clusters have only few observations (see Figure 1; Cameron & Miller, 2015). It is reassuring that our estimates on the cumulative number of COVID-19 cases and deaths remain highly statistically significant if we use clustered standard errors at the venue level (see Panels B.1 and B.2 in Appendix Figure A.5).
Fourth, to use the dynamic nature of our outcome data, we construct an alternative data set where cumulative cases and deaths are measured over time in a panel covering 30 days from April 1-30, with the treatment variable being held fixed (i.e., games between March 1 and 11). Tables A.4 (cases) and A.5 (deaths) in the Appendix present these results. In column 1, we estimate the effect of the cumulative number of games from March 1 to 11 on daily cases and deaths between April 1 and 30. We estimate an increase of over 200 confirmed COVID-19 cases and 9 deaths for each additional game. Columns 2 through 4 split the outcome window in three parts; April 1-10 (column 2), April 11-20 (column 3), and April 21-30 (column 4). Here we see that the estimated effects increase over time, similar to our static model. In column 5, we use the number of cumulative games lagged by 1 month as the treatment variable and estimate effects for the first 11 days of April. 20 Since the lagged number of games is time-varying, we can estimate these models with county fixed effects. The results confirm the positive effect of the cumulative number of games on COVID-19 cases and deaths. We can also use this data set to estimate event studies where the number of games is interacted with dummy variables indicating every day between April 1 and April 30 ( Figure A.6 in the Appendix). This leads to very similar effect patterns as in Figure 2, where we estimate separate regressions for every day.
Finally, we log-transform our outcome variables and reestimate our models in Table A.6 in the Appendix. We find that, across specifications, coefficients are positive and statistically significant. Our most conservative estimates would imply an 11.2% increase in infections (panel A, column 5) and 13.7% increase in deaths (panel B, column 5). Overall, we are confident that our estimates indeed measure a causal effect of indoor mass gatherings on COVID-19 mortality.

| POLICY CONCLUSIONS
As argued by Adda (2016), it is crucial to understand which measures are able to effectively contain a new virus pandemic at a particular point of time. During the COVID-19 pandemic, mass gatherings have been an important consideration in terms of possible safety protocols and NPIs. In particular, sports leagues worldwide have contemplated when to stop operations, restrict attendances, or adapt COVID-19 safety protocols.
In this paper, we present estimates for the impact of mass gatherings in the form of NBA or NHL games on the community spread of COVID-19 in the US. We find that one additional game increased the cumulative number of COVID-19 deaths in urban areas and surrounding counties by 10.3%. Multiplying this by the average number of games in affected counties (3.47) yields a 36% increase, which amounts to around half of the difference in death rates between affected counties and matched placebo counties that do not host or are adjacent to sports venues, which we used for a robustness check. We conclude that banning mass gatherings is an effective NPI to slow the spread of COVID-19.
Since our results stem from a time where wearing face masks was still uncommon, we interpret our results as upper bound estimates of the benefits of canceling games in future pandemics if mask mandates and other hygiene measures are enacted by sports venues. Also, holding events without live audiences may be helpful to mitigate disease transmission while minimizing welfare losses that come with canceling sports games. Our results show that team's away games do not seem to affect COVID-19 spread, which suggests that fans meeting to watch games at home had not been an important driver of the pandemic. Finally, our results indicate the consequences of reopening indoor sports venues at full capacity without elaborate hygiene concepts or sufficient vaccination coverage, even when infection and death rates are low.
While the benefits of banning mass gatherings are tremendous, the cost are likely low. Estimates in NBA circles suggest, for example, that each game yields an average $1.2 million in gate revenue. 21 This figure comprises all gameday revenue, including tickets and concessions, but excludes revenues from TV and sponsoring deals and the resulting consumer surplus. 22 The latter two components might not be lost if games are played without audience. More importantly, however, the opportunity cost of banning sports games are likely much lower than those of other NPIs. This is supported by a recent upswing in TV ratings for the NBA and NHL. 23 While it is inevitable that some jobs may be lost in the process, we believe that the resulting human capital loss is orders of magnitude smaller than what we would expect from, for example, school or workplace closures. We cannot rule out though, that sport audiences might change 14 -AHAMMER ET AL.
consumer behavior and turn to other mass events. Consequently, to be effective, any NPI directed at (indoor) mass gatherings should account for potential substitution effects.
A major limitation of our study is that we cannot speak to outdoor mass gatherings. Overwhelming evidence suggests that COVID-19 is primarily transmitted via aerosols (Bourouiba, 2020;Morawska & Cao, 2020), which implies that indoor gatherings carry a particularly high risk for infection. Outdoor events, especially when certain safety protocols (minimum physical distance, mandatory masks, etc.) are established, may be less risky. There is first evidence, for example, that the 2020 Black Lives Matter protests in the US did not lead to in increase in COVID-19 cases and deaths (Dave et al., 2020a).

ACKNOWLEDGMENTS
For helpful comments we thank the editor Gary Wagner and two anonymous referees, Jochen Güntner, Brad Humphreys, Dominik Schreyer, Carl Singleton, Martin Watzinger, and participants at the Reading Online Sport Economics Seminar (ROSES). The usual disclaimer applies. Financial support from the Christian Doppler Laboratory "Aging, Health and the Labor Market" is gratefully acknowledged.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.

ENDNOTES
1 The WHO describes a mass gathering as "a planned or spontaneous event where the number of people attending could strain the planning and response resources of the community or country hosting the event. The Olympic Games, The Hajj, and other major sporting, religious, and cultural events are all examples of a mass gathering." accessed June 9, 2020. 22 Other factors we cannot account for that would likely increase the cost of banning mass gatherings are, among others, effects on revenues of local restaurants and bars. 23 In case of the NBA, see https://www.forbes.com/sites/shlomosprung/2021/01/21/nba-tv-ratings-on-tnt-espn-abc-up-34-from-last-year-pernielsen/?sh=69378f964ddc and https://theathletic.com/2352601/2021/01/29/nhl-tv-ratings-social-media/ for the NHL.