Eastern Eastern Economic Economic Journal, Journal, 2014, 40 , (518–534) r 2014 EEA 0094-5056/14 www.palgrave-journals.com/eej/
Understanding Price Movements in Point-Spread Betting Markets: Evidence from NCAA Basketball Brad R. Humphreysa, Rodney J. Paulb and Andrew Weinbachc a
Department of Economics, University of Alberta, 8-14 HM Tory, Edmonton, Alberta T6G 2H4, Canada. Department of Sport Management, 810 Nottingham Road Syracuse, NY 13224, USA. c Department of Accounting, Finance and Economics, E. Craig Wall College of Business Administration, Coastal Carolina University, P.O. Box 261954, Conway, SC 29528-6054, USA. b
We analyze point-spread changes and betting volume for National Collegiate Athletic Association Division I men’s basketball games using survival analysis techniques. Estimates based on point-spread changes from 2,548 games at the Hilton sports book and 1,681 games at the Mirage sports book in the 2007 season indicate that observed changes in point spreads are not related to imbalances in bet volumes. Sports books do not appear to change point spreads to induce equal volumes of bets on either side of propositions; the “balanced book” model of sports book behavior does not describe observed point-spread changes in this market. Eastern Economic Journal (2014) Journal (2014) 40, 518–534. doi:10.1057/eej.2013.10; published online 4 March 2013 sports gambling; gambling; hazard hazard model; balanced book model Keywords: sports JEL: L83; G12; C14
INTRODUCTION We anal analyz yzee the the rela relati tions onshi hip p betwee between n chan changes ges in point point sprea spreads ds on Nati Nation onal al ColCollegiate legiate Athlet Athletic ic Associat Association ion (NCAA) (NCAA) Divisi Division on I men’s men’s basket basketball ball games games and the volume of bets placed on the home and visiting team using survival analysis. Little previous research has examined changes in point spreads over time. Survival analysis techniques techniques are well suited to the analysis of price changes changes because they exploit both the direction and timing of price changes, both of which are important in betting markets where the outcome of the wager has a well-defined end point and relatively little time elapses between the opening of trading and the revelation of the outcome. Analyzing the relationship between movements in point spreads and betting volume can extend our understanding of the behavior of sports books, the suppliers of bets in sports betting markets. In addition, new insight into the relationship between changes in point spreads and bet volumes can be generalized to other related financial markets, such as equities, furthering our understanding of the functioning of these markets. We find that observed changes in point spreads set by the Mirage and Hilton sports books in Las Vegas on several thousand regular season NCAA Division I men’s basketball games to be unrelated to the volume of bets placed on home and visiting team in these games. Point spreads on games with unbalanced betting on either side (home win or visitor win) do not appear to change more often than point spreads on games with balanced betting. The most commonly discussed model of sports book behavior in the literature, the “balanced book” model, assumes that sports books set point spreads in order to balance the volume of bets on either side of propositions in sports betting markets. markets. According to the “balanced book” model,
Brad R. Humphreys et al Price Movements in Betting Markets
519
changes in point spreads should be more likely to occur in games where the early betting is unbalanced, as the sports book attempts to induce later bettors to bet on the other side of the proposition by adjusting the point spread. We present evidence that is inconsistent with his model; observed point-spread changes do not appear to be related to imbalances in betting volumes on college basketball games. Instead, sports books are more likely to change the point spread on games with relatively large opening point spreads. Games with large point spreads are played between teams with relatively unequal abilities. The outcome, in terms of the score difference, in games played between teams of unequal abilities may be more difficult to forecast than the outcome in games played between teams with relatively equal abilities. In this case, observed changes in point spreads can be interpreted as attempts to improve the forecast of the point difference in these games, and not to alter the volume of bets made on each side of the proposition. Evidence exists that unbalanced bet volume is associated with higher bookmaker profits. Humphreys [2010] showed that the larger the volume of bets placed on the home team in National Basketball Association (NBA) games, the less likely was a bet on the home team to win, controlling for the point spread on the game using an IV approach to correct for potential endogeneity of the point spread, suggesting that bookmakers earn higher profits on games with unbalanced betting volume. Humphreys [2011] showed that bookmakers earned a higher profit by accepting unbalanced betting on National Football League (NFL) games than they would have if the betting volume was balanced on each game, using financial simulations based on the actual point spreads and game outcomes from a large number of games played in the NFL. Prices in financial markets change frequently. Because of the frequent nature of price changes in financial markets, survival analysis has been used to analyze changes in stock prices and the time between trades in order to better understand the determinants of these changes. In the sports betting literature, a number of previous studies have analyzed the difference between the opening line and the closing line in betting markets for NFL games [Gandar et al. 1988; Avery and Chevalier 1999], The NBA [Gandar et al. 2000; Gandar et al. 2002b; Colquitt et al. 2007], and NCAA football [Dare et al. 2005; Durham et al. 2005; Durham and Perry 2008]. Durham and Santhanakrishnan [2010] analyzed the effect of changes in information on changes in point spreads. In general, the analysis of changes in point spreads assesses opening and closing lines as forecasts for actual game outcomes and estimate reduced form regression models explaining variation in the change in the point spread from the opening to closing of the market. In both the approaches, the empirical analysis proceeds from the assumption that observed changes in point spreads reflect attempts by sports books to balance betting volume on either side in games. The literature emphasizes the idea that either private information on the part of bettors, or bettor biases in the form of beliefs in “hot hand” effects or the simple desire to bet on a specific team to win or lose a game at any point spread leads to imbalances in betting volume on specific games. In general, the literature shows that closing lines forecast game outcomes better than opening lines, and changes in point spreads from the opening line to the closing line are not well explained by public information such as poll standings, experts’ opinions about game outcomes, or recent performance. These results lead researchers to conclude that significant private information exists in sports betting markets, or that bettors are “noise traders” who exploit short-term volatility in the market. Nearly every paper in this literature explicitly assumes that the balanced book model provides the underlying rationale for observed point-spread changes. Eastern Economic Journal 2014 40
Brad R. Humphreys et al Price Movements in Betting Markets
520
THE ECONOMIC BASIS FOR SPORTS BOOK PRICE SETTING Sports books are profit-maximizing firms that have clearly defined outputs and prices and make decisions in an environment of uncertainty — factors that would appear to generate theoretical interest in economics. Yet, there has been surprisingly little theoretical research on sports book behavior. The most commonly discussed basis for sports book behavior in the literature is what we call the balanced book model. A formal version of this model has never been written down, perhaps because it is so simple. In the balanced book model, firms set the price of a bet so that an equal amount is wagered on either side of the bet. As sports books charge a commission on losing bets, the balanced book model generates a sure profit for the sports book no matter what the outcome of the game is. Suppose that the total dollars bet on a game is H , the fraction of the total dollars bet on the favorite is f , and the fraction of the total dollars bet on the underdog is (1 f ). Let v be the commission or “vig” charged by the bookmaker. In the pointspread betting market, commission is only collected on losing bets. A point-spread bet is a wager of US$1(1 þ v) to win $1. The balanced book model assumes that sports books set the point spread to induce an equal amount of wagers on the favorite and underdog. In this case f ¼ 0.5; no matter which team wins the sports book collects $H (1 þ v) f from the losers and owes only $Hf to the winners. Levitt [2004] derived a useful expression for understanding the behavior of bookmakers based on this concept. This expression shows that the expected value of a sports book’s profit per unit bet, R, is a function of the probability ( p) that a bet on the favored team wins, the commission (v) charged by the sports book and the fraction ( f ) of the total dollars bet on the favorite
ð1Þ
E ðRÞ ¼ ½ð1 pÞ f þ pð1 f Þð1 þ vÞ ½ð1 pÞð1 f Þ þ pf :
The first term on the right hand side represents the amount of money kept by the sports book on losing bets and the second term represents the amount of money returned to the winners. Levitt [2004] treated p as a choice variable of the sports book, implying that p is a function of the point spread. In particular, if p is the probability that the favored team wins, then the lower the point spread, the larger the p, other things being equal. If team A is stronger than team B, then the probability that a bet on team A wins will be larger if team A is favored by 10 points than if favored by 5 points in the same game. Equation (1) can be simplified to
ð2Þ
E ðRÞ ¼ ð2 þ vÞð f þ p 2 pf Þ 1
which shows that expected profits are an increasing function of the commission. The balanced book model is a special case of Equation (2), where either the probability of the favored team winning is 0.5 or the fraction of the total dollars bet on each team is equal ( f ¼ 0.5). In either case, the expected profits are v/2, and the bookmaker is indifferent to the outcome of the game, as profits are identical no matter who wins. Note that we do not attribute the balanced book model to Levitt [2004], but simply discuss the balanced book model in the context of Levitt’s expression for the expected profits earned per unit bet by a sports book; the balanced book concept was discussed as far back as Lawrence [1950]. The balanced book model predicts that sports books attempt to balance the betting volume evenly on each side of a bet, and set p, or the point spread, which is Eastern Economic Journal 2014 40
Brad R. Humphreys et al Price Movements in Betting Markets
521
assumed to affect p, optimally to achieve this outcome. In order to explore the implications of the balanced book model, Levitt [2004] assumed that the fraction of money bet on the favorite is a function of the probability that a bet on the favorite wins f ¼ g( p), g0 >0, where the fraction of money bet on the favorite increases with the probability that a bet on the favorite wins. Again, in this context, the probability that a bet on the favorite wins is in turn a function of the point spread set by the sports book. Given this relationship between f and p, the balanced book hypothesis requires that bettors preferences are such that half the bets will be placed on the favorite when the probability that a bet on the favorite wins is 50 percent [ g( p ¼ 0.5) ¼ 0.5]. In this case, the sports book sets the point spread to make p ¼ 0.5, the betting is balanced on each side, and the expected profit earned by the sports book is the commission. Levitt [2004] also discussed the case where bettors preferences result in more than half the money being bet on the favorite when the probability that a bet on that team will win is 50 percent [ g( p ¼ 0.5)>0.5]; in other words, the situation when some subset of the bettors in a market prefer to bet on favorites at even odds rather than simply looking for potentially profitable bets at the posted odds. In this case, the sports book can earn higher profits by setting the point spread in a way to attract more than half of the money to be bet on the favorite. Under the unbalanced book model, sports books take positions on bets in that they allow the dollars bet to be unequally distributed across the two teams. The existence of unbalanced volume in sports betting markets is gaining considerable empirical support in the literature. Levitt’s [2004] experiment in betting on NFL games showed strong evidence of unbalanced book outcomes. Paul and Weinbach [2007, 2008] present evidence of unbalanced betting on NFL and NBA games. Two distinct behavioral explanations for unbalanced betting volume have been identified in the literature. In one explanation, sports books with unbalanced betting volume exploit known bettor preferences to “shade” the point spread in the direction of the bias. In this case, the sports book uses its knowledge of the market to exploit bettors who simply want to bet on a specific team no matter what point spread is set by setting inflated prices in the direction of bettor bias. This strategy results in the less popular side of the proposition (typically big underdogs, home underdogs, and on the total score to be below the over/under “total”) winning more often than implied by efficiency, possibly generating opportunities for profits. This outcome was demonstrated to exist in point-spread betting on NFL games by Levitt [2004], where bets on home underdogs earned positive returns. We refer to this explanation for the relationship between point spreads and bet volume as “pricing to exploit bettor bias.” An alternative explanation for unbalanced betting volume is that point spreads set by sports books are unbiased forecasts of game outcomes and are not set to balance the volume of bets. In this case, the sports book accepts an unbalanced book without attempting to exploit known bettor preferences or influence betting volume. Unbiased forecasts will result in expected win percentages of 50 percent for all basic wagering strategies, and the sports book will earn a return greater than or equal to the commission on losing bets in the long run. Given the repeated nature of gambling (because of consumption benefits or bettor addiction), a bookmaker may not be concerned with game-to-game or day-to-day returns, but focuses on long-run earnings from commissions. The setting of point spreads as unbiased forecasts of game outcomes by bookmakers eliminates incentives for informed bettors to enter into the market to exploit a pricing or to exploit bettor bias (for example, setting Eastern Economic Journal 2014 40
Brad R. Humphreys et al Price Movements in Betting Markets
522
point spreads higher when the home team is favored). We refer to this explanation for the relationship between point spreads and betting volume as “pricing as an unbiased forecast of game outcomes.” Unbalanced betting volume could also be generated by competition among bookmakers, coupled with the presence of both informed bettors who seek out profitable betting opportunities and uninformed bettors who like to bet for or against specific teams at any price. Consider a market with two bookmakers that contains both informed bettors looking for profitable betting opportunities at given prices and uninformed bettors who prefer to bet on the home team when the home team is favored at any price. Suppose that both the bookmakers set the initial point spread at the home team minus 7 points, and this is an unbiased forecast of the game outcome. Both the bookmakers initially have an unbalanced book because the informed bettors do not see this as a profitable betting opportunity, or bet evenly on each team, but early uninformed bettors bet heavily on the home favorite. If one of the bookmakers increases the point spread to attempt to balance the betting volume, inducing more uninformed bettors to place bets with this bookmaker, the other bookmaker has unbalanced betting volume in the other direction. Assessing the explanatory power of this situation would require bet volume data for individual bookmakers; we lack access to such data. Either pricing as an unbiased forecast or pricing to exploit bettor bias explains the results found in market efficiency studies of sports gambling. Furthermore, pricing as an unbiased forecast even in the presence of unbalanced betting volume is consistent with the well-established efficiency of sports betting markets. Tests of the efficiency of sports betting markets are based on the hypothesis that point spreads are unbiased predictors of actual game outcomes [Sauer 1998]. The presence of unbalanced betting is consistent with the idea that some bettors prefer to bet on favorites even when the point spread is an unbiased predictor of the actual outcome of the game. In general, the assumption of market efficiency cannot be rejected in a large number of studies analyzing betting odds and game outcomes in many different sports leagues over many seasons. In several subsamples, however, market efficiency has consistently been rejected.1 Bettors tend to prefer to wager on the best teams, resulting in the overbetting of the biggest favorites and road favorites. In addition, in over-under betting markets, bettors prefer to wager on the over in games with relatively high totals. In these situations, the bookmaker exercises discretion in their pricing, setting the price too high, resulting in underdogs and unders winning more than 50 percent of the time. In some cases, this results in simple strategies earning returns, which overcome the sports book commission. In this article, we analyze changes in point spreads on college basketball games using a Hazard model in the context of these competing explanations for unbalanced betting. Hazard models are frequently used to analyze the dynamics of price changes in financial markets [Liu and Maheu 2010; Dunne and Mu 2010]. Sports books frequently change the point spread on games. On the basis of the balanced book model, changes in point spreads should be related to imbalances in the betting on the game; if too much money is bet on the favored team, the balanced book model predicts that a sports book will try to induce more bettors to place bets on the underdog. One way to do this is to change the point spread. The presence of unbalanced betting volume suggests that bookmakers could earn higher profits from the imbalance, either because of shading to exploit biased bettors or pricing as an unbiased forecast. If the balanced book model describes sports books’ behavior, then observed changes in point spreads should be explained by imbalances in betting Eastern Economic Journal 2014 40
Brad R. Humphreys et al Price Movements in Betting Markets
523
volume; if bookmakers shade point spreads to exploit biased bettors, or price as an unbiased forecast, then factors other than the volume of bets on each side of a game will explain observed changes in point spreads.
DATA DESCRIPTION The data come from two sources. The first data set contains point spreads set by the Mirage and Hilton sports books in Las Vegas for regular season and conference tournament NCAA men’s college basketball games in the 2007–2008 season. Data on changes in point spreads were collected for 2,672 games from G & J Update Sports Information Services, which monitors and records instant updates on posted point spreads and over/under lines from a number of Las Vegas and offshore sports books. G & J posts all changes in point spreads and totals with individual time stamps, which was collected daily to construct the data set. The sample contains 16,165 records. Each record represents a specific change in the point spread or over/under for a regular season game made by the Mirage and Hilton sports book. About 65 percent of the records are from the Hilton and about 35 percent are from the Mirage. The sample is not evenly divided between the sports books because the Hilton sets point spreads and over/under lines for more games. The second data set contains information on individual game outcomes and betting percentages, the fraction of bets placed on the home and visiting team in each game, for each game in the sample. These data were purchased from Sports Insights, an online sports betting information service. Sports Insights has agreements to obtain betting volume data from BetUS, Carib Sports, Sportbet, and Sportsbook.com, four large online sports books. The percentage of bets made on each side represents an average across the four participating sports books. The betting volume is not available for all sports books and is not available for each game in the data set. However, as a benchmark, for 35 college basketball games with posted point spreads played on a Wednesday night in February, BetUS, Carib Sports, and Skybook reported more than 247,000 wagers, with an average of roughly 7,200 wagers per game. In general, most wagers reflect point-spread wagers, followed by wagers on totals, followed by money line and parlay bets. We matched the point spread and over/under movement data obtained from G & J Update Sports Information Services with the betting volume data obtained from Sports Insights to construct the final analysis data set. By combining these two data sources, we explicitly assume that the bet volumes at BetUS, Carib Sports, Sportbet, and Sportsbook.com are similar enough to the bet volumes at the Hilton and Mirage sports books. The opening and closing lines reported by G & J Update Sports Information Services for the four online sports books are generally quite close to the opening and closing lines for the Hilton. The opening and closing lines for the Mirage are slightly lower than the opening and closing lines reported by G & J Update Sports Information Services, and the closing line reported by the Mirage is slightly lower than the closing line reported by the Hilton. However, in all cases, these differences are extremely small. The difference between the closing lines for the four online sports books and the Hilton is 0.02 points. The difference in opening lines is 0.01 points. These tiny differences can be because of the fact that G & J Update Sports Information Services continuously monitors the web pages of the Hilton and Mirage, while the opening and closing lines reported by Sports Insights are sometimes the “early” and “late” lines, missing initial or last minute line Eastern Economic Journal 2014 40
Brad R. Humphreys et al Price Movements in Betting Markets
524
moves. The fact that the reported opening and closing lines are very similar indicates that our assumption that the bet volumes are very similar is a good assumption. We cannot explicitly test the assumption because we lack bet volume data from the Vegas sports books. Data analysis
We use survival analysis to analyze changes in the point spread. Each record in the data shows the current point spread or over/under at a point in time. Survival analysis assigns meaning to the passage of time because time is a proxy for changes in other unobservable factors that are correlated with the passage of time, such as the revelation of new information. This makes it important to appropriately define analysis time because the measurement of time needs to be an appropriate proxy. Defining analysis time has two aspects 1. Ensuring that whenever two subjects have equal time values, the risk they face would be the same if they also have the same covariates. 2. Deciding which particular point in time should be labeled t ¼ 0, marking the onset of risk. In this context, risk refers to the “risk” that a sports book changes the point spread. When is the onset of risk? In this case, the onset of risk begins when the line is posted, so we measure time from the instant that each point spread for each game was posted. When do two games have the same risk? In this context, this means when do the lines on two different games have the same forces acting on them? The underlying forces include bets made on each side over the course of betting, new information about the games, and the objective of the sports books. If bettors are relatively homogenous and both the sports books have the same objectives, then this would seem to be true in our sample. We assume that the two sports books move their lines for some reason, and both attract similar bettors, implying that analysis time begins when each sports book posts the first point spread on each game and that the risks are identical across games and across the two sports books.
METHODS We apply standard survival time analysis techniques to changes in point spreads on NCAA Division I college basketball games. The goal is to determine if any relationship exists between the final volume of bets placed on the home and visiting team and observed changes in point spreads. As these techniques are not commonly used to analyze sports betting, we begin with a brief introduction to survival time analysis. Suppose that T is a non-negative random variable that represents the amount of time elapsed between the time a sports book first sets a point spread and the time that the sports book changes the point spread. F (t) is T ’s probability density function and F (t) ¼ Pr(T pt) is T ’s cumulative distribution function. Survival time analysis is based on T ’s survivor function S (t)
ð3Þ
S ðtÞ ¼ 1 F ðtÞ ¼ Pr ðT 4tÞ
which is simply the probability that the point spread has not been changed by time t. The survivor function has a value of 1 at t ¼ 0 and decreases toward zero as t increases. Another important concept in survival time analysis is the hazard function Eastern Economic Journal 2014 40
Brad R. Humphreys et al Price Movements in Betting Markets
525
h(t), which is also called the conditional failure rate. The hazard function is the instantaneous rate of point-spread changes, in other words the probability that a sports book changes the point spread over a given interval, conditional on the point spread not being changed before that interval. The hazard function is
ð4Þ
hðtÞ ¼ f ðtÞ=S ðtÞ:
Hazard functions take values between zero (no chance of the point spread being changed during the interval) and infinity (the point spread will be changed with certainty during the interval). The hazard function can have many shapes; it can change over time. The key is that the underlying process of sports books taking bets from bettors on each bet, the effects of these cumulative bets on the sports book’s position, and the objective of the sports book all influence the shape of the hazard function. In the jargon of survival time analysis, the effect of these factors is called “risk.” Although we cannot hope to quantify the underlying risk, that is, the underlying process that generates incentives to change point spreads, changes in point spreads over time can be observed, and these changes can be used to learn something about the underlying process. The survivor function and the hazard function lie at the heart of survival time analysis. The hazard function provides a natural way to interpret the process that generates observed changes in point spreads. The survivor function is a simple measure of the probability of no change in the point spread. Finally, to complete the link between the hazard function and the probability density function, define the cumulative hazard function
ð5Þ
H ðtÞ ¼
Z
hðuÞ du ¼ ln½S ðtÞ:
The cumulative hazard function reflects the total amount of risk — the cumulative factors that lead sports books to change their point spreads — that have accumulated up until time t. The cumulative hazard function also has a natural interpretation beyond a summation. For any given hazard function, the cumulative hazard function tells us the number of times we would expect a sports book to change the point spread over a given period of time. Given these three functions, the relationship between the survivor function, the hazard function, and the probability density function for t is straightforward S ðtÞ ¼ expfH ðtÞg F ðtÞ ¼1 expfH ðtÞg
ð6Þ
F ðtÞ ¼hðtÞ expfH ðtÞg:
Nonparametric analysis of point-spread changes
Preliminary analysis of survival time data cannot be done using standard univariate data analysis tools such as means, medians, and variances because of the censoring and truncation present in survival time data. Descriptive statistics in survival time analysis are based on nonparametric estimates of the survivor function and hazard function. Table 1 shows some basic information about the sample. The Hilton posts sides and totals for more games than the Mirage. The sample contains more observations than point-spread changes because it also contains changes in the over/under Eastern Economic Journal 2014 40
Brad R. Humphreys et al Price Movements in Betting Markets
526 Table 1
Basic duration statistics
Book
Observations
Line changes
Games
10,432 5,733
3,466 3,242
2,546 1,675
Hilton Mirage
Point-spread changes per game
Hilton Mirage
Time at risk (hours)
Mean
Median
Minimum
Maximum
Mean
Median
Minimum
Maximum
1.36 1.94
1.0 2.0
0 0
13 16
11.64 10.33
11.48 9.80
2.53 2.67
52.71 35.57
line and the typical practice is to first post the point spread and then post the over/under line later in the day. Although the Hilton posts point spreads and over/under lines for more games, the Mirage changes its posted point spread more often than the Hilton. In order to explain why the data set contains so many more observations than games or spread moves, consider a small part of the data set for the game played between the University of California at Berkeley (California) and Washington State in Pullman Washington on January 31, 2008. The relevant records and fields are shown below. Visiting team
Home team
California California California California California
Washington Washington Washington Washington Washington
Date
State State State State State
January January January January January
31, 31, 31, 31, 31,
2008 2008 2008 2008 2008
Time
Book
Spread
08:12:46 08:34:37 14:38:17 11:01:04 14:37:18
Hilton Hilton Hilton Mirage Mirage
9.0 9.5 10.5 9.5 10.5
The Hilton posted its first point spread, Washington State as a 9-point favorite, at 8:12 am on January 31. Twenty-two minutes later, at 8:34 am, the Hilton increased the point spread to 9.5 points. The Mirage posted its first point spread, Washington State as a 9.5-point favorite, at 11:01 am. That afternoon, at about 2:38 pm, both the Mirage and the Hilton increased the point spread to 10.5 points. For the Hilton, analysis time begins at 8:12 am; for the Mirage, analysis time begins at 11:01 am. There are two point-spread changes for the Hilton, from 9 points to 9.5 points and from 9.5 points to 10 points; there is only one point-spread change for the Mirage. Incidentally, California won this game 69–64, so all bets on the underdog paid off. The time at risk for this game is the time that elapsed between 8:12 am and tip-off for the Hilton and from 11:01 am and tip-off for the Mirage. Semiparametric analysis of point-spread changes
The semiparametric analysis of survival time data is analogous to regression analysis. Semiparametric analysis examines the relationship between a vector of covariates, either time varying or time invariant but varying across games, and observed changes in point spreads. The most common semiparametric empirical approach for analyzing the effect of observable covariates and unobservable factors on observed changes in survival time data is the Cox proportional hazards model. This regression model relates the hazard rate for the j th subject to some baseline hazard function, Eastern Economic Journal 2014 40
Brad R. Humphreys et al Price Movements in Betting Markets
527 Table 2
Summary statistics, game variables
Variable
Mean
Standard deviation
Minimum
Maximum
Difference in bets Absolute value (opening line) Percentage of bet on home team Percentage of bet on favorite Percentage of bet on home favorite Percentage of bet on road favorite Percentage of home favorites Percentage of road favorites
32.99 7.96 53.14 62.39 60.53 69.31 74.05 23.80
20.189 5.927 19.083 14.850 14.618 11.914 0.438 0.426
0 0 7 8 8 21 0 0
86 34.5 92 93 92 93 1 1
h0(t), where all the factors that affect the hazard rate are equal to zero, and a vector of covariates. Formally, the Cox proportional hazards regression model is
ð7Þ
hðtjx j Þ ¼ h 0 ðtÞ expðx j 9x Þ
where h(t|x j ) is the hazard rate for college basketball game j , x j is a vector of explanatory variables that affect the hazard rate, and (x is a vector of unknown parameters to be estimated. The Cox model is frequently used because it is computationally feasible and because it does not require any parameterization of the baseline hazard rate, h0(t). As the baseline hazard rate is not parameterized, the estimator is semiparametric. The vector of unknown regression parameters, (x, are estimated using maximum likelihood. We estimate a random effects Cox model (called a “shared frailty” model in the survival time literature). Frailty models assume that there is an unobservable positive factor, ai , that affects the hazard rate for different groups in the sample. In this case, “groups” are defined as individual games; frailty models account for within-group correlation in hazard rates from unobservable sources. These gamespecific unobservable factors include any information about games that is available to sports books but not observable by econometricians, such as injuries or past betting patterns. Formally, the model estimated is
ð8Þ
hðtjx j Þ ¼ h 0 ðtÞai expðx j 9x Þ
where ai , is a random variable with mean 1 and variance y that captures the random effect. Note that the covariates in Equation (8) only vary by game. We do not have access to variables that vary hour by hour like the observed point-spread changes. The game characteristic variables come from the Sports Insights web site and are described above. Table 2 contains summary statistics for the game level variables used as covariates in the Cox proportional hazard model. The average opening line was just under 8 points. The difference in bets variable is the absolute value of the difference in the fraction of bets placed on each game in the sample. The mean value of 33 for this variable implies that the average book on any game was unbalanced with about 66 percent of the bets placed on one team and 33 percent of the bets placed on the other team. From the betting percent variables, home teams, favored teams, and especially teams favored on the road attracted a majority of the bets placed in games. Although most home teams are favored, the home team was the underdog in 23.8 percent of the games, and the visiting team was bet heavily in these games. Table 3 reports estimated hazard ratios and P-values for the null hypothesis that the estimated parameter is equal to 1, and other summary statistics for several Eastern Economic Journal 2014 40
Brad R. Humphreys et al Price Movements in Betting Markets
528 Table 3
Estimated hazard ratios and P -values Cox proportional hazards model with random effects
Variable
Model 1
Model 2
Model 3
Model 4
Mirage
1.5821 0.00 1.0002 0.865 0.9796 0.016 1.0008 0.013 — — — — — — — — 233.28 52,592 4,217 6,707
1.5800 0.00 0.9999 0.917 0.9836 0.058 1.0007 0.028 0.9967 0.003 1.0009 0.604 — — — — 240.08 52,580 4,217 6,707
1.5713 0.00 1.0003 0.752 0.9885 0.196 1.0006 0.079 — — — — 0.9975 0.000 — — 241.31 52,574 4,217 6,707
1.5805 0.00 0.9993 0.514 0.9818 0.035 1.0008 0.017 — — — — — — 1.0018 0.009 234.62 52,583 4,217 6,707
Difference in bet volume Absolute value (opening line) Absolute value (opening line)2 Percentage of bet on home team Percentage of bet on favorite Percentage of bet on home favorite Percentage of bet on road favorite Wald statistic Log likelihood Games Line changes
P-values of 0.00 mean the null is rejected at better than 0.1 percent.
different specifications of Equation (8). The reported hazard ratios are the ratio of the baseline hazard, h0(t), to the hazard associated with a one-unit change in the covariate. Hazard ratios greater than 1 indicate an increased likelihood that pointspread changes are associated with a change in that covariate; hazard ratios less than 1 indicate a reduced likelihood that point-spread changes are associated with a change in that covariate. We are looking for evidence that observed changes in point spreads are consistent with either the balanced book model or the two alternative explanations for bookmaker behavior. Model 1 on Table 3 is a baseline model for generating such evidence. The baseline model contains a dummy variable for point spreads offered by the Mirage sports book because the nonparametric analysis indicates that the Mirage changes its point spread more than the Hilton. The baseline model also contains the absolute value of the opening line — the first point spread posted by each sports book — and this value squared. These variables capture any tendency to change large point spreads. The larger the absolute value of the point spread, the greater the disparity between the perceived strength of the two teams. We expect that the relationship between the size of the opening line and changes in the point spread might be non-linear because sports books might have less of an incentive to change smaller lines. Given a higher likelihood for the possibilities of being “middled” (where the score differential of a game falls between posted lines — leaving the potential for bettors on both sides to win, opening the sports book to considerable risk) or “sided” (where the score differential of a game falls on one of the posted lines and one group of bettors wins while the others bets are a push) with small point spreads (particularly around two or three), sports books may be less inclined to change these point spreads. In Sports Book Management: A Guide for the Legal Bookmaker, Roxborough and Rhoden [1998] note that in games with large point spreads, full point movements are Eastern Economic Journal 2014 40
Brad R. Humphreys et al Price Movements in Betting Markets
529
the norm in the sports gambling industry. Roxborough and Rhoden [1998] also note that basketball is similar to football in this respect. With respect to football, Roxborough and Rhoden [1998] suggest “When football point spreads are in double digits, the house may wish to move the line by a full point on each limit bet. When the point spread is that large, the game is unlikely to fall on the numbers. Thus, movement of large spreads by only a half point will not effectively correct unbalanced action.”2 This practice also leads to the result of higher point spreads having larger overall movements associated with these games. The key variable in the baseline model is the difference in bets placed on the two teams. This variable reflects how unbalanced the book is on each game. The balanced book model predicts that sports books try to attract an equal amount of betting on each side of a bet; the larger the difference in betting volume, the larger the observed difference in bets. The balanced book model predicts that observed moves in point spreads should be related to unbalanced betting volume, as the point spread an important instrument available to sports books to influence the amount of money bet on each team. The estimated parameter on the difference in bets variable is our primary test for evidence supporting the balanced book model in our data. Column 1 on Table 3 contains the results for the baseline model. The estimated hazard ratio on the Mirage dummy variable is greater than one and significant, indicating that the Mirage is more likely to change its point spread than the Hilton, conditional on the other observed variables in the regression model. The parameters on the absolute value of the opening line variables are both statistically different from one. The estimated parameter on the absolute value of the opening line is less than one and the estimated parameter on the squared term is greater than one. These estimated parameters imply that sports books are less likely to change small point spreads and more likely to change large point spreads, other things being equal. As noted above, this may be because of the risk involved in “middles” and “sides” and the practice of sports books of moving large point spreads by a full point, rather than a half point. Alternatively, these parameter estimates may reflect bookmakers pricing as an unbiased forecast, as the more unequal the ability of the teams, the larger is the potential for the bookmaker to make a mistake when setting the initial point spread on the game. On the basis of these estimated parameters, the overall hazard ratio is equal to one at an opening line of 12.5 points, which is above the average opening line of 7.5 in the sample. The estimated parameter on the difference in bets variable is quite interesting. This parameter estimate is equal to one, indicating that the difference in bets has no effect on the likelihood that a sports book changes the point spread, other things equal. Sports books are no more likely to have changed the point spread on a game where 75 percent of the dollars bet were on one team and 25 percent of the dollars bet were on the other than they were on a game where half of the bets were placed on each team. The balanced book model predicts that sports books want to attract an equal amount of dollars bet on each team. The observed changes in point spreads in our data are not consistent with this idea; point-spread changes are not associated with unbalanced betting on college basketball games. The alternatives to the balanced book model are that bookmakers set point spreads to exploit biased bettors or price as an unbiased forecast of the game outcome. In either case, sports books might take positions on games when the betting is not equal on either side of a bet. Model specifications 2 4 on Table 3 explore the idea that observed point-spread changes are associated with other factors. Model 2 includes two additional explanatory variables, the fraction of bets Eastern Economic Journal 2014 40
Brad R. Humphreys et al Price Movements in Betting Markets
530
placed on the home team and the fraction of bets placed on the favored team. As Model 2 still contains the difference in bets variable, this specification holds the effects of an unbalanced book on the likelihood of a change in the point-spread constant, so these alternative specifications nest the balanced book model. The parameter estimates from model specification 2 indicate that observed changes in point spreads are not related to the fraction of the bets placed on the favorite team in that game, but the point spreads are less likely to change when the fraction of bets on the home team increases. Model specifications 3 and 4 explore the possibility that heterogeneity in the type of favorite drives the insignificant estimated parameter on the fraction of bets placed on the favorite variable in Model 2. Model 3 includes the fraction of bets on home favorites and Model 4 includes the fraction of bets on road favorites (“home dogs”) as explanatory variables. The results from these two model specifications indicate that observed point spreads are less likely to change as the fraction of bets on favored home teams increases, and more likely to move as the fraction of bets against home underdogs increases. The estimated parameter on the home favorite variable in Model 3 is consistent with the estimated parameter on the home team variable in Model 2. Given that road favorites, in this sample and over longer timeframes, have been shown to win more than 50 percent of the time, these price movements toward road favorites (against home underdogs) may reflect sports book reaction to the actions of expected informed bettors wagering on road favorites. The results from Models 2–4 do not support the balanced book model and also suggest that other factors are not associated with observed changes in point spreads. The results for Model 4 on Table 3 imply that sports books do not change the point spread on games with heavy betting on the home team. As noted above, one explanation for unbalanced betting volume is that sports books “shade” their posted lines in order to exploit known patterns in bettors’ preferences. Other evidence for this explanation exists. Strumpf [2002] found evidence that illegal bookmakers in New York City offered worse odds to customers known to always bet on the home team, the New York Yankees. The observation that sports books do not change their point spreads when the number of bets placed on the home team increases could be consistent point spread shading in the game results. Table 4 presents the results of a number of alternative betting simulations we undertook to determine if “shading” of the point spread occurs in this sample. If sports books “shade” their point spreads toward home teams or favorites, simple betting strategies, such as wagering on the road team or the underdog, should win more often than implied by efficiency. In these simulations, we calculate the percentage of the specific bets that would have won for a number of specific simple betting strategies such as “always bet on the favorite” and test whether this strategy resulted in a “fair bet” (probability of a win is 50 percent), and whether this bet would have earned a profit (following this strategy led to wins more than 52.4 percent of the time). Results of all simulations are compared with the null hypothesis of a fair bet (win percentage of 50 percent) and, where necessary, tested against the null of no profitability (win percentage of 52.4 percent — percentage needed to break even given the sports book commission) using a log-likelihood ratio test. The test statistic has a w2 distribution with one degree of freedom. Critical values for this test are 2.706 for the 10 percent significance level, 3.841 for the 5 percent level and 6.635 for the 1 percent level. Details of this test can be found in Even and Noble [1992]. Eastern Economic Journal 2014 40
Brad R. Humphreys et al Price Movements in Betting Markets
531 Table 4
Log-likelihood ratio tests of “Shading” in NCAA basketball
Bet wins:
Home team
Road team
Road team win (%)
LL ratio: Fair bet
LL ratio: No profits
All games
1,687
1,720
50.48
0.3196
N/A
Bet wins:
Favorite
Underdog
Favorite win (%)
LL ratio: Fair bet
LL ratio: No profits
All favorites All 12+ All 10+ All games
372 524 1,744
382 515 1,673
49.34 50.43 51.04
0.1326 0.0780 1.4754
N/A N/A N/A
Home favorites Home 12+ Home 10+ All H -Favorites
337 465 1,305
357 470 1,291
48.56 49.73 50.27
0.5764 0.0267 0.0755
N/A N/A N/A
Road favorites Road 12+ Road 10+ All R -Favorites
35 59 439
25 45 382
58.33 56.73 53.47
1.6745 1.8903 3.9606*
0.8578 0.7924 0.3924
*
Statistically significant at the 5 percent level.
The home/road results are presented first on Table 4, with favorite/underdog results posted in the same panel following the home/road outcomes. The results by favorite/underdog status are calculated for various point-spread thresholds for big favorites as identified by Wolfers [2006] and Paul and Weinbach [2008]. There is no evidence of shading of the point spread toward the home team in these betting simulations. Bets on road teams win slightly more often (50.48 percent) than bets on home teams, but this result is not statistically significant against the null of a fair bet. Local illegal sports books may shade the point spread toward the local favorite, as noted in Strumpf [2002], but this appears to be a result of price discrimination in local betting markets and not a strategy used by large online or Vegas sports books. The lack of relationship between changes in point spreads and the fraction of bets placed on the favored team in Model 2 on Table 3 is puzzling because of the existing evidence that bettors prefer to bet on winners — that is, favored teams. If sports books know that bettors prefer to wager on favorites, the unbalanced book model indicates that higher profits can be earned by shading point spreads to exploit this preference. Table 4 contains no evidence of “shading” of point spreads toward favorites. In the overall sample and for the subset of home favorites, bets on both favorites and underdogs win almost exactly 50 percent of the time. The only case where the null of a fair bet is rejected (but the null of no profits is not) is for the subset of games with road favorites. This is consistent with the results for Model 4 on Table 3, which indicate that sports books are more likely to move lines when the betting volume is heavy on a road favorite. Unlike other sports, however, in college basketball the road favorite covers the point spread more than half of the time. These results, including the result of road favorites winning more often than implied by efficiency, are consistent with findings of past studies on college basketball betting markets. Paul and Weinbach [2008] found the overall market for NCAA basketball to be efficient, with win percentages of favorites and underdogs Eastern Economic Journal 2014 40
Brad R. Humphreys et al Price Movements in Betting Markets
532
hovering around 50 percent for the sample as a whole and for various thresholds. The only rejections of the null hypothesis of a fair bet found were for the simple wagering strategies of betting on all underdogs of 20 or more points (big favorites overbet) and for betting on all road favorites of 10 or more points (home underdogs are overbet). This result differs from previous findings from betting on the NFL, where bets on home underdogs have been shown to win more often than implied by efficiency. The difference could be attributable to differences in NFL and NCAA basketball bettors, or because of institutional differences in the two sports.
DISCUSSION AND CONCLUSIONS The results of the semiparametric analysis of changes in point-spread changes indicate that observed changes in point spreads are not related to differences in overall betting volume on games. The point spread on a game with 80 percent of the bets on the favorite and 20 percent of the bets on the underdog is not more likely to change than the point spread on a game with 51 percent of the bets on the favorite and 49 percent of the bets on the underdog. Point-spread movements are associated with games between unequally matched teams, where the opening line is large and with heavier betting on the visiting team when the visiting team is favored to win the game. This suggests that balancing the betting evenly on each side is not the reason for observed changes in point spreads on college basketball games. In addition, the results of the betting simulations are not consistent with line shading. Sports books do not appear to set point spreads on college basketball games to take advantage of the well-known bettor preference for bets on the stronger favored team. The only exception to this occurs in the small number of games where the road team is favored. In this case, the null of a fair bet can be rejected, indicating that bets on the road team win more than 50 percent of the time, but the null of no profits, indicating that bets on the road team win more than 52.4 percent of the time, cannot be rejected. The results have several important implications. First, previous research relied heavily on the balanced book model of sports book behavior to motivate empirical research. All of the existing research on changes in point spreads assumes that observed changes in point spreads reflect attempts by sports books to balance the volume of bets after the opening line did not achieve this goal. These studies did not make use of betting volume data; they assumed that observed changes in point spreads must reflect betting imbalances, and looked for factors that might lead to betting imbalances. Our results do not support this story. Imbalanced betting is relatively common in this market, and observed line changes are not related to these imbalances. This makes the interpretation of the empirical results in this literature problematic. Observed changes in point spreads do not have as much to do with betting volumes as was previously thought. As economists now have access to betting volume data, the relationship between betting volume and past performance, expert opinion, and winning streaks needs to be examined, to determine how bettor behavior relates to these factors. Second, the usefulness of the balanced book model of sports book behavior must be evaluated. The emerging empirical evidence about betting volumes strongly indicates that balanced betting on either side of a proposition is not widespread in sports betting markets. Sports books routinely take large positions on bets; sports books do not set point spreads to balance betting on individual games and do not move point spreads to induce additional bets on the other side of unbalanced bets. Eastern Economic Journal 2014 40
Brad R. Humphreys et al Price Movements in Betting Markets
533
This evidence indicates that the balanced book model does not explain much of observed sports book behavior. The most likely alternative model of sports book behavior is one where point spreads are set as optimal forecasts of game outcomes. In such a model, sports books would be willing to take a position on some games, winning some bets and losing others in these cases, while offering a broad “portfolio” of bets to bettors and relying on the law of large numbers to earn a profit in the long run across many different bets on many different games. The advantage of this model is that it appears to be consistent with the existing evidence of efficiency in sports betting markets in all cases except games between teams of unequal ability, when one team is a large favorite. Clearly, the motivation of changing point spreads in this context will differ from the motivation for changing point spreads in the balanced book model.
Notes 1. Evidence of inflated betting lines for big favorites has been identified across betting markets for virtually all college and professional team sports. Some examples include Major League Baseball [Woodland and Woodland 1994; Gandar et al. 2002a], professional hockey [Woodland and Woodland 2001; Gandar et al. 2004], professional football [Dare and MacDonald 1996; Levitt 2004], college football [Paul et al. 2003], professional basketball [Paul and Weinbach 2005a], and college basketball [Paul and Weinbach 2005b]. 2. Roxborough and Rhoden [1998] also note at various points in their book that balanced betting action may be the ideal; as this situation is riskless for the sports book, it is not the normal situation as the majority of games are typically unbalanced.
References Avery, Christopher, and Judith Chevalier. 1999. Identifying Investor Sentiment from Price Paths: The Case of Football Betting. Journal of Business , 72(4): 493–521. Colquitt, L. Lee, Norman H. Godwin, and Rebecca T. Shortridge. 2007. The Effects of Uncertainty on Market Prices: Evidence from Coaching Changes in the NBA. Journal of Business and Finance Accounting, 34(5): 861–871. Dare, William H., and S. Scott MacDonald. 1996. A Generalized Model for Testing the Home and Favorite Team Advantage in Point Spread Markets. Journal of Financial Economics , 40(2): 295–318. Dare, William H., John M. Gandar, Richard A. Zuber, and Robert M. Pavlik. 2005. In Search of the Source of Informed Trader Information in the College Football Betting Market. Applied Financial Economics, 15(3): 143–152. Dunne, Timothy, and Xiaoyi Mu. 2010. Investment Spikes and Uncertainty in the Petroleum Refining Industry. Journal of Industrial Economics , 58(1): 190–213. Durham, Gregory R., Michael G. Hertzel, and J. Spencer Martin. 2005. The Market Impact of Trends and Sequences in Performance: New Evidence. Journal of Finance, 60(5): 2551–2569. Durham, Gregory R., and Tod Perry. 2008. The Impact of Sentiment on Point Spreads in the College Football Wagering Market. Journal of Prediction Markets , 2(1): 1–27. Durham, Gregory H., and Mukunthian Santhanakrishnan. 2010. Point-spread Wagering Markets’ Analogue to Realized Return in Financial Markets. Journal of Sports Economics , 12(4): 1–13. Even, William E., and Nicholas R. Noble. 1992. Testing Efficiency in Gambling Markets. Applied Economics, 24(1): 85–88. Gandar, John M., Richard A. Zuber, and R. Stafford Johnson. 2004. A Reexamination of the Efficiency of the Betting Market on National Hockey League Games. Journal of Sports Economics, 5(2): 152–168. Gandar, John M., Richard A. Zuber., Thomas, O’Brien, and Ben, Russo. 1988. Testing Rationality in the Point Spread Betting Market. Journal of Finance, 43(4): 995–1008. Eastern Economic Journal 2014 40
Brad R. Humphreys et al Price Movements in Betting Markets
534 Gandar, John M., Richard A. Zuber, and William H. Dare. 2000. The Search for Informed Traders in the Totals Betting Market for National Basketball Association Games. Journal of Sports Economics , 1(2): 177–186. Gandar, John M., Richard A. Zuber, R. Stafford Johnson, and William H. Dare. 2002a. Re-examining the Betting Market on Major League Baseball Games: Is There a Reverse Favorite-Longshot Bias? Applied Economics, 34(10): 1309–1317. Gandar, John M., William H. Dare, Craig R. Brow, and Richard A. Zuber. 2002b. Informed Traders and Price Variations in the Betting Market for Professional Basketball Games. Journal of Finance , 53(1): 385–401. Humphreys, Brad R. 2010. Point spread Shading and Behavioral Biases in NBA Betting Markets. Rivista di Diritto ed Economia dello Sport , 6(1): 13–26. _______ 2011. The Financial Consequences of Unbalanced Betting on NFL Games. International Journal of Sport Finance , 6(1): 60–71. Lawrence, Louis A. 1950. Bookmaking. Annals of the American Academy of Political and Social Science , 269(1): 46–54. Levitt, Steven D. 2004. Why are Gambling Markets Organized so Differently from Financial Markets? The Economic Journal , 114(495): 223–246. Liu, Chun, and John M. Maheu. 2010. Intraday Dynamics of Volatility and Duration: Evidence from the Chinese Stock Market, Working Papers, Toronto, Canada: Department of Economics, University of Toronto. Paul, Rodney J., and Andrew P. Weinbach. 2005a. Market Efficiency and NCAA College Basketball Gambling. Journal of Economics and Finance, 29(3): 403–408. _______ . 2005b. Bettor Misperceptions in the NBA: The Overbetting of Large Favorites and the “Hot Hand”. Journal of Sports Economics, 6(4): 390–400. _______ . 2007. Does Sportsbook.com set Point spreads to Maximize Profits? Tests of the Levitt Model of Sportsbook Behavior. Journal of Prediction Markets, 1(3): 209–218. _______ . 2008. Price Setting in the NBA Gambling Market: Tests of the Levitt Model of Sportsbook Behavior. International Journal of Sports Finance, 3(3): 2–18. Paul, Rodney J., Andrew P. Weinbach, and Chris J. Weinbach. 2003. Fair Bets and Profitability in College Football Gambling. Journal of Economics and Finance , 27(2): 236–242. Roxborough, R. Michael, and Mike Rhoden. 1998. Sports Book Management: A Guide for the Legal Bookmaker. Las Vegas: Las Vegas Sports Consultants. Sauer Raymond D. 1998. The Economics of Wagering Markets. Journal of Economic Literature, 36(4): 2021–2064. Strumpf, Koleman 2002. Illegal Sports Bookmakers, Mimeo, University of North Carolina, Department of Economics. Wolfers, Justin 2006. Pointshaving: Corruption in NCAA Basketball. American Economic Review, 96(2): 279–283. Woodland, Linda M., and Bill M. Woodland. 1994. Market Efficiency and the Favorite Longshot Bias: The Baseball Betting Market. Journal of Finance, 49(1): 269–280. _______ . 2001. Market Efficiency and Profitable Wagering in the National Hockey League: Can Bettors Score on Longshots? Southern Economic Journal , 67(4): 983–995.
Eastern Economic Journal 2014 40