Sports Analytics: How Athletics and Mathematics Are Merging

The data revolution has swept over the world of sports, and it is time for teams to embrace the change in order to succeed.

Reading Time: 5 minutes

Cover Image
By Vivian Teo

The crowd in the Oakland Coliseum watched starstruck at the crack of the bat. As the ball soared toward the right field seats, fans descended into a frenzy. The Oakland Athletics’ Scott Hatteberg, who three years earlier had ruptured a nerve in his elbow, produced a miracle off of Kansas City Royals’ Jason Grimsley, granting the Oakland Athletics their record-breaking 20th consecutive win in 2002.

The win streak would not have been possible without the Athletics’ General Manager Billy Beane’s innovative approach to baseball. Faced with a team on a shoestring budget, Beane looked for new ways to make the most of the money available to him. With the help of Harvard graduate Paul DePodesta, Beane used overlooked statistical metrics to identify inexpensive players who had great potential. Applying this “Moneyball” strategy of sabermetrics, Beane pushed the Athletics to the top of the AL West and made Major League Baseball history.

The data revolution in sports was officially underway after the Athletics’ 2002 season. Beane’s successful alternative approach to baseball contradicted scouting techniques that were utilized for decades prior, but it reverberated throughout the MLB and other sports.

Since Beane’s record-breaking season, the use of sports analytics in baseball has evolved with the rapidly advancing technology available. Using high-resolution cameras in games, teams have collected previously inaccessible data points, including a pitcher’s arm angle, the spin rate on a ball, the exit velocity of a home run, and more. Services like Statcast provide teams with this data as they can track information to the specificity of the angle and height of a pitch release point. Data collection in baseball is only improving as MLB implements new technologies like Hawk-Eye cameras, used in video assistant referee systems in soccer, to track balls and PITCHf/x, which gathers velocity and movement information for each pitch. Systems like PITCHf/x not only help data analysis teams but also transform the fan experience as viewers have access to much more specific details, like where each pitch lands in the strike zone.

The novel data collection softwares in baseball have led to new metrics in games as well. Statisticians have created ERA+, which improves the standard earned run average calculation by including ballpark dimensions. Other measurements include weighted runs created plus, a comprehensive offensive metric, and wins above replacement, which calculates how many wins a player produces for a team in comparison to the average player on a roster.

In soccer, clubs around the world have successfully introduced data analysis teams to the Beautiful Game, seeking to optimize scouting, understand fan behavior, and improve squads as a whole. Clubs have picked up on analytics companies like Smartodds, a betting consultancy that collects data from various games and sells its analyses to gamblers. Smartodds is not only useful for gambling, however. Squads like Brentford FC have been able to utilize the data of Smartodds in their scouting processes to make the most of their budgets, similar to the “Moneyball” approach. For example, Matthew Benham, owner of Smartodds and Brentford FC, expanded his mathematical modeling strategy by becoming a majority shareholder of FC Midtjylland, a Danish club, and used his analysis of players to find undervalued prospects based on overlooked statistics. Benham was able to identify players from the third German division who were capable of performing in the Bundesliga at much lower prices than other prospects. As a result, after only one year with the team, Benham led Midtjylland to the top of the Danish Superliga as champions for the first time in the 2014-2015 season. Midtjylland proceeded to win the championship again in the 2017-2018 and 2019-2020 seasons.

Statisticians like Benham have been finding innovative ways to extract data from soccer matches. Rather than looking at counting stats like simple goals and assists, measurements like expected goals, expected assists, average expected assists per 90 minutes, total shot rate, expected points (xP), and others have come to define a player and team’s potential statistically. Using the gathered data from Smartodds, Benham created a “justice” league table in which he sees where Midtjylland stands based on its stats—especially xP—and performance rather than pure wins and losses that are influenced by the luck factor.

In the NBA, Daryl Morey has been the primary analytics trailblazer. Morey developed a “Moreyball” philosophy for basketball statistics that maintains a heavy emphasis on three-pointers and layups over mid-range shots on the court. The effects are obvious. The mid-range shot is no longer common in today’s game. Morey has pioneered the tracking of players’ performance through data, cofounding the MIT Sloan Sports Analytics Conference to bring together pundits from the field. Startup softwares like RSPCT, which was used in the 2018 All-Star Three-Point Contest to show the exact placement of shots, have garnered investments from stars like Dwyane Wade and attention from fans. Other devices such as Kinexon, a wearable fitness tracker, provide coaches with a better sense of how players position themselves and perform on the court.

Within professional football, many teams in the National Football League (NFL) have formed data analytics teams for their franchises, employing sports analytics for their benefit. NFL teams have already adjusted their playbooks according to findings from collected data. Coaches have noticed that in most situations, it is better to risk “going for it” on fourth down instead of punting the ball to the opposing team. Over the past few seasons of the NFL, there has been a gradual downtick in the number of times a team has punted on fourth down.

Data services like Zebra Technologies have implemented radio frequency identification in NFL stadiums to track players’ movement. Services like Next Gen Stats—made by Amazon Web Services—can take the data provided by Zebra Technologies to pinpoint which specific players would fit best on a team. Next Gen Stats also provides detailed information on the fastest sacks, longest tackles, fastest ball carriers, improbable completions, and other gameplay specifics.

The sports analytics market is quickly expanding across leagues and is expected to reach a value of nearly $4 billion by 2022. Many teams across different sports have recognized the importance of collecting and interpreting data and have started data analytics teams to optimize their performance. Similar to an arms race, the sports analytics revolution has franchises quietly scraping for outside information to get an edge over other teams. While mathematics and sports have long been considered complete opposites of each other, the efforts of Beane, Benham, Morey, and others have brought the fields closer together than ever. The math team and football team are finally converging, and the use of sports analytics has become the predominant mark of success in various teams. Soon, what will separate supergiant teams from underdeveloped teams will be the extent to which a franchise can embrace and utilize data.