An Evaluation of Cy Young Award Selection Using Machine Learning

Image: Patrick Semansky/Associated Press

Every year, members of the Baseball Writers’ Association of America (BBWAA) vote on one of baseball’s most prestigious awards: the Cy Young Award, given annually to the best pitcher in the National and American league. 

As opposed to performances in other sports, evaluating baseball pitchers is far from cut and dry. There are a variety of conventional baseball statistics that fans and analysts use, causing a significant debate about what statistics should matter most when selecting award-winners.

When considering the Cy Young Award, voters are incredibly guarded in terms of sharing what statistics they take into account, and as a result, there is little information about which truly matter and how the importance of various statistics has changed over time.

To try and solve this, I turned to Machine Learning. 

This article will focus on my high-level results and analysis, with some methodology sprinkled throughout. However, if you are interested in the lower-level analysis, check out my Github repository.

As a quick overview of the methodology, I separated the voting results into different “baskets” by decade (e.g. 1960s, 1970s, etc.). Then, I trained separate (GradientBoosting Regressor) Machine Learing models for each decade with various statistics (e.g. wins, losses, saves, strikeouts, etc.) as features and Cy Young vote share as the dependent variable. 

Next, I recorded the various measured feature importances for each decade to see how important each statistic was in predicting Cy Young vote share. Feature importance for a model essentially allows us to quantify the importance of each feature in predicting the dependent variable — in this case Cy Young vote share — as a proportion.

Disclaimer: We cannot be sure of what statistics Cy Young voters value, but this project makes strong inferences based on present data.

For the sake of this article, I’ll focus on the results for two variables: wins and ERA/ERA+. However, if you are interested in seeing how the importance of various other statistics has changed over time according to the model, check out my interactive Tableau visualization.

Wins Are Still King (But Not for Long?) 

When it comes to evaluating pitchers, the win statistic has recently become a highly contentious topic. 

For much of baseball history, pitcher wins have been the pinnacle of evaluating mound success. However, in recent years, there have been a variety of discussions about its merits as a talent evaluator. This can be attributed to wins being incredibly dependent on a pitcher’s run support.

In recent years, the baseball public has come around to the idea of wins losing their importance. Does the data suggest, however, that this has subsequently impacted Cy Young Award selection?

Based on the data, the trend of discussion is most definitely true — the pitcher win is losing its importance. Despite this, it is still far and away the most important statistic when evaluating Cy Young Award selections.

However, this pattern may not last that much longer.

In 2018, then-New York Mets ace Jacob deGrom tossed 217 innings of 1.70-ERA baseball. Despite this, deGrom only had a 10–9 record, due to the Mets’ severe lack of run support. Nonetheless, the lanky righty easily waltzed to the Cy Young Award, earning 99% of the vote share and 29-of-30 first-place votes.

In the earlier decades of baseball, deGrom’s Cy Young case would have been significantly more difficult, but seemingly, the tides of the Cy Young Award are changing as reinforced by recent data trends.

ERA on the Rise? 

In most baseball discussions, ERA is one of the key measures considered when evaluating pitcher performance. However, ERA wasn’t always the most important pitching stat, and its significance to the Cy Young Award has changed greatly over time.

For the model, I opted to use ERA+ as a proxy for ERA. ERA+ is essentially ERA but adjusted for era, ballparks, and more. ERA+ is also scaled such that 100 is average. Ultimately, choosing ERA+ over ERA allows us to see how pitchers are performing relative to pitchers in their own era.

Historically, there have been significant shifts in the importance of ERA in consideration of the Cy Young Award. Interestingly, it seems as if ERA importance was at a high point during the 60s and 70s, dying in the late 20th century, and is only now making a resurgence.

However, I don’t see the current trend for ERA continuing. With the revolution of sabermetrics in baseball, I believe fans and analysts will start incorporating more advanced statistics into their discourse, which will eventually seep its way into Cy Young discussion.

Final Thoughts

This work of analyzing Cy Young voters is far from its end, but it provides an interesting data-driven perspective of how Cy Young Award criteria has seemingly changed over time.

Again, if you are interested in a more interactive visualization of more statistics check out the Tableau visualization (linked below). Additionally, for a more technical analysis, check out my GitHub (rp57).



Categories: Analysis, Articles, Research

Tags: ,

1 reply

  1. Great analysis. I agree that fans and analysts will introduce new statistics to evaluate players, rather than relying on ERA

Leave a comment