Logistic Regression Applied to Baseball Data

1 oliverc1622 1 7/22/2025, 3:30:50 PM runningonnumbers.com ↗

Comments (1)

oliverc1622 · 7h ago
In messing around with the eye-catching visuals on Baseball Savant, I noticed a dichotomous pattern among batters and their ideal attack angle rate and hard-hit outcome.

The distribution of Ideal Attack Angle Rate is different for hard hits vs. non-hard hits.

We then trained a model on that signal. The resulting S-curve shows a predictive fit, correctly classifying most outcomes. The model's coefficient revealed that an odds ratio of 8.244, which we get by computing, means that for every one standard deviation increase in a player’s ideal attack angle rate, the odds of them hitting the ball hard multiply by approximately 8.244. This is a significant relationship, indicating that this feature is a strong predictor of hard-hit outcomes. The intercept of 0.0900 suggests that for a player with an average ideal attack angle rate, the odds of hitting the ball hard are about 1.094 to 1, or a 52.2% chance.

Data acquired from Baseball Savant. I used scikit-learn to train my logistic regression model.