The Windows Subsystem for Linux is now open source (blogs.windows.com)

Hey everyone, I’m an NBA fan and Python dev, and I recently built DeepShot — a machine learning model that predicts NBA game outcomes with about 71% accuracy based on historical stats and rolling performance metrics (EWMA). It features: Real NBA data from Basketball Reference Exponentially Weighted Moving Averages to track momentum Interactive NiceGUI interface with team comparison and predictions Full Python stack and open-source (MIT license) Here’s the GitHub repo: https://github.com/saccofrancesco/deepshot And if you like it, here’s my Buy Me a Coffee: buymeacoffee.com/saccofrancesco

Would love any feedback — especially from folks who’ve built sports models or worked on real-time stat tools. Also open to ideas on where to take this next (player-level modeling? betting advice dashboard?).

Thanks!

Comments (2)

Reubend · 11h ago

Hey Francesco, this is very cool, and I'm sure this was a fun project to work on.

If you're interested in improving the performance here, using a method like TrueSkill would likely yield much better predictions than the XGBoostClassifier you're using now. It provides a robust method for modelling the game at the player level, so that the model can change its predictions as different players get swapped out. As you can imagine, roster changes make a huge impact on overall team performance, and the sample size of NBA data here isn't really big enough for gradient boosting to be effective when the teams themselves change. Bayesian methods are nice for this sort of thing.

In terms of where to take things next, it could also be cool to see some kind of "what if" scenario generator. How would the Dallas Mavericks' probability of winning have changed if they hadn't traded away Luka Dončić? How would the Indiana Pacers' chances of winning the season change if they weren't playing the Nicks in the Eastern conference finals?

saccofrancesco · 7h ago

Hi! Thanks so much for your comment and for suggesting some really thoughtful ideas for the project — I really appreciate it.

At the beginning, I also considered the idea of gathering individual player data and assembling team profiles based on active rosters for each game. That way, team strength could be evaluated more accurately based on who actually played, rather than relying on aggregate team stats.

I completely understand your point about using a method like TrueSkill to model team performance more dynamically — based on the presence or absence of specific players and the impact each one has on the team's overall performance. It’s a compelling approach and definitely something that would make predictions much more responsive to roster changes.

The main challenge, though, is the data itself. Even getting reliable game-level data for all teams from the 2000–01 season through to 2024–25 was already quite complex. So when it comes to going a level deeper — pulling individual player data, lineups, or starting rosters for every single game — it becomes difficult to know where to start. These data sources are often scattered, inconsistent, or hidden behind APIs that may have usage limits or costs. There’s also the issue of computational load and the sheer scale of the data, especially when you're working solo, as I currently am.

That’s actually part of why I’m sharing the project publicly — to see if others might be interested, just like you, and maybe even want to contribute. Sometimes just having another perspective helps catch something I may have overlooked.

Thanks again for your suggestions — I’ll definitely explore them further during the NBA off-season and hopefully come back with a more refined version of the project for the next season.