Martin Eastwood's Avatar

Martin Eastwood

@martineastwood.co.uk

Somewhere in the middle of a Venn diagram of machine learning and football / soccer. http://www.pena.lt/y/blog.html

1,781
Followers
119
Following
59
Posts
23.11.2023
Joined
Posts Following

Latest posts by Martin Eastwood @martineastwood.co.uk

A Python code snippet demonstrating the new 'create_dixon_coles_grid' function in the penaltyblog library. The code shows how to initialize a probability grid using expected goals (lambdas) for the home and away teams, and then calculate home win probabilities, over/under 2.5 goal totals, and Asian Handicaps directly from that grid.

A Python code snippet demonstrating the new 'create_dixon_coles_grid' function in the penaltyblog library. The code shows how to initialize a probability grid using expected goals (lambdas) for the home and away teams, and then calculate home win probabilities, over/under 2.5 goal totals, and Asian Handicaps directly from that grid.

πŸš€ penaltyblog v1.9.0 is live! 🐍⚽️

New in this version:
βœ… create_dixon_coles_grid(): Use lambdas from external ML models.
βœ… goal_expectancy_extended(): Infer rho/lambdas from odds.
πŸ› οΈ Improved quarter lines (2.25/2.75) logic for totals market.
⚑️ More optimisations to FootballProbabilityGrid

28.02.2026 17:12 πŸ‘ 8 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
GitHub - martineastwood/penaltyblog: ⚽ High-performance football analytics toolkit: build data pipelines, scrape data, model matches, rank teams, and bet smarter | Powered by pena.lt/y/blog πŸš€ ⚽ High-performance football analytics toolkit: build data pipelines, scrape data, model matches, rank teams, and bet smarter | Powered by pena.lt/y/blog πŸš€ - martineastwood/penaltyblog

Everything you need to get started is right here:

pip install penaltyblog --upgrade

πŸ’» Repo: github.com/martineastwo...
πŸ“– Docs: penaltyblog.readthedocs.io
πŸ“¦ PyPI: pypi.org/project/pena...

08.01.2026 20:03 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
A technical MCMC Trace Plot with two horizontal subplots labeled "attack_Arsenal" and "home_advantage." Each plot shows four overlapping colored lines (Chains 0 to 3) creating a dense, horizontal "fuzzy caterpillar" shape. This visualization indicates that the Bayesian model has successfully converged and is stable across 2,000 iterations.

A technical MCMC Trace Plot with two horizontal subplots labeled "attack_Arsenal" and "home_advantage." Each plot shows four overlapping colored lines (Chains 0 to 3) creating a dense, horizontal "fuzzy caterpillar" shape. This visualization indicates that the Bayesian model has successfully converged and is stable across 2,000 iterations.

Why build a custom MCMC sampler from scratch?

In my latest blog post, I dig into the "why" and the "how". From the decision to drop heavy dependencies like Stan for this use case, to using Cython for speed, and how to interpret those "fuzzy caterpillar" plots.

Read it here: pena.lt/y/2026/01/08...

08.01.2026 20:03 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

What can you do with v1.8.0?

βœ… Bayesian Dixon-Coles: Quantify uncertainty, not just point estimates
βœ… Hierarchical Models: Learn league-wide variance to handle small sample sizes
βœ… Built-in trace plots: Ensure model convergence
βœ… Familiar API: Works seamlessly with existing scrapers and weights

08.01.2026 20:01 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
A dark-themed code editor window showing Python code using the penaltyblog library. It demonstrates initializing a HierarchicalBayesianGoalModel with home/away goals and teams, fitting the model using MCMC sampling with parameters like 2000 samples and 4 chains, and finally predicting match probabilities for an Arsenal vs. Manchester City fixture

A dark-themed code editor window showing Python code using the penaltyblog library. It demonstrates initializing a HierarchicalBayesianGoalModel with home/away goals and teams, fitting the model using MCMC sampling with parameters like 2000 samples and 4 chains, and finally predicting match probabilities for an Arsenal vs. Manchester City fixture

Bayesian goal models are back in penaltyblog v1.8.0 - and this time, they’re dependency-free! βš½οΈπŸ“ˆ

08.01.2026 20:00 πŸ‘ 13 πŸ” 3 πŸ’¬ 1 πŸ“Œ 0

⚽ penaltyblog v1.7.1 is now available:

Added params_array and param_indices functions to goal models to make it easier to work with the model's parameters.

Thank you to Sebastian Velandia for this contribution!

pip install --upgrade penaltyblog

29.12.2025 12:23 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

πŸ“ A couple of important notes:

You'll need your own Opta subscription/credentials to use this (I provide the tools, not the data!) πŸ”‘

As this is a v1 release of the integration, there's likely some rough edges. If you run into any, let me know - I'm happy to work with you to improve it. 🀝

09.12.2025 19:06 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Code example showing how the penaltyblog python package can now integrate directly with the StatsPerform Opta API to download data directly.

Code example showing how the penaltyblog python package can now integrate directly with the StatsPerform Opta API to download data directly.

⚽ penaltyblog v1.7.0 is now available: Direct integration with the Opta (Stats Perform) API.

You can now stream matches & events lazily without downloading the JSON first. Includes helpers for human-readable filtering (no more memorising IDs).

pip install --upgrade penaltyblog

09.12.2025 19:05 πŸ‘ 6 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Have been doing Advent of Code in Nim and Kotlin this year and enjoying learning both. I can see Nim becoming one of my favourite languages!

04.12.2025 21:30 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Thanks Nils, it’s a good point and is on my todo list to dig further into how long it takes for something like FSAA to stabilise to something useful early on in a player’s career. Even with a fairly wide HDI at that stage, there’s still potentially benefits to its use

24.10.2025 12:05 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Not yet, but good idea!

20.10.2025 17:02 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Shrinkage, Uncertainty, and Son Heung-min: Using Bayesian Methods to Identify Finishing Ability Why most finishing metrics are flawed and how a Bayesian approach gives us a truer picture of a player's finishing ability...

New article: "Shrinkage, Uncertainty, and Son Heung-min: Using Bayesian Methods to Identify Finishing Ability"

which discusses using a Bayesian hierarchical approach to quantifying player finishing ability, with credible intervals to express uncertainty.

pena.lt/y/2025/10/01...

20.10.2025 16:34 πŸ‘ 22 πŸ” 5 πŸ’¬ 3 πŸ“Œ 7

πŸŽ‰ penaltyblog v1.6.1 is out!

✨ What's new:

- Python 3.14 support
- scipy 1.16+ compatibility
- Better numerical stability for Negative Binomial model
- New Colab notebook for implied probabilities example

pip install --upgrade penaltyblog

17.10.2025 18:54 πŸ‘ 4 πŸ” 3 πŸ’¬ 0 πŸ“Œ 0

Absolutely. Also, people started sharing big (for then anyway) pre-trained networks that made it much easier to get started. I built many models in my day job back then by fine tuning BERT and ImageNet that I would have struggled to train from scratch without investing massively in compute.

07.10.2025 20:18 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image

We can also split Massey Ratings into attack & defence:

πŸ”΄ LFC: best attack in the league paired with a mid-table defence
βšͺ️ ARS: Elite at both ends. They have the #1 defence and the #3 attack
🌳 Forest: A disaster at both ends of the pitch

07.10.2025 19:55 πŸ‘ 10 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Massey Ratings for the English Premier League

Massey Ratings for the English Premier League

Here's how the Premier League table really looks according to Massey Ratings, which account for strength of schedule.

πŸ“ˆ Arsenal (+1.5) & Man City (+1.3) are clear strongest overall
😬 Man Utd (-0.2) rank in the bottom half
πŸ“‰ West Ham & Forest (-1.2) are worst teams by far

07.10.2025 19:54 πŸ‘ 14 πŸ” 4 πŸ’¬ 3 πŸ“Œ 2

Thanks to everyone who suggested features and reported issues. Your input shapes the package's development.

Questions or feedback welcome at pena.lt/y/contact

Install: pip install penaltyblog
GitHub: github.com/martineastwo...

23.09.2025 19:47 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
penaltyblog: Football Data & Modelling Made Easy β€” penaltyblog documentation

πŸ“š Interactive Colab notebooks are available in the docs - experiment with real examples without any local setup.

I'll be steadily expanding these over the coming weeks to cover all functionality in the package.

Docs: penaltyblog.readthedocs.io

23.09.2025 19:46 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

πŸ”§ Improved implied odds module:

- New logarithmic overround removal method for better accuracy
- Structured results instead of raw arrays
- Better handling of edge cases

Making it easier to work with bookmaker probabilities in your analyses.

23.09.2025 19:46 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

πŸ’° Expanded betting utilities:

- Kelly Criterion for multiple outcomes
- Arbitrage opportunity detection
- Value bet identification
- Hedge bet calculations
- Odds format conversion (decimal/fractional/American)

All functions now return structured outputs for easier integration.

23.09.2025 19:45 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

⚽ penaltyblog v1.6.0 is now available!

πŸ“Š MatchFlow updates:

- SQL-style joins for nested football data (left, right, outer, inner, anti)
- Cloud storage support: read/write directly to AWS S3, Google Cloud Storage, and Azure Blob
- Automatic type inference for join keys

pip install penaltyblog

23.09.2025 19:45 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Penaltyblog v1.5.0: Faster Models, Smarter Queries, and a Sharper Edge v1.5.0 delivers interactive charts, faster models, upgraded football probability grid, and a powerful Flow query language - all designed to make your analysis sharper and quicker...

πŸš€ New article on my blog walking through the latest updates to the penaltyblog python package for football (soccer) analytics & betting

βœ… New interactive pitch plots
βœ… 5-10Γ— faster goal models
βœ… New Flow query DSL

πŸ‘‰ pena.lt/y/2025/08/14...

18.08.2025 18:07 πŸ‘ 5 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
penaltyblog: Football Data & Modelling Made Easy β€” penaltyblog documentation

πŸ“š Docs: penaltyblog.readthedocs.io/en/latest/in...
πŸ’» GitHub: github.com/martineastwo...
🐍 pip install penaltyblog

Feedback welcome, let me know what you build!

15.08.2025 19:04 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Image of code showing examples of penaltyblog Flow's new DSL

Image of code showing examples of penaltyblog Flow's new DSL

πŸ” New: Flow Query DSL

Filter datasets with safe, Pythonic expressions:
- AST-parsed (no eval)
- Variables, regex, dates
- Access nested fields

15.08.2025 18:59 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Image of code demonstrating the the features in penaltyblog's goals models

Image of code demonstrating the the features in penaltyblog's goals models

πŸ“ˆ Goal models are now 5-10Γ— faster

- Cython-powered analytical gradients for speed + stability
- Fine-tune with minimizer_options:

15.08.2025 18:57 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
GIF demonstrating the interactive tooltips in penaltyblog's new Pitch plotting API

GIF demonstrating the interactive tooltips in penaltyblog's new Pitch plotting API

⚽ New: Pitch Plotting API

Build interactive football visualisations with:
- Multiple layouts & themes
- Scatter, heatmaps, arrows, comets
- Custom hover tooltips

15.08.2025 18:55 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

πŸš€ penaltyblog v1.5.0 is here!

βœ… Interactive pitch plots
βœ… 5-10Γ— faster goal models
βœ… New Flow query DSL

What’s new πŸ‘‡

15.08.2025 18:53 πŸ‘ 4 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0
How Accurate Are Soccer Odds? A Data Dive into 250 Million Betting Lines A data-driven deep dive into how accurately bookmakers price global soccer markets...

πŸ“Š New article on my blog: How Accurate Are Soccer Odds? A Data Dive into 250 Million Betting Lines

πŸ” How sharp are different bookmakers?
πŸ“ˆ How accurate are bookmaker's odds?
🎯 Are the odds well-calibrated?

➑️ pena.lt/y/2025/07/16...

16.07.2025 18:19 πŸ‘ 16 πŸ” 4 πŸ’¬ 0 πŸ“Œ 0
MatchFlow 1.4.0: Optimizing, Visualizing, and Validating your Data Pipelines MatchFlow just got smarter, friendlier, and more powerful for optimizing your pipelines, visualizing your data flow, and keeping your data clean...

There's also a new blog post here that explains more about the latest updates:

pena.lt/y/2025/06/10...

19.06.2025 19:40 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
GitHub - martineastwood/penaltyblog: ⚽ High-performance football analytics toolkit: build data pipelines, scrape data, model matches, rank teams, and bet smarter | Powered by pena.lt/y/blog πŸš€ ⚽ High-performance football analytics toolkit: build data pipelines, scrape data, model matches, rank teams, and bet smarter | Powered by pena.lt/y/blog πŸš€ - martineastwood/penaltyblog

πŸš€ penaltyblog v1.4.0 is out!

Now includes a query plan optimiser for smarter Flow pipelines:

β€’ Optional FlowOptimizer for smart rewrites (optimize=True)
β€’ New .plot_plan() for pipeline viz
β€’ .with_schema() for field validation
β€’ Rolling- and time-based summaries

github.com/martineastwo...

19.06.2025 19:39 πŸ‘ 4 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0