• The Cutback
  • Posts
  • Defensive Quality: Convex Hull Approach Refined

Defensive Quality: Convex Hull Approach Refined

How I Improved My Model to Better Evaluate Defensive Quality Across Leagues

As I foreshadowed in my last post, I've been working again on my project to quantify defensive quality in players.

To improve the process, I’ve optimized my code to be more efficient, making the necessary calculations much faster and allowing me to process my entire database more easily. I’ve also refined the underlying concept. If you're unfamiliar with my previous approach, the old version involved determining the convex hull of a player's actions in a given game, then identifying opponent actions that occurred within that convex hull. This process was repeated for every game in a season, with the values then adjusted for number of actions occurred in the area to determine the best defenders.

While the initial version produced promising results, it was difficult to replicate and was limited to just two Premier League seasons. With my recent improvements, I believe I’ve addressed those issues and achieved a more reliable outcome. Below, I’ll explain the changes I made, why I think they’ve been successful, and why I’ll introduce the new metric into my player radars.

For those who want to skip the technical details, here’s the final ranking of center-backs from the 2023/24 Premier League season based on my updated model:

Given the inherent limitations of using event data rather than tracking data, I think this list holds up well, plus I like lists from other leagues-position combinations best, but I need you to trust me here.

The scatterplots you’ll see are based on data from the 2020/21 to 2023/24 seasons for the Championship, Premier League, Liga Portugal, Serie A, and Bundesliga, as well as from 2020 to 2024 (excluding 2021) for the Brasileirao. These leagues were chosen because I feel provide enough variation on strength and style of play between them, but also have enough transfer between them. In that way we can see how values change seasons after season keeping into account all of this.

To create the convex hulls, I’ve selected the following actions: 'pass', 'receival', 'dribble', 'out', 'take_on', 'shot', 'offside', 'bad_touch', 'cross', 'goal', 'interception', 'clearance', 'tackle', 'foul'. Meanwhile, the opponent actions used for ratings are: 'pass', 'receival', 'dribble', 'out', 'take_on', 'shot', 'offside', 'bad_touch', 'cross', 'goal'.

These choices ensure that the convex hulls are built using a wide enough range of player actions to be meaningful while excluding set-piece scenarios. Additionally, the opponent actions considered are all in-possession events that a defender would be expected to contest.

I also compare two versions of the model: one using convex hulls based on the entire pitch and another using only actions performed in the first two-thirds of the pitch —similar to what I tested in my last post. Finally, for both versions, I evaluate three different metrics:

  1. Overall value of opponent actions

  2. Value from successful opponent actions (according to our definition)

  3. Value from unsuccessful opponent actions (according to our definition)

These values are calculated using the atomic VAEP model, which I’ve referenced multiple times in previous posts.

For those who want to skip the technical details, the final model I settled on — used to generate the rankings above — is the one based on convex hulls from the first two-thirds of the pitch, considering all actions performed in that area.

Why Did I Choose the First Two-Thirds Model?

Simply put, it produces better lists, especially when evaluating both the Premier League and Serie A—leagues I’m more familiar with—across all positions in my dataset. At the end of the day, this is my decision, and my authorial judgment has to carry weight here.

From a conceptual standpoint, limiting convex hulls to the first two-thirds of the pitch focuses on areas closer to a team's own goal, making them a more accurate representation of true defensive zones. These areas are more likely to capture mid/low block situations rather than pressing/gegen-pressing actions, which are already well covered by existing metrics if you have access to that type of data. All of this reinforces my belief that this is the right approach.

Why Prioritize All Actions Over Other Slicing Methods?

This choice comes down to interpretability and model strength. Among the three possible slicing methods, the metric based on prevented actions (unsuccessful opponent actions) is the weakest.

Here’s the possession-adjusted, whole-pitch ranking based on that method:

While there are some solid names on the list, there are also some very questionable ones. This metric appears to be heavily influenced by opportunity — how often opponents play the ball in your area given the names that top the list — which in turn affects how many actions you actually get to stop. It also depends on how aggressive and effective you are in disrupting play given the presence of good centerbacks like Thiago Silva and Botman.

From a repeatability standpoint, the prevented actions metric is the least reliable among all three possession-adjusted models.

Whole Pitch Model Repeatability:

Two-Thirds Model Repeatability:

As for the choice between all actions vs. only non prevented opponent actions, it ultimately came down to list quality and risk aversion. The rankings are very similar, and the difference in repeatability is not huge. Therefore, I opted for the version with the highest season-to-season correlation in the two-thirds model.

Two-Thirds Model – Non-Prevented Actions:

Two-Thirds Model – All Actions:

Whole Pitch Model – All Actions:

The numbers are comparable and getting stronger, so I ultimately settled on the middle ground — favoring the two-thirds model over the whole pitch, based on both conceptual reasoning and results (list quality + correlation values).

Addressing Possession Adjustment in Defensive Metrics

If you’ve followed my work closely, you know I’ve previously raised concerns about possession-adjusting defensive metrics. To account for this, I also tested a minutes-adjusted (per 98) version of all the models instead of possession adjustment. However, since my approach is based on in-possession opponent actions, the minutes-adjusted version performed worse across all metrics.

For reference, here’s the correlation for the whole pitch all-actions metric using per 98 adjustment:

Wrapping Up

I’m really happy with the progress. While I still need to process my full database, I now know which model and metric to prioritize. In about a week or two, I should be able to implement this in my radars and regular analysis.

This also feels like a major addition. Around mid-January, I read How to Win the Premier League by Ian Graham (Liverpool’s former Head of Research & Development). On pages 150-152, he mentions that a similar model was good enough for Liverpool’s recruitment before they fully developed tracking-based defensive metrics. That’s a huge validation of this work.

So yeah — I’m satisfied. See you soon!