Improving Player Position Clustering

An Evolution in My Football Data Analysis

I'd like to share how I've refined my clustering methodology for player positions, resulting in more accurate and meaningful positional groupings.

The Evolution of My Approach

My second version maintains the core of the original method but with a crucial improvement: it works directly with the numerical data generated by the heatmap function, as shown in the visualization below.

By feeding the raw numerical values into the clustering algorithm, we can identify which players truly resemble each other positionally. This approach eliminates one of the main issues with the previous version — where players with very different roles but similarly low activity levels (like some goalkeepers and strikers) would end up in the same cluster in some cases.

Benefits of the New Methodology

The refined version delivers several improvements:

  1. Better positional differentiation: Players are now grouped more accurately according to their actual field positions

  2. Handling of low-minute players: Even players with limited playing time are properly clustered if they operate in similar areas of the pitch

  3. Manual cluster interpretation: I've implemented a process to examine each cluster's composition and assign meaningful position identifiers

The Human Element

While this process can be time-consuming — often dealing with 40+ clusters that require careful attention — it adds significant value. By analyzing team-weighted heatmaps and focusing on well-known teams within each cluster, I can better interpret what each grouping truly represents in football terms.

This manual review process allows me to translate algorithmic findings into practical football knowledge, bridging the gap between data science and on-pitch reality.

Impact

What might seem like a minor methodological improvement has actually created a substantial impact when comparing players. The enhanced accuracy enables more meaningful player comparisons and potentially better recruitment decisions.

Context: I completed my master's in anthropology in March and am now building my portfolio for the football data analysis industry. While the Italian football job market presents challenges compared to the more specialized English market, I'm hopeful it demonstrates my capabilities to potential employers in football clubs, data companies, consultancies, and betting organizations.