K Means Clustering: Redefining Central Midfield Classifications
I only included matches where a player started as a central midfielder. I only included players in my classification that had played at least the equivalent of 10 matches as a central midfielder (900 minutes).
From there, I decided on 25 features to measure the players on. The features span from attributes (eg. height, weight), to positional information (eg. standard deviation of vertical movement), to passing, defense, and shooting metrics. I have tried to include features that cover the majority of a player’s actions during the match... After I had my 25 features, I needed to determine how many different classifications of players I should define. Determining K, or the number of clusters, is a non-standardized task when working with clustering algorithms.