Compared to the American sports of basketball, football, hockey, and baseball, the mainstream world of soccer has often rejected advanced statistics. In the last couple of years, however, a metric called Expected Goals (xG) has emerged to challenge soccer’s traditional belief that the only statistic that matters is the number of goals. xG now dominates any discussion about soccer, from BBC’s “Match of the Day” professional analyses to the pub or Twitter conversations between fans. Having emerged as the first advanced metric in mainstream soccer, fans often wrongly treat xG as the ‘holy grail’ of soccer statistics.
With soccer being such a low-scoring game, many believe the final score often does not reflect how well a team played; xG attempts to enumerate teams’ performances by how many goals they ‘deserved.’ The calculation of xG is simple: each shot in a game is given a probability of ending up in the goal, and these probabilities are added together to yield the team’s xG at full time. Various models calculate these probabilities differently, but the factors used include the position from which the shot was taken, the part of the body with which the shot was taken, and the position of the goalkeeper when the shot was taken. Most models give special scenarios like set pieces a constant number: a penalty, for example, is commonly given an xG of 0.76.
xG is an excellent predictor of long-term success and a useful tool for analyzing long-term performance. When applied over long stretches of time, xG may indicate clinical finishing, unsustainable success, or a little bit of both. FiveThirtyEight, a data journalism organization, uses xG to calculate Soccer Power Index (SPI), which is essentially the team’s recent strength rating that FiveThirtyEight uses for forecasting match results. The use of xG helps to minimise the involvement of ‘luck’ which comes into play with soccer’s low-scoring nature. During a run of form in which Arsenal F.C. underperformed its xG, Arsenal manager Mikel Arteta made the analogy that “football is not like basketball. In basketball, you shoot 50 times and the opponent does it once and you win every single game. It doesn’t work in football like that. You can do it the opposite way around and lose 1-0.”
The problem comes when xG is used to analyze the performance of a team in a single match or a short run of matches. For example, after every match, fans use xG to discuss if the winning team ‘deserved’ the win. The Twitter account @xGPhilosophy posts the xGs of each team after every Premier League match, sparking debate about the potential involvement of ‘luck’ in the match’s outcome.
Using xG to analyze long-term performances works because the factors that are not accounted for — perhaps the height of the ball or the position of the defenders when the ball is struck — can be averaged out and made insignificant with the Law of Large Numbers. With the small sample size in data from a single match, these factors stay significant and may cause significant errors in the calculation of xG. A single match of soccer only has around 20 shots combined between the two teams, which is far too small of a sample size for which to use xG.
The problem of small sample sizes is worsened when xG is translated into other metrics such as Expected Assists (xA), Expected Goals Chain (xGC), and Expected Goals Buildup (xGBuildup) to evaluate an individual player’s performance. One match is not sufficient to justify using these metrics to evaluate individuals, as xG data from a single player has an even smaller sample size than data from the whole match from every player on the field.
Moreover, specific situations or tactical decisions in a match may influence the xG significantly. For example, a team may choose to sit back and defend after scoring one goal, increasing the opponent’s xG and decreasing their xG. In this particular situation, the team’s xGs do not indicate that the opponent played ‘better’; rather, it reflects a deliberate tactical decision by the leading team. If we look at xG over a longer period, these tactical factors are likely to be averaged out and made relatively insignificant.
The metric of Expected Goals is mainstream soccer’s first step into advanced statistics. In a low-scoring sport like soccer, advanced statistics are especially important, and xG presents soccer with the perfect gateway to look at the sport through more than the number of goals and shots. Yet, its limitations mean that xG should only be viewed in the context of other statistics and watching the match itself. xG does not reveal tactical decisions and particular shot circumstances that can only be seen by the eye. xG does not and should not replace anything; it is just another tool that fans can use to make conclusions about long-term performance.