xG, xA and a battle for hearts and minds – a friendly debate between a fan and The Athletic’s data expert

xG, xA and a battle for hearts and minds – a friendly debate between a fan and The Athletic’s data expert

Like most of life’s better ideas, the genesis of this article was in the pub.

It came via a discussion about the merits or otherwise of Tottenham Hotspur’s football team. You know the drill: they’re great to watch, can beat the best and lose to the worst. Spurs gonna Spurs, etc.

The pub-based chat, though, became particularly heated when the topic turned to expected goals (xG).

“Spurs’ underlying xG numbers are pretty good, so I’d expect them to climb the table,” The Athletic opined.

“Mate, seriously do not talk to me about xG, that’s just fancy journalism bulls*** that you lot use to make a point.” Enter this article’s protagonist, Dominic Townsend.

Dom is a Wycombe Wanderers fan, middle class, educated to degree level and a likeable, right-minded fellow. He absolutely cannot stand xG and its use in modern mainstream media.

What followed in The Salisbury, a cracking traditional boozer in Haringey, north London, was a wine-fuelled debate for the ages, arguing the toss over the merits or otherwise of xG and other data terms.

At its conclusion, Dom was angry/drunk enough to accept the challenge of repeating his views to an international audience via the medium of the internet and agreed to continue the xG debate with a proper data expert.

Enter The Athletic’s very own Duncan Alexander, an ex-Opta OG in xG. Duncan is also a Wycombe fan — but, incredibly, the pair have never met.

So what is xG? Why do we use it? What other data terms might cross over into the mainstream? And what about xA, PSxG, PPDA, field tilt, duels and progressive passes and carries?

Dom v Duncan. Let battle the duel commence.


Expected goals (xG)

Dominic Townsend: “I should probably start by saying I actually work for a data company, so despite Tim’s protestations I’m definitely not anti-data. Expected goals, I’ve had explained to me a couple of times and I broadly get what it is. I think ‘possible goals’ is a better way of putting it.

“I also think the xG figure should start at a hundred, and then every time you miss a chance it takes a certain percentage off. I don’t understand how you could have more possible/expected goals than you actually score, because there’s illogicality there.”

Duncan Alexander: “Well, not really. But yeah.”

Townsend: “And it isn’t used in the right way on television. It gets flashed up on (BBC’s) Match of the Day and everyone is expected — pun intended — to know what it means, but I’m not sure everyone does. I don’t hear many other people talking about xG in a pub.”

Alexander: “Maybe you’re going to the wrong pubs.”

Townsend: “Haha, well maybe I’m not!”

Alexander: “I think it has crossed into common parlance with younger generations.”

Townsend: “You’re calling me old, basically.”

Alexander: “Well, you’re younger than me, but yes. I think you make some really good points. The name… I worked with the person who basically invented the model, or borrowed it from ice hockey. He’ll admit as well that the name isn’t great — it makes people think it’s predicting stuff that’s happening in the future.

“It’s a measure of chance quality. To your point about how you can have more expected goals than actual goals in a game… the easiest example to use is a penalty, which has a .78 xG rating because historically 78 per cent of penalties are scored.

“And then a shot from the ‘D’ just outside the box probably has a 0.04 xG rating because four per cent of those shots go in.”

Townsend: “OK, so that makes sense.”

Alexander: “There’s an awful lot of data behind it — probably more than half a million shots are in the database.

“It’s at its best when judging a team over a longer period. If you look at rolling xG over a large amount of games you can see where a team’s good and bad form kicks in. It’s OK in a one-off game scenario, and sort of OK in a one-off shot.

The xG rolling average can show a team’s form over time

“The other thing people sometimes get the wrong end of the stick is; ‘How can you measure Lionel Messi and Che Adams (editor: apologies to Che Adams) on the same expected goals basis, because they’re obviously different players in terms of skill and quality?’.

“But the point is you’re rating everyone above or below the average. Messi has outscored his xG pretty much every season he’s played football. Son Heung-min at Spurs, Harry Kane; these players consistently overperform.”

chart visualization

Townsend: “So, hang on. Is xG a worldwide average based on overall stats and positions of shots, or is it a team stat against the average? I guess what I’m trying to say is: if you’re saying Messi, Kane and Son outperform xG, is that because it’s a worldwide average?”

Alexander: “Essentially yes, although it’s not worldwide. Opta, the most common model, is taken from 10 or 11 leagues.

Townsend: “OK so how about if Liverpool and Southampton play each other and have the same xG of 2.10… but it’s not the same xG is it, because Liverpool’s players would be expected to score more goals because they have better players?

Alexander: “Well that’s where the name doesn’t quite fit and possible goals would work better.”

Townsend: “I think that’s the problem, you’ve just explained it to me and I’ve followed what you said but I still have a question mark around certain aspects.

“And I guarantee you most pundits that use xG numbers in their analysis have no idea and wouldn’t be able to explain what you just did.”

Alexander: “I won’t mention who it was but I was once on a Premier League programme with an ex-player talking about xG, he didn’t understand its usage but after 45 minutes of me going through the semi-famous Manchester United win at Arsenal when they battered them on xG but lost 3-1, he got it. He had a moment of realisation.

“So yeah, you’re right. But it has only existed in the professional game for over a decade and it wasn’t until 2017 that it was pushed to the media. So certain old dogs just won’t have learned the new tricks.

“I remember looking at Twitter the night Match of the Day first flashed up xG alongside shots, yellow cards and, yeah, it didn’t go down well. Seven years later, there are clearly still some sceptics…”

Townsend: “Haha, yes. OK, just clarify something for me. Paul Gascoigne at Euro 96, that famous chance against Germany when the cross comes in and he slides in to score, but doesn’t actually touch the ball. It’s a huge chance for a goal, but not a shot, so is xG registered for that?”


Gascoigne’s chance for England against Germany at Euro ’96 (ITV SPORT)

Alexander: “That’s a very good point. Opta have a metric called ‘Big Chance’ and that Gazza one would go down as one and get an xG value. It’s a niche example for data people: how can you get xG from a shot you didn’t have?”

Townsend: “So it does count?”

Alexander: “Yeah, but they’re rare. A better example is probably the Alejandro Garnacho overhead kick at Goodison Park last season. That had maybe a six per cent xG because his body was in a good position in the box, you’d expect to score a decent amount of goals from there, but what he actually did with his back to goal was a one-in-10,000 thing… xG didn’t take into account he was facing the wrong way and acrobatically flinging himself.

“That’s why doing individual shot xG isn’t the perfect use. Where it really has powerful value is looking at how a team does over a season or multiple seasons.”

Townsend: “The thing is — and obviously The Athletic is leading the charge here — it’s appealing to the way Americans talk about sport. I think that’s probably fine. But there are certain things that don’t translate as well. The Gascoigne one, let’s say wingers hit the byline and cross just over a striker’s head 20 times in a game, does that count? Because it’s a goal opportunity but the striker may have mistimed their run.”

Alexander: “Well, that’s where we bring in our good friend expected assists (xA).”


Expected assists (xA)

Alexander: “How do you credit players for good passes that are not taken advantage of? Or how do you credit players who consistently do things that go unrecognised?

“The Steven Gerrard figure, Hollywood-balling it across the pitch, or playing through balls, they get a lot of credit, but what about the player consistently making those passes that open up space? That’s the logic behind xA. You award a value to each pass, given how the sequence of play then progresses.

“Rather than just looking at each bit of the game as an individual pass, cross, header, etc, you start linking everything together and assigning value.

“A difficult pass at the start of a chain, because it moves the opposition around, might lead to a good chance. So it’s a way of rewarding those players further back. How do you value Rodri, etc?”


How easy is it to give a value to Manchester City midfielder Rodri? (Nick Potts/PA Images via Getty Images)

Townsend: “That’s a really interesting one. As you’ll know, having supported a lower league club, there are certain players who really stand out in games for being a class above League One/League Two level, but that class might not be properly reflected in their goals/assists numbers because their team-mates are crap.

“When every pundit and journalist is asking what a team actually needs, xA can help fill in those gaps.”

Alexander: “Data is not there to try to solve football. Or to turn football into a spreadsheet. What it’s there for is to explain stuff and add context to things we all know.”

Townsend: “This is where I come back to pundits, because I feel it’s their responsibility to use the incredible amount of data available to them to back up their points.

“Some of them, you can tell, were good players and think they’re above data because it’s all (points to forehead) up here.”

Alexander: “That’s definitely true, but also if you’re on Match of the Day, that’s for a general football audience. My mum watches Match of the Day… no offence to my mum, I’ve told her about xG.

“You do get some people who know about data in football who think they know better than an ex-pro, but in reality, they don’t because they’ve never played at that level.

“Conversely, a lot of ex-pros don’t think they need to know about data because they scored a hat-trick at Mansfield once.

“In the spirit of open-mindedness, whatever your position or knowledge base or history, everyone can enhance their understanding and knowledge of football at any point.”


Post-shot expected goals (PSxG)

Alexander: “Right, so xG is where the shot is taken from, but then if that shot is on target you can measure how likely it is that it would have been a goal. I.e., if you shoot at the top or bottom corner, it’s got a higher chance of being scored than if you hit it in the middle of the goal.

“The most common usage is for goalkeepers. If they’ve saved 10 shots on target but all those shots were in the middle of the goal where they’re probably stood, then that’s valued less than if the shots were aimed at the top corner.

“It’s essentially an evolution and an add-on to the main xG model.”

Townsend: “See, again, I like that as a metric but the name is terrible. Post-shot?”

Alexander: “Yeah, to be honest, it can be xGOT (expected goals on target).”

Townsend: “We really need to talk about the whole naming thing. At what point did the whole media industry get in a room and go: ‘Right, you know football has existed for 100-odd years and everyone is fine with shots, saves, assists, tackles, closing down…’.”

Alexander: “Well, you say that…”

Townsend: “Basically we’re going to get to change all these terms with weird naming conventions and we’re not going to tell anyone about it.”

Alexander: “Yeah, that’s my fault, basically. But to be honest that happens in football all the time. If you go back and look at the reports of John Barnes’ home debut for Liverpool in 1987, it says he scored one and made one, which he did. Now, we’d say he got a goal and an assist, but ‘assists’ weren’t around then, that terminology came from American sport in the 1990s.”


The term ‘assist’ was not used in football in the 1980s when John Barnes was starring for Liverpool (Simon Bruty/Allsport)

“People would have valued an assist in the 1890s, there just wasn’t the terminology. In the same way xG would have been understood in a different way… someone runs through on goal in 1905 for The Wednesday and they miss the shot, people would have said, ‘He should have scored that’. Essentially that’s still xG, but they were wearing more hats.”

Townsend: “So for PSxG, the concept works well but again it’s about presenting it in a certain way. So the keeper is in the form of their life and this is why, here’s the data which backs it up.”

Alexander: “Any kind of acronym, it’s not particularly helpful to start throwing letters around, speaking of which…”


PPDA (passes per defensive action)

Alexander: “This is an interesting one because it’s pretty basic and measures a team’s press.

“If you’ve got a PPDA of 10, the opposition are having 10 passes for every defensive action you’re doing against them.”

Townsend: “What’s a defensive action?”

Alexander: “A tackle, a challenge, an aerial duel, any kind of defensive interaction from the offensive players.”


Fulham being pressed by Chelsea (BBC Match of the Day)

Townsend: “So, tackle one to tackle two, there are 10 passes occurring?”

Alexander: “Yes and some games that might be 25… it’s a proxy for the higher the figure, the less you are pressing or being intense against that team, because you’re allowing them to have 25 passes before you tackle or press them.

“Again, over a longer period of time, a PPDA for a season tallies that the lowest figures are for the teams that press the most and the highest figures are the ones who sit off and are passive or reactive.

“It’s pretty simple. Just two numbers divided by each other. And it correlates pretty well with what happens on the pitch.”

Townsend: “I like that a lot. That should be used more commonly, you hear so much about whether teams press high or not.”

Alexander: “Of all the stats that get explained, that’s the one people go; ‘Oh yeah, fair enough, good, I like it’.”

Townsend: “Good, I like it.”


Duels

Townsend: “This sounds quite medieval.”

Alexander: “Duels have an interesting backstory. When Opta was in its infancy, it started working on English football and in England we like tackles. But in Germany, they value a metric called duels so when Opta started working with the Bundesliga, German clients asked them why Opta didn’t have duels data.

“Essentially, any time two players from opposite teams come together it can be classed as a duel. So, it’s a composite metric of tackles, aerial challenges, dribbles, etc.

“Aerial duels are a particularly good metric and high numbers tend to directly relate to players who are good in the air, like Chris Wood.”


Chris Wood, good in the air (Nathan Stirk/Getty Images)

Townsend: “So it’s the number of duels and then the percentage of duels won? And that’s based on the duels you’ve won against everyone, not just your battles against a certain player?”

Alexander: “Yeah, that’s right.”

Townsend: “That’s absolutely reasonable. The word has been in medieval parlance for centuries. Well done the Germans.”


Field Tilt

Alexander: “Field tilt is a slightly improved version of possession: it’s the proportion of touches a team has in the final third.

“You ignore the middle bit of the pitch, you just look at the defensive and attacking thirds and you say; ‘If Man City have got a field tilt of 82 per cent, they’re having loads of the ball in the attacking third’.

“If a team averages a 70 per cent field tilt over a season, you know they’ve been possession-heavy on the front foot.”

Townsend: “Again, I get it, but again I’m going to sound like a broken record because I’ve got an issue with the name. It just makes me think of a sloping pitch.”

Alexander: “Just imagine yourself in a pub shouting; ‘Come on, we’ve got to turn this field tilt around’.”

The Athletic’s matchday snapshot from December’s Manchester Derby

Townsend: “Can you see the traditional list of shots, fouls, corners evolving over time so that in 10 years we’ve got PPDA and field tilt being used in the mainstream media?

“Because when I’m in the States, they have the most unbelievable rolling sports coverage and they go over every single play in detail. All of this parlance that’s coming into football feels influenced by the States.

“Certain sports do lend themselves more naturally to stats — stop-start games like American Football, basketball, cricket and rugby, for example.

“I think the difficulty over here is that, broadly speaking, you’ve got Match of the Day which clearly doesn’t have enough time. Sky Sports’ Monday Night Football is the only long-form mainstream show that uses data well.

“And to be honest xG is actually one of the most complicated things we’ve explained today but it’s used the most, whereas some of the others like PPDA which are easier to understand and really helpful, don’t get used.”

Alexander: “Yeah… I can’t argue with some of that. I do think as time goes on, people in their 20s now and younger are using those terms more. Terminologies become outdated and we’re in a permanent cycle of ‘In my day’.

“I’m pretty sure when we’re old, we’ll say to some kid on a hoverboard: ‘In my day we called it xG’.”


Progressive passes and carries

Alexander: “Right, so this is related to metrics like xA. Data for a specific pass doesn’t tell you that much, but if you then ascribe value to a good pass, or a cross-field pass that opens up the pitch, or a pass that takes out four opposition players, progressive passes is a way of applying value to that.

“It’s the same with carries. Rather than just dribbles, which is going past a player, with carries you can measure distance. So players who have carries over 10 yards and then end with a key pass or a shot, it’s a better way of valuing what a player does.

Townsend: “The very obvious comparison there would be yards carried in American Football? I think that makes a lot of sense. And progressive passes, does that mean forward passes?”

Alexander: “At The Athletic, we count a completed pass as ‘progressive’ if it’s at least 10 metres long and moves the ball at least 25 per cent of the remaining distance to goal…”

go-deeper

GO DEEPER

Why progressive actions are football’s most important metrics

Townsend: “I like the idea of being able to judge someone via these metrics… it takes the fun out of saying; ‘My player’s better than yours, mate’. But as someone who likes to win arguments, having more data is probably quite handy.

“Basically, where we’ve got to with all this is 1) I agree with pretty much everything you’ve said but 2) Everyone needs a Duncan in the pub with you to explain all this.”

Alexander: “I’m happy to go to any pub in any town and offer that service.”

Townsend: “Just make sure you call it ‘possible goals’.”

(Top photos: Getty Images; design by Demetrius Robinson)

Leave a Reply

Your email address will not be published. Required fields are marked *