Are the Reds on the Verge of Footballing AI Frontiers?
Part two of Liverpool’s role in the data revolution in football
Part one looked at the historical context of the Reds’ data department evolving; how John Henry tried to buy the world-leading statistical analysis firm Decision Technology but, when that failed because of their agreement with Tottenham, instead he brought Michael Edwards - from Spurs - who then tempted Ian Graham (one of the leading consultants and model building experts in the company) to Anfield to help build a new area within the club focused on advanced analytics.
Prior to the Liverpool’s Champions League victory over the aforementioned Tottenham, this long read was published in The New York Times. It outlined the role Ian Graham had, not only in recruitment, but various areas within the field of using data to extract ‘edges’ or marginal gains.
More than other major clubs, Liverpool incorporates data analysis into the decisions it makes, from the corporate to the tactical. How much that has contributed to its recent performance is itself hard to measure. But whatever the outcome of the final, the club’s ascent has already started to make number-crunching acceptable, even fashionable, in England and beyond. As more clubs contemplate employing analysts without soccer-playing backgrounds to try to gain a competitive edge, Liverpool’s season has served as something of a referendum on the practice.
This brings us onto the future of the Reds’ use of advanced data and analytics. During the time when Graham was setting up the department - alongside Edwards - they employed a particle physicist who had been working at CERN, trying to find the Higgs Boson. That’s not the latest world-class Hungarian midfielder, whose dad made him wear tight boots as a kid, it’s - and I glazed over reading this so don’t feel bad - something a bit more important than that:
You and everything around you are made of particles. But when the universe began, no particles had mass; they all sped around at the speed of light. Stars, planets and life could only emerge because particles gained their mass from a fundamental field associated with the Higgs boson. The existence of this mass-giving field was confirmed in 2012, when the Higgs boson particle was discovered at CERN.
This was Will Spearman, and after the panic-stricken discussions that occurred after the departures of Edwards, Ward and Graham himself, our Chicago-born American took over the role vacated to become Liverpool’s Director of Research. But let’s rewind a bit, because how does a particle physicist from the US become interested in football? Well, he first took a job at Hudl, another data company that eventually went onto buy Wyscout. Not only that, but he started giving presentations at the Opta Pro Forum in London and the Sloan Sports Analytics conference in Boston.
Just A Bit Of “Fun”
At Opta, his focus was on a theory called ‘pitch control’ - again, not the easiest to understand but worth a go.
He recalls at the time of the presentation:
“I was extremely nervous because I was in a room with people from Liverpool, PSG, Barcelona and whatnot — people who know more about football than I could ever possibly know, I was getting up there showing stuff that I’d largely just worked on for fun for a week or so.” His slide deck emphasised that he was a data scientist, “not a football expert!”
For more context, have a look at the presentation itself and tell me if you could come with this stuff… in a week “for fun”.
No, me neither. But one particular line stands out for me:
“How much does their presence improve their team’s chances of gaining possession of the ball above the average.”
The model’s aim was to look for “controlling players”, and remember this information was all in the public arena. Clubs hate this information being public. Who can gain an edge when everyone knows the answers? That soon changed, and Spearman was convinced to join the club in 2018. But not before he published another paper at the Sloan conference, entitled ‘Beyond Expected Goals’. This was before Match Of The Day or other media outlets had even started using xG, while Spearman was already moving on to the next ‘frontier’.
Forget Expected Goals
As an aside, my other ‘boss’ Gags Tandon - who started Anfield Index, and the pressing data that I took over from when he didn’t have enough time anymore, was on a flight to the US once on the same plane as Michael Edwards. In a fairly long conversation, one of the things Edwards confirmed was that the club do not use expected goals behind the scenes, because of the flaws inherent in its assumptions. It’s better than total shots, shots on target, and big chances, but still, the weaknesses in the metric - as well as what the club themselves have, is far more advanced. Remember that when large sweeping statements are made using xG - and remind not to do it either!
Anyway, back onto his research, before we move behind the paywall and I explain why we should be excited about the Reds’ future. The abstract is below and it is packed with such high level analysis (again in the public arena).
Abstract: Many models have been constructed to quantify the quality of shots in soccer. In this paper, we evaluate the quality of off-ball positioning, preceding shots, that could lead to goals. For example, consider a tall unmarked center forward positioned at the far post during a corner kick. Sometimes the cross comes in and the center forward heads it in effortlessly, other times the cross flies over his head.
Another example is of a winger, played onside, while making a run in past the defensive line. Sometimes the through-ball arrives; other times the winger must break off their run because a teammate has failed to deliver a timely pass. In both circumstances, the attacking player has created an opportunity even if they never received the ball.
In this paper, we construct a probabilistic physics-based model that uses spatio-temporal player tracking data to quantify such off-ball scoring opportunities (OBSO). This model can be used to highlight which, if any, players are likely to score at any point during the match and where on the pitch their scoring is likely to come from.
We show how this model can be used in three key ways: 1) to identify and analyze important opportunities during a match 2) to assist opposition analysis by highlighting the regions of the pitch where specific players or teams are more likely to create off-ball scoring opportunities 3) to automate talent identification by finding the players across an entire league that are most proficient at creating off-ball scoring opportunities.
The rest of this article is for subscribers only, and includes information - that will have to stay anonymous to protect the source - relating to some of the analysis the club is currently doing, without - of course - being specific.