Quantifying Luton’s Attacking and Defensive Performance (so far)

 

“Never too high, never too low.” – a line that will be fondly familiar to Luton fans, often preached by John Still during his spell at the Luton helm. 8 games in to #OperationPissTheLeague this 17/18 season, we’ve already experienced the highs of an 8-2 victory,  the Marek Stech injury-time penalty save at Mansfield to rescue a point, and the 98th minute winner at Wycombe. Alongside that, we’ve also endured the lows of conceding an injury-time winner to Barnet (as well as the overall performance that day) and being down to 10 men after half an hour against Swindon who promptly played us off the park for the remaining 60 minutes. Those moments thrown together and added to the results so far (currently sitting 4th in League Two on 14 points) and the general feeling among fans seems to be one of  hitting-but-not-exceeding expectations. It could be worse, it could be better. The best is, hopefully, yet to come.

The purpose of this piece is to apply a bit of analysis to the season and get under the hood of the performances Luton have been putting in. Hopefully you’ll have read my previous two posts on this blog and will therefore have a feel for what’s coming up in this piece in terms of the analytical concepts I’ll be applying to Luton based on the data I’ve been collecting this season. If you’re familiar with Expected Goals then feel free to skip the next paragraph and move on to the main course. If not, then allow me to quickly explain the concept:

Expected Goals is simply a method used to quantify the quality of chances created and conceded by a team throughout a game. It is calculated by taking a shot and looking at historical data (collected by companies such as my employer, Stratagem) to calculate the conversion percentage of shots from the past that had the same characteristics as the new shot. The basic theory is that 1) the closer the shot is to goal, the more likely it’ll go in, 2) a shot from a central location is more dangerous than a shot from a wide location, and 3) shots with feet are more dangerous than headers from the same distance. As an example, imagine Harry Kane has a header on goal from just outside the 6 yard box following a cross from Danny Rose. You’d look back at all headers from just outside the 6 yard box following a cross, to find it has a conversion percentage of 15%*. This would give Harry Kane’s header an Expected Goal value of 0.15. If you’d like to read more about why Expected Goals is useful information for teams and players, I recommend giving this by @OneShortCorner a read as it explains in simple terms what xG’s useful for and how to interpret it.

*I pulled this number out of thin air.

On we go…

 

The Team

 

Part 1: Expected Goals and Expected Points

 

123

 

In the first section of this piece, I want to talk about the 8 matches Luton have played so far and the underlying performances the team has been putting in. What we have here is a table of Luton’s results so far. The bold, black number represents the goals scored by each side in the match, but the number I hope to be of most interest is the one underneath in orange, italic font. These represent the Expected Goal (xG) scores by each side in each game, as per my data (I’ve already written about the 8-2 Yeovil game here).

These numbers can help to tell you a little story about what happened in each match, and draw some early conclusions prior to doing more detailed analysis. For example, the Barnet game was very nearly a non-event, Colchester at home we were comfortable and deserved winners, and Lincoln was probably a fair draw. I don’t really want to go too much into these though as, rather than looking at individual match performance, I’d rather look at the overall performance over the opening 8 games as I think there’s more insight to be gained by doing so.

We’ll come onto that shortly, but first I want to establish another metric which was touched on briefly in the “8-2 Yeovil” piece –  Expected Points. Expected Points will tell us how many points Luton deserved to pick up from each match based on the chances created by both sides and is another good way of using xG values to illustrate whether Luton’s results have been in line with performances or have they been getting lucky/unlucky with their results so far. Expected Points are calculated by simulating the shots taken by each side in a game thousands of times, leaving us with the percentage of times we could expect a team to win, lose, or draw a particular match based on the xG scores by each side. I’ll use the example you’ll have seen before to recap – the Yeovil match where the xG scores were Luton 3.07, Yeovil 1.48. We include Yeovil’s penalty (xG value of .78) in the simulation even though you’ll notice it is listed as “+1pen” in the match table – I’ll explain why this is later – in this instance it’s included as it was a shot that contributed towards Yeovil’s likelihood of scoring during the game and therefore impacted Yeovil’s chances of gaining a result.

The results from those simulations were as follows:

Luton Win (74.72%),  Draw (15.77%), Yeovil (9.51%)

We calculate Luton’s Expected Points from the game simply by multiplying Luton’s win percentage by 3 (the amount of points Luton would receive for winning the game) and multiplying Luton’s draw percentage by 1 (the amount of points Luton would receive for drawing the game) and add the two together, as follows:

(0.7472*3)+(0.1577*1) = 2.4

So, Luton gain an Expected Points score of 2.4 for the match. This doesn’t tell us that much on its own, but over the course of a season and a period of games this will have more value, as we can use it to see whether Luton have been getting the results their performances deserved, as mentioned before. This piece is all about the season so far so let’s take a look at the Expected Points scores for the other matches:

xP Table

I appreciate this is boringly similar to the xG table above but I don’t feel the individual numbers are that important. I want to instead look at how Expected Points compare with the Actual Points gained so far and see where we’re at:

Actual Points (8 games): 14

Expected Points (8 games): 13.36

So, a near-perfect hit which, in simple terms, means we’ve pretty much gotten what we’ve deserved for the performances we’ve put in. The slight exception is the Barnet game, where there is definitely an argument to be had that we were unlucky to lose the game (though an equally strong argument that we didn’t deserve to win the game!) so no one had any complaints when Barnet’s Jack Taylor had the temerity to rudely awake everyone from their afternoon nap by curling home inside the far post from 20 yards for a 92nd minute winner.

This is actually a great case for why we track Expected Points though – this stuff *should* even itself out over the season. For every match we lose where we “deserved” at least a point, there’ll be a match we win where we could consider ourselves a little fortunate to do so.

Part 2: Under The Hood Performances

 

Let’s get more stuck in to the numbers that power the observations made so far – Luton’s Expected Goals totals. Here are three tables:

-First, the sum total of all of Luton’s Expected Goals and the sum total of Luton’s Opponents Expected goals, not including penalties.**

xG Total

 

-Second, the Expected Goals totals averaged over 8 games.

xG Per Game

 

-Third, the total number of shots, shots on target (SOT) and the average xG value per shot as recorded in my dataset.

xG Shot Stats

 

Conclusions: These are pretty good. It’s only 8 games (very small sample) but, as a team that’s harbouring promotion ambitions, it goes without saying that you want to see your team creating better quality chances more often than your opponents and that’s exactly what Luton have done so far. This is displayed by the superior difference in xG totals so far, but also in the xG per Shot statistic. It’s fairly self explanatory but, for clarity, it’s the average xG value of the shots taken by Luton and their opponents calculated by dividing the xG Sum Total by the Total number of shots. This is another feather in Luton’s cap as what it tells us is that Luton are taking a good quality of shot when they do decide to pull the trigger, whilst limiting their opponents to taking shots of lesser quality.

I quite like the metaphor Ted Knutson used in his piece on Brentford’s start to the season when he compared a team’s xG per shot to rolling a die. Looking at it in this way, Luton are rolling an ~8 sided die every time they shoot, whereas they’re currently forcing their opponents to roll an ~11 sided die. This is to their advantage and is even more favourable when we can see that Luton are also shooting/rolling that 8-sided die more often than their opposition are shooting/rolling their 11-sided die in their games so far. Given we’ve also had a tough schedule with away games at Mansfield, Lincoln, and Wycombe, games where even the most optimistic fan would probably not expect Luton to dominate in shot quality and quantity, then this is not bad at all. So early signs are encouraging and it’ll be interesting to see how these averages hold out over a larger sample size and a few more games- certainly if we are to maintain these standards throughout the season, we should go close but my gut feeling suggests these numbers may need improving on slightly at both the attacking and defensive ends of the pitch if we’re to really push on into league-winning form.

**The reason penalties are excluded from these totals is that creating quality chances and preventing your opponents from creating quality chances is a repeatable skill – whereas winning penalties is not. And that’s ultimately what we’re trying to track here, a gauge on how Luton are likely to perform going forward based on things we know to be repeatable**

 

The Players

 

Town defender Jack Stacey

Part 3: Player xG

 

It goes without saying that a team’s expected goals total is made up of each of it’s players own personal contribution to that total so that’s exactly what we’re going to look at next. Again I must stress that 8 games is a small sample size with very few players completing more than 5 games worth of minutes so far this season – these will be much more interesting and insightful once the dataset has beefed up a bit. I’ve filtered out players that have played less than 200 minutes so far this season from these lovely graphs below because it is absolutely, certainly too early to draw any kind of conclusions about them from a statistical point of view.

First up, let’s take a look at the sum totals for each of the players so far:

 

xG Totals

 

Let’s start with the obvious: Danny Hylton and James Collins have been getting on the end of Luton’s best chances overwhelmingly so compared to their team mates. It’s no big problem and if anything is quite intuitive to be relying on your strikers to be getting on the end of your best chances, so this is fine. What’s interesting is that Hylton’s played ~100 minutes less than Collins and is already ahead of his striking colleague in xG, but we’ll go into this more in a paragraph or two.

The 2nd thing that should be immediately obvious, to Luton fans at the very least, is the player nicely settled in at number 5 on this list and amongst Andy Shinnie, Olly Lee, Pelly-Ruddock Mpanzu and Luke Berry – players who are more-readily considered to be the supporting cast in an attacking sense. Yes, Dan Potts has so far contributed the 5th-highest amount to Luton’s xG total this season. This was something that I noticed when collecting this data after each match, that Potts was getting on the end of an unusually large amount of set pieces, certainly more than conventional wisdom would tell you he should be getting on – I can’t be the only one who, before the start of this season, wouldn’t have put Potts up there with the biggest threats from set pieces in the team. It’s credit to him though and definitely something to keep an eye on, though it must be stressed that nearly half of his current 0.88 xG total came from his goal against Colchester, by far the best chance he’s gotten on the end of to date. The rest of his chances certainly haven’t been as clear-cut but it’s undoubtedly a positive that he’s been posing any kind of threat in the opposition penalty area, especially from set pieces. Fingers crossed he can add to his goal tally before teams start marking him more tightly.

One of the problems with looking at xG totals is that it’s unfairly weighted towards players who have played more minutes – naturally they’ve had more time on the pitch to get more scoring opportunities. We want to see who’s contributing the most for their time on the pitch and we do this by adjusting the data into a “per 90 minutes” number i.e, what the players average contribution would be for every 90 minutes they spend on the pitch. It levels the playing field for those players who haven’t played as many minutes as some of their teammates. Feast your eyes:

 

xG_90

 

Sadly Elliot Lee and his 1.3xG/90 miss out on this one on the basis he’s played a meagre 30 minutes of football this season (1 shot, 1 goal is a sterling contribution for half an hours play, though). Let’s concentrate on what we do have, and Hylton and Collins’ numbers continue to be encouraging. Hylton’s currently taken 12 shots from open play giving him an xG p/shot of 0.24. He’s essentially averaging a couple of 1-in-4 shots every game so far which is decent enough and should see him able to add to his current goal tally of 1 on a regular basis.

James Collins is the more interesting player for me to talk about right now though. He was considered to be a very good finisher before he came to Luton and has certainly added weight to that argument since he’s been here. However, 11 shots / 2.65xG / 0.38xGp90 doesn’t really scream 6 goals to me, which he has scored so far. We can look at the likelihood of Collins scoring 6 goals from the shots he’s taken using Danny Page‘s Longterm Expected Goals Simulator. See below:

James Collins

The Simulator believes Collins should more likely have scored 2 or 3 goals from the chances that have fallen to him so far, with it actually being slightly more likely that he would’ve scored 0 goals than 6! This is far from a criticism of Collins as it shows he’s finished his chances very well, but the point is that we cannot expect him to keep scoring at the same rate if his (perfectly good) xG/90 numbers continue as they are. As you’ll be tiring of hearing me say now, 8 games is a small sample and Collins is far from the first player to experience a hot run of finishing. To put it into context, if Collins continues to receive the chances he’s currently receiving at the same rate as he’s so far received them, then you’d be pretty confident that he could be getting 15-20 goals in a season which is exactly what you want from him. All is well.

Just for further illustration, I’ve giffed up a couple of his more impressive finishes that came from what my model terms to be low-quality chances: his hat-trick goal vs Yeovil and his beauty against Colchester.

 

giphygiphy (1)

 

Part 4: Player xA

 

The last section of this piece will focus on the creative forces of the side and for that I’ll need to introduce the Expected Assists (xA) metric. Every time a shot is recorded along with it’s Expected Goal value, I’ve been tracking the player making the assist (Primary Assist) and the player making the pass to the Primary assister (Secondary Assist), with each player credited a value for their part in creating the chance. Rather than just saying how many Key Passes (passes before the shot) each player has made, we’re assigning a quality value to each of these passes. In the same way not all shots are equal, not all shot assists are equal either. Now, as with the Expected Goals model, I can’t/won’t go into how exactly the values are credited to each of these players as it’s not dissimilar to the methods used at Stratagem, but let’s not let that get in the way of more data-driven insight! I present to you the xA totals:

 

xA Totals

 

Would you just look at that. TWO defenders in the top-two places, no less that THREE defenders in the top 4. Let’s start our analysis at the top of the pile.

Jack Stacey is a player I’m really growing to like and, with each passing performance, I’m becoming more and more convinced that he must run to and from matches like a young schoolboy – his engine hasn’t once seemed to be exhausted by his incessant patrolling of the right flank for 90 minutes every game. Incidentally, this fact hasn’t been lost on Luton fans and is providing one of the big talking points amongst the fan-base right now as we currently have a home-grown, England U20 international (an extremely rare commodity at League Two level) right-back also in the squad in the shape of James Justin. The fact that there is even a debate as to whether Justin should come back into the side when fit again shows how well Stacey has done – most fans, in particular Luton’s, want to see home-grown players in the starting XI and especially one whom the club did very well to keep hold of in the summer amid interest from higher up the English football pyramid. Stacey, however, must be giving Nathan Jones an almighty headache in this position. His attacking output has been excellent so far as he’s always offering an option on the right flank, particularly necessary because of Jones’ formation of choice – the narrow 4-4-2 diamond – and it’s showing up in the data that his final ball isn’t too bad either.

Alan Sheehan is another weapon in this team as his set-piece deliveries are up there with the best in this division (as mentioned earlier, Dan Potts has gobbled up a lot of those quality Sheehan deliveries from corners). And this is all from a centre-back. It’s fair to say that if he was 3 inches taller and had a higher top speed, he’d have had a much more sustained period at a higher level of football and would almost definitely still be playing there now and, despite his suggested physical limitations at this level (where coming across an extremely quick or extremely strong striker is an almost-weekly occurence), he is actually a fairly important cog in creating further goalscoring opportunities for a team that is posting pretty good defensive numbers as it is with him in it.

Again though, we know the totals are slightly slanted in Stacey and Sheehan’s favour as, collectively, they’ve missed a grand total of 0 minutes this season. Let’s look at the p90 numbers:

 

xA_90

 

So there you go then, a level playing field and Stacey – the right back – is still our biggest creative contributor and shows how he is excelling in the demands placed on him in that right back position of the 4-4-2 diamond, with Andrew Shinnie the “tip” of that diamond and largely expected to be the largest creative force in the team, just behind him. In my opinion, it’s definitely a good sign that there’s no outstanding creator the team is reliant on, if one of these players was to get injured then their creative output shouldn’t be too sorely missed with it spread so widely amongst the team.

We’re nearing the end now but here’s one last graphic to sum up everything that’s been said above in the Player section:

 

Player xG_90 vs xA_90

 

There’s no new insight to gain here – players towards the top of the graph have been the greater creators so far, players towards the right of the graph have been the greater goal-threats so far, with the + marking you see embedded on the graph the average for both. So players above the line are above-average creators in the team, players to the right of the line are the above-average goal-threats. This is decent viewing at this stage, but will only becoming more valuable once the sample size has grown and players have played more minutes, drops or improvements in form have occured, and we can more confidently say that this is a players average contribution to the team.

 

Any readers who are in slight doubt of the use of expected goals and what it means – the video below should help reinforce the point that no chance is ever a 100%, nailed-on, my-gran-could-have-scored-that, certainty.  It makes sense to reward teams for creating chances that are as close to 100% in probability as possible even when the chance is missed, and that’s what Expected Goals does. I genuinely am grateful if you’ve made it this far and I’d be glad to listen to any feedback or questions you may have regarding anything you have read in this piece, even if you just want to open up a discussion about something you’ve read above, I’m all ears! Leave a comment or find me on twitter at @olivermpw_. Thanks for reading.

COYH

 

 

 

To calculate Expected Points, I’ve once again been using Danny Page‘s excellent Match Expected Goal Simulator.

Advertisements

A story of how Luton have already achieved 1/3rd of a Leicester this season.

Before we get started, I thought it’d be best to make absolutely sure that anyone reading is up to speed with the concepts mentioned previously and in this forthcoming piece of writing. I’m not sure the best use of the blog is to provide yet another explanation of Expected Goals (xG) and its uses; other people have already done so in a far more articulate and interesting way than I would manage. If you want to take 5 minutes just to increase your understanding, then I’d highly recommend OneShortCorner‘s set of posts that is essentially Football Analytics for Dummies, his 4th piece in the series being the one that covers xG. For a deeper dive into what xG actually tells us and how to interpret it, this piece by Danny Page (more from him later) is by far the best post I’ve read on explaining what an xG number actually means. Once you’re feeling comfortable with what we’re talking about here, please read on…

In my debut post on this blog, I spoke about the project I’ll be running this season; self-collecting xG numbers for every Luton Town league match this season. The plan is to write about the findings occasionally when I have an interesting idea or when there’s an interesting match to write about. It’s important to note now before we dig in to those collected numbers – if you were hoping to read a full methodology of how I collect my data and run my xG model, then I’m sorry to disappoint you but this is secret sauce. I’m collecting the data in a very similar way to how I would in my work for StrataBet so to reveal anything about how these numbers are produced would be infringing on what is really StrataBet’s intellectual property and the essence of their business. Needless to say, it will have to remain under wraps as I like my job there and would like to keep it.

Enough waffling, I present you the pièce de résistance of what this blog is for: the numbers. Luton have so far played two games in League Two – Yeovil (H) and Barnet (A). I was initially going to write about both matches in this piece, but the longer it went on the more obvious it became that I should be focusing squarely on the opening day match against Yeovil which you may or may not be aware finished in an 8-2 victory for Luton, a typically slow start to the season. Getting stuck in straight away, this is actually a perfect case study for what xG can be useful for – intuitively have you ever watched a game, including Luton vs Yeovil (if you haven’t, here’s the goals which are worth watching just for Alan McCormack’s thunderb*****d), and felt that the final score should’ve been 8-2? It goes without saying that this is a freak result, so based on the chances created by both sides, what would be the likelihood of that occurrence? To answer that question, we need to sum up all of the xG numbers for each individual chance from the game and it looks a little something like this:

 

xG Chart
It may surprise you to learn that I’m somewhat new to dataviz.

Note: Yeovil’s total includes a penalty, which accounts for 0.78 of their xG total as they get converted ~78% of the time. Their Open Play xG was 0.7.

Now we have the numbers, what can we use them to tell us? For that we need to refer back to Danny Page and his article mentioned at the beginning of this piece. In the article, he states:

 “In my opinion, Win Percentage would be the most reasonable determination if you’re tweeting out expected goal scores. If you’re including a picture in your report, graphing the goal difference will show the variance in possible results, and allows you to display the probability of each result.”

Sound advice indeed, so using his Match Expected Goals Simulator, we can show what the Result Percentage would’ve been for both sides, the Points Per Game both sides would typically win based on the game, the probabilities of each Goal Difference occurring, and the likelihood of each side scoring a certain number of goals.

 

This slideshow requires JavaScript.

Team A = Luton(Red) . Team B = Yeovil(Blue)

Some explanation:

Result Percentage is calculated from simulating the game 10,000 times, with each result grouped together and then expressed as a percentage. Luton won 74.72% of the sims.

Points Per Game is calculated by multiplying the Win Percentage by 3 (the points available for a win), multiplying the Draw Percentage by 1 (the points available for a draw) and adding them together.

(Win % * 3) + (Draw %) = PPG

In this case, (3*0.7472) + 0.1577 = 2.399

This is something I’ll be keeping track of throughout the season, and plan to chart Luton’s actual points won against their expected Points further into the season to gauge whether they’ve been picking up the results they “should” have.

-Goals Scored. Ideally you’d be able to see the exact percentages for each unit, because then you’d be able to see that there was just a 0.17% chance that Luton would score 8 goals, based on the chances they created. The best way to articulate why it was so unlikely is to watch the goals and look at Olly Lee’s goal (00:16), Alan McCormack’s goal (00:35), and James Collins’ hat-trick goal (01:38). Pause the game just as they’re taking the shot, and think of how many times you’d estimate a shot from those positions would result in a goal? Would it be 1 in 5? 1 in 10? 1 in 20? As it happens, in this instance Luton scored 3 goals from these chances as a result of some excellent finishing. xG doesn’t measure finishing skill however, it measures Chance Creation and based on those chances it is extremely unlikely that Luton should have scored 3 times from them which obviously contributes to the low probability that Luton would, in fact, score 8.

Goal Difference quite clearly shows a lot of colour for “-minus” goal differences, i.e Luton wins. For me, I think the main takeaway is that there was a 51.11% chance Luton would win by 2 goals or more, a better-than-coinflip chance.

If you’ve made it this far, hopefully your patience will be rewarded because I think the most exciting takeaway is yet to come. Thanks to the above Goals Scored feature on Danny Page’s Expected Goals Simulator, we now have percentage probabilities for the likelihood that Yeovil would score exactly 2 goals, and for the likelihood that Luton would score exactly 8 goals.

So what?

So we can calculate what the chances are the game would’ve finished as an 8-2 win for Luton!

The % chance of Yeovil scoring exactly 2 goals was 37.49%

The % chance of Luton scoring exactly 8 goals was 0.17%

Multiply these two together would give you a 0.064% chance of the match finishing 8-2 or a 1569/1 shot (or just under 1/3rd of a Leicester).

Thanks for reading and constructive feedback is definitely welcome – feel free to hit me up on Twitter or in the comments.

Introduction to: Luton Town FC & Advanced Stats

I must be the 30,00th person to add their voice to the football blogging rat race, but the aim of this endeavour is to hopefully provide something different to what is currently out there. A quick story to help illustrate what to expect from this page and the motivation behind starting it…

In my current job role at StrataBet, I watch matches from a variety of leagues worldwide and collect data for their database, most of which revolves around the chances created by each team. This data is used to generate ratings for each league StrataBet covers and to inform trading decisions.

Using these skills, I’m going to be collecting similar data on all matches played this season by my supported team, Luton Town, and will be posting statistical insights drawn from this data for both teams in each match. This is a personal project and is not affiliated with StrataBet so it is not to be seen as official content from them – I’ve wanted to see more depth to the data available for League Two and Luton for a while now rather than just shot or shot on target numbers so this is merely something to bridge that gap, posting content that I would want to see from other sources if it were possible.

Statistical insight you say?…

Yep, so if you’re reading and you’re not too familiar with the current in-vogue metrics, then you might want to buff up on Expected Goals (xG) and Expected Assists (xA), one of which made its debut on Match of the Day this weekend. With each future post on here I’ll provide in-depth explanations of any metric or idea used with the data, but to briefly explain: Expected Goals is a measurement of Chance Quality, Expected Assists is a measurement of Creation Quality. Using this data should provide reasonable insight into how Luton have done against their League Two opponents and which players have been particularly good (or bad) contributors to that, in an attacking sense.

Right now I’m not too sure how often content will come out, and whether it will be in tweet or blog form. I have a couple of ideas of what to produce with the data once the season is a few matches old and there is some depth to the data so keep your eyes peeled on here and on my Twitter account for any announcements (strong word) about upcoming posts.

Thanks for reading and fingers crossed I’ll see you here again next time…