How far ahead of the average batsman of his era is Steven Smith?

Smith is clear of the average Test batsman in the top seven by over 30 runs Getty Images

The year 2018 saw remarkably tough conditions for batting in Test cricket, with West Indies, South Africa and England hosting teams on devilish pitches that aided fast bowlers. Batsmen playing in the top seven averaged a measly 31.46 in 2018. Since 1946, only three other years have returned a lower average. This rose to 34.68 in 2019 and then to 36 in the ten Tests played in 2020, but Test-match batting in general has been difficult in the past few years. Apart from adverse conditions for batting, the idea that shorter formats have made batsmen less disciplined has been proposed as an explanation for this.

Investigating the exact reasons for this noticeable fall in batting numbers requires nuanced analysis of multiple factors, which is beyond the scope of this piece. Here, I will look to break Test batting up into different phases, and analyse which players have outperformed the average batsman and by how much. After all, Steven Smith averaging over 60 in an era where batting is hard should be put in context, vis a vis someone averaging the same in batting-friendly times.

We will go backwards from 2020, looking over 16 eras of four years each, ending with 1957. Four years make one touring cycle in Test cricket, so a player's performance over that period is likely to consider a variety of conditions over a big enough sample size. Also, before 1957, there are not enough players with decent sample sizes over any four-year period after the Second World War.

We will consider players batting in the top seven batting positions only. To begin, let us look at the averages by phase:

Averages have fluctuated around the 30-run mark through modern Test history, but the 2017-2020 number, at 34.07 is the third lowest since 1957. Before that, conditions were batting-friendly in the 2000s, with averages hovering in the late 30s.

Although the performance of the average batsman has diminished recently, how do the top players compare to him in each era? Have the elite players maintained their high averages in difficult batting eras? I consider all batsmen with more than 20 innings in an era, and take the top five by batting average, comparing them with the average of all players in that era.

The most prominent takeaway: the top batsmen in the last 20 years have mostly averaged over 60, although the average player's performance has not risen past the high 30s. The modern standard for an elite player is a 60 average over a four-year cycle, as opposed to a figure that was in the mid-50s earlier.

Looking at the last two bars, the overall batting average has gone down from 38 to 34 between the last two eras, and the average of the top five has fallen almost in parallel: from 61 to 58.

Since the 2009-12 period, batting averages have fallen for the average player as well as for the elite batsman.

How far are the top players in each era from the average batsman of that period? To quantify this rigorously, I will use a number called the z-score, which tells us exactly this.

Consider the distribution of averages in the last era (2017-2020) below, which takes into account batsmen who have played at least 20 innings. This "distribution" of averages effectively shows the probability of a player's average falling in a given bracket. For instance, high averages, which are naturally less probable have very low counts, whereas it's highly probable that a player averages in the 35-40 run region.

This distribution can be talked about in terms of the "mean": the mean batting average of all the players who have batted 20 or more times, and the "width": the standard deviation of the collection of all these batting averages. Note that the "mean" here is 35.7 (as opposed to 34.07, which was the average of all innings), because now we only consider players with enough innings under their belts. This mean of 35.7 is the average of the averages of the 69 batsmen who make the cut (and not the average calculated by adding all the runs and dividing by their total dismissals).

Notice that this distribution of averages makes the shape of a bell curve (which is plotted in blue). The peak of the curve is at 35.7. In this era, the short bar (representing one player) in the 65-70 average bracket is Smith, with an average of 67.3. He is (67.3 - 35.7) = 31.5 runs ahead of the average player in this era.

However, the width of the distribution matters as well. Consider the two distributions in the graph below, from two different eras, which show the chances of a player having a given batting average.

Although they both peak at 40 runs, the grey curve is wider. Consider two players, one averaging 60 in the blue era, and the other averaging the same in the grey era. Both are 20 runs higher than the average, but the feat of achieving a 60 average is much rarer in the blue era. The z-score rewards this by factoring in the width of the distribution of averages in an era. (For the mathematically inclined, the "width" is the standard deviation of the bell curve.)

The z-score is defined as z = Distance From The Average / Width Of The Distribution

Going back to Smith in 2017-2020, he is 31.5 runs ahead of the average batsman, and the width of that distribution is 9.7 runs, so his z-score for this era is 31.5 / 9.7 = 3.25.

The z-score tells us the distance of a player from the average batsman, factoring in the difficulty of scoring high averages in a given era.

Who are the top scorers in each era considering this metric?

Remarkably, the two players most frequently in contention for the title of the best Test allrounder feature twice each on this list. Garry Sobers averaged 71 in two distinct four-year cycles, with z-scores of 2.53 and 2.35. Jacques Kallis averaged slightly lower but with high z-scores of 2.2 in both eras he topped.

Imran Khan is the other allrounder on the list, just making the cut with 20 innings from 1989 to the end of his career, a period in which he scored two hundreds and seven fifties.

A z-score of 3 has been breached just four times: by Dilip Vengsarkar (who has the highest z-score, of 3.33), Steve Waugh, Sachin Tendulkar, and most recently Smith since 2017.

Looking at the table of the top three players by z-score in each phase below, we see the toppers are usually a fair distance ahead of the second-ranked batsman in most cases. The exceptions are Sobers and Graeme Pollock close together in the four years from 1965, Zaheer Abbas and Clive Lloyd almost neck-to-neck from 1981 to 1984, and Smith hot on the heels of Kumar Sangakkara from 2013 to 2016.

Elite batsmen are mostly at a z-score of 2 - 2.5 in any era, with a score of three or greater being a rarity.

We can use these z-scores to evaluate long careers by considering the ease of batting in each four-year phase a player has played in, since the z-score inherently accounts for the run-scoring probabilities of each era. For instance, Tendulkar has played in six different phases, and had a very positive z-score in five out of those six, showing remarkable consistency in performance over a very long career.

We can average these z-scores over all phases to get a career z-score for Tendulkar. This will accomplish the task of scaling his run-scoring by the difficulty of run-scoring in those eras to present how far ahead he was of his peers overall.

We will average the z-scores proportionally, considering the number of innings played in each era. So, if Tendulkar has played 40 innings in a phase where he has a z-score of 2.0, and 60 innings in the next phase, with a z-score of 1.0, his overall z-score will be ( 2 * 40 + 1 * 60 ) / 100 = 1.40. We can do this for all batsmen over their careers. Here is the table of the best z-scores over entire careers. We consider players who have played in two or more phases, to ensure we consider sufficiently long careers.

When we look at the z-scores of batsmen with long careers - of four phases or more - this is how they are ranked.