The success of DRS is evident from the fact that its original goal of “removing howlers” has been achieved so completely that we do not even talk about it anymore with our focus moving to more detailed issues such as margin of error and whether a ball clipping the bails should be given out. For teams, the use of DRS has now moved from a last resort to a tactical tool which is now used to “take a chance” or “slow the game down”, apart from its primary usage to overturn visibly wrong calls.
So much so that a captain is now expected to understand and improve upon his DRS success rate, which is a frequently used metric to evaluate a captain. And this is what I would like to explore; How should we measure a captain’s DRS performance? Counting successful challenges is one way and the most widely used. Nothing simpler than a simple percentage of reviews that a captain gets right. Based on that, this is how captains stack: Sarfaraz Ahmed is the best while Tim Paine and Chandimal are the worst. This is not different from what we know.
Even with shortcomings, this would work fine till 2016 where if you did not win a review, you lost it. From 2017, rules were tweaked to allow retention of umpire’s call, incentivising a captain to review more. In this scenario, Success Rate only provides a partial picture as it is susceptible to type II error by not accounting for reviews not taken. This mistake is as bad, if not worse than burning a review. It might even lead to a situation where a captain with high success rate is missing more reviews than they’re calling correctly. Unfortunately, there’s no data available to measure reviews missed so, we must look for substitutes. A captain may not review a bad decision for two reasons, by either being overly cautious or running out of reviews. The most popular instance of this is Tim Paine’s inability to review against Ben Stokes in that innings. That moment resulted in a domino effect.
A good surrogate for estimating type II error is the frequency at which a captain reviews. More the number of reviews, less the chances of missing a bad decision. So, just as a batsman is supposed to score without losing his wicket, a captain should be reviewing as much as possible without losing reviews to guard against both kinds of failures. However, there is a caveat; a captain left with one review is as good as a captain left with two at the end of the innings. Both however, are better than the one left with no reviews.
So what is the optimal metric to measure DRS performance? The metric should be defined by the end goal, which is to have at least one review available by the end of the innings, WHILE reviewing frequently. To elaborate further, a captain with a lower success rate but retaining one review by the end of the innings is better than one with a higher success rate but no reviews left. This is what a metric should capture.
When we apply it to Test captains the result is somewhat different from a vanilla success rate table. Chandimal reviews frequently and gets it wrong. But So does Sarfaraz who either nails it or loses the review. And then there’s Jason Holder, who reviews aggressively but manages to retain the review often. Even though his success rate is low, he is the best of the lot. Joe Root and Virat Kohli are not far behind.
*Reviews available/Inns have been adjusted for recent matches allowing 3 reviews/Inns. This leads to >2 reviews/inns in some cases
Tim Paine, against reputation, does not burn as many reviews. Though, if we look closely, we see that Paine, along with Williamson and Faf, face a different kind of problem. All three tread on the side of caution and do not review nearly as often, perhaps leaving some fair shouts on the table?
To conclude, Success Rate is a weak indicator of a captain’s DRS performance as we do not account the reviews not taken. Additionally, it does not matter if a review is not upheld as long as it is retained. So, a good captain is one who neither miss taking a review, nor the ability (Reviews left>0) to review when needed. By this measure, Holder is (was) the best with an optimum mix of aggression and precision while some of his (former) counterparts are either right but too cautious or aggressive but wrong.
Note: I received the reviews numbers (successful/failed) as a screenshot. Willing to give credit to the one who extracted it.