An analytics quandry when comparing catchers for the Nats!

Photo by Sol Tucker for TalkNats

There was a distinct difference in wins last year when Riley Adams started a game last year. The Washington Nationals had a .349 winning percentage in Adams’ 43-starts. Contrast that with Keibert Ruiz‘s .388 winning percentage in his 98-starts, and his advantage is six more wins over a full season. But if you go to cERA (catcher’s ERA), Adams had a distinct advantage at 4.54 versus Ruiz’s 5.11. Sure, cERA means very little in most cases because of the smaller sample sizes and matchups with the battery.

But drilling down further on these statistical differences make matters curious. Yes, Ruiz was the better hitter with a 90 wRC+ stat versus 56 for Adams. You might expect the opposite on those wins with that cERA, but the team only scored exactly 3.5 runs per game with Adams in the lineup versus 4.02 runs per game with Ruiz in the lineup. That half run in cERA was basically cancelled out by the offensive runs per game. Obviously there are other factors in the team scoring more runs when Ruiz starts because he is not a one-man wrecking crew with a bat. 

This is the perfect assignment for the Nationals analytic’s staff to drill down on this and try to find out why these numbers were so skewed. But I wanted to see if we saw any patterns on that cERA that stood out.

So I asked DonH if he could do something with this. He generated the following tables but wanted to make sure to let folks know about two issues with these tables:

  • The Game Day data is apparently not updated to reflect changes in earned vs. unearned runs after a game ends
  • We did not try to factor in the impact of late game replacements of the catcher (mostly Ruiz replacing Adams). The impact of that should be minimal.

Here are the overall numbers from Game Day data before changes were made for unearned runs:Since these numbers can be influenced by outliers, a standard analytic technique is to use trimmed means (and Catcher ERA is a mean). Trimmed means exclude the top/bottom X%. This table excludes the top/bottom 5%.And this table excludes the top/bottom 10%While trimming the cERA brought the numbers for Ruiz and Adams closer, these results don’t support the hypothesis of a few bad games as the source of the difference.

Next lets drill down on Opponent. We know that Nats were worse against the NL East.Don’t see an obvious pattern here.  But returning to the question here – vs. the NL East, let’s collapse this table to NLEast Yes/No.Wow. a dramatic turn-about. Adams is much better against the NL East vs the rest of MLB; while Ruiz is worse. But this does not help with the question of why?

Next let us drill down by starting pitcher and the catcher and this is where it gets interesting.Looking at this, perhaps we need to revisit the overall table and exclude Joan Adon – who was consistently awful as a starter, and this almost has the two catchers as equal on cERA.It brought the numbers a little closer, but still does not explain everything. The obvious question is what about Patrick Corbin. This overall table excludes Corbin. Still not much impact on the difference. So what happens if we exclude both Corbin and Adon. Look at that. Closer, but who knows.

Now the drill-down by month. And finally, drill down by Days of Rest.
Bottom line is that it appears there is something going on. Maybe it is a bunch of factors; maybe it is random. Also we know that some pitchers prefer a certain catcher. Some managers dislike the “personal catcher” assignment. This is all open to debate. Having more data might help. For example, how often are the pitchers shaking off Ruiz vs. Adams. This is the perfect assignment for the Nats’ beefed up analytics department to tackle. There appears to be something there, in the very least, to which catcher might work best with a pitcher. But bottomline the team wins more with Ruiz as a starter in the lineup.

This entry was posted in Analysis. Bookmark the permalink.