×

How Good is Masyn Winn (a fWAR vs. bWAR dialogue)

By J. P. Hill Aug 28, 2024 | 8:00 AM
Photo by David Berding/Getty Images

Using Masyn Winn to discuss the differences in bWAR and fWAR and what they mean for player analysis.

Yesterday while scrolling through Twitter or X or whatever crazy thing Elon Musk is calling that hellscape these days, I noticed something that shocked me. SHOCKED ME!

That’s what Twitter or X or whatever is all about. Check out this SHOCKING picture of [insert whatever political topic you don’t care about at all here.]

This one, though, didn’t leave me scrolling past in agitated frustration.

It was a post about MVP candidates and it included a list of the NL leaders in WAR – Wins Above Replacement – from Baseball-Reference. There on the list, high on the list, crazy high on the list, SHOCKINGLY high on the list, was Masyn Winn.

Masyn Winn?” I thought. “In the top 6 in the league in WAR? Really? No. No way.”

I was shocked, I tell you. SHOCKED!

I would post the Twitter or X Tweet or whatever you call it. (A Tweet from Twitter is a Tweet but now that it’s X, is it an X from X, a post from X, or a Tweet from X? I don’t even know.) But I was on my phone at the time and it wasn’t from an account I follow. It showed up on my feed. I glanced at it. I scrolled past while my mind processed what I saw, and then closed the app (and flushed the toilet) before I thought about writing about it. Now I can’t find that Tweet/X/whatever again.

So, I can’t give credit to whoever it was that gave me this idea. But if you know who you are, then this post (and the toilet imagery) is for you!

Back to my point. Has Masyn Winn had such a good rookie year that he’s on the verge of MVP territory?

By Baseball-Reference’s way of calculating WAR – Wins Above Replacement – it’s true. He has a 4.4 bWAR – the “b” stands for Baseball-Reference’s version of WAR – on the season. Even though at least one game had passed since I first saw that Tweet, the rankings have not changed. I checked them on Tuesday. Winn ranks 6th in the NL in bWAR among position players. He’s tied with De La Cruz of Cincinnati and ahead of Betts from the Dodgers.

That’s a pretty exciting group of players to be lumped in with. Especially as a rookie and a 22-year-old defense-first rookie with remaining untapped offensive upside at that. If Winn is already ranked this highly with his current level of production, what kind of levels can he reach when he finds his power stroke? What will he be when he matures as a defender?

I wouldn’t blame anyone if they looked at that bWAR number and immediately concluded that Winn was a player who will not only contend for the NL MVP award on a perennial basis but probably Winn one one of these years. (Yes, that Winn one one was entirely intentional.)

Astute readers, though, have already found the flaw in the logic. It’s a problem inherent in the stats themselves.

This whole argument rests on one statistic from one place. bWAR, Baseball-Reference’s Wins Above Replacement.

bWAR! What is it good for? The answer, as Edwin Starr screamed into the void in 1969, is absolutely nothing!

WAR is a useful catch-all performance statistic. It’s useful to compare players at different positions and somewhat useful to compare players across eras. It’s also a great way to start inane arguments on Twitter or X or whatever it’s called these days.

One of the problems with WAR is that, as a stat made up from other stats, it’s only as useful as the stats that make it up. And that becomes a problem when we consider Masyn Winn’s current or future MVP candidacy.

What is WAR? Baseball-Reference defines it this way:

“WAR attempts to measure a player’s value – expressed in wins – over that which would have been contributed by a fictional “replacement-level player” (essentially a AAA-quality player who can be readily acquired by a team at any time for the league’s minimum salary) in the same amount of playing time.”

bWAR combines all of the various things that take place on a baseball field – “batting, baserunning, double play avoidance, and fielding” – and adds in a positional adjustment (i.e. it’s much harder to play SS than 1b) to find a player’s value relative to other players.

How does Winn’s performance fit within the framework of bWAR? What makes him one of the best players in the NL by this metric?

Winn’s offense is slightly above average. Baseball-Reference uses OPS+ – a statistic that places a player’s OPS (On-base% + slugging%) on a scale of 100 with 100 being average. Winn’s OPS+ is 108. Solid. Not awesome. Pretty good for shortstops, who tend to be worse hitters than, say, first basemen.

He’s a speedy runner but doesn’t have a particularly good stolen base rate. He is a net negative in steals but an extreme positive in taking extra bases. That makes him above average in terms of baserunning runs.

How does a player who is just above average in offense and baserunning land one good game away from being in the top 5 in the NL in bWAR?

It all boils down to defense. Winn is +13 in DRS – defensive runs saved. He leads all shortstops in the statistic. Through BR’s calculations, his +13 DRS makes him second in the NL in defensive WAR. (That accounts for 2.1 of his 4.4 total bWAR.)

This makes sense. It’s not going to be any surprise to anyone who has watched Winn this season that the core of his production is tied to his defensive ability. He looks like a dynamic fielder in terms of his glove, his range, and his arm. He’s dynamic.

But how dynamic is he? Is he REALLY the best defensive SS in the league? At age 22?

Here’s where WAR gets confusing. Various baseball statistic sites have their version of WAR or something comparable. This is why you will frequently see bWAR and fWAR cited for a player. bWAR is from Baseball-Reference. fWAR is from Fangraphs. While they are similar, they are not the same.

Winn’s performance this season helps illustrate the difference between the two models of WAR calculation. Winn is at 4.4 bWAR at Baseball-Reference – top 6 in the NL. Over at Fangraphs, Winn is at just 2.9 fWAR, good for 17th in the NL in total production.

(For context, that 2.9 is just slightly ahead of Nolan Arenado’s 2.7 fWAR and 23rd ranking in the NL.)

I’ll explain the differences between these fWAR and bWAR here in a moment. First, I just want to ask you a “feel” question related to WAR because I think it will help you understand the argument I’m going to make at the end of this post.

After watching Masyn Winn perform this season, does he “feel” like he’s the 6th best offensive player in the NL?

Does he “feel” like he could be the 17th best player in the NL?

You really can’t judge statistics by feel but when two similar statistics say very different things, feel can help us determine which one to trust and perhaps encourage us to better understand both statistics.

So, why is Winn’s bWAR so different from his fWAR?

Fangraphs calculates their fWAR statistic with the same kind of logic as Baseball-Reference. Here’s what Fangraphs says about fWAR:

fWAR – “A comprehensive statistic that estimates the number of wins a player has been worth to his team compared to a freely available player such as a minor league free agent. You can learn exactly how we calculated it here.”

They even publicize their formula if someone wants to do the math today: WAR = (Batting Runs + Base Running Runs + Fielding Runs + Positional Adjustment + League Adjustment +Replacement Runs) / (Runs Per Win)

Even though Fangraphs and Baseball-Refernece take the same kind of approach to calculating WAR, they use a very different set of statistics to determine their offensive, baserunning, and defensive run values. That matters. A lot.

I’ll offer a little “master of the obvious” to you. 2+3 does not equal 2+5. Duh.

That’s what we’re talking about here. Two statistics can share a name – WAR – and even try to answer the same question, but if the statistics use different numbers they WILL result in a different answer to the question.

When WAR values between Baseball-Reference (bWAR) and Fangraphs (fWAR) are different it is because there is a significant difference in the metric that the two stats use. The metric that is different is almost always defense.

The article for Fangraph’s fWAR linked above is from 2014. Back then, Fangraphs used a stat called UZR – Ultimate Zone Rating – to calculate their defensive run values. UZR was a proprietary defensive statistical model that tried to improve on the older system developed by DRS – Defensive Runs Saved.

This is an oversimplification, but both UZR (used by Fangraphs) and DRS (used by Baseball-Reference) were based on fielding zones and assigned a +/- rating based on the number of plays a defender made inside or outside of certain zones. That method of defensive evaluation has some recognized flaws that were, for a long time, simply unavoidable. Particularly when shifting became so prominent. Still, DRS and UZR, though different, were better than using fielding percentages, errors, or more common defensive stats.

Last decade, when the league started to install complex Statcast systems throughout MLB stadiums, statisticians found that they were able to look beyond a simple zone to evaluate a fielder’s defensive ability. The sage sabermetricians– baseball math nerds – at Baseball Savant (like our friend Tom Tango!) developed a stat called OAA – Outs Above Average – and were able to assign DRS/UZR-like run values to fielders based on their OAA. These stats used raw speed, location, distance, etc. to measure what a defender did regardless of where they did it on the field. Zone didn’t matter anymore. Only the probability of making the play based on the distance the fielder had to travel, ball velocity, etc.

(An easy way to understand this is when the SS makes a play on the 2b side of the bag. It looks impressive based on a shortstop’s normal defensive zone. A play on the other side of the bag is out of the SS’s normal zone. +1 DRS or UZR. But where the SS started matters, doesn’t it? If the player was shifted over by the 2b bag before the pitch was thrown, they might make the play out of their normal zone, but it’s a high-probability play. +0 OAA despite what might look like an impressive play live on TV. However, if the SS was at normal SS depth and still somehow managed to range past 2b to make an out? Low probability = high reward. +1 OAA.)

Fangraphs, which has always been more aggressive in integrating new technology and statistical models into their site, decided a few years ago to stop using UZR as their primary fielding metric and, instead, adopted OAA and OAA run values in their fWAR calculations back through 2016.

This wasn’t a meaningless rabbit trail. It all ties in with Masyn Winn’s production this season. Can you see how?

Baseball-Reference’s bWAR statistic still uses DRS as their defensive model. That’s not bad. From what I understand, DRS has received an overhaul since it was first implemented. It has its uses but DRS simply isn’t the quality of statistic that OAA is.

Fangraph’s fWAR statistic uses OAA run values as their defensive model, basing the defensive component of WAR on the most advanced and accurate defensive statistical system we have available.

One of these systems, DRS, says that Masyn Winn is the best shortstop in baseball.

The other? OAA? Winn doesn’t rate out as well.

OAA has Winn at +4. He is elite when he ranges in (+7) on the ball but below average moving toward 3b (-4). Before anyone makes this claim, no, this doesn’t have anything to do with Arenado’s range next to him. Winn does not get docked for plays Arenado makes. And if Winn slows up on plays he expects Arenado to make and then isn’t there to make the plays that he misses (which doesn’t happen based on the eye test), then Winn would deserve the negative numbers.

Winn’s just a young player who needs to get better at moving to his right defensively. It’s ok to admit that and it does fit with what I’ve seen and what scouting reports have indicated about him.

So, which stat do we believe? DRS? Or OAA?

That will go a long way in determining how you “feel” about Winn’s bWAR vs. his fWAR.

Here’s another point to consider. Winn’s OAA has changed significantly over time. Back on May 22nd, I wrote on Twitter/X/Whatever that Winn had a -3 OAA. That felt wrong to me and I expected it to start moving in the other direction sometime soon.

It has. Since then, Winn has been a +7 defender by OAA. That’s elite-level improvement over 3 months.

Which is what DRS has been saying the whole time.

Maybe all of that leaves your head spinning. The same statistic is really two different statistics. Defensive values can radically change WAR values. It feels complex. It is complex.

But the end result leaves us in a better place.

Back to the original question. Is Winn an MVP caliber of player, as bWAR implies? Or is he just pretty good, like fWAR suggests?

fWAR feels right to me. Right now, Winn has a baseline of non-defensive production that is above average. He’s good on the basepaths. That’s an exceptional foundation for a 22-year-old rookie.

Defensively, though, his OAA feels about right for this season. He routinely flashes elite potential and has had stretches of play this season that might be among the best in the league at his position. But he could be more consistent with the glove.

Based on his exceptional DRS number, Winn’s bWAR gives us an idea of what Winn’s value would be like if his defensive performance stabilizes as one of the best defenders in the league. He has the skillset. It’s something that could happen. It might be happening now.

All it’s going to take for Winn to be a top 6-caliber producer in the NL is for his current defensive performance to continue.

That’s remarkable. That’s exciting. That’s, to borrow the word I used at the beginning of this post, SHOCKING!

Based on what we’ve seen this season – actual statistics – it’s not a stretch at all to suggest that Winn is already one of the better shortstops in the NL and has the capacity to be the best in the NL as he matures.

I look forward to seeing that happen. And I think it will.