Fantasy baseball: Lessons learned from points-league mocks

Todd ZolaApr 24, 2020, 01:10 PM ET
Close
Follow on X

Recently, I conducted a mock draft experiment and a follow-up shadow drafting exercise based on standard rotisserie scoring. Now it's time to focus on points leagues. For this study, I participated in five standard ESPN points league mocks, drafting each from the six-hole. The objective is determining the optimal manner to set up the rankings cheat sheet.

One of the advantages points leagues have over the rotisserie format is players can be compared via a single unit: points. In rotisserie, it's all about balancing categorical contributions. On the other hand, 5x5 scoring is 5x5 scoring, regardless of the host site. Perhaps the biggest disadvantage of points league is the lack of a universal scoring system. This renders it difficult to disseminate advice since everything revolves around the points driving the relative ranking of one player to another. Fantasy pundits have a feel for how they rank players in rotisserie but in order to properly address a points league questions, they need to "run the numbers."

The key for points league success is deriving the best cheat sheet since we all possess a cognitive bias to trust our rankings. That is, the less we need to stray from our rankings, the better. Obviously, drafting entails a lot more than, "taking one from the top." It's necessary to read the room, knowing when to adjust to the market and flow off the draft. That said, the further you need to go down your list to find the most favorable pick, the less certitude it holds.

The principle component of rankings is understanding the difference between raw points and useful points. The lowest-scoring player on an active roster provides no useful points since everyone in the league has those points. As such, to deduce the number of useful points each player contributes, the points from the worst player is subtracted from everyone in the pool.

Since most fantasy leagues have position requirements, the described adjustment needs to be carried out for each position. However, the inclusion of corner infield, middle infield, utility and batters eligible at multiple position clouds the hitting adjustment. Not to mention, most leagues allow for in-season management. The useful points correction really isn't done via a player, but rather a roster spot. The lowest contribution from a roster spot is subtracted from the contribution of roster spots dedicated to the same position. This point will be later addressed in a shadow drafting procedure, using the five mocks as the foundation.

The first step in determining the positional adjustment level is computing the expected points based on the player's projection. Pitching is straightforward; take away the expected points from the lowest ranked draft worthy pitcher from every pitcher. In standard ESPN formats, there are 10 teams, each starting nine arms. Ergo, the points from the 90th highest scoring pitcher are subtracted from every hurler.

As suggested, hitting is more complicated. Here is the adjustment process for standard 10-team ESPN leagues.

1. Calculate the projected points for all hitter and rank accordingly.

2. The following positional requirements are needed for all active rosters:

10 catchers (C)
10 first basemen (1B)
10 second basemen (2B)
10 third basemen (3B)
10 corner infielders (CI)
10 middle infielders (MI)
50 outfielders (OF)
10 utility men (UT)

There draft-worthy hitting pool should therefore contain 130 players. Count how many players of each position with eligibility at just one spot are present in the top-130 ranked hitters.

3. Chances are, the pool is lacking ample backstops. Consider all the catchers eligible at other positions to be C, then subtract the projected points from the tenth highest ranked C, from everyone at the position, dipping into those ranked below 130 to identify the useful level.

4. There are likely 10 players with single eligibility at the four infield spots. Let's say there aren't enough 2B-only players. Take enough from the 2B/SS set to fill the 2B position. Do this for all four spots.

5. Start with any leftover single eligibility infielder and group them into their associated CI and MI spots. There are likely fewer than 10 at each. Add the remaining 2B/SS to MI and 1B/3B to CI. If either MI or CI is still short, assign the necessary openings to players with cross-eligibility (but not OF). That is, if another MI is required, use someone with either 2B or SS and either 1B or 3B. If it's necessary to add someone with OF eligibility to bring either total to 10, do so. If either MI or CI is still short (very doubtful in today's landscape), dip into the players past 130.

6. Count the number of players with OF eligibility, including those with multiple eligibility not already assigned another position. If there aren't ample with OF eligibility in the top-130, reach below to bring the total to 50.

7. Assign the top 10 highest ranked players to be UT. Because at the very least, it was necessary to dip below for catchers, there are more than 10 from the top 130 to choose.

8. If it was only necessary to reach down for catchers, the hitting pool can be split into catchers and non-catchers. The useful points level for the non-catchers is the 120th highest scoring non-catcher. This points total is subtracted from all non-catchers.

9. Any position dipping below the top 130 requires its own replacement level. The four-infield position will all have adequate eligible players but MI, CI or OF could be scarce. Several years ago, MI was indeed scarce requiring three batting pools: C, MI and CI plus OF.

10. Re-rank the list after the corresponding adjustments. At minimum, there should be three players projected to score zero useful points: catcher, non-catcher and pitcher.

Based on ESPN projections, there were exactly 50 players with OF eligibility (even when considering those legal at more than one spot) in the top 130. As such, I decided to break outfield into its own pool since others may slot an IF/OF eligible player into an infield spot. Adjusting the OF useful points pushed them up the rankings list, helping to insure five are drafted from the top 130.

Like what was done with the rotisserie mock study, specific rules were set for the five points leagues mocks.

In each draft, the top-ranked player must be drafted, so long as he can be legally placed on the active, 22-man roster.
Two of the mocks used an unadjusted rankings list, ordered by raw points.
Two of the mocks used a list adjusted for catcher, non-catcher and pitcher.
The last mock used the adjusted rankings but straying from the order was permitted.
Each mock would be from the same starting point, 1.06.

Much like with the 5x5 mocks, comparing my total points to those of my competitors is moot. Even with a poorly constructed draft list, my total will be inherently high since its biased by the ESPN rankings, which were used to set my rankings. Comparing them against each other has some value, even though the flow of each mock was different.

Intuitively, I expected the totals from the pair of unadjusted cheat sheets to be lowest with the highest score coming from the adjusted list and freedom to read the room. Here are the results:

Obviously, the results didn't go as expected, but that makes for a better experiment. Confirming intuition is never a bad thing but being forced to thing through the results is a more fruitful endeavor.

Sample size is clearly too small to draw credible conclusions. The flow of each dictated each pick and anecdotally, I was taken aback by how often the highest remaining player on the board fit into an open roster spot. I wasn't forced to slide down the rankings until the 19th pick. This was completely fortuitous.

Another factor is ESPN scoring is designed so the useful points level across positions is small. That is, the listing of players using the adjusted and unadjusted lists wasn't as shuffled as it would be using other scoring systems, with catchers jumping up the adjusted list. Please note, outfielders are lumped in with the other non-catchers for this portion of the experiment.

As such, contending the non-adjusted lists were superior to the adjusted ones isn't valid, even though they were the two highest scoring totals. More mocks are needed to test this hypothesis.

Despite the small sample, accruing the fewest points in the adjusted mock with freedom to pick anyone is bothersome, if not humbling. It's easy to attribute it on draft flow and call it a day, but that's poor science.

Earlier, a cognitive bias to adhere to rankings was mentioned. What if the low total was a result of a poor draft list?

I've already alluded to a pair of deficiencies in the adjusted rankings. They don't account for streaming or the outfielder tweak. The latter has already been described so let's discuss the former.

In-season roster management by taking advantage of favorable matchups is integral to winning. Activating hitters in a potentially more productive scenario is helpful, but streaming pitching is obligatory. Successful streaming increases the projected points of the 90th pitching roster spot thus lowering the useful points supplied by each pitcher. The result is a rankings list with pitching being pushed down. At the end of the day, they contribute the same number of points to the team total, but where they fall on the cheat sheet influences when they'll be drafted. The better they're situated relative to hitters, the greater the chance of optimizing the final total.

Technically, an adjustment for the useful points for the hitting spots needs to be included. However, the actual adjustment will be the difference between pitcher and hitter so this difference can be estimated and applied just to pitchers, facilitating the process.

The worst pitchers on everyone's roster will likely be replaced by a hurler scheduled for two starts. If he's available, he's not very good, so expectations should be tempered. However, in theory someone with favorable matchups are deployed, likely to tally between 15 and 25 points for the week. According to the data, the pitcher replaced should score about 10 points, so the difference is an added 5 to 15 points, less the contribution from steaming batters.

What follows are the results from a shadow drafting experiment, using the five mocks as the basis. Three rankings lists were generated for each of the mocks. Each incorporated the outfield adjustment, then a different pitching alteration. The useful points level for pitching was

1. Unchanged
2. Increased by 10 points
3. Increased by 20 points

This decreased the projected useful points for pitchers, forcing then down the draft list. There's an ancillary benefit to this in terms of roster construction. Pushing pitching down probably manifests in taking lower ranked pitchers and higher ranked batters. This is beneficial since the streaming detailed earlier replaces a lesser arm, increasing the points added to the roster spot since the expectation is lower.

Here are the rules for the 15 shadow mocks:

Only players on the board at the time of my pick are eligible.
Freedom to take anyone, not strictly one from the top.
Shuffle up the order and do the mocks in several sittings to prevent using knowledge of the draft to favor picks.
Do not track points and redo drafts with a lower than desired points total.

The rules are designed to get an honest appraisal of the results. It doesn't do me any good to repeat shadowing the mocks to get a more impressive series of totals. The idea is to become a better player and pass that knowledge along. Fudging results is counterproductive if that's not how I would have truly drafted.

With that as a backdrop, here are the totals for the 15 shadow drafts:

The good news is the totals from the shadow drafts were largely better, suggesting the rankings list were improved. However, discerning which pitcher adjustment was optimal yields inconsistent results.

In full disclosure, I've been making this pitching adjustment for years in points leagues, utilizing mock drafts and ADP (average draft position) to help set the level. Humbly, the results have been positive.

The problem is, the described adjustment really doesn't put more points on the roster, at least not directly. It's purpose is paving the way for more in-season points. The effect may not be witnessed in the projected draft total.

Still, while the trend within each mock was different, the adjustments were beneficial, likely beyond simply pushing outfielders up, though that helped. Then I had an epiphany.

This statement is going to ruffle some feathers, but not everyone sets a proper rankings list. This goes beyond the simple useful points adjustment, because as stated, ESPN scoring doesn't require a significant change. Some draft intuitively, or perhaps with a rotisserie influence and don't bother to run the numbers, which are done for you in the ESPN draft room, calculating the FPTS (fantasy points) for each player and allowing sorting by that column.

The best example is Ronald Acuna Jr. In standard 5x5 scoring, he's arguable the top fantasy performer, in fact he's my first overall pick. However, his high strikeout rate is detrimental in ESPN scoring, dropping the Braves outfielder to late second/early third round production. In the five mocks, Acuna Jr. was selected 2nd, 3rd, 8th, 3rd and 4th overall.

While it's great many are misranking players, it's moot unless your roster construction avails openings to take advantage of the better players left on the board. The problem is, there is no way of gauging the preparation level of the room. Long-time points players may dispute this and contend their league knows what they're doing, and you are probably correct. However, it's been my experience every draft features some members with poor rankings.

Anecdotally, other than something like the Acuna snafu, the main mistake in points leagues is waiting too long for pitching. This makes sense in rotisserie scoring since two of the categories are ratio stats which can be managed. Points are akin to a counting stat, needing to be bullied for hitters and pitchers.

If the room is waiting on pitching, and your rankings already favor arms, following the rankings verbatim result in filling most of the pitching slots early. By the time hitter populate the top of your ranks, the quality is lacking, in effect giving back the advantage gained with early arms.

The manner to combat this is pushing pitching down the rankings while the batters slide up. Does this sound familiar? This is precisely what occurred with the experiment conducted earlier. However, instead of making the adjustment based on accounting for streaming, it's a way to adjust for market ranking inefficiencies. Sure, you still need to red the room, but having more batters organically be atop your cheat sheet in earlier rounds helps overcome the bias of feeling uncomfortable when required to jump way down the rankings as dictated by the draft flow.

While there the flow of each draft isn't known, the adjustment most likely to benefit can be determined and applied, leaving more manageable subjective decisions. Based on the shadow drafts, a 20-point adjustment forces pitching too low, resulting in taking too few arms early. In fact, of the three tested levels, no adjustment is best, though testing a couple spots in between zero and 10 is needed to unequivocally say no adjustment is needed.

At this point, it's necessary to point out a couple of important considerations. I've conducted similar research on formats with different scoring systems and the result has always been some level of adjustment is needed. That said, the rankings from these other scoring systems feature a higher level of pitching domination. Granted, the top four spots in my ESPN adjusted rankings are pitchers, but the delta between them and the batters is less than other systems. Plus, overall, pitchers are ranked higher compared to ESPN. It usually requires between a 10- and 20-point adjustment to optimize rankings. Until this write-up, I would attribute it to streaming, but now I know it's a market adjustment, not unlike deciding on the optimal hitting versus pitching split in rotisserie auctions.

The other factor is I humbly may be better able to overcome a rankings bias and know intuitively when to snake down my rankings to choose the most-worthy option. That said, before straining my arm patting myself on the back, even though I tried to mitigate knowledge of the mock results when shadow drafting, I'm sure I was making "better" picks subconsciously and not getting an honest appraisal of the lists. I tried to make each with the mindset of "Do what you would have done if this were a real draft," but I can't guarantee I was truthful each pick.

The important lesson is these are my results, derived from my projections for this format of an ESPN league. Your projections could be different, or the scoring or number of teams may not be the same. This all influences the market, perhaps leading to the need to adjust pitching points.

The message transcending all formats is a properly constructed rankings list is crucial. The number of sub-pools should be determined with the corresponding useful points adjustment. From there, the rankings should be battle-tested against the market to discern if any further changes are needed to offer the optimum rankings and cheat sheet.

Type	Total
Mock 1 non-Adjusted	8334
Mock 2 Adjusted	8184
Mock 3 non-Adjusted	8248
Mock 4 Adjusted	8191
Mock 5 Adjusted with freedom	8117

	0 points	10 points	20 points	Initial
Mock 1	8329	8304	8175	8334
Mock 2	8205	8190	8111	8184
Mock 3	8301	8245	8257	8248
Mock 4	8224	8228	8233	8191
Mock 5	8176	8163	8147	8117