If you have any questions about the code here, feel free to reach out to me on Twitter or on Reddit.

Shameless Plug Section

If you like Fantasy Football and have an interest in learning how to code, check out our Ultimate Guide on Learning Python with Fantasy Football Online Course. Here is a link to purchase for 15% off. The course includes 15 chapters of material, 14 hours of video, hundreds of data sets, lifetime updates, and a Slack channel invite to join the Fantasy Football with Python community.

Draft Model 2020

In this post, I want to show you how you can build a draft model for 2020 using Python, Pandas, and some web scraping in no less than a couple hundred lines of code. I'll provide the source code in a github repo and you guys can make pull requests if you like.

I know I said I'd be working on other stuff for part 5 of the series in the last post, but I couldn't resist considering draft season is close away and the fact that I haven't posted for the intermediate series in so long. (Oops - sorry guys)

A couple things, first.

1 - We are going to be coming up with a draft model for a snake draft. I'm not super familiar with auctions and so we won't be focusing on them too much. Sorry.

2 - In order to get through this thing in one swoop, I'm going to gloss over some of the in-depth code explanations I usually give in my other posts. If you've made it this far, you should be able to keep up though. If not, stackoverflow is your friend!

3 - We are going to be calculating something known as value over replacement for each player in the draft pool, and then sort them in descending order. This will be the basis of our ranking model. To begin this post, we'll talk about what value over replacement is and why it's actually really effective and ranking players for the draft. If I decide to make this thing a series, we'll also include ADP data in our final DataFrame and look for gaps in ADP rank and VOR rank (the point of this is that we'll be looking for bargains/steals).

4 - This draft model works for the standard, half-ppr, and ppr formats. I'll be working in PPR as that's my main league format, but this code here can easily be extended to the other formats.

5 - We are going to be scraping the data from FantasyPros. Thanks FantasyPros (even though we didn't ask)! FantasyPros provides us two things that we need for our model - ADP data and projection. Moreover, the projection data hosted on Fantasypros is a combination of 4 different sources.

That's all the background info you need for now. Let's talk about VOR, what it is, and why you should use it.

Value Over Replacement

	A	B
QB	20	25
RB	19	11
WR	15	14

Let's take this simple example of two teams facing off - which we call team A and team B. There are only two teams in this fictional league and only 6 players in the draft pool - no other teams, no other players. Each team has one spot for QB, RB, and WR. The DataFrame here shows us how many points each position scored for each team. Let's do two things, let's see which team won, first of all. And let's also look at the point differential for each position.

So, A won with a win margin of 4 points.

	A	B	A - B
QB	20	25	-5
RB	19	11	8
WR	15	14	1

So, here we have the altered our original df to include the score differential. We can see that A's RB provided the largest score differential for team A.

We also calculated the max for each row, and we can see that B's QB topped the list.

This is at the heart of VOR. B's QB scored the most points, but was he the most valuable player?

Not exactly. The most valuable player was actually A's RB, as he provided the largest differential.

You see, you have a limited amount of spots on your starting to play your players. And your opponent also has a limited number of starting roster spots to play their players. And so the goal of fantasy football is not to maximize how many points you'll score, but maximize your scoring differential at each position in relation to your opponent.

If the goal of FF was to maximize your points scored, then we'd all be picking QB's early in the draft. But we're all smart enough so that this doesn't happen. Why? Because there's few QB's that provide a large enough differential compared to their peers to justify drafting them so high. In contrast, RB's are more spread out, and so if you miss out on a stud RB in the early rounds, you may be hard pressed to find a RB in later rounds who can provide that same differential. In short, RB's have higher positional value, and as we'll see, higher replacement values. For QBs, you can just pick up Matt Stafford in the 10th round, and you would've only been slightly better off if you went Russell Wilson in the 7th. The same can't be said if you don't draft Mark Ingram in the 7th round and go for Damien Williams in the 10th.

You can think of these scoring differentials as our value over replacement numbers. Each player's value is the differential they can provide over a typical replacement player. In the universe we constructed, though, we only have 2 teams and 6 players. And thus, calculating VOR is as simple as the following:

And there's our ranking model (for this universe)! If you were to go back in time and redraft, you would want to pick A's RB 1st, B's QB 2nd, and A's WR 3rd. The point here is that you wouldn't go B's QB 1st, even though he scored the highest points.

Extending this to a 12 team league with a 196 player draft pool and large waiver wire and calulating a value over replacement value is much more difficult, and thus we need to rely on estimates of replacement value instead.

What we have to do is find a "replacement player" for each position in the draft pool - a player who's projected points represents the average postional value at each position. Then, with respect to each player's position (this is important. You want to compare each player's projected points to their position's replacement value), substract out the replacement value you calculated from your replacement player. The value you're left with is each player's value over the typical replacement player, or for short, their value over replacement.

There's multiple ways people do this, but I've found the most reliable method is to do the following:

1 - Look at ADP for the upcoming draft year and look at pick #100.

2 - Starting from pick 100, go backwards and look for the last WR, RB, QB, and TE picked thus far. These players are your replacement players.

Other methods include using "man games" (which, in my opinion, is a bit too convulated for my taste, although it is an interesting idea), picking the average starter, and picking the worst average starter. I've found the "point in draft" method I layed out above works the best. The decision to use pick #100 is relatively arbitrary, but it's what most FF-experts use, so we'll roll with it. You can also use several cutoff points, and then average the results you get from a range in say, [75, 125], and then use that as your model. I actually don't think that's a bad idea, but we'll just be using 100 for the sake of brevity. You can adjust the cutoff point based on your league size. 100 works well for leagues in the 10-12 team range, but if you have a 16 team league for example, maybe move that up to 115.

Hopefully, that all makes sense.

With the theory out of the way, let's code. Our first step is to find our replacement players. We'll find this using ADP data provided by FantasyPros. We have to scrape this data. Like I said, we'll be working with PPR data (ADP obviously changes for each format), but this can easily be extended to half PPR and standard.

We wrote a little function called make_adp_df that makes a request to the URL stored in the BASE_URL variable. If you inspect element on the page, you'll find the data we need is hidden in a table tag with an id of 'data'. If you want to use a different format, go to that URL and toggle the drop down list to your league format. The URL will change, and that will be the URL you will use in your function.

I included print statements along the way to visualize the changes I was making to the dataframe as we made them. What we are left with is that df under 'Final output', which is all we need.

What we need now is to cutoff our df at 100, and find the last RB, QB, TE, and WR chosen up to that point (on average), and append them to a dictionary we'll call replacement_players.

That was pretty easy actually. All we did here was continously updated our dictionary until we reached the end of our loop. The last player at each position is the one that stays in our replacement_values dictionary.

Now that we have our replacement players, we have to get projection data. We're going to scrape PPR projection data from FantasyPros, and then replace the player name values in our dicitonary with their projected points.

	PLAYER	POS	FPTS
0	Christian McCaffrey	RB	366.5
0	Lamar Jackson	QB	355.4
1	Patrick Mahomes	QB	342.0
0	Michael Thomas	WR	326.4
2	Dak Prescott	QB	314.1

Cool, so now we have a dataframe with projected player points straight from fantasy pros. I've added comments throughout the code so we can move on with life faster (at least I can).

So we have replacement players, we have projected points, what's left to do now is calculate our replacement values for each position from our replacement_players dictionary, and then calculate a new column for our final df called VOR, and sort that table in descending order.

So now we have our replacement values based on the df we just calculated above. To reiterate, these replacement values are what's going to be substracted from each player's projected FPTS, WITH RESPECT TO THEIR POSITION. I can't emphasize that enough. The real value, pun not intended, in a value over replacement model is the ability to compare players at different positions with different projected points. The answer to that question is not trivial and is usually left to intuition. But this Fantasy Football Data Pros damnit, to hell with intuition.

	PLAYER	POS	FPTS	VOR
0	Christian McCaffrey	RB	366.5	214.5
0	Lamar Jackson	QB	355.4	81.5
1	Patrick Mahomes	QB	342.0	68.1
0	Michael Thomas	WR	326.4	161.3
2	Dak Prescott	QB	314.1	40.2

And in one line of code, we've done it! Let's sort our dataframe on VOR and look at our top ranked players.

	PLAYER	POS	FPTS	VOR	VALUERANK
0	Christian McCaffrey	RB	366.5	214.5	1.0
0	Michael Thomas	WR	326.4	161.3	2.0
1	Saquon Barkley	RB	305.9	153.9	3.0
2	Ezekiel Elliott	RB	299.3	147.3	4.0
5	Alvin Kamara	RB	294.9	142.9	5.0
4	Dalvin Cook	RB	289.4	137.4	6.0
1	Davante Adams	WR	289.2	124.1	7.0
2	Julio Jones	WR	279.7	114.6	8.0
0	Travis Kelce	TE	254.2	107.5	9.0
3	Derrick Henry	RB	259.1	107.1	10.0
7	Clyde Edwards-Helaire	RB	249.6	97.6	11.0
8	Miles Sanders	RB	249.5	97.5	12.0
14	Austin Ekeler	RB	248.4	96.4	13.0
5	DeAndre Hopkins	WR	260.0	94.9	14.0
12	Kenyan Drake	RB	245.6	93.6	15.0
3	Tyreek Hill	WR	257.3	92.2	16.0
10	Aaron Jones	RB	240.9	88.9	17.0
1	George Kittle	TE	233.5	86.8	18.0
9	Joe Mixon	RB	237.5	85.5	19.0
4	Chris Godwin	WR	250.4	85.3	20.0
0	Lamar Jackson	QB	355.4	81.5	21.0
6	Nick Chubb	RB	227.4	75.4	22.0
8	D.J. Moore	WR	238.8	73.7	23.0
12	Robert Woods	WR	237.8	72.7	24.0
6	Mike Evans	WR	237.6	72.5	25.0
18	Le'Veon Bell	RB	223.9	71.9	26.0
11	Josh Jacobs	RB	223.6	71.6	27.0
17	Leonard Fournette	RB	222.2	70.2	28.0
7	Kenny Golladay	WR	234.3	69.2	29.0
15	Cooper Kupp	WR	233.8	68.7	30.0
1	Patrick Mahomes	QB	342.0	68.1	31.0
16	Allen Robinson	WR	232.7	67.6	32.0
13	Chris Carson	RB	217.5	65.5	33.0
15	Todd Gurley	RB	217.1	65.1	34.0
9	Adam Thielen	WR	230.0	64.9	35.0
11	Amari Cooper	WR	229.1	64.0	36.0
14	Calvin Ridley	WR	228.4	63.3	37.0
3	Zach Ertz	TE	209.8	63.1	38.0
19	Keenan Allen	WR	226.7	61.6	39.0
16	David Johnson	RB	213.3	61.3	40.0
13	Tyler Lockett	WR	223.8	58.7	41.0
22	JuJu Smith-Schuster	WR	221.5	56.4	42.0
17	Odell Beckham Jr.	WR	220.9	55.8	43.0
19	James Conner	RB	206.3	54.3	44.0
20	Melvin Gordon	RB	205.9	53.9	45.0
10	A.J. Brown	WR	216.7	51.6	46.0
18	DeVante Parker	WR	214.5	49.4	47.0
20	Courtland Sutton	WR	212.7	47.6	48.0
21	Terry McLaurin	WR	211.7	46.6	49.0
24	T.Y. Hilton	WR	209.3	44.2	50.0
4	Darren Waller	TE	190.2	43.5	51.0
2	Mark Andrews	TE	189.9	43.2	52.0
27	Jarvis Landry	WR	205.9	40.8	53.0
2	Dak Prescott	QB	314.1	40.2	54.0
3	Deshaun Watson	QB	312.1	38.2	55.0
23	D.K. Metcalf	WR	203.2	38.1	56.0
26	Stefon Diggs	WR	201.6	36.5	57.0
28	A.J. Green	WR	201.5	36.4	58.0
30	D.J. Chark	WR	201.4	36.3	59.0
25	Michael Gallup	WR	199.5	34.4	60.5
32	Tyler Boyd	WR	199.5	34.4	60.5
33	Julian Edelman	WR	198.0	32.9	62.0
29	Marquise Brown	WR	196.1	31.0	63.0
4	Russell Wilson	QB	304.1	30.2	64.0
25	Devin Singletary	RB	180.7	28.7	65.0
31	Marvin Jones	WR	188.8	23.7	66.0
24	Ronald Jones II	RB	175.4	23.4	68.0
36	Jamison Crowder	WR	188.5	23.4	68.0
29	Kareem Hunt	RB	175.4	23.4	68.0
5	Evan Engram	TE	168.0	21.3	70.0
5	Kyler Murray	QB	294.9	21.0	71.0
21	Mark Ingram II	RB	172.0	20.0	72.0
22	David Montgomery	RB	169.1	17.1	73.0
37	Diontae Johnson	WR	182.1	17.0	74.0
6	Josh Allen	QB	290.9	17.0	75.0
38	Tarik Cohen	RB	168.7	16.7	76.0
23	Jonathan Taylor	RB	168.1	16.1	77.5
38	Christian Kirk	WR	181.2	16.1	77.5
35	James White	RB	167.2	15.2	79.0
41	Sterling Shepard	WR	179.9	14.8	80.0
8	Tyler Higbee	TE	161.0	14.3	81.0
27	D'Andre Swift	RB	165.4	13.4	82.0
7	Matt Ryan	QB	286.7	12.8	83.0
34	Will Fuller	WR	177.1	12.0	84.0
6	Hunter Henry	TE	158.6	11.9	85.0
35	Brandin Cooks	WR	176.9	11.8	86.0
39	Preston Williams	WR	174.3	9.2	87.0
44	Golden Tate	WR	174.0	8.9	88.0
26	Raheem Mostert	RB	159.3	7.3	89.0
28	Cam Akers	RB	156.5	4.5	90.0
8	Tom Brady	QB	278.0	4.1	91.0
9	Drew Brees	QB	275.9	2.0	92.0
11	Hayden Hurst	TE	147.0	0.3	93.0
7	Jared Cook	TE	146.7	0.0	95.5
30	Kerryon Johnson	RB	152.0	0.0	95.5
40	John Brown	WR	165.1	0.0	95.5
10	Aaron Rodgers	QB	273.9	0.0	95.5
42	Deebo Samuel	WR	164.0	-1.1	98.0
11	Carson Wentz	QB	270.3	-3.6	99.0
13	Mike Gesicki	TE	142.6	-4.1	100.0

You now have a draft model completely built in less than (I think) 100 lines of Python. This would've taken me like 4 hours in excel, and I can only imagine the INDEX and MATCH formulas I'd have to use (I'm getting a headache just thinking about it).

I'll leave it up to you to interpret the results. I ran this same model through a FantasyPros mock draft and got a score of 93, for whatever that's worth.

In the next post, I think we'll come back to joining tables (like I promised in part 4) and join ADP data and this model here. We'll then look for gaps in ADP and our ranking model and try to find those players who are sleepers, and those players who are overvalued.

Thanks for reading, you guys are awesome.

A Value-Based Draft Model

Shameless Plug Section

Draft Model 2020

Value Over Replacement