Plug Section

If you like Fantasy Football and have an interest in learning how to code, check out our Ultimate Guide on Learning Python with Fantasy Football Online Course. Here is a link to purchase for 15% off. The course includes 15 chapters of material, 14 hours of video, hundreds of data sets, lifetime updates, and a Slack channel invite to join the Fantasy Football with Python community.

Building a VOR Draft Model with Python

In this part of the intermediate series, we are going to be doing our yearly draft prep with our VOR model. If you don't know what value over replacement is, it's basically a model that shows which players are most valuable and can be used as a rankings list. I've written about VOR at length in other parts of this blog, but I'll cover it briefly here.

Different positions have different mean fantasy outputs. An RB that scores 350 points is much more valuable than a QB that scores 350 points, since the mean fantasy football points output for QB's as a whole is much higher. In a value over replacement model, value is defined as how much output a player delivers over a typical replacement player within their own position. On this site, we use the "last player drafted at a certain draft spot" method of selecting typical replacement players at each position. Essentially, we look at ADP and select a cutoff point (we use 100). At the 100th pick, we ask, "Who were the last RB, QB, WR, and TE chosen in the draft?" and these are our replacement players. Their projected fantasy output is the replacement value we use for each position. Each player's projected fantasy output - their position's replacement value is their value over replacement. We then sort this list of VOR's in descending order and come up with our ranking model.

The basic idea of VOR is that we want to avoid drafting players that are being taken above value, and snatch up players that are currently being drafted below value. We are net neutral on players that are drafting at value. We do this by combining our ranking model to find underdrafted and overdrafted players.

In this notebook, we will be using FantasyPros projections to help us establish the value we assign to each player. The projections are as important as the actual value calculation here. If your projections are crap, then your VOR model is also going to be crap, no way around that. I choose every year to use Fantasy Pros because their projections are an aggregation of multiple projection sources, and historically, crowdsourced projections are actually the most accurate.

However, you can change these projections to whatever source you'd like. You can even use your own projections. The process of generating a VOR model from a list of projections is all the same.

The Code Behind the Model

Open up a new Jupyter notebook and type in the first cell below. As always, here, we're simply importing libraries.

Here, we import some projection data I've personally hosted on Github. I update this CSV file semi-regularly, but there is no guarentee that I will have it updated in the time between some major development and your draft. If you want to update the projections yourself, head to the repo where this data is hosted, and read the instructions in the README.md.

Here, we load in the data from that repo above, and do some basic clean up of the data. Note that if you manually update the data, you will simply load in the CSV file on your local machine, and not actually use this remote url we use within the read_csv function.

I've added comments to each line, briefly explaining what each line is doing.

Player Team POS RUSH_ATT RUSH_YD RUSH_TD REC REC_YD REC_TD FL PASS_ATT CMP PASS_YD PASS_TD INTS
1 Christian McCaffrey CAR RB 297.1 1301.6 11.9 96.8 785.0 4.3 2.5 0.0 0.0 0.0 0.0 0.0
2 Dalvin Cook MIN RB 317.5 1491.7 13.0 54.6 472.3 2.0 2.9 0.0 0.0 0.0 0.0 0.0
3 Derrick Henry TEN RB 341.0 1669.9 14.2 24.1 179.8 1.1 2.7 0.0 0.0 0.0 0.0 0.0
4 Alvin Kamara NO RB 206.9 941.4 9.7 80.2 685.9 3.5 1.9 0.0 0.0 0.0 0.0 0.0
5 Nick Chubb CLE RB 273.5 1371.7 11.1 28.6 225.2 1.0 1.9 0.0 0.0 0.0 0.0 0.0

These projections are simply stat projections. We need to create a column for projected fantasy points. I've added the code you'll need to create a column for each of the three most common scoring formats:

Player Team POS RUSH_ATT RUSH_YD RUSH_TD REC REC_YD REC_TD FL PASS_ATT CMP PASS_YD PASS_TD INTS STANDARD HALF_PPR PPR
1 Christian McCaffrey CAR RB 297.1 1301.6 11.9 96.8 785.0 4.3 2.5 0.0 0.0 0.0 0.0 0.0 300.86 349.26 397.66
2 Dalvin Cook MIN RB 317.5 1491.7 13.0 54.6 472.3 2.0 2.9 0.0 0.0 0.0 0.0 0.0 280.60 307.90 335.20
3 Derrick Henry TEN RB 341.0 1669.9 14.2 24.1 179.8 1.1 2.7 0.0 0.0 0.0 0.0 0.0 271.37 283.42 295.47
4 Alvin Kamara NO RB 206.9 941.4 9.7 80.2 685.9 3.5 1.9 0.0 0.0 0.0 0.0 0.0 238.13 278.23 318.33
5 Nick Chubb CLE RB 273.5 1371.7 11.1 28.6 225.2 1.0 1.9 0.0 0.0 0.0 0.0 0.0 228.49 242.79 257.09

Now that we have our projections loaded, cleaned, and prepared, we can move on to grabbing ADP data. ADP data, unlike projection data from fantasy pros, can be scraped.

Below is how we scrape the data for PPR scoring. If you'd like to change this to standard or half-PPR, head to the link below that's assigned to ADP_URL and change the scoring format. The URL should change as well. This will be the URL you will use for ADP_URL, instead of the one below.

Again, I've added comments explaining what each line is doing, but if you've been following my content for a while, you'd recognize that the web scraping process I use is pretty much the same for scraping every single web page. Make a HTTP request via the requests library, pass that HTML content in to BeautifulSoup, find the table on the page we are looking for (via some unique HTML attribute, like an id), and pass the string representation of that table in to pandas. Then we format the data because usually it will be a bit messy. You can use different libraries such as urllib or another scraping library, but this is what I've found to be the most straightforward method of web scraping in Python.

Player Team POS AVG ADP_RANK
85 Logan Thomas WAS TE 91.0 86.0
86 D.J. Chark JAC WR 91.3 87.0
87 Brandin Cooks HOU WR 93.8 88.0
88 Damien Harris NE RB 94.7 89.0
89 Trey Sermon SF RB 95.5 90.0
90 Tyler Boyd CIN WR 96.0 91.0
91 Deebo Samuel SF WR 96.3 92.0
92 Robert Tonyan GB TE 96.3 93.0
93 Zack Moss BUF RB 96.8 94.0
94 Antonio Brown TB WR 98.7 95.0
95 Ronald Jones TB RB 99.5 96.0
96 Jalen Hurts PHI QB 99.8 97.0
97 David Johnson HOU RB 100.5 98.0
98 Will Fuller MIA WR 102.8 99.0
99 Joe Burrow CIN QB 104.3 100.0

Below is the "heaviest" bit of code we'll be writing for this project. Essentially, we're implementing the process of finding a typical replacement player for each position I detailed above.

We cutoff our ADP df at 100, and iterate through it, continuously appending the replacement_players dictionary we instantiate in the first line of the cell. Naturally, this DataFrame is going to end up containing the replacement players at each position. This is because the loop will run until it, well, reaches the end of the loop, and if, for example, the next value in adp_df is an RB and RB is already included in the dictionary, Python will simply replace the value for RB with the new RB. As long as our adp_df is in order, and this is important, then we will naturally end up with the last player drafted at each position by pick #100. These players are our replacement players.

We then convert our dictionary in to a DataFrame (a DataFrame object is really just an abstracted dictionary), and merge that data with our projection data to find the replacement VALUES for each position. Essentially, we find each of these replacement players' projected values. We then filter out uneccessarym, rename the PPRcolumn and end up with a table of replacement values for each position.

POS REPLACEMENT_VALUE
0 RB 161.020
1 WR 181.930
2 TE 155.320
3 QB 286.624

We can now simply use this table, merge it with our projection data, and find the value over replacement for each player, and we're done! We sort our table in descending order, and this is our ranking model. Now, I talked about how part of the value of a VOR model is we can evaluate which players are being drafted at/below/above value using ADP. We didn't that here, but leaving that for the next part of the intermediate series. I hope to have that part of the series up before the start of the majority of your drafts.

Player POS Team PPR VOR
0 Christian McCaffrey RB CAR 397.660 236.640
1 Dalvin Cook RB MIN 335.200 174.180
2 Davante Adams WR GB 343.950 162.020
3 Alvin Kamara RB NO 318.330 157.310
4 Travis Kelce TE KC 311.920 156.600
5 Tyreek Hill WR KC 329.250 147.320
6 Derrick Henry RB TEN 295.470 134.450
7 Stefon Diggs WR BUF 316.210 134.280
8 DeAndre Hopkins WR ARI 312.570 130.640
9 Austin Ekeler RB LAC 285.630 124.610
10 Saquon Barkley RB NYG 281.990 120.970
11 Ezekiel Elliott RB DAL 278.620 117.600
12 Aaron Jones RB GB 276.660 115.640
13 Calvin Ridley WR ATL 296.430 114.500
14 Jonathan Taylor RB IND 268.790 107.770
15 Joe Mixon RB CIN 265.350 104.330
16 Najee Harris RB PIT 261.400 100.380
17 Darren Waller TE LV 255.570 100.250
18 George Kittle TE SF 255.120 99.800
19 Keenan Allen WR LAC 279.860 97.930
20 Justin Jefferson WR MIN 278.610 96.680
21 Nick Chubb RB CLE 257.090 96.070
22 D.K. Metcalf WR SEA 273.990 92.060
23 Josh Allen QB BUF 378.194 91.570
24 Patrick Mahomes QB KC 376.032 89.408
25 Antonio Gibson RB WAS 249.570 88.550
26 A.J. Brown WR TEN 269.080 87.150
27 D'Andre Swift RB DET 245.290 84.270
28 Clyde Edwards-Helaire RB KC 242.830 81.810
29 Allen Robinson WR CHI 261.990 80.060