Learn how to use Python to build a 2021 draft model.
If you like Fantasy Football and have an interest in learning how to code, check out our Ultimate Guide on Learning Python with Fantasy Football Online Course. Here is a link to purchase for 15% off. The course includes 15 chapters of material, 14 hours of video, hundreds of data sets, lifetime updates, and a Slack channel invite to join the Fantasy Football with Python community.
In this part of the intermediate series, we are going to be doing our yearly draft prep with our VOR model. If you don't know what value over replacement is, it's basically a model that shows which players are most valuable and can be used as a rankings list. I've written about VOR at length in other parts of this blog, but I'll cover it briefly here.
Different positions have different mean fantasy outputs. An RB that scores 350 points is much more valuable than a QB that scores 350 points, since the mean fantasy football points output for QB's as a whole is much higher. In a value over replacement model, value is defined as how much output a player delivers over a typical replacement player within their own position. On this site, we use the "last player drafted at a certain draft spot" method of selecting typical replacement players at each position. Essentially, we look at ADP and select a cutoff point (we use 100). At the 100th pick, we ask, "Who were the last RB, QB, WR, and TE chosen in the draft?" and these are our replacement players. Their projected fantasy output is the replacement value we use for each position. Each player's projected fantasy output - their position's replacement value is their value over replacement. We then sort this list of VOR's in descending order and come up with our ranking model.
The basic idea of VOR is that we want to avoid drafting players that are being taken above value, and snatch up players that are currently being drafted below value. We are net neutral on players that are drafting at value. We do this by combining our ranking model to find underdrafted and overdrafted players.
In this notebook, we will be using FantasyPros projections to help us establish the value we assign to each player. The projections are as important as the actual value calculation here. If your projections are crap, then your VOR model is also going to be crap, no way around that. I choose every year to use Fantasy Pros because their projections are an aggregation of multiple projection sources, and historically, crowdsourced projections are actually the most accurate.
However, you can change these projections to whatever source you'd like. You can even use your own projections. The process of generating a VOR model from a list of projections is all the same.
Open up a new Jupyter notebook and type in the first cell below. As always, here, we're simply importing libraries.
Here, we import some projection data I've personally hosted on Github. I update this CSV file semi-regularly, but there is no guarentee that I will have it updated in the time between some major development and your draft. If you want to update the projections yourself, head to the repo where this data is hosted, and read the instructions in the README.md.
Here, we load in the data from that repo above, and do some basic clean up of the data. Note that if you manually update the data, you will simply load in the CSV file on your local machine, and not actually use this remote url we use within the read_csv function.
I've added comments to each line, briefly explaining what each line is doing.
Player | Team | POS | RUSH_ATT | RUSH_YD | RUSH_TD | REC | REC_YD | REC_TD | FL | PASS_ATT | CMP | PASS_YD | PASS_TD | INTS | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Christian McCaffrey | CAR | RB | 297.1 | 1301.6 | 11.9 | 96.8 | 785.0 | 4.3 | 2.5 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
2 | Dalvin Cook | MIN | RB | 317.5 | 1491.7 | 13.0 | 54.6 | 472.3 | 2.0 | 2.9 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
3 | Derrick Henry | TEN | RB | 341.0 | 1669.9 | 14.2 | 24.1 | 179.8 | 1.1 | 2.7 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
4 | Alvin Kamara | NO | RB | 206.9 | 941.4 | 9.7 | 80.2 | 685.9 | 3.5 | 1.9 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
5 | Nick Chubb | CLE | RB | 273.5 | 1371.7 | 11.1 | 28.6 | 225.2 | 1.0 | 1.9 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
These projections are simply stat projections. We need to create a column for projected fantasy points. I've added the code you'll need to create a column for each of the three most common scoring formats:
Player | Team | POS | RUSH_ATT | RUSH_YD | RUSH_TD | REC | REC_YD | REC_TD | FL | PASS_ATT | CMP | PASS_YD | PASS_TD | INTS | STANDARD | HALF_PPR | PPR | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Christian McCaffrey | CAR | RB | 297.1 | 1301.6 | 11.9 | 96.8 | 785.0 | 4.3 | 2.5 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 300.86 | 349.26 | 397.66 |
2 | Dalvin Cook | MIN | RB | 317.5 | 1491.7 | 13.0 | 54.6 | 472.3 | 2.0 | 2.9 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 280.60 | 307.90 | 335.20 |
3 | Derrick Henry | TEN | RB | 341.0 | 1669.9 | 14.2 | 24.1 | 179.8 | 1.1 | 2.7 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 271.37 | 283.42 | 295.47 |
4 | Alvin Kamara | NO | RB | 206.9 | 941.4 | 9.7 | 80.2 | 685.9 | 3.5 | 1.9 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 238.13 | 278.23 | 318.33 |
5 | Nick Chubb | CLE | RB | 273.5 | 1371.7 | 11.1 | 28.6 | 225.2 | 1.0 | 1.9 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 228.49 | 242.79 | 257.09 |
Now that we have our projections loaded, cleaned, and prepared, we can move on to grabbing ADP data. ADP data, unlike projection data from fantasy pros, can be scraped.
Below is how we scrape the data for PPR scoring. If you'd like to change this to standard or half-PPR, head to the link below that's assigned to ADP_URL and change the scoring format. The URL should change as well. This will be the URL you will use for ADP_URL, instead of the one below.
Again, I've added comments explaining what each line is doing, but if you've been following my content for a while, you'd recognize that the web scraping process I use is pretty much the same for scraping every single web page. Make a HTTP request via the requests library, pass that HTML content in to BeautifulSoup, find the table on the page we are looking for (via some unique HTML attribute, like an id), and pass the string representation of that table in to pandas. Then we format the data because usually it will be a bit messy. You can use different libraries such as urllib or another scraping library, but this is what I've found to be the most straightforward method of web scraping in Python.
Player | Team | POS | AVG | ADP_RANK | |
---|---|---|---|---|---|
85 | Logan Thomas | WAS | TE | 91.0 | 86.0 |
86 | D.J. Chark | JAC | WR | 91.3 | 87.0 |
87 | Brandin Cooks | HOU | WR | 93.8 | 88.0 |
88 | Damien Harris | NE | RB | 94.7 | 89.0 |
89 | Trey Sermon | SF | RB | 95.5 | 90.0 |
90 | Tyler Boyd | CIN | WR | 96.0 | 91.0 |
91 | Deebo Samuel | SF | WR | 96.3 | 92.0 |
92 | Robert Tonyan | GB | TE | 96.3 | 93.0 |
93 | Zack Moss | BUF | RB | 96.8 | 94.0 |
94 | Antonio Brown | TB | WR | 98.7 | 95.0 |
95 | Ronald Jones | TB | RB | 99.5 | 96.0 |
96 | Jalen Hurts | PHI | QB | 99.8 | 97.0 |
97 | David Johnson | HOU | RB | 100.5 | 98.0 |
98 | Will Fuller | MIA | WR | 102.8 | 99.0 |
99 | Joe Burrow | CIN | QB | 104.3 | 100.0 |
Below is the "heaviest" bit of code we'll be writing for this project. Essentially, we're implementing the process of finding a typical replacement player for each position I detailed above.
We cutoff our ADP df at 100, and iterate through it, continuously appending the replacement_players dictionary we instantiate in the first line of the cell. Naturally, this DataFrame is going to end up containing the replacement players at each position. This is because the loop will run until it, well, reaches the end of the loop, and if, for example, the next value in adp_df is an RB and RB is already included in the dictionary, Python will simply replace the value for RB with the new RB. As long as our adp_df is in order, and this is important, then we will naturally end up with the last player drafted at each position by pick #100. These players are our replacement players.
We then convert our dictionary in to a DataFrame (a DataFrame object is really just an abstracted dictionary), and merge that data with our projection data to find the replacement VALUES for each position. Essentially, we find each of these replacement players' projected values. We then filter out uneccessarym, rename the PPRcolumn and end up with a table of replacement values for each position.
POS | REPLACEMENT_VALUE | |
---|---|---|
0 | RB | 161.020 |
1 | WR | 181.930 |
2 | TE | 155.320 |
3 | QB | 286.624 |
We can now simply use this table, merge it with our projection data, and find the value over replacement for each player, and we're done! We sort our table in descending order, and this is our ranking model. Now, I talked about how part of the value of a VOR model is we can evaluate which players are being drafted at/below/above value using ADP. We didn't that here, but leaving that for the next part of the intermediate series. I hope to have that part of the series up before the start of the majority of your drafts.
Player | POS | Team | PPR | VOR | |
---|---|---|---|---|---|
0 | Christian McCaffrey | RB | CAR | 397.660 | 236.640 |
1 | Dalvin Cook | RB | MIN | 335.200 | 174.180 |
2 | Davante Adams | WR | GB | 343.950 | 162.020 |
3 | Alvin Kamara | RB | NO | 318.330 | 157.310 |
4 | Travis Kelce | TE | KC | 311.920 | 156.600 |
5 | Tyreek Hill | WR | KC | 329.250 | 147.320 |
6 | Derrick Henry | RB | TEN | 295.470 | 134.450 |
7 | Stefon Diggs | WR | BUF | 316.210 | 134.280 |
8 | DeAndre Hopkins | WR | ARI | 312.570 | 130.640 |
9 | Austin Ekeler | RB | LAC | 285.630 | 124.610 |
10 | Saquon Barkley | RB | NYG | 281.990 | 120.970 |
11 | Ezekiel Elliott | RB | DAL | 278.620 | 117.600 |
12 | Aaron Jones | RB | GB | 276.660 | 115.640 |
13 | Calvin Ridley | WR | ATL | 296.430 | 114.500 |
14 | Jonathan Taylor | RB | IND | 268.790 | 107.770 |
15 | Joe Mixon | RB | CIN | 265.350 | 104.330 |
16 | Najee Harris | RB | PIT | 261.400 | 100.380 |
17 | Darren Waller | TE | LV | 255.570 | 100.250 |
18 | George Kittle | TE | SF | 255.120 | 99.800 |
19 | Keenan Allen | WR | LAC | 279.860 | 97.930 |
20 | Justin Jefferson | WR | MIN | 278.610 | 96.680 |
21 | Nick Chubb | RB | CLE | 257.090 | 96.070 |
22 | D.K. Metcalf | WR | SEA | 273.990 | 92.060 |
23 | Josh Allen | QB | BUF | 378.194 | 91.570 |
24 | Patrick Mahomes | QB | KC | 376.032 | 89.408 |
25 | Antonio Gibson | RB | WAS | 249.570 | 88.550 |
26 | A.J. Brown | WR | TEN | 269.080 | 87.150 |
27 | D'Andre Swift | RB | DET | 245.290 | 84.270 |
28 | Clyde Edwards-Helaire | RB | KC | 242.830 | 81.810 |
29 | Allen Robinson | WR | CHI | 261.990 | 80.060 |