Before we get started in R, save down the SFB9 Data from the Google Sheet created by Josh Hornsby. Save it down as a csv, which should be named “#SFB9 Data - sfb_data.csv” by default. If not, make sure the filename in the second block of code matches yours.
Next, we will load the required packages (‘dplyr’ for manipulating the data, ‘ggplot2’ to handle the plotting). Make sure to install these first, if they are not already.
### Load required packages
library(dplyr)
library(ggplot2)
Now, we will read the data from the csv into a data frame, using ‘colClasses’ to define the column types (this part is not necessary, but is useful to avoid problems that can arise due to a column of numbers being read in as characters). Then, we’ll preview the top of the data frame.
### Read in CSV
sfb9_data <- read.csv(file="#SFB9 Data - sfb_data.csv",
colClasses=c("character", "character", "integer", "integer", "integer",
"character", "character", "character", "integer", "character",
"character", "character", "integer", "integer"))
### Preview the data frame
sfb9_data %>% head
lg_name div_name rd pk ovr p_time p_com fran_at fran_num p_pos p_tm p_full n_pos
1 Shooters Doomguy 1 1 1 1562598605 <NA> @Matt_Gajewski 10733002 RB DAL Ezekiel Elliott 1
2 Shooters Doomguy 1 2 2 1562598605 PDL @df_patterson 10733008 RB NYG Saquon Barkley 2
3 Shooters Doomguy 1 3 3 1562598927 <NA> @othmanfcb 10733003 RB CAR Christian McCaffrey 3
4 Shooters Doomguy 1 4 4 1562599176 <NA> @BKenny26 10733009 RB NO Alvin Kamara 4
5 Shooters Doomguy 1 5 5 1562599467 <NA> @PacingPete 10733011 RB LAC Melvin Gordon 5
6 Shooters Doomguy 1 6 6 1562599467 PDL @DynastyInfidel 10733006 RB ARI David Johnson 6
f_pos
1 1
2 1
3 1
4 1
5 1
6 1
Let’s take a glance at the column names.
### List out column names
names(sfb9_data)
[1] "lg_name" "div_name" "rd" "pk" "ovr" "p_time" "p_com" "fran_at" "fran_num"
[10] "p_pos" "p_tm" "p_full" "n_pos" "f_pos"
Then, we can edit the column names to be a bit more clear.
### Rename columns
names(sfb9_data) <- c("League", "Division", "Round", "Pick", "Overall",
"Draft_Time", "Predraft", "Team_Name", "Team_Id", "Position",
"NFL_Team", "Player", "League_Pos_Rank", "Team_Pos_Rank")
### List out new column names
names(sfb9_data)
[1] "League" "Division" "Round" "Pick" "Overall"
[6] "Draft_Time" "Predraft" "Team_Name" "Team_Id" "Position"
[11] "NFL_Team" "Player" "League_Pos_Rank" "Team_Pos_Rank"
Next, we will calculate ADP. We create a new data frame based off ‘sfb9_data’, which filters out NA values (Timer Expired or similar non-player picks), groups by Player/Position/NFL_Team, calculates a few new columns, and sorts by True_ADP.
Count is the number of times drafted. ADP is the standard calculation, while True_ADP accounts for number of times going undrafted. ADP & True_ADP will match for anyone who was drafted all 100 times, but look at where Eric Tomlinson falls if you sort by one vs. the other for a good example of why I use True_ADP.
### Calculate ADP
sfb9_adp <- sfb9_data %>%
filter(!is.na(Player)) %>%
group_by(Player, Position, NFL_Team) %>%
summarize(Count=n(), ADP=mean(Overall), True_ADP=(mean(Overall)*n() + (100-n())*265)/100) %>%
arrange(True_ADP)
### Preview the ADP data frame
sfb9_adp %>% head
Now that we have ADP data, we will join that to the granular draft data.
### Join ADP data to granular draft data
sfb9_drafts_adp <- left_join(sfb9_data, sfb9_adp)
Joining, by = c("Position", "NFL_Team", "Player")
Here, we are breaking out the overall draft data into separate data frames for each position.
### Create position-specific data frames
qb_adp <- sfb9_drafts_adp %>% filter(Position=="QB")
rb_adp <- sfb9_drafts_adp %>% filter(Position=="RB")
wr_adp <- sfb9_drafts_adp %>% filter(Position=="WR")
te_adp <- sfb9_drafts_adp %>% filter(Position=="TE")
### Preview the QB data frame
qb_adp %>% head
League Division Round Pick Overall Draft_Time Predraft Team_Name Team_Id Position NFL_Team
1 Shooters Doomguy 1 10 10 1562604664 <NA> @sberg82 10733010 QB KC
2 Shooters Doomguy 2 6 18 1562607574 <NA> @Eliotcrist 10733005 QB HOU
3 Shooters Doomguy 3 9 33 1562627564 <NA> @JeneBramel 10733001 QB IND
4 Shooters Doomguy 3 11 35 1562633371 <NA> @IDP_Dude 10733007 QB GB
5 Shooters Doomguy 4 7 43 1562676015 <NA> @DynastyInfidel 10733006 QB CLE
6 Shooters Doomguy 4 10 46 1562689928 <NA> @othmanfcb 10733003 QB ATL
Player League_Pos_Rank Team_Pos_Rank Count ADP True_ADP
1 Patrick Mahomes 1 1 100 10.75 10.75
2 Deshaun Watson 2 1 100 22.01 22.01
3 Andrew Luck 3 1 100 25.77 25.77
4 Aaron Rodgers 4 1 100 29.35 29.35
5 Baker Mayfield 5 1 100 35.29 35.29
6 Matt Ryan 6 1 100 40.61 40.61
Here we plot Overall draft pick vs. ADP for the qb_adp data frame, using geom_point to produce a scatter with our specified color & alpha (transparency).
### Define QB Plot
qb_plot <- ggplot(qb_adp, aes(x=True_ADP, y=Overall)) +
geom_point(alpha=.5, color="#C77CFF")
### Show QB Plot
qb_plot
Now, let’s add a title & edit the axis labels (which will make more sense after our next block of code).
### Add title & edit axis labels
qb_plot <- ggplot(qb_adp, aes(x=True_ADP, y=Overall)) +
geom_point(alpha=.5, color="#C77CFF") +
labs(title="\nVisualizing QB Tiers in #SFB9", subtitle="") +
xlab("\nADP (Round)\n") +
ylab("\nDraft Round\n")
### Show Plot
qb_plot
To match our axis labels & see where in the draft we are more easily, we’re going to edit the axis scales to be based on draft round instead of overall pick. In the scale, we are defining the breaks (where to have text) to be a sequence of the middle of each round, and defining the labels as the numbers 1 through 22.
### Edit axis ticks
qb_plot <- ggplot(qb_adp, aes(x=True_ADP, y=Overall)) +
geom_point(alpha=.5, color="#C77CFF") +
labs(title="\nVisualizing QB Tiers in #SFB9", subtitle="") +
xlab("\nADP (Round)\n") +
ylab("\nDraft Round\n") +
scale_x_continuous(breaks=seq(6, 260, 12), labels=seq(1, 22, 1)) +
scale_y_continuous(breaks=seq(6, 260, 12), labels=seq(1, 22, 1))
### Show QB Plot
qb_plot
Now, we’ll use ‘coord_cartesian’ to help us zoom into the range we’d like to focus on. This can be used to look at just the first x rounds, but here we’ll just trim the excess space on the ends. (You have to use a bit of trial & error with this function as I had to specify limits of 12-252 to actually get 0-264, as it always adds a little onto each end.)
### Edit display settings to zoom/focus on range
qb_plot <- ggplot(qb_adp, aes(x=True_ADP, y=Overall)) +
geom_point(alpha=.5, color="#C77CFF") +
labs(title="\nVisualizing QB Tiers in #SFB9", subtitle="") +
xlab("\nADP (Round)\n") +
ylab("\nDraft Round\n") +
scale_x_continuous(breaks=seq(6, 260, 12), labels=seq(1, 22, 1)) +
scale_y_continuous(breaks=seq(6, 260, 12), labels=seq(1, 22, 1)) +
coord_cartesian(xlim=c(12, 252), ylim=c(12, 252))
### Show QB Plot
qb_plot