by Guangming Lang
4 min read

Categories

  • r

Updated October 4, 2018

In this post, we’ll look at how to make effective heatmaps using ezplot. We’ll use a dataset of NBA players’ statistics from flowingdata.com. Make sure you first install ezplot by running the command devtools::install_github("gmlang/ezplot").

library(ezplot)
library(dplyr)
library(tidyr)

Let’s get the data. Notice we pass the url directly to read.csv().

nba = read.csv("http://datasets.flowingdata.com/ppg2008.csv")
# examine the variables
str(nba)
## 'data.frame':	50 obs. of  21 variables:
##  $ Name: Factor w/ 50 levels "Al Harrington ",..: 21 31 29 19 15 27 28 2 13 9 ...
##  $ G   : int  79 81 82 81 67 74 51 50 78 66 ...
##  $ MIN : num  38.6 37.7 36.2 37.7 36.2 39 38.2 36.6 38.5 34.5 ...
##  $ PTS : num  30.2 28.4 26.8 25.9 25.8 25.3 24.6 23.1 22.8 22.8 ...
##  $ FGM : num  10.8 9.7 9.8 9.6 8.5 8.9 6.7 9.7 8.1 8.1 ...
##  $ FGA : num  22 19.9 20.9 20 19.1 18.8 15.9 19.5 16.1 18.3 ...
##  $ FGP : num  0.491 0.489 0.467 0.479 0.447 0.476 0.42 0.497 0.503 0.443 ...
##  $ FTM : num  7.5 7.3 5.9 6 6 6.1 9 3.7 5.8 5.6 ...
##  $ FTA : num  9.8 9.4 6.9 6.7 6.9 7.1 10.3 5 6.7 7.1 ...
##  $ FTP : num  0.765 0.78 0.856 0.89 0.878 0.863 0.867 0.738 0.868 0.793 ...
##  $ X3PM: num  1.1 1.6 1.4 0.8 2.7 1.3 2.3 0 0.8 1 ...
##  $ X3PA: num  3.5 4.7 4.1 2.1 6.7 3.1 5.4 0.1 2.3 2.6 ...
##  $ X3PP: num  0.317 0.344 0.351 0.359 0.404 0.422 0.415 0 0.364 0.371 ...
##  $ ORB : num  1.1 1.3 1.1 1.1 0.7 1 0.6 3.4 0.9 1.6 ...
##  $ DRB : num  3.9 6.3 4.1 7.3 4.4 5.5 3 7.5 4.7 5.2 ...
##  $ TRB : num  5 7.6 5.2 8.4 5.1 6.5 3.6 11 5.5 6.8 ...
##  $ AST : num  7.5 7.2 4.9 2.4 2.7 2.8 2.7 1.6 11 3.4 ...
##  $ STL : num  2.2 1.7 1.5 0.8 1 1.3 1.2 0.8 2.8 1.1 ...
##  $ BLK : num  1.3 1.1 0.5 0.8 1.4 0.7 0.2 1.7 0.1 0.4 ...
##  $ TO  : num  3.4 3 2.6 1.9 2.5 3 2.9 1.8 3 3 ...
##  $ PF  : num  2.3 1.7 2.3 2.2 3.1 1.8 2.3 2.8 2.7 3 ...
# look at the first 5 rows and 8 columns
nba[1:5, 1:8]
##             Name  G  MIN  PTS  FGM  FGA   FGP FTM
## 1   Dwyane Wade  79 38.6 30.2 10.8 22.0 0.491 7.5
## 2  LeBron James  81 37.7 28.4  9.7 19.9 0.489 7.3
## 3   Kobe Bryant  82 36.2 26.8  9.8 20.9 0.467 5.9
## 4 Dirk Nowitzki  81 37.7 25.9  9.6 20.0 0.479 6.0
## 5 Danny Granger  67 36.2 25.8  8.5 19.1 0.447 6.0

The variable Name has the names of the NBA players. By default, it’s treated by R as a Factor with levels ordered alphabetically. Reorder its levels by points scored.

nba$Name = with(nba, reorder(Name, PTS))

The other variables are various performance statistics. Before we can make a heat map, we need to put the data in the long format. In other words, we need to gather the names of the statistics in one column, and their values in another column.

nba_m = nba %>% gather(stats, val, -Name)
head(nba_m)
##             Name stats val
## 1   Dwyane Wade      G  79
## 2  LeBron James      G  81
## 3   Kobe Bryant      G  82
## 4 Dirk Nowitzki      G  81
## 5 Danny Granger      G  67
## 6  Kevin Durant      G  74

We also want to scale the values of each performance statistics so that they are between 0 and 1.

dat = nba_m %>% group_by(stats) %>% mutate(val_scaled = scales::rescale(val))
head(dat)
## # A tibble: 6 x 4
## # Groups:   stats [1]
##   Name             stats   val val_scaled
##   <fct>            <chr> <dbl>      <dbl>
## 1 "Dwyane Wade "   G        79      0.947
## 2 "LeBron James "  G        81      0.982
## 3 "Kobe Bryant "   G        82      1    
## 4 "Dirk Nowitzki " G        81      0.982
## 5 "Danny Granger " G        67      0.737
## 6 "Kevin Durant "  G        74      0.860

With all these data prep work done, we’re ready to make a heatmap. This is super easy with ezplot.

plt = mk_heatmap(dat)
p = plt("stats", "Name", "val_scaled")
rotate_axis_text(p, 90, vjust_x = 0.5)

center

Not it’s your turn. Make a heat map using the unscaled values, and compare it with the scaled version. You will see they are very different. The scaled version is the mathematically correct one.

This is the last post in the ezplot how-to series. If you’ve enjoyed it, tell your friend about it. If you want to learn more about how to use ezplot, you can get my book here.