Trying new table libraries

code
analysis
Author

Matthew Waite

Published

March 21, 2021

I’ve recently shifted my Sports Data Analysis and Visualization class from creating static graphics that get exported as a png graphic to creating blog posts. In doing so, the chapter on tables is not terribly useful or relevant. Exporting a visually interesting table to a png was not easy, so there’s a lot of code and content dedicated to that.

It can go away now. The question is, for what?

Time for a table throwdown.

The criteria will be:

  1. Ease of use. The students in this class are beginners and have very minimal web design and HTML skills.
  2. Bang for buck on code. What can I do in the fewest lines?
  3. End result. What looks good here?
  4. Under active development. Don’t want to get left with obsolete code.
  5. Works seamlessly with the Tidyverse.
  6. Provides tools to add all the pieces I think storytelling visuals require, such as headlines, chatter, credit lines and a link to the data source.

The contestants:

I considered Formattable, which has some really great features, but lacks the ability to add a headline, chatter, source and credit lines. So it’s out (though it can be used in combination with Kable)

Let’s load and go.

Code
library(gt)
library(reactable)
library(htmltools)
library(kableExtra)
library(tidyverse)

stats <- read_csv("http://mattwaite.github.io/sportsdatafiles/stats21.csv") %>% arrange(desc(OverallWins)) %>% select(School, OverallWins, OverallSRS, OverallSOS) %>% top_n(wt=OverallWins, 5)

I’m writing this during the NCAA basketball tournament so I’ll use some data from Sports Reference’s college basketball site. We’ll look at wins, team rating and strength of schedule.

My intention is to make a table that asks a question as the headline: Did Belmont get snubbed by the selection committee? It needs to have some explanation below the headline – chatter, in news graphics parlance. It’s going to display some basic data, use color to highlight highs and lows, and will have credit and source info at the bottom.

NOTE: There is a lot more you can do with these libraries that is very very cool. But I’m working with beginners, so I’m keeping it simple.

First up: gt

I haven’t used gt before now, but it’s from the folks at R Studio so that sets the bar pretty high. It almost certainly means it plugs into the Tidyverse seamlessly, which is an absolute must for my class.

The first thing I’m going to do with each library is just see what you get out of the box with no changes.

Code
stats %>% gt()
School OverallWins OverallSRS OverallSOS
Gonzaga 31 27.20 5.92
Baylor 28 24.83 7.40
Houston 28 21.66 5.37
Alabama 26 19.58 10.01
Belmont 26 3.46 -8.71
Drake 26 9.42 -0.48
Loyola (IL) 26 15.12 1.37

A solid start. I’m going to use examples from Tom Mock’s excellent Gt Cookbook and the documentation’s intro post to make this table.

I’m not going to walk through all the steps here. I’m just going to try and implement a table that highlights differences, adds a headline, chatter, source line and credit line and does some basic styling to remove excess lines and add some readability help.

Code
stats %>%
  gt() %>% 
  cols_label(
    OverallWins = "Wins",
    OverallSRS = "Rating",
    OverallSOS = "Sched. Strength"
  ) %>%
  tab_header(
    title = "Did Belmont get snubbed?",
    subtitle = "They tied Gonzaga winning the most games in college basketball, but those wins weren't very convincing. Drake and Loyola, both tournament teams, had better wins against better opponents."
  ) %>% tab_style(
    style = cell_text(color = "black", weight = "bold", align = "left"),
    locations = cells_title("title")
  ) %>% tab_style(
    style = cell_text(color = "black", align = "left"),
    locations = cells_title("subtitle")
  ) %>%
  tab_source_note(
    source_note = "By Matt Waite"
  ) %>%
  tab_source_note(
    source_note = md("Source: [Sports Reference](https://www.sports-reference.com/cbb/seasons/2021-school-stats.html)")
  ) %>% tab_style(
    style = cell_text(color = "black", weight = "bold"),
    locations = cells_body(
      columns = c(School)
    )
  ) %>% 
  tab_style(
    style = cell_text(color = "red", weight = "bold"),
    locations = cells_body(
      columns = c(OverallSOS),
      rows = OverallSOS < 0
    )
  ) %>% 
  tab_style(
    style = cell_text(color = "green", weight = "normal"),
    locations = cells_body(
      columns = c(OverallSOS),
      rows = OverallSOS > 0
    )
  ) %>% 
  tab_style(
    style = cell_text(color = "red", weight = "bold"),
    locations = cells_body(
      columns = c(OverallSRS),
      rows = OverallSRS < 0
    )
  ) %>% 
  tab_style(
    style = cell_text(color = "green", weight = "normal"),
    locations = cells_body(
      columns = c(OverallSRS),
      rows = OverallSRS > 0
    )
  ) %>% 
  opt_row_striping() %>% 
  opt_table_lines("none") %>% 
  tab_style(
    style = cell_borders(sides = c("top", "bottom"), 
                         color = "grey", weight = px(1)),
    locations = cells_column_labels(everything())
  )
Did Belmont get snubbed?
They tied Gonzaga winning the most games in college basketball, but those wins weren't very convincing. Drake and Loyola, both tournament teams, had better wins against better opponents.
School Wins Rating Sched. Strength
Gonzaga 31 27.20 5.92
Baylor 28 24.83 7.40
Houston 28 21.66 5.37
Alabama 26 19.58 10.01
Belmont 26 3.46 -8.71
Drake 26 9.42 -0.48
Loyola (IL) 26 15.12 1.37
By Matt Waite
Source: Sports Reference

Upside: That looks good. Really good. It has all the parts. It’s clean.

Downside: That’s a lot of repetitive code. I can see the reasoning for separate style blocks for each individual thing – to give maxiumum control over it – but that starts to pile up quickly. It’s not difficult code. It’s quite simple, really. But to a beginner, that’s a lot. Going to take a bit to ramp up with this.

Next: Reactable.

To be fair to Reactable here, it’s not exactly a perfect fit. It’s made for interactive tables online, so I’m expecting some fussing here to make this work.

First we look at it without any styling.

Code
stats %>% reactable()

Better spacing that GT right out of the box, but otherwise similar.

Now to try and mimic the table I made in GT. I’m borrowing from the documentation’s examples post and some other demos.

Code
tbl <- stats %>%
  reactable(
  striped = TRUE,
  columns = list(
  OverallSOS = colDef(
    name = "Sched. Strength",
    style = function(value) {
      if (value > 0) {
        color <- "#008000"
        weight <- "regular"
      } else if (value < 0) {
        color <- "#e00000"
        weight <- "bold"
      } else {
        color <- "#777"
        weight <- "regular"
      }
      list(color = color, fontWeight = weight)
    }
  ),
  OverallSRS = colDef(
    name = "Rating",
    style = function(value) {
      if (value > 0) {
        color <- "#008000"
        weight <- "regular"
      } else if (value < 0) {
        color <- "#e00000"
        weight <- "bold"
      } else {
        color <- "#777"
        weight <- "regular"
      }
      list(color = color, fontWeight = weight)
    }
  ),
  OverallWins = colDef(
    name = "Wins",
  ),
  School = colDef(
    style = list(fontWeight = "bold")
  )
))

div(class = "snub",
  div(class = "table-header",
    h2(class = "table-title", "Did Belmont get snubbed?"),
    "They tied Gonzaga winning the most games in college basketball, but those wins weren't very convincing. Drake and Loyola, both tournament teams, had better wins against better opponents."
  ),
  tbl,
  "Source: Sports Reference. By Matt Waite"
)

Did Belmont get snubbed?

They tied Gonzaga winning the most games in college basketball, but those wins weren't very convincing. Drake and Loyola, both tournament teams, had better wins against better opponents.
Source: Sports Reference. By Matt Waite

Things I like: The column definitions make a lot more sense to me, putting them all in the same place like that.

Things I do not like: I do not like that you need some HTML knowledge and another library – htmltools – to add a headline, chatter, and the source/credit line at the bottom. I can see why a developer would want that – the possibilities are limitless! – but that’s a luge track to paralysis for a beginner. One step on that ice and goodbye.

Kable and KableExtra

I started using Kable recently, but I’m not totally sold on it. I’m going to use this as a comparison. I’ve used the documentation’s excellent examples for this.

Code
stats %>% kable() %>% kable_styling()
School OverallWins OverallSRS OverallSOS
Gonzaga 31 27.20 5.92
Baylor 28 24.83 7.40
Houston 28 21.66 5.37
Alabama 26 19.58 10.01
Belmont 26 3.46 -8.71
Drake 26 9.42 -0.48
Loyola (IL) 26 15.12 1.37

Again, not bad. But the limitations of Kable are going to become clear quickly. There’s no easy way to add a headline, zero way to add the chatter, and the way to add source and credit lines is to use the footnote, which adds an unnecessary “note” to the bottom. There is some options for coloring of text, but the ability to specify colors to specific cells is limited (and involves editing the data beforehand).

This is an example of “bad use case” for this library.

Code
stats %>% 
  kable(caption = "Did Belmont get snubbed?", col.names = c('School', 'Wins', 'Rating', 'Sched. Strength')) %>% 
  kable_styling() %>% 
  column_spec(1, bold=T) %>%
  column_spec(4, color=spec_color(stats$OverallSOS[1:6], option="D")) %>%
  footnote(general = "Source: Sports Reference | By Matt Waite")
Warning in ensure_len_html(color, nrows, "color"): The number of provided values
in color does not equal to the number of rows.
Did Belmont get snubbed?
School Wins Rating Sched. Strength
Gonzaga 31 27.20 5.92
Baylor 28 24.83 7.40
Houston 28 21.66 5.37
Alabama 26 19.58 10.01
Belmont 26 3.46 -8.71
Drake 26 9.42 -0.48
Loyola (IL) 26 15.12 1.37
Note:
Source: Sports Reference | By Matt Waite

The verdict

For my use case, which is specific, I’m really liking the output of GT. It’s not as lean and direct as I might want, but none of the libraries were. The output is clean and checks all the boxes for what I need.

Now to rewrite that chapter.