library(tidyverse)
library(gt)
26 Tables
But not a table. A table with features.
Sometimes, the best way to show your data is with a table – simple rows and columns. It allows a reader to compare whatever they want to compare a little easier than a graph where you’ve chosen what to highlight. The folks that made R Studio and the tidyverse have a neat package called gt
.
For this assignment, we’ll need gt
so go over to the console and run:
install.packages("gt")
So what does all of these libraries do? Let’s gather a few and use data of every game in the last 5 years.
For this walkthrough:
Load libraries.
And the data.
<- read_csv("data/logs1520.csv") logs
Rows: 68617 Columns: 44
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (6): HomeAway, Opponent, W_L, Team, Conference, season
dbl (36): X1, Game, TeamScore, OpponentScore, TeamFG, TeamFGA, TeamFGPCT, T...
lgl (1): Blank
date (1): Date
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Let’s ask this question: Fred Ball is supposed to be play fast and shoot threes – a pro-style offense. How much did Nebraska change in that regard from Tim Miles? In other words, which college basketball team saw the greatest increase in three point attempts last season as a percentage of shots? The simplest way to calculate that is by percent change.
We’ve got a little work to do, putting together ideas we’ve used before. What we need to end up with is some data that looks like this:
Team | 2018-2019 season threes | 2019-2020 season threes | pct change
To get that, we’ll need to do some filtering to get the right seasons, some grouping and summarizing to get the right number, some pivoting to get it organized correctly so we can mutate the percent change.
<- logs |>
threechange filter(season == "2018-2019" | season == "2019-2020") |>
group_by(Team, Conference, season) |>
summarise(Total3PA = sum(Team3PA)) |>
pivot_wider(names_from=season, values_from = Total3PA) |>
mutate(PercentChange = (`2019-2020`-`2018-2019`)/`2018-2019`) |>
arrange(desc(PercentChange)) |>
ungroup() |>
top_n(10) # just want a top 10 list
`summarise()` has grouped output by 'Team', 'Conference'. You can override
using the `.groups` argument.
Selecting by PercentChange
We’ve output tables to the screen a thousand times in this class with head
, but gt
makes them look decent with very little code.
|> gt() threechange
Team | Conference | 2018-2019 | 2019-2020 | PercentChange |
---|---|---|---|---|
Mississippi Valley State Delta Devils | SWAC | 554 | 837 | 0.5108303 |
Valparaiso Crusaders | MVC | 585 | 843 | 0.4410256 |
Ball State Cardinals | MAC | 621 | 842 | 0.3558776 |
San Jose State Spartans | MWC | 641 | 861 | 0.3432137 |
Alabama Crimson Tide | SEC | 718 | 957 | 0.3328691 |
Minnesota Golden Gophers | Big Ten | 603 | 762 | 0.2636816 |
Georgia Southern Eagles | Sun Belt | 631 | 792 | 0.2551506 |
Tennessee Tech Golden Eagles | OVC | 620 | 776 | 0.2516129 |
San Francisco Dons | WCC | 728 | 899 | 0.2348901 |
McNeese State Cowboys | Southland | 547 | 675 | 0.2340037 |
So there you have it. Mississippi Valley State changed their team so much they took 51 percent more threes last season from the season before. Where did Nebraska come out? Isn’t Fred Ball supposed to be a lot of threes? We ranked 111th in college basketball in terms of change from the season before. Believe it or not, Nebraska took four fewer threes in the first season of Fred Ball than the last season of Tim Miles.
gt
has a mountain of customization options. The good news is that it works in a very familiar pattern. We’ll start with fixing headers. What we have isn’t bad, but PercentChange isn’t good either. Let’s fix that.
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
)
Team | Conference | 2018-2019 | 2019-2020 | Percent Change |
---|---|---|---|---|
Mississippi Valley State Delta Devils | SWAC | 554 | 837 | 0.5108303 |
Valparaiso Crusaders | MVC | 585 | 843 | 0.4410256 |
Ball State Cardinals | MAC | 621 | 842 | 0.3558776 |
San Jose State Spartans | MWC | 641 | 861 | 0.3432137 |
Alabama Crimson Tide | SEC | 718 | 957 | 0.3328691 |
Minnesota Golden Gophers | Big Ten | 603 | 762 | 0.2636816 |
Georgia Southern Eagles | Sun Belt | 631 | 792 | 0.2551506 |
Tennessee Tech Golden Eagles | OVC | 620 | 776 | 0.2516129 |
San Francisco Dons | WCC | 728 | 899 | 0.2348901 |
McNeese State Cowboys | Southland | 547 | 675 | 0.2340037 |
Better. Note the pattern: Actual header name = “What we want to see”. So if we wanted to change Team to School, we’d do this: Team = "School"
inside the cols_label
bits.
Now we can start working with styling. The truth is most of your code in tables is going to be dedicated to styling specific things. The first thing we need: A headline and some chatter. They’re required parts of a graphic, so they’re a good place to start. We do that with tab_header
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
|>
) tab_header(
title = "Does Hoiberg's offense push threes more than Miles?",
subtitle = "Nebraska wasn't in the top 100 of teams shooting more threes. These 10 teams completely changed their offense."
)
Does Hoiberg's offense push threes more than Miles? | ||||
Nebraska wasn't in the top 100 of teams shooting more threes. These 10 teams completely changed their offense. | ||||
Team | Conference | 2018-2019 | 2019-2020 | Percent Change |
---|---|---|---|---|
Mississippi Valley State Delta Devils | SWAC | 554 | 837 | 0.5108303 |
Valparaiso Crusaders | MVC | 585 | 843 | 0.4410256 |
Ball State Cardinals | MAC | 621 | 842 | 0.3558776 |
San Jose State Spartans | MWC | 641 | 861 | 0.3432137 |
Alabama Crimson Tide | SEC | 718 | 957 | 0.3328691 |
Minnesota Golden Gophers | Big Ten | 603 | 762 | 0.2636816 |
Georgia Southern Eagles | Sun Belt | 631 | 792 | 0.2551506 |
Tennessee Tech Golden Eagles | OVC | 620 | 776 | 0.2516129 |
San Francisco Dons | WCC | 728 | 899 | 0.2348901 |
McNeese State Cowboys | Southland | 547 | 675 | 0.2340037 |
We have a headline and some chatter, but … gross. Centered? The extra lines? No real difference in font weight? We can do better. We can style individual elements using tab_style
. First, let’s make the main headline – the title
– bold and left aligned.
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
|>
) tab_header(
title = "Does Hoiberg's offense push threes more than Miles?",
subtitle = "Nebraska wasn't in the top 100 of teams shooting more threes. These 10 teams completely changed their offense."
|> tab_style(
) style = cell_text(color = "black", weight = "bold", align = "left"),
locations = cells_title("title")
)
Does Hoiberg's offense push threes more than Miles? | ||||
Nebraska wasn't in the top 100 of teams shooting more threes. These 10 teams completely changed their offense. | ||||
Team | Conference | 2018-2019 | 2019-2020 | Percent Change |
---|---|---|---|---|
Mississippi Valley State Delta Devils | SWAC | 554 | 837 | 0.5108303 |
Valparaiso Crusaders | MVC | 585 | 843 | 0.4410256 |
Ball State Cardinals | MAC | 621 | 842 | 0.3558776 |
San Jose State Spartans | MWC | 641 | 861 | 0.3432137 |
Alabama Crimson Tide | SEC | 718 | 957 | 0.3328691 |
Minnesota Golden Gophers | Big Ten | 603 | 762 | 0.2636816 |
Georgia Southern Eagles | Sun Belt | 631 | 792 | 0.2551506 |
Tennessee Tech Golden Eagles | OVC | 620 | 776 | 0.2516129 |
San Francisco Dons | WCC | 728 | 899 | 0.2348901 |
McNeese State Cowboys | Southland | 547 | 675 | 0.2340037 |
It’s hard to see here, but the chatter below is also centered (it doesn’t look like it because it fills the space). We can left align that too, but leave it normal weight (i.e. not bold).
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
|>
) tab_header(
title = "Does Hoiberg's offense push threes more than Miles?",
subtitle = "Nebraska wasn't in the top 100 of teams shooting more threes. These 10 teams completely changed their offense."
|> tab_style(
) style = cell_text(color = "black", weight = "bold", align = "left"),
locations = cells_title("title")
|> tab_style(
) style = cell_text(color = "black", align = "left"),
locations = cells_title("subtitle")
)
Does Hoiberg's offense push threes more than Miles? | ||||
Nebraska wasn't in the top 100 of teams shooting more threes. These 10 teams completely changed their offense. | ||||
Team | Conference | 2018-2019 | 2019-2020 | Percent Change |
---|---|---|---|---|
Mississippi Valley State Delta Devils | SWAC | 554 | 837 | 0.5108303 |
Valparaiso Crusaders | MVC | 585 | 843 | 0.4410256 |
Ball State Cardinals | MAC | 621 | 842 | 0.3558776 |
San Jose State Spartans | MWC | 641 | 861 | 0.3432137 |
Alabama Crimson Tide | SEC | 718 | 957 | 0.3328691 |
Minnesota Golden Gophers | Big Ten | 603 | 762 | 0.2636816 |
Georgia Southern Eagles | Sun Belt | 631 | 792 | 0.2551506 |
Tennessee Tech Golden Eagles | OVC | 620 | 776 | 0.2516129 |
San Francisco Dons | WCC | 728 | 899 | 0.2348901 |
McNeese State Cowboys | Southland | 547 | 675 | 0.2340037 |
The next item on the required elements list: Source and credit lines. In gt
, those are called tab_source_notes
and we can add them like this:
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
|>
) tab_header(
title = "Does Hoiberg's offense push threes more than Miles?",
subtitle = "Nebraska wasn't in the top 100 of teams shooting more threes. These 10 teams completely changed their offense."
|> tab_style(
) style = cell_text(color = "black", weight = "bold", align = "left"),
locations = cells_title("title")
|> tab_style(
) style = cell_text(color = "black", align = "left"),
locations = cells_title("subtitle")
|>
) tab_source_note(
source_note = md("**By:** Matt Waite | **Source:** [Sports Reference](https://www.sports-reference.com/cbb/seasons/)")
)
Does Hoiberg's offense push threes more than Miles? | ||||
Nebraska wasn't in the top 100 of teams shooting more threes. These 10 teams completely changed their offense. | ||||
Team | Conference | 2018-2019 | 2019-2020 | Percent Change |
---|---|---|---|---|
Mississippi Valley State Delta Devils | SWAC | 554 | 837 | 0.5108303 |
Valparaiso Crusaders | MVC | 585 | 843 | 0.4410256 |
Ball State Cardinals | MAC | 621 | 842 | 0.3558776 |
San Jose State Spartans | MWC | 641 | 861 | 0.3432137 |
Alabama Crimson Tide | SEC | 718 | 957 | 0.3328691 |
Minnesota Golden Gophers | Big Ten | 603 | 762 | 0.2636816 |
Georgia Southern Eagles | Sun Belt | 631 | 792 | 0.2551506 |
Tennessee Tech Golden Eagles | OVC | 620 | 776 | 0.2516129 |
San Francisco Dons | WCC | 728 | 899 | 0.2348901 |
McNeese State Cowboys | Southland | 547 | 675 | 0.2340037 |
By: Matt Waite | Source: Sports Reference |
We can do a lot with tab_style
. For instance, we can make the headers bold and reduce the size a bit to reduce font congestion in the area.
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
|>
) tab_header(
title = "Does Hoiberg's offense push threes more than Miles?",
subtitle = "Nebraska wasn't in the top 100 of teams shooting more threes. These 10 teams completely changed their offense."
|>
) tab_style(
style = cell_text(color = "black", weight = "bold", align = "left"),
locations = cells_title("title")
|>
) tab_style(
style = cell_text(color = "black", align = "left"),
locations = cells_title("subtitle")
|>
) tab_source_note(
source_note = md("**By:** Matt Waite | **Source:** [Sports Reference](https://www.sports-reference.com/cbb/seasons/)")
|>
) tab_style(
locations = cells_column_labels(columns = everything()),
style = list(
cell_borders(sides = "bottom", weight = px(3)),
cell_text(weight = "bold", size=12)
) )
Does Hoiberg's offense push threes more than Miles? | ||||
Nebraska wasn't in the top 100 of teams shooting more threes. These 10 teams completely changed their offense. | ||||
Team | Conference | 2018-2019 | 2019-2020 | Percent Change |
---|---|---|---|---|
Mississippi Valley State Delta Devils | SWAC | 554 | 837 | 0.5108303 |
Valparaiso Crusaders | MVC | 585 | 843 | 0.4410256 |
Ball State Cardinals | MAC | 621 | 842 | 0.3558776 |
San Jose State Spartans | MWC | 641 | 861 | 0.3432137 |
Alabama Crimson Tide | SEC | 718 | 957 | 0.3328691 |
Minnesota Golden Gophers | Big Ten | 603 | 762 | 0.2636816 |
Georgia Southern Eagles | Sun Belt | 631 | 792 | 0.2551506 |
Tennessee Tech Golden Eagles | OVC | 620 | 776 | 0.2516129 |
San Francisco Dons | WCC | 728 | 899 | 0.2348901 |
McNeese State Cowboys | Southland | 547 | 675 | 0.2340037 |
By: Matt Waite | Source: Sports Reference |
Next up: There’s a lot of lines in this that don’t need to be there. gt
has some tools to get rid of them easily and add in some other readability improvements.
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
|>
) tab_header(
title = "Does Hoiberg's offense push threes more than Miles?",
subtitle = "Nebraska wasn't in the top 100 of teams shooting more threes. These 10 teams completely changed their offense."
|>
) tab_source_note(
source_note = md("**By:** Matt Waite | **Source:** [Sports Reference](https://www.sports-reference.com/cbb/seasons/)")
|>
) tab_style(
style = cell_text(color = "black", weight = "bold", align = "left"),
locations = cells_title("title")
|>
) tab_style(
style = cell_text(color = "black", align = "left"),
locations = cells_title("subtitle")
|>
) tab_style(
locations = cells_column_labels(columns = everything()),
style = list(
cell_borders(sides = "bottom", weight = px(3)),
cell_text(weight = "bold", size=12)
)|>
) opt_row_striping() |>
opt_table_lines("none")
Does Hoiberg's offense push threes more than Miles? | ||||
Nebraska wasn't in the top 100 of teams shooting more threes. These 10 teams completely changed their offense. | ||||
Team | Conference | 2018-2019 | 2019-2020 | Percent Change |
---|---|---|---|---|
Mississippi Valley State Delta Devils | SWAC | 554 | 837 | 0.5108303 |
Valparaiso Crusaders | MVC | 585 | 843 | 0.4410256 |
Ball State Cardinals | MAC | 621 | 842 | 0.3558776 |
San Jose State Spartans | MWC | 641 | 861 | 0.3432137 |
Alabama Crimson Tide | SEC | 718 | 957 | 0.3328691 |
Minnesota Golden Gophers | Big Ten | 603 | 762 | 0.2636816 |
Georgia Southern Eagles | Sun Belt | 631 | 792 | 0.2551506 |
Tennessee Tech Golden Eagles | OVC | 620 | 776 | 0.2516129 |
San Francisco Dons | WCC | 728 | 899 | 0.2348901 |
McNeese State Cowboys | Southland | 547 | 675 | 0.2340037 |
By: Matt Waite | Source: Sports Reference |
We’re in pretty good shape here, but look closer. What else makes this table sub-par? How about the formatting of the percent change? We can fix that with a formatter.
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
|>
) tab_header(
title = "Does Hoiberg's offense push threes more than Miles?",
subtitle = "Nebraska wasn't in the top 100 of teams shooting more threes. These 10 teams completely changed their offense."
|>
) tab_source_note(
source_note = md("**By:** Matt Waite | **Source:** [Sports Reference](https://www.sports-reference.com/cbb/seasons/)")
|>
) tab_style(
style = cell_text(color = "black", weight = "bold", align = "left"),
locations = cells_title("title")
|>
) tab_style(
style = cell_text(color = "black", align = "left"),
locations = cells_title("subtitle")
|>
) tab_style(
locations = cells_column_labels(columns = everything()),
style = list(
cell_borders(sides = "bottom", weight = px(3)),
cell_text(weight = "bold", size=12)
)|>
) opt_row_striping() |>
opt_table_lines("none") |>
fmt_percent(
columns = c(PercentChange),
decimals = 1
)
Does Hoiberg's offense push threes more than Miles? | ||||
Nebraska wasn't in the top 100 of teams shooting more threes. These 10 teams completely changed their offense. | ||||
Team | Conference | 2018-2019 | 2019-2020 | Percent Change |
---|---|---|---|---|
Mississippi Valley State Delta Devils | SWAC | 554 | 837 | 51.1% |
Valparaiso Crusaders | MVC | 585 | 843 | 44.1% |
Ball State Cardinals | MAC | 621 | 842 | 35.6% |
San Jose State Spartans | MWC | 641 | 861 | 34.3% |
Alabama Crimson Tide | SEC | 718 | 957 | 33.3% |
Minnesota Golden Gophers | Big Ten | 603 | 762 | 26.4% |
Georgia Southern Eagles | Sun Belt | 631 | 792 | 25.5% |
Tennessee Tech Golden Eagles | OVC | 620 | 776 | 25.2% |
San Francisco Dons | WCC | 728 | 899 | 23.5% |
McNeese State Cowboys | Southland | 547 | 675 | 23.4% |
By: Matt Waite | Source: Sports Reference |
Throughout the semester, we’ve been using color and other signals to highlight things. Let’s pretend we’re doing a project on Minnesota. Note they’re the only Big Ten team on this list. With a little tab_style
magic, we can change individual rows and add color. The last tab_style
block here will first pass off the styles we want to use – we’re going to make the rows maroon and the text gold – and then for locations we specify where with a simple filter. What that means is that any rows we can address with logic – all rows with a value greater than X, for example – we can change the styling.
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
|>
) tab_header(
title = "Does Hoiberg's offense push threes more than Miles?",
subtitle = "Nebraska wasn't in the top 100 of teams shooting more threes. These 10 teams completely changed their offense."
|>
) tab_source_note(
source_note = md("**By:** Matt Waite | **Source:** [Sports Reference](https://www.sports-reference.com/cbb/seasons/)")
|>
) tab_style(
style = cell_text(color = "black", weight = "bold", align = "left"),
locations = cells_title("title")
|>
) tab_style(
style = cell_text(color = "black", align = "left"),
locations = cells_title("subtitle")
|>
) tab_style(
locations = cells_column_labels(columns = everything()),
style = list(
cell_borders(sides = "bottom", weight = px(3)),
cell_text(weight = "bold", size=12)
)|>
) opt_row_striping() |>
opt_table_lines("none") |>
fmt_percent(
columns = c(PercentChange),
decimals = 1
|>
) tab_style(
style = list(
cell_fill(color = "maroon"),
cell_text(color = "gold")
),locations = cells_body(
rows = Team == "Minnesota Golden Gophers")
)
Does Hoiberg's offense push threes more than Miles? | ||||
Nebraska wasn't in the top 100 of teams shooting more threes. These 10 teams completely changed their offense. | ||||
Team | Conference | 2018-2019 | 2019-2020 | Percent Change |
---|---|---|---|---|
Mississippi Valley State Delta Devils | SWAC | 554 | 837 | 51.1% |
Valparaiso Crusaders | MVC | 585 | 843 | 44.1% |
Ball State Cardinals | MAC | 621 | 842 | 35.6% |
San Jose State Spartans | MWC | 641 | 861 | 34.3% |
Alabama Crimson Tide | SEC | 718 | 957 | 33.3% |
Minnesota Golden Gophers | Big Ten | 603 | 762 | 26.4% |
Georgia Southern Eagles | Sun Belt | 631 | 792 | 25.5% |
Tennessee Tech Golden Eagles | OVC | 620 | 776 | 25.2% |
San Francisco Dons | WCC | 728 | 899 | 23.5% |
McNeese State Cowboys | Southland | 547 | 675 | 23.4% |
By: Matt Waite | Source: Sports Reference |
Two things here:
- Dear God that color scheme is awful, which is fitting for a school that worships a lawn-wrecking varmint.
- We’ve arrived where we want to be: We’ve created a clear table that allows a reader to compare schools at will while also using color to draw attention to the thing we want to draw attention to. We’ve kept it simple so the color has impact.