R Fundamentals & Statistics
Lessons 1–12 · Learn to work with data in R
The Basics
Learn the very basics of R — using it as a calculator, creating variables, and understanding data types. Applied to real Nebraska baseball statistics.
Data Basics
Learn fundamental data terminology — rows, columns, data types, and how to load a CSV. The vocabulary you'll use throughout the course.
Aggregates, Part 1
Learn how to take lots of individual records and total them into summaries — counting, summing, and grouping sports data with dplyr.
Aggregates, Part 2
Go deeper into grouping and summarizing — calculating means, medians, and multiple statistics at once across different categories.
Mutating Data
Learn how to create new columns from existing ones — calculating percentages, per-game rates, and other derived statistics.
Filters
Learn how to narrow in on what's important and remove what isn't — filtering rows by conditions to focus your analysis.
Transforming Data
Reshape data between wide and long formats — a critical skill for preparing data for visualization and analysis.
Significance Tests
Learn how to see if differences are meaningful, or just noise. Covers t-tests and what statistical significance actually means for sports stories.
Correlations & Regressions
Learn how to see how related two numbers are and build a simple predictive model — a fundamental tool for data-driven sports stories.
Residuals
Learn how to use residuals to find who is better or worse than a model predicts — a powerful technique for identifying over- and underperformers.
Z-Scores
Learn how to compare athletes and teams across different leagues, seasons, and contexts by standardizing to z-scores.
Using Sports Data Packages
Learn to use R packages that pull sports data directly — NFL, NBA, college football — without downloading any CSV files manually.
Data Visualization with ggplot2
Lessons 13–27 · Build charts that tell stories
Bar Charts
Start turning data into graphics. Learn the grammar of ggplot2 and build your first bar chart from sports data.
Stacked Bar Charts
Add nuance to a bar chart by showing composition — how a total breaks down into parts — with stacked and filled bar charts.
Waffle Charts
Make a chart that shows both magnitude and part-to-whole composition using the waffle package — intuitive for general audiences.
Line Charts
Show change over time with line charts — the workhorse of sports trend analysis.
Step Charts
Show change over time where values hold steady until a discrete event — useful for cumulative totals and season progressions.
Slope Charts
Show how values changed between two specific points in time — great for before/after comparisons and season-over-season shifts.
Scatterplots
Show the relationship between two numeric variables. A staple for exploring whether two stats are actually connected.
Bubble Charts
Add a third dimension to a scatterplot by encoding a value in point size — showing relationships along with magnitude.
Beeswarm Plots
Make scatterplots grouped by category along a number line — ideal for comparing distributions across teams or conferences.
Bump Charts
Show how rankings change over time — a combination of step chart and connected dot plot that reveals movement in standings.
Dumbbell Charts
Show the difference between two values for multiple subjects — comparing teams, players, or time periods side by side.
Tables
Make data tables that are actually readable — formatted, styled, and ready for publication using the gt package.
Faceting
Make many small charts from one dataset with a single line of code — small multiples for comparing across groups.
Arranging Plots
Combine multiple charts into a single graphic using patchwork — build multi-panel figures for stories with several angles.
Encircling Points
Use shapes as a storytelling device — draw attention to specific data points by highlighting or encircling them.
Advanced Topics
Lessons 28–37 · Publishing, scraping, and modeling
Blogging with Quarto & GitHub Pages
Build a public portfolio of data stories using Quarto documents and GitHub Pages — the same tools working journalists use.
Text Cleaning
Clear out junk that data providers give you — inconsistent capitalization, extra whitespace, encoding issues — using stringr.
Annotations & Headlines
Add context directly to charts with text annotations, reference lines, and headlines that guide readers to the story in the data.
Color
Learn how limited, intentional use of color draws attention and focuses your story — and how overuse destroys meaning.
Finishing Touches
Add the final visual flourishes — proper axis labels, captions, themes, and font choices — that make a chart publication-ready.
Web Scraping with rvest
Learn to scrape data tables from websites using rvest — pull stats from anywhere on the web into a data frame in R.
Multiple Regression
Predict an outcome using multiple input variables — a more realistic model for complex sports questions than simple regression.
Clustering
Group similar teams or players together using k-means clustering — let math find the natural categories in your data.
Simulations
Use simulation to figure out whether a result was bad luck or something more unusual — a powerful tool for contextualizing sports outcomes.
Joins
Combine two separate datasets into one by matching on a common field — essential for bringing together stats from different sources.
Reference Guides
Lessons 38–40 · Tools to have at your side
Getting CSVs from Sports Reference
Learn how to export any table from Sports Reference websites as a CSV and import it cleanly into R. Practical and immediately useful.
Project Checklist
A checklist for data-driven story projects — what to check before you publish, common mistakes to avoid, and what makes a strong data story.
Tutorial Index
An index of every function and concept covered across all 40 tutorials. Use it when you remember learning something but can't recall which lesson covered it.
Ready to start?
Install R, RStudio, and the tutorials package — then open the Tutorial tab and start Lesson 1. The whole setup takes about 15 minutes.
Installation Guide →