# Ordinal regression in R: part 1

A theoretical and applied walkthrough of ordinal regression. Part 1: the frequentist approach with ordinal.

R
regression
ordinal
frequentist statistics
Author

Taylor Dunn

Published

March 15, 2020

R setup
library(tidyverse)
library(dunnr)
library(gt)
library(broom)
library(patchwork)

extrafont::loadfonts(device = "win", quiet = TRUE)
theme_set(theme_td())
set_geom_fonts()
set_palette()

wine_red <- "#58181F"
update_geom_defaults("point", list(color = wine_red))
update_geom_defaults("line", list(color = wine_red))

The purpose of this post is to learn more about ordinal regression models (a.k.a. cumulative link, proportional odds, ordered logit models, etc.) and practice their implementation in R. This is part 1, where I’ll be taking the frequentist approach via the ordinal package. There are other options, like MASS::polr, but two features in particular drew me to ordinal: (1) it allows for random effects, and (2) it has broom::tidy methods available.

Particularly, I’ll be following along with

• this excellent primer which includes theory and application, and
• this vignette which is a tutorial on incorporating random effects.

## Setup

Import ordinal, and the included data set wine:

library(ordinal)
data(wine)
wine <- as_tibble(wine)
glimpse(wine)
Rows: 72
Columns: 6
$response <dbl> 36, 48, 47, 67, 77, 60, 83, 90, 17, 22, 14, 50, 30, 51, 90, 7…$ rating   <ord> 2, 3, 3, 4, 4, 4, 5, 5, 1, 2, 1, 3, 2, 3, 5, 4, 2, 3, 3, 2, 5…
$temp <fct> cold, cold, cold, cold, warm, warm, warm, warm, cold, cold, c…$ contact  <fct> no, no, yes, yes, no, no, yes, yes, no, no, yes, yes, no, no,…
$bottle <fct> 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5…$ judge    <fct> 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3…

wine is a data set from Randall (1989) of wine bitterness ratings from multiple judges. The variables are as follows:

• Outcome:
• response: wine bitterness rating on a 0-100 scale
• rating: ordered factor with 5 levels (grouped version of response) with 1 = “least bitter” and 5 = “most bitter”
• Treatment factors:
• temp: temperature during wine production (cold and warm)
• contact: contact between juice and skins during wine production (no and yes)
• Random effects
• bottle with 8 levels
• judge with 9 levels

Relationship between response and rating:

wine %>%
ggplot(aes(y = rating, x = response)) +
geom_boxplot(width = 0.5) +
geom_jitter(alpha = 0.5)

Note that there is no overlap between the levels.

There are 72 total observations with the following ratings distribution by treatment and random effects:

wine %>%
transmute(temp, contact, bottle, judge, rating = as.numeric(rating)) %>%
pivot_wider(names_from = judge, values_from = rating) %>%
gt() %>%
tab_spanner(columns = 1:9, label = "judge") %>%
data_color(
columns = 1:9,
colors = scales::col_numeric(
palette = c("white", wine_red), domain = c(1, 5)
)
)
temp contact bottle judge
1 2 3 4 5 6 7 8 9
cold no 1 2 1 2 3 2 3 1 2 1
cold no 2 3 2 3 2 3 2 1 2 2
cold yes 3 3 1 3 3 4 3 2 2 3
cold yes 4 4 3 2 2 3 2 2 3 2
warm no 5 4 2 5 3 3 2 2 3 3
warm no 6 4 3 5 2 3 4 3 3 2
warm yes 7 5 5 4 5 3 5 2 3 4
warm yes 8 5 4 4 3 3 4 3 4 4

So each bottle had a particular temp and contact (2 bottles for each of the 4 combinations), and each judge rated the bitterness each bottle.

Before modeling, can we see a clear effect of temp and contact?

wine %>%
count(contact, rating, temp) %>%
mutate(temp = fct_rev(temp)) %>%
ggplot(aes(x = temp, y = rating, color = temp)) +
geom_point(aes(group = temp, size = n)) +
facet_wrap(~contact, scales = "free_x",
labeller = labeller(contact = label_both)) +
scale_size(breaks = c(1, 2, 4, 6, 8)) +
add_facet_borders()

At a glance, it looks like the temp = warm and contact = yes is associated with higher ratings.

## Conclusion

Thanks to detailed documentation, fitting cumulative link (mixed) models is very easy with ordinal. In this post, we first learned the theoretical basis for these models, then worked through examples using wine bitterness ratings from multiple judges.

In the next post, I’ll explore the Bayesian approach to ordinal regression with the brms package.

## Reproducibility

Session info
 setting  value
version  R version 4.2.1 (2022-06-23 ucrt)
os       Windows 10 x64 (build 19044)
system   x86_64, mingw32
ui       RTerm
language (EN)
collate  English_Canada.utf8
ctype    English_Canada.utf8
tz       America/Curacao
date     2022-10-27
pandoc   2.18 @ C:/Program Files/RStudio/bin/quarto/bin/tools/ (via rmarkdown)
Git repository
Local:    main C:/Users/tdunn/Documents/tdunn-quarto
Remote:   main @ origin (https://github.com/taylordunn/tdunn-quarto.git)
Head:     [4eb5bf2] 2022-10-26: Added font import to style sheet

## References

Randall, J. H. 1989. The Analysis of Sensory Data by Generalized Linear Model.” Biometrical Journal 31 (7): 781–93. https://doi.org/10.1002/bimj.4710310703.

## Citation

BibTeX citation:
@online{dunn2020,
author = {Taylor Dunn},
title = {Ordinal Regression in {R:} Part 1},
date = {2020-03-15},
url = {https://tdunn.ca/posts/2020-03-15-ordinal-regression-in-r-part-1},
langid = {en}
}

For attribution, please cite this work as:
Taylor Dunn. 2020. “Ordinal Regression in R: Part 1.” March 15, 2020. https://tdunn.ca/posts/2020-03-15-ordinal-regression-in-r-part-1.