• Steven Ponce
  • About
  • Data Visualizations
  • Projects
  • Resume
  • Email

On this page

  • Steps to Create this Graphic
    • 1. Load Packages & Setup
    • 2. Read in the Data
    • 3. Examine the Data
    • 4. Tidy Data
    • 5. Visualization Parameters
    • 6. Plot
    • 7. Save
    • 8. Session Info
    • 9. GitHub Repository
    • 10. References
    • 11. Custom Functions Documentation

The Holiday-Volatility Paradox

  • Show All Code
  • Hide All Code

  • View Source

In small markets, more holidays correlate with higher traffic volatility (r = +0.48). In larger markets, the relationship flips — more holidays predict lower volatility. Remake of TidyTuesday 2024 · Week 52*

30DayChartChallenge
Data Visualization
R Programming
2026
In small markets, more holidays correlate with higher traffic volatility (r = +0.48). In larger markets, the relationship flips — more holidays predict lower volatility. A faceted scatter plot with a sign-flip summary panel, built with ggplot2 in R as part of the #30DayChartChallenge 2026 — Day 17: Remake.
Author

Steven Ponce

Published

April 17, 2026

Figure 1: A two-part data visualization titled “The Holiday-Volatility Paradox.” The left side shows four scatter plots arranged in a 2×2 grid, each showing the relationship between the average number of holidays per month (x-axis) and the coefficient of variation in air traffic (y-axis) for Small, Medium, Large, and Very Large markets. A red linear trend line in the Small Market panel shows a positive correlation (r = +0.48), while blue trend lines in the Medium (r = −0.22), Large (r = −0.17), and Very Large (r = −0.07) panels show negative correlations. A dashed reference line marks the industry median volatility. The right side shows a summary slope chart titled “The Sign Flip,” plotting the four correlation values across market sizes and built in R/ggplot2 as a remake of TidyTuesday 2024 Week 52.

Steps to Create this Graphic

1. Load Packages & Setup

Show code
```{r}
#| label: load
#| warning: false
#| message: false      
#| results: "hide"     

## 1. LOAD PACKAGES & SETUP ----
suppressPackageStartupMessages({
pacman::p_load(
  tidyverse, ggtext, showtext, patchwork,
  janitor, scales, glue, ggrepel
  )
})

### |- figure size ----
camcorder::gg_record(
  dir    = here::here("temp_plots"),
  device = "png",
  width  = 12,
  height = 10,
  units  = "in",
  dpi    = 320
)

# Source utility functions
suppressMessages(source(here::here("R/utils/fonts.R")))
source(here::here("R/utils/social_icons.R"))
source(here::here("R/utils/image_utils.R"))
source(here::here("R/themes/base_theme.R"))
```

2. Read in the Data

Show code
```{r}
#| label: read
#| include: true
#| eval: true
#| warning: false

tt <- tidytuesdayR::tt_load(2024, week = 52)

global_holidays_raw <- tt$global_holidays |> clean_names()
monthly_passengers_raw  <- tt$monthly_passengers |> clean_names()
rm(tt)
```

3. Examine the Data

Show code
```{r}
#| label: examine
#| include: true
#| eval: true
#| results: 'hide'
#| warning: false

glimpse(global_holidays_raw)
glimpse(monthly_passengers_raw)
```

4. Tidy Data

Show code
```{r}
#| label: tidy
#| warning: false

monthly_passengers_clean <- monthly_passengers_raw |>
  mutate(
    date  = ymd(paste(year, month, "01", sep = "-")),
    total_passengers = coalesce(total, total_os)
  )

monthly_holidays_clean <- global_holidays_raw |>
  mutate(
    year  = year(date),
    month = month(date)
  ) |>
  group_by(iso3, year, month) |>
  summarise(
    holiday_count = n(),
    public_holidays = sum(type == "Public holiday"),
    .groups = "drop"
  )

combined_data <- monthly_passengers_clean |>
  left_join(monthly_holidays_clean, by = c("iso3", "year", "month"))

# Housekeeping
rm(global_holidays_raw, monthly_passengers_raw, monthly_holidays_clean, monthly_passengers_clean)
gc()

# Volatility summary by country ----
volatility_df <- combined_data |>
  group_by(iso3) |>
  summarise(
    mean_traffic = mean(total_passengers, na.rm = TRUE),
    sd_traffic = sd(total_passengers, na.rm = TRUE),
    cv = sd_traffic / mean_traffic,
    avg_holidays = mean(holiday_count, na.rm = TRUE),
    total_observations = n(),
    traffic_size = sum(total_passengers, na.rm = TRUE),
    .groups = "drop"
  ) |>
  filter(
    complete.cases(cv, avg_holidays),
    total_observations >= 12,
    cv >= 0,
    cv <= quantile(cv, 0.95, na.rm = TRUE)
  ) |>
  mutate(
    size_category = cut(
      traffic_size,
      breaks = quantile(traffic_size, probs = seq(0, 1, 0.25), na.rm = TRUE),
      labels = c("Small", "Medium", "Large", "Very Large"),
      include.lowest = TRUE
    )
  )

# Per-facet correlation values ----
cor_labels <- volatility_df |>
  group_by(size_category) |>
  summarise(
    r = cor(avg_holidays, cv, use = "complete.obs"),
    n_country = n(),
    .groups = "drop"
  ) |>
  mutate(
    # Direction drives color encoding in strip + summary panel
    direction = if_else(r > 0, "positive", "negative"),
    label = glue("r = {sprintf('%+.2f', r)}\nn = {n_country}")
  )

# Identify notable outliers per facet (top/bottom CV, only if genuinely extreme) ----
outlier_threshold <- 0.15 

outliers_df <- volatility_df |>
  left_join(
    volatility_df |>
      group_by(size_category) |>
      summarise(med_cv = median(cv), .groups = "drop"),
    by = "size_category"
  ) |>
  group_by(size_category) |>
  filter(
    abs(cv - med_cv) >= outlier_threshold,
    cv == max(cv) | cv == min(cv)
  ) |>
  ungroup()
```
          used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 1517212 81.1    2463462 131.6  2463462 131.6
Vcells 2835047 21.7    8388608  64.0  7785807  59.5

5. Visualization Parameters

Show code
```{r}
#| label: params
#| include: true
#| warning: false

### |- plot aesthetics ----
colors <- get_theme_colors(
  palette = list(
    positive  = "#C0392B",   
    negative  = "#2980B9",  
    neutral   = "#7F8C8D",  
    point     = "#34495E",  
    highlight = "#E67E22",   
    bg        = "#F8F9FA"
  )
)

### |- titles and caption ----
title_text <- str_glue(
  "The Holiday-Volatility Paradox"
)

subtitle_text <- str_glue(
  "In small markets, more holidays correlate with **higher** traffic volatility (r = +0.48).<br>
    In larger markets, the relationship **flips** — more holidays predict **lower** volatility.<br>
    <span style='color:#C0392B;'>● Positive correlation</span> &nbsp;&nbsp;
    <span style='color:#2980B9;'>● Negative correlation</span> &nbsp;&nbsp;
    *Remake of TidyTuesday 2024 · Week 52*"
)

caption_text <- create_dcc_caption(
  dcc_year    = 2026,
  dcc_day     = 17,
  source_text = "Global Holidays and Travel · TidyTuesday 2024 Week 52"
)

### |- fonts ----
setup_fonts()
fonts <- get_font_families()

### |- base theme ----
base_theme <- create_base_theme(colors)

weekly_theme <- extend_weekly_theme(
  base_theme,
  theme(
    # strip labels
    strip.text = element_text(
      family = fonts$text, size = 11, face = "bold",
      margin = margin(b = 6)
    ),
    # axes
    axis.title = element_text(family = fonts$text, size = 9, color = "gray50"),
    axis.text = element_text(family = fonts$text, size = 8, color = "gray40"),
    axis.ticks = element_blank(),
    # grid — horizontal only, very faint
    panel.grid.major.y = element_line(color = "gray90", linewidth = 0.3),
    panel.grid.major.x = element_blank(),
    panel.grid.minor = element_blank(),
    # panel spacing — more breathing room
    panel.spacing.x = unit(2.0, "lines"),
    panel.spacing.y = unit(1.8, "lines"),
    # plot margins
    plot.margin = margin(t = 20, r = 20, b = 10, l = 20)
  )
)

theme_set(weekly_theme)
```

6. Plot

Show code
```{r}
#| label: plot
#| warning: false

### |- helper: strip color per facet ----
strip_labels <- cor_labels |>
  mutate(
    strip_color = if_else(direction == "positive",
                          colors$palette$positive,
                          colors$palette$negative
    ),
    strip_text = glue("{size_category} Market"),
    x_pos = -Inf,
    y_pos = Inf
  )

### |- main scatter plot (p_main) ----
p_main <- ggplot(volatility_df, aes(x = avg_holidays, y = cv)) +
  
  # Geoms
  geom_hline(
    yintercept = median(volatility_df$cv),
    linetype = "dashed",
    color = colors$palette$neutral,
    linewidth  = 0.4,
    alpha = 0.6
  ) +
  geom_point(
    color = colors$palette$point,
    size = 2.8,
    alpha = 0.75,
    shape = 16
  ) +
  geom_smooth(
    aes(color = size_category),
    method = "lm",
    formula = y ~ x,
    linewidth = 1,
    se = TRUE,
    alpha = 0.12
  ) +
  geom_text_repel(
    data = outliers_df,
    aes(label = iso3),
    size = 2.8,
    color = colors$palette$highlight,
    fontface = "bold",
    max.overlaps = 3,
    box.padding = 0.5,
    segment.color = colors$palette$neutral,
    segment.alpha = 0.5,
    segment.size = 0.3,
    seed = 123
  ) +
  geom_text(
    data = volatility_df |> filter(size_category == "Small"),
    x = Inf, y = median(volatility_df$cv) + 0.025,
    label = "industry median",
    size = 2.5, color = colors$palette$neutral,
    hjust = 1.05, vjust = -0.3, fontface = "italic"
  ) +
  geom_text(
    data = cor_labels,
    aes(
      x = Inf,
      y = 0.62,
      label = label,
      color = direction
    ),
    size = 3,
    hjust = 1.1,
    vjust = 1,
    fontface = "bold",
    inherit.aes = FALSE
  ) +
  geom_label(
    data = strip_labels,
    aes(
      x = x_pos,
      y = y_pos,
      label = strip_text,
      fill = direction
    ),
    hjust = -0.05,
    vjust = 1.3,
    size = 3.2,
    fontface = "bold",
    color = "white",
    label.size = 0,
    label.padding = unit(0.25, "lines"),
    inherit.aes = FALSE
  ) +
  # Scales
  scale_color_manual(
    values = c(
      "Small" = colors$palette$positive,
      "Medium" = colors$palette$negative,
      "Large" = colors$palette$negative,
      "Very Large" = colors$palette$negative
    )
  ) +
  scale_color_manual(
    values = c(
      "positive" = colors$palette$positive,
      "negative" = colors$palette$negative,
      "Small" = colors$palette$positive,
      "Medium" = colors$palette$negative,
      "Large" = colors$palette$negative,
      "Very Large" = colors$palette$negative
    )
  ) +
  scale_fill_manual(
    values = c(
      "positive" = colors$palette$positive,
      "negative" = colors$palette$negative
    )
  ) +
  scale_y_continuous(
    breaks = seq(0, 0.75, by = 0.25),
    limits = c(0, 0.75), # CV cannot be negative — floor at 0
    labels = percent_format(accuracy = 1)
  ) +
  scale_x_continuous(
    breaks = seq(2, 8, by = 2),
    expand = expansion(mult = c(0.05, 0.1))
  ) +
  
  # Facet ---
  facet_wrap(
    ~size_category,
    nrow   = 2,
    scales = "free_x"
  ) +
  
  # Labs
  labs(
    x = "Average Number of Holidays per Month",
    y = "Coefficient of Variation in Traffic"
  ) +
  
  # Theme
  theme(
    plot.title = element_text(
      size = rel(2),
      family = fonts$title,
      face = "bold",
      color = colors$title,
      lineheight  = 1.1,
      margin = margin(t = 5, b = 5)
    ),
    plot.subtitle = element_markdown(
      size = rel(0.95),
      family = fonts$text,
      color = colors$subtitle,
      lineheight = 1.4,
      margin = margin(t = 5, b = 15)
    ),
    strip.text = element_blank()
  )

### |- summary panel: r values across market sizes (p_summary) ----

# Standalone panel 
p_summary <- cor_labels |>
  mutate(
    size_category = factor(size_category, levels = c("Small", "Medium", "Large", "Very Large")),
    x_num  = as.numeric(size_category),
    vjust_val = if_else(r > 0, -1.2, 2.0)
  ) |>
  ggplot(aes(x = x_num, y = r, color = direction)) +
  
  # Annotate
  annotate(
    "rect",
    xmin = 0.5, xmax = 4.5,
    ymin = 0,   ymax = 0.55,
    fill  = colors$palette$positive,
    alpha = 0.04
  ) +
  
  # Geoms
  geom_hline(
    yintercept = 0,
    linetype = "dashed",
    color = colors$palette$neutral,
    linewidth  = 0.5
  ) +
  geom_line(color = "#BBBBBB", linewidth = 0.8) +
  geom_point(size = 6) +
  geom_text(
    aes(label = sprintf("%+.2f", r), vjust = vjust_val),
    size = 3.2,
    fontface = "bold"
  ) +
  annotate(
    "text",
    x = 2.5, y = 0.62,
    label = "← positive          negative →",
    size = 2.6,
    color = colors$palette$neutral,
    family = 'sans',
    fontface = "italic",
    hjust = 0.5
  ) +
  
  # Scales
  scale_color_manual(
    values = c(
      "positive" = colors$palette$positive,
      "negative" = colors$palette$negative
    ),
    guide = "none"
  ) +
  scale_x_continuous(
    breaks = 1:4,
    labels = c("Small", "Medium", "Large", "Very\nLarge"),
    expand = expansion(add = 0.5) 
  ) +
  scale_y_continuous(
    limits = c(-0.45, 0.70),
    breaks = seq(-0.4, 0.6, by = 0.2),
    labels = function(x) sprintf("%+.1f", x)
  ) +
  # Labs
  labs(
    x = "Market Size",
    y = "Correlation (r)",
    title = "The Sign Flip"
  ) +
  # Theme
  theme(
    plot.margin = margin(t = 10, r = 18, b = 10, l = 18),
    plot.title = element_text(
      size = rel(1.1),
      family  = fonts$text,
      face = "bold",
      color = colors$title,
      margin = margin(b = 10)
    ),
    axis.text.x = element_text(
      size = rel(0.78), color = "#555555",
      margin = margin(t = 6)
    ),
    axis.text.y = element_text(size = rel(0.78), color = "#555555"),
    axis.title.x = element_text(
      size = rel(0.85), color = "#444444",
      margin = margin(t = 8)
    ),
    axis.title.y = element_text(
      size = rel(0.85), color = "#444444",
      margin = margin(r = 8)
    ),
    panel.grid.major.x = element_blank(),
    panel.grid.major.y = element_line(color = "gray88", linewidth = 0.2),
    panel.grid.minor = element_blank(),
    axis.ticks = element_blank()
  )

# Combined plots
p_right <- p_summary / plot_spacer() +
  plot_layout(heights = c(1, 0.20)) 

# Left = evidence (2x2 scatter), Right = conclusion 
p_final <- 
  (p_main | p_right) +
  plot_layout(widths = c(2.4, 1.1)) +
  plot_annotation(
    title = title_text,
    subtitle = subtitle_text,
    caption = caption_text,
    theme = theme(
      plot.title = element_text(
        size = rel(2),
        family = fonts$title,
        face = "bold",
        color = colors$title,
        lineheight = 1.1,
        margin = margin(t = 5, b = 5)
      ),
      plot.subtitle = element_markdown(
        size = rel(0.95),
        family = 'sans',
        color = colors$subtitle,
        lineheight = 1.4,
        margin = margin(t = 5, b = 15)
      ),
      plot.caption = element_markdown(
        family = fonts$caption,
        size = rel(0.65),
        color = colors$caption,
        linewidth = 1.3,
        hjust = 0,
        margin = margin(t = 15)
      ),
      plot.margin = margin(15, 15, 10, 15)
    )
  )
```

7. Save

Show code
```{r}
#| label: save
#| warning: false

### |-  plot image ----  
save_plot_patchwork(
  p_final, 
  type = "30daychartchallenge", 
  year = 2026, 
  day = 17, 
  width = 12, 
  height = 10
  )
```

8. Session Info

TipExpand for Session Info
R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 26100)

Matrix products: default


locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] here_1.0.2      ggrepel_0.9.8   glue_1.8.0      scales_1.4.0   
 [5] janitor_2.2.1   patchwork_1.3.2 showtext_0.9-7  showtextdb_3.0 
 [9] sysfonts_0.8.9  ggtext_0.1.2    lubridate_1.9.5 forcats_1.0.1  
[13] stringr_1.6.0   dplyr_1.2.0     purrr_1.2.1     readr_2.2.0    
[17] tidyr_1.3.2     tibble_3.2.1    ggplot2_4.0.2   tidyverse_2.0.0

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.1   farver_2.1.2       S7_0.2.0           fastmap_1.2.0     
 [5] gh_1.4.1           pacman_0.5.1       digest_0.6.39      timechange_0.4.0  
 [9] lifecycle_1.0.5    rsvg_2.6.2         magrittr_2.0.3     compiler_4.3.1    
[13] rlang_1.1.7        tools_4.3.1        yaml_2.3.12        knitr_1.51        
[17] htmlwidgets_1.6.4  bit_4.6.0          curl_7.0.0         xml2_1.5.2        
[21] camcorder_0.1.0    RColorBrewer_1.1-3 tidytuesdayR_1.2.1 withr_3.0.2       
[25] grid_4.3.1         gitcreds_0.1.2     cli_3.6.5          rmarkdown_2.30    
[29] crayon_1.5.3       generics_0.1.4     otel_0.2.0         rstudioapi_0.18.0 
[33] tzdb_0.5.0         commonmark_2.0.0   splines_4.3.1      parallel_4.3.1    
[37] ggplotify_0.1.3    vctrs_0.7.1        yulab.utils_0.2.4  Matrix_1.5-4.1    
[41] jsonlite_2.0.0     litedown_0.9       gridGraphics_0.5-1 hms_1.1.4         
[45] bit64_4.6.0-1      systemfonts_1.3.2  magick_2.8.6       gifski_1.32.0-2   
[49] codetools_0.2-19   stringi_1.8.7      gtable_0.3.6       pillar_1.11.1     
[53] rappdirs_0.3.4     htmltools_0.5.9    R6_2.6.1           httr2_1.2.2       
[57] rprojroot_2.1.1    vroom_1.7.0        evaluate_1.0.5     lattice_0.21-8    
[61] markdown_2.0       gridtext_0.1.6     snakecase_0.11.1   Rcpp_1.1.1        
[65] svglite_2.1.3      nlme_3.1-162       mgcv_1.8-42        xfun_0.56         
[69] fs_1.6.7           pkgconfig_2.0.3   

9. GitHub Repository

TipExpand for GitHub Repo

The complete code for this analysis is available in 30dcc_2026_17.qmd.

For the full repository, click here.

10. References

TipExpand for References
  1. Data Sources:
    • TidyTuesday. (2024). Global Holidays and Travel — Week 52 [Dataset]. https://github.com/rfordatascience/tidytuesday/tree/main/data/2024/2024-12-24
    • WorldPop Hub. Monthly passenger traffic data [Dataset]. https://www.worldpop.org/
    • Original submission: Ponce, S. (2024). TidyTuesday 2024 Week 52. https://github.com/poncest/tidytuesday/tree/main/2024/Week_52
  2. Remake Methodology:
    • Switched from LOESS to linear smooth (method = "lm") for analytical honesty with sparse panel data
    • Removed size-encoded points to focus on a single visual story: correlation sign change across market size
    • Restructured layout from single faceted plot to patchwork composition (evidence left, synthesis right) to eliminate coordinate system bleed

11. Custom Functions Documentation

Note📦 Custom Helper Functions

This analysis uses custom functions from my personal module library for efficiency and consistency across projects.

Functions Used:

  • fonts.R: setup_fonts(), get_font_families() - Font management with showtext
  • social_icons.R: create_social_caption() - Generates formatted social media captions
  • image_utils.R: save_plot() - Consistent plot saving with naming conventions
  • base_theme.R: create_base_theme(), extend_weekly_theme(), get_theme_colors() - Custom ggplot2 themes

Why custom functions?
These utilities standardize theming, fonts, and output across all my data visualizations. The core analysis (data tidying and visualization logic) uses only standard tidyverse packages.

Source Code:
View all custom functions → GitHub: R/utils

Back to top

Citation

BibTeX citation:
@online{ponce2026,
  author = {Ponce, Steven},
  title = {The {Holiday-Volatility} {Paradox}},
  date = {2026-04-17},
  url = {https://stevenponce.netlify.app/data_visualizations/30DayChartChallenge/2026/30dcc_2026_17.html},
  langid = {en}
}
For attribution, please cite this work as:
Ponce, Steven. 2026. “The Holiday-Volatility Paradox.” April 17, 2026. https://stevenponce.netlify.app/data_visualizations/30DayChartChallenge/2026/30dcc_2026_17.html.
Source Code
---
title: "The Holiday-Volatility Paradox"
subtitle: "In small markets, more holidays correlate with higher traffic volatility (r = +0.48). In larger markets, the relationship flips — more holidays predict lower volatility. Remake of TidyTuesday 2024 · Week 52*"
description: "In small markets, more holidays correlate with higher traffic volatility (r = +0.48). In larger markets, the relationship flips — more holidays predict lower volatility. A faceted scatter plot with a sign-flip summary panel, built with ggplot2 in R as part of the #30DayChartChallenge 2026 — Day 17: Remake."
date: "2026-04-17" 
author:
  - name: "Steven Ponce"
    url: "https://stevenponce.netlify.app"
citation:
  url: "https://stevenponce.netlify.app/data_visualizations/30DayChartChallenge/2026/30dcc_2026_17.html"
categories: ["30DayChartChallenge", "Data Visualization", "R Programming", "2026"]
tags: [
  "30DayChartChallenge",
  "Relationships",
  "Remake",
  "Scatter Plot",
  "Small Multiples",
  "Patchwork",
  "Correlation",
  "Air Traffic",
  "Holiday Patterns",
  "Market Size",
  "ggplot2",
  "ggrepel",
  "TidyTuesday"
]
image: "thumbnails/30dcc_2026_17.png"
format:
  html:
    toc: true
    toc-depth: 5
    code-link: true
    code-fold: true
    code-tools: true
    code-summary: "Show code"
    self-contained: true
    theme: 
      light: [flatly, assets/styling/custom_styles.scss]
      dark: [darkly, assets/styling/custom_styles_dark.scss]
editor_options: 
  chunk_output_type: inline
execute: 
  freeze: true                                                  
  cache: true                                                   
  error: false
  message: false
  warning: false
  eval: true
---

![A two-part data visualization titled "The Holiday-Volatility Paradox." The left side shows four scatter plots arranged in a 2×2 grid, each showing the relationship between the average number of holidays per month (x-axis) and the coefficient of variation in air traffic (y-axis) for Small, Medium, Large, and Very Large markets. A red linear trend line in the Small Market panel shows a positive correlation (r = +0.48), while blue trend lines in the Medium (r = −0.22), Large (r = −0.17), and Very Large (r = −0.07) panels show negative correlations. A dashed reference line marks the industry median volatility. The right side shows a summary slope chart titled "The Sign Flip," plotting the four correlation values across market sizes and built in R/ggplot2 as a remake of TidyTuesday 2024 Week 52.](30dcc_2026_17.png){#fig-1}

### [**Steps to Create this Graphic**]{.mark}

#### [1. Load Packages & Setup]{.smallcaps}

```{r}
#| label: load
#| warning: false
#| message: false      
#| results: "hide"     

## 1. LOAD PACKAGES & SETUP ----
suppressPackageStartupMessages({
pacman::p_load(
  tidyverse, ggtext, showtext, patchwork,
  janitor, scales, glue, ggrepel
  )
})

### |- figure size ----
camcorder::gg_record(
  dir    = here::here("temp_plots"),
  device = "png",
  width  = 12,
  height = 10,
  units  = "in",
  dpi    = 320
)

# Source utility functions
suppressMessages(source(here::here("R/utils/fonts.R")))
source(here::here("R/utils/social_icons.R"))
source(here::here("R/utils/image_utils.R"))
source(here::here("R/themes/base_theme.R"))
```

#### [2. Read in the Data]{.smallcaps}

```{r}
#| label: read
#| include: true
#| eval: true
#| warning: false

tt <- tidytuesdayR::tt_load(2024, week = 52)

global_holidays_raw <- tt$global_holidays |> clean_names()
monthly_passengers_raw  <- tt$monthly_passengers |> clean_names()
rm(tt)
```

#### [3. Examine the Data]{.smallcaps}

```{r}
#| label: examine
#| include: true
#| eval: true
#| results: 'hide'
#| warning: false

glimpse(global_holidays_raw)
glimpse(monthly_passengers_raw)

```

#### [4. Tidy Data]{.smallcaps}

```{r}
#| label: tidy
#| warning: false

monthly_passengers_clean <- monthly_passengers_raw |>
  mutate(
    date  = ymd(paste(year, month, "01", sep = "-")),
    total_passengers = coalesce(total, total_os)
  )

monthly_holidays_clean <- global_holidays_raw |>
  mutate(
    year  = year(date),
    month = month(date)
  ) |>
  group_by(iso3, year, month) |>
  summarise(
    holiday_count = n(),
    public_holidays = sum(type == "Public holiday"),
    .groups = "drop"
  )

combined_data <- monthly_passengers_clean |>
  left_join(monthly_holidays_clean, by = c("iso3", "year", "month"))

# Housekeeping
rm(global_holidays_raw, monthly_passengers_raw, monthly_holidays_clean, monthly_passengers_clean)
gc()

# Volatility summary by country ----
volatility_df <- combined_data |>
  group_by(iso3) |>
  summarise(
    mean_traffic = mean(total_passengers, na.rm = TRUE),
    sd_traffic = sd(total_passengers, na.rm = TRUE),
    cv = sd_traffic / mean_traffic,
    avg_holidays = mean(holiday_count, na.rm = TRUE),
    total_observations = n(),
    traffic_size = sum(total_passengers, na.rm = TRUE),
    .groups = "drop"
  ) |>
  filter(
    complete.cases(cv, avg_holidays),
    total_observations >= 12,
    cv >= 0,
    cv <= quantile(cv, 0.95, na.rm = TRUE)
  ) |>
  mutate(
    size_category = cut(
      traffic_size,
      breaks = quantile(traffic_size, probs = seq(0, 1, 0.25), na.rm = TRUE),
      labels = c("Small", "Medium", "Large", "Very Large"),
      include.lowest = TRUE
    )
  )

# Per-facet correlation values ----
cor_labels <- volatility_df |>
  group_by(size_category) |>
  summarise(
    r = cor(avg_holidays, cv, use = "complete.obs"),
    n_country = n(),
    .groups = "drop"
  ) |>
  mutate(
    # Direction drives color encoding in strip + summary panel
    direction = if_else(r > 0, "positive", "negative"),
    label = glue("r = {sprintf('%+.2f', r)}\nn = {n_country}")
  )

# Identify notable outliers per facet (top/bottom CV, only if genuinely extreme) ----
outlier_threshold <- 0.15 

outliers_df <- volatility_df |>
  left_join(
    volatility_df |>
      group_by(size_category) |>
      summarise(med_cv = median(cv), .groups = "drop"),
    by = "size_category"
  ) |>
  group_by(size_category) |>
  filter(
    abs(cv - med_cv) >= outlier_threshold,
    cv == max(cv) | cv == min(cv)
  ) |>
  ungroup()
```


#### [5. Visualization Parameters]{.smallcaps}

```{r}
#| label: params
#| include: true
#| warning: false

### |- plot aesthetics ----
colors <- get_theme_colors(
  palette = list(
    positive  = "#C0392B",   
    negative  = "#2980B9",  
    neutral   = "#7F8C8D",  
    point     = "#34495E",  
    highlight = "#E67E22",   
    bg        = "#F8F9FA"
  )
)

### |- titles and caption ----
title_text <- str_glue(
  "The Holiday-Volatility Paradox"
)

subtitle_text <- str_glue(
  "In small markets, more holidays correlate with **higher** traffic volatility (r = +0.48).<br>
    In larger markets, the relationship **flips** — more holidays predict **lower** volatility.<br>
    <span style='color:#C0392B;'>● Positive correlation</span> &nbsp;&nbsp;
    <span style='color:#2980B9;'>● Negative correlation</span> &nbsp;&nbsp;
    *Remake of TidyTuesday 2024 · Week 52*"
)

caption_text <- create_dcc_caption(
  dcc_year    = 2026,
  dcc_day     = 17,
  source_text = "Global Holidays and Travel · TidyTuesday 2024 Week 52"
)

### |- fonts ----
setup_fonts()
fonts <- get_font_families()

### |- base theme ----
base_theme <- create_base_theme(colors)

weekly_theme <- extend_weekly_theme(
  base_theme,
  theme(
    # strip labels
    strip.text = element_text(
      family = fonts$text, size = 11, face = "bold",
      margin = margin(b = 6)
    ),
    # axes
    axis.title = element_text(family = fonts$text, size = 9, color = "gray50"),
    axis.text = element_text(family = fonts$text, size = 8, color = "gray40"),
    axis.ticks = element_blank(),
    # grid — horizontal only, very faint
    panel.grid.major.y = element_line(color = "gray90", linewidth = 0.3),
    panel.grid.major.x = element_blank(),
    panel.grid.minor = element_blank(),
    # panel spacing — more breathing room
    panel.spacing.x = unit(2.0, "lines"),
    panel.spacing.y = unit(1.8, "lines"),
    # plot margins
    plot.margin = margin(t = 20, r = 20, b = 10, l = 20)
  )
)

theme_set(weekly_theme)
```

#### [6. Plot]{.smallcaps}

```{r}
#| label: plot
#| warning: false

### |- helper: strip color per facet ----
strip_labels <- cor_labels |>
  mutate(
    strip_color = if_else(direction == "positive",
                          colors$palette$positive,
                          colors$palette$negative
    ),
    strip_text = glue("{size_category} Market"),
    x_pos = -Inf,
    y_pos = Inf
  )

### |- main scatter plot (p_main) ----
p_main <- ggplot(volatility_df, aes(x = avg_holidays, y = cv)) +
  
  # Geoms
  geom_hline(
    yintercept = median(volatility_df$cv),
    linetype = "dashed",
    color = colors$palette$neutral,
    linewidth  = 0.4,
    alpha = 0.6
  ) +
  geom_point(
    color = colors$palette$point,
    size = 2.8,
    alpha = 0.75,
    shape = 16
  ) +
  geom_smooth(
    aes(color = size_category),
    method = "lm",
    formula = y ~ x,
    linewidth = 1,
    se = TRUE,
    alpha = 0.12
  ) +
  geom_text_repel(
    data = outliers_df,
    aes(label = iso3),
    size = 2.8,
    color = colors$palette$highlight,
    fontface = "bold",
    max.overlaps = 3,
    box.padding = 0.5,
    segment.color = colors$palette$neutral,
    segment.alpha = 0.5,
    segment.size = 0.3,
    seed = 123
  ) +
  geom_text(
    data = volatility_df |> filter(size_category == "Small"),
    x = Inf, y = median(volatility_df$cv) + 0.025,
    label = "industry median",
    size = 2.5, color = colors$palette$neutral,
    hjust = 1.05, vjust = -0.3, fontface = "italic"
  ) +
  geom_text(
    data = cor_labels,
    aes(
      x = Inf,
      y = 0.62,
      label = label,
      color = direction
    ),
    size = 3,
    hjust = 1.1,
    vjust = 1,
    fontface = "bold",
    inherit.aes = FALSE
  ) +
  geom_label(
    data = strip_labels,
    aes(
      x = x_pos,
      y = y_pos,
      label = strip_text,
      fill = direction
    ),
    hjust = -0.05,
    vjust = 1.3,
    size = 3.2,
    fontface = "bold",
    color = "white",
    label.size = 0,
    label.padding = unit(0.25, "lines"),
    inherit.aes = FALSE
  ) +
  # Scales
  scale_color_manual(
    values = c(
      "Small" = colors$palette$positive,
      "Medium" = colors$palette$negative,
      "Large" = colors$palette$negative,
      "Very Large" = colors$palette$negative
    )
  ) +
  scale_color_manual(
    values = c(
      "positive" = colors$palette$positive,
      "negative" = colors$palette$negative,
      "Small" = colors$palette$positive,
      "Medium" = colors$palette$negative,
      "Large" = colors$palette$negative,
      "Very Large" = colors$palette$negative
    )
  ) +
  scale_fill_manual(
    values = c(
      "positive" = colors$palette$positive,
      "negative" = colors$palette$negative
    )
  ) +
  scale_y_continuous(
    breaks = seq(0, 0.75, by = 0.25),
    limits = c(0, 0.75), # CV cannot be negative — floor at 0
    labels = percent_format(accuracy = 1)
  ) +
  scale_x_continuous(
    breaks = seq(2, 8, by = 2),
    expand = expansion(mult = c(0.05, 0.1))
  ) +
  
  # Facet ---
  facet_wrap(
    ~size_category,
    nrow   = 2,
    scales = "free_x"
  ) +
  
  # Labs
  labs(
    x = "Average Number of Holidays per Month",
    y = "Coefficient of Variation in Traffic"
  ) +
  
  # Theme
  theme(
    plot.title = element_text(
      size = rel(2),
      family = fonts$title,
      face = "bold",
      color = colors$title,
      lineheight  = 1.1,
      margin = margin(t = 5, b = 5)
    ),
    plot.subtitle = element_markdown(
      size = rel(0.95),
      family = fonts$text,
      color = colors$subtitle,
      lineheight = 1.4,
      margin = margin(t = 5, b = 15)
    ),
    strip.text = element_blank()
  )

### |- summary panel: r values across market sizes (p_summary) ----

# Standalone panel 
p_summary <- cor_labels |>
  mutate(
    size_category = factor(size_category, levels = c("Small", "Medium", "Large", "Very Large")),
    x_num  = as.numeric(size_category),
    vjust_val = if_else(r > 0, -1.2, 2.0)
  ) |>
  ggplot(aes(x = x_num, y = r, color = direction)) +
  
  # Annotate
  annotate(
    "rect",
    xmin = 0.5, xmax = 4.5,
    ymin = 0,   ymax = 0.55,
    fill  = colors$palette$positive,
    alpha = 0.04
  ) +
  
  # Geoms
  geom_hline(
    yintercept = 0,
    linetype = "dashed",
    color = colors$palette$neutral,
    linewidth  = 0.5
  ) +
  geom_line(color = "#BBBBBB", linewidth = 0.8) +
  geom_point(size = 6) +
  geom_text(
    aes(label = sprintf("%+.2f", r), vjust = vjust_val),
    size = 3.2,
    fontface = "bold"
  ) +
  annotate(
    "text",
    x = 2.5, y = 0.62,
    label = "← positive          negative →",
    size = 2.6,
    color = colors$palette$neutral,
    family = 'sans',
    fontface = "italic",
    hjust = 0.5
  ) +
  
  # Scales
  scale_color_manual(
    values = c(
      "positive" = colors$palette$positive,
      "negative" = colors$palette$negative
    ),
    guide = "none"
  ) +
  scale_x_continuous(
    breaks = 1:4,
    labels = c("Small", "Medium", "Large", "Very\nLarge"),
    expand = expansion(add = 0.5) 
  ) +
  scale_y_continuous(
    limits = c(-0.45, 0.70),
    breaks = seq(-0.4, 0.6, by = 0.2),
    labels = function(x) sprintf("%+.1f", x)
  ) +
  # Labs
  labs(
    x = "Market Size",
    y = "Correlation (r)",
    title = "The Sign Flip"
  ) +
  # Theme
  theme(
    plot.margin = margin(t = 10, r = 18, b = 10, l = 18),
    plot.title = element_text(
      size = rel(1.1),
      family  = fonts$text,
      face = "bold",
      color = colors$title,
      margin = margin(b = 10)
    ),
    axis.text.x = element_text(
      size = rel(0.78), color = "#555555",
      margin = margin(t = 6)
    ),
    axis.text.y = element_text(size = rel(0.78), color = "#555555"),
    axis.title.x = element_text(
      size = rel(0.85), color = "#444444",
      margin = margin(t = 8)
    ),
    axis.title.y = element_text(
      size = rel(0.85), color = "#444444",
      margin = margin(r = 8)
    ),
    panel.grid.major.x = element_blank(),
    panel.grid.major.y = element_line(color = "gray88", linewidth = 0.2),
    panel.grid.minor = element_blank(),
    axis.ticks = element_blank()
  )

# Combined plots
p_right <- p_summary / plot_spacer() +
  plot_layout(heights = c(1, 0.20)) 

# Left = evidence (2x2 scatter), Right = conclusion 
p_final <- 
  (p_main | p_right) +
  plot_layout(widths = c(2.4, 1.1)) +
  plot_annotation(
    title = title_text,
    subtitle = subtitle_text,
    caption = caption_text,
    theme = theme(
      plot.title = element_text(
        size = rel(2),
        family = fonts$title,
        face = "bold",
        color = colors$title,
        lineheight = 1.1,
        margin = margin(t = 5, b = 5)
      ),
      plot.subtitle = element_markdown(
        size = rel(0.95),
        family = 'sans',
        color = colors$subtitle,
        lineheight = 1.4,
        margin = margin(t = 5, b = 15)
      ),
      plot.caption = element_markdown(
        family = fonts$caption,
        size = rel(0.65),
        color = colors$caption,
        linewidth = 1.3,
        hjust = 0,
        margin = margin(t = 15)
      ),
      plot.margin = margin(15, 15, 10, 15)
    )
  )

```

#### [7. Save]{.smallcaps}

```{r}
#| label: save
#| warning: false

### |-  plot image ----  
save_plot_patchwork(
  p_final, 
  type = "30daychartchallenge", 
  year = 2026, 
  day = 17, 
  width = 12, 
  height = 10
  )
```

#### [8. Session Info]{.smallcaps}

::: {.callout-tip collapse="true"}
##### Expand for Session Info

```{r, echo = FALSE}
#| eval: true
#| warning: false

sessionInfo()
```
:::

#### [9. GitHub Repository]{.smallcaps} 

::: {.callout-tip collapse="true"}
##### Expand for GitHub Repo

The complete code for this analysis is available in [`30dcc_2026_17.qmd`](https://github.com/poncest/personal-website/blob/master/data_visualizations/TidyTuesday/2026/30dcc_2026_17.qmd).

For the full repository, [click here](https://github.com/poncest/personal-website/).
:::


#### [10. References]{.smallcaps}
::: {.callout-tip collapse="true"}
##### Expand for References
1. **Data Sources:**
   - TidyTuesday. (2024). *Global Holidays and Travel — Week 52* [Dataset]. 
     https://github.com/rfordatascience/tidytuesday/tree/main/data/2024/2024-12-24
   - WorldPop Hub. *Monthly passenger traffic data* [Dataset]. 
     https://www.worldpop.org/
   - Original submission: Ponce, S. (2024). *TidyTuesday 2024 Week 52*. 
     https://github.com/poncest/tidytuesday/tree/main/2024/Week_52

2. **Remake Methodology:**
   - Switched from LOESS to linear smooth (`method = "lm"`) for analytical honesty with sparse panel data
   - Removed size-encoded points to focus on a single visual story: correlation sign change across market size
   - Restructured layout from single faceted plot to patchwork composition (evidence left, synthesis right) to eliminate coordinate system bleed
:::


#### [11. Custom Functions Documentation]{.smallcaps}

::: {.callout-note collapse="true"}
##### 📦 Custom Helper Functions

This analysis uses custom functions from my personal module library for efficiency and consistency across projects.

**Functions Used:**

-   **`fonts.R`**: `setup_fonts()`, `get_font_families()` - Font management with showtext
-   **`social_icons.R`**: `create_social_caption()` - Generates formatted social media captions
-   **`image_utils.R`**: `save_plot()` - Consistent plot saving with naming conventions
-   **`base_theme.R`**: `create_base_theme()`, `extend_weekly_theme()`, `get_theme_colors()` - Custom ggplot2 themes

**Why custom functions?**\
These utilities standardize theming, fonts, and output across all my data visualizations. The core analysis (data tidying and visualization logic) uses only standard tidyverse packages.

**Source Code:**\
View all custom functions → [GitHub: R/utils](https://github.com/poncest/personal-website/tree/master/R)
:::

© 2024 Steven Ponce

Source Issues