---
title: "More Holidays Associated with Lower Air Traffic Volatility in Larger Markets"
subtitle: "Higher holiday frequency correlates with reduced traffic volatility, especially in larger markets. Lower CV values indicate more stable traffic patterns"
description: "Analysis of global holiday patterns and air traffic volatility reveals that larger markets with more frequent holidays tend to experience more stable passenger traffic patterns. Small markets show positive correlation (r=0.48) while larger markets show progressively decreasing negative correlations, suggesting holidays may help stabilize air traffic in developed markets."
author: "Steven Ponce"
date: "2024-12-09"
categories: [TidyTuesday, Data Visualization, R Programming]
tags: [ggplot2, air-travel, holidays, time-series, correlation-analysis, market-analysis, traffic-patterns, data-visualization, tidyverse, aviation]
image: "thumbnails/tt_2024_52.png"
format:
html:
toc: true
toc-depth: 5
code-link: true
code-fold: true
code-tools: true
code-summary: "Show code"
self-contained: true
theme:
light: [flatly, assets/styling/custom_styles.scss]
dark: [darkly, assets/styling/custom_styles_dark.scss]
editor_options:
chunk_output_type: inline
execute:
freeze: true
cache: true
error: false
message: false
warning: false
eval: true
filters:
- social-share
share:
permalink: "https://stevenponce.netlify.app/data_visualizations/TidyTuesday/2024/tt_2024_52.html"
description: "Analysis shows larger markets with more holidays have more stable air traffic patterns suggesting holidays help regulate passenger flow in developed markets"
twitter: true
linkedin: true
email: true
facebook: false
reddit: false
stumble: false
tumblr: false
mastodon: true
bsky: true
---
![Scatter plot panels show the relationship between average monthly holidays and air traffic volatility (CV) across different market sizes (Small, Medium, Large, Very Large). Larger markets show lower volatility (CV) with more holidays. Data points are sized by annual passenger traffic, with industry median volatility and trend lines highlighted. Source: WorldPop Hub.](tt_2024_52.png) {#fig-1}
### <mark> **Steps to Create this Graphic** </mark>
#### 1. Load Packages & Setup
```{r}
#| label: load
#| warning: false
#| message: false
#| results: "hide"
## 1. LOAD PACKAGES & SETUP ----
suppressPackageStartupMessages ({
pacman:: p_load (
tidyverse, # Easily Install and Load the 'Tidyverse'
ggtext, # Improved Text Rendering Support for 'ggplot2'
showtext, # Using Fonts More Easily in R Graphs
janitor, # Simple Tools for Examining and Cleaning Dirty Data
skimr, # Compact and Flexible Summaries of Data
scales, # Scale Functions for Visualization
glue, # Interpreted String Literals
here # A Simpler Way to Find Your Files
)
})
# Source utility functions
suppressMessages (source (here:: here ("R/utils/fonts.R" )))
source (here:: here ("R/utils/social_icons.R" ))
source (here:: here ("R/utils/image_utils.R" ))
source (here:: here ("R/themes/base_theme.R" ))
```
#### 2. Read in the Data
```{r}
#| label: read
#| include: true
#| eval: true
#| warning: false
# tt <- tidytuesdayR::tt_load(2024, week = 50)
#
# parfumo_data_raw <- tt$parfumo_data |> clean_names()
#
# tidytuesdayR::readme(tt)
# rm(tt)
# Option 2: Read directly from GitHub
global_holidays_raw <- readr:: read_csv (
'https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2024/2024-12-24/global_holidays.csv' ) |>
clean_names ()
monthly_passengers_raw <- readr:: read_csv (
'https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2024/2024-12-24/monthly_passengers.csv' ) |>
clean_names ()
```
#### 3. Examine the Data
```{r}
#| label: examine
#| include: true
#| eval: true
#| results: 'hide'
#| warning: false
glimpse (global_holidays_raw)
skim (global_holidays_raw)
glimpse (monthly_passengers_raw)
skim (monthly_passengers_raw)
```
#### 4. Tidy Data
```{r}
#| label: tidy
#| warning: false
### |- tidy data ----
## Clean and join the datasets ----
monthly_passengers_clean <- monthly_passengers_raw |>
mutate (
date = ymd (paste (year, month, "01" , sep = "-" )),
total_passengers = coalesce (total, total_os) # Use total_os when total is NA
)
monthly_holidays_clean <- global_holidays_raw |>
mutate (
year = year (date),
month = month (date)
) |>
group_by (iso3, year, month) |>
summarise (
holiday_count = n (),
public_holidays = sum (type == "Public holiday" ),
.groups = "drop"
)
combined_data <- monthly_passengers_clean |>
left_join (monthly_holidays_clean, by = c ("iso3" , "year" , "month" ))
# Housekeeping
rm (global_holidays_raw, monthly_passengers_raw, monthly_holidays_clean, monthly_passengers_clean)
## data plot ---
volatility_df <- combined_data |>
# Calculate summary statistics by country
group_by (iso3) |>
summarise (
mean_traffic = mean (total_passengers, na.rm = TRUE ),
sd_traffic = sd (total_passengers, na.rm = TRUE ),
cv = sd_traffic / mean_traffic,
avg_holidays = mean (holiday_count, na.rm = TRUE ),
total_observations = n (),
traffic_size = sum (total_passengers, na.rm = TRUE ),
.groups = "drop"
) |>
# Remove NA, infinite, or outlier values
filter (
complete.cases (cv, avg_holidays),
total_observations >= 12 , # At least one year of data
cv >= 0 , # Ensure no negative coefficients of variation
cv <= quantile (cv, 0.95 , na.rm = TRUE ) # Remove extreme outliers
) |>
# Add size categories for visualization
mutate (
size_category = cut (
traffic_size,
breaks = quantile (traffic_size, probs = seq (0 , 1 , 0.25 ), na.rm = TRUE ),
labels = c ("Small" , "Medium" , "Large" , "Very Large" ),
include.lowest = TRUE
)
)
```
#### 5. Visualization Parameters
```{r}
#| label: params
#| include: true
#| warning: false
### |- plot aesthetics ----
# Get base colors with custom palette
colors <- get_theme_colors (palette = c ("#4B79B7" , "#F8F9FA" , "#2C3E50" , "#34495E" , "#7F8C8D" ))
### |- titles and caption ----
title_text <- str_glue ("More Holidays Associated with Lower Air Traffic Volatility \n in Larger Markets" )
subtitle_text <- str_glue ("Higher holiday frequency correlates with reduced traffic volatility, especially in larger markets<br>
Lower CV values indicate more stable traffic patterns<br><br>
**Coefficient of Variation in Traffic**" )
# Create caption
caption_text <- create_social_caption (
tt_year = 2024 ,
tt_week = 52 ,
source_text = "WorldPop Hub"
)
### |- fonts ----
setup_fonts ()
fonts <- get_font_families ()
### |- plot theme ----
# Start with base theme
base_theme <- create_base_theme (colors)
# Add weekly-specific theme elements
weekly_theme <- extend_weekly_theme (
base_theme,
theme (
# Weekly-specific modifications
axis.line.x = element_line (color = "#252525" , linewidth = .2 ),
panel.spacing.x = unit (2 , 'lines' ),
panel.spacing.y = unit (1 , 'lines' ),
panel.grid.major.x = element_blank (),
panel.grid.major.y = element_line (color = alpha (colors$ palette[5 ], 0.2 ), linewidth = 0.2 ),
panel.grid.minor = element_blank (),
strip.text = element_textbox (
size = rel (0.9 ),
face = 'bold' ,
color = colors$ palette[3 ],
fill = alpha (colors$ palette[1 ], 0.1 ),
box.color = alpha (colors$ palette[1 ], 0.5 ),
halign = 0.5 ,
linetype = 1 ,
r = unit (3 , "pt" ),
width = unit (1 , "npc" ),
padding = margin (5 , 10 , 5 , 10 ),
margin = margin (b = 10 )
),
legend.margin = margin (- 25 , 5 , 0 , 0 ), # align the legend with the y-axis label
legend.position = "top" ,
legend.title = element_text (size = rel (0.7 )),
legend.text = element_text (size = rel (0.6 )),
legend.justification.top = "right" ,
)
)
# Set theme
theme_set (weekly_theme)
```
#### 6. Plot
```{r}
#| label: plot
#| warning: false
### |- Plot ----
p <- ggplot (volatility_df, aes (x = avg_holidays, y = cv)) +
# Add reference line for overall median
geom_hline (
yintercept = median (volatility_df$ cv),
linetype = "dashed" ,
color = "gray50" ,
alpha = 0.3
) +
# Add points
geom_point (
aes (
size = traffic_size,
alpha = cv # Vary transparency by CV
),
color = colors$ palette[1 ]
) +
# Add trend line
geom_smooth (
color = colors$ palette[3 ],
method = "loess" ,
linewidth = 1 ,
se = TRUE
) +
# Add labels for extreme points (cv)
ggrepel:: geom_text_repel (
data = volatility_df |>
group_by (size_category) |>
filter (cv == max (cv) | cv == min (cv)),
aes (label = iso3),
size = 3 ,
color = colors$ palette[4 ],
max.overlaps = 2 ,
box.padding = 0.5 ,
segment.color = colors$ palette[5 ],
segment.alpha = 0.5
) +
# Add single annotation for the median line
geom_text (
data = volatility_df |> filter (size_category == "Small" ),
x = 9 ,
y = median (volatility_df$ cv) + 0.02 ,
label = "Industry median volatility" ,
size = 3 ,
color = "gray50" ,
hjust = 1 ,
vjust = - 0.5
) +
# Add correlation annotation in each facet
geom_text (
data = volatility_df |>
group_by (size_category) |>
summarise (
cor = cor (avg_holidays, cv),
.groups = "drop"
),
aes (x = 8 , y = 0.65 ,
label = sprintf ("r = %.2f" , cor)),
size = 3 ,
hjust = 1
) +
# Scales
scale_y_continuous (
breaks = seq (0 , 1 , by = .25 ),
limits = c (- .25 , .75 ),
labels = percent_format ()
) +
scale_x_continuous (
breaks = seq (2 , 8 , by = 2 ),
limits = c (1 , 9 ),
expand = expansion (mult = c (0.02 , 0.08 ))
) +
scale_size_continuous (
range = c (2 , 8 ),
labels = scales:: label_number (scale = 1e-6 , suffix = "M" )
) +
scale_alpha_continuous (
range = c (0.4 , 0.8 ),
guide = "none"
) +
# Labs
labs (
x = "Average Number of Holidays per Month" ,
y = NULL ,
size = "Annual Passenger Traffic (M)" ,
color = "Market Size" ,
title = title_text,
subtitle = subtitle_text,
caption = caption_text
) +
# Facets
facet_wrap (
~ size_category,
labeller = as_labeller (function (x) paste (x, "Market" )),
scales = "fixed"
) +
# Theme
theme (
plot.title = element_text (
size = rel (2 ),
family = fonts$ title,
face = "bold" ,
color = colors$ title,
lineheight = 1.1 ,
margin = margin (t = 5 , b = 5 )
),
plot.subtitle = element_markdown (
size = rel (1 ),
family = fonts$ subtitle,
color = colors$ subtitle,
lineheight = 1.1 ,
margin = margin (t = 5 , b = 5 )
),
plot.caption = element_markdown (
family = fonts$ caption,
size = rel (0.65 ),
color = colors$ caption,
hjust = 0.5 ,
margin = margin (t = 10 )
)
)
```
#### 7. Save
```{r}
#| label: save
#| warning: false
### |- plot image ----
save_plot (p, type = "tidytuesday" ,
year = 2024 , week = 52 , width = 10 , height = 10 )
```
#### 8. Session Info
::: {.callout-tip collapse="true"}
##### Expand for Session Info
```{r, echo = FALSE}
#| eval: true
#| warning: false
sessionInfo()
```
:::
#### 9. GitHub Repository
::: {.callout-tip collapse="true"}
##### Expand for GitHub Repo
The complete code for this analysis is available in [ `tt_2024_52.qmd` ](https://github.com/poncest/personal-website/blob/master/data_visualizations/TidyTuesday/2024/tt_2024_52.qmd) .
For the full repository, [ click here ](https://github.com/poncest/personal-website/) .
:::