• Steven Ponce
  • About
  • Data Visualizations
  • Projects
  • Resume
  • Email

On this page

  • Steps to Create this Graphic
    • 1. Load Packages & Setup
    • 2. Read in the Data
    • 3. Examine the Data
    • 4. Tidy Data
    • 5. Visualization Parameters
    • 6. Plot
    • 7. Save
    • 8. Session Info
    • 9. GitHub Repository
    • 10. References
    • 11. Custom Functions Documentation

Most exoplanets are still a blur

  • Show All Code
  • Hide All Code

  • View Source

Over 5,000 worlds confirmed, yet uncertainty in mass and radius blurs the boundary between rocky worlds, sub-Neptunes, and gas giants. Color shows inferred class; error bars reveal how much we still don’t know.

30DayChartChallenge
Data Visualization
R Programming
2026
A log-log scatter plot of over 5,000 confirmed exoplanets plotting mass against radius, with asymmetric measurement error bars for each world. Points are colored by inferred compositional class — rocky, sub-Neptune, or gas giant — using the Fulton gap boundaries as thresholds. The overlapping fog of error bars across classification boundaries is the central argument: thousands of worlds have been confirmed, yet uncertainty in their fundamental properties still blurs the line between what they are.
Author

Steven Ponce

Published

April 25, 2026

Figure 1: Scatter plot on a dark background titled “Most exoplanets are still a blur.” Log-log axes show planet mass in Earth masses (x) versus planet radius in Earth radii (y) for over 5,000 confirmed exoplanets. Points are colored by inferred compositional class: orange for rocky worlds, blue for sub-Neptunes, and violet for gas giants, following a broad diagonal trend from lower-left to upper-right. Thin error bars extend from each point in both directions, creating a fog of overlapping uncertainty across the chart. Two dashed horizontal lines mark the approximate class boundaries at 1.6 and 4.0 Earth radii, labeled “Boundaries are not discrete.” Earth and Jupiter are marked as reference points. An annotation reads: “Uncertainty spans entire compositional classes for many planets.” The chart argues that despite thousands of confirmed detections, measurement uncertainty in mass and radius often prevents confident classification.

Steps to Create this Graphic

1. Load Packages & Setup

Show code
```{r}
#| label: load
#| warning: false
#| message: false      
#| results: "hide"     

## 1. LOAD PACKAGES & SETUP ----
suppressPackageStartupMessages({
pacman::p_load(
  tidyverse, ggtext, showtext, 
  janitor, scales, glue, here
  )
})

### |- figure size ----
camcorder::gg_record(
  dir    = here::here("temp_plots"),
  device = "png",
  width  = 10,
  height = 8,
  units  = "in",
  dpi    = 320
)

# Source utility functions
suppressMessages(source(here::here("R/utils/fonts.R")))
source(here::here("R/utils/social_icons.R"))
source(here::here("R/utils/image_utils.R"))
source(here::here("R/themes/base_theme.R"))
```

2. Read in the Data

Show code
```{r}
#| label: read
#| include: true
#| eval: true
#| warning: false

cache_path <- here("data/30DayChartChallenge/2026/exoplanets_pscomppars.csv")

exo_raw <- read_csv(cache_path, show_col_types = FALSE, na = c("", "NA"))

# --- Rebuild cache (uncomment if re-downloading from NASA Exoplanet Archive) ---
# Requires: httr2
# Source: NASA Exoplanet Archive — Planetary Systems Composite Parameters (pscomppars)
# TAP endpoint: https://exoplanetarchive.ipac.caltech.edu/TAP/sync
# Columns used:
#   pl_name         — planet name
#   pl_rade         — radius (Earth radii)
#   pl_radeerr1/2   — upper/lower radius uncertainty; err2 stored as negative by NASA
#   pl_bmasse       — best mass estimate (Earth masses); uses pl_msinie when available
#   pl_bmasseerr1/2 — upper/lower mass uncertainty
#   discoverymethod — detection technique
#   disc_year       — year of confirmation
#
# query <- paste0(
#   "SELECT pl_name, pl_rade, pl_radeerr1, pl_radeerr2, ",
#   "pl_bmasse, pl_bmasseerr1, pl_bmasseerr2, ",
#   "discoverymethod, disc_year ",
#   "FROM pscomppars ",
#   "WHERE pl_rade IS NOT NULL ",
#   "AND pl_bmasse IS NOT NULL"
# )
# url <- paste0(
#   "https://exoplanetarchive.ipac.caltech.edu/TAP/sync?",
#   "query=", URLencode(query, reserved = TRUE),
#   "&format=csv"
# )
# resp    <- request(url) |> req_timeout(60) |> req_perform()
# exo_raw <- resp_body_string(resp) |> read_csv(show_col_types = FALSE, na = c("", "NA"))
# write_csv(exo_raw, cache_path)
# message("Cached to: ", cache_path)
```

3. Examine the Data

Show code
```{r}
#| label: examine
#| include: true
#| eval: true
#| results: 'hide'
#| warning: false

glimpse(exo_raw)
cat("Detection methods:\n"); print(table(exo_raw$discoverymethod, useNA = "always"))
```

4. Tidy Data

Show code
```{r}
#| label: tidy
#| warning: false

### |- clean, filter, classify ----
exo <- exo_raw |>
  mutate(across(
    c(
      pl_rade, pl_radeerr1, pl_radeerr2,
      pl_bmasse, pl_bmasseerr1, pl_bmasseerr2, disc_year
    ),
    as.numeric
  )) |>
  filter(!is.na(pl_rade), !is.na(pl_bmasse), pl_rade > 0, pl_bmasse > 0) |>
  # Remove extreme outliers (> ~3 Jupiter radii or > ~16 Jupiter masses)
  filter(pl_rade < 35, pl_bmasse < 6000) |>
  mutate(
    # NASA stores err2 as negative (signed lower bound) — take abs() for both
    rad_err_up = abs(pl_radeerr1),
    rad_err_dn = abs(pl_radeerr2),
    mass_err_up = abs(pl_bmasseerr1),
    mass_err_dn = abs(pl_bmasseerr2),
    has_rad_err = !is.na(rad_err_up) & !is.na(rad_err_dn),
    has_mass_err = !is.na(mass_err_up) & !is.na(mass_err_dn),
    # Compositional class using Fulton gap (~1.6 Re) + gas giant threshold (~4 Re)
    # These are inferred boundaries — the uncertainty cloud blurs them
    comp_class = case_when(
      pl_rade <= 1.6 ~ "Rocky",
      pl_rade <= 4.0 ~ "Sub-Neptune",
      TRUE ~ "Gas giant"
    ),
    comp_class = factor(comp_class, levels = c("Rocky", "Sub-Neptune", "Gas giant"))
  )

### |- subset with both error bars ----
exo_err <- exo |>
  filter(has_rad_err, has_mass_err) |>
  # Cap runaway bars on log scale
  mutate(
    rad_err_up  = pmin(rad_err_up, pl_rade * 2.0),
    rad_err_dn  = pmin(rad_err_dn, pl_rade * 0.85),
    mass_err_up = pmin(mass_err_up, pl_bmasse * 5.0),
    mass_err_dn = pmin(mass_err_dn, pl_bmasse * 0.85)
  )
```

5. Visualization Parameters

Show code
```{r}
#| label: params
#| include: true
#| warning: false

### |- plot aesthetics ----
colors <- get_theme_colors(
  palette = list(
    bg           = "#0A0F1A",
    rocky        = "#E8A87C",   
    subneptune   = "#7EB8D4",   
    gasgiant     = "#C4A4D4", 
    grid         = "#161E2E",
    text_main    = "#E8E8E8",
    text_sub     = "#8A94A0",
    annotation   = "#C0C8D0",
    earth_fill   = "#3366AA",
    earth_ring   = "#AACCFF",
    jupiter_fill = "#CC8833",
    jupiter_ring = "#FFD8A0",
    zone_label   = "#3D4A5A"
  )
)

col_bg <- colors$palette$bg
col_rocky <- colors$palette$rocky
col_subneptune <- colors$palette$subneptune
col_gasgiant <- colors$palette$gasgiant
col_grid <- colors$palette$grid
col_text_main <- colors$palette$text_main
col_text_sub <- colors$palette$text_sub
col_annotation <- colors$palette$annotation
col_earth_fill <- colors$palette$earth_fill
col_earth_ring <- colors$palette$earth_ring
col_jup_fill <- colors$palette$jupiter_fill
col_jup_ring <- colors$palette$jupiter_ring
col_zone <- colors$palette$zone_label

comp_colors <- c(
  "Rocky"       = col_rocky,
  "Sub-Neptune" = col_subneptune,
  "Gas giant"   = col_gasgiant
)

### |- titles and caption ----
title_text    <- "Most exoplanets are still a blur"

subtitle_text <- paste0(
  "Over 5,000 worlds confirmed, yet uncertainty in mass and radius ",
  "blurs the boundary between rocky worlds,<br>",
  "sub-Neptunes, and gas giants. ",
  "Color shows inferred class; error bars reveal how much we still don't know."
)

caption_text <- create_dcc_caption(
  dcc_year    = 2026,
  dcc_day     = 25,
  source_text = "NASA Exoplanet Archive — Planetary Systems Composite Parameters (pscomppars)"
)

### |- fonts ----
setup_fonts()
fonts <- get_font_families()

base_theme   <- create_base_theme(colors)

weekly_theme <- extend_weekly_theme(
  base_theme,
  theme(
    plot.background = element_rect(fill = col_bg, color = NA),
    panel.background = element_rect(fill = col_bg, color = NA),
    panel.grid.major = element_line(color = col_grid, linewidth = 0.35),
    panel.grid.minor = element_blank(),
    axis.text = element_text(color = col_text_sub, family = fonts$text, size = 9),
    axis.title = element_text(color = col_text_sub, family = fonts$text, size = 10),
    axis.ticks = element_blank(),
    plot.title = element_markdown(
      color = col_text_main, family = fonts$title,
      size = 22, face = "bold", hjust = 0, margin = margin(b = 6)
    ),
    plot.subtitle = element_markdown(
      color = col_text_sub, family = fonts$text,
      size = 10, lineheight = 1.5, hjust = 0, margin = margin(b = 20)
    ),
    plot.caption = element_markdown(
      color = col_text_sub, family = fonts$text,
      size = 7.5, hjust = 1, margin = margin(t = 14)
    ),
    plot.margin = margin(t = 20, r = 24, b = 12, l = 20),
    legend.position = "bottom",
    legend.background = element_rect(fill = col_bg, color = NA),
    legend.key = element_rect(fill = col_bg, color = NA),
    legend.text = element_text(color = col_text_sub, family = fonts$text, size = 8.5),
    legend.title = element_text(
      color = col_text_sub, family = fonts$text, size = 9, face = "bold"
    ),
    legend.key.width = unit(1.2, "lines"),
    legend.spacing.x = unit(0.4, "cm")
  )
)

theme_set(weekly_theme)
```

6. Plot

Show code
```{r}
#| label: plot
#| warning: false

### |- main plot ----
p <- ggplot() +

  # Geoms
  geom_errorbar(
    data = exo_err,
    aes(
      x = pl_bmasse,
      ymin = pl_rade - rad_err_dn,
      ymax = pl_rade + rad_err_up,
      color = comp_class
    ),
    linewidth = 0.15, alpha = 0.12, width = 0
  ) +
  geom_errorbarh(
    data = exo_err,
    aes(
      y = pl_rade,
      xmin = pl_bmasse - mass_err_dn,
      xmax = pl_bmasse + mass_err_up,
      color = comp_class
    ),
    linewidth = 0.15, alpha = 0.12, height = 0
  ) +
  geom_point(
    data = exo,
    aes(x = pl_bmasse, y = pl_rade, color = comp_class),
    size = 0.85, alpha = 0.55, shape = 16
  ) +

  # --- Compositional boundary lines (subtle dashed) ---
  # Fulton gap: ~1.6 Re separates rocky from sub-Neptune
  geom_hline(yintercept = 1.6, color = col_zone, linewidth = 0.3, linetype = "dashed") +
  # Gas giant threshold: ~4 Re
  geom_hline(yintercept = 4.0, color = col_zone, linewidth = 0.3, linetype = "dashed") +

  # Annotate
  annotate("text",
    x = 0.38, y = 1.1,
    label = "Rocky", color = col_rocky, alpha = 0.55,
    family = fonts$text, size = 2.8, hjust = 0, fontface = "italic"
  ) +
  annotate("text",
    x = 0.38, y = 2.3,
    label = "Sub-Neptune", color = col_subneptune, alpha = 0.55,
    family = fonts$text, size = 2.8, hjust = 0, fontface = "italic"
  ) +
  annotate("text",
    x = 0.38, y = 7.0,
    label = "Gas giant", color = col_gasgiant, alpha = 0.55,
    family = fonts$text, size = 2.8, hjust = 0, fontface = "italic"
  ) +
  annotate("text",
    x = 10, y = 0.42,
    label = "Uncertainty spans entire\ncompositional classes\nfor many planets",
    color = col_annotation, family = fonts$text,
    size = 2.9, hjust = 0, lineheight = 1.3
  ) +
  annotate("segment",
    x = 20, xend = 25, y = 0.58, yend = 1.05,
    color = col_annotation, linewidth = 0.35, alpha = 0.6,
    arrow = arrow(length = unit(0.12, "cm"), type = "open")
  ) +
  annotate("point",
    x = 1, y = 1,
    color = col_earth_ring, size = 3.2,
    shape = 21, fill = col_earth_fill, stroke = 0.9
  ) +
  annotate("text",
    x = 1.5, y = 1.03,
    label = "Earth", color = col_earth_ring,
    family = fonts$text, size = 2.7, hjust = 0
  ) +
  annotate("text",
    x = 1.5, y = 0.88,
    label = "Well-constrained benchmark",
    color = col_text_sub, family = fonts$text,
    size = 2.3, hjust = 0, fontface = "italic"
  ) +
  annotate("point",
    x = 317.8, y = 11.2,
    color = col_jup_ring, size = 3.8,
    shape = 21, fill = col_jup_fill, stroke = 0.9
  ) +
  annotate("text",
    x = 430, y = 11.5,
    label = "Jupiter", color = col_jup_ring,
    family = fonts$text, size = 2.7, hjust = 0
  ) +
  annotate("text",
    x = 430, y = 9.5,
    label = "Even well-studied giants\nshow wide measurement ranges",
    color = col_text_sub, family = fonts$text,
    size = 2.3, hjust = 0, fontface = "italic", lineheight = 1.25
  ) +
  annotate("text",
    x = 3200, y = 1.72,
    label = "Boundaries are not discrete",
    color = col_zone, family = fonts$text,
    size = 2.2, hjust = 1, fontface = "italic"
  ) +

  # Scales
  scale_x_log10(
    name   = "Planet mass (Earth masses)",
    breaks = c(0.5, 1, 3, 10, 30, 100, 300, 1000, 3000),
    labels = label_number(accuracy = 1, big.mark = ","),
    expand = expansion(mult = c(0.04, 0.06))
  ) +
  scale_y_log10(
    name   = "Planet radius (Earth radii)",
    breaks = c(0.5, 1, 2, 4, 8, 15, 25),
    labels = label_number(accuracy = 0.1),
    expand = expansion(mult = c(0.04, 0.06))
  ) +
  scale_color_manual(
    name = "Inferred composition",
    values = comp_colors,
    guide = guide_legend(
      nrow         = 1,
      override.aes = list(size = 3.5, alpha = 0.9)
    )
  ) +
  coord_cartesian(clip = "off") +

  # Labs
  labs(
    title    = title_text,
    subtitle = subtitle_text,
    caption  = caption_text
  )
```

7. Save

Show code
```{r}
#| label: save
#| warning: false

### |-  plot image ----  
save_plot(
  p,
  type = "30daychartchallenge",
  year = 2026,
  day = 25,
  width = 10,
  height = 8
  )
```

8. Session Info

TipExpand for Session Info
R version 4.5.3 (2026-03-11 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26100)

Matrix products: default
  LAPACK version 3.12.1

locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] here_1.0.2      glue_1.8.0      scales_1.4.0    janitor_2.2.1  
 [5] showtext_0.9-8  showtextdb_3.0  sysfonts_0.8.9  ggtext_0.1.2   
 [9] lubridate_1.9.5 forcats_1.0.1   stringr_1.6.0   dplyr_1.2.1    
[13] purrr_1.2.2     readr_2.2.0     tidyr_1.3.2     tibble_3.3.1   
[17] ggplot2_4.0.2   tidyverse_2.0.0

loaded via a namespace (and not attached):
 [1] gtable_0.3.6       xfun_0.57          htmlwidgets_1.6.4  tzdb_0.5.0        
 [5] vctrs_0.7.3        tools_4.5.3        generics_0.1.4     curl_7.0.0        
 [9] parallel_4.5.3     gifski_1.32.0-2    pacman_0.5.1       pkgconfig_2.0.3   
[13] RColorBrewer_1.1-3 S7_0.2.1           lifecycle_1.0.5    compiler_4.5.3    
[17] farver_2.1.2       textshaping_1.0.5  codetools_0.2-20   snakecase_0.11.1  
[21] litedown_0.9       htmltools_0.5.9    yaml_2.3.12        pillar_1.11.1     
[25] crayon_1.5.3       camcorder_0.1.0    magick_2.9.1       commonmark_2.0.0  
[29] tidyselect_1.2.1   digest_0.6.39      stringi_1.8.7      rsvg_2.7.0        
[33] rprojroot_2.1.1    fastmap_1.2.0      grid_4.5.3         cli_3.6.6         
[37] magrittr_2.0.5     withr_3.0.2        bit64_4.6.0-1      timechange_0.4.0  
[41] rmarkdown_2.31     bit_4.6.0          otel_0.2.0         ragg_1.5.2        
[45] hms_1.1.4          evaluate_1.0.5     knitr_1.51         markdown_2.0      
[49] rlang_1.2.0        gridtext_0.1.6     Rcpp_1.1.1         xml2_1.5.2        
[53] svglite_2.2.2      rstudioapi_0.18.0  vroom_1.7.1        jsonlite_2.0.0    
[57] R6_2.6.1           systemfonts_1.3.2 

9. GitHub Repository

TipExpand for GitHub Repo

The complete code for this analysis is available in 30dcc_2026_25.qmd.

For the full repository, click here.

10. References

TipExpand for References
  1. Data Sources:
    • NASA Exoplanet Archive — Planetary Systems Composite Parameters (pscomppars). California Institute of Technology, on behalf of NASA. Retrieved from: https://exoplanetarchive.ipac.caltech.edu/TAP/sync
  2. Chart Inspiration:
    • Fulton, B. J., et al. (2017). The California-Kepler Survey III: A Gap in the Radius Distribution of Small Planets. The Astronomical Journal, 154(3), 109. https://doi.org/10.3847/1538-3881/aa80eb

11. Custom Functions Documentation

Note📦 Custom Helper Functions

This analysis uses custom functions from my personal module library for efficiency and consistency across projects.

Functions Used:

  • fonts.R: setup_fonts(), get_font_families() - Font management with showtext
  • social_icons.R: create_social_caption() - Generates formatted social media captions
  • image_utils.R: save_plot() - Consistent plot saving with naming conventions
  • base_theme.R: create_base_theme(), extend_weekly_theme(), get_theme_colors() - Custom ggplot2 themes

Why custom functions?
These utilities standardize theming, fonts, and output across all my data visualizations. The core analysis (data tidying and visualization logic) uses only standard tidyverse packages.

Source Code:
View all custom functions → GitHub: R/utils

Back to top

Citation

BibTeX citation:
@online{ponce2026,
  author = {Ponce, Steven},
  title = {Most Exoplanets Are Still a Blur},
  date = {2026-04-25},
  url = {https://stevenponce.netlify.app/data_visualizations/30DayChartChallenge/2026/30dcc_2026_25.html},
  langid = {en}
}
For attribution, please cite this work as:
Ponce, Steven. 2026. “Most Exoplanets Are Still a Blur.” April 25, 2026. https://stevenponce.netlify.app/data_visualizations/30DayChartChallenge/2026/30dcc_2026_25.html.
Source Code
---
title: "Most exoplanets are still a blur"
subtitle: "Over 5,000 worlds confirmed, yet uncertainty in mass and radius blurs the boundary between rocky worlds, sub-Neptunes, and gas giants. Color shows inferred class; error bars reveal how much we still don't know."
description: "A log-log scatter plot of over 5,000 confirmed exoplanets plotting mass against radius, with asymmetric measurement error bars for each world. Points are colored by inferred compositional class — rocky, sub-Neptune, or gas giant — using the Fulton gap boundaries as thresholds. The overlapping fog of error bars across classification boundaries is the central argument: thousands of worlds have been confirmed, yet uncertainty in their fundamental properties still blurs the line between what they are."
date: "2026-04-25" 
author:
  - name: "Steven Ponce"
    url: "https://stevenponce.netlify.app"
citation:
  url: "https://stevenponce.netlify.app/data_visualizations/30DayChartChallenge/2026/30dcc_2026_25.html"
categories: ["30DayChartChallenge", "Data Visualization", "R Programming", "2026"]
tags: [
  "30DayChartChallenge",
  "Uncertainties",
  "Space",
  "Exoplanets",
  "Scatter Plot",
  "Error Bars",
  "Log Scale",
  "Astronomy",
  "NASA",
  "ggplot2"
]
image: "thumbnails/30dcc_2026_25.png"
format:
  html:
    toc: true
    toc-depth: 5
    code-link: true
    code-fold: true
    code-tools: true
    code-summary: "Show code"
    self-contained: true
    theme: 
      light: [flatly, assets/styling/custom_styles.scss]
      dark: [darkly, assets/styling/custom_styles_dark.scss]
editor_options: 
  chunk_output_type: inline
execute: 
  freeze: true                                                  
  cache: true                                                   
  error: false
  message: false
  warning: false
  eval: true
---

![Scatter plot on a dark background titled "Most exoplanets are still a blur." Log-log axes show planet mass in Earth masses (x) versus planet radius in Earth radii (y) for over 5,000 confirmed exoplanets. Points are colored by inferred compositional class: orange for rocky worlds, blue for sub-Neptunes, and violet for gas giants, following a broad diagonal trend from lower-left to upper-right. Thin error bars extend from each point in both directions, creating a fog of overlapping uncertainty across the chart. Two dashed horizontal lines mark the approximate class boundaries at 1.6 and 4.0 Earth radii, labeled "Boundaries are not discrete." Earth and Jupiter are marked as reference points. An annotation reads: "Uncertainty spans entire compositional classes for many planets." The chart argues that despite thousands of confirmed detections, measurement uncertainty in mass and radius often prevents confident classification.](30dcc_2026_25.png){#fig-1}

### [**Steps to Create this Graphic**]{.mark}

#### [1. Load Packages & Setup]{.smallcaps}

```{r}
#| label: load
#| warning: false
#| message: false      
#| results: "hide"     

## 1. LOAD PACKAGES & SETUP ----
suppressPackageStartupMessages({
pacman::p_load(
  tidyverse, ggtext, showtext, 
  janitor, scales, glue, here
  )
})

### |- figure size ----
camcorder::gg_record(
  dir    = here::here("temp_plots"),
  device = "png",
  width  = 10,
  height = 8,
  units  = "in",
  dpi    = 320
)

# Source utility functions
suppressMessages(source(here::here("R/utils/fonts.R")))
source(here::here("R/utils/social_icons.R"))
source(here::here("R/utils/image_utils.R"))
source(here::here("R/themes/base_theme.R"))
```

#### [2. Read in the Data]{.smallcaps}

```{r}
#| label: read
#| include: true
#| eval: true
#| warning: false

cache_path <- here("data/30DayChartChallenge/2026/exoplanets_pscomppars.csv")

exo_raw <- read_csv(cache_path, show_col_types = FALSE, na = c("", "NA"))

# --- Rebuild cache (uncomment if re-downloading from NASA Exoplanet Archive) ---
# Requires: httr2
# Source: NASA Exoplanet Archive — Planetary Systems Composite Parameters (pscomppars)
# TAP endpoint: https://exoplanetarchive.ipac.caltech.edu/TAP/sync
# Columns used:
#   pl_name         — planet name
#   pl_rade         — radius (Earth radii)
#   pl_radeerr1/2   — upper/lower radius uncertainty; err2 stored as negative by NASA
#   pl_bmasse       — best mass estimate (Earth masses); uses pl_msinie when available
#   pl_bmasseerr1/2 — upper/lower mass uncertainty
#   discoverymethod — detection technique
#   disc_year       — year of confirmation
#
# query <- paste0(
#   "SELECT pl_name, pl_rade, pl_radeerr1, pl_radeerr2, ",
#   "pl_bmasse, pl_bmasseerr1, pl_bmasseerr2, ",
#   "discoverymethod, disc_year ",
#   "FROM pscomppars ",
#   "WHERE pl_rade IS NOT NULL ",
#   "AND pl_bmasse IS NOT NULL"
# )
# url <- paste0(
#   "https://exoplanetarchive.ipac.caltech.edu/TAP/sync?",
#   "query=", URLencode(query, reserved = TRUE),
#   "&format=csv"
# )
# resp    <- request(url) |> req_timeout(60) |> req_perform()
# exo_raw <- resp_body_string(resp) |> read_csv(show_col_types = FALSE, na = c("", "NA"))
# write_csv(exo_raw, cache_path)
# message("Cached to: ", cache_path)
```

#### [3. Examine the Data]{.smallcaps}

```{r}
#| label: examine
#| include: true
#| eval: true
#| results: 'hide'
#| warning: false

glimpse(exo_raw)
cat("Detection methods:\n"); print(table(exo_raw$discoverymethod, useNA = "always"))
```

#### [4. Tidy Data]{.smallcaps}

```{r}
#| label: tidy
#| warning: false

### |- clean, filter, classify ----
exo <- exo_raw |>
  mutate(across(
    c(
      pl_rade, pl_radeerr1, pl_radeerr2,
      pl_bmasse, pl_bmasseerr1, pl_bmasseerr2, disc_year
    ),
    as.numeric
  )) |>
  filter(!is.na(pl_rade), !is.na(pl_bmasse), pl_rade > 0, pl_bmasse > 0) |>
  # Remove extreme outliers (> ~3 Jupiter radii or > ~16 Jupiter masses)
  filter(pl_rade < 35, pl_bmasse < 6000) |>
  mutate(
    # NASA stores err2 as negative (signed lower bound) — take abs() for both
    rad_err_up = abs(pl_radeerr1),
    rad_err_dn = abs(pl_radeerr2),
    mass_err_up = abs(pl_bmasseerr1),
    mass_err_dn = abs(pl_bmasseerr2),
    has_rad_err = !is.na(rad_err_up) & !is.na(rad_err_dn),
    has_mass_err = !is.na(mass_err_up) & !is.na(mass_err_dn),
    # Compositional class using Fulton gap (~1.6 Re) + gas giant threshold (~4 Re)
    # These are inferred boundaries — the uncertainty cloud blurs them
    comp_class = case_when(
      pl_rade <= 1.6 ~ "Rocky",
      pl_rade <= 4.0 ~ "Sub-Neptune",
      TRUE ~ "Gas giant"
    ),
    comp_class = factor(comp_class, levels = c("Rocky", "Sub-Neptune", "Gas giant"))
  )

### |- subset with both error bars ----
exo_err <- exo |>
  filter(has_rad_err, has_mass_err) |>
  # Cap runaway bars on log scale
  mutate(
    rad_err_up  = pmin(rad_err_up, pl_rade * 2.0),
    rad_err_dn  = pmin(rad_err_dn, pl_rade * 0.85),
    mass_err_up = pmin(mass_err_up, pl_bmasse * 5.0),
    mass_err_dn = pmin(mass_err_dn, pl_bmasse * 0.85)
  )
```


#### [5. Visualization Parameters]{.smallcaps}

```{r}
#| label: params
#| include: true
#| warning: false

### |- plot aesthetics ----
colors <- get_theme_colors(
  palette = list(
    bg           = "#0A0F1A",
    rocky        = "#E8A87C",   
    subneptune   = "#7EB8D4",   
    gasgiant     = "#C4A4D4", 
    grid         = "#161E2E",
    text_main    = "#E8E8E8",
    text_sub     = "#8A94A0",
    annotation   = "#C0C8D0",
    earth_fill   = "#3366AA",
    earth_ring   = "#AACCFF",
    jupiter_fill = "#CC8833",
    jupiter_ring = "#FFD8A0",
    zone_label   = "#3D4A5A"
  )
)

col_bg <- colors$palette$bg
col_rocky <- colors$palette$rocky
col_subneptune <- colors$palette$subneptune
col_gasgiant <- colors$palette$gasgiant
col_grid <- colors$palette$grid
col_text_main <- colors$palette$text_main
col_text_sub <- colors$palette$text_sub
col_annotation <- colors$palette$annotation
col_earth_fill <- colors$palette$earth_fill
col_earth_ring <- colors$palette$earth_ring
col_jup_fill <- colors$palette$jupiter_fill
col_jup_ring <- colors$palette$jupiter_ring
col_zone <- colors$palette$zone_label

comp_colors <- c(
  "Rocky"       = col_rocky,
  "Sub-Neptune" = col_subneptune,
  "Gas giant"   = col_gasgiant
)

### |- titles and caption ----
title_text    <- "Most exoplanets are still a blur"

subtitle_text <- paste0(
  "Over 5,000 worlds confirmed, yet uncertainty in mass and radius ",
  "blurs the boundary between rocky worlds,<br>",
  "sub-Neptunes, and gas giants. ",
  "Color shows inferred class; error bars reveal how much we still don't know."
)

caption_text <- create_dcc_caption(
  dcc_year    = 2026,
  dcc_day     = 25,
  source_text = "NASA Exoplanet Archive — Planetary Systems Composite Parameters (pscomppars)"
)

### |- fonts ----
setup_fonts()
fonts <- get_font_families()

base_theme   <- create_base_theme(colors)

weekly_theme <- extend_weekly_theme(
  base_theme,
  theme(
    plot.background = element_rect(fill = col_bg, color = NA),
    panel.background = element_rect(fill = col_bg, color = NA),
    panel.grid.major = element_line(color = col_grid, linewidth = 0.35),
    panel.grid.minor = element_blank(),
    axis.text = element_text(color = col_text_sub, family = fonts$text, size = 9),
    axis.title = element_text(color = col_text_sub, family = fonts$text, size = 10),
    axis.ticks = element_blank(),
    plot.title = element_markdown(
      color = col_text_main, family = fonts$title,
      size = 22, face = "bold", hjust = 0, margin = margin(b = 6)
    ),
    plot.subtitle = element_markdown(
      color = col_text_sub, family = fonts$text,
      size = 10, lineheight = 1.5, hjust = 0, margin = margin(b = 20)
    ),
    plot.caption = element_markdown(
      color = col_text_sub, family = fonts$text,
      size = 7.5, hjust = 1, margin = margin(t = 14)
    ),
    plot.margin = margin(t = 20, r = 24, b = 12, l = 20),
    legend.position = "bottom",
    legend.background = element_rect(fill = col_bg, color = NA),
    legend.key = element_rect(fill = col_bg, color = NA),
    legend.text = element_text(color = col_text_sub, family = fonts$text, size = 8.5),
    legend.title = element_text(
      color = col_text_sub, family = fonts$text, size = 9, face = "bold"
    ),
    legend.key.width = unit(1.2, "lines"),
    legend.spacing.x = unit(0.4, "cm")
  )
)

theme_set(weekly_theme)
```

#### [6. Plot]{.smallcaps}

```{r}
#| label: plot
#| warning: false

### |- main plot ----
p <- ggplot() +

  # Geoms
  geom_errorbar(
    data = exo_err,
    aes(
      x = pl_bmasse,
      ymin = pl_rade - rad_err_dn,
      ymax = pl_rade + rad_err_up,
      color = comp_class
    ),
    linewidth = 0.15, alpha = 0.12, width = 0
  ) +
  geom_errorbarh(
    data = exo_err,
    aes(
      y = pl_rade,
      xmin = pl_bmasse - mass_err_dn,
      xmax = pl_bmasse + mass_err_up,
      color = comp_class
    ),
    linewidth = 0.15, alpha = 0.12, height = 0
  ) +
  geom_point(
    data = exo,
    aes(x = pl_bmasse, y = pl_rade, color = comp_class),
    size = 0.85, alpha = 0.55, shape = 16
  ) +

  # --- Compositional boundary lines (subtle dashed) ---
  # Fulton gap: ~1.6 Re separates rocky from sub-Neptune
  geom_hline(yintercept = 1.6, color = col_zone, linewidth = 0.3, linetype = "dashed") +
  # Gas giant threshold: ~4 Re
  geom_hline(yintercept = 4.0, color = col_zone, linewidth = 0.3, linetype = "dashed") +

  # Annotate
  annotate("text",
    x = 0.38, y = 1.1,
    label = "Rocky", color = col_rocky, alpha = 0.55,
    family = fonts$text, size = 2.8, hjust = 0, fontface = "italic"
  ) +
  annotate("text",
    x = 0.38, y = 2.3,
    label = "Sub-Neptune", color = col_subneptune, alpha = 0.55,
    family = fonts$text, size = 2.8, hjust = 0, fontface = "italic"
  ) +
  annotate("text",
    x = 0.38, y = 7.0,
    label = "Gas giant", color = col_gasgiant, alpha = 0.55,
    family = fonts$text, size = 2.8, hjust = 0, fontface = "italic"
  ) +
  annotate("text",
    x = 10, y = 0.42,
    label = "Uncertainty spans entire\ncompositional classes\nfor many planets",
    color = col_annotation, family = fonts$text,
    size = 2.9, hjust = 0, lineheight = 1.3
  ) +
  annotate("segment",
    x = 20, xend = 25, y = 0.58, yend = 1.05,
    color = col_annotation, linewidth = 0.35, alpha = 0.6,
    arrow = arrow(length = unit(0.12, "cm"), type = "open")
  ) +
  annotate("point",
    x = 1, y = 1,
    color = col_earth_ring, size = 3.2,
    shape = 21, fill = col_earth_fill, stroke = 0.9
  ) +
  annotate("text",
    x = 1.5, y = 1.03,
    label = "Earth", color = col_earth_ring,
    family = fonts$text, size = 2.7, hjust = 0
  ) +
  annotate("text",
    x = 1.5, y = 0.88,
    label = "Well-constrained benchmark",
    color = col_text_sub, family = fonts$text,
    size = 2.3, hjust = 0, fontface = "italic"
  ) +
  annotate("point",
    x = 317.8, y = 11.2,
    color = col_jup_ring, size = 3.8,
    shape = 21, fill = col_jup_fill, stroke = 0.9
  ) +
  annotate("text",
    x = 430, y = 11.5,
    label = "Jupiter", color = col_jup_ring,
    family = fonts$text, size = 2.7, hjust = 0
  ) +
  annotate("text",
    x = 430, y = 9.5,
    label = "Even well-studied giants\nshow wide measurement ranges",
    color = col_text_sub, family = fonts$text,
    size = 2.3, hjust = 0, fontface = "italic", lineheight = 1.25
  ) +
  annotate("text",
    x = 3200, y = 1.72,
    label = "Boundaries are not discrete",
    color = col_zone, family = fonts$text,
    size = 2.2, hjust = 1, fontface = "italic"
  ) +

  # Scales
  scale_x_log10(
    name   = "Planet mass (Earth masses)",
    breaks = c(0.5, 1, 3, 10, 30, 100, 300, 1000, 3000),
    labels = label_number(accuracy = 1, big.mark = ","),
    expand = expansion(mult = c(0.04, 0.06))
  ) +
  scale_y_log10(
    name   = "Planet radius (Earth radii)",
    breaks = c(0.5, 1, 2, 4, 8, 15, 25),
    labels = label_number(accuracy = 0.1),
    expand = expansion(mult = c(0.04, 0.06))
  ) +
  scale_color_manual(
    name = "Inferred composition",
    values = comp_colors,
    guide = guide_legend(
      nrow         = 1,
      override.aes = list(size = 3.5, alpha = 0.9)
    )
  ) +
  coord_cartesian(clip = "off") +

  # Labs
  labs(
    title    = title_text,
    subtitle = subtitle_text,
    caption  = caption_text
  )
```

#### [7. Save]{.smallcaps}

```{r}
#| label: save
#| warning: false

### |-  plot image ----  
save_plot(
  p,
  type = "30daychartchallenge",
  year = 2026,
  day = 25,
  width = 10,
  height = 8
  )
```

#### [8. Session Info]{.smallcaps}

::: {.callout-tip collapse="true"}
##### Expand for Session Info

```{r, echo = FALSE}
#| eval: true
#| warning: false

sessionInfo()
```
:::

#### [9. GitHub Repository]{.smallcaps} 

::: {.callout-tip collapse="true"}
##### Expand for GitHub Repo

The complete code for this analysis is available in [`30dcc_2026_25.qmd`](https://github.com/poncest/personal-website/blob/master/data_visualizations/TidyTuesday/2026/30dcc_2026_25.qmd).

For the full repository, [click here](https://github.com/poncest/personal-website/).
:::


#### [10. References]{.smallcaps}
::: {.callout-tip collapse="true"}
##### Expand for References
1. **Data Sources:**
   - NASA Exoplanet Archive — Planetary Systems Composite Parameters (pscomppars).
     California Institute of Technology, on behalf of NASA.
     Retrieved from: https://exoplanetarchive.ipac.caltech.edu/TAP/sync

2. **Chart Inspiration:**
   - Fulton, B. J., et al. (2017). The California-Kepler Survey III: A Gap in the
     Radius Distribution of Small Planets. *The Astronomical Journal*, 154(3), 109.
     https://doi.org/10.3847/1538-3881/aa80eb
:::


#### [11. Custom Functions Documentation]{.smallcaps}

::: {.callout-note collapse="true"}
##### 📦 Custom Helper Functions

This analysis uses custom functions from my personal module library for efficiency and consistency across projects.

**Functions Used:**

-   **`fonts.R`**: `setup_fonts()`, `get_font_families()` - Font management with showtext
-   **`social_icons.R`**: `create_social_caption()` - Generates formatted social media captions
-   **`image_utils.R`**: `save_plot()` - Consistent plot saving with naming conventions
-   **`base_theme.R`**: `create_base_theme()`, `extend_weekly_theme()`, `get_theme_colors()` - Custom ggplot2 themes

**Why custom functions?**\
These utilities standardize theming, fonts, and output across all my data visualizations. The core analysis (data tidying and visualization logic) uses only standard tidyverse packages.

**Source Code:**\
View all custom functions → [GitHub: R/utils](https://github.com/poncest/personal-website/tree/master/R)
:::

© 2024 Steven Ponce

Source Issues