Quick Hit: A Different (Diminutive) Look At Distributions With {ggeconodist}

Despite being a full-on denizen of all things digital I receive a fair number of dead-tree print magazines as there’s nothing quite like seeing an amazing, large, full-color print data-driven visualization up close and personal. I also like supporting data journalism through the subscriptions since without cash we will only have insane, extreme left/right-wing perspectives out there.

One of these publications is The Economist (I’d subscribe to the Financial Times as well for the non-liberal perspective but I don’t need another mortgage payment right now). The graphics folks at The Economist are Top Notch™


and a great source of inspiration to “do better” when cranking out visuals.

After reading a recent issue, one of their visualization styles stuck in my head. Specifically, the one from this story on the costs of a mammogram. I’ve put it below:

Essentially it’s a boxplot with outliers removed along with significantly different aesthetics than the one we’re all used to seeing. I would not use this for exploratory data analysis or working with other data science team members when poking at a problem but I really like the idea of making “distributions” easier to consume for general audiences and believe The Economist graphics folks have done a superb job focusing on the fundamentals (both statistical and aesthetic).

There are ways to hack something like those out manually in {ggplot2} but it would be nice to just be able to swap out something for geom_boxplot() when deciding to go to “production” with a distribution chart. Thus begat {ggeconodist}.

Since this is just a “quick hit” post we’ll avoid some interim blathering to note that we can use {ggplot2} (and a touch of {grid}) to make the following:

with just a tiny bit of R code:

library(ggeconodist) ggplot(mammogram_costs, aes(x = city)) + geom_econodist( aes(ymin = tenth, median = median, ymax = ninetieth), stat = "identity" ) + scale_y_continuous(expand = c(0,0), position = "right", limits = range(0, 800)) + coord_flip() + labs( x = NULL, y = NULL, title = "Mammoscams", subtitle = "United States, prices for a mammogram*\nBy metro area, 2016, $", caption = "*For three large insurance companies\nSource: Health Care Cost Institute" ) + theme_econodist() -> gg grid.newpage()
left_align(gg, c("subtitle", "title", "caption")) %>% add_econodist_legend(econodist_legend_grob(), below = "subtitle") %>% grid.draw()


A future post (and, even — perhaps — a new screen sharing video) will describe what went into making this new geom, stat, and theme (along with some info on how I managed to reproduce data for the vis since none was provided).

In the interim, hit up the CINC page on {ggeconodist} to learn more about the package. You may want to take a quick look at {hrbrthemes} since it might have some helpers for using all the required theming components.

So kick the tyres, file issues/PRs, and be on the lookout for the director’s cut of the making of {ggeconodist}.

To leave a comment for the author, please follow the link and comment on their blog: R – rud.is.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more…

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook

Leave a Comment