How to choose a bivariate color palette?

Bivariate color palettes are products of combining two separate color palettes. They are usually represented by a square with rows (one color palette) and columns (second color palette). You can more about how they are made in the blog post “Bivariate Choropleth Maps: A How-to Guide” by Joshua Stevens.

The main role of bivariate color palettes is to present the values of two variables simultaneously. For example, the map below uses a bivariate palette to represent both GDP per capita and life expectancy for countries in Africa.

The code to create this map is in the tmap issue tracker. Some other bivariate maps’ examples can be found in the “Bivarite Mapping with ggplot2” vignette and the “Bivariate maps with ggplot2 and sf” blog post.

The above map has one issue, though. As pointed out by Frederico R Ramos, it is not suitable for people with color vision deficiencies. They are not able to distinguish between some colors, and therefore, cannot understand the map correctly. Therefore, the main question is how to choose a proper bivariate color palette?

Bivariate palettes

The pals R package has a dozen or so bivariate color palettes.

Twelve of these palettes are presented below.

par(mfrow = c(3, 4), mar = c(1, 1, 2, 1))
bivcol(arc.bluepink)
bivcol(brewer.divdiv)
bivcol(brewer.divseq)
bivcol(brewer.qualseq)
bivcol(brewer.seqseq1)
bivcol(brewer.seqseq2)
bivcol(census.blueyellow)
bivcol(stevens.bluered)
bivcol(stevens.greenblue)
bivcol(stevens.pinkblue)
bivcol(stevens.pinkgreen)
bivcol(stevens.purplegold)

Palettes’ properties

Now, we can use the colorblindcheck package to decide if the selected color palette is colorblind-friendly or not.

The main function in this package is palette_check(), which creates summary statistics comparing the original input palette and simulations of three main color vision deficiencies. Let’s use it on two color palettes: arc.bluepink() and brewer.seqseq2().

colorblindcheck::palette_check(arc.bluepink(), plot = TRUE, bivariate = TRUE)

## name n tolerance ncp ndcp min_dist mean_dist max_dist
## 1 normal 16 7.135562 120 120 7.1355623 27.72463 53.76783
## 2 deuteranopia 16 7.135562 120 100 0.3450842 19.79323 52.46731
## 3 protanopia 16 7.135562 120 96 0.0000000 20.08030 50.20137
## 4 tritanopia 16 7.135562 120 120 7.9914570 31.48801 71.57927

The visual inspection of arc.bluepink() suggests that this palette is not suitable for people with color vision deficiencies, namely deuteranopia and protanopia. In deuteranopia and protanopia simulations, it is almost impossible to distinguish some colors. This problem is also confirmed by the summary statistics, where the minimal distance between colors of the original palette is about 7, while it is only about 0.345 for deuteranopia and 0 (no difference at all) for protanopia.

colorblindcheck::palette_check(brewer.seqseq2(), plot = TRUE, bivariate = TRUE)

## name n tolerance ncp ndcp min_dist mean_dist max_dist
## 1 normal 9 13.21133 36 36 13.21133 39.99288 94.59810
## 2 deuteranopia 9 13.21133 36 34 10.99234 40.33172 94.22020
## 3 protanopia 9 13.21133 36 34 10.53062 38.99158 94.59810
## 4 tritanopia 9 13.21133 36 36 13.66888 39.60803 94.48661

On the other hand, the inspection of brewer.seqseq2() indicate that it is possible to differentiate between all of the colors in this palette based on the original colors and simulations of color vision deficiencies. You can see more examples of colorblindcheck in action at https://nowosad.github.io/colorblindcheck.

Colorblind-friendly palettes

Using the above function, I tested all of the bivariate color palettes from pals. I visualized all of the palettes and decided to keep only the ones for which the minimal distance between colors was above 6.

It allowed to distinguish four palettes – brewer.divseq, brewer.seqseq2, stevens.greenblue, and stevens.purplegold. You can see the comparison between them and simulations of color vision deficiencies below.

colorblindcheck::palette_check(brewer.divseq(), plot = TRUE, bivariate = TRUE)

## name n tolerance ncp ndcp min_dist mean_dist max_dist
## 1 normal 9 9.237516 36 36 9.237516 38.32933 87.90123
## 2 deuteranopia 9 9.237516 36 36 9.267188 39.85751 90.88415
## 3 protanopia 9 9.237516 36 36 9.237516 40.79861 86.08385
## 4 tritanopia 9 9.237516 36 35 6.777558 32.82160 83.10774
colorblindcheck::palette_check(brewer.seqseq2(), plot = TRUE, bivariate = TRUE)

## name n tolerance ncp ndcp min_dist mean_dist max_dist
## 1 normal 9 13.21133 36 36 13.21133 39.99288 94.59810
## 2 deuteranopia 9 13.21133 36 34 10.99234 40.33172 94.22020
## 3 protanopia 9 13.21133 36 34 10.53062 38.99158 94.59810
## 4 tritanopia 9 13.21133 36 36 13.66888 39.60803 94.48661
colorblindcheck::palette_check(stevens.greenblue(), plot = TRUE, bivariate = TRUE)

## name n tolerance ncp ndcp min_dist mean_dist max_dist
## 1 normal 9 9.29651 36 36 9.296510 26.34666 50.19184
## 2 deuteranopia 9 9.29651 36 33 7.238684 24.60856 51.19105
## 3 protanopia 9 9.29651 36 35 7.693015 24.51814 47.10098
## 4 tritanopia 9 9.29651 36 29 6.154169 20.06474 50.20386
colorblindcheck::palette_check(stevens.purplegold(), plot = TRUE, bivariate = TRUE)

## name n tolerance ncp ndcp min_dist mean_dist max_dist
## 1 normal 9 11.97625 36 36 11.97625 30.13646 53.56032
## 2 deuteranopia 9 11.97625 36 35 10.57857 27.58839 46.59557
## 3 protanopia 9 11.97625 36 34 11.48625 29.32017 50.36899
## 4 tritanopia 9 11.97625 36 28 6.31650 20.96426 49.27898

Summary

Four palettes from the pals package, brewer.divseq, brewer.seqseq2, stevens.greenblue, and stevens.purplegold seem to be the most adequate to use for bivariate visualizations.

All of them are suitable for people with color deficiencies. It is important to note that brewer.divseq is made of a sequential (from bottom to top) and a diverging (from left to right) palette. Therefore its use should be limited only to some subset of applications, when we want to present one variable going from high to low (or vice versa) and one variable that has values around a central neutral point. brewer.seqseq2, stevens.greenblue, and stevens.purplegold, on the other hand, consists of a mix of two sequential palettes and, thus, should be used to present two variables with values going from high to low (or vice versa).

Favorite

Leave a Comment