# Color Theory Basics For Data Visualization

## The Color Wheel

The visible spectrum is commonly divided into six main colors: red, orange, yellow, green, blue and violet. The color wheel is the organization of primary and secondary colors (with or without tertiary colors) around a circle. Arranging colors in this way illustrates the relationships between each, with secondary colors located in between primary, and so on.

## Web Color Models

RGB (Red, Blue, Green)

RGB is an additive color model used for screens, wherein red, green and blue are the primary colors. Any color can be created by mixing red, green, and blue light in different proportions. Each value is expressed as a number that varies from zero, for none of that color, to 255, for full intensity of that color. Black is rgb(0,0,0) and white is rgb(255,255,255).

HEX is a different way to express RGB color, most often used for the web. HEX codes begin with a hash sign (#), followed by letters and numbers that represent RGB values. Black is #000000 and white is #FFFFFF.

Both RGB and HEX codes may contain a fourth component, called alpha, which encodes the degree of transparency or opacity.

## Color Palettes

Determining the color palette for your data requires you to think about what you are trying to show.

### Categorical Data

#### Qualitative Palettes

For categorical data, the goal is to use a series of distinctive colors, spread around the color wheel. These are called qualitative palettes.

### Continuous Data

#### Sequential Palettes

When using color to encode continuous data, it usually makes sense to use increasing intensity or saturation of color to indicate larger values. These are called sequential color palettes.

#### Diverging Palettes

In some circumstances, you may have continuous data that has positive and negative values, or which highlights deviation from a central value. Here, you should use a diverging color palette, which will have two colors reasonably well separated on the color wheel as its end points.

Avoid The Muddy Middle

The classic mistake when setting a diverging color palette is to make the central point a mix between the two colors at each end, which results in a “muddy middle” where the colors are hard to distinguish. Instead, a diverging palette should look like two sequential palettes stuck end-to-end, cycling through a neutral color such as gray or white in the middle.

Here’s an example, showing votes in the 2016 presidential election by county. Here purple is deliberately being used to emphasize the broad middle of “Purple America.” But usually with a map like this we would want to clearly distinguish whether a county is leaning Democratic or Republican, for which a diverging palette with a neutral middle color would be much more effective.

In some circumstances, you may have continuous data that has positive and negative values, or which highlights deviation from a central value. Here, you should use a diverging color palette, which will have two colors reasonably well separated on the color wheel as its end points.

## Color Resources

Choosing the right color for visualization can be daunting. Luckily, there are resources to help in the selection process.

ColorBrewer is a diagnostic tool to test color schemes, focused on map design.

https://colorbrewer2.org/

Viz Palette is a tool designed to test the viability of colors in visualizations.

https://projects.susielu.com/viz-palette