intro to cleaning data


Understanding how to clean  data is an important skill every reporter needs. Demographic, financial and other data is available on a city, county, state and national level in the United States.

But understanding how to take a large data file and distill it into a usable form can be daunting.

In this tutorial, you'll learn how spreadsheets work, basic data-cleaning workflow and how to use formulas and functions to clean data. This is a general tutorial and it doesn't delve deeply into one program. We'll use Microsoft Excel but most of the same techniques work in Google Spreadsheets and other programs.

I will expand on this tutorial as time permits. If there is something you would like to see included, please leave a suggestion in the comments.

This tutorial does not cover analysis. That can be more effectively done in Google Refine. That tutorial is in the works, but until it's available check out this tutorial from ProPublica.

Filed under: Reporting, Data Visualization