I've got some data. – Now what?



I've got some data. – Now what?

0 1


foia-fest-data

Slides from my talk to the 2015 FOIA Fest on what to think about when starting a data project.

On Github chagan / foia-fest-data

I've got some data.

Now what?

Chris Hagan | Web Producer, Data | WBEZ@chrishagan | chagan@wbez.org Slides at chagan.github.io/foia-fest-data FOIA examples at github.com/chagan/foia-fest-data

You may think your data look like this

But really, it's this

What does that mean

  • Come with questions
  • Know their bias
    • Who collected this? For what?
    • How sure are they?
  • Can't rely only on the data

More likely, your data are this

Quick bath

  • Take a min, max, sum and average
  • Sort and scan
  • Missing values
  • Change things around
    • Text to columns
    • Convert to numbers, dates, text as needed
    • Pivot tables

Data smells

  • Talk to the people who collect the data
    • Get the documents behind the data
  • Check previous years
  • Excel row limits:
    • Before 2007: 65,536
    • After 2007: 1,048,576
    • Round numbers
  • Null Island

Make your life easier

  • Create a data dictionary
  • Make a copy of the original
  • Don't make changes in a cell, create a new column
  • Track your changes

Resources