Given:
- life is short
- I am lazy
- You should not lie
- Humans are intelligent (w/ caveats ;-)
- ... and not all of them are working at Eurocontrol
...it follows
- I'll (procastinate on boring stuff and only) work on useful/fun projects
- Automation saves me from repeating boring and/or forgotten tasks
- I'll be open to let others critisize/scrutinize/learn
- ...and I'll learn back from them
- I'll strive to produce truthful explanations/visualizations
The Axioms (IMHO)
- Value of data --> visualization
- Visualization --> WWW
- Make data available
-
no Web: then you do not exist, i.e. EC/PRB/PRU
-
no boring stuff: enough of it, do better.
-
truthful: no evil
-
visualization: humans perception & best practices!
- data availability!
The Plan(Jan 2015)
- Generate a (static) website for the PRU
- Version control it all
- Automate!
-
static: no need of server, no authentication, no hacks!
-
version control: done by systems not humans, i.e. naming convention in folders...
-
automation: the only way to scale
Now one year and a half later
Editing
- easy, i.e. textual (ASCII, no HTML): separate content from style
- nice Math (via MathJax): \[f(x)=\sum_{n=0}^\infty\frac{f^{(n)}(a)}{n!}(x-a)^n\]
- bibliography: cite and style
- templates for different kind of pages (Definitions, list of ANSP's, RN's)
Markdown
No need to edit in HTML: we (mainly) use Markdown (from Pandoc)
## Methodology
[Horizontal en-route flight efficiency methodology](/r/m/hfe_pi.html)
is fully consistent with the Single European Sky (SES)
Performance Scheme [see {% cite pru-hfe-pi --file aviation %}].
## Column naming and types
### HFE data
{:.metatable}
| Column name | Src | Label | Column description | Example |
|-------------|-----|-----------|-----------------------|---------|
| YEAR | NM | YEAR | Reference year | 2014 |
| MONTH_NUM | NM | MONTH_NUM | Month (numeric) | 9 |
| MONTH_MON | NM | MONTH_MON | Month (3-letter code) | JAN |
Generation
- from DB queries to website: scripts
-
Jekyll: MD -> HTML
-
Pandoc: MD -> PDF
- some from Rmarkdown/[knitr] in the near future
- But we NEED MORE to scale: for example checks on data consistency
DB
- new schema for production: PRUPROD
- use current ones for development (PRUDEV) and testing (PRUTEST)
- version control [PL]SQL code, i.e. which code was used to produc which indicators
-
version control the DB used for prod: regulatory repository
Data
- clarify dimensions
- improved the Meta part of it: definitions, methodology
- add more data and (web) API (see ICAO iSTARS)
- generate the spreadsheets if CSV files/API are not enough
- Metadata is to be transparent and to avoid confusion, i.e. define what you name/use (delay, trajectory, FIR)
- the API is to make the data available: remember we are not the only smart ones around
More Viz
- more Studies/Articles w/ interactivity (see NYT, WP)
- more thinking of what is worth plotting
- more graphs in Graphs
- one year old experiment click here
- a recent one w/ STATFOR click here
Wild thoughts
- personally I am not interested in BI or industrial-like dashboards
- I know that little is used of our NMIR
Just mine ones
- PRR live in the website and PDF generated from the source in git repo
- add Jypiter notebooks to the website for case studies
We want you!
- Share knowledge (or lack of)
- Learn from and know each other
- Discover internal and external datasets
- critisize & propose alternatives
- signal things you saw and would like to see implemented in our site For example NYT, Bloomberg (1, 2), WP, ProPublica, The Guardian, Financial Times ... have fantastic infographics
We hear you!
- emails with questions, proposals are a good start
- you are always welcome to come and chat (but bring your coffee)
- present at the next Show & Tell
References and Inspirations
Tools
- Google Charts cannot be run offline
- GCharts make your life difficult if you want to load data locally, i.e. CSV instead of Google Spreadsheets
Social
- D3.js: Mike Bostock, Ian Johnson, Elijah Meeks, Nadieh Bremer, Susie Lu, Christophe Viau
- Viz gurus: Alberto Cairo, Stephen Few, Edward Tufte, Enrico Bertini, Maarten Lambrechts, Jonathan Corum, Jeffrey Heer ...
- Twitter: big inspiration from #d3js #dataviz
- Statistics: Hadley Wickham
- Conferences: Eyeo Festival, OpenVis Conference
- Awards: Malofiej, Data Journalism Awards
- Newspapers: NYT, The Guardian, Bloomberg, The Washington Post, Pro Publica, National Public Radio ...
Books
Yes, you still have to study!
- Tufte, Edward
- Cairo, Alberto
- Few, Stephen
Automation 1
xkdc 1319 and explanation
Title text: 'Automating' comes from the roots 'auto-' meaning 'self-', and 'mating', meaning 'screwing'.
Automation 2
xkdc 1205 and explanation
Title text: Don't forget the time you spend finding the chart to look up what you save. And the time spent reading this reminder about the time spent. And the time trying to figure out if either of those actually make sense. Remember, every second counts toward your life total, includingthese right now.
Correlation
xkdc 1205 and explanation
Title text: Correlation doesn't imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing 'look over there'.
Convincing
xkdc 833 and explanation
Title text: Don't forget the time you spend finding the chart to look up what you save. And the time spent reading this reminder about the time spent. And the time trying to figure out if either of those actually make sense. Remember, every second counts toward your life total, includingthese right now.
1/45
Data ScienceShow & Tell
Enrico Spinielli
June 9, 2016
Live slides available at https://espinielli.github.com/showandtell
PDF and source of slides available at https://github.com/espinielli/showandtell