On Github emhart / open_sci_industry
Complete transparency in the scientific process
(open science workflows Hampton et al 2014)
Crisis in public confidence
Combat high profile retractions
Combat high profile retractions
"The debunkers could do their debunking only because of a bit of luck: Data they needed happened to be available not from its original source, but through another researcher who had posted it to meet a journal’s open-data policies. (fivethirtyeight.com)"
Journals care.
Journals care.
"the major hurdle to overcome when trying to convince others that we should strive for Open Science: it is a major pain in the ass and is really expensive, in terms of both the money and amount of time required.
We need to stop telling people 'You should' and get better at telling people 'Here’s how' " - Emilio Bruna, UF, editor Biotropica
A stack is a complete group of components that work together to produce a goal.
Open science stack is all the tools you need to produce open science
Open science stack is all the tools you need to produce open science
“Open data and content can be freely used, modified, and shared by anyone for any purpose” - Open Knowledge Foundation
Your data can be used long after you're gone
(Figure 1D - Vines et al 2014)
Increased citation (9%)
(Figure 2 - Piowar and Vision 2013)
(dataone.org)
What makes a format open?
Open
Closed
Some metadata standards
"To anyone who wants to photocopy, bind, and give a copy of the book to their loved one — more power to them. He/She will likely be disappointed that you’re so cheap, though." - Randall Munroe (xkcd)
Your most open choice, public domain!
Choose a Creative Commons license that fits your comfort level
No license does not mean your data is open!
Ideally:
Some suggestions
For more suggestions:
Wolkovich et al. 2012
Open standards facilitate government and industry sharing
Open standards facilitate government and industry sharing
Open standards facilitate government and industry sharing
Relies on Clinical Data Interchange Standards Consortium (CDISC) Study Data Tabulation Model (SDTM) format
Sharing happens between companies
Sharing between AstraZeneca and Sanofi
Sharing happens between companies
Sharing between 23AndMe and Pfizer and 23AndMe and Genentech
"Although the issue of irreproducible data has been discussed between scientists for decades, it has recently received greater attention as the costs of drug development have increased along with the number of late-stage clinical-trial failures and the demand for more effective therapies." (doi:10.1038/483531a)
Data science project workflow
"It is possible to achieve some measure of traditional success while being open. Grants; publications; tenure. 'nuff said." - C. Titus Brown, UC Davis http://bit.ly/ossohsu @emhrt_