Talk about technical and ethical issues related to providing access to a Ferguson social media collection. Interested in hearing about different approaches people come up with since this is a relatively new area for archival research and practice.
the existence, preservation and availability of archives, documents, records in our society are very much determined by the distribution of wealth and power. That is, the most powerful, the richest elements in society have the greatest capacity to find documents, preserve them, and decide what is or is not available to the public. This means government, business and the military are dominant.
Zeynep Tufekci (UNC iSchool) muses on the ways that algorithms
shape the media, and our experience. Twitter, and BlackTwitter
provides an unprecedented view into a community that has been
denied a voice in mainstream media.
% twarc search ferguson > tweets.json
% twarc filter ferguson > stream.json
13,480,00 tweets
August 10, 2014 - August 27, 2014
8.4 compressed
Technical accessibility issues. How do you provide access to this
content?
Not just 140 characters.
- time
- hashtags
- geo coordinates
- places
- embedded media
- retweet
- reply to
- user
- profile
- avatar
- follower count
- Twitter API
Anatomy of a tweet. Twitter's documentation is quite good.
417,972 Unique Unshortened URLs
December 3, 2014
Random imagery from tweets used as a backdrop for the townhall meeting.
Organizing Meeting
Tweet Collection
Ethics, Rights and Data Management
Basic Navigagtion and Analysis
Advanced Analytical Techniques
Ferguson
Indictment Decision of Darren Wilson
November 11 - December 8, 2015
15,080,078 tweets
Ferguson
Department of Justice Report
February 25 - March 21, 2015
2,033,898 tweets
#WalterScott
April 1 - April 13, 2015
846,602 tweets
#FreddieGray
April 15, 2015 - Present
2,887,755
#BaltimoreUprising
April 25, 2015 - Present
2,157,853
Twitter likes to embellish its socially progressive image, but
it's Terms of Service unsurprisingly and necessarily reflect
its business interests. Gnip as data reseller, purchased by
Twitter. Shutting off DataSift.
Explain how only IDs can be shared, and how you can
rehydrate data.