Whither the Pageview Apocalypse?



Whither the Pageview Apocalypse?

0 0


hh-berlin-2013

Slides for Hacks/Hackers Berlin, 10-30-2013

On Github abelsonlive / hh-berlin-2013

Whither the Pageview Apocalypse?

Brian Abelson | @brianabelson Knight-Mozilla OpenNews Fellow, The New York Times Hacks/Hackers - Berlin - October 30, 2013 Slides: bit.ly/hh-berlin-2013

"Pageviews are dead"

Remind you of anything?

Pageviews Above Replacement (un-juking the stats)

  • What if we could control for promotion when judging performance?
  • From July - August, I collected data on the promotion and performance of over 21,000 articles published on nytimes.com

Data sources

  • Promotional Data:
  • ~ 200 NYT-related Twitter accounts
  • ~ 20 NYT-related Facebook accounts
  • ~ 20 section fronts
  • One homepage
  • One paper Metadata:
  • Article type: (video, slideshow, interactive, article, blogpost)
  • Section: (US, World, Art, etc...) Performance Data:
  • Pageviews and Social Media Activity for each article

Predicting pageviews

  • Sum all the pageviews for 7 days on the site
  • Use promotional features and article metadata to predict this number
  • Random Forests (the mode of a bunch of decision trees)

Variable importance

  • Time on all section fronts
  • Number of unique section fronts
  • Was the article in the paper?
  • Number of NYT-Twitter followers reached
  • Time on homepage
  • Number of NYT-tweets
  • Is the article from Reuters?
  • Is the article from the AP?
  • Max rank on homepage
  • Word count

So what?

  • Placing promotional data alongside pageviews gives us a better understanding of what the metric actually means.
  • (NYT) Pageviews are actually fairly predictable (90% of the variance explained in my model)
  • Incorporating this approach in your Newsroom should be fairly painless (open-source library on the way!)

Danke!

@brianabelson brianabelson.com OpenNews