Gary J. Murakami, Ph.D.
Lead Engineer & Ruby Evangelist
Tuesday, September 25, 2012
$ brew update $ brew install mongodb $ mongod
$ gem install mongo $ gem install bson_extGemfile
gem 'mongo' gem 'bson_ext'
IDE includes support for Mongoid and MongoDB
require 'mongo' require 'httpclient' require 'json' screen_name = ARGV[0] || 'garymurakami' friends_ids_uri = "https://api.twitter.com/1/friends/ids.json?cursor=-1&screen_name=#{screen_name}" friend_ids = JSON.parse(HTTPClient.get(friends_ids_uri).body)['ids']
a few lines of Ruby gets and parses JSON data from the web - twitter API
connection = Mongo::Connection.new db = connection['twitter'] collection = db['users'] friend_ids.each_slice(100) do |ids| # best practices users_lookup_uri = "https://api.twitter.com/1/users/lookup.json?user_id=#{ids.join(',')}" response = HTTPClient.get(users_lookup_uri) docs = JSON.parse(response.body) # high-level objects - Array of Hashes docs.each{|doc| doc['_id'] = doc['id']} # user supplied _id collection.insert(docs, :safe => true) # no schema! - bulk insert - best practices end puts "users:#{collection.count}"
a small number of lines load JSON data from the web into a collection
users = Mongo::Connection.new['twitter']['users'] users.find({}, :sort => { followers_count: -1 }, :limit => 10).each do |doc| puts doc.values_at('followers_count', 'screen_name').join("\t") end
a simple query with sort and limit
1409213 Dropbox 25092 MongoDB 9074 oscon 8148 railsconf 7302 spf13 4374 10gen 2259 kchodorow 1925 rit 1295 dmerr 1148 aaronheckmann
db = Mongo::Connection.new['twitter'] users = db['users'] tweets = db['tweets'] tweets.ensure_index('user.id') # dot notation to specify subfield users.find({}, {fields: {id: true, screen_name: true, since_id: true}}).each do |user| twitter_user_timeline_uri = "https://api.twitter.com/1/statuses/user_timeline.json?user_id=#{user['id']}&count=200&include_rts=true&contributor_details=true" twitter_user_timeline_uri += "since_id=#{user['since_id']}" if user['since_id'] response = HTTPClient.get(twitter_user_timeline_uri) docs = JSON.parse(response.body) # high-level objects docs.each{|doc| doc['_id'] = doc['id']} # user supplied _id tweets.insert(docs, :continue_on_error => true) # bulk insert users.update({_id: user['_id']}, {'$set' => {'since_id' => docs.last['id']}}) puts tweets.count(:query => {'user.id' => user['id']}) end puts "tweets:#{tweets.count}"
a small number of lines loads and records tweets
tweets = Mongo::Connection.new['twitter']['tweets'] screen_name = 'MongoDB' tweets.find({'user.screen_name' => screen_name}, :sort => { retweet_count: -1 }, :limit => 10).each do |doc| puts doc.values_at('retweet_count', 'text').join("\t") end
a simple query with a query selector, sort and limit
172 #MongoDB v2.2 released http://t.co/sN6Rzc7D 77 #mongoDB2dot2 officially released, w/ Advanced Aggregation Framework, Multi-Data Center Deployment + 600 new features http://t.co/DMSdWGwN 59 Announcing free online MongoDB classes. http://t.co/RIoAb7l7 30 RT @jmikola: Introducing mongoqp: a frontend for #MongoDB's query profiler https://t.co/ROCSzs6W 29 Sign up for free online MongoDB courses starting in October http://t.co/Im68Q4NM 24 1,000+ have signed up for free MongoDB training since the courses were announced today http://t.co/oINxfH1t 22 How Disney built a big data platform on a startup budget using MongoDB. http://t.co/8qW5yc7T 21 #mongoDB2dot2, available for download. http://t.co/kcGhUDhI 19 MongoDB Sharding Visualizer http://t.co/onIG08jv another product of #10genLabs 16 MongoDB Java Driver 2.9.0 released http://t.co/UDvZoAYV
{"BSON": ["awesome", 5.05, 1986]} → "\x31\x00\x00\x00 \x04BSON\x00 &\x00\x00\x00 \x020\x00\x08\x00\x00\x00awesome\x00 \x011\x00333333\x14@ \x102\x00\xc2\x07\x00\x00 \x00 \x00"base types extended to match modern programming languages transfer over wire - Mongo Wire Protocol
fast - no parsing, no server-side data repackaging to disk
multiple database server nodes
one primary member, many secondary members
Ruby on Rails, Sinatra, etc.
class User include MongoMapper::Document key :name, String key :age, Integer many :hobbies end class Hobby include MongoMapper::EmbeddedDocument key :name, String key :started, Time end user = User.new(:name => 'Brandon') user.hobbies.build(:name => 'Programming', :started => 10.years.ago) user.save! User.where(:name => 'Brandon').first
class Artist include Mongoid::Document field :name, type: String embeds_many :instruments end class Instrument include Mongoid::Document field :name, type: String embedded_in :artist end syd = Artist.where(name: "Syd Vicious").between(age: 18..25).first syd.instruments.create(name: "Bass") syd.with(database: "bands", session: "backup").save!
session = Moped::Session.new([ "127.0.0.1:27017" ]) session.use "echo_test" session.with(safe: true) do |safe| safe[:artists].insert(name: "Syd Vicious") end session[:artists].find(name: "Syd Vicious").update(:$push => { instruments: { name: "Bass" }})
a MongoDB driver for Ruby which exposes a simple, elegant, and fast API
"Lightroom 3 Catalog.lrcat"
Objective Write about best practices for using MongoDB with Ruby. Post your blog posts in the comments section of this blog post by October 10 to be entered to win.
http://blog.10gen.com/post/32158163299/mongodb-and-ruby-blogging-contestGoogle: mongodb and ruby blogging contest
{ name: "Gary J. Murakami, Ph.D.", title: "Lead Engineer and Ruby Evangelist", company: "10gen (the MongoDB company)", phone: "1-866-237-8815 x8015", mobile: "1-908-787-6621", email: "gary.murakami@10gen.com", im: "gjmurakami (AIM)", twitter: "@GaryMurakami", blog: "grayghostwriter.blogspot.com", website: "www.nobell.org", linkedin: "www.linkedin.com/pub/gary-murakami/1/36a/327", facebook: "facebook.com/gary.j.murakami" }