bamboo - real time data analysis
- modi research group | earth institute | columbia
- real world development
- data gathering
- data analysis
- open source! github: modilabs/bamboo
use cases:
- nigeria mdg info system
- egyptian election
- meteor client
egyptian election analysis
it's a web service, after all
# import dataest
curl -X POST -F csv_file=@election_results_8pm_9pm.csv \
http://bamboo.io/datasets
# define calculation
curl -X POST -d "name=voter_turnout&formula=vote_count/population" \
http://bamboo.io/datasets/123212421/calculations
# query result
curl http://bamboo.io/dataset/12432112424/info?select=voter_turnout
# later that day
curl -X POST -f csv_file=@election_results_9pm_10pm.csv \
http://bamboo.io/datasets
# query updated results easily
curl http://bamboo.io/datasets/12432112424/info?select=voter_turnout
pybamboo
bamboo.js
stack
- python
- pandas - stats, analysis
- celery - tasks
- cherrypy - webserver
- mongo - database (?)
- pyparsing - formula
What Does it Do?
- managing dataset, give me csv
- calculations
- newCol = numGirls / population
- aggregation
- sum()
- max()
- mean()
- ratio()
- update
- merge datasets
- join datasets by column
obstacles & future
- obstacles
- mongo???
- if not, what?
- row v. column stores
- scidb
- we are on budget
-
future
- make the world suck less with data!
bamboo