Introduction to Big Data
with practical use-cases
@dhilipsiva
@dhilipsiva
- Full-Stack & DevOps Engineer @Appknox
- Big Data, Machine Learning & IoT Enthusiast
- Open-Source Fanatic & GitHub Addict
- Father
Data & History
Lets discuss about History of data
Surprise, Surprise!!!
- Each card is about 100 Bytes
- 62500 Cards
- 5.96 MB (Approx 3 floppy disk)
Data Tomorrow
- Data on Bacteria
- Data on DNA
- Qubits
- And so much more
Computers - Back then
10 years down the line
- Mechanical
*
- The Electronics
Computers - Today
- Moore's law
- Already hit the roadblock
Inputs - today
What Happens in a min
[doubled in the pic]
Keyboards, Mobile, Camera alone gives Text, Audio, Pictures, Videos.
Big Data
And that is Big Data :)
But Wait
What about the Future?
Census
- First, there was census
- Then, came computers
Hadoop - to the resque
But first - History
Hadoop & Co.
- Hadoop
- MapReduce
- Ambari: A web-based tool for provisioning, managing and monitoring Apache Hadoop clusters
- Avro: A data serialization system
- HBase: A Column based Data Store
- Hive: A data warehouse infrastructure that provides data summarization and ad hoc querying.
- Pig: A high-level data-flow language and execution framework for parallel computation.
- Spark: ETL (Extract, Transform, Load), machine learning, stream processing, and graph computation.
- ZooKeeper: A high-performance coordination service for distributed applications.
Applications: Google
Webpage indexing
Face detection
Personalized Ads
Plenty of others
Applications: Facebook
Malware detection
Spam detection
Finding Faces
And much more
Applications: Twitter
Trending Posts
Analytics
Applications: Others
Appknox
Nest
Cancer Research
Rice DNA
Cancer Research
Applications: Future
IoT - Health, Home Appliances, Weather
Introduction to Big Data
with practical use-cases
@dhilipsiva