nosql-is-a-lie-lnug-2016



nosql-is-a-lie-lnug-2016

0 0


nosql-is-a-lie-lnug-2016


On Github ukmadlz / nosql-is-a-lie-lnug-2016

NoSQL is a lie

Legal Disclaimer

  • © IBM Corporation 2015. All Rights Reserved.
  • The information contained in this publication is provided for informational purposes only. While efforts were made to verify the completeness and accuracy of the information contained in this publication, it is provided AS IS without warranty of any kind, express or implied. In addition, this information is based on IBM’s current product plans and strategy, which are subject to change by IBM without notice. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this publication or any other materials. Nothing contained in this publication is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software.
  • References in this presentation to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates. Product release dates and/or capabilities referenced in this presentation may change at any time at IBM’s sole discretion based on market opportunities or other factors, and are not intended to be a commitment to future product or feature availability in any way. Nothing contained in these materials is intended to, nor shall have the effect of, stating or implying that any activities undertaken by you will result in any specific sales, revenue growth or other results.
  • If the text contains performance statistics or references to benchmarks, insert the following language; otherwise delete: Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here.
  • If the text includes any customer examples, please confirm we have prior written approval from such customer and insert the following language; otherwise delete: All customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer.
  • Please review text for proper trademark attribution of IBM products. At first use, each product name must be the full name and include appropriate trademark symbols (e.g., IBM Lotus® Sametime® Unyte™). Subsequent references can drop “IBM” but should include the proper branding (e.g., Lotus Sametime Gateway, or WebSphere Application Server). Please refer to http://www.ibm.com/legal/copytrade.shtml for guidance on which trademarks require the ® or ™ symbol. Do not use abbreviations for IBM product names in your presentation. All product names must be used as adjectives rather than nouns. Please list all of the trademarks that you use in your presentation as follows; delete any not included in your presentation. IBM, the IBM logo, Lotus, Lotus Notes, Notes, Domino, Quickr, Sametime, WebSphere, UC2, PartnerWorld and Lotusphere are trademarks of International Business Machines Corporation in the United States, other countries, or both. Unyte is a trademark of WebDialogs, Inc., in the United States, other countries, or both.
  • If you reference Adobe® in the text, please mark the first use and include the following; otherwise delete: Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries.
  • If you reference Java™ in the text, please mark the first use and include the following; otherwise delete: Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.
  • If you reference Microsoft® and/or Windows® in the text, please mark the first use and include the following, as applicable; otherwise delete: Microsoft and Windows are trademarks of Microsoft Corporation in the United States, other countries, or both.
  • If you reference Intel® and/or any of the following Intel products in the text, please mark the first use and include those that you use as follows; otherwise delete: Intel, Intel Centrino, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
  • If you reference UNIX® in the text, please mark the first use and include the following; otherwise delete: UNIX is a registered trademark of The Open Group in the United States and other countries.
  • If you reference Linux® in your presentation, please mark the first use and include the following; otherwise delete: Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Other company, product, or service names may be trademarks or service marks of others.
  • If the text/graphics include screenshots, no actual IBM employee names may be used (even your own), if your screenshots include fictitious company names (e.g., Renovations, Zeta Bank, Acme) please update and insert the following; otherwise delete: All references to [insert fictitious company name] refer to a fictitious company and are used for illustration purposes only.

Who am I?

Mike Elsmore

Developer Advocate

mike.elsmore@uk.ibm.com

< rant >

NoSQL

Catch All

try {
    SQL
} catch (Exception $e) {
    Must be NoSQL
}

Not SQL

It's a nasty backronym

SQL on NoSQL

SPARQL

CQL (Cassandra Query Language)

Couchbase SQL

Schemaless

Yes, they'll accept anything…but

Schema

Because how else do you know what you’re getting out?

No NoSQL Experts

Over 5 Primary Types

So many distinct types of databases

"X" Expert

</ rant >

Enter CAP Theorem

Consistency, Availability and Partition Tolerance

Consistent

In = Out

Available

Partition Tolerant

Know's where to look

Pick two?

Uncertainty principle

Why is this important?

History

Consistent & Available

Ignoring the Partition Tolerance by being in the same place

Distributed Systems and Databases

Needs to know what machine X data is on

Partition Tolerance & ___________

Design Decision

The reason why most NoSQL Databases are either AP or CP

Other Database Types

Object, Tabular, Tuple, Triple/Quad store (RDF), Multimodel, Etc

Key Value Datastores

Popular Key-Value Datastores

Redis

Memcached

Riak

What is a Key-Value Datastore?

- data storage paradigm designed for storing, retrieving, and managing associative arrays, a data structure more commonly known today as a dictionary or hash - Dictionaries contain a collection of objects, or records, which in turn have many different fields within them, each containing data - Usually an AP system

Why use a Key-Value store?

- simple model makes them simple to use and powerful Use Cases - Session Stores - Fast search lookups - Queues

Document Datastores

Popular Document Datastores

Cloudant

CouchDB

MongoDB

Rethink

What is a Document Datastore?

- Considered a subclass of key-value datastores - Relies on the document to provide the meta data to optimise and build further queries - Uses techniques like MapReduce to query - Uses search systems like Apache Lucene for advanced querying

Why use a Document Datastore?

- Operational Datastore - Flexibility in changing the data model whilst presenting the same responses - The majority are designed with AP in mind - Once you model around eventual consistency your about have a lot of reads and writes

Column Datastores

- Wide Column Datastores

Popular Column Datastores

Cassandra

Hbase

Accumulo

What is a Column Datastore?

- Does use tables, rows and columns for the storage model - Kinda relational - However the names and format of columns can change between rows

Why use a Column Datastore?

- Can use it for operational storage - But due to how you model it, Relational DBs - Is amazing for timeseries - Massive distribution - Network failure

Graph Datastores

Popular Graph Datastores

TitanDB

Neo4j

Giraph

What are Graph Datastores?

- Well it's graph structures - nodes -> edges - Take advantage of distributed computing to cope with X Million+ relationships - Relies more on the relationships than the meta data

Why use Graph Datastores?

- You can do the same in SQL, if the traversal is fixed - Allows for complex iterative and cyclical queries Best use cases - Social Graphs (erdos) - Recommendation engines - Fraud detection
© IBM Corporation 2016. All Rights Reserved.