Apache Cassandra and JON

NoSQL in Real Life

Jiří Kremser < jkremser@redhat.com >

#rhq on Freenode

2014-04-02

Outline

Overview of JON
Cassandra Essentials
JON + Cassandra = ♥
Demo

Oh hey, these are some notes. They'll be hidden in your presentation, but you can see them if you open the speaker notes window (hit 's' on your keyboard).

Overview of JON

Framework for
- Monitoring
- Alerting
- Management of servers and applications
Central Server(s)
Agents on managed machines
Plugins do the actual work

Architecture

Web UI

Why NoSQL?

Multiple servers HA setup
Many agents can report to one server
One agent can collect many metrics
Everything went to 1 postgres instance => bottleneck

♥ Cassandra ♥

Latest release: 2.0.6 / March 10, 2014

Licence: Apache License v2.0

RDBMS Style

CREATE TABLE person (
  id uuid PRIMARY KEY,
  name varchar(255),
  homepage varchar(255)
);

CREATE TABLE person_email (
  user_id uuid REFERENCES person,
  email varchar(255)
);

SELECT * FROM person NATURAL JOIN person_email;

C* Style

CREATE TABLE person (
  id uuid,
  name text,
  homepage text,
  emails set<text>,
  PRIMARY KEY (id)
);

UPDATE person SET emails = emails + {'a@b.c', 'foo@bar.baz'} 
WHERE id = 550e8400-e29b-41d4-a716-44665544a000;

C* Data Model

partitioned row store with tunable consistency
one row key identifies sorted set of columns (conceptually: "Map<rowKey, SortedSet<column>>")
column family ~ table
row key ~ partition key
denormalization => Cassandra tables are not related.

Interfacing with C*

Thrift vs CQL

TTransport tr = new TFramedTransport(new TSocket("localhost", 9160));
TProtocol proto = new TBinaryProtocol(tr);
Cassandra.Client client = new Cassandra.Client(proto);
tr.open();

client.set_keyspace("MyKeyspace");
ColumnParent parent = new ColumnParent("person");

// read one row with row key equal to "1"
SlicePredicate predicate = new SlicePredicate();
SliceRange sliceRange = new SliceRange(ByteBuffer.wrap("".getBytes("UTF-8"), ByteBuffer.wrap("".getBytes("UTF-8"), false, 10);
predicate.setSlice_range(sliceRange);

List<ColumnOrSuperColumn> results = client.get_slice(ByteBuffer.wrap("1".getBytes("UTF-8"), parent, predicate, ConsistencyLevel.ONE);
for (ColumnOrSuperColumn result : results) {
  Column column = result.column;
  System.out.println(toString(column.name) + " -> " + toString(column.value));
} 
tr.close();

Writes

Logging data in the commit log Writing data to the memtable Flushing data from the memtable - sort, sequential I/O, purge records in c. l. Storing data on disk in SSTables Compaction - merge sort, data consolidation, removing records with tombstones, etc.

[source: DataStax documentation]

Writes Compared

Postgres, MariaDB, Oracle, MongoDB, Membase, BerkeleyDB, etc use in-place modifications
..but flushes are _random_ writes
C* just appends all the operations to the commit log and memtables
the flush can be done in O(n)

Reads

[source: DataStax documentation]

CAP Theorem

aka Brewer's "theorem"

pick two:

Tunable Consistency

eventualy consistent by default
Read and Write Consistency
Linearizable consistency, since 2.0 and Paxos protocol (lightweight transactions)
last write wins X vector clocks X Paxos (Leslie Lamport)

Consistency Levels

can differ for writes and reads
should depend on your Replication Factor
ONE, TWO, THREE
ALL
{LOCAL_|EACH_}QUORUM
SERIAL - Lightweight transactions

Cluster Architecture

Token Ring, md5 hashes of partition keys
Consistent Hashing
Virtual nodes, since C* 1.2
Support for datacenters - Snitches

JON + Cassandra

C* is a perfect fit for time series data
TTL feature - continuous forgetting
write throughput is enormous (300k metrics per minute with 2 nodes = 600k writes/minute)
we call it RHQ Storage Node

INSERT INTO x(..) VALUES (..) USING TTL 300;

JON + Storage Node

Foreman infrastructure [link]
Disk space benchmark [link]
Monitoring the status of the cluster
Alerting when disk space goes low / operation failure
At least 1 Storage Node has to be installed
rhqctl install installs server, agent and storage node

Admin UI

Demo

Talk is cheap ;)

Thats all folks

This presentation

http://goo.gl/zs82jh

Apache Cassandra and JON – NoSQL in Real Life – Overview of JON

Jiri-Kremser

Apache Cassandra and JON – NoSQL in Real Life – Overview of JON

0 0

jbug-cassandra-slides

Apache Cassandra and JON

NoSQL in Real Life

Outline

Overview of JON

Architecture

Web UI

Why NoSQL?

♥ Cassandra ♥

RDBMS Style

C* Style

C* Data Model

Interfacing with C*

Writes

Writes Compared

Reads

CAP Theorem

Tunable Consistency

Consistency Levels

Cluster Architecture

JON + Cassandra

JON + Storage Node

Admin UI

Demo

Thats all folks

Apache Cassandra and JON – NoSQL in Real Life – Overview of JON

Jiri-Kremser

Apache Cassandra and JON – NoSQL in Real Life – Overview of JON

0 0 (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/platform.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

jbug-cassandra-slides

Apache Cassandra and JON

NoSQL in Real Life

Outline

Overview of JON

Architecture

Web UI

Why NoSQL?

♥ Cassandra ♥

RDBMS Style

C* Style

C* Data Model

Interfacing with C*

Writes

Writes Compared

Reads

CAP Theorem

Tunable Consistency

Consistency Levels

Cluster Architecture

JON + Cassandra

JON + Storage Node

Admin UI

Demo

Thats all folks

0 0