Cassandra Intro



Cassandra Intro

0 0


tcjug-2014-cassandra-intro


On Github beckje01 / tcjug-2014-cassandra-intro

Cassandra Intro

@beckje01

http://bit.ly/tcjug-cassandra

Agenda

whoami What is Cassandra Glossary Cassandra Current State Connection Options Data Modeling Anti Patterns Links

whoami

Jeff Beck

beckje01 on GitHub and Twitter

TechLead at ReachLocal

What is Cassandra

  • Highly-Available, Distributed, Tuned Consistency
  • Master-less Replication
  • redundancy configurable per table
  • Cluster can span DCs
  • Flexible schema

Consistency

Great talk on Cassandra consistency, Eventual Consistency != Hopeful Consistency

Video Slides

Glossary

  • CAP Theorem - Consistency, Availability, and Partition tolerance
  • Cluster, DataCenter, Rack
  • Partitioner
  • Thrift
  • Quorum, Local Quorum

Glossary

  • DSE
  • Node Discovery
  • Load Balancing
  • Replication Strategy

Cassandra Today

  • Cassandra 2.0
  • CQL3
  • Java Native Driver
  • Murmur3Partitioner

Cassandra Common Setup

  • Astyanax
  • C* 1.2
  • Wide Column Design

Connection Options

  • Astyanax
  • Hector
  • CQL|Java (JDBC)
  • 4-5 less popular ones

Astyanax

Thrift based with varying compatibility. Also supports some CQL3 via Thrift.

See Astyanax Cassandra Compatability

Hector

Not as popular anymore, allows for connection pooling etc. Based on Thrift.

CQL - JDBC Based Solutions

Avoid these currently are not cluster aware so a single node failure can cause problems. Or you have to follow the bad practice having a LB in front of the cluster.

4-5 More Clients

There are a lot of other clients and ORMs built out around Java check out all them here.

Java Native Driver

Based around CQL3 and a new binary protocol. Supports node discovery, load balancing and failover. Encourages CQL3 based data design.

Java Native Driver

CQL prepared statements claim to be 10% faster than thrift, with the first release. You have to model your data more like a traditional DB. It is new and comes in two flavors.

Java Native Driver 1.x

Currently a generally available client, docs are good to help get you started. You don't get all the great async work that has been done in 2.x

Java Native Driver 2.x

Many breaking changes for the upgrade and requires Cassandra 2.0 for full feature set many items such as result set paging will throw an exception if used against a 1.2 C*. Current state is 2.0.1, it does have nice async support with futures.

State of Astyanax

Netflix announced that they will be updating Astyanax to support both the new binary protocol and thrift. Read here. There is even a beta out of Astyanax Over Java Driver very early.

Data Modeling in Cassandra

Model what you want to query for not the data.

This part is hard there will be mistakes.

Hints

  • Duplicate Data
  • Good Compound Primary Key
  • Hire a Consultant
  • Careful around adding an index

Anti Patterns

  • Read Before Write
  • Load Balancer
  • Excessive Heap Space
  • Use of order-preserving partitioner

Slides

Links

Extras

CREATE KEYSPACE Simple
        WITH REPLICATION = {'class' : 'SimpleStrategy',
                            'replication_factor': 1};

CREATE TABLE Person (
     firstName varchar,
     lastName varchar,
     age int,
    PRIMARY KEY (lastName,firstName)
); 
insert into person (firstname, lastname, age) values ('Jeff','Beck',30) ;
insert into person (firstname, lastname, age) values ('Bob','Beck',60) ;
create table alist (
 page varchar,
 visits list <timestamp>,
 PRIMARY KEY(page)
);

Append Item to list

update alist set visits=visits+['2014-04-14 19:34:20-0500'] where page='/fake' ;

Update.Assignments updateAssignments = QueryBuilder.update(keyspaceName, table).with(); for (PersistentProperty prop : persistentEntity.getPersistentProperties()){ updateAssignments = updateAssignments.and(QueryBuilder.set(prop.getName(), convertToCassandraType(entry.get(prop.getName())))); }

Statement update = updateAssignments.where(QueryBuilder.eq("id", UUID.fromString(uuid.toString())));

session.execute(update);

      ByteBuffer bb = row.getBytes(columnName);
      byte[] result = new byte[bb.remaining()];
      bb.get(result);
      ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(result);
      ObjectInputStream ois = new ObjectInputStream(byteArrayInputStream);
      o = ois.readObject();    

Local Cassandra Meetup

http://www.meetup.com/Minneapolis-St-Paul-Cassandra-Meetup/ http://bit.ly/C-MSP