An Infinispan Tale

How split clusters can get back together happily

Tristan Tarrant / @tristantarrant

About me

Infinispan Project Lead

Open Source hacker since 1993

@ Red Hat since 2011

Agenda

What is Infinispan ?
How does Infinispan approach clustering ?
How does Infinispan handle cluster partitions ?

What is Infinispan ?

In-Memory Java Key/Value store (with Scala bits)
Local + Clustered
Elastic + Highly scalable
Transactional + Strongly consistent
Distributed under the ASL 2.0
Supported product: Red Hat JBoss Data Grid

What can Infinispan do ?

Optional persistence:File + LevelDB + JDBC + JPA + REST
Embedded + Client/Server
Remote Protocols: HotRod + REST + Memcached + WebSocket
Distributed execution framework
Indexed and Indexless queries
Events: local + clustered + remote
Security: ACL, Encryption, Audit

Does Infinispan scale ?

Yes.

Largest cluster in production:

320 nodes, 3000 caches, 20TB RAM

Largest cluster formed:

1000 nodes

Infinispan clustering

Transport

Based on JGroups
Multiprotocol: UDP + TCP
Discovery: Multicast + Unicast + S3 + Google + etc
Security
Geographical replication

Infinispan clustering

Cluster Topology

JGroup views
Represents group membership
Each node has a unique address
Ordered, first is coordinator
View changes when nodes join + leave
View Id

Infinispan clustering

Consistent Hashing

Key hashed using MurmurHash3 algorithm
Hash space divided into segments
Segments assigned to nodes
Key > Segment > Owners
Primary and Backup owners

Consistent Hashing Visualization

Step 1: Empty cluster

Consistent Hashing Visualization

Step 2: Adding one entry

Consistent Hashing Visualization

Step 3: Primary and Backup

Consistent Hashing Visualization

Step 4: Another one

Consistent Hashing Visualization

Step 5: You get the idea

Consistent Hashing Visualization

Step 6: Here's one I made earlier

Infinispan clustering

Controlling Consistent Hashing

Physical topology aware (site/rack/machine)
Grouping
Key Affinity service
Capacity Factor

Infinispan clustering

State Transfer

Elasticity
Nodes added / removed
We still have at least one owner
Rebalancing: moving segments to satisfy distribution

Rebalancing Visualization

Step 1: My cluster full of data

Rebalancing Visualization

Step 2: He's dead, Jim

Rebalancing Visualization

Step 3: A healed cluster

Disaster:

What if multiple nodes fail at once ?

CAP: The Theorem

Formulated by Eric Brewer in 1998
Consistency
Availability
Partitioning
Can only satisfy two properties at once

CAP: The Combinations

Consistency + Availability: The "Ideal World"
Partitioning + Availability: "I bend so I don't break"
Partitioning + Consistency: "Don't corrupt my data"

Partition handling strategies

Prefer availability

<distributed-cache>
    <partition-handling enabled="false"/>
</distributed-cache>
ConfigurationBuilder dcc =new ConfigurationBuilder();
dcc.clustering().partitionHandling().enabled(false);

Prefer consistency

<distributed-cache>
    <partition-handling enabled="true"/>
</distributed-cache>
ConfigurationBuilder dcc =new ConfigurationBuilder();
dcc.clustering().partitionHandling().enabled(true);

Split detection

Lost >= number of owners
Ensure a stable topology
Check segment ownership
Mark partition as available / degraded / unavailable
Send PartitionStatusChangedEvent to listeners

Cluster partitioning

Case 1: No data loss

Cluster partitioning

Case 2: Lost data

Merging split clusters

Split clusters see each other again
Ensure a stable topology
Automatic: new state based on partition state
- One unavailable partition > UNAVAILABLE
- One available partition > attempt merge
- All partitions are degraded > attempt merge
Manual
- Data was lost
- Custom listener on Merge event
- It's up to YOU

Future work

Automatic state reconciliation
Eventual consistency !!!

Thanks / Q&A

http://infinispan.org

@infinispan + @tristantarrant

https://github.com/infinispan

https://github.com/tristantarrant/infinispan-presentation-splitbrain

An Infinispan Tale – How split clusters can get back together happily

tristantarrant

An Infinispan Tale – How split clusters can get back together happily

0 1 (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/platform.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

infinispan-presentation-splitbrain

An Infinispan Tale

How split clusters can get back together happily

About me

Agenda

What is Infinispan ?

What can Infinispan do ?

Does Infinispan scale ?

Infinispan clustering

Transport

Infinispan clustering

Cluster Topology

Infinispan clustering

Consistent Hashing

Consistent Hashing Visualization

Step 1: Empty cluster

Consistent Hashing Visualization

Step 2: Adding one entry

Consistent Hashing Visualization

Step 3: Primary and Backup

Consistent Hashing Visualization

Step 4: Another one

Consistent Hashing Visualization

Step 5: You get the idea

Consistent Hashing Visualization

Step 6: Here's one I made earlier

Infinispan clustering

Controlling Consistent Hashing

Infinispan clustering

State Transfer

Rebalancing Visualization

Step 1: My cluster full of data

Rebalancing Visualization

Step 2: He's dead, Jim

Rebalancing Visualization

Step 3: A healed cluster

Disaster:

What if multiple nodes fail at once ?

CAP: The Theorem

CAP: The Combinations

Partition handling strategies

Split detection

Cluster partitioning

Case 1: No data loss

Cluster partitioning

Case 2: Lost data

Merging split clusters

Future work

Thanks / Q&A

0 1