Apache Cassandra, part 1 – principles, data model

ProsGood balance between functionality and usability. Powerful tools support.SQL has feature rich syntaxSet of widely accepted standards.Consistency

ScalabilityRDBMS were mainstream for tens years till requirements for scalability were increased dramatically.Complexity of processed data structures was increased dramatically.

ScalingTwo ways to achieve scalability:Vertical scalingHorizontal scaling

ConsCost of distributed transactionsNo availability support . Two DB with 99.9% have availability 100% - 2 * (100% - DB availability) = 99.8% (43 min. downtime per month).Additional synchronization overhead.As slow as slowest DB node + network latency.2PC is blocking protocol.It is possible to lock resources forever.

ConsUsage of master - slave replication.Makes write side (master) performance bottleneck and requires additional CPU/IO resources. There is no partition tolerance.

ShardingFeature shardingHash code shardingLookup table - Node that contains lookup table is performance bottleneck and single point of failure.

Feature sharding DB instances are divided by DB functions.

Hash code sharding Data is divided through DB instances by hash code ranges.

Sharding consistencyFor efficient sharding data should be eventually consistent.

Feature vs. hash code shardingFeature sharding allows to perform consistency tuning on the domain logic granularity. But load may be not well balanced.Hash code sharding allows to perform good load balancing but does not allow consistency on domain logic level.

Cassandra shardingCassandra uses hash code load balancingCassandra better fits for reporting than for business logic processing.Cassandra + Hadoop == OLAP server with high performance and availability.

II. Apache Cassandra. Overview

CassandraAmazon Dynamo(architecture)DHTEventual consistencyTunable trade-offs, consistencyGoogle BigTable(data model)Values are structured and indexed

Distributed and decentralizedNo master/slave nodes (server symmetry)No single point of failure

DHTDistributed hash table (DHT) is a class of a decentralized distributed system that provides a lookup service similar to a hash table; (key, value) pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key.

DHTKeyspaceKeyspace partitioningOverlay network

KeyspaceAbstract keyspace, such as the set of 128 or 160 bit strings. A keyspace partitioning scheme splits ownership of this keyspace among the participating nodes.

Keyspace partitioningKeyspace distance function δ(k1,k2) A node with ID ix owns all the keys km for which ix is the closest ID, measured according to δ(km,ix).

Keyspace partitioningImagine mapping range from 0 to 2128 into a circle so the values wrap around.

Keyspace partitioningConsider what happens if node C is removed

Keyspace partitioningConsider what happens if node D is added

Overlay networkFor any key k, each node either has a node ID that owns k or has a link to a node whose node ID is closer to kGreedy algorithm (that is not necessarily globally optimal): at each step, forward the message to the neighbor whose ID is closest to k

Elastic scalabilityAdding/removing new node doesn’t require reconfiguring of Cassandra, changing application queries or restarting system

High availability and fault toleranceCassandra picks A and P from CAPEventual consistency

Tunable consistencyReplication factor (number of copies of each piece of data)Consistency level (number of replicas to access on every read/write operation)

Quorum consistency levelR = N/2 + 1 W = N/2 + 1R + W > N

Hybrid orientationColumn orientationcolumns aren’t fixedcolumns can be sortedcolumns can be queried for a certain rangeRow orientationeach row is uniquely identifiable by keyrows group columns and super columns

Schema-freeYou don’t have to define columns when you create data modelYou think of queries you will use and then provide data around them

High performance50 GB reading and writing Cassandra- write 0.12 ms- read : 15 msMySQL- write : 300 ms- read : 350 ms

DatabaseTable1Table2Relational data model

Cassandra data modelKeyspaceColumn FamilyColumn1Column2Column3RowKey1Value3Value2Value1Column4Column1RowKey2Value4Value1

KeyspaceKeyspace is close to a relational databaseBasic attributes:replication factorreplica placement strategycolumn families (tables from relational model)Possible to create several keyspaces per application (for example, if you need different replica placement strategy or replication factor)

Column familyContainer for collection of rowsColumn family is close to a table from relational data modelColumn FamilyRowColumn1Column2Column3RowKeyValue3Value2Value1

Column family vs. TableStore represents four-dimensional hash map[Keyspace][ColumnFamily][Key][Column]The columns are not strictly defined in column family and you can freely add any column to any row at any timeA column family can hold columns or super columns (collection of subcolumns)

Column family vs. TableColumn family has an comparator attribute which indicated how columns will be sorted in query results (according to long, byte, UTF8, etc)Each column family is stored in separate file on disk so it’s useful to keep related columns in the same column family

ColumnBasic unit of data structureColumnname: byte[]value: byte[]clock: long

Skinny and wide rowsWide rows – huge number of columns and several rows (are used to store lists of things)Skinny rows – small number of columns and many different rows (close to the relational model)

Disadvantages of wide rowsBadly work with RowCashIf you have many rows and many columns you end up with larger indexes (~ 40GB of data and 10GB index)

Column sortingColumn sorting is typically important only with wide modelComparator – is an attribute of column family that specifies how column names will be compared for sort order

Comparator typesCassandra has following predefined types:AsciiTypeBytesTypeLexicalUUIDTypeIntegerTypeLongTypeTimeUUIDTypeUTF8Type

Super columnStores map of subcolumnsSuper columnname: byte[]cols: Map<byte[], Column>Cannot store map of super columns (only one level deep)

Five-dimensional hash:[Keyspace][ColumnFamily][Key][SuperColumn][SubColumn]

Super columnSometimes it is useful to use composite keys instead of super columns.

Necessity more then one level depth

Performance issuesSuper column familyColumn families:Standard (default)Can combine columns and super columnsSuperMore strict schema constraintsCan store only super columnsSubcomparator can be specified for subcolumns

Note thatThere are no joins in Cassandra, so you canjoin data on a client sidecreate denormalized second column family

TTL column typeTTL column is column value of which expires after given period of time.Useful to store session token.

Counter columnIn eventual consistent environment old versions of column values are overridden by new one, but counters should be cumulative.Counter columns are intended to support increment/decrement operations in eventual consistent environment without losing any of them.

CounterColumn internalsCounterColumn structure:name…….[ (replicaId1, counter1, logical clock1), (replicaId2, counter2, logical clock2), ……………….. (replicaId3, counter3, logical clock3)]

CounterColumn write - beforeUPDATE CounterCF SET count_me = count_me + 2 WHERE key = 'counter1‘[ (A, 10, 2), (B, 3, 4), (C, 6, 7)]

CounterColumn write -afterA is leader [ (A, 10 + 2, 2 + 1), (B, 3, 4), (C, 6, 7) ]

Apache Cassandra, part 1 – principles, data model

Recommended

More Related Content

What's hot (20)

Viewers also liked (19)

Similar to Apache Cassandra, part 1 – principles, data model (20)

Recently uploaded (20)

Apache Cassandra, part 1 – principles, data model