SlideShare a Scribd company logo
Unit -3
Cassandra
Cassandra –
Apache Cassandra - An Introduction, Features of Cassandra, CQL Data types, CQLSH,
Keyspaces, CRUD (Create, Read, Update and Delete) Operations, Collections, Using a
Counter, Time to Live (TTL), Alter Commands, Import and Export, Querying System
Tables, Practice Examples
Unit -3 -Features of Cassandra, CQL Data types,  CQLSH, Keyspaces
What is Apache Cassandra?
• Apache Cassandra is an opensource,distributed and decentralized/distributed
storage system (database),for managing very large amounts of structured data
spread out across the world.
• It provides highly available service with no single point of failure.
• Listed below are some of the notable points of Apache Cassandra −
• It is scalable, fault-tolerant, and consistent.
• It is a column-oriented database.
• Its distribution design is basedon Amazon’s Dynamo and its data model on
Google’s Bigtable.
• Created at Facebook, it differs sharply from relational database management
systems.
• Cassandra implements a Dynamo-style replication model with no single point
of failure, but adds a more powerful “column family” data model.
• Cassandra is being used by some of the biggest companies such as Facebook,
Twitter, Cisco, Rackspace, ebay, Twitter, Netflix, and more.
Unit -3 -Features of Cassandra, CQL Data types,  CQLSH, Keyspaces
NoSQLDatabase
• A NoSQL database (sometimes called as Not Only SQL) is a database
that provides a mechanism to store and retrieve data other than the tabular
relations used in relational databases.
• These databases are schema-free, support easy replication, have simple API,
eventually consistent, and can handle huge amounts of data.
• The primary objective of a NoSQL database is to have
• simplicity of design,
• horizontal scaling, and
• finer control over availability.
• NoSql databases use different data structures compared to relational databases.
• It makes some operations faster in NoSQL.
• The suitability of a given NoSQL database depends on the problem it must solve.
Unit -3 -Features of Cassandra, CQL Data types,  CQLSH, Keyspaces
• Besides Cassandra, we have the following NoSQL databases that
are quite popular −
• Apache HBase −
• HBase is an open source, non-relational, distributed database modeled after
Google’s BigTable and is written in Java.
• It is developed as a part of Apache Hadoop project and runs on top of HDFS,
providing BigTable-like capabilities for Hadoop.
• MongoDB −
• MongoDB is a cross-platform document-oriented database system that
avoids using the traditional table-based relational database structure in favor
of JSON-like documents with dynamic schemas making the integration of
data in certain types of applications easier and faster.
Unit -3 -Features of Cassandra, CQL Data types,  CQLSH, Keyspaces
Features of Cassandra
•Cassandra has become so popular because of its outstanding technical
features.
•Elastic scalability − Cassandra is highly scalable; it allows to add more hardware to
accommodate more customers and more data as per requirement.
•Always on architecture − Cassandra has no single point of failure and it is
continuously available for business-critical applications that cannot afford a failure.
•Fast linear-scale performance − Cassandra is linearly scalable, i.e., it increases
your throughput as you increase the number of nodes in the cluster. Therefore it
maintains a quick response time.
•Flexible data storage − Cassandra accommodates all possible data
formats including: structured,semi-structured, and unstructured. It
can dynamically accommodate changes to your data structures according to
your need.
•Easy data distribution − Cassandra provides the flexibility to distribute data where
you need by replicating data across multiple data centers.
•Transaction support − Cassandra supports properties like Atomicity, Consistency,
Isolation, and Durability (ACID).
•Fast writes − Cassandra was designed to run on cheap commodity hardware. It
performs blazingly fast writes and can store hundreds of terabytes of data, without
sacrificing the read efficiency.
Unit -3 -Features of Cassandra, CQL Data types,  CQLSH, Keyspaces
APPLICATIONS
a. Cassandra Storage
• One of the major applications of Cassandra is storage.
• The broad coverage of Cassandra enables the user to store any kind of data.
• This data is stored in various nodes that Cassandra provides. Cisco WebEx, InWorldz, Formspring, OpenX are some companies using
Cassandra for storage.
b. Back-end development applications
• Users can also use Cassandra for back-end development of their applications.
• Many software and applications have front-end and back-end.
• Cassandra provides a wide platform for the development of the back-end. It also provides a huge database of the data.
• Talentica software uses back-end for analytics.
c. Cassandra Monitoring
• Many applications are based on a wide scale of user activity.
• Developers can also use Cassandra to monitor the user activity.
• This user activity can be based on the different parameter, media, art, music etc. CERN, Cloudkick and many such companies use Cassandra
monitoring.
d. Time-series-based applications
• Time-series-based applications are basically the applications in real time.
• These applications include hits on the internet browser, traffic light data, GPS location tracking data etc.
• These applications require heavy write systems.
• Cassandra is best for these kinds of applications.
e. Cassandra Analytics
• Cassandra provides a platform to analyse data collected from various sources.
• These sources may include social media, product feedback catalogues, retail inputs and lookups.
• Developers can use Cassandra to retrieve and analyse this data.
• Ooyala is using Cassandra Analytics applications.
f. Cassandra Messaging
• Nowadays, people use messaging services all the time.
• This eventually, demands a need for a platform to manage these message data.
• Therefore, Cassandra acts as a platform for the message providers for their database management.
Casandra Architecture
• Cassandra takes hardware failure into consideration.
• Thus, it possesses plans of contingency to avoid such
failures.
• It consists of a ring type structure i.e. its nodes are logically
distributed like a ring.
• Thus it has no master or slave nodes.
• It makes replicas of data on several homogenous
nodes of the cluster.
• Each information exchanges among the nodes of the cluster
every second.
• A sequentially written commit log on each node
captures write activity to make sure data durability.
• This data is then indexed and written to memtable.
• Once the memtable is full, we write data on disk on SSTable
data file.
• All the data is partitioned and replicated to other nodes
automatically.
• By using a process known as compaction Cassandra
periodically updates SSTables and remove outdated data.
• A client can make read/write request to any node in the
cluster.
What is Cassandra Architecture?
Storage Components
Key Terms Of Cassandra Architecture
a. Cassandra Nodes
• It is the basic fundamental unit of Cassandra.
• Data stores in these units(computer/server).
b. Cassandra Data Center
• Cassandra Datacenter, basically a collection of related Cassandra nodes.
• A centralized place to accommodate computer and networking system to meet the needs of
an organization’s information technology.
c. Cassandra Rack
• A rack is a unit that contains all the multiple servers all stacked on top of another.
• A node is a single server in a rack.
d. Cassandra Cluster
• A collection of many data centers form a Cassandra cluster.
• It can be spanned to physical locations.
e. Cassandra Commit log
• Every writes operation performs in a commit log to ensure the durability of the data.
• After it has been flushed to an SSTable data archives or delete or change here.
• It is like a crash recovery mechanism.
f. MemTables
• A temporary memory location where we write data during updates or
deletion.
• Data is written in memtables after it has been written in the commit log.
• When the data in memtables is full, we flush them to the disk to SSTables
g. SSTables
• SSTables, the fixed set of data files in which Cassandra writes memtables
periodically.
• These are appended only, which means that we can add data at the end of
the file thus helping in the sequential storage in the disk.
h. Data Replication
• Imagine a situation if one of the nodes goes down in a data center then a part
of information will lost.
• Thus to overcome this limitation, Cassandra made replicas of data on various
nodes. This is called replication.
• This ensures fault tolerance and reliability.
Cassandra Query Language
Users can access Cassandra through its nodes using Cassandra Query Language (CQL). CQL
treats the database (Keyspace) as a container of tables. Programmers use cqlsh: a prompt to
work with CQL or separate application language drivers.
Clients approach any of the nodes for their read-write operations. That node (coordinator) plays
a proxy between the client and the nodes holding the data.
Write Operations
Every write activity of nodes is captured by the commit logs written in the nodes. Later the data
will be captured and stored in the mem-table. Whenever the mem-table is full, data will be
written into the SStable data file. All writes are automatically partitioned and replicated
throughout the cluster. Cassandra periodically consolidates the SSTables, discarding
unnecessary data.
Read Operations
During read operations, Cassandra gets values
from the mem-table and checks the bloom filter
to find the appropriate SSTable that holds the
required data.
What is Cassandra Keyspace?
• In the Cassandra Data Model, Cassandra Keyspace is a container for
data.
• It contains many attributes. The basic attributes are:-
• a. Replication Factor
• It basically signifies the number of copies of a data. In other words, the number of nodes in a
cluster that are copies of a data.
• b. Replica Placement Strategy
• We have strategies such as
• simple strategy (rack-aware strategy),
• old network topology strategy (rack-aware strategy),
• network topology strategy (datacenter-shared strategy).
• c. Cassandra Column Families
• Column Family in Cassandra is a collection of rows, which contains ordered columns.
They represent a structure of the stored data. These Cassandra Column families are
contained in Keyspace.
• There is at least one Column family in each Keyspace.
• The rows in each column are once again the collection of many columns.
• The columns are the basic unit of the data structure in Cassandra.
• Columns have three values stored in them.
• They are key or columns name, timestamp and value.
CQL Data Type
CQLSH
• cqlsh: the CQL shell
• cqlsh is a command line shell for interacting with Cassandra through CQL (the
Cassandra Query Language).
• It is shipped with every Cassandra package, and can be found in the bin/
directory alongside the cassandra executable.
• cqlsh utilizes the Python native protocol driver, and connects to the single node
specified on the command line.
Unit -3 -Features of Cassandra, CQL Data types,  CQLSH, Keyspaces
Cqlsh Commands
Cqlsh has a few commands that allow users to interact with it.
• HELP − Displays help topics for all cqlsh commands.
• CAPTURE − Captures the output of a command and adds it to a file.
• CONSISTENCY − Shows the current consistency level, or sets a new consistency level.
• COPY − Copies data to and from Cassandra.
• DESCRIBE − Describes the current cluster of Cassandra and its objects.
• EXPAND − Expands the output of a query vertically.
• EXIT − Using this command, you can terminate cqlsh.
• PAGING − Enables or disables query paging.
• SHOW − Displays the details of current cqlsh session such as Cassandra version, host, or
data type assumptions.
• SOURCE − Executes a file that contains CQL statements.
• TRACING − Enables or disables request tracing.
CQL Data Definition Commands
• CREATE KEYSPACE − Creates a KeySpace in Cassandra.
• USE − Connects to a created KeySpace.
• ALTER KEYSPACE − Changes the properties of a KeySpace.
• DROP KEYSPACE − Removes a KeySpace
• CREATE TABLE − Creates a table in a KeySpace.
• ALTER TABLE − Modifies the column properties of a table.
• DROP TABLE − Removes a table.
• TRUNCATE − Removes all the data from a table.
• CREATE INDEX − Defines a new index on a single column of a
table.
• DROP INDEX − Deletes a named index.
CQL Data Manipulation Commands
• INSERT − Adds columns for a row in a table.
• UPDATE − Updates a column of a row.
• DELETE − Deletes data from a table.
• BATCH − Executes multiple DML statements at once.
CQL Clauses
• SELECT − This clause reads data from a table
• WHERE − The where clause is used along with select to read a
specific data.
• ORDERBY − The orderby clause is used along with select to read a
specific data in a specific order.
KEY SPACES
With in the keyspace tables can be defined
Table
Keyspace
Table
Table
Unit -3 -Features of Cassandra, CQL Data types,  CQLSH, Keyspaces
•CREATE KEYSPACE “KeySpace Name” WITH replication =
{'class': ‘Strategy name’, 'replication_factor' : ‘No.Of
replicas’};
•CREATE KEYSPACE “KeySpace Name” WITH replication =
{'class': ‘Strategy name’, 'replication_factor' : ‘No.Of
replicas’} AND durable_writes = ‘Boolean value’;
•The CREATE KEYSPACE statement has two properties:
replication and durable_writes.
Creating a Keyspace using Cqlsh
• A keyspace in Cassandra is a namespace that defines data replication
on nodes.
• A cluster contains one keyspace per node.
• Given below is the syntax for creating a keyspace using the statement
CREATE KEYSPACE.
• CREATE KEYSPACE <identifier> WITH <properties>
Replication
• The replication option is to specify the Replica Placement strategy and the number of
replicas wanted. The following table lists all the replica placement strategies.
Strategy name
• Simple Strategy’
• Network Topology
Strategy
Description
Specifies a simple replication factor for the cluster.
Using this option, you can set the replication factor for each data-
center independently.
• Old Network Topology
Strategy
This is a legacy replication strategy.
Using this option, you can instruct Cassandra whether to
use commitlog for updates on the
current KeySpace. This option is not mandatory and by default, it
is set to true.
•Given below is an example of creating a KeySpace.
•Here we are creating a KeySpace named DATADABSE1. We are using
the first replica placement strategy, i.e.., Simple Strategy. And we are
choosing the replication factor to 1 replica.
cqlsh.> CREATE KEYSPACE DATABASE1 WITH replication
={'class':'SimpleStrategy', 'replication_factor' : 3};
Verification
•You can verify whether the table is created or not using the command
Describe.
•If you use this command over keyspaces, it will display all the
keyspaces created as shown below.
•cqlsh> DESCRIBE keyspaces;
DATABASE1 system system_traces
Durable_writes
•By default, the durable_writes properties of a table is set to true,
however it can be set to false. You cannot set this property to
simplex strategy.
Example
•Given below is the example demonstrating the usage of
durable writes property.
•cqlsh> CREATE KEYSPACE test ... WITH REPLICATION
= { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 3 }
... AND DURABLE_WRITES = false;
Verification
•You can verify whether the durable_writes property of test
KeySpace was set to false by querying the System Keyspace.
This query gives you all the KeySpaces along with their
properties.
•cqlsh> SELECT * FROM system_schema.keyspaces;
Using a Keyspace
•You can use a created KeySpace using the keyword USE. Its
syntax is as follows −
•Syntax:USE <identifier>
Example
•In the following example, we are using the KeySpace
DATABASE1.
•cqlsh> USE DATABASE1;
•cqlsh:DATABASE1>
Altering a KeySpace
• ALTER KEYSPACE can be used to alter properties such as the number of
replicas and the durable_writes of a KeySpace. Given below is the syntax of
this command.
Syntax
ALTER KEYSPACE <identifier> WITH <properties>
i.e.
ALTER KEYSPACE “KeySpace Name” WITH replication = {'class': ‘Strategy name’,
'replication_factor' : ‘No.Of replicas’};
The properties of ALTER KEYSPACE are same as CREATE KEYSPACE. It has
two properties: replication and durable_writes.
Example
•Here we are altering a KeySpace named DATABASE1.
•We are changing the replication factor from 1 to 3.
•cqlsh.> ALTER KEYSPACE DATABASE1 WITH replication =
{'class':'NetworkTopologyStrategy', 'replication_factor' : 3};
•ALTER KEYSPACE test WITH REPLICATION = {'class’ :
'NetworkTopologyStrategy', 'datacenter1' : 3} AND
DURABLE_WRITES
= true;
Dropping a Keyspace
• You can drop a KeySpace using the command DROP KEYSPACE.
Given below is the syntax for dropping a KeySpace.
Syntax
DROP KEYSPACE <identifier>
i.e.
DROP KEYSPACE “KeySpace name”
Example
cqlsh> DROP KEYSPACE DATABASE1;
CRUD Operation
Ad

More Related Content

Similar to Unit -3 -Features of Cassandra, CQL Data types, CQLSH, Keyspaces (20)

Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
nehabsairam
 
Cassandra Learning
Cassandra LearningCassandra Learning
Cassandra Learning
Ehsan Javanmard
 
Big Data_Architecture.pptx
Big Data_Architecture.pptxBig Data_Architecture.pptx
Big Data_Architecture.pptx
betalab
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra
Nikiforos Botis
 
cassandra.pptx
cassandra.pptxcassandra.pptx
cassandra.pptx
BRINDHA256909
 
Why Cassandra?
Why Cassandra?Why Cassandra?
Why Cassandra?
Tayfun Sevimli
 
Column db dol
Column db dolColumn db dol
Column db dol
poojabi
 
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMCASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
IJCI JOURNAL
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting data
Chen Robert
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
PritamKathar
 
Cassndra (4).pptx
Cassndra (4).pptxCassndra (4).pptx
Cassndra (4).pptx
NikhilAmauriya
 
Apache Cassandra overview
Apache Cassandra overviewApache Cassandra overview
Apache Cassandra overview
ElifTech
 
Dsm project-h base-cassandra
Dsm project-h base-cassandraDsm project-h base-cassandra
Dsm project-h base-cassandra
Shantanu Deshpande
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
Arunit Gupta
 
5266732.ppt
5266732.ppt5266732.ppt
5266732.ppt
hothyfa
 
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
raghdooosh
 
No sq lv1_0
No sq lv1_0No sq lv1_0
No sq lv1_0
Tuan Luong
 
cassandra_presentation_final
cassandra_presentation_finalcassandra_presentation_final
cassandra_presentation_final
SergioBruno21
 
Data Storage Management
Data Storage ManagementData Storage Management
Data Storage Management
Nisheet Mahajan
 
Using Apache Cassandra and Apache Kafka to Scale Next Gen Applications
Using Apache Cassandra and Apache Kafka to Scale Next Gen ApplicationsUsing Apache Cassandra and Apache Kafka to Scale Next Gen Applications
Using Apache Cassandra and Apache Kafka to Scale Next Gen Applications
Data Con LA
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
nehabsairam
 
Big Data_Architecture.pptx
Big Data_Architecture.pptxBig Data_Architecture.pptx
Big Data_Architecture.pptx
betalab
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra
Nikiforos Botis
 
Column db dol
Column db dolColumn db dol
Column db dol
poojabi
 
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMCASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
IJCI JOURNAL
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting data
Chen Robert
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
PritamKathar
 
Apache Cassandra overview
Apache Cassandra overviewApache Cassandra overview
Apache Cassandra overview
ElifTech
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
Arunit Gupta
 
5266732.ppt
5266732.ppt5266732.ppt
5266732.ppt
hothyfa
 
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
raghdooosh
 
cassandra_presentation_final
cassandra_presentation_finalcassandra_presentation_final
cassandra_presentation_final
SergioBruno21
 
Using Apache Cassandra and Apache Kafka to Scale Next Gen Applications
Using Apache Cassandra and Apache Kafka to Scale Next Gen ApplicationsUsing Apache Cassandra and Apache Kafka to Scale Next Gen Applications
Using Apache Cassandra and Apache Kafka to Scale Next Gen Applications
Data Con LA
 

Recently uploaded (20)

Drugs in Anaesthesia and Intensive Care,.pdf
Drugs in Anaesthesia and Intensive Care,.pdfDrugs in Anaesthesia and Intensive Care,.pdf
Drugs in Anaesthesia and Intensive Care,.pdf
crewot855
 
03#UNTAGGED. Generosity in architecture.
03#UNTAGGED. Generosity in architecture.03#UNTAGGED. Generosity in architecture.
03#UNTAGGED. Generosity in architecture.
MCH
 
puzzle Irregular Verbs- Simple Past Tense
puzzle Irregular Verbs- Simple Past Tensepuzzle Irregular Verbs- Simple Past Tense
puzzle Irregular Verbs- Simple Past Tense
OlgaLeonorTorresSnch
 
Ajanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of HistoryAjanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of History
Virag Sontakke
 
What is the Philosophy of Statistics? (and how I was drawn to it)
What is the Philosophy of Statistics? (and how I was drawn to it)What is the Philosophy of Statistics? (and how I was drawn to it)
What is the Philosophy of Statistics? (and how I was drawn to it)
jemille6
 
LDMMIA Reiki Yoga S5 Daily Living Workshop
LDMMIA Reiki Yoga S5 Daily Living WorkshopLDMMIA Reiki Yoga S5 Daily Living Workshop
LDMMIA Reiki Yoga S5 Daily Living Workshop
LDM Mia eStudios
 
How to Add Customer Note in Odoo 18 POS - Odoo Slides
How to Add Customer Note in Odoo 18 POS - Odoo SlidesHow to Add Customer Note in Odoo 18 POS - Odoo Slides
How to Add Customer Note in Odoo 18 POS - Odoo Slides
Celine George
 
Overview Well-Being and Creative Careers
Overview Well-Being and Creative CareersOverview Well-Being and Creative Careers
Overview Well-Being and Creative Careers
University of Amsterdam
 
Kenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 CohortKenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 Cohort
EducationNC
 
TERMINOLOGIES,GRIEF PROCESS AND LOSS AMD ITS TYPES .pptx
TERMINOLOGIES,GRIEF PROCESS AND LOSS AMD ITS TYPES .pptxTERMINOLOGIES,GRIEF PROCESS AND LOSS AMD ITS TYPES .pptx
TERMINOLOGIES,GRIEF PROCESS AND LOSS AMD ITS TYPES .pptx
PoojaSen20
 
How to Configure Public Holidays & Mandatory Days in Odoo 18
How to Configure Public Holidays & Mandatory Days in Odoo 18How to Configure Public Holidays & Mandatory Days in Odoo 18
How to Configure Public Holidays & Mandatory Days in Odoo 18
Celine George
 
Myasthenia gravis (Neuromuscular disorder)
Myasthenia gravis (Neuromuscular disorder)Myasthenia gravis (Neuromuscular disorder)
Myasthenia gravis (Neuromuscular disorder)
Mohamed Rizk Khodair
 
How to Create Kanban View in Odoo 18 - Odoo Slides
How to Create Kanban View in Odoo 18 - Odoo SlidesHow to Create Kanban View in Odoo 18 - Odoo Slides
How to Create Kanban View in Odoo 18 - Odoo Slides
Celine George
 
Kumushini_Thennakoon_CAPWIC_slides_.pptx
Kumushini_Thennakoon_CAPWIC_slides_.pptxKumushini_Thennakoon_CAPWIC_slides_.pptx
Kumushini_Thennakoon_CAPWIC_slides_.pptx
kumushiniodu
 
MEDICAL BIOLOGY MCQS BY. DR NASIR MUSTAFA
MEDICAL BIOLOGY MCQS  BY. DR NASIR MUSTAFAMEDICAL BIOLOGY MCQS  BY. DR NASIR MUSTAFA
MEDICAL BIOLOGY MCQS BY. DR NASIR MUSTAFA
Dr. Nasir Mustafa
 
APGAR SCORE BY sweety Tamanna Mahapatra MSc Pediatric
APGAR SCORE  BY sweety Tamanna Mahapatra MSc PediatricAPGAR SCORE  BY sweety Tamanna Mahapatra MSc Pediatric
APGAR SCORE BY sweety Tamanna Mahapatra MSc Pediatric
SweetytamannaMohapat
 
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
TechSoup
 
spinal cord disorders (Myelopathies and radiculoapthies)
spinal cord disorders (Myelopathies and radiculoapthies)spinal cord disorders (Myelopathies and radiculoapthies)
spinal cord disorders (Myelopathies and radiculoapthies)
Mohamed Rizk Khodair
 
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptxLecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Arshad Shaikh
 
Chemotherapy of Malignancy -Anticancer.pptx
Chemotherapy of Malignancy -Anticancer.pptxChemotherapy of Malignancy -Anticancer.pptx
Chemotherapy of Malignancy -Anticancer.pptx
Mayuri Chavan
 
Drugs in Anaesthesia and Intensive Care,.pdf
Drugs in Anaesthesia and Intensive Care,.pdfDrugs in Anaesthesia and Intensive Care,.pdf
Drugs in Anaesthesia and Intensive Care,.pdf
crewot855
 
03#UNTAGGED. Generosity in architecture.
03#UNTAGGED. Generosity in architecture.03#UNTAGGED. Generosity in architecture.
03#UNTAGGED. Generosity in architecture.
MCH
 
puzzle Irregular Verbs- Simple Past Tense
puzzle Irregular Verbs- Simple Past Tensepuzzle Irregular Verbs- Simple Past Tense
puzzle Irregular Verbs- Simple Past Tense
OlgaLeonorTorresSnch
 
Ajanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of HistoryAjanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of History
Virag Sontakke
 
What is the Philosophy of Statistics? (and how I was drawn to it)
What is the Philosophy of Statistics? (and how I was drawn to it)What is the Philosophy of Statistics? (and how I was drawn to it)
What is the Philosophy of Statistics? (and how I was drawn to it)
jemille6
 
LDMMIA Reiki Yoga S5 Daily Living Workshop
LDMMIA Reiki Yoga S5 Daily Living WorkshopLDMMIA Reiki Yoga S5 Daily Living Workshop
LDMMIA Reiki Yoga S5 Daily Living Workshop
LDM Mia eStudios
 
How to Add Customer Note in Odoo 18 POS - Odoo Slides
How to Add Customer Note in Odoo 18 POS - Odoo SlidesHow to Add Customer Note in Odoo 18 POS - Odoo Slides
How to Add Customer Note in Odoo 18 POS - Odoo Slides
Celine George
 
Overview Well-Being and Creative Careers
Overview Well-Being and Creative CareersOverview Well-Being and Creative Careers
Overview Well-Being and Creative Careers
University of Amsterdam
 
Kenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 CohortKenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 Cohort
EducationNC
 
TERMINOLOGIES,GRIEF PROCESS AND LOSS AMD ITS TYPES .pptx
TERMINOLOGIES,GRIEF PROCESS AND LOSS AMD ITS TYPES .pptxTERMINOLOGIES,GRIEF PROCESS AND LOSS AMD ITS TYPES .pptx
TERMINOLOGIES,GRIEF PROCESS AND LOSS AMD ITS TYPES .pptx
PoojaSen20
 
How to Configure Public Holidays & Mandatory Days in Odoo 18
How to Configure Public Holidays & Mandatory Days in Odoo 18How to Configure Public Holidays & Mandatory Days in Odoo 18
How to Configure Public Holidays & Mandatory Days in Odoo 18
Celine George
 
Myasthenia gravis (Neuromuscular disorder)
Myasthenia gravis (Neuromuscular disorder)Myasthenia gravis (Neuromuscular disorder)
Myasthenia gravis (Neuromuscular disorder)
Mohamed Rizk Khodair
 
How to Create Kanban View in Odoo 18 - Odoo Slides
How to Create Kanban View in Odoo 18 - Odoo SlidesHow to Create Kanban View in Odoo 18 - Odoo Slides
How to Create Kanban View in Odoo 18 - Odoo Slides
Celine George
 
Kumushini_Thennakoon_CAPWIC_slides_.pptx
Kumushini_Thennakoon_CAPWIC_slides_.pptxKumushini_Thennakoon_CAPWIC_slides_.pptx
Kumushini_Thennakoon_CAPWIC_slides_.pptx
kumushiniodu
 
MEDICAL BIOLOGY MCQS BY. DR NASIR MUSTAFA
MEDICAL BIOLOGY MCQS  BY. DR NASIR MUSTAFAMEDICAL BIOLOGY MCQS  BY. DR NASIR MUSTAFA
MEDICAL BIOLOGY MCQS BY. DR NASIR MUSTAFA
Dr. Nasir Mustafa
 
APGAR SCORE BY sweety Tamanna Mahapatra MSc Pediatric
APGAR SCORE  BY sweety Tamanna Mahapatra MSc PediatricAPGAR SCORE  BY sweety Tamanna Mahapatra MSc Pediatric
APGAR SCORE BY sweety Tamanna Mahapatra MSc Pediatric
SweetytamannaMohapat
 
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
TechSoup
 
spinal cord disorders (Myelopathies and radiculoapthies)
spinal cord disorders (Myelopathies and radiculoapthies)spinal cord disorders (Myelopathies and radiculoapthies)
spinal cord disorders (Myelopathies and radiculoapthies)
Mohamed Rizk Khodair
 
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptxLecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Arshad Shaikh
 
Chemotherapy of Malignancy -Anticancer.pptx
Chemotherapy of Malignancy -Anticancer.pptxChemotherapy of Malignancy -Anticancer.pptx
Chemotherapy of Malignancy -Anticancer.pptx
Mayuri Chavan
 
Ad

Unit -3 -Features of Cassandra, CQL Data types, CQLSH, Keyspaces

  • 1. Unit -3 Cassandra Cassandra – Apache Cassandra - An Introduction, Features of Cassandra, CQL Data types, CQLSH, Keyspaces, CRUD (Create, Read, Update and Delete) Operations, Collections, Using a Counter, Time to Live (TTL), Alter Commands, Import and Export, Querying System Tables, Practice Examples
  • 3. What is Apache Cassandra? • Apache Cassandra is an opensource,distributed and decentralized/distributed storage system (database),for managing very large amounts of structured data spread out across the world. • It provides highly available service with no single point of failure. • Listed below are some of the notable points of Apache Cassandra − • It is scalable, fault-tolerant, and consistent. • It is a column-oriented database. • Its distribution design is basedon Amazon’s Dynamo and its data model on Google’s Bigtable. • Created at Facebook, it differs sharply from relational database management systems. • Cassandra implements a Dynamo-style replication model with no single point of failure, but adds a more powerful “column family” data model. • Cassandra is being used by some of the biggest companies such as Facebook, Twitter, Cisco, Rackspace, ebay, Twitter, Netflix, and more.
  • 5. NoSQLDatabase • A NoSQL database (sometimes called as Not Only SQL) is a database that provides a mechanism to store and retrieve data other than the tabular relations used in relational databases. • These databases are schema-free, support easy replication, have simple API, eventually consistent, and can handle huge amounts of data. • The primary objective of a NoSQL database is to have • simplicity of design, • horizontal scaling, and • finer control over availability. • NoSql databases use different data structures compared to relational databases. • It makes some operations faster in NoSQL. • The suitability of a given NoSQL database depends on the problem it must solve.
  • 7. • Besides Cassandra, we have the following NoSQL databases that are quite popular − • Apache HBase − • HBase is an open source, non-relational, distributed database modeled after Google’s BigTable and is written in Java. • It is developed as a part of Apache Hadoop project and runs on top of HDFS, providing BigTable-like capabilities for Hadoop. • MongoDB − • MongoDB is a cross-platform document-oriented database system that avoids using the traditional table-based relational database structure in favor of JSON-like documents with dynamic schemas making the integration of data in certain types of applications easier and faster.
  • 9. Features of Cassandra •Cassandra has become so popular because of its outstanding technical features. •Elastic scalability − Cassandra is highly scalable; it allows to add more hardware to accommodate more customers and more data as per requirement. •Always on architecture − Cassandra has no single point of failure and it is continuously available for business-critical applications that cannot afford a failure. •Fast linear-scale performance − Cassandra is linearly scalable, i.e., it increases your throughput as you increase the number of nodes in the cluster. Therefore it maintains a quick response time. •Flexible data storage − Cassandra accommodates all possible data formats including: structured,semi-structured, and unstructured. It can dynamically accommodate changes to your data structures according to your need. •Easy data distribution − Cassandra provides the flexibility to distribute data where you need by replicating data across multiple data centers. •Transaction support − Cassandra supports properties like Atomicity, Consistency, Isolation, and Durability (ACID). •Fast writes − Cassandra was designed to run on cheap commodity hardware. It performs blazingly fast writes and can store hundreds of terabytes of data, without sacrificing the read efficiency.
  • 12. a. Cassandra Storage • One of the major applications of Cassandra is storage. • The broad coverage of Cassandra enables the user to store any kind of data. • This data is stored in various nodes that Cassandra provides. Cisco WebEx, InWorldz, Formspring, OpenX are some companies using Cassandra for storage. b. Back-end development applications • Users can also use Cassandra for back-end development of their applications. • Many software and applications have front-end and back-end. • Cassandra provides a wide platform for the development of the back-end. It also provides a huge database of the data. • Talentica software uses back-end for analytics. c. Cassandra Monitoring • Many applications are based on a wide scale of user activity. • Developers can also use Cassandra to monitor the user activity. • This user activity can be based on the different parameter, media, art, music etc. CERN, Cloudkick and many such companies use Cassandra monitoring. d. Time-series-based applications • Time-series-based applications are basically the applications in real time. • These applications include hits on the internet browser, traffic light data, GPS location tracking data etc. • These applications require heavy write systems. • Cassandra is best for these kinds of applications. e. Cassandra Analytics • Cassandra provides a platform to analyse data collected from various sources. • These sources may include social media, product feedback catalogues, retail inputs and lookups. • Developers can use Cassandra to retrieve and analyse this data. • Ooyala is using Cassandra Analytics applications. f. Cassandra Messaging • Nowadays, people use messaging services all the time. • This eventually, demands a need for a platform to manage these message data. • Therefore, Cassandra acts as a platform for the message providers for their database management.
  • 14. • Cassandra takes hardware failure into consideration. • Thus, it possesses plans of contingency to avoid such failures. • It consists of a ring type structure i.e. its nodes are logically distributed like a ring. • Thus it has no master or slave nodes. • It makes replicas of data on several homogenous nodes of the cluster. • Each information exchanges among the nodes of the cluster every second. • A sequentially written commit log on each node captures write activity to make sure data durability. • This data is then indexed and written to memtable. • Once the memtable is full, we write data on disk on SSTable data file. • All the data is partitioned and replicated to other nodes automatically. • By using a process known as compaction Cassandra periodically updates SSTables and remove outdated data. • A client can make read/write request to any node in the cluster. What is Cassandra Architecture?
  • 16. Key Terms Of Cassandra Architecture a. Cassandra Nodes • It is the basic fundamental unit of Cassandra. • Data stores in these units(computer/server). b. Cassandra Data Center • Cassandra Datacenter, basically a collection of related Cassandra nodes. • A centralized place to accommodate computer and networking system to meet the needs of an organization’s information technology. c. Cassandra Rack • A rack is a unit that contains all the multiple servers all stacked on top of another. • A node is a single server in a rack. d. Cassandra Cluster • A collection of many data centers form a Cassandra cluster. • It can be spanned to physical locations. e. Cassandra Commit log • Every writes operation performs in a commit log to ensure the durability of the data. • After it has been flushed to an SSTable data archives or delete or change here. • It is like a crash recovery mechanism.
  • 17. f. MemTables • A temporary memory location where we write data during updates or deletion. • Data is written in memtables after it has been written in the commit log. • When the data in memtables is full, we flush them to the disk to SSTables g. SSTables • SSTables, the fixed set of data files in which Cassandra writes memtables periodically. • These are appended only, which means that we can add data at the end of the file thus helping in the sequential storage in the disk. h. Data Replication • Imagine a situation if one of the nodes goes down in a data center then a part of information will lost. • Thus to overcome this limitation, Cassandra made replicas of data on various nodes. This is called replication. • This ensures fault tolerance and reliability.
  • 18. Cassandra Query Language Users can access Cassandra through its nodes using Cassandra Query Language (CQL). CQL treats the database (Keyspace) as a container of tables. Programmers use cqlsh: a prompt to work with CQL or separate application language drivers. Clients approach any of the nodes for their read-write operations. That node (coordinator) plays a proxy between the client and the nodes holding the data. Write Operations Every write activity of nodes is captured by the commit logs written in the nodes. Later the data will be captured and stored in the mem-table. Whenever the mem-table is full, data will be written into the SStable data file. All writes are automatically partitioned and replicated throughout the cluster. Cassandra periodically consolidates the SSTables, discarding unnecessary data. Read Operations During read operations, Cassandra gets values from the mem-table and checks the bloom filter to find the appropriate SSTable that holds the required data.
  • 19. What is Cassandra Keyspace? • In the Cassandra Data Model, Cassandra Keyspace is a container for data. • It contains many attributes. The basic attributes are:- • a. Replication Factor • It basically signifies the number of copies of a data. In other words, the number of nodes in a cluster that are copies of a data. • b. Replica Placement Strategy • We have strategies such as • simple strategy (rack-aware strategy), • old network topology strategy (rack-aware strategy), • network topology strategy (datacenter-shared strategy). • c. Cassandra Column Families • Column Family in Cassandra is a collection of rows, which contains ordered columns. They represent a structure of the stored data. These Cassandra Column families are contained in Keyspace. • There is at least one Column family in each Keyspace.
  • 20. • The rows in each column are once again the collection of many columns. • The columns are the basic unit of the data structure in Cassandra. • Columns have three values stored in them. • They are key or columns name, timestamp and value.
  • 22. CQLSH • cqlsh: the CQL shell • cqlsh is a command line shell for interacting with Cassandra through CQL (the Cassandra Query Language). • It is shipped with every Cassandra package, and can be found in the bin/ directory alongside the cassandra executable. • cqlsh utilizes the Python native protocol driver, and connects to the single node specified on the command line.
  • 24. Cqlsh Commands Cqlsh has a few commands that allow users to interact with it. • HELP − Displays help topics for all cqlsh commands. • CAPTURE − Captures the output of a command and adds it to a file. • CONSISTENCY − Shows the current consistency level, or sets a new consistency level. • COPY − Copies data to and from Cassandra. • DESCRIBE − Describes the current cluster of Cassandra and its objects. • EXPAND − Expands the output of a query vertically. • EXIT − Using this command, you can terminate cqlsh. • PAGING − Enables or disables query paging. • SHOW − Displays the details of current cqlsh session such as Cassandra version, host, or data type assumptions. • SOURCE − Executes a file that contains CQL statements. • TRACING − Enables or disables request tracing.
  • 25. CQL Data Definition Commands • CREATE KEYSPACE − Creates a KeySpace in Cassandra. • USE − Connects to a created KeySpace. • ALTER KEYSPACE − Changes the properties of a KeySpace. • DROP KEYSPACE − Removes a KeySpace • CREATE TABLE − Creates a table in a KeySpace. • ALTER TABLE − Modifies the column properties of a table. • DROP TABLE − Removes a table. • TRUNCATE − Removes all the data from a table. • CREATE INDEX − Defines a new index on a single column of a table. • DROP INDEX − Deletes a named index.
  • 26. CQL Data Manipulation Commands • INSERT − Adds columns for a row in a table. • UPDATE − Updates a column of a row. • DELETE − Deletes data from a table. • BATCH − Executes multiple DML statements at once. CQL Clauses • SELECT − This clause reads data from a table • WHERE − The where clause is used along with select to read a specific data. • ORDERBY − The orderby clause is used along with select to read a specific data in a specific order.
  • 27. KEY SPACES With in the keyspace tables can be defined Table Keyspace Table Table
  • 29. •CREATE KEYSPACE “KeySpace Name” WITH replication = {'class': ‘Strategy name’, 'replication_factor' : ‘No.Of replicas’}; •CREATE KEYSPACE “KeySpace Name” WITH replication = {'class': ‘Strategy name’, 'replication_factor' : ‘No.Of replicas’} AND durable_writes = ‘Boolean value’; •The CREATE KEYSPACE statement has two properties: replication and durable_writes. Creating a Keyspace using Cqlsh • A keyspace in Cassandra is a namespace that defines data replication on nodes. • A cluster contains one keyspace per node. • Given below is the syntax for creating a keyspace using the statement CREATE KEYSPACE. • CREATE KEYSPACE <identifier> WITH <properties>
  • 30. Replication • The replication option is to specify the Replica Placement strategy and the number of replicas wanted. The following table lists all the replica placement strategies. Strategy name • Simple Strategy’ • Network Topology Strategy Description Specifies a simple replication factor for the cluster. Using this option, you can set the replication factor for each data- center independently. • Old Network Topology Strategy This is a legacy replication strategy. Using this option, you can instruct Cassandra whether to use commitlog for updates on the current KeySpace. This option is not mandatory and by default, it is set to true.
  • 31. •Given below is an example of creating a KeySpace. •Here we are creating a KeySpace named DATADABSE1. We are using the first replica placement strategy, i.e.., Simple Strategy. And we are choosing the replication factor to 1 replica. cqlsh.> CREATE KEYSPACE DATABASE1 WITH replication ={'class':'SimpleStrategy', 'replication_factor' : 3};
  • 32. Verification •You can verify whether the table is created or not using the command Describe. •If you use this command over keyspaces, it will display all the keyspaces created as shown below. •cqlsh> DESCRIBE keyspaces; DATABASE1 system system_traces
  • 33. Durable_writes •By default, the durable_writes properties of a table is set to true, however it can be set to false. You cannot set this property to simplex strategy. Example •Given below is the example demonstrating the usage of durable writes property. •cqlsh> CREATE KEYSPACE test ... WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 3 } ... AND DURABLE_WRITES = false;
  • 34. Verification •You can verify whether the durable_writes property of test KeySpace was set to false by querying the System Keyspace. This query gives you all the KeySpaces along with their properties. •cqlsh> SELECT * FROM system_schema.keyspaces;
  • 35. Using a Keyspace •You can use a created KeySpace using the keyword USE. Its syntax is as follows − •Syntax:USE <identifier>
  • 36. Example •In the following example, we are using the KeySpace DATABASE1. •cqlsh> USE DATABASE1; •cqlsh:DATABASE1>
  • 37. Altering a KeySpace • ALTER KEYSPACE can be used to alter properties such as the number of replicas and the durable_writes of a KeySpace. Given below is the syntax of this command. Syntax ALTER KEYSPACE <identifier> WITH <properties> i.e. ALTER KEYSPACE “KeySpace Name” WITH replication = {'class': ‘Strategy name’, 'replication_factor' : ‘No.Of replicas’}; The properties of ALTER KEYSPACE are same as CREATE KEYSPACE. It has two properties: replication and durable_writes.
  • 38. Example •Here we are altering a KeySpace named DATABASE1. •We are changing the replication factor from 1 to 3. •cqlsh.> ALTER KEYSPACE DATABASE1 WITH replication = {'class':'NetworkTopologyStrategy', 'replication_factor' : 3}; •ALTER KEYSPACE test WITH REPLICATION = {'class’ : 'NetworkTopologyStrategy', 'datacenter1' : 3} AND DURABLE_WRITES = true;
  • 39. Dropping a Keyspace • You can drop a KeySpace using the command DROP KEYSPACE. Given below is the syntax for dropping a KeySpace. Syntax DROP KEYSPACE <identifier> i.e. DROP KEYSPACE “KeySpace name” Example cqlsh> DROP KEYSPACE DATABASE1;
  翻译: