SlideShare a Scribd company logo
NETEZZA
What is Bigdata Netezza ?
An introduction to Netezza Vijaya Chandrika J
1
Netezza Architecture
 Netezza uses a proprietary architecture called Asymmetric Massively Parallel
Processing (AMPP)
 AMPP is based on the concept of Massively Parallel Processing (MPP) where
nothing (CPU, memory, storage) is shared .
 The MPP is achieved through an array of S-Blades which are servers on its own
running its own operating systems connected to disks.
 Netezza architecture has one unique hardware component called the
Database Accelerator card which is attached to the S-Blades.
An introduction to Netezza Vijaya Chandrika J
2
Hardware components of the Netezza
05/25/15An introduction to Netezza Vijaya Chandrika J 3
The following diagram provides a
high level logical schematic
which will help imagine the
various components in the
Netezza appliance.
Uses a Linux OS
Each S-Blades has 8 processor
cores and 16 GB of RAM .
Each processor in the S-Blade is
connected to disks in a disk array
through a Database Accelerator
card which uses FPGA
technology.
What are S-Blades
 S blades are called Snippet blades or Snippet Processing Array (SPA)
 The S-Blade is a specialized processing board which combines the CPU
processing power of a blade server with the query analysis intelligence
 The Netezza Database Accelerator card contains the FPGA query engines,
memory, and I/O for processing the data from the disks where user data is
stored.
An introduction to Netezza Vijaya Chandrika J
4
1- S Blade
2- accelerator card
How it works ? An example
 Assumptions : Assume an example data warehouse for a large retail firm and
one of the tables store the details about all of its 10 million customers. Also
assume that there are 25 columns in the tables and the total length of each
table row is 250 bytes.
 Query : user query the application for say Customer Id, Name and State who
joined the organization in a particular period sorted by state and name
An introduction to Netezza Vijaya Chandrika J
5
High level steps
 In Netezza the 10 million customer records will be stored fairly equally across
all the disks available in the disk arrays connected to the snippet processors
in the S-Blades in a compressed form.
 The Database Accelerator card in the snippet processor will un-compress the
data which will include all the columns in the table, then it will remove the
unwanted columns from the data which in case will be 22 columns i.e. 220
bytes out of the 250 bytes, applies the where clause which will remove the
unwanted rows from the data and passes the small amount of the data to the
CPU in the snippet processor. In traditional databases all these steps are
performed in the CPU.
 The CPU in the snippet processor performs tasks like aggregation, sum, sort
etc on the data from the database accelerator card and parses the result to
the host through the network.
An introduction to Netezza Vijaya Chandrika J
6
The key takeaways
 The Netezza has the ability to process large volume of data in parallel and
the key is to make sure that the data is distributed appropriately to leverage
the massive parallel processing.
 Implement designs in a way that most of the processing happens in the
snippet processors; minimize communication between snippet processors and
minimal data communication to the host.
An introduction to Netezza Vijaya Chandrika J
7
Netezza Tools
 NzAdmin : This is a GUI based administration tool
 The tool has a system view which it provides a visual snapshot of the state of
the appliance including issues with any hardware components. The second
view the tool provides is the database view which lists all the databases
including the objects in them, users and groups currently defined, active
sessions, query history and any backup history. The database view also
provides options to perform database administration tasks like creation and
management of database and database objects, users and groups.
An introduction to Netezza Vijaya Chandrika J
8
An introduction to Netezza Vijaya Chandrika J
9
05/25/15An introduction to Netezza Vijaya Chandrika J
10
NZSQL
 “nzsql” is the second tool that is most commonly used .
 The “nzsql” command invoke the SQL command interpreter through which all
Netezza supported SQL statements can be executed.
 nzsql –d testdb –u testuser –p password
 This command Will connect and create a “nzsql” session with the database
“testdb” as the user “testuser” after which the user can execute SQL
statements against the database. Also as with all the Netezza commands the
“nzsql” has the “-h” help option which displays details about the usage of the
command.
05/25/15An introduction to Netezza Vijaya Chandrika J
11
System Objects
 The appliance comes preconfigured with the following 3 user ids which can’t
be modified or deleted from the system. They are used to perform all the
administration tasks and hence should be used by restricted number of users.
 root : The super user for the host system on the appliance and has all the
access as a super user in any Linux system.
 nz : Netezza system administrator Linux account that is used to run host
software on Linux
 admin : The default Netezza SQL database administrator user which has
access to perform all database related tasks against all the databases in the
appliance.
An introduction to Netezza Vijaya Chandrika J
12
Create Table
create table employee (
emp_id integer not null,
first_name varchar(25) not null,
last_name varchar(25) not null,
sex char(1),
dept_id integer not null,
created_dt timestamp not null,
created_by char(8) not null,
updated_dt timestamp not null,
updated_by char(8) not null,
constraint pk_employee primary key(emp_id)
constraint fk_employee foreign key (dept_id) references department(dept_id)
on update restrict on delete restrict
) distribute on random;
An introduction to Netezza Vijaya Chandrika J
13
the statement will look familiar except for the
“distribute on” clause details. Also there are
no storage related details like tablespace on
which the table needs to be created or any
bufferpool details which are handled by the
Netezza appliance.
Netezza vs traditional dbs
 Netezza doesn’t enforce any of the constraints like the primary key or foreign
key when inserting or loading data into the tables for performance reasons. It
is up to the application to make sure that these constraints are satisfied by
the data being loaded into the tables. Even though the constraints are not
enforced by Netezza defining them will provide additional hints to the query
optimizer to generate efficient snippet execution code which in turn helps
performance.
 Modifying the column length is only applicable to columns defined as varchar.
 If a table gets renamed the views attached to the table will stop working
 If a table is referenced by a stored procedure adding or dropping a column is
not permitted. The stored procedure needs to be dropped first before adding
or dropping a column and then the stored procedure needs to be recreated.
An introduction to Netezza Vijaya Chandrika J
14
Netezza vs traditional dbs - MV
 Only one table can be specified in the FROM clause of the create statement
for MV
 There can be no where clause in the select clause of the create statement for
MV
 The columns in the projection list must be columns from the base table and
no expressions
 External, temporary, system or clustered base tables can’t be used as base
table for materialized views
An introduction to Netezza Vijaya Chandrika J
15
Netezza vs traditional dbs - Sequence
 The following is a sample sequence creation statement which can be used to
populate the id column in the employee table.
create sequence seq_emp_id as integer start with 1 increment by 1
minvalue 1 no maxvalue no cycle;
 Since no max value is used, the sequence will be able to hold up to the
largest value of the sequence type which in this case is 35,791,394 for integer
type.
 System will be forced to flush cached values of sequences in situations like
stopping of the system, system or SPU crashes or during some alter sequence
statements which will also create gaps in the sequence number generated by
a sequence.
An introduction to Netezza Vijaya Chandrika J
16
Netezza Storage
 Each disk in the appliance is partitioned into primary, mirror and temp or
swap partitions. The primary partition in each disk is used to store user data
like database tables, the mirror stores a copy of the primary partition of
another disk so that it can be used in the event of disk failures and the
temp/swap partition is used to store the data temporarily like when the
appliance does data redistribution while processing queries. The logical
representation of the data saved in the primary partition of each disk is
called the data slice. When users create database tables and loads data into
it, they get distributed across the available data slices. Logical representation
of data slices is called the data partition.
An introduction to Netezza Vijaya Chandrika J
17
Netezza Storage - Diagram
An introduction to Netezza Vijaya Chandrika J
18
Data Organization
 When users create tables in databases and store data into it, data gets stored
in disk extents which is the minimum storage allocated on disks for data
storage. Netezza distributes the data in data extents across all the available
data slices based on the distribution key specified during the table creation.
A user can specify upto four columns for data distribution or can specify the
data to be distributed randomly or none at all during the table creation
process.
 When the user selects random as the option for data distribution, then the
appliance uses round robin algorithm to distribute the data uniformly across
all the available dataslices.
 The key is to make sure that the data for a table is uniformly distributed
across all the data slices so that there are no data skews. By distributing data
across the data slices, all the SPUs in the system can be utilized to process
any query and in turn improves performance.
An introduction to Netezza Vijaya Chandrika J
19
Netezza Transactions
 By default Netezza SQLs are executed in auto-commit mode i.e. the changes
made by a SQL statement takes in effect immediately after the completion of
the statement as if the transaction is complete.
 If there are multiple related SQL statements where all the SQL execution
need to fail if any one of them fails, user can use the BEGIN, COMMIT and
ROLLBACK transaction control statements to control the transaction involving
multiple statements. All SQL statements between a BEGIN statement and
COMMIT or ROLLBACK statement will be treated as part of a single transaction
An introduction to Netezza Vijaya Chandrika J
20
Alternate for redo logs in Netezza
 Netezza doesn’t use logs and all the changes are made on the storage where
user data is stored which also helps with the performance.
 Netezza maintains three additional hidden columns (createxid, deletexid and
row id) per table row which stores the transaction id which created the row,
the transaction id which deleted the row and a unique row id assigned to the
data row by the system.
An introduction to Netezza Vijaya Chandrika J
21
Best Practices
 Define all constraints and relationships between objects. Even though
Netezza doesn’t enforce them other than the not null constraint, the query
optimizer will still use these details to come-up with an efficient query
execution plan.
 If data for a column is known to have a fixed length value, then use char(x)
instead of varchar(x). Varchar(x) uses additional storage which will be
significant when dealing with TB of data and also impacts the query
processing since additional data need to be pulled in from disk for processing.
 Use NOT NULL wherever data permits. This will help improve performance by
not having to check for null condition by the appliance and will reduce
storage usage.
An introduction to Netezza Vijaya Chandrika J
22
Best Practices
 Distribute on columns of high cardinality and ones that used to join often. It
is best to distribute fact and dimension table on the same column. This will
reduce the data redistribution during queries improving the performance.
 Create materialized view on a small set of the columns from a large table
often used by user queries.
An introduction to Netezza Vijaya Chandrika J
23
Questions ?
An introduction to Netezza Vijaya Chandrika J
24
Thank you
25
Ad

More Related Content

What's hot (20)

Big Data: Issues and Challenges
Big Data: Issues and ChallengesBig Data: Issues and Challenges
Big Data: Issues and Challenges
Harsh Kishore Mishra
 
Big data ppt
Big data pptBig data ppt
Big data ppt
Thirunavukkarasu Ps
 
3 pillars of big data : structured data, semi structured data and unstructure...
3 pillars of big data : structured data, semi structured data and unstructure...3 pillars of big data : structured data, semi structured data and unstructure...
3 pillars of big data : structured data, semi structured data and unstructure...
PROWEBSCRAPER
 
VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...
VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...
VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...
vtunotesbysree
 
Mobile dbms
Mobile dbmsMobile dbms
Mobile dbms
Tech_MX
 
Remote Sensing: Principal Component Analysis
Remote Sensing: Principal Component AnalysisRemote Sensing: Principal Component Analysis
Remote Sensing: Principal Component Analysis
Kamlesh Kumar
 
Crop predction ppt using ANN
Crop predction ppt using ANNCrop predction ppt using ANN
Crop predction ppt using ANN
Astha Jain
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
Gajanand Sharma
 
Dbms and it infrastructure
Dbms and  it infrastructureDbms and  it infrastructure
Dbms and it infrastructure
projectandppt
 
Lecture #01
Lecture #01Lecture #01
Lecture #01
Konpal Darakshan
 
Big Data For Flight Delay Report
Big Data For Flight Delay ReportBig Data For Flight Delay Report
Big Data For Flight Delay Report
JSPM's JSCOE , Pune Maharashtra.
 
Tools for data warehousing
Tools  for data warehousingTools  for data warehousing
Tools for data warehousing
Manju Rajput
 
Applications of Big Data
Applications of Big DataApplications of Big Data
Applications of Big Data
Prashant Kumar Jadia
 
Geospatial Analysis: Innovation in GIS for Better Decision Making
Geospatial Analysis: Innovation in GIS for Better Decision MakingGeospatial Analysis: Innovation in GIS for Better Decision Making
Geospatial Analysis: Innovation in GIS for Better Decision Making
removed_62798267384a091db5c693ad7f1cc5ac
 
Data Mining
Data MiningData Mining
Data Mining
SOMASUNDARAM T
 
Machine learning
Machine learningMachine learning
Machine learning
Sanjay krishne
 
Rule Based Algorithms.pptx
Rule Based Algorithms.pptxRule Based Algorithms.pptx
Rule Based Algorithms.pptx
RoshanSuvedi1
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Salah Amean
 
Unit 1
Unit 1Unit 1
Unit 1
kanchan khedikar
 
Big data
Big dataBig data
Big data
Mithilesh Joshi - SEO & Digital Marketing Consultant
 
3 pillars of big data : structured data, semi structured data and unstructure...
3 pillars of big data : structured data, semi structured data and unstructure...3 pillars of big data : structured data, semi structured data and unstructure...
3 pillars of big data : structured data, semi structured data and unstructure...
PROWEBSCRAPER
 
VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...
VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...
VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...
vtunotesbysree
 
Mobile dbms
Mobile dbmsMobile dbms
Mobile dbms
Tech_MX
 
Remote Sensing: Principal Component Analysis
Remote Sensing: Principal Component AnalysisRemote Sensing: Principal Component Analysis
Remote Sensing: Principal Component Analysis
Kamlesh Kumar
 
Crop predction ppt using ANN
Crop predction ppt using ANNCrop predction ppt using ANN
Crop predction ppt using ANN
Astha Jain
 
Dbms and it infrastructure
Dbms and  it infrastructureDbms and  it infrastructure
Dbms and it infrastructure
projectandppt
 
Tools for data warehousing
Tools  for data warehousingTools  for data warehousing
Tools for data warehousing
Manju Rajput
 
Rule Based Algorithms.pptx
Rule Based Algorithms.pptxRule Based Algorithms.pptx
Rule Based Algorithms.pptx
RoshanSuvedi1
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Salah Amean
 

Viewers also liked (19)

Netezza fundamentals for developers
Netezza fundamentals for developersNetezza fundamentals for developers
Netezza fundamentals for developers
Biju Nair
 
The IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse applianceThe IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse appliance
IBM Danmark
 
The IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse ApplianceThe IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse Appliance
IBM Sverige
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep Dives
Rush Shah
 
IBM Pure Data System for Analytics (Netezza)
IBM Pure Data System for Analytics (Netezza)IBM Pure Data System for Analytics (Netezza)
IBM Pure Data System for Analytics (Netezza)
Girish Srivastava
 
Netezza pure data
Netezza pure dataNetezza pure data
Netezza pure data
Hossein Sarshar
 
Ibm pure data system for analytics n200x
Ibm pure data system for analytics n200xIbm pure data system for analytics n200x
Ibm pure data system for analytics n200x
IBM Sverige
 
Ibm pure data system for analytics n3001
Ibm pure data system for analytics n3001Ibm pure data system for analytics n3001
Ibm pure data system for analytics n3001
Abhishek Satyam
 
Netezza Online Training by www.etraining.guru in India
Netezza Online Training by www.etraining.guru in IndiaNetezza Online Training by www.etraining.guru in India
Netezza Online Training by www.etraining.guru in India
Ravikumar Nandigam
 
Using Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceUsing Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve Performace
Biju Nair
 
Postgre sql statements 03
Postgre sql statements 03Postgre sql statements 03
Postgre sql statements 03
huynhle1990
 
Managing user Online Training in IBM Netezza DBA Development by www.etraining...
Managing user Online Training in IBM Netezza DBA Development by www.etraining...Managing user Online Training in IBM Netezza DBA Development by www.etraining...
Managing user Online Training in IBM Netezza DBA Development by www.etraining...
Ravikumar Nandigam
 
IBM Netezza - The data warehouse in a big data strategy
IBM Netezza - The data warehouse in a big data strategyIBM Netezza - The data warehouse in a big data strategy
IBM Netezza - The data warehouse in a big data strategy
IBM Sverige
 
NENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezzaNENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezza
Biju Nair
 
Netezza workload management
Netezza workload managementNetezza workload management
Netezza workload management
Biju Nair
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar Database
Biju Nair
 
Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?
Daniel Abadi
 
High performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveHigh performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspective
Jason Shih
 
スタートアップ組織づくりの具体策を学ぶ 先生:金子 陽三
スタートアップ組織づくりの具体策を学ぶ 先生:金子 陽三スタートアップ組織づくりの具体策を学ぶ 先生:金子 陽三
スタートアップ組織づくりの具体策を学ぶ 先生:金子 陽三
schoowebcampus
 
Netezza fundamentals for developers
Netezza fundamentals for developersNetezza fundamentals for developers
Netezza fundamentals for developers
Biju Nair
 
The IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse applianceThe IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse appliance
IBM Danmark
 
The IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse ApplianceThe IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse Appliance
IBM Sverige
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep Dives
Rush Shah
 
IBM Pure Data System for Analytics (Netezza)
IBM Pure Data System for Analytics (Netezza)IBM Pure Data System for Analytics (Netezza)
IBM Pure Data System for Analytics (Netezza)
Girish Srivastava
 
Ibm pure data system for analytics n200x
Ibm pure data system for analytics n200xIbm pure data system for analytics n200x
Ibm pure data system for analytics n200x
IBM Sverige
 
Ibm pure data system for analytics n3001
Ibm pure data system for analytics n3001Ibm pure data system for analytics n3001
Ibm pure data system for analytics n3001
Abhishek Satyam
 
Netezza Online Training by www.etraining.guru in India
Netezza Online Training by www.etraining.guru in IndiaNetezza Online Training by www.etraining.guru in India
Netezza Online Training by www.etraining.guru in India
Ravikumar Nandigam
 
Using Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceUsing Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve Performace
Biju Nair
 
Postgre sql statements 03
Postgre sql statements 03Postgre sql statements 03
Postgre sql statements 03
huynhle1990
 
Managing user Online Training in IBM Netezza DBA Development by www.etraining...
Managing user Online Training in IBM Netezza DBA Development by www.etraining...Managing user Online Training in IBM Netezza DBA Development by www.etraining...
Managing user Online Training in IBM Netezza DBA Development by www.etraining...
Ravikumar Nandigam
 
IBM Netezza - The data warehouse in a big data strategy
IBM Netezza - The data warehouse in a big data strategyIBM Netezza - The data warehouse in a big data strategy
IBM Netezza - The data warehouse in a big data strategy
IBM Sverige
 
NENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezzaNENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezza
Biju Nair
 
Netezza workload management
Netezza workload managementNetezza workload management
Netezza workload management
Biju Nair
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar Database
Biju Nair
 
Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?
Daniel Abadi
 
High performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveHigh performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspective
Jason Shih
 
スタートアップ組織づくりの具体策を学ぶ 先生:金子 陽三
スタートアップ組織づくりの具体策を学ぶ 先生:金子 陽三スタートアップ組織づくりの具体策を学ぶ 先生:金子 陽三
スタートアップ組織づくりの具体策を学ぶ 先生:金子 陽三
schoowebcampus
 
Ad

Similar to An Introduction to Netezza (20)

Netezza fundamentals-for-developers
Netezza fundamentals-for-developersNetezza fundamentals-for-developers
Netezza fundamentals-for-developers
Tariq H. Khan
 
Bigdata netezza-ppt-apr2013-bhawani nandan prasad
Bigdata netezza-ppt-apr2013-bhawani nandan prasadBigdata netezza-ppt-apr2013-bhawani nandan prasad
Bigdata netezza-ppt-apr2013-bhawani nandan prasad
Bhawani N Prasad
 
td1.ppt
td1.ppttd1.ppt
td1.ppt
ABHINAVGUPTA401650
 
SQL Server 2019 CTP 2.5
SQL Server 2019 CTP 2.5SQL Server 2019 CTP 2.5
SQL Server 2019 CTP 2.5
Gianluca Hotz
 
DataStage_Whitepaper
DataStage_WhitepaperDataStage_Whitepaper
DataStage_Whitepaper
Sourav Maity
 
Course content (netezza dba)
Course content (netezza dba)Course content (netezza dba)
Course content (netezza dba)
Ravikumar Nandigam
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database Overview
Steve Min
 
MySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User ConferenceMySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User Conference
Severalnines
 
12c Database new features
12c Database new features12c Database new features
12c Database new features
Sandeep Redkar
 
Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0
alok khobragade
 
Blades for HPTC
Blades for HPTCBlades for HPTC
Blades for HPTC
Guy Coates
 
Dmv's & Performance Monitor in SQL Server
Dmv's & Performance Monitor in SQL ServerDmv's & Performance Monitor in SQL Server
Dmv's & Performance Monitor in SQL Server
Zeba Ansari
 
Migrating To PostgreSQL
Migrating To PostgreSQLMigrating To PostgreSQL
Migrating To PostgreSQL
Grant Fritchey
 
Ebook10
Ebook10Ebook10
Ebook10
kaashiv1
 
Sql interview question part 10
Sql interview question part 10Sql interview question part 10
Sql interview question part 10
kaashiv1
 
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
Dave Stokes
 
IBM Netezza
IBM NetezzaIBM Netezza
IBM Netezza
Ahmed Salman
 
LAS16-300: Mini Conference 2 Cortex-M Software - Device Configuration
LAS16-300: Mini Conference 2 Cortex-M Software - Device ConfigurationLAS16-300: Mini Conference 2 Cortex-M Software - Device Configuration
LAS16-300: Mini Conference 2 Cortex-M Software - Device Configuration
Linaro
 
Sql server lesson13
Sql server lesson13Sql server lesson13
Sql server lesson13
Ala Qunaibi
 
String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?
Jeremy Schneider
 
Netezza fundamentals-for-developers
Netezza fundamentals-for-developersNetezza fundamentals-for-developers
Netezza fundamentals-for-developers
Tariq H. Khan
 
Bigdata netezza-ppt-apr2013-bhawani nandan prasad
Bigdata netezza-ppt-apr2013-bhawani nandan prasadBigdata netezza-ppt-apr2013-bhawani nandan prasad
Bigdata netezza-ppt-apr2013-bhawani nandan prasad
Bhawani N Prasad
 
SQL Server 2019 CTP 2.5
SQL Server 2019 CTP 2.5SQL Server 2019 CTP 2.5
SQL Server 2019 CTP 2.5
Gianluca Hotz
 
DataStage_Whitepaper
DataStage_WhitepaperDataStage_Whitepaper
DataStage_Whitepaper
Sourav Maity
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database Overview
Steve Min
 
MySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User ConferenceMySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User Conference
Severalnines
 
12c Database new features
12c Database new features12c Database new features
12c Database new features
Sandeep Redkar
 
Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0
alok khobragade
 
Blades for HPTC
Blades for HPTCBlades for HPTC
Blades for HPTC
Guy Coates
 
Dmv's & Performance Monitor in SQL Server
Dmv's & Performance Monitor in SQL ServerDmv's & Performance Monitor in SQL Server
Dmv's & Performance Monitor in SQL Server
Zeba Ansari
 
Migrating To PostgreSQL
Migrating To PostgreSQLMigrating To PostgreSQL
Migrating To PostgreSQL
Grant Fritchey
 
Sql interview question part 10
Sql interview question part 10Sql interview question part 10
Sql interview question part 10
kaashiv1
 
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
Dave Stokes
 
LAS16-300: Mini Conference 2 Cortex-M Software - Device Configuration
LAS16-300: Mini Conference 2 Cortex-M Software - Device ConfigurationLAS16-300: Mini Conference 2 Cortex-M Software - Device Configuration
LAS16-300: Mini Conference 2 Cortex-M Software - Device Configuration
Linaro
 
Sql server lesson13
Sql server lesson13Sql server lesson13
Sql server lesson13
Ala Qunaibi
 
String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?
Jeremy Schneider
 
Ad

Recently uploaded (20)

Top Magento Hyvä Theme Features That Make It Ideal for E-commerce.pdf
Top Magento Hyvä Theme Features That Make It Ideal for E-commerce.pdfTop Magento Hyvä Theme Features That Make It Ideal for E-commerce.pdf
Top Magento Hyvä Theme Features That Make It Ideal for E-commerce.pdf
evrigsolution
 
Memory Management and Leaks in Postgres from pgext.day 2025
Memory Management and Leaks in Postgres from pgext.day 2025Memory Management and Leaks in Postgres from pgext.day 2025
Memory Management and Leaks in Postgres from pgext.day 2025
Phil Eaton
 
From Vibe Coding to Vibe Testing - Complete PowerPoint Presentation
From Vibe Coding to Vibe Testing - Complete PowerPoint PresentationFrom Vibe Coding to Vibe Testing - Complete PowerPoint Presentation
From Vibe Coding to Vibe Testing - Complete PowerPoint Presentation
Shay Ginsbourg
 
sequencediagrams.pptx software Engineering
sequencediagrams.pptx software Engineeringsequencediagrams.pptx software Engineering
sequencediagrams.pptx software Engineering
aashrithakondapalli8
 
Robotic Process Automation (RPA) Software Development Services.pptx
Robotic Process Automation (RPA) Software Development Services.pptxRobotic Process Automation (RPA) Software Development Services.pptx
Robotic Process Automation (RPA) Software Development Services.pptx
julia smits
 
How I solved production issues with OpenTelemetry
How I solved production issues with OpenTelemetryHow I solved production issues with OpenTelemetry
How I solved production issues with OpenTelemetry
Cees Bos
 
AEM User Group DACH - 2025 Inaugural Meeting
AEM User Group DACH - 2025 Inaugural MeetingAEM User Group DACH - 2025 Inaugural Meeting
AEM User Group DACH - 2025 Inaugural Meeting
jennaf3
 
Digital Twins Software Service in Belfast
Digital Twins Software Service in BelfastDigital Twins Software Service in Belfast
Digital Twins Software Service in Belfast
julia smits
 
[gbgcpp] Let's get comfortable with concepts
[gbgcpp] Let's get comfortable with concepts[gbgcpp] Let's get comfortable with concepts
[gbgcpp] Let's get comfortable with concepts
Dimitrios Platis
 
Why Tapitag Ranks Among the Best Digital Business Card Providers
Why Tapitag Ranks Among the Best Digital Business Card ProvidersWhy Tapitag Ranks Among the Best Digital Business Card Providers
Why Tapitag Ranks Among the Best Digital Business Card Providers
Tapitag
 
Buy vs. Build: Unlocking the right path for your training tech
Buy vs. Build: Unlocking the right path for your training techBuy vs. Build: Unlocking the right path for your training tech
Buy vs. Build: Unlocking the right path for your training tech
Rustici Software
 
How to Install and Activate ListGrabber Plugin
How to Install and Activate ListGrabber PluginHow to Install and Activate ListGrabber Plugin
How to Install and Activate ListGrabber Plugin
eGrabber
 
Surviving a Downturn Making Smarter Portfolio Decisions with OnePlan - Webina...
Surviving a Downturn Making Smarter Portfolio Decisions with OnePlan - Webina...Surviving a Downturn Making Smarter Portfolio Decisions with OnePlan - Webina...
Surviving a Downturn Making Smarter Portfolio Decisions with OnePlan - Webina...
OnePlan Solutions
 
Autodesk Inventor Crack (2025) Latest
Autodesk Inventor    Crack (2025) LatestAutodesk Inventor    Crack (2025) Latest
Autodesk Inventor Crack (2025) Latest
Google
 
Deploying & Testing Agentforce - End-to-end with Copado - Ewenb Clark
Deploying & Testing Agentforce - End-to-end with Copado - Ewenb ClarkDeploying & Testing Agentforce - End-to-end with Copado - Ewenb Clark
Deploying & Testing Agentforce - End-to-end with Copado - Ewenb Clark
Peter Caitens
 
The Elixir Developer - All Things Open
The Elixir Developer - All Things OpenThe Elixir Developer - All Things Open
The Elixir Developer - All Things Open
Carlo Gilmar Padilla Santana
 
A Comprehensive Guide to CRM Software Benefits for Every Business Stage
A Comprehensive Guide to CRM Software Benefits for Every Business StageA Comprehensive Guide to CRM Software Benefits for Every Business Stage
A Comprehensive Guide to CRM Software Benefits for Every Business Stage
SynapseIndia
 
Do not let staffing shortages and limited fiscal view hamper your cause
Do not let staffing shortages and limited fiscal view hamper your causeDo not let staffing shortages and limited fiscal view hamper your cause
Do not let staffing shortages and limited fiscal view hamper your cause
Fexle Services Pvt. Ltd.
 
Beyond the code. Complexity - 2025.05 - SwiftCraft
Beyond the code. Complexity - 2025.05 - SwiftCraftBeyond the code. Complexity - 2025.05 - SwiftCraft
Beyond the code. Complexity - 2025.05 - SwiftCraft
Dmitrii Ivanov
 
Artificial hand using embedded system.pptx
Artificial hand using embedded system.pptxArtificial hand using embedded system.pptx
Artificial hand using embedded system.pptx
bhoomigowda12345
 
Top Magento Hyvä Theme Features That Make It Ideal for E-commerce.pdf
Top Magento Hyvä Theme Features That Make It Ideal for E-commerce.pdfTop Magento Hyvä Theme Features That Make It Ideal for E-commerce.pdf
Top Magento Hyvä Theme Features That Make It Ideal for E-commerce.pdf
evrigsolution
 
Memory Management and Leaks in Postgres from pgext.day 2025
Memory Management and Leaks in Postgres from pgext.day 2025Memory Management and Leaks in Postgres from pgext.day 2025
Memory Management and Leaks in Postgres from pgext.day 2025
Phil Eaton
 
From Vibe Coding to Vibe Testing - Complete PowerPoint Presentation
From Vibe Coding to Vibe Testing - Complete PowerPoint PresentationFrom Vibe Coding to Vibe Testing - Complete PowerPoint Presentation
From Vibe Coding to Vibe Testing - Complete PowerPoint Presentation
Shay Ginsbourg
 
sequencediagrams.pptx software Engineering
sequencediagrams.pptx software Engineeringsequencediagrams.pptx software Engineering
sequencediagrams.pptx software Engineering
aashrithakondapalli8
 
Robotic Process Automation (RPA) Software Development Services.pptx
Robotic Process Automation (RPA) Software Development Services.pptxRobotic Process Automation (RPA) Software Development Services.pptx
Robotic Process Automation (RPA) Software Development Services.pptx
julia smits
 
How I solved production issues with OpenTelemetry
How I solved production issues with OpenTelemetryHow I solved production issues with OpenTelemetry
How I solved production issues with OpenTelemetry
Cees Bos
 
AEM User Group DACH - 2025 Inaugural Meeting
AEM User Group DACH - 2025 Inaugural MeetingAEM User Group DACH - 2025 Inaugural Meeting
AEM User Group DACH - 2025 Inaugural Meeting
jennaf3
 
Digital Twins Software Service in Belfast
Digital Twins Software Service in BelfastDigital Twins Software Service in Belfast
Digital Twins Software Service in Belfast
julia smits
 
[gbgcpp] Let's get comfortable with concepts
[gbgcpp] Let's get comfortable with concepts[gbgcpp] Let's get comfortable with concepts
[gbgcpp] Let's get comfortable with concepts
Dimitrios Platis
 
Why Tapitag Ranks Among the Best Digital Business Card Providers
Why Tapitag Ranks Among the Best Digital Business Card ProvidersWhy Tapitag Ranks Among the Best Digital Business Card Providers
Why Tapitag Ranks Among the Best Digital Business Card Providers
Tapitag
 
Buy vs. Build: Unlocking the right path for your training tech
Buy vs. Build: Unlocking the right path for your training techBuy vs. Build: Unlocking the right path for your training tech
Buy vs. Build: Unlocking the right path for your training tech
Rustici Software
 
How to Install and Activate ListGrabber Plugin
How to Install and Activate ListGrabber PluginHow to Install and Activate ListGrabber Plugin
How to Install and Activate ListGrabber Plugin
eGrabber
 
Surviving a Downturn Making Smarter Portfolio Decisions with OnePlan - Webina...
Surviving a Downturn Making Smarter Portfolio Decisions with OnePlan - Webina...Surviving a Downturn Making Smarter Portfolio Decisions with OnePlan - Webina...
Surviving a Downturn Making Smarter Portfolio Decisions with OnePlan - Webina...
OnePlan Solutions
 
Autodesk Inventor Crack (2025) Latest
Autodesk Inventor    Crack (2025) LatestAutodesk Inventor    Crack (2025) Latest
Autodesk Inventor Crack (2025) Latest
Google
 
Deploying & Testing Agentforce - End-to-end with Copado - Ewenb Clark
Deploying & Testing Agentforce - End-to-end with Copado - Ewenb ClarkDeploying & Testing Agentforce - End-to-end with Copado - Ewenb Clark
Deploying & Testing Agentforce - End-to-end with Copado - Ewenb Clark
Peter Caitens
 
A Comprehensive Guide to CRM Software Benefits for Every Business Stage
A Comprehensive Guide to CRM Software Benefits for Every Business StageA Comprehensive Guide to CRM Software Benefits for Every Business Stage
A Comprehensive Guide to CRM Software Benefits for Every Business Stage
SynapseIndia
 
Do not let staffing shortages and limited fiscal view hamper your cause
Do not let staffing shortages and limited fiscal view hamper your causeDo not let staffing shortages and limited fiscal view hamper your cause
Do not let staffing shortages and limited fiscal view hamper your cause
Fexle Services Pvt. Ltd.
 
Beyond the code. Complexity - 2025.05 - SwiftCraft
Beyond the code. Complexity - 2025.05 - SwiftCraftBeyond the code. Complexity - 2025.05 - SwiftCraft
Beyond the code. Complexity - 2025.05 - SwiftCraft
Dmitrii Ivanov
 
Artificial hand using embedded system.pptx
Artificial hand using embedded system.pptxArtificial hand using embedded system.pptx
Artificial hand using embedded system.pptx
bhoomigowda12345
 

An Introduction to Netezza

  • 1. NETEZZA What is Bigdata Netezza ? An introduction to Netezza Vijaya Chandrika J 1
  • 2. Netezza Architecture  Netezza uses a proprietary architecture called Asymmetric Massively Parallel Processing (AMPP)  AMPP is based on the concept of Massively Parallel Processing (MPP) where nothing (CPU, memory, storage) is shared .  The MPP is achieved through an array of S-Blades which are servers on its own running its own operating systems connected to disks.  Netezza architecture has one unique hardware component called the Database Accelerator card which is attached to the S-Blades. An introduction to Netezza Vijaya Chandrika J 2
  • 3. Hardware components of the Netezza 05/25/15An introduction to Netezza Vijaya Chandrika J 3 The following diagram provides a high level logical schematic which will help imagine the various components in the Netezza appliance. Uses a Linux OS Each S-Blades has 8 processor cores and 16 GB of RAM . Each processor in the S-Blade is connected to disks in a disk array through a Database Accelerator card which uses FPGA technology.
  • 4. What are S-Blades  S blades are called Snippet blades or Snippet Processing Array (SPA)  The S-Blade is a specialized processing board which combines the CPU processing power of a blade server with the query analysis intelligence  The Netezza Database Accelerator card contains the FPGA query engines, memory, and I/O for processing the data from the disks where user data is stored. An introduction to Netezza Vijaya Chandrika J 4 1- S Blade 2- accelerator card
  • 5. How it works ? An example  Assumptions : Assume an example data warehouse for a large retail firm and one of the tables store the details about all of its 10 million customers. Also assume that there are 25 columns in the tables and the total length of each table row is 250 bytes.  Query : user query the application for say Customer Id, Name and State who joined the organization in a particular period sorted by state and name An introduction to Netezza Vijaya Chandrika J 5
  • 6. High level steps  In Netezza the 10 million customer records will be stored fairly equally across all the disks available in the disk arrays connected to the snippet processors in the S-Blades in a compressed form.  The Database Accelerator card in the snippet processor will un-compress the data which will include all the columns in the table, then it will remove the unwanted columns from the data which in case will be 22 columns i.e. 220 bytes out of the 250 bytes, applies the where clause which will remove the unwanted rows from the data and passes the small amount of the data to the CPU in the snippet processor. In traditional databases all these steps are performed in the CPU.  The CPU in the snippet processor performs tasks like aggregation, sum, sort etc on the data from the database accelerator card and parses the result to the host through the network. An introduction to Netezza Vijaya Chandrika J 6
  • 7. The key takeaways  The Netezza has the ability to process large volume of data in parallel and the key is to make sure that the data is distributed appropriately to leverage the massive parallel processing.  Implement designs in a way that most of the processing happens in the snippet processors; minimize communication between snippet processors and minimal data communication to the host. An introduction to Netezza Vijaya Chandrika J 7
  • 8. Netezza Tools  NzAdmin : This is a GUI based administration tool  The tool has a system view which it provides a visual snapshot of the state of the appliance including issues with any hardware components. The second view the tool provides is the database view which lists all the databases including the objects in them, users and groups currently defined, active sessions, query history and any backup history. The database view also provides options to perform database administration tasks like creation and management of database and database objects, users and groups. An introduction to Netezza Vijaya Chandrika J 8
  • 9. An introduction to Netezza Vijaya Chandrika J 9
  • 10. 05/25/15An introduction to Netezza Vijaya Chandrika J 10
  • 11. NZSQL  “nzsql” is the second tool that is most commonly used .  The “nzsql” command invoke the SQL command interpreter through which all Netezza supported SQL statements can be executed.  nzsql –d testdb –u testuser –p password  This command Will connect and create a “nzsql” session with the database “testdb” as the user “testuser” after which the user can execute SQL statements against the database. Also as with all the Netezza commands the “nzsql” has the “-h” help option which displays details about the usage of the command. 05/25/15An introduction to Netezza Vijaya Chandrika J 11
  • 12. System Objects  The appliance comes preconfigured with the following 3 user ids which can’t be modified or deleted from the system. They are used to perform all the administration tasks and hence should be used by restricted number of users.  root : The super user for the host system on the appliance and has all the access as a super user in any Linux system.  nz : Netezza system administrator Linux account that is used to run host software on Linux  admin : The default Netezza SQL database administrator user which has access to perform all database related tasks against all the databases in the appliance. An introduction to Netezza Vijaya Chandrika J 12
  • 13. Create Table create table employee ( emp_id integer not null, first_name varchar(25) not null, last_name varchar(25) not null, sex char(1), dept_id integer not null, created_dt timestamp not null, created_by char(8) not null, updated_dt timestamp not null, updated_by char(8) not null, constraint pk_employee primary key(emp_id) constraint fk_employee foreign key (dept_id) references department(dept_id) on update restrict on delete restrict ) distribute on random; An introduction to Netezza Vijaya Chandrika J 13 the statement will look familiar except for the “distribute on” clause details. Also there are no storage related details like tablespace on which the table needs to be created or any bufferpool details which are handled by the Netezza appliance.
  • 14. Netezza vs traditional dbs  Netezza doesn’t enforce any of the constraints like the primary key or foreign key when inserting or loading data into the tables for performance reasons. It is up to the application to make sure that these constraints are satisfied by the data being loaded into the tables. Even though the constraints are not enforced by Netezza defining them will provide additional hints to the query optimizer to generate efficient snippet execution code which in turn helps performance.  Modifying the column length is only applicable to columns defined as varchar.  If a table gets renamed the views attached to the table will stop working  If a table is referenced by a stored procedure adding or dropping a column is not permitted. The stored procedure needs to be dropped first before adding or dropping a column and then the stored procedure needs to be recreated. An introduction to Netezza Vijaya Chandrika J 14
  • 15. Netezza vs traditional dbs - MV  Only one table can be specified in the FROM clause of the create statement for MV  There can be no where clause in the select clause of the create statement for MV  The columns in the projection list must be columns from the base table and no expressions  External, temporary, system or clustered base tables can’t be used as base table for materialized views An introduction to Netezza Vijaya Chandrika J 15
  • 16. Netezza vs traditional dbs - Sequence  The following is a sample sequence creation statement which can be used to populate the id column in the employee table. create sequence seq_emp_id as integer start with 1 increment by 1 minvalue 1 no maxvalue no cycle;  Since no max value is used, the sequence will be able to hold up to the largest value of the sequence type which in this case is 35,791,394 for integer type.  System will be forced to flush cached values of sequences in situations like stopping of the system, system or SPU crashes or during some alter sequence statements which will also create gaps in the sequence number generated by a sequence. An introduction to Netezza Vijaya Chandrika J 16
  • 17. Netezza Storage  Each disk in the appliance is partitioned into primary, mirror and temp or swap partitions. The primary partition in each disk is used to store user data like database tables, the mirror stores a copy of the primary partition of another disk so that it can be used in the event of disk failures and the temp/swap partition is used to store the data temporarily like when the appliance does data redistribution while processing queries. The logical representation of the data saved in the primary partition of each disk is called the data slice. When users create database tables and loads data into it, they get distributed across the available data slices. Logical representation of data slices is called the data partition. An introduction to Netezza Vijaya Chandrika J 17
  • 18. Netezza Storage - Diagram An introduction to Netezza Vijaya Chandrika J 18
  • 19. Data Organization  When users create tables in databases and store data into it, data gets stored in disk extents which is the minimum storage allocated on disks for data storage. Netezza distributes the data in data extents across all the available data slices based on the distribution key specified during the table creation. A user can specify upto four columns for data distribution or can specify the data to be distributed randomly or none at all during the table creation process.  When the user selects random as the option for data distribution, then the appliance uses round robin algorithm to distribute the data uniformly across all the available dataslices.  The key is to make sure that the data for a table is uniformly distributed across all the data slices so that there are no data skews. By distributing data across the data slices, all the SPUs in the system can be utilized to process any query and in turn improves performance. An introduction to Netezza Vijaya Chandrika J 19
  • 20. Netezza Transactions  By default Netezza SQLs are executed in auto-commit mode i.e. the changes made by a SQL statement takes in effect immediately after the completion of the statement as if the transaction is complete.  If there are multiple related SQL statements where all the SQL execution need to fail if any one of them fails, user can use the BEGIN, COMMIT and ROLLBACK transaction control statements to control the transaction involving multiple statements. All SQL statements between a BEGIN statement and COMMIT or ROLLBACK statement will be treated as part of a single transaction An introduction to Netezza Vijaya Chandrika J 20
  • 21. Alternate for redo logs in Netezza  Netezza doesn’t use logs and all the changes are made on the storage where user data is stored which also helps with the performance.  Netezza maintains three additional hidden columns (createxid, deletexid and row id) per table row which stores the transaction id which created the row, the transaction id which deleted the row and a unique row id assigned to the data row by the system. An introduction to Netezza Vijaya Chandrika J 21
  • 22. Best Practices  Define all constraints and relationships between objects. Even though Netezza doesn’t enforce them other than the not null constraint, the query optimizer will still use these details to come-up with an efficient query execution plan.  If data for a column is known to have a fixed length value, then use char(x) instead of varchar(x). Varchar(x) uses additional storage which will be significant when dealing with TB of data and also impacts the query processing since additional data need to be pulled in from disk for processing.  Use NOT NULL wherever data permits. This will help improve performance by not having to check for null condition by the appliance and will reduce storage usage. An introduction to Netezza Vijaya Chandrika J 22
  • 23. Best Practices  Distribute on columns of high cardinality and ones that used to join often. It is best to distribute fact and dimension table on the same column. This will reduce the data redistribution during queries improving the performance.  Create materialized view on a small set of the columns from a large table often used by user queries. An introduction to Netezza Vijaya Chandrika J 23
  • 24. Questions ? An introduction to Netezza Vijaya Chandrika J 24
  翻译: