SlideShare a Scribd company logo
Hadoop
and Big Data Security
Kevin T. Smith, 11/14/2013
Ksmith <AT> Novetta . COM
Big Data Security – Why Should We Care?
New Challenges related to Data Management, Security, and Privacy
As data growth is explosive, so is the complexity of our IT environments
Many organizations required to enforce access control & privacy restrictions on
data sets (HIPAA, Privacy Laws) – or face steep penalties & fines
Organizations are increasingly required to enforce access control to their data
scientists based on Need-to-Know, User Authorization levels, and what data they
are allowed to see – especially in Healthcare, Finance, and Government
Organizations struggling to understand what data they can release

Mismanagement of Data Sets -- Costly..
AOL Research “Data Valdez” Incident
• CNNMoney - “101 Dumbest Moments in Business”
• $5 Million Settlement , plus $100 to each member of AOL between 3/2006-5/2006,
+ $50 to each member who believed their data was in the released data; Fired
employees, CTO Resignation

The Netflix Contest Anonymized Data Set Incident
• Class-Action Lawsuit, $9 Million Settlement

Massachusetts Hospital Record Incident

Cyber Security Attacks are on the Rise
Ponemon Institute – the Average Cost of a Data Breach in the U.S. is 5.4 Million
dollars*
Playstation (2011) – Experts predict costs between 2.2 and 2.4 Billion
* (Breach Study: Global Analysis, May 2013)
A (Brief) History of Hadoop Security
Hadoop developed without Security in Mind
Originally No Security model
No authentication of users or services
Anyone could submit arbitrary code to be executed
Later authorization added, but any user could
impersonate other users with command-line switch

In 2009, Yahoo! focused on Hadoop
Authentication, and did a Hadoop Redesign, But…
Resulting Security Model is Complex
Security Configuration is complex & Easy to Mess Up
No Data at Rest Encryption
Kerberos-Centric
Limited Authorization Capabilities

Things are Changing, But Slowly..

It is important to
understand how
Hadoop Security is
Currently Implemented
& Configured
It is important to
understand how to
meet your
organization’s security
requirements
Hadoop Security Data Flow
Distributed Security
is a Challenge
Since the .20.20x
distributions of
Hadoop, much of
the model is
Kerberos Centric ,
as you see to the
right
Model is quite
complex, as you will
see on the next
slide
Token Delegation &
Hadoop Security Flow
Token

Used For

Kerberos TGT

Kerberos initial authentication to KDC.

Kerberos service
ticket

Kerberos initial authentication between users,
client processes, and services.

Delegation
token

Token issued by the NameNode to the client,
used by the client or any services working on
the client’s behalf to authenticate them to the
NameNode.

Block Access
token

Token issued by the NameNode after
validating authorization to a particular block
of data, based on a shared secret with the
DataNode. Clients (and services working on
the client’s behalf) use the Block Access
token to request blocks from the DataNode.

Job token

This is issued by the JobTracker to
TaskTrackers. Tasks communicating with
TaskTrackers for a particular job use this
token to prove they are associated with the
job.
Some Vendor Activity in Hadoop Security
Seems to be a New One Every Week!

Cloudera Sentry – Fine Grained Access Control for Apache Hive & Cloudera Impala
IBM InfoSphere Optim Data Masking – Optim Data Masking provides “Deidentification” of data by obfuscating corporate secrets, Guardium provides
monitoring & auditing
Intel’s Secure Hadoop Distribution – Encryption in transit & at rest, Granular access
control with HBase
DataStax Enterprise – Encryption in Transit & at Rest (using Cassandra for storage)
DataGuise for Hadoop – Detects & protects sensitive data, setting access
permission, masking or encrypting data, authorization based access
Knox Gateway (Hortonworks) – Perimeter security, integration with IDAM
environments, manage security across multiple clusters – now an Apache Project
Protegrity – Big Data Protector provides Encryption & tokenization, Enterprise
Security Administrator provides central policy, key mgmt, auditing, reporting
Sqrrl – Builds on Apache Accumulo’s security capabilities for Hadoop
Zettaset Secure Orchestrator – security wrapper around Hadoop
Apache Accumulo
• Cell-Level Access Control via visibility
• By default, uses its own db for
users & credentials
• Can be extended in code to use other
Identity & Access Management
Infrastructure
Project Rhino
Intel launched this open source effort to improve security capabilities of Hadoop &
contributed code to Apache in early 2013.
Encrypted Data at Rest - JIRA Tasks HADOOP-9331 (Hadoop Crypto Codec
Framework and Crypto Codec Implementation) and MAPREDUCE-5025 (Key
Distribution and Management for Supporting Crypto Codec in MapReduce) .
ZOOKEEPER-1688 will provide the ability for transparent encryption of snapshots
and commit logs on disk, protecting against the leakage of sensitive information
from files at rest.
Token-Based Authentication & Unified Authorization Framework - JIRA
TasksHADOOP-9392 (Token-Based Authentication and Single Sign-On) and HADOOP9466(Unified Authorization Framework)
Improved Security in HBase - The JIRA Task HBASE-6222 (Add Per-KeyValue
Security) adds cell-level authorization to HBase – something that Apache Accumulo
has but HBase does not. HBASE-7544 builds on the encryption framework being
developed, extending it to HBase, providing transparent table encryption.
What’s the Best Guidance Now?
Identify and Understand the Sensitivity Levels of Your Data
Are there access control policies associated with your data?

Understand the Impact of the Release of Your Data
Netflix example – Could someone couple your data with open source data to
gain new (and unintended) insight?

Develop Policies & Procedures relating to Security & Privacy of Your Data
Sets
Data Ingest
Access Control within Your Organization
Cleansing/Sanitization/Destruction
Auditing
Monitoring Procedures
Incident Response

Develop a Technical Security Approach that Complements Hadoop Security
Questions?
Ksmith <AT> Novetta.COM
Ad

More Related Content

What's hot (18)

Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
BigDataEverywhere
 
Fighting cyber fraud with hadoop v2
Fighting cyber fraud with hadoop v2Fighting cyber fraud with hadoop v2
Fighting cyber fraud with hadoop v2
Niel Dunnage
 
Data Governance and Management in Cloud pak nam
Data Governance and Management in Cloud pak namData Governance and Management in Cloud pak nam
Data Governance and Management in Cloud pak nam
PT Datacomm Diangraha
 
Solving the Really Big Tech Problems with IoT
 Solving the Really Big Tech Problems with IoT Solving the Really Big Tech Problems with IoT
Solving the Really Big Tech Problems with IoT
Eric Kavanagh
 
Lessons Learned from Building and Running MHN, the World's Largest Crowdsourc...
Lessons Learned from Building and Running MHN, the World's Largest Crowdsourc...Lessons Learned from Building and Running MHN, the World's Largest Crowdsourc...
Lessons Learned from Building and Running MHN, the World's Largest Crowdsourc...
Jason Trost
 
Modern Honey Network at Bay Area Open Source Security Hackers
Modern Honey Network at Bay Area Open Source Security HackersModern Honey Network at Bay Area Open Source Security Hackers
Modern Honey Network at Bay Area Open Source Security Hackers
Jason Trost
 
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
DataWorks Summit/Hadoop Summit
 
GTB Data Leakage Prevention Use Cases 2014
GTB Data Leakage Prevention Use Cases 2014GTB Data Leakage Prevention Use Cases 2014
GTB Data Leakage Prevention Use Cases 2014
Ravindran Vasu
 
IRJET- Mutual Key Oversight Procedure for Cloud Security and Distribution of ...
IRJET- Mutual Key Oversight Procedure for Cloud Security and Distribution of ...IRJET- Mutual Key Oversight Procedure for Cloud Security and Distribution of ...
IRJET- Mutual Key Oversight Procedure for Cloud Security and Distribution of ...
IRJET Journal
 
Web Enabled DDS - London Connext DDS Conference
Web Enabled DDS - London Connext DDS ConferenceWeb Enabled DDS - London Connext DDS Conference
Web Enabled DDS - London Connext DDS Conference
Gerardo Pardo-Castellote
 
Anomali Detect 2016 - Borderless Threat Intelligence
Anomali Detect 2016 - Borderless Threat IntelligenceAnomali Detect 2016 - Borderless Threat Intelligence
Anomali Detect 2016 - Borderless Threat Intelligence
Jason Trost
 
SANS CTI Summit 2016 Borderless Threat Intelligence
SANS CTI Summit 2016 Borderless Threat IntelligenceSANS CTI Summit 2016 Borderless Threat Intelligence
SANS CTI Summit 2016 Borderless Threat Intelligence
Jason Trost
 
IRJET- Multi-Owner Keyword Search over Cloud with Cryptography
IRJET- Multi-Owner Keyword Search over Cloud with CryptographyIRJET- Multi-Owner Keyword Search over Cloud with Cryptography
IRJET- Multi-Owner Keyword Search over Cloud with Cryptography
IRJET Journal
 
Accelerated Migrations with Nuix
Accelerated Migrations with NuixAccelerated Migrations with Nuix
Accelerated Migrations with Nuix
Carey Bandler
 
Balancing data democratization with comprehensive information governance: bui...
Balancing data democratization with comprehensive information governance: bui...Balancing data democratization with comprehensive information governance: bui...
Balancing data democratization with comprehensive information governance: bui...
DataWorks Summit
 
Data Driven Security in SSAS
Data Driven Security in SSASData Driven Security in SSAS
Data Driven Security in SSAS
Mike Duffy
 
Data Leakage Prevention
Data Leakage PreventionData Leakage Prevention
Data Leakage Prevention
Microsoft TechNet - Belgium and Luxembourg
 
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
DataWorks Summit
 
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
BigDataEverywhere
 
Fighting cyber fraud with hadoop v2
Fighting cyber fraud with hadoop v2Fighting cyber fraud with hadoop v2
Fighting cyber fraud with hadoop v2
Niel Dunnage
 
Data Governance and Management in Cloud pak nam
Data Governance and Management in Cloud pak namData Governance and Management in Cloud pak nam
Data Governance and Management in Cloud pak nam
PT Datacomm Diangraha
 
Solving the Really Big Tech Problems with IoT
 Solving the Really Big Tech Problems with IoT Solving the Really Big Tech Problems with IoT
Solving the Really Big Tech Problems with IoT
Eric Kavanagh
 
Lessons Learned from Building and Running MHN, the World's Largest Crowdsourc...
Lessons Learned from Building and Running MHN, the World's Largest Crowdsourc...Lessons Learned from Building and Running MHN, the World's Largest Crowdsourc...
Lessons Learned from Building and Running MHN, the World's Largest Crowdsourc...
Jason Trost
 
Modern Honey Network at Bay Area Open Source Security Hackers
Modern Honey Network at Bay Area Open Source Security HackersModern Honey Network at Bay Area Open Source Security Hackers
Modern Honey Network at Bay Area Open Source Security Hackers
Jason Trost
 
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
DataWorks Summit/Hadoop Summit
 
GTB Data Leakage Prevention Use Cases 2014
GTB Data Leakage Prevention Use Cases 2014GTB Data Leakage Prevention Use Cases 2014
GTB Data Leakage Prevention Use Cases 2014
Ravindran Vasu
 
IRJET- Mutual Key Oversight Procedure for Cloud Security and Distribution of ...
IRJET- Mutual Key Oversight Procedure for Cloud Security and Distribution of ...IRJET- Mutual Key Oversight Procedure for Cloud Security and Distribution of ...
IRJET- Mutual Key Oversight Procedure for Cloud Security and Distribution of ...
IRJET Journal
 
Web Enabled DDS - London Connext DDS Conference
Web Enabled DDS - London Connext DDS ConferenceWeb Enabled DDS - London Connext DDS Conference
Web Enabled DDS - London Connext DDS Conference
Gerardo Pardo-Castellote
 
Anomali Detect 2016 - Borderless Threat Intelligence
Anomali Detect 2016 - Borderless Threat IntelligenceAnomali Detect 2016 - Borderless Threat Intelligence
Anomali Detect 2016 - Borderless Threat Intelligence
Jason Trost
 
SANS CTI Summit 2016 Borderless Threat Intelligence
SANS CTI Summit 2016 Borderless Threat IntelligenceSANS CTI Summit 2016 Borderless Threat Intelligence
SANS CTI Summit 2016 Borderless Threat Intelligence
Jason Trost
 
IRJET- Multi-Owner Keyword Search over Cloud with Cryptography
IRJET- Multi-Owner Keyword Search over Cloud with CryptographyIRJET- Multi-Owner Keyword Search over Cloud with Cryptography
IRJET- Multi-Owner Keyword Search over Cloud with Cryptography
IRJET Journal
 
Accelerated Migrations with Nuix
Accelerated Migrations with NuixAccelerated Migrations with Nuix
Accelerated Migrations with Nuix
Carey Bandler
 
Balancing data democratization with comprehensive information governance: bui...
Balancing data democratization with comprehensive information governance: bui...Balancing data democratization with comprehensive information governance: bui...
Balancing data democratization with comprehensive information governance: bui...
DataWorks Summit
 
Data Driven Security in SSAS
Data Driven Security in SSASData Driven Security in SSAS
Data Driven Security in SSAS
Mike Duffy
 
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
DataWorks Summit
 

Viewers also liked (20)

2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_security
Adam Muise
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
Biju Nair
 
Big Data Security with Hadoop
Big Data Security with HadoopBig Data Security with Hadoop
Big Data Security with Hadoop
Cloudera, Inc.
 
Big Data Security and Governance
Big Data Security and GovernanceBig Data Security and Governance
Big Data Security and Governance
DataWorks Summit/Hadoop Summit
 
2008-01-22 Red Hat (Security) Roadmap Presentation
2008-01-22 Red Hat (Security) Roadmap Presentation2008-01-22 Red Hat (Security) Roadmap Presentation
2008-01-22 Red Hat (Security) Roadmap Presentation
Shawn Wells
 
Processing massive amount of data with Map Reduce using Apache Hadoop - Indi...
Processing massive amount of data with Map Reduce using Apache Hadoop  - Indi...Processing massive amount of data with Map Reduce using Apache Hadoop  - Indi...
Processing massive amount of data with Map Reduce using Apache Hadoop - Indi...
IndicThreads
 
Big security for big data
Big security for big dataBig security for big data
Big security for big data
Ari Elias-Bachrach
 
Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]
Lu Wei
 
Hadoop Security Preview
Hadoop Security PreviewHadoop Security Preview
Hadoop Security Preview
Hadoop User Group
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013
Jonathan Seidman
 
Introduction to Hadoop - The Essentials
Introduction to Hadoop - The EssentialsIntroduction to Hadoop - The Essentials
Introduction to Hadoop - The Essentials
Fadi Yousuf
 
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduceBIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
Mahantesh Angadi
 
Is Your Hadoop Environment Secure?
Is Your Hadoop Environment Secure?Is Your Hadoop Environment Secure?
Is Your Hadoop Environment Secure?
Datameer
 
Big Data Security with HP ArcSight
Big Data Security with HP ArcSightBig Data Security with HP ArcSight
Big Data Security with HP ArcSight
Sridhar Karnam
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Cloudera, Inc.
 
Hadoop Security Now and Future
Hadoop Security Now and FutureHadoop Security Now and Future
Hadoop Security Now and Future
tcloudcomputing-tw
 
The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014
Cloudera, Inc.
 
Advanced Security In Hadoop Cluster
Advanced Security In Hadoop ClusterAdvanced Security In Hadoop Cluster
Advanced Security In Hadoop Cluster
Edureka!
 
LPWA-Open for Business. It’s time to execute
LPWA-Open for Business. It’s time to executeLPWA-Open for Business. It’s time to execute
LPWA-Open for Business. It’s time to execute
Telefónica IoT
 
Big Data, Security Intelligence, (And Why I Hate This Title)
Big Data, Security Intelligence, (And Why I Hate This Title) Big Data, Security Intelligence, (And Why I Hate This Title)
Big Data, Security Intelligence, (And Why I Hate This Title)
Coastal Pet Products, Inc.
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_security
Adam Muise
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
Biju Nair
 
Big Data Security with Hadoop
Big Data Security with HadoopBig Data Security with Hadoop
Big Data Security with Hadoop
Cloudera, Inc.
 
2008-01-22 Red Hat (Security) Roadmap Presentation
2008-01-22 Red Hat (Security) Roadmap Presentation2008-01-22 Red Hat (Security) Roadmap Presentation
2008-01-22 Red Hat (Security) Roadmap Presentation
Shawn Wells
 
Processing massive amount of data with Map Reduce using Apache Hadoop - Indi...
Processing massive amount of data with Map Reduce using Apache Hadoop  - Indi...Processing massive amount of data with Map Reduce using Apache Hadoop  - Indi...
Processing massive amount of data with Map Reduce using Apache Hadoop - Indi...
IndicThreads
 
Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]
Lu Wei
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013
Jonathan Seidman
 
Introduction to Hadoop - The Essentials
Introduction to Hadoop - The EssentialsIntroduction to Hadoop - The Essentials
Introduction to Hadoop - The Essentials
Fadi Yousuf
 
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduceBIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
Mahantesh Angadi
 
Is Your Hadoop Environment Secure?
Is Your Hadoop Environment Secure?Is Your Hadoop Environment Secure?
Is Your Hadoop Environment Secure?
Datameer
 
Big Data Security with HP ArcSight
Big Data Security with HP ArcSightBig Data Security with HP ArcSight
Big Data Security with HP ArcSight
Sridhar Karnam
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Cloudera, Inc.
 
Hadoop Security Now and Future
Hadoop Security Now and FutureHadoop Security Now and Future
Hadoop Security Now and Future
tcloudcomputing-tw
 
The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014
Cloudera, Inc.
 
Advanced Security In Hadoop Cluster
Advanced Security In Hadoop ClusterAdvanced Security In Hadoop Cluster
Advanced Security In Hadoop Cluster
Edureka!
 
LPWA-Open for Business. It’s time to execute
LPWA-Open for Business. It’s time to executeLPWA-Open for Business. It’s time to execute
LPWA-Open for Business. It’s time to execute
Telefónica IoT
 
Big Data, Security Intelligence, (And Why I Hate This Title)
Big Data, Security Intelligence, (And Why I Hate This Title) Big Data, Security Intelligence, (And Why I Hate This Title)
Big Data, Security Intelligence, (And Why I Hate This Title)
Coastal Pet Products, Inc.
 
Ad

Similar to Hadoop and Big Data Security (20)

Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
sahil lalwani
 
Cryptographie avancée et Logical Data Fabric : Accélérez le partage et la mig...
Cryptographie avancée et Logical Data Fabric : Accélérez le partage et la mig...Cryptographie avancée et Logical Data Fabric : Accélérez le partage et la mig...
Cryptographie avancée et Logical Data Fabric : Accélérez le partage et la mig...
Denodo
 
A robust and verifiable threshold multi authority access control system in pu...
A robust and verifiable threshold multi authority access control system in pu...A robust and verifiable threshold multi authority access control system in pu...
A robust and verifiable threshold multi authority access control system in pu...
IJARIIT
 
A Trusted TPA Model, to Improve Security & Reliability for Cloud Storage
A Trusted TPA Model, to Improve Security & Reliability for Cloud StorageA Trusted TPA Model, to Improve Security & Reliability for Cloud Storage
A Trusted TPA Model, to Improve Security & Reliability for Cloud Storage
IRJET Journal
 
Zero Trust 20211105
Zero Trust 20211105 Zero Trust 20211105
Zero Trust 20211105
Thomas Treml
 
Achieving Secure, sclable and finegrained Cloud computing report
Achieving Secure, sclable and finegrained Cloud computing reportAchieving Secure, sclable and finegrained Cloud computing report
Achieving Secure, sclable and finegrained Cloud computing report
Kiran Girase
 
IRJET- A Novel and Secure Approach to Control and Access Data in Cloud St...
IRJET-  	  A Novel and Secure Approach to Control and Access Data in Cloud St...IRJET-  	  A Novel and Secure Approach to Control and Access Data in Cloud St...
IRJET- A Novel and Secure Approach to Control and Access Data in Cloud St...
IRJET Journal
 
Simplifying Data Governance and Security with a Logical Data Fabric (ASEAN)
Simplifying Data Governance and Security with a Logical Data Fabric (ASEAN)Simplifying Data Governance and Security with a Logical Data Fabric (ASEAN)
Simplifying Data Governance and Security with a Logical Data Fabric (ASEAN)
Denodo
 
Proxy-Oriented Data Uploading & Monitoring Remote Data Integrity in Public Cloud
Proxy-Oriented Data Uploading & Monitoring Remote Data Integrity in Public CloudProxy-Oriented Data Uploading & Monitoring Remote Data Integrity in Public Cloud
Proxy-Oriented Data Uploading & Monitoring Remote Data Integrity in Public Cloud
IRJET Journal
 
Final PPT after cla after class (1).pptx
Final PPT after cla after class (1).pptxFinal PPT after cla after class (1).pptx
Final PPT after cla after class (1).pptx
nandan543979
 
Trusted db a trusted hardware based database with privacy and data confidenti...
Trusted db a trusted hardware based database with privacy and data confidenti...Trusted db a trusted hardware based database with privacy and data confidenti...
Trusted db a trusted hardware based database with privacy and data confidenti...
LeMeniz Infotech
 
A New Mode to Ensure Security in Cloud Computing Services
A New Mode to Ensure Security in Cloud Computing ServicesA New Mode to Ensure Security in Cloud Computing Services
A New Mode to Ensure Security in Cloud Computing Services
Mahmuda Rahman
 
Fighting cyber fraud with hadoop
Fighting cyber fraud with hadoopFighting cyber fraud with hadoop
Fighting cyber fraud with hadoop
Niel Dunnage
 
Preparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity RenaissancePreparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity Renaissance
Cloudera, Inc.
 
Implementing an improved security for collin’s database and telecommuters
Implementing an improved security for collin’s database and telecommutersImplementing an improved security for collin’s database and telecommuters
Implementing an improved security for collin’s database and telecommuters
Rishabh Gupta
 
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...
Editor IJCATR
 
Rik Ferguson
Rik FergusonRik Ferguson
Rik Ferguson
CloudExpoEurope
 
Privacy Preserving in Authentication Protocol for Shared Authority Based Clou...
Privacy Preserving in Authentication Protocol for Shared Authority Based Clou...Privacy Preserving in Authentication Protocol for Shared Authority Based Clou...
Privacy Preserving in Authentication Protocol for Shared Authority Based Clou...
IRJET Journal
 
0th PPT - BLOCKCHAIN-CBE (1).ppt
0th PPT - BLOCKCHAIN-CBE (1).ppt0th PPT - BLOCKCHAIN-CBE (1).ppt
0th PPT - BLOCKCHAIN-CBE (1).ppt
VarioTechnology
 
Compliance in the Cloud
Compliance in the CloudCompliance in the Cloud
Compliance in the Cloud
RapidScale
 
Cryptographie avancée et Logical Data Fabric : Accélérez le partage et la mig...
Cryptographie avancée et Logical Data Fabric : Accélérez le partage et la mig...Cryptographie avancée et Logical Data Fabric : Accélérez le partage et la mig...
Cryptographie avancée et Logical Data Fabric : Accélérez le partage et la mig...
Denodo
 
A robust and verifiable threshold multi authority access control system in pu...
A robust and verifiable threshold multi authority access control system in pu...A robust and verifiable threshold multi authority access control system in pu...
A robust and verifiable threshold multi authority access control system in pu...
IJARIIT
 
A Trusted TPA Model, to Improve Security & Reliability for Cloud Storage
A Trusted TPA Model, to Improve Security & Reliability for Cloud StorageA Trusted TPA Model, to Improve Security & Reliability for Cloud Storage
A Trusted TPA Model, to Improve Security & Reliability for Cloud Storage
IRJET Journal
 
Zero Trust 20211105
Zero Trust 20211105 Zero Trust 20211105
Zero Trust 20211105
Thomas Treml
 
Achieving Secure, sclable and finegrained Cloud computing report
Achieving Secure, sclable and finegrained Cloud computing reportAchieving Secure, sclable and finegrained Cloud computing report
Achieving Secure, sclable and finegrained Cloud computing report
Kiran Girase
 
IRJET- A Novel and Secure Approach to Control and Access Data in Cloud St...
IRJET-  	  A Novel and Secure Approach to Control and Access Data in Cloud St...IRJET-  	  A Novel and Secure Approach to Control and Access Data in Cloud St...
IRJET- A Novel and Secure Approach to Control and Access Data in Cloud St...
IRJET Journal
 
Simplifying Data Governance and Security with a Logical Data Fabric (ASEAN)
Simplifying Data Governance and Security with a Logical Data Fabric (ASEAN)Simplifying Data Governance and Security with a Logical Data Fabric (ASEAN)
Simplifying Data Governance and Security with a Logical Data Fabric (ASEAN)
Denodo
 
Proxy-Oriented Data Uploading & Monitoring Remote Data Integrity in Public Cloud
Proxy-Oriented Data Uploading & Monitoring Remote Data Integrity in Public CloudProxy-Oriented Data Uploading & Monitoring Remote Data Integrity in Public Cloud
Proxy-Oriented Data Uploading & Monitoring Remote Data Integrity in Public Cloud
IRJET Journal
 
Final PPT after cla after class (1).pptx
Final PPT after cla after class (1).pptxFinal PPT after cla after class (1).pptx
Final PPT after cla after class (1).pptx
nandan543979
 
Trusted db a trusted hardware based database with privacy and data confidenti...
Trusted db a trusted hardware based database with privacy and data confidenti...Trusted db a trusted hardware based database with privacy and data confidenti...
Trusted db a trusted hardware based database with privacy and data confidenti...
LeMeniz Infotech
 
A New Mode to Ensure Security in Cloud Computing Services
A New Mode to Ensure Security in Cloud Computing ServicesA New Mode to Ensure Security in Cloud Computing Services
A New Mode to Ensure Security in Cloud Computing Services
Mahmuda Rahman
 
Fighting cyber fraud with hadoop
Fighting cyber fraud with hadoopFighting cyber fraud with hadoop
Fighting cyber fraud with hadoop
Niel Dunnage
 
Preparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity RenaissancePreparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity Renaissance
Cloudera, Inc.
 
Implementing an improved security for collin’s database and telecommuters
Implementing an improved security for collin’s database and telecommutersImplementing an improved security for collin’s database and telecommuters
Implementing an improved security for collin’s database and telecommuters
Rishabh Gupta
 
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...
Editor IJCATR
 
Privacy Preserving in Authentication Protocol for Shared Authority Based Clou...
Privacy Preserving in Authentication Protocol for Shared Authority Based Clou...Privacy Preserving in Authentication Protocol for Shared Authority Based Clou...
Privacy Preserving in Authentication Protocol for Shared Authority Based Clou...
IRJET Journal
 
0th PPT - BLOCKCHAIN-CBE (1).ppt
0th PPT - BLOCKCHAIN-CBE (1).ppt0th PPT - BLOCKCHAIN-CBE (1).ppt
0th PPT - BLOCKCHAIN-CBE (1).ppt
VarioTechnology
 
Compliance in the Cloud
Compliance in the CloudCompliance in the Cloud
Compliance in the Cloud
RapidScale
 
Ad

More from Chicago Hadoop Users Group (19)

Kinetica master chug_9.12
Kinetica master chug_9.12Kinetica master chug_9.12
Kinetica master chug_9.12
Chicago Hadoop Users Group
 
Chug dl presentation
Chug dl presentationChug dl presentation
Chug dl presentation
Chicago Hadoop Users Group
 
Yahoo compares Storm and Spark
Yahoo compares Storm and SparkYahoo compares Storm and Spark
Yahoo compares Storm and Spark
Chicago Hadoop Users Group
 
Using Apache Drill
Using Apache DrillUsing Apache Drill
Using Apache Drill
Chicago Hadoop Users Group
 
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Chicago Hadoop Users Group
 
Meet Spark
Meet SparkMeet Spark
Meet Spark
Chicago Hadoop Users Group
 
Choosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChoosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your Business
Chicago Hadoop Users Group
 
An Overview of Ambari
An Overview of AmbariAn Overview of Ambari
An Overview of Ambari
Chicago Hadoop Users Group
 
Introduction to MapReduce
Introduction to MapReduceIntroduction to MapReduce
Introduction to MapReduce
Chicago Hadoop Users Group
 
Advanced Oozie
Advanced OozieAdvanced Oozie
Advanced Oozie
Chicago Hadoop Users Group
 
Scalding for Hadoop
Scalding for HadoopScalding for Hadoop
Scalding for Hadoop
Chicago Hadoop Users Group
 
Financial Data Analytics with Hadoop
Financial Data Analytics with HadoopFinancial Data Analytics with Hadoop
Financial Data Analytics with Hadoop
Chicago Hadoop Users Group
 
Everything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about OozieEverything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about Oozie
Chicago Hadoop Users Group
 
An Introduction to Impala – Low Latency Queries for Apache Hadoop
An Introduction to Impala – Low Latency Queries for Apache HadoopAn Introduction to Impala – Low Latency Queries for Apache Hadoop
An Introduction to Impala – Low Latency Queries for Apache Hadoop
Chicago Hadoop Users Group
 
HCatalog: Table Management for Hadoop - CHUG - 20120917
HCatalog: Table Management for Hadoop - CHUG - 20120917HCatalog: Table Management for Hadoop - CHUG - 20120917
HCatalog: Table Management for Hadoop - CHUG - 20120917
Chicago Hadoop Users Group
 
Map Reduce v2 and YARN - CHUG - 20120604
Map Reduce v2 and YARN - CHUG - 20120604Map Reduce v2 and YARN - CHUG - 20120604
Map Reduce v2 and YARN - CHUG - 20120604
Chicago Hadoop Users Group
 
Hadoop in a Windows Shop - CHUG - 20120416
Hadoop in a Windows Shop - CHUG - 20120416Hadoop in a Windows Shop - CHUG - 20120416
Hadoop in a Windows Shop - CHUG - 20120416
Chicago Hadoop Users Group
 
Running R on Hadoop - CHUG - 20120815
Running R on Hadoop - CHUG - 20120815Running R on Hadoop - CHUG - 20120815
Running R on Hadoop - CHUG - 20120815
Chicago Hadoop Users Group
 
Avro - More Than Just a Serialization Framework - CHUG - 20120416
Avro - More Than Just a Serialization Framework - CHUG - 20120416Avro - More Than Just a Serialization Framework - CHUG - 20120416
Avro - More Than Just a Serialization Framework - CHUG - 20120416
Chicago Hadoop Users Group
 
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Chicago Hadoop Users Group
 
Choosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChoosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your Business
Chicago Hadoop Users Group
 
Everything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about OozieEverything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about Oozie
Chicago Hadoop Users Group
 
An Introduction to Impala – Low Latency Queries for Apache Hadoop
An Introduction to Impala – Low Latency Queries for Apache HadoopAn Introduction to Impala – Low Latency Queries for Apache Hadoop
An Introduction to Impala – Low Latency Queries for Apache Hadoop
Chicago Hadoop Users Group
 
HCatalog: Table Management for Hadoop - CHUG - 20120917
HCatalog: Table Management for Hadoop - CHUG - 20120917HCatalog: Table Management for Hadoop - CHUG - 20120917
HCatalog: Table Management for Hadoop - CHUG - 20120917
Chicago Hadoop Users Group
 
Avro - More Than Just a Serialization Framework - CHUG - 20120416
Avro - More Than Just a Serialization Framework - CHUG - 20120416Avro - More Than Just a Serialization Framework - CHUG - 20120416
Avro - More Than Just a Serialization Framework - CHUG - 20120416
Chicago Hadoop Users Group
 

Recently uploaded (20)

May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...
OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...
OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...
SOFTTECHHUB
 
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptxDevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
Justin Reock
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier VroomAI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
UXPA Boston
 
Sustainable_Development_Goals_INDIANWraa
Sustainable_Development_Goals_INDIANWraaSustainable_Development_Goals_INDIANWraa
Sustainable_Development_Goals_INDIANWraa
03ANMOLCHAURASIYA
 
IT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information TechnologyIT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information Technology
SHEHABALYAMANI
 
Distributionally Robust Statistical Verification with Imprecise Neural Networks
Distributionally Robust Statistical Verification with Imprecise Neural NetworksDistributionally Robust Statistical Verification with Imprecise Neural Networks
Distributionally Robust Statistical Verification with Imprecise Neural Networks
Ivan Ruchkin
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
Understanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdfUnderstanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdf
Fulcrum Concepts, LLC
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
Secondary Storage for a microcontroller system
Secondary Storage for a microcontroller systemSecondary Storage for a microcontroller system
Secondary Storage for a microcontroller system
fizarcse
 
Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Vasileios Komianos
 
UiPath AgentHack - Build the AI agents of tomorrow_Enablement 1.pptx
UiPath AgentHack - Build the AI agents of tomorrow_Enablement 1.pptxUiPath AgentHack - Build the AI agents of tomorrow_Enablement 1.pptx
UiPath AgentHack - Build the AI agents of tomorrow_Enablement 1.pptx
anabulhac
 
React Native for Business Solutions: Building Scalable Apps for Success
React Native for Business Solutions: Building Scalable Apps for SuccessReact Native for Business Solutions: Building Scalable Apps for Success
React Native for Business Solutions: Building Scalable Apps for Success
Amelia Swank
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
In-App Guidance_ Save Enterprises Millions in Training & IT Costs.pptx
In-App Guidance_ Save Enterprises Millions in Training & IT Costs.pptxIn-App Guidance_ Save Enterprises Millions in Training & IT Costs.pptx
In-App Guidance_ Save Enterprises Millions in Training & IT Costs.pptx
aptyai
 
How Top Companies Benefit from Outsourcing
How Top Companies Benefit from OutsourcingHow Top Companies Benefit from Outsourcing
How Top Companies Benefit from Outsourcing
Nascenture
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...
OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...
OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...
SOFTTECHHUB
 
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptxDevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
Justin Reock
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier VroomAI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
UXPA Boston
 
Sustainable_Development_Goals_INDIANWraa
Sustainable_Development_Goals_INDIANWraaSustainable_Development_Goals_INDIANWraa
Sustainable_Development_Goals_INDIANWraa
03ANMOLCHAURASIYA
 
IT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information TechnologyIT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information Technology
SHEHABALYAMANI
 
Distributionally Robust Statistical Verification with Imprecise Neural Networks
Distributionally Robust Statistical Verification with Imprecise Neural NetworksDistributionally Robust Statistical Verification with Imprecise Neural Networks
Distributionally Robust Statistical Verification with Imprecise Neural Networks
Ivan Ruchkin
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
Understanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdfUnderstanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdf
Fulcrum Concepts, LLC
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
Secondary Storage for a microcontroller system
Secondary Storage for a microcontroller systemSecondary Storage for a microcontroller system
Secondary Storage for a microcontroller system
fizarcse
 
Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Vasileios Komianos
 
UiPath AgentHack - Build the AI agents of tomorrow_Enablement 1.pptx
UiPath AgentHack - Build the AI agents of tomorrow_Enablement 1.pptxUiPath AgentHack - Build the AI agents of tomorrow_Enablement 1.pptx
UiPath AgentHack - Build the AI agents of tomorrow_Enablement 1.pptx
anabulhac
 
React Native for Business Solutions: Building Scalable Apps for Success
React Native for Business Solutions: Building Scalable Apps for SuccessReact Native for Business Solutions: Building Scalable Apps for Success
React Native for Business Solutions: Building Scalable Apps for Success
Amelia Swank
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
In-App Guidance_ Save Enterprises Millions in Training & IT Costs.pptx
In-App Guidance_ Save Enterprises Millions in Training & IT Costs.pptxIn-App Guidance_ Save Enterprises Millions in Training & IT Costs.pptx
In-App Guidance_ Save Enterprises Millions in Training & IT Costs.pptx
aptyai
 
How Top Companies Benefit from Outsourcing
How Top Companies Benefit from OutsourcingHow Top Companies Benefit from Outsourcing
How Top Companies Benefit from Outsourcing
Nascenture
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 

Hadoop and Big Data Security

  • 1. Hadoop and Big Data Security Kevin T. Smith, 11/14/2013 Ksmith <AT> Novetta . COM
  • 2. Big Data Security – Why Should We Care? New Challenges related to Data Management, Security, and Privacy As data growth is explosive, so is the complexity of our IT environments Many organizations required to enforce access control & privacy restrictions on data sets (HIPAA, Privacy Laws) – or face steep penalties & fines Organizations are increasingly required to enforce access control to their data scientists based on Need-to-Know, User Authorization levels, and what data they are allowed to see – especially in Healthcare, Finance, and Government Organizations struggling to understand what data they can release Mismanagement of Data Sets -- Costly.. AOL Research “Data Valdez” Incident • CNNMoney - “101 Dumbest Moments in Business” • $5 Million Settlement , plus $100 to each member of AOL between 3/2006-5/2006, + $50 to each member who believed their data was in the released data; Fired employees, CTO Resignation The Netflix Contest Anonymized Data Set Incident • Class-Action Lawsuit, $9 Million Settlement Massachusetts Hospital Record Incident Cyber Security Attacks are on the Rise Ponemon Institute – the Average Cost of a Data Breach in the U.S. is 5.4 Million dollars* Playstation (2011) – Experts predict costs between 2.2 and 2.4 Billion * (Breach Study: Global Analysis, May 2013)
  • 3. A (Brief) History of Hadoop Security Hadoop developed without Security in Mind Originally No Security model No authentication of users or services Anyone could submit arbitrary code to be executed Later authorization added, but any user could impersonate other users with command-line switch In 2009, Yahoo! focused on Hadoop Authentication, and did a Hadoop Redesign, But… Resulting Security Model is Complex Security Configuration is complex & Easy to Mess Up No Data at Rest Encryption Kerberos-Centric Limited Authorization Capabilities Things are Changing, But Slowly.. It is important to understand how Hadoop Security is Currently Implemented & Configured It is important to understand how to meet your organization’s security requirements
  • 4. Hadoop Security Data Flow Distributed Security is a Challenge Since the .20.20x distributions of Hadoop, much of the model is Kerberos Centric , as you see to the right Model is quite complex, as you will see on the next slide
  • 5. Token Delegation & Hadoop Security Flow Token Used For Kerberos TGT Kerberos initial authentication to KDC. Kerberos service ticket Kerberos initial authentication between users, client processes, and services. Delegation token Token issued by the NameNode to the client, used by the client or any services working on the client’s behalf to authenticate them to the NameNode. Block Access token Token issued by the NameNode after validating authorization to a particular block of data, based on a shared secret with the DataNode. Clients (and services working on the client’s behalf) use the Block Access token to request blocks from the DataNode. Job token This is issued by the JobTracker to TaskTrackers. Tasks communicating with TaskTrackers for a particular job use this token to prove they are associated with the job.
  • 6. Some Vendor Activity in Hadoop Security Seems to be a New One Every Week! Cloudera Sentry – Fine Grained Access Control for Apache Hive & Cloudera Impala IBM InfoSphere Optim Data Masking – Optim Data Masking provides “Deidentification” of data by obfuscating corporate secrets, Guardium provides monitoring & auditing Intel’s Secure Hadoop Distribution – Encryption in transit & at rest, Granular access control with HBase DataStax Enterprise – Encryption in Transit & at Rest (using Cassandra for storage) DataGuise for Hadoop – Detects & protects sensitive data, setting access permission, masking or encrypting data, authorization based access Knox Gateway (Hortonworks) – Perimeter security, integration with IDAM environments, manage security across multiple clusters – now an Apache Project Protegrity – Big Data Protector provides Encryption & tokenization, Enterprise Security Administrator provides central policy, key mgmt, auditing, reporting Sqrrl – Builds on Apache Accumulo’s security capabilities for Hadoop Zettaset Secure Orchestrator – security wrapper around Hadoop
  • 7. Apache Accumulo • Cell-Level Access Control via visibility • By default, uses its own db for users & credentials • Can be extended in code to use other Identity & Access Management Infrastructure
  • 8. Project Rhino Intel launched this open source effort to improve security capabilities of Hadoop & contributed code to Apache in early 2013. Encrypted Data at Rest - JIRA Tasks HADOOP-9331 (Hadoop Crypto Codec Framework and Crypto Codec Implementation) and MAPREDUCE-5025 (Key Distribution and Management for Supporting Crypto Codec in MapReduce) . ZOOKEEPER-1688 will provide the ability for transparent encryption of snapshots and commit logs on disk, protecting against the leakage of sensitive information from files at rest. Token-Based Authentication & Unified Authorization Framework - JIRA TasksHADOOP-9392 (Token-Based Authentication and Single Sign-On) and HADOOP9466(Unified Authorization Framework) Improved Security in HBase - The JIRA Task HBASE-6222 (Add Per-KeyValue Security) adds cell-level authorization to HBase – something that Apache Accumulo has but HBase does not. HBASE-7544 builds on the encryption framework being developed, extending it to HBase, providing transparent table encryption.
  • 9. What’s the Best Guidance Now? Identify and Understand the Sensitivity Levels of Your Data Are there access control policies associated with your data? Understand the Impact of the Release of Your Data Netflix example – Could someone couple your data with open source data to gain new (and unintended) insight? Develop Policies & Procedures relating to Security & Privacy of Your Data Sets Data Ingest Access Control within Your Organization Cleansing/Sanitization/Destruction Auditing Monitoring Procedures Incident Response Develop a Technical Security Approach that Complements Hadoop Security
  翻译: