SlideShare a Scribd company logo
Hands-On Hadoop Tutorial Chris Sosa Wolfgang Richter May 23, 2008
General Information Hadoop uses HDFS, a distributed file system based on GFS, as its shared filesystem HDFS architecture divides files into large chunks (~64MB) distributed across data servers HDFS has a global namespace
General Information (cont’d) Provided a script for your convenience Run source /localtmp/hadoop/setupVars from centurtion064 Changes all uses of {somePath}/command to just command Goto  http://www.cs.virginia.edu/~cbs6n/hadoop  for web access.  These slides and more information are also available there. Once you use the DFS (put something in it), relative paths are from /usr/{your usr id}.  E.G. if your id is tb28 … your “home dir” is /usr/tb28
Master Node Hadoop currently configured with centurion064 as the master node Master node Keeps track of namespace and metadata about items Keeps track of MapReduce jobs in the system
Slave Nodes Centurion064 also acts as a slave node Slave nodes Manage blocks of data sent from master node In terms of GFS, these are the chunkservers Currently centurion060 is also another slave node
Hadoop Paths Hadoop is locally “installed” on each machine Installed location is in /localtmp/hadoop/hadoop-0.15.3 Slave nodes store their data in /localtmp/hadoop/hadoop-dfs (this is automatically created by the DFS) /localtmp/hadoop is owned by group gbg (someone in this group must administer this or a cs admin) Files are divided into 64 MB chunks (this is configurable)
Starting / Stopping Hadoop For the purposes of this tutorial, we assume you have run the setupVars from earlier start-all.sh – starts all slave nodes and master node stop-all.sh – stops all slave nodes and master node
Using HDFS (1/2) hadoop dfs [-ls <path>] [-du <path>] [-cp <src> <dst>] [-rm <path>] [-put <localsrc> <dst>] [-copyFromLocal <localsrc> <dst>] [-moveFromLocal <localsrc> <dst>] [-get [-crc] <src> <localdst>] [-cat <src>] [-copyToLocal [-crc] <src> <localdst>] [-moveToLocal [-crc] <src> <localdst>] [-mkdir <path>] [-touchz <path>] [-test -[ezd] <path>] [-stat [format] <path>] [-help [cmd]]
Using HDFS (2/2) Want to reformat? Easy hadoop namenode –format Basically we see most commands look similar  hadoop “some command” options If you just type hadoop you get all possible commands (including undocumented ones – hooray)
To Add Another Slave This adds another data node / job execution site to the pool Hadoop dynamically uses filesystem underneath it If more space is available on the HDD, HDFS will try to use it when it needs to Modify the slaves file  In centurion064:/localtmp/hadoop/hadoop-0.15.3/conf Copy code installation dir to newMachine:/localtmp/hadoop/hadoop-0.15.3 (very small) Restart Hadoop
Configure Hadoop Can configure in {$installation dir}/conf hadoop-default.xml for global hadoop-site.xml for site specific (overrides global)
That’s it for Configuration!
Real-time Access
Ad

More Related Content

What's hot (20)

Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Ovidiu Dimulescu
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
joelcrabb
 
Setting High Availability in Hadoop Cluster
Setting High Availability in Hadoop ClusterSetting High Availability in Hadoop Cluster
Setting High Availability in Hadoop Cluster
Edureka!
 
Hadoop hive presentation
Hadoop hive presentationHadoop hive presentation
Hadoop hive presentation
Arvind Kumar
 
An Introduction to Hadoop
An Introduction to HadoopAn Introduction to Hadoop
An Introduction to Hadoop
DerrekYoungDotCom
 
Hive Quick Start Tutorial
Hive Quick Start TutorialHive Quick Start Tutorial
Hive Quick Start Tutorial
Carl Steinbach
 
Hadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapaHadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapa
kapa rohit
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Ran Ziv
 
Practical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & PigPractical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & Pig
Milind Bhandarkar
 
Hadoop architecture by ajay
Hadoop architecture by ajayHadoop architecture by ajay
Hadoop architecture by ajay
Hadoop online training
 
Hadoop cluster configuration
Hadoop cluster configurationHadoop cluster configuration
Hadoop cluster configuration
prabakaranbrick
 
Learn Hadoop Administration
Learn Hadoop AdministrationLearn Hadoop Administration
Learn Hadoop Administration
Edureka!
 
Hadoop
HadoopHadoop
Hadoop
Rajesh Piryani
 
Hadoop Installation presentation
Hadoop Installation presentationHadoop Installation presentation
Hadoop Installation presentation
puneet yadav
 
Hadoop - Introduction to Hadoop
Hadoop - Introduction to HadoopHadoop - Introduction to Hadoop
Hadoop - Introduction to Hadoop
Vibrant Technologies & Computers
 
Administer Hadoop Cluster
Administer Hadoop ClusterAdminister Hadoop Cluster
Administer Hadoop Cluster
Edureka!
 
Introduction to Big Data & Hadoop
Introduction to Big Data & HadoopIntroduction to Big Data & Hadoop
Introduction to Big Data & Hadoop
Edureka!
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
Ramesh Pabba - seeking new projects
 
Hadoop admin training
Hadoop admin trainingHadoop admin training
Hadoop admin training
Arun Kumar
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
Shweta Patnaik
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
joelcrabb
 
Setting High Availability in Hadoop Cluster
Setting High Availability in Hadoop ClusterSetting High Availability in Hadoop Cluster
Setting High Availability in Hadoop Cluster
Edureka!
 
Hadoop hive presentation
Hadoop hive presentationHadoop hive presentation
Hadoop hive presentation
Arvind Kumar
 
Hive Quick Start Tutorial
Hive Quick Start TutorialHive Quick Start Tutorial
Hive Quick Start Tutorial
Carl Steinbach
 
Hadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapaHadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapa
kapa rohit
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Ran Ziv
 
Practical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & PigPractical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & Pig
Milind Bhandarkar
 
Hadoop cluster configuration
Hadoop cluster configurationHadoop cluster configuration
Hadoop cluster configuration
prabakaranbrick
 
Learn Hadoop Administration
Learn Hadoop AdministrationLearn Hadoop Administration
Learn Hadoop Administration
Edureka!
 
Hadoop Installation presentation
Hadoop Installation presentationHadoop Installation presentation
Hadoop Installation presentation
puneet yadav
 
Administer Hadoop Cluster
Administer Hadoop ClusterAdminister Hadoop Cluster
Administer Hadoop Cluster
Edureka!
 
Introduction to Big Data & Hadoop
Introduction to Big Data & HadoopIntroduction to Big Data & Hadoop
Introduction to Big Data & Hadoop
Edureka!
 
Hadoop admin training
Hadoop admin trainingHadoop admin training
Hadoop admin training
Arun Kumar
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
Shweta Patnaik
 

Viewers also liked (20)

Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
Edureka!
 
Hadoop & HDFS for Beginners
Hadoop & HDFS for BeginnersHadoop & HDFS for Beginners
Hadoop & HDFS for Beginners
Rahul Jain
 
Tutorial hadoop hdfs_map_reduce
Tutorial hadoop hdfs_map_reduceTutorial hadoop hdfs_map_reduce
Tutorial hadoop hdfs_map_reduce
mudassar mulla
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Vigen Sahakyan
 
Hadoop YARN
Hadoop YARNHadoop YARN
Hadoop YARN
Vigen Sahakyan
 
Hadoop Streaming Tutorial With Python
Hadoop Streaming Tutorial With PythonHadoop Streaming Tutorial With Python
Hadoop Streaming Tutorial With Python
Joe Stein
 
Hadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHadoop HDFS Detailed Introduction
Hadoop HDFS Detailed Introduction
Hanborq Inc.
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
Rahul Jain
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
EMC
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
Rahul Agarwal
 
Hadoop Seminar Report
Hadoop Seminar ReportHadoop Seminar Report
Hadoop Seminar Report
Atul Kushwaha
 
Oracle Coherence
Oracle CoherenceOracle Coherence
Oracle Coherence
Liran Zelkha
 
Oracle Coherence: in-memory datagrid
Oracle Coherence: in-memory datagridOracle Coherence: in-memory datagrid
Oracle Coherence: in-memory datagrid
Emiliano Pecis
 
Intro to hadoop tutorial
Intro to hadoop tutorialIntro to hadoop tutorial
Intro to hadoop tutorial
markgrover
 
Hadoop operations
Hadoop operationsHadoop operations
Hadoop operations
DataWorks Summit
 
มาตรฐานการป้องกันความลับของข้อมูลผู้ป่วย (23 มี.ค. 2559)
มาตรฐานการป้องกันความลับของข้อมูลผู้ป่วย (23 มี.ค. 2559)มาตรฐานการป้องกันความลับของข้อมูลผู้ป่วย (23 มี.ค. 2559)
มาตรฐานการป้องกันความลับของข้อมูลผู้ป่วย (23 มี.ค. 2559)
Nawanan Theera-Ampornpunt
 
HBase introduction talk
HBase introduction talkHBase introduction talk
HBase introduction talk
Hayden Marchant
 
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics FrameworksOverview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Slim Baltagi
 
Big dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosqlBig dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosql
Khanderao Kand
 
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Natalino Busa
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
Edureka!
 
Hadoop & HDFS for Beginners
Hadoop & HDFS for BeginnersHadoop & HDFS for Beginners
Hadoop & HDFS for Beginners
Rahul Jain
 
Tutorial hadoop hdfs_map_reduce
Tutorial hadoop hdfs_map_reduceTutorial hadoop hdfs_map_reduce
Tutorial hadoop hdfs_map_reduce
mudassar mulla
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Vigen Sahakyan
 
Hadoop Streaming Tutorial With Python
Hadoop Streaming Tutorial With PythonHadoop Streaming Tutorial With Python
Hadoop Streaming Tutorial With Python
Joe Stein
 
Hadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHadoop HDFS Detailed Introduction
Hadoop HDFS Detailed Introduction
Hanborq Inc.
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
Rahul Jain
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
EMC
 
Hadoop Seminar Report
Hadoop Seminar ReportHadoop Seminar Report
Hadoop Seminar Report
Atul Kushwaha
 
Oracle Coherence: in-memory datagrid
Oracle Coherence: in-memory datagridOracle Coherence: in-memory datagrid
Oracle Coherence: in-memory datagrid
Emiliano Pecis
 
Intro to hadoop tutorial
Intro to hadoop tutorialIntro to hadoop tutorial
Intro to hadoop tutorial
markgrover
 
มาตรฐานการป้องกันความลับของข้อมูลผู้ป่วย (23 มี.ค. 2559)
มาตรฐานการป้องกันความลับของข้อมูลผู้ป่วย (23 มี.ค. 2559)มาตรฐานการป้องกันความลับของข้อมูลผู้ป่วย (23 มี.ค. 2559)
มาตรฐานการป้องกันความลับของข้อมูลผู้ป่วย (23 มี.ค. 2559)
Nawanan Theera-Ampornpunt
 
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics FrameworksOverview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Slim Baltagi
 
Big dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosqlBig dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosql
Khanderao Kand
 
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Natalino Busa
 
Ad

Similar to Hadoop Tutorial (20)

Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorial
SergioBruno21
 
Design and Research of Hadoop Distributed Cluster Based on Raspberry
Design and Research of Hadoop Distributed Cluster Based on RaspberryDesign and Research of Hadoop Distributed Cluster Based on Raspberry
Design and Research of Hadoop Distributed Cluster Based on Raspberry
IJRESJOURNAL
 
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Simplilearn
 
394753714-hdfc-command-biga data tecnology
394753714-hdfc-command-biga data tecnology394753714-hdfc-command-biga data tecnology
394753714-hdfc-command-biga data tecnology
SupriyaGhosh51
 
Configuring and manipulating HDFS files
Configuring and manipulating HDFS filesConfiguring and manipulating HDFS files
Configuring and manipulating HDFS files
Rupak Roy
 
MapReduce1.pptx
MapReduce1.pptxMapReduce1.pptx
MapReduce1.pptx
ashimashahi1
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answers
Kalyan Hadoop
 
BIGDATA ANALYTICS LAB MANUAL final.pdf
BIGDATA  ANALYTICS LAB MANUAL final.pdfBIGDATA  ANALYTICS LAB MANUAL final.pdf
BIGDATA ANALYTICS LAB MANUAL final.pdf
ANJALAI AMMAL MAHALINGAM ENGINEERING COLLEGE
 
Hadoop operations basic
Hadoop operations basicHadoop operations basic
Hadoop operations basic
Hafizur Rahman
 
Unit 1
Unit 1Unit 1
Unit 1
SriKGangadharRaoAssi
 
Running hadoop on ubuntu linux
Running hadoop on ubuntu linuxRunning hadoop on ubuntu linux
Running hadoop on ubuntu linux
TRCK
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
Shashwat Shriparv
 
Data analysis on hadoop
Data analysis on hadoopData analysis on hadoop
Data analysis on hadoop
Frank Y
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFS
Edureka!
 
Unit 5
Unit  5Unit  5
Unit 5
Ravi Kumar
 
Bd class 2 complete
Bd class 2 completeBd class 2 complete
Bd class 2 complete
JigsawAcademy2014
 
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Amir Sedighi
 
Hadoop disaster recovery
Hadoop disaster recoveryHadoop disaster recovery
Hadoop disaster recovery
Sandeep Singh
 
Hadoop installation by santosh nage
Hadoop installation by santosh nageHadoop installation by santosh nage
Hadoop installation by santosh nage
Santosh Nage
 
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudBest Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Leons Petražickis
 
Design and Research of Hadoop Distributed Cluster Based on Raspberry
Design and Research of Hadoop Distributed Cluster Based on RaspberryDesign and Research of Hadoop Distributed Cluster Based on Raspberry
Design and Research of Hadoop Distributed Cluster Based on Raspberry
IJRESJOURNAL
 
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Simplilearn
 
394753714-hdfc-command-biga data tecnology
394753714-hdfc-command-biga data tecnology394753714-hdfc-command-biga data tecnology
394753714-hdfc-command-biga data tecnology
SupriyaGhosh51
 
Configuring and manipulating HDFS files
Configuring and manipulating HDFS filesConfiguring and manipulating HDFS files
Configuring and manipulating HDFS files
Rupak Roy
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answers
Kalyan Hadoop
 
Hadoop operations basic
Hadoop operations basicHadoop operations basic
Hadoop operations basic
Hafizur Rahman
 
Running hadoop on ubuntu linux
Running hadoop on ubuntu linuxRunning hadoop on ubuntu linux
Running hadoop on ubuntu linux
TRCK
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
Shashwat Shriparv
 
Data analysis on hadoop
Data analysis on hadoopData analysis on hadoop
Data analysis on hadoop
Frank Y
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFS
Edureka!
 
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Amir Sedighi
 
Hadoop disaster recovery
Hadoop disaster recoveryHadoop disaster recovery
Hadoop disaster recovery
Sandeep Singh
 
Hadoop installation by santosh nage
Hadoop installation by santosh nageHadoop installation by santosh nage
Hadoop installation by santosh nage
Santosh Nage
 
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudBest Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Leons Petražickis
 
Ad

More from awesomesos (18)

A Hardware Architecture For Implementing Protection Rings
A Hardware Architecture For Implementing Protection RingsA Hardware Architecture For Implementing Protection Rings
A Hardware Architecture For Implementing Protection Rings
awesomesos
 
Amazon’s Cloud Computing Efforts
Amazon’s Cloud Computing EffortsAmazon’s Cloud Computing Efforts
Amazon’s Cloud Computing Efforts
awesomesos
 
Bringing The Grid Home for Grid2008
Bringing The Grid Home for Grid2008Bringing The Grid Home for Grid2008
Bringing The Grid Home for Grid2008
awesomesos
 
Handling Byzantine Faults
Handling Byzantine FaultsHandling Byzantine Faults
Handling Byzantine Faults
awesomesos
 
Masters of Science presentation: Bringing The Grid Home
Masters of Science presentation:  Bringing The Grid HomeMasters of Science presentation:  Bringing The Grid Home
Masters of Science presentation: Bringing The Grid Home
awesomesos
 
DIOS - compilers
DIOS - compilersDIOS - compilers
DIOS - compilers
awesomesos
 
Distributed Snapshots
Distributed SnapshotsDistributed Snapshots
Distributed Snapshots
awesomesos
 
PicFS presentation
PicFS presentationPicFS presentation
PicFS presentation
awesomesos
 
Online feedback correlation using clustering
Online feedback correlation using clusteringOnline feedback correlation using clustering
Online feedback correlation using clustering
awesomesos
 
Web Service Choreography Interface (Wsci)
Web Service Choreography Interface (Wsci)Web Service Choreography Interface (Wsci)
Web Service Choreography Interface (Wsci)
awesomesos
 
Lustre And Nfs V4
Lustre And Nfs V4Lustre And Nfs V4
Lustre And Nfs V4
awesomesos
 
An Installable File System For Genesis II
An Installable File System For Genesis IIAn Installable File System For Genesis II
An Installable File System For Genesis II
awesomesos
 
A Web Based Covert File System
A Web Based Covert File SystemA Web Based Covert File System
A Web Based Covert File System
awesomesos
 
DIOS
DIOSDIOS
DIOS
awesomesos
 
Distributed File Systems
Distributed File SystemsDistributed File Systems
Distributed File Systems
awesomesos
 
Exploring The Cloud
Exploring The CloudExploring The Cloud
Exploring The Cloud
awesomesos
 
Data Grid Taxonomies
Data Grid TaxonomiesData Grid Taxonomies
Data Grid Taxonomies
awesomesos
 
A Guide to DAGMan
A Guide to DAGManA Guide to DAGMan
A Guide to DAGMan
awesomesos
 
A Hardware Architecture For Implementing Protection Rings
A Hardware Architecture For Implementing Protection RingsA Hardware Architecture For Implementing Protection Rings
A Hardware Architecture For Implementing Protection Rings
awesomesos
 
Amazon’s Cloud Computing Efforts
Amazon’s Cloud Computing EffortsAmazon’s Cloud Computing Efforts
Amazon’s Cloud Computing Efforts
awesomesos
 
Bringing The Grid Home for Grid2008
Bringing The Grid Home for Grid2008Bringing The Grid Home for Grid2008
Bringing The Grid Home for Grid2008
awesomesos
 
Handling Byzantine Faults
Handling Byzantine FaultsHandling Byzantine Faults
Handling Byzantine Faults
awesomesos
 
Masters of Science presentation: Bringing The Grid Home
Masters of Science presentation:  Bringing The Grid HomeMasters of Science presentation:  Bringing The Grid Home
Masters of Science presentation: Bringing The Grid Home
awesomesos
 
DIOS - compilers
DIOS - compilersDIOS - compilers
DIOS - compilers
awesomesos
 
Distributed Snapshots
Distributed SnapshotsDistributed Snapshots
Distributed Snapshots
awesomesos
 
PicFS presentation
PicFS presentationPicFS presentation
PicFS presentation
awesomesos
 
Online feedback correlation using clustering
Online feedback correlation using clusteringOnline feedback correlation using clustering
Online feedback correlation using clustering
awesomesos
 
Web Service Choreography Interface (Wsci)
Web Service Choreography Interface (Wsci)Web Service Choreography Interface (Wsci)
Web Service Choreography Interface (Wsci)
awesomesos
 
Lustre And Nfs V4
Lustre And Nfs V4Lustre And Nfs V4
Lustre And Nfs V4
awesomesos
 
An Installable File System For Genesis II
An Installable File System For Genesis IIAn Installable File System For Genesis II
An Installable File System For Genesis II
awesomesos
 
A Web Based Covert File System
A Web Based Covert File SystemA Web Based Covert File System
A Web Based Covert File System
awesomesos
 
Distributed File Systems
Distributed File SystemsDistributed File Systems
Distributed File Systems
awesomesos
 
Exploring The Cloud
Exploring The CloudExploring The Cloud
Exploring The Cloud
awesomesos
 
Data Grid Taxonomies
Data Grid TaxonomiesData Grid Taxonomies
Data Grid Taxonomies
awesomesos
 
A Guide to DAGMan
A Guide to DAGManA Guide to DAGMan
A Guide to DAGMan
awesomesos
 

Recently uploaded (20)

AI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of DocumentsAI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of Documents
UiPathCommunity
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
UiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer OpportunitiesUiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer Opportunities
DianaGray10
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
James Anderson
 
Does Pornify Allow NSFW? Everything You Should Know
Does Pornify Allow NSFW? Everything You Should KnowDoes Pornify Allow NSFW? Everything You Should Know
Does Pornify Allow NSFW? Everything You Should Know
Pornify CC
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptxWebinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
MSP360
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
UiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer OpportunitiesUiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer Opportunities
DianaGray10
 
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
CSUC - Consorci de Serveis Universitaris de Catalunya
 
Transcript: Canadian book publishing: Insights from the latest salary survey ...
Transcript: Canadian book publishing: Insights from the latest salary survey ...Transcript: Canadian book publishing: Insights from the latest salary survey ...
Transcript: Canadian book publishing: Insights from the latest salary survey ...
BookNet Canada
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Cyntexa
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
AI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of DocumentsAI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of Documents
UiPathCommunity
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
UiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer OpportunitiesUiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer Opportunities
DianaGray10
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
James Anderson
 
Does Pornify Allow NSFW? Everything You Should Know
Does Pornify Allow NSFW? Everything You Should KnowDoes Pornify Allow NSFW? Everything You Should Know
Does Pornify Allow NSFW? Everything You Should Know
Pornify CC
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptxWebinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
MSP360
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
UiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer OpportunitiesUiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer Opportunities
DianaGray10
 
Transcript: Canadian book publishing: Insights from the latest salary survey ...
Transcript: Canadian book publishing: Insights from the latest salary survey ...Transcript: Canadian book publishing: Insights from the latest salary survey ...
Transcript: Canadian book publishing: Insights from the latest salary survey ...
BookNet Canada
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Cyntexa
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 

Hadoop Tutorial

  • 1. Hands-On Hadoop Tutorial Chris Sosa Wolfgang Richter May 23, 2008
  • 2. General Information Hadoop uses HDFS, a distributed file system based on GFS, as its shared filesystem HDFS architecture divides files into large chunks (~64MB) distributed across data servers HDFS has a global namespace
  • 3. General Information (cont’d) Provided a script for your convenience Run source /localtmp/hadoop/setupVars from centurtion064 Changes all uses of {somePath}/command to just command Goto http://www.cs.virginia.edu/~cbs6n/hadoop for web access. These slides and more information are also available there. Once you use the DFS (put something in it), relative paths are from /usr/{your usr id}. E.G. if your id is tb28 … your “home dir” is /usr/tb28
  • 4. Master Node Hadoop currently configured with centurion064 as the master node Master node Keeps track of namespace and metadata about items Keeps track of MapReduce jobs in the system
  • 5. Slave Nodes Centurion064 also acts as a slave node Slave nodes Manage blocks of data sent from master node In terms of GFS, these are the chunkservers Currently centurion060 is also another slave node
  • 6. Hadoop Paths Hadoop is locally “installed” on each machine Installed location is in /localtmp/hadoop/hadoop-0.15.3 Slave nodes store their data in /localtmp/hadoop/hadoop-dfs (this is automatically created by the DFS) /localtmp/hadoop is owned by group gbg (someone in this group must administer this or a cs admin) Files are divided into 64 MB chunks (this is configurable)
  • 7. Starting / Stopping Hadoop For the purposes of this tutorial, we assume you have run the setupVars from earlier start-all.sh – starts all slave nodes and master node stop-all.sh – stops all slave nodes and master node
  • 8. Using HDFS (1/2) hadoop dfs [-ls <path>] [-du <path>] [-cp <src> <dst>] [-rm <path>] [-put <localsrc> <dst>] [-copyFromLocal <localsrc> <dst>] [-moveFromLocal <localsrc> <dst>] [-get [-crc] <src> <localdst>] [-cat <src>] [-copyToLocal [-crc] <src> <localdst>] [-moveToLocal [-crc] <src> <localdst>] [-mkdir <path>] [-touchz <path>] [-test -[ezd] <path>] [-stat [format] <path>] [-help [cmd]]
  • 9. Using HDFS (2/2) Want to reformat? Easy hadoop namenode –format Basically we see most commands look similar hadoop “some command” options If you just type hadoop you get all possible commands (including undocumented ones – hooray)
  • 10. To Add Another Slave This adds another data node / job execution site to the pool Hadoop dynamically uses filesystem underneath it If more space is available on the HDD, HDFS will try to use it when it needs to Modify the slaves file In centurion064:/localtmp/hadoop/hadoop-0.15.3/conf Copy code installation dir to newMachine:/localtmp/hadoop/hadoop-0.15.3 (very small) Restart Hadoop
  • 11. Configure Hadoop Can configure in {$installation dir}/conf hadoop-default.xml for global hadoop-site.xml for site specific (overrides global)
  • 12. That’s it for Configuration!
  翻译: