SlideShare a Scribd company logo
HADOOP DISTRIBUTED FILE
SYSTEM AND MAPREDUCE
BY
Eadara Harsha Siva Sai
DEPARTMENT OF COMPUTER SCENCE AND ENGINEERING
1. INTRDUCTION TO HADOOP
1. HADOOP ARCHITECTURE
2. HADOOP DISTRIBUTED FILE
SYSTEM(HDFS)
3. MAPREDUCE
CONTENTS
INTRDUCTION TO HADOOP
What is hadoop..?
• Hadoop is a frame work, To store and to
process big data sets. It is open source
software used for distributed computing.
• Dough cutting introduced hadoop in cloud
era .
• Designed to answer the question: “How to
process big data with reasonable cost and
time?”
INTRDUCTION TO HADOOP
• In a traditional non distributed architecture,
you’ll have data stored in one server and any
client program will access this central data server
to access the data.
• The non distributed model has few issues. In this
model, you’ll mostly scale vertically by adding
more CPU to adding more storage, etc.
• This architecture is also not reliable, as if the
main server fails, you have to go back to the
backup to restore the data and it is slow to
access the huge data.
INTRDUCTION TO HADOOP
In a hadoop distributed architecture
• Each and every server offers local computation and
storage. i.e. When you run a query against a large data set,
every server in this distributed architecture will be executing
the query on its local machine against the local data set.
Finally, the result set from all this local servers are
consolidated.
• You don’t need a powerful server. Just use several less
expensive commodity servers as hadoop individual nodes.
If any of the nodes fails in the hadoop environment, it will
still return the dataset properly, as hadoop takes care of
replicating and distributing the data efficiently across the
multiple nodes.
• Hadoop is written in Java. So, it can run on any platform.
HADOOP ARCHITECTURE
In Hadoop architecture
1.Name node
2. Secondary Node
3. Job Tracker
4. Data node
5. Task Tracker
HADOOP ARCHITECTURE
HADOOP DISTRIBUTED FILE SYSTEM(HDFS)
• The distribution of a data between the data nodes by using
hadoop is called HDFS .A typical HDFS block size is 64MB.
• There will be one Name Node that manages the file system
metadata. It will divide the data into 64MB size.
• The name will decide to which data node the data to send
and it also says the data node to store it replications to
another two nodes.
• After storing the data the data node will send how much of
space is available.
• Each and every 3 seconds data node passes a heart beat to
name node. If data node failed to send heart beat then
name node wait for 30 seconds . If not send it declare the
data node is dead.
HADOOP DISTRIBUTED FILE SYSTEM(HDFS)
MAPREDUCE
• The process of obtaining the output or getting back your
data is called map reduce . The following is map reduce
using hadoop frame work.
• When the client want the output of the stored data then the
client writes a program and sends the program to job tracer.
• Job tracer asks the name node wether the metadata is
created for this data or not . If created then send the meta
data.
• Then the job tracer will order the data nodes to process the
data with them.
• After processing the data the out put will send to jobtraker.
Again job tracer will send the outputs obtained to another
data node for final output.
MAPREDUCE
• After receiving the the final output then the jobtraker will
send it to the client.
• If a data node failed while processing the data then the job
tracer will order another data node to process the data that
consist of the replication of file.
• After receiving the out puts from the data nodes then the
jobtrker will see which data node have the less work at the
moment and send to data node to get the final output.
• The following diagram will explains about mapreduce.
MAPREDUCE
ADVANTAGES AND DISADVANTAGES
ADVANTAGES:
1. Cost effective
2. Flexible
3. Fast
4. Resilient to failure
DISADVANTAGES:
1. Security concerns
2. Not fit for small data
3. Potential stability issues
CONCLUSION
• Facebook , Google ,Amazon , flipchart
etc.. are using HADOOP
• Hadoop solves so many problems in
storing of data on cloud .hence hadoop is
a open source it is free and it can work on
any Operating System .
ANY
QUESTIONS…….?
THANK YOU….
Ad

More Related Content

What's hot (20)

Seminar ppt
Seminar pptSeminar ppt
Seminar ppt
RajatTripathi34
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabad
sreehari orienit
 
Hadoop
HadoopHadoop
Hadoop
Ramakrishna Reddy Bijjam
 
Hadoop
HadoopHadoop
Hadoop
Kartik Kalpande Patil
 
Map Reduce basics
Map Reduce basicsMap Reduce basics
Map Reduce basics
Abhishek Mukherjee
 
Hadoop
Hadoop Hadoop
Hadoop
Shamama Kamal
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
Roushan Sinha
 
hive lab
hive labhive lab
hive lab
marwa baich
 
Unit 1
Unit 1Unit 1
Unit 1
SriKGangadharRaoAssi
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Dr. C.V. Suresh Babu
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
Shweta Patnaik
 
Anju
AnjuAnju
Anju
Anju Shekhawat
 
Apache Hadoop
Apache HadoopApache Hadoop
Apache Hadoop
Ajit Koti
 
Hadoop and big data
Hadoop and big dataHadoop and big data
Hadoop and big data
Sharad Pandey
 
Hadoop seminar
Hadoop seminarHadoop seminar
Hadoop seminar
KrishnenduKrishh
 
Hadoop An Introduction
Hadoop An IntroductionHadoop An Introduction
Hadoop An Introduction
Mohanasundaram Ponnusamy
 
Hadoop scalability
Hadoop scalabilityHadoop scalability
Hadoop scalability
WANdisco Plc
 
Hadoop
HadoopHadoop
Hadoop
Himanshu Soni
 
عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟
datastack
 
Presentation on Hadoop Technology
Presentation on Hadoop TechnologyPresentation on Hadoop Technology
Presentation on Hadoop Technology
OpenDev
 

Similar to HADOOP DISTRIBUTED FILE SYSTEM AND MAPREDUCE (20)

Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
ch adnan
 
Hadoop and It_s Components_PPT .pptx
Hadoop and It_s Components_PPT      .pptxHadoop and It_s Components_PPT      .pptx
Hadoop and It_s Components_PPT .pptx
ABHIJEETKUMAR632313
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
Mr. Ankit
 
Big Data Analytics With Hadoop
Big Data Analytics With HadoopBig Data Analytics With Hadoop
Big Data Analytics With Hadoop
Umair Shafique
 
Hadoop and MapReduce addDdaDadadDDAD.pptx
Hadoop and MapReduce addDdaDadadDDAD.pptxHadoop and MapReduce addDdaDadadDDAD.pptx
Hadoop and MapReduce addDdaDadadDDAD.pptx
ms236400269
 
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for womenHadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
maharajothip1
 
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
Venneladonthireddy1
 
Hadoop-Quick introduction
Hadoop-Quick introductionHadoop-Quick introduction
Hadoop-Quick introduction
Sandeep Singh
 
getFamiliarWithHadoop
getFamiliarWithHadoopgetFamiliarWithHadoop
getFamiliarWithHadoop
AmirReza Mohammadi
 
CLOUD_COMPUTING_MODULE4_RK_BIG_DATA.pptx
CLOUD_COMPUTING_MODULE4_RK_BIG_DATA.pptxCLOUD_COMPUTING_MODULE4_RK_BIG_DATA.pptx
CLOUD_COMPUTING_MODULE4_RK_BIG_DATA.pptx
bhuvankumar3877
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to Hadoop
GERARDO BARBERENA
 
Unit IV.pdf
Unit IV.pdfUnit IV.pdf
Unit IV.pdf
KennyPratheepKumar
 
Hadoop distributed computing framework for big data
Hadoop distributed computing framework for big dataHadoop distributed computing framework for big data
Hadoop distributed computing framework for big data
Cyanny LIANG
 
002 Introduction to hadoop v3
002   Introduction to hadoop v3002   Introduction to hadoop v3
002 Introduction to hadoop v3
Dendej Sawarnkatat
 
Big data
Big dataBig data
Big data
Alisha Roy
 
Big data
Big dataBig data
Big data
Mayuri Verma
 
Bigdata workshop february 2015
Bigdata workshop  february 2015 Bigdata workshop  february 2015
Bigdata workshop february 2015
clairvoyantllc
 
MOD-2 presentation on engineering students
MOD-2 presentation on engineering studentsMOD-2 presentation on engineering students
MOD-2 presentation on engineering students
rishavkumar1402
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
arslanhaneef
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
sonukumar379092
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
ch adnan
 
Hadoop and It_s Components_PPT .pptx
Hadoop and It_s Components_PPT      .pptxHadoop and It_s Components_PPT      .pptx
Hadoop and It_s Components_PPT .pptx
ABHIJEETKUMAR632313
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
Mr. Ankit
 
Big Data Analytics With Hadoop
Big Data Analytics With HadoopBig Data Analytics With Hadoop
Big Data Analytics With Hadoop
Umair Shafique
 
Hadoop and MapReduce addDdaDadadDDAD.pptx
Hadoop and MapReduce addDdaDadadDDAD.pptxHadoop and MapReduce addDdaDadadDDAD.pptx
Hadoop and MapReduce addDdaDadadDDAD.pptx
ms236400269
 
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for womenHadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
maharajothip1
 
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
Venneladonthireddy1
 
Hadoop-Quick introduction
Hadoop-Quick introductionHadoop-Quick introduction
Hadoop-Quick introduction
Sandeep Singh
 
CLOUD_COMPUTING_MODULE4_RK_BIG_DATA.pptx
CLOUD_COMPUTING_MODULE4_RK_BIG_DATA.pptxCLOUD_COMPUTING_MODULE4_RK_BIG_DATA.pptx
CLOUD_COMPUTING_MODULE4_RK_BIG_DATA.pptx
bhuvankumar3877
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to Hadoop
GERARDO BARBERENA
 
Hadoop distributed computing framework for big data
Hadoop distributed computing framework for big dataHadoop distributed computing framework for big data
Hadoop distributed computing framework for big data
Cyanny LIANG
 
Bigdata workshop february 2015
Bigdata workshop  february 2015 Bigdata workshop  february 2015
Bigdata workshop february 2015
clairvoyantllc
 
MOD-2 presentation on engineering students
MOD-2 presentation on engineering studentsMOD-2 presentation on engineering students
MOD-2 presentation on engineering students
rishavkumar1402
 
Ad

Recently uploaded (20)

Lesson 6-Interviewing in SHRM_updated.pdf
Lesson 6-Interviewing in SHRM_updated.pdfLesson 6-Interviewing in SHRM_updated.pdf
Lesson 6-Interviewing in SHRM_updated.pdf
hemelali11
 
Unit 2 - Unified Modeling Language (UML).pdf
Unit 2 - Unified Modeling Language (UML).pdfUnit 2 - Unified Modeling Language (UML).pdf
Unit 2 - Unified Modeling Language (UML).pdf
sixokak391
 
2022.02.07_Bahan DJE Energy Transition Dialogue 2022 kirim.pdf
2022.02.07_Bahan DJE Energy Transition Dialogue 2022 kirim.pdf2022.02.07_Bahan DJE Energy Transition Dialogue 2022 kirim.pdf
2022.02.07_Bahan DJE Energy Transition Dialogue 2022 kirim.pdf
RomiRomeo
 
Dr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug - Expert In Artificial IntelligenceDr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug
 
What is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdfWhat is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdf
SaikatBasu37
 
Important JavaScript Concepts Every Developer Must Know
Important JavaScript Concepts Every Developer Must KnowImportant JavaScript Concepts Every Developer Must Know
Important JavaScript Concepts Every Developer Must Know
yashikanigam1
 
最新版澳洲西澳大利亚大学毕业证(UWA毕业证书)原版定制
最新版澳洲西澳大利亚大学毕业证(UWA毕业证书)原版定制最新版澳洲西澳大利亚大学毕业证(UWA毕业证书)原版定制
最新版澳洲西澳大利亚大学毕业证(UWA毕业证书)原版定制
Taqyea
 
web-roadmap developer file information..
web-roadmap developer file information..web-roadmap developer file information..
web-roadmap developer file information..
pandeyarush01
 
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Jayantilal Bhanushali
 
Get Started with FukreyGame Today!......
Get Started with FukreyGame Today!......Get Started with FukreyGame Today!......
Get Started with FukreyGame Today!......
liononline785
 
Urban models for professional practice 03
Urban models for professional practice 03Urban models for professional practice 03
Urban models for professional practice 03
DanisseLoiDapdap
 
2024 Digital Equity Accelerator Report.pdf
2024 Digital Equity Accelerator Report.pdf2024 Digital Equity Accelerator Report.pdf
2024 Digital Equity Accelerator Report.pdf
dominikamizerska1
 
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdfTOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
NhiV747372
 
Dynamics 365 Business Rules Dynamics Dynamics
Dynamics 365 Business Rules Dynamics DynamicsDynamics 365 Business Rules Dynamics Dynamics
Dynamics 365 Business Rules Dynamics Dynamics
heyoubro69
 
Lesson-2.pptxjsjahajauahahagqiqhwjwjahaiq
Lesson-2.pptxjsjahajauahahagqiqhwjwjahaiqLesson-2.pptxjsjahajauahahagqiqhwjwjahaiq
Lesson-2.pptxjsjahajauahahagqiqhwjwjahaiq
AngelPinedaTaguinod
 
national income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptxnational income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptx
j2492618
 
Introduction to Artificial Intelligence_ Lec 2
Introduction to Artificial Intelligence_ Lec 2Introduction to Artificial Intelligence_ Lec 2
Introduction to Artificial Intelligence_ Lec 2
Dalal2Ali
 
presentacion.slideshare.informáticaJuridica..pptx
presentacion.slideshare.informáticaJuridica..pptxpresentacion.slideshare.informáticaJuridica..pptx
presentacion.slideshare.informáticaJuridica..pptx
GersonVillatoro4
 
Storage Devices and the Mechanism of Data Storage in Audio and Visual Form
Storage Devices and the Mechanism of Data Storage in Audio and Visual FormStorage Devices and the Mechanism of Data Storage in Audio and Visual Form
Storage Devices and the Mechanism of Data Storage in Audio and Visual Form
Professional Content Writing's
 
HershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistributionHershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistribution
hershtara1
 
Lesson 6-Interviewing in SHRM_updated.pdf
Lesson 6-Interviewing in SHRM_updated.pdfLesson 6-Interviewing in SHRM_updated.pdf
Lesson 6-Interviewing in SHRM_updated.pdf
hemelali11
 
Unit 2 - Unified Modeling Language (UML).pdf
Unit 2 - Unified Modeling Language (UML).pdfUnit 2 - Unified Modeling Language (UML).pdf
Unit 2 - Unified Modeling Language (UML).pdf
sixokak391
 
2022.02.07_Bahan DJE Energy Transition Dialogue 2022 kirim.pdf
2022.02.07_Bahan DJE Energy Transition Dialogue 2022 kirim.pdf2022.02.07_Bahan DJE Energy Transition Dialogue 2022 kirim.pdf
2022.02.07_Bahan DJE Energy Transition Dialogue 2022 kirim.pdf
RomiRomeo
 
Dr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug - Expert In Artificial IntelligenceDr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug
 
What is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdfWhat is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdf
SaikatBasu37
 
Important JavaScript Concepts Every Developer Must Know
Important JavaScript Concepts Every Developer Must KnowImportant JavaScript Concepts Every Developer Must Know
Important JavaScript Concepts Every Developer Must Know
yashikanigam1
 
最新版澳洲西澳大利亚大学毕业证(UWA毕业证书)原版定制
最新版澳洲西澳大利亚大学毕业证(UWA毕业证书)原版定制最新版澳洲西澳大利亚大学毕业证(UWA毕业证书)原版定制
最新版澳洲西澳大利亚大学毕业证(UWA毕业证书)原版定制
Taqyea
 
web-roadmap developer file information..
web-roadmap developer file information..web-roadmap developer file information..
web-roadmap developer file information..
pandeyarush01
 
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Jayantilal Bhanushali
 
Get Started with FukreyGame Today!......
Get Started with FukreyGame Today!......Get Started with FukreyGame Today!......
Get Started with FukreyGame Today!......
liononline785
 
Urban models for professional practice 03
Urban models for professional practice 03Urban models for professional practice 03
Urban models for professional practice 03
DanisseLoiDapdap
 
2024 Digital Equity Accelerator Report.pdf
2024 Digital Equity Accelerator Report.pdf2024 Digital Equity Accelerator Report.pdf
2024 Digital Equity Accelerator Report.pdf
dominikamizerska1
 
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdfTOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
NhiV747372
 
Dynamics 365 Business Rules Dynamics Dynamics
Dynamics 365 Business Rules Dynamics DynamicsDynamics 365 Business Rules Dynamics Dynamics
Dynamics 365 Business Rules Dynamics Dynamics
heyoubro69
 
Lesson-2.pptxjsjahajauahahagqiqhwjwjahaiq
Lesson-2.pptxjsjahajauahahagqiqhwjwjahaiqLesson-2.pptxjsjahajauahahagqiqhwjwjahaiq
Lesson-2.pptxjsjahajauahahagqiqhwjwjahaiq
AngelPinedaTaguinod
 
national income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptxnational income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptx
j2492618
 
Introduction to Artificial Intelligence_ Lec 2
Introduction to Artificial Intelligence_ Lec 2Introduction to Artificial Intelligence_ Lec 2
Introduction to Artificial Intelligence_ Lec 2
Dalal2Ali
 
presentacion.slideshare.informáticaJuridica..pptx
presentacion.slideshare.informáticaJuridica..pptxpresentacion.slideshare.informáticaJuridica..pptx
presentacion.slideshare.informáticaJuridica..pptx
GersonVillatoro4
 
Storage Devices and the Mechanism of Data Storage in Audio and Visual Form
Storage Devices and the Mechanism of Data Storage in Audio and Visual FormStorage Devices and the Mechanism of Data Storage in Audio and Visual Form
Storage Devices and the Mechanism of Data Storage in Audio and Visual Form
Professional Content Writing's
 
HershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistributionHershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistribution
hershtara1
 
Ad

HADOOP DISTRIBUTED FILE SYSTEM AND MAPREDUCE

  • 1. HADOOP DISTRIBUTED FILE SYSTEM AND MAPREDUCE BY Eadara Harsha Siva Sai DEPARTMENT OF COMPUTER SCENCE AND ENGINEERING
  • 2. 1. INTRDUCTION TO HADOOP 1. HADOOP ARCHITECTURE 2. HADOOP DISTRIBUTED FILE SYSTEM(HDFS) 3. MAPREDUCE CONTENTS
  • 3. INTRDUCTION TO HADOOP What is hadoop..? • Hadoop is a frame work, To store and to process big data sets. It is open source software used for distributed computing. • Dough cutting introduced hadoop in cloud era . • Designed to answer the question: “How to process big data with reasonable cost and time?”
  • 4. INTRDUCTION TO HADOOP • In a traditional non distributed architecture, you’ll have data stored in one server and any client program will access this central data server to access the data. • The non distributed model has few issues. In this model, you’ll mostly scale vertically by adding more CPU to adding more storage, etc. • This architecture is also not reliable, as if the main server fails, you have to go back to the backup to restore the data and it is slow to access the huge data.
  • 5. INTRDUCTION TO HADOOP In a hadoop distributed architecture • Each and every server offers local computation and storage. i.e. When you run a query against a large data set, every server in this distributed architecture will be executing the query on its local machine against the local data set. Finally, the result set from all this local servers are consolidated. • You don’t need a powerful server. Just use several less expensive commodity servers as hadoop individual nodes. If any of the nodes fails in the hadoop environment, it will still return the dataset properly, as hadoop takes care of replicating and distributing the data efficiently across the multiple nodes. • Hadoop is written in Java. So, it can run on any platform.
  • 6. HADOOP ARCHITECTURE In Hadoop architecture 1.Name node 2. Secondary Node 3. Job Tracker 4. Data node 5. Task Tracker
  • 8. HADOOP DISTRIBUTED FILE SYSTEM(HDFS) • The distribution of a data between the data nodes by using hadoop is called HDFS .A typical HDFS block size is 64MB. • There will be one Name Node that manages the file system metadata. It will divide the data into 64MB size. • The name will decide to which data node the data to send and it also says the data node to store it replications to another two nodes. • After storing the data the data node will send how much of space is available. • Each and every 3 seconds data node passes a heart beat to name node. If data node failed to send heart beat then name node wait for 30 seconds . If not send it declare the data node is dead.
  • 9. HADOOP DISTRIBUTED FILE SYSTEM(HDFS)
  • 10. MAPREDUCE • The process of obtaining the output or getting back your data is called map reduce . The following is map reduce using hadoop frame work. • When the client want the output of the stored data then the client writes a program and sends the program to job tracer. • Job tracer asks the name node wether the metadata is created for this data or not . If created then send the meta data. • Then the job tracer will order the data nodes to process the data with them. • After processing the data the out put will send to jobtraker. Again job tracer will send the outputs obtained to another data node for final output.
  • 11. MAPREDUCE • After receiving the the final output then the jobtraker will send it to the client. • If a data node failed while processing the data then the job tracer will order another data node to process the data that consist of the replication of file. • After receiving the out puts from the data nodes then the jobtrker will see which data node have the less work at the moment and send to data node to get the final output. • The following diagram will explains about mapreduce.
  • 13. ADVANTAGES AND DISADVANTAGES ADVANTAGES: 1. Cost effective 2. Flexible 3. Fast 4. Resilient to failure DISADVANTAGES: 1. Security concerns 2. Not fit for small data 3. Potential stability issues
  • 14. CONCLUSION • Facebook , Google ,Amazon , flipchart etc.. are using HADOOP • Hadoop solves so many problems in storing of data on cloud .hence hadoop is a open source it is free and it can work on any Operating System .
  翻译: