Integrating LVM with Hadoop and providing Elasticity to Data Node Storage

Integrating LVM with Hadoop and providing Elasticity to Data Node Storage


What is LVM?

LVM is a tool for logical volume management which includes allocating disks, striping, mirroring and resizing logical volumes. With LVM, a hard drive or set of hard drives is allocated to one or more physical volumes. LVM physical volumes can be placed on other block devices which might span two or more disks.

The physical volumes are combined into logical volumes, with the exception of the /boot partition. The /boot partition cannot be on a logical volume group because the boot loader cannot read it. If the root (/) partition is on a logical volume, create a separate /boot partition which is not a part of a volume group. Since a physical volume cannot span over multiple drives, to span over more than one drive, create one or more physical volumes per drive.

No alt text provided for this image

The volume groups can be divided into logical volumes, which are assigned mount points, such as /home and / and file system types, such as ext2 or ext3. When "partitions" reach their full capacity, free space from the volume group can be added to the logical volume to increase the size of the partition. When a new hard drive is added to the system, it can be added to the volume group, and partitions that are logical volumes can be increased in size.


🌀 TASK DESCRIPTION :

🔅7.1: Elasticity Task -- a) Integrating LVM with Hadoop and providing Elasticity to DataNode Storage


First, Set up the Hadoop cluster and start the instance which is configured as data node. Then follow the steps--

Step 1 : Installing the LVM software in the Data Node which will be integrated with the LVM

Command : yum install lvm2

No alt text provided for this image
No alt text provided for this image

Installation is done.


Step 2 : Launching 2 EBS volumes in the same availability zone at which instance is running

No alt text provided for this image

volume1 and volume2, both are in available state currently.


Step 3 : Attaching volumes to instance

No alt text provided for this image

volumes- /dev/sdf and /dev/sdg attached to the instance named "instance".

This can also be seen by connecting to the instance and running "fdisk -l" command in the terminal.

No alt text provided for this image

disk /dev/xvdf of size 5 GB and disk /dev/xvdg of size 6 GB have been added to the instance.


Step 4 : Creation of physical volumes

Converting both the EBS volumes to physical volumes using command "pvcreate".

No alt text provided for this image

We can see the complete information of respective physical volumes created with the command "pvdisplay".


Step 5 : Creation of volume group

Creating volume group with the two physical volumes previously created with the command "vgcreate". Here "myvg" is the name of the volume group.

No alt text provided for this image

With the command "vgdisplay", we can see the complete information of the created volume group. The total size of the volume group is 10.99 GB, combining the size of both the physical volumes i.e. physical volume /dev/xvdf of size 5 GB and physical volume /dev/xvdg of size 6 GB.


Step 6 : Creation of logical volume

After successful creation of volume group now creating the Logical volume of size according to the requirement, initially we are creating the logical volume of size 5GB. Then will increase it according to the need.

Command : lvcreate --size <lv_size> --name <lv_name> <vg_name>

No alt text provided for this image

No alt text provided for this image

Logical volume named "mylv" created of size 5GB. This can be seen using command "lvdisplay".

No alt text provided for this image

Now, running "vgdisplay" command will show that the volume group has 5.99 GB free size left as 5 GB size is allocated to the logical volume.


Step 7 : Formatting the logical volume

No alt text provided for this image

While formatting, we have to give the full path of the logical volume.


Step 8 : Mounting the volume to the data node folder

No alt text provided for this image
No alt text provided for this image

The logical volume successfully mounted to the folder "/slave" and now the total size of this folder became 4.9 GB of which available is 4.6 GB and rest of it contains some information.


Step 9 : Starting hadoop namenode services

Starting the instance which is configured as namenode.

No alt text provided for this image

Starting the namenode services --

Command : hadoop-daemon.sh start namenode

No alt text provided for this image


Step 10 : Starting hadoop datanode services

Command : hadoop-daemon.sh start datanode

No alt text provided for this image


Step 11 : Checking the contributed storage by the data node to the hadoop cluster

Command : hadoop dfsadmin -report

No alt text provided for this image

As the datanode folder is attached to the logical volume of size 4.9 GB, so it will contribute 4.86 GB to the hadoop cluster. We can also extend or reduce this contributed capacity by extending or reducing the size of the logical volume which is attached to the datanode folder.


Step 12 : Extending the size of the logical volume

command : lvextend --size <size_to_extend> <volume_path>

No alt text provided for this image

Now, the total size of the logical volume became 8 GB. Can verify using command "lvdisplay".

No alt text provided for this image

But, still the total size of the datanode folder is 4.9 GB. It has not extended yet. So we need to format this extended volume part so that this can also be used to store data.

No alt text provided for this image


Step 13 : Formatting the extended volume part

command : resize2fs <volume_path>

No alt text provided for this image

After formatting, the total size of the datanode folder i.e. "/slave" also extended to 7.9 GB.


Step 14 : Confirming the contribution of extended storage by the data node to the hadoop cluster.

command : hadoop dfsadmin -report

No alt text provided for this image

Now, data node is contributing 7.81 GB whereas was previously contributing 4.86 GB.



Thank You for reading..!!
DINESH KUMAR MEENA

Helping people's, who find something to be passionate about............... entrepreneurship/Independent business

4y

Congratulations

Like
Reply

To view or add a comment, sign in

More articles by Tejaswita Soni

  • Creating and publishing Helm chart for Hadoop

    What is helm? Helm is a package manager for Kubernetes that helps in deploying the applications, managing the versions…

  • Multi-Cloud Setup of Kubernetes Cluster

    Multi-cloud Kubernetes is the deployment of Kubernetes over multiple cloud services and providers. Kubernetes can also…

  • K-means Clustering and its use-cases in the Security Domain

    Clustering Clustering is one of the most common exploratory data analysis technique used to get an intuition about the…

  • INDUSTRY USECASES OF OPENSHIFT

    What is OpenShift? OpenShift is a cloud development Platform as a Service (PaaS) hosted by Red Hat. It’s an…

  • Industry use cases of Jenkins

    What is Jenkins? Jenkins is a free and open source automation server. It helps automate the parts of software…

  • How industry uses MongoDB

    What is MongoDB? MongoDB is a source-available cross-platform document-oriented database program. MongoDB stores data…

  • Neural Networks and use cases in Industries

    What are Neural Networks? A branch of machine learning, neural networks (NN), also known as artificial neural networks…

  • Case Study and Industries Use-cases of Amazon SQS

    What is AWS SQS? Amazon Simple Queue Service (SQS) is a fully managed message queuing service that enables you to…

  • Industry usecases of Azure Kubernetes

    What is Azure? Azure is a cloud computing platform and an online portal that allows you to access and manage cloud…

  • Increase or Decrease the Size of Static Partition in Linux

    ✍️ Task Description 🌀 7.1: Elasticity Task 🔅Increase or Decrease the Size of Static Partition in Linux.

Insights from the community

Others also viewed

Explore topics