Integrating LVM with Hadoop and providing Elasticity to Data Node Storage

Shivam Upadhyay
4 min readJun 5, 2021

Hello friends!! Today I would like to share with you all the steps to integrate the Hadoop with LVM so that the storage of the Hadoop becomes elastic storage So let's get started,

  • Firstly we will add a hard disk of 20GB to the Linux OS( which is configured as Datanode) as shown in the image

You can see in the below image how the hard disk of 20 GB is added to the OS

Now after that, we will configure our Hadoop Cluster by going to “/etc/hadoop/core-site.xml” and “/etc/hadoop/hdfs-site.xml” as shown in the below pictures

Configuring core-site.xml file
Configuring hdfs-site.xml file

Now if we write the following command we can see that no Data node has been connected to the Name node as shown in the below picture

hadoop dfsadmin -report

Now we will go to the data node and will see the attached hard disk by running the following command as shown in the picture

fdisk -l

Here we can see in the above picture that “/dev/sdb” hard disk has been attached to the OS. Now we will create a partition in the hard disk by running the following command and entering the options as shown in the picture

fdisk /dev/sdb

Now we will create physical volume(PV) , volume group(VG), and logical volume(LV) and then display them by running the following commands as shown in the picture

pvcreate /dev/sdb1
vgcreate myvg /dev/sdb1
lvcreate --size 19G --name mylv myvg
lvdisplay myvg/mylv

Now we will format the LV with ext4 file system and then mount it to a new folder by running the following commands as shown in the image

mkfs.ext4 /dev/myvg/mylv
mkdir /dn1
mount /dev/myvg/mylv /dn1

The folder i.e “/dn1” that we mounted is the one that will provide storage to the Hadoop Data node. We can see if the folder is mounted or not by running the following command

df -h

Now if we run “hadoop dfsadmin -report” command we can see that the data node has been added to the master node as shown in the picture

Now if we run the data node using the following command it will run properly as shown in the picture

hadoop-daemon.sh start datanode

Now we will reduce, unmount and resize the folder by running the commands as shown in the picture

Now we can see that the size is reduced from 19 Gb to 5Gb by running “df -h” command after running the following commands as shown in the picture below

umount /dn1
resize2fs /dev/mapper/myvg-mylv 5G
lvreduce --size 5G /dev/mapper/myvg-mylv

Now we can see that the data node size is being changed to 5 Gb by running the “hadoop dfsadmin -report” command.

Now we can also see the VG details by running the following command as shown in the picture

vgdisplay myvg

Similarly, we can extend the size to 15 Gb as shown in the picture below.

And now again if we look at the size of the data node then we can see that it is again changed to 15 Gb as shown in the image below.

This was all from my side.

Thank You for reading!!:)

Do connect me on LinkedIn: https://www.linkedin.com/in/shivam-prasad-upadhyay/

--

--