Here I am going to Integrate LVM with Hadoop and providing Elasticity to DataNode Storage.
Before We get started, Let’s understand some Basics.
What is LVM?
Logical Volume Management (LVM) creates a layer of abstraction over physical storage, allowing you to create logical storage volumes. With LVM in place, you are not bothered with physical disk sizes because the hardware storage is hidden from the software so it can be resized and moved without stopping applications or unmounting file systems. You can think of LVM as dynamic partitions.
For example, if you are running out of disk space on your server, you can just add another disk and extend the logical volume on the fly.
Below are some advantages of using Logical volumes over using physical storage directly:
Resize storage pools: You can extend the logical space as well as reduce it without reformatting the disks.
Flexible storage capacity: You can add more space by adding more disks and adding them to the pool of physical storage, thus you have a flexible storage capacity.
What is Hadoop?
Hadoop is an open-source, a Java-based programming framework that supports the storage and processing of extremely large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation.
Some Basic Terminology:
Namenode: Also called the master node is the main component of the Hadoop Cluster. It stores the metadata about block locations etc. This metadata is useful for file read and write operations.
Datanode: Also known as Slave node, is someone who share their own components with master node. It is the final location for storing the files. There can be many data-nodes in one cluster.
So Let’s get started
Step 1: Attach new hard disk. Here, we have attached two hard disks.
Note: Here we are using AWS cloud where 2 instances had been launched. But you can use local systems also to perform this.
So after attaching the volumes , We can check whether hard disk is attached or not using fdisk -l command.
Step 2: Convert the hard disk to Physical Volume(PV)
pvcreate command initialize these disks so that they can be a part in forming volume groups.
We can use pvdisplay command to view information of PV.
Step 3: Create a volume group
Physical volumes are combined into volume groups (VGs). It creates a pool of disk space out of which logical volumes can be allocated.
We can view the information about volume group using vgdisplay command.
Step 4: Create a Logical Volume
A volume group is divided up into logical volumes. So if you have created vg-01 earlier then you can create logical volumes from that VG.
We can view the information of LV using lvdisplay command
Step 5: Format the Logical Volume/partition
Step 6: mount the partition
Step 7: To increase the size of the LV/partition use lvextend command.
Step 8: Now, reformat the new storage added to the LV using resize2fs command
Step 9: Check the size of the storage
We have increased the storage of Data-node from 4Gib to 12Gib on the fly.