How to Replicate Storage Across Servers using GlusterFS on CentOS 7

Try it in our public cloud & Get $50 Credit
CLAIM NOW

GlusterFS is a scalable network filesystem. Using common off-the-shelf hardware, you can create large, distributed storage solutions for media streaming, data analysis, and other data- and bandwidth-intensive tasks. If you want to distribute a storage pool across many servers, GlusterFS is a well-tested solution. This guide will walk you through setting up a GlusterFS cluster on CentOS 7.

Getting Started

To complete this guide, you will need the following:
• 3 Node (Cloud Server or Dedicated Server) running CentOS 7.

The first Node is a web server configured to run on both the WAN and LAN. The second and third LAN-only servers are for storage. When finished, these servers will share access to a common storage pool.

Tutorial

This tutorial will guide you through setting up a shared document root for a web server. Even so, the principles are the same for setting up any other scenario where you wish to share storage between Linux boxes.

Our network setup for this example will look as follows:
• web1: 10.0.0.47
• gluster1: 10.0.0.48
• gluster2: 10.0.0.49

Set up the hosts file on these three servers so their hostnames match the above configuration.

nano /etc/hosts
10.0.0.48 gluster1
10.0.0.49 gluster2

Create the directory in which the storage pool will reside. This directory will be mounted on each node.

mkdir /data

Next we need to install and enable the GlusterFS package repository onto each node in the cluster.

cd /etc/yum.repos.d/
wget http://download.gluster.org/pub/gluster/glusterfs/LATEST/CentOS/glusterfs-epel.repo
yum -y update

Then install the GlusterFS package on each node. It should also be enabled to start on boot.

yum install epel-release -y
yum install glusterfs-server -y
systemctl enable glusterd.service
systemctl start glusterd.service

We must now peer the gluster1 node with gluster2.

gluster peer probe gluster2
peer probe: success.

Let’s check the status of the trusted storage pool to see how the operation went.

gluster peer status
Number of Peers: 1
Hostname: gluster2
Uuid: 00b483de-66a5-4a74-bb49-e06718c83035
State: Peer in Cluster (Connected)

Now we must create the brick directory on each node in the cluster.

Gluster1:
mkdir /data/brick1

Gluster2:
mkdir /data/brick2

On gluster1, we must create the storage volume and replication.

gluster volume create glustervol1 replica 2 transport tcp gluster1:/data/brick1 gluster2:/data/brick2 force

If the volume is being created on the same drive as the system, /dev/sda for instance, you’ll need the “force” option. Otherwise it isn’t necessary.

Use "force" at the end if the volume is created on the same disk as the system disk ( sda ), if you are using a second disk for the volume ( sdb ) you don't need to force the creation of the storage volume
volume create: glustervol1: success: please start the volume to access data

Start the storage volume on gluster1.

gluster volume start glustervol1
volume start: glustervol1: success

Now the local IP range needs access to the storage volume on gluster1. We’ll enable that here.

gluster volume set glustervol1 auth.allow 10.0.0.*
volume set: success

Check the status to see how things are progressing.

gluster volume info
Volume Name: glustervol1
Type: Replicate
Volume ID: c0525f18-103f-45e2-b5e3-ced9ae0b072f
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: gluster1:/data/brick1
Brick2: gluster2:/data/brick2
Options Reconfigured:
auth.allow: 10.0.0.*
performance.readdir-ahead: on

Fine-tune the GlusterFS volume on gluster1 as shown below:

gluster volume set glustervol1 performance.write-behind off
gluster volume set glustervol1 performance.io-thread-count 64
gluster volume set glustervol1 performance.cache-size 1073741824
gluster volume set glustervol1 network.ping-timeout "5"
gluster volume set glustervol1 performance.write-behind-window-size 524288
gluster volume set glustervol1 performance.cache-refresh-timeout 1

Great, let’s set up the web server to serve up files from the cluster.

Install the web server package on web1, enabling it to start on boot.

yum install httpd -y
systemctl enable httpd.service
systemctl start httpd.service

We’ll next mount the GlusterFS filesystem on /var/www/html so the web server can access it.

mount.glusterfs gluster1:/glustervol1 /var/www/html/

Confirm that the mount succeeded.

mount | grep glusterfs
gluster1:/glustervol1 on /var/www/html type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

The mount entry must be added to /etc/fstab so it becomes available automatically on boot.

nano /etc/fstab
add this line at the end :

gluster1:/glustervol1 /var/www/html glusterfs defaults,_netdev,direct-io-mode=disable 0 0

Let’s see if replication is working. Create a file on web1 in the document root.
cd /var/www/html
touch index.html

Examine the volume on both Gluster nodes to determine if the file exists.

Gluster1:
ls /data/brick1
index.html

Gluster2:
ls /data/brick2
index.html

Conclusion

You now have a robust storage system that spans multiple servers. While this example is targeted at a web server, it is easy to adapt it for other circumstances that might benefit from a shared storage pool. If this guide was helpful to you, kindly share it with others who may also be interested.