[Gluster-users] New to Gluster. Having trouble with server replacement.

Tue Nov 12 10:19:50 UTC 2013

Hello all,

I'm new to gluster. In order to gain some knowledge, and test a few
things I decided to install it on three servers and play around with
it a bit.

My setup:
Three servers dc1-09, dc2-09, dc2-10. All with RHEL 6.4, and Gluster
3.4.0 (from RHS 2.1)
Each server has three disks, mounted in /mnt/raid1, /mnt/raid2 and /mnt/raid3.

I created a distributed/replicated volume, test1, with two replicas.

[root at dc2-10 ~]# gluster volume info test1

Volume Name: test1
Type: Distributed-Replicate
Volume ID: 59049b52-9e25-4cc9-bebd-fb3587948900
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: dc1-09:/mnt/raid1/test1
Brick2: dc2-09:/mnt/raid2/test1
Brick3: dc2-09:/mnt/raid1/test1
Brick4: dc2-10:/mnt/raid2/test1
Brick5: dc2-10:/mnt/raid1/test1
Brick6: dc1-09:/mnt/raid2/test1

I mounted this volume on a fourth unix server, and started a small
script that just keeps writing small files to it, in order to have
some activity.
Then I shut down one of the servers, started it again, shut down
another etc... gluster proved to have no problem keeping the files
available.

Then I decided to just nuke one server, and just completely
reinitialise it. After reinstalling OS + Gluster I had some trouble
getting the server back in the pool.
I followed two hints I found on the internet, and added the old UUID
in to glusterd.info, and made sure the correct
trusted.glusterfs.volume-id was set on all bricks.

Now the new server starts storing stuff again. But it still looks a
bit odd. I don't get consistent output from gluster volume status on
all three servers.

gluster volume info test1 gives me the same output everywhere. However
the output of gluster volume status is different:

[root at dc1-09 glusterd]# gluster volume status test1
Status of volume: test1
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick dc1-09:/mnt/raid1/test1 49154 Y 10496
Brick dc2-09:/mnt/raid2/test1 49152 Y 7574
Brick dc2-09:/mnt/raid1/test1 49153 Y 7581
Brick dc1-09:/mnt/raid2/test1 49155 Y 10502
NFS Server on localhost 2049 Y 1039
Self-heal Daemon on localhost N/A Y 1046
NFS Server on dc2-09 2049 Y 12397
Self-heal Daemon on dc2-09 N/A Y 12444

There are no active volume tasks

[root at dc2-10 /]# gluster volume status test1
Status of volume: test1
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick dc2-09:/mnt/raid2/test1 49152 Y 7574
Brick dc2-09:/mnt/raid1/test1 49153 Y 7581
Brick dc2-10:/mnt/raid2/test1 49152 Y 9037
Brick dc2-10:/mnt/raid1/test1 49153 Y 9049
NFS Server on localhost 2049 Y 14266
Self-heal Daemon on localhost N/A Y 14281
NFS Server on 172.16.1.21 2049 Y 12397
Self-heal Daemon on 172.16.1.21 N/A Y 12444

There are no active volume tasks

[root at dc2-09 mnt]# gluster volume status test1
Status of volume: test1
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick dc1-09:/mnt/raid1/test1 49154 Y 10496
Brick dc2-09:/mnt/raid2/test1 49152 Y 7574
Brick dc2-09:/mnt/raid1/test1 49153 Y 7581
Brick dc2-10:/mnt/raid2/test1 49152 Y 9037
Brick dc2-10:/mnt/raid1/test1 49153 Y 9049
Brick dc1-09:/mnt/raid2/test1 49155 Y 10502
NFS Server on localhost 2049 Y 12397
Self-heal Daemon on localhost N/A Y 12444
NFS Server on dc2-10 2049 Y 14266
Self-heal Daemon on dc2-10 N/A Y 14281
NFS Server on dc1-09 2049 Y 1039
Self-heal Daemon on dc1-09 N/A Y 1046

There are no active volume tasks--

Why would the output of status be different on the three hosts? Is
this normal, or is there still something wrong? If so, how do I fix
this?

Krist

krist.vanbesien at gmail.com
krist at vanbesien.org
Bern, Switzerland