[Gluster-users] Peer Rejected(Connected) and Self heal daemon is not running causing split brain

Kaamesh Kamalaaharan kaamesh at novocraft.com
Fri Feb 27 07:44:06 UTC 2015


Hi everyone,

I managed to fix my problem. It was an old process of gluster which was
using up the ports.. i manually killed all the processes and the
corresponding /var/run file used by that process before starting a new
instance of gluster and it worked fine.

What i did was :


service glusterfs-server stop


kill -9 `ps -ef | grep gl | grep -v grep | awk '{print $2}'`


rm -r /var/lib/glusterd/geo-replication /var/lib/glusterd/glustershd
/var/lib/glusterd/groups /var/lib/glusterd/hooks /var/lib/glusterd/nfs
/var/lib/glusterd/options /var/lib/glusterd/peers /var/lib/glusterd/quotad
/var/lib/glusterd/vols


service glusterfs-server start


gluster peer probe gfs1


service glusterfs-server restart



gluster volume sync gfs1 all





Thank You Kindly,
Kaamesh
Bioinformatician
Novocraft Technologies Sdn Bhd
C-23A-05, 3 Two Square, Section 19, 46300 Petaling Jaya
Selangor Darul Ehsan
Malaysia
Mobile: +60176562635
Ph: +60379600541
Fax: +60379600540

On Fri, Feb 27, 2015 at 8:51 AM, Kaamesh Kamalaaharan <kaamesh at novocraft.com
> wrote:

> Hi atin,
>
> I have tried to flush the iptables and this time i managed to get the peer
> into cluster. However, the self heal daemon is still offline and im unable
> to bring the daemon back online on gfs2. Doing a heal on either server
> gives me a succesful output but when i check the heal info i am getting
> many split brain errors on gfs2
>
> Thank You Kindly,
> Kaamesh
>
>
> On Thu, Feb 26, 2015 at 5:40 PM, Atin Mukherjee <amukherj at redhat.com>
> wrote:
>
>> Could you check the N/W firewall setting? Flush iptable setting using
>> iptables -F and retry.
>>
>> ~Atin
>>
>> On 02/26/2015 02:55 PM, Kaamesh Kamalaaharan wrote:
>> > Hi guys,
>> >
>> > I managed to get gluster running but im having a couple of issues with
>> my
>> > setup 1) my peer status is rejected but connected 2) my self heal
>> daemon is
>> > not running on one server and im getting split brain files.
>> > My setup is two gluster volumes  (gfs1 and gfs2) on replicate each with
>> a
>> > brick
>> >
>> > 1) My peer status doesnt go into Peer in Cluster. running a peer status
>> > command gives me State:Peer Rejected (Connected) . At this point, the
>> brick
>> > on gfs2 does not go online and i get this output
>> >
>> >
>> > #gluster volume status
>> >
>> > Status of volume: gfsvolume
>> >
>> > Gluster process Port Online Pid
>> >
>> >
>> ------------------------------------------------------------------------------
>> >
>> > Brick gfs1:/export/sda/brick 49153 Y 15025
>> >
>> > NFS Server on localhost 2049 Y 15039
>> >
>> > Self-heal Daemon on localhost N/A Y 15044
>> >
>> >
>> >
>> > Task Status of Volume gfsvolume
>> >
>> >
>> ------------------------------------------------------------------------------
>> >
>> > There are no active volume tasks
>> >
>> >
>> >
>> > I have followed the methods used in one of the threads and performed the
>> > following
>> >
>> >    a) stop glusterd
>> >    b) rm all files in /var/lib/glusterd/  except for glusterd.info
>> >    c) start glusterd and probe gfs1 from gfs2 and peer status which
>> gives me
>> >
>> >
>> > # gluster peer status
>> >
>> > Number of Peers: 1
>> >
>> >
>> > Hostname: gfs1
>> >
>> > Uuid: 49acc9c2-4809-4da5-a6f0-6a3d48314070
>> >
>> > State: Sent and Received peer request (Connected)
>> >
>> >
>> > the same thread mentioned that changing the status of the peer in
>> > /var/lib/glusterd/peer/{UUID} from status=5 to status=3 fixes this and
>> on
>> > restart of gfs1 the peer status goes to
>> >
>> > #gluster peer status
>> >
>> > Number of Peers: 1
>> >
>> >
>> > Hostname: gfs1
>> >
>> > Uuid: 49acc9c2-4809-4da5-a6f0-6a3d48314070
>> >
>> > State: Peer in Cluster (Connected)
>> >
>> > This fixes the connection between the peers and the volume status shows
>> >
>> >
>> > Status of volume: gfsvolume
>> >
>> > Gluster process Port Online Pid
>> >
>> >
>> ------------------------------------------------------------------------------
>> >
>> > Brick gfs1:/export/sda/brick 49153 Y 10852
>> >
>> > Brick gfs2:/export/sda/brick 49152 Y 17024
>> >
>> > NFS Server on localhost N/A N N/A
>> >
>> > Self-heal Daemon on localhost N/A N N/A
>> >
>> > NFS Server on gfs2 N/A N N/A
>> >
>> > Self-heal Daemon on gfs2 N/A N N/A
>> >
>> >
>> >
>> > Task Status of Volume gfsvolume
>> >
>> >
>> ------------------------------------------------------------------------------
>> >
>> > There are no active volume tasks
>> >
>> >
>> > Which brings us to problem 2
>> >
>> > 2) My self-heal demon is not alive
>> >
>> > I fixed the self heal on gfs1 by running
>> >
>> >  #find <gluster-mount> -noleaf -print0 | xargs --null stat >/dev/null
>> > 2>/var/log/gluster/<gluster-mount>-selfheal.log
>> >
>> > and running a volume status command gives me
>> >
>> > # gluster volume status
>> >
>> > Status of volume: gfsvolume
>> >
>> > Gluster process Port Online Pid
>> >
>> >
>> ------------------------------------------------------------------------------
>> >
>> > Brick gfs1:/export/sda/brick 49152 Y 16660
>> >
>> > Brick gfs2:/export/sda/brick 49152 Y 21582
>> >
>> > NFS Server on localhost 2049 Y 16674
>> >
>> > Self-heal Daemon on localhost N/A Y 16679
>> >
>> > NFS Server on gfs2 N/A N 21596
>> >
>> > Self-heal Daemon on gfs2 N/A N 21600
>> >
>> >
>> >
>> > Task Status of Volume gfsvolume
>> >
>> >
>> ------------------------------------------------------------------------------
>> >
>> > There are no active volume tasks
>> >
>> >
>> >
>> > However, running this on gfs2 doesnt fix the daemon.
>> >
>> > Restarting the gfs2 server brings me back to problem 1 and the cycle
>> > continues..
>> >
>> > Can anyone assist me with this issue(s).. thank you.
>> >
>> > Thank You Kindly,
>> > Kaamesh
>> >
>> >
>> >
>> > _______________________________________________
>> > Gluster-users mailing list
>> > Gluster-users at gluster.org
>> > http://www.gluster.org/mailman/listinfo/gluster-users
>> >
>>
>> --
>> ~Atin
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150227/6cb4ab1f/attachment.html>


More information about the Gluster-users mailing list