[Gluster-users] Gluster Startup Issue

Sat Jun 25 15:17:33 UTC 2016

Notice it actually tells you to look in the logs on server-ip-2 but you did not include any logs from that server.

On June 21, 2016 10:22:14 AM PDT, Danny Lee <dannyl at vt.edu> wrote:
>Hello,
>
>We are currently figuring out how to add GlusterFS to our system to
>make
>our systems highly available using scripts.  We are using Gluster
>3.7.11.
>
>Problem:
>Trying to migrate to GlusterFS from a non-clustered system to a 3-node
>glusterfs replicated cluster using scripts.  Tried various things to
>make
>this work, but it sometimes causes us to be in an indesirable state
>where
>if you call "gluster volume heal <volname> full", we would get the
>error
>message, "Launching heal operation to perform full self heal on volume
><volname> has been unsuccessful on bricks that are down. Please check
>if
>all brick processes are running."  All the brick processes are running
>based on running the command, "gluster volume status volname"
>
>Things we have tried:
>Order of preference
>1. Create Volume with 3 Filesystems with the same data
>2. Create Volume with 2 Empty filesysytems and one with the data
>3. Create Volume with only one filesystem with data and then using
>"add-brick" command to add the other two empty filesystems
>4. Create Volume with one empty filesystem, mounting it, and then
>copying
>the data over to that one.  And then finally, using "add-brick" command
>to
>add the other two empty filesystems
>5. Create Volume with 3 empty filesystems, mounting it, and then
>copying
>the data over
>
>Other things to note:
>A few minutes after the volume is created and started successfully, our
>application server starts up against it, so reads and writes may happen
>pretty quickly after the volume has started.  But there is only about
>50MB
>of data.
>
>Steps to reproduce (all in a script):
># This is run by the primary node with the IP Adress, <server-ip-1>,
>that
>has data
>systemctl restart glusterd
>gluster peer probe <server-ip-2>
>gluster peer probe <server-ip-3>
>Wait for "gluster peer status" to all be in "Peer in Cluster" state
>gluster volume create <volname> replica 3 transport tcp ${BRICKS[0]}
>${BRICKS[1]} ${BRICKS[2]} force
>gluster volume set <volname> nfs.disable true
>gluster volume start <volname>
>mkdir -p $MOUNT_POINT
>mount -t glusterfs <server-ip-1>:/volname $MOUNT_POINT
>find $MOUNT_POINT | xargs stat
>
>Note that, when we added sleeps around the gluster commands, there was
>a
>higher probability of success, but not 100%.
>
># Once volume is started, all the the clients/servers will mount the
>gluster filesystem by polling "mountpoint -q $MOUNT_POINT":
>mkdir -p $MOUNT_POINT
>mount -t glusterfs <server-ip-1>:/volname $MOUNT_POINT
>
>Logs:
>*etc-glusterfs-glusterd.vol.log* in *server-ip-1*
>
>[2016-06-21 14:10:38.285234] I [MSGID: 106533]
>[glusterd-volume-ops.c:857:__glusterd_handle_cli_heal_volume]
>0-management:
>Received heal vol req for volume volname
>[2016-06-21 14:10:38.296801] E [MSGID: 106153]
>[glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Commit failed on
><server-ip-2>. Please check log file for details.
>
>
>*usr-local-volname-data-mirrored-data.log* in *server-ip-1*
>
>[2016-06-21 14:14:39.233366] E [MSGID: 114058]
>[client-handshake.c:1524:client_query_portmap_cbk] 0-volname-client-0:
>failed to get the port number for remote subvolume. Please run 'gluster
>volume status' on server to see if brick process is running.
>*I think this is caused by the self heal daemon*
>
>*cmd_history.log* in *server-ip-1*
>
>[2016-06-21 14:10:38.298800]  : volume heal volname full : FAILED :
>Commit
>failed on <server-ip-2>. Please check log file for details.
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Gluster-users mailing list
>Gluster-users at gluster.org
>http://www.gluster.org/mailman/listinfo/gluster-users

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160625/fc1bb94b/attachment.html>