[Gluster-users] How-to start gluster when only one node is up ?

Mon Nov 2 22:25:30 UTC 2015

Personally I'd be much more interested in development/testing resources
going into large scale glusterfs clusters, rather than small office setups
or home use. Keep in mind this is a PB scale filesystem clustering
technology.

For home use I don't really see what advantage replica 2 would provide. I'd
probably do two single nodes, and have the primary node geo-replicate to
the secondary node so my data was intact if the primary node failed. In a
small office I could switch the DNS record to the 2nd node for failover. In
fact I probably wouldn't (and don't) use gluster at home at all, there are
other volume managers with snapshots and send/receive capabilities that
suit a small environment.

Really if your data is important at such a small scale, I'd be looking at a
single file server and cloud replication. S3 is $3/month for 100GB of data,
and $60/month for 2TB of data, can store multiple versions, and can move
old versions into glacial storage. Any individual/small business should be
able to determine the worth of its data and determine how much of it they
want to pay to backup. Over 3 years it might even be cheaper than a 2nd
node + dealing with maintenance/split-brains.

BTW I agree with your issues in regards to releases. I've found the best
method is to stick to a branch marked as stable. I tested 3.7.3 and it was
a bit of a disaster, but 3.6.6 hasn't given me any grief yet.

Steve

On Fri, Oct 30, 2015 at 6:40 AM, Mauro M. <gluster at ezplanet.net> wrote:

> Atin,
>
> Sorry I should have said that the design does not suit the needs of an
> ON/STANDBY cluster configuration and I would like it to be changed to
> cater for this popular use case for home and small office applications.
>
> Up to relase 3.5 it was perfect and beside I had never experienced split
> brain situations, actually until I was on 3.5 I did not even realize there
> could be split brains (I am a use without reading the manuals guy, if I
> had to add the time necessary to read the manuals of everything I use I
> would become 190 before I am done with it). I skipped 3.6 altogether
> because 3.6.1 did not even start my bricks. Later I upgraded to 3.7 and
> that is when troubles started: split brains that periodically pop up even
> through I never have a case where files are accessed at the same time from
> two nodes (I am the only user of my systems and the second node is only
> there to replicate), and issues getting the cluster to work single node.
>
> Mauro
>
> On Fri, October 30, 2015 12:14, Atin Mukherjee wrote:
> > -Atin
> > Sent from one plus one
> > On Oct 30, 2015 5:28 PM, "Mauro Mozzarelli" <mauro at ezplanet.net> wrote:
> >>
> >> Hi,
> >>
> >> Atin keeps giving the same answer: "it is by design"
> >>
> >> I keep saying "the design is wrong and it should be changed to cater for
> >> standby servers"
> > Every design has got its own set of limitations and i would say this is a
> > limitation instead of mentioning the overall design itself wrong. I would
> > again stand with my points as correctness is always a priority in a
> > distributed system. This behavioural change was introduced in 3.5 and if
> > this was not included as part of release note I apologize on behalf of
> the
> > release management.
> > As communicated earlier, we will definitely resolve this issue in
> > GlusterD2.
> >>
> >> In the meantime this is the workaround I am using:
> >> When the single node starts I stop and start the volume, and then it
> >> becomes mountable. On CentOS 6 and CentOS 7 it works with release up to
> >> 3.7.4. Release 3.7.5 is broken so I reverted back to 3.7.4.
> > This is where I am not convinced. An explicit volume start should start
> > the
> > bricks, can you raise a BZ with all the relevant details?
> >>
> >> In my experience glusterfs releases are a bit of a hit and miss. Often
> >> something stops working with newer releases, then after a few more
> >> releases it works again or there is a workaround ... Not quite the
> >> stability one would want for commercial use, and thus at the moment I
> >> can
> >> risk using it only for my home servers, hence the cluster with a node
> >> always ON and the second as STANDBY.
> >>
> >> MOUNT=/home
> >> LABEL="GlusterFS:"
> >> if grep -qs $MOUNT /proc/mounts; then
> >>     echo "$LABEL $MOUNT is mounted";
> >>     gluster volume start gv_home 2>/dev/null
> >> else
> >>     echo "$LABEL $MOUNT is NOT mounted";
> >>     echo "$LABEL Restarting gluster volume ..."
> >>     yes|gluster volume stop gv_home > /dev/null
> >>     gluster volume start gv_home
> >>     mount -t glusterfs sirius-ib:/gv_home $MOUNT;
> >>     if grep -qs $MOUNT /proc/mounts; then
> >>         echo "$LABEL $MOUNT is mounted";
> >>         gluster volume start gv_home 2>/dev/null
> >>     else
> >>         echo "$LABEL failure to mount $MOUNT";
> >>     fi
> >> fi
> >>
> >> I hope this helps.
> >> Mauro
> >>
> >> On Fri, October 30, 2015 11:48, Atin Mukherjee wrote:
> >> > -Atin
> >> > Sent from one plus one
> >> > On Oct 30, 2015 4:35 PM, "Remi Serrano" <rserrano at pros.com> wrote:
> >> >>
> >> >> Hello,
> >> >>
> >> >>
> >> >>
> >> >> I setup a gluster file cluster with 2 nodes. It works fine.
> >> >>
> >> >> But, when I shut down the 2 nodes, and startup only one node, I
> >> cannot
> >> > mount the share :
> >> >>
> >> >>
> >> >>
> >> >> [root at xxx ~]#  mount -t glusterfs 10.32.0.11:/gv0 /glusterLocalShare
> >> >>
> >> >> Mount failed. Please check the log file for more details.
> >> >>
> >> >>
> >> >>
> >> >> Log says :
> >> >>
> >> >> [2015-10-30 10:33:26.147003] I [MSGID: 100030]
> >> [glusterfsd.c:2318:main]
> >> > 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version
> >> 3.7.5
> >> > (args: /usr/sbin/glusterfs -127.0.0.1 --volfile-id=/gv0
> >> > /glusterLocalShare)
> >> >>
> >> >> [2015-10-30 10:33:26.171964] I [MSGID: 101190]
> >> > [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
> >> thread
> >> > with index 1
> >> >>
> >> >> [2015-10-30 10:33:26.185685] I [MSGID: 101190]
> >> > [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
> >> thread
> >> > with index 2
> >> >>
> >> >> [2015-10-30 10:33:26.186972] I [MSGID: 114020] [client.c:2118:notify]
> >> > 0-gv0-client-0: parent translators are ready, attempting connect on
> >> > transport
> >> >>
> >> >> [2015-10-30 10:33:26.191823] I [MSGID: 114020] [client.c:2118:notify]
> >> > 0-gv0-client-1: parent translators are ready, attempting connect on
> >> > transport
> >> >>
> >> >> [2015-10-30 10:33:26.192209] E [MSGID: 114058]
> >> > [client-handshake.c:1524:client_query_portmap_cbk] 0-gv0-client-0:
> > failed
> >> > to get the port number for remote subvolume. Please ume status' on
> > server
> >> > to see if brick process is running.
> >> >>
> >> >> [2015-10-30 10:33:26.192339] I [MSGID: 114018]
> >> > [client.c:2042:client_rpc_notify] 0-gv0-client-0: disconnected from
> >> > gv0-client-0. Client process will keep trying to connect t brick's
> >> port
> > is
> >> > available
> >> >>
> >> >>
> >> >>
> >> >> And when I check the volumes I get:
> >> >>
> >> >> [root at xxx ~]# gluster volume status
> >> >>
> >> >> Status of volume: gv0
> >> >>
> >> >> Gluster process                             TCP Port  RDMA Port
> >> Online
> >> > Pid
> >> >>
> >> >>
> >> >
> >
> ------------------------------------------------------------------------------
> >> >>
> >> >> Brick 10.32.0.11:/glusterBrick1/gv0         N/A       N/A        N
> >> > N/A
> >> >>
> >> >> NFS Server on localhost                     N/A       N/A        N
> >> > N/A
> >> >>
> >> >> NFS Server on localhost                     N/A       N/A        N
> >> > N/A
> >> >>
> >> >>
> >> >>
> >> >> Task Status of Volume gv0
> >> >>
> >> >>
> >> >
> >
> ------------------------------------------------------------------------------
> >> >>
> >> >> There are no active volume tasks
> >> >>
> >> >>
> >> >>
> >> >> If I start th second node, all is OK.
> >> >>
> >> >>
> >> >>
> >> >> Is this normal ?
> >> > This behaviour is by design. In a multi node cluster when GlusterD
> >> comes
> >> > up
> >> > it doesn't start the bricks until it receives the configuration from
> >> its
> >> > one of the friends to ensure that stale information is not been
> > referred.
> >> > In your case since the other node is down bricks are not started and
> > hence
> >> > mount fails.
> >> > As a workaround, we recommend to add a dummy node to the cluster to
> > avoid
> >> > this issue.
> >> >>
> >> >>
> >> >>
> >> >> Regards,
> >> >>
> >> >>
> >> >>
> >> >> RÃ©mi
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> _______________________________________________
> >> >> Gluster-users mailing list
> >> >> Gluster-users at gluster.org
> >> >> http://www.gluster.org/mailman/listinfo/gluster-users
> >> > _______________________________________________
> >> > Gluster-users mailing list
> >> > Gluster-users at gluster.org
> >> > http://www.gluster.org/mailman/listinfo/gluster-users
> >>
> >>
> >> --
> >> Mauro Mozzarelli
> >> Phone: +44 7941 727378
> >> eMail: mauro at ezplanet.net
> >>
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
>
>
> --
> Mauro Mozzarelli
> Phone: +44 7941 727378
> eMail: mauro at ezplanet.net
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151102/31cac7e6/attachment.html>