[Gluster-users] Issues with replicated gluster volume

Karthik Subrahmanya ksubrahm at redhat.com
Wed Jun 17 06:32:38 UTC 2020


Hi Ahemad,

Sorry for a lot of back and forth on this. But we might need a few more
details to find the actual cause here.
What version of gluster you are running on server and client nodes?
Also provide the statedump [1] of the bricks and the client process when
the hang is seen.

[1] https://docs.gluster.org/en/latest/Troubleshooting/statedump/

Regards,
Karthik

On Wed, Jun 17, 2020 at 9:25 AM ahemad_shaik at yahoo.com <
ahemad_shaik at yahoo.com> wrote:

> I have a 3 replica gluster volume created in 3 nodes and when one node is
> down due to some issue and the clients not able access volume. This was the
> issue. I have fixed the server and it is back. There was downtime at
> client. I just want to avoid the downtime since it is 3 replica.
>
> I am testing the high availability now by making one of the brick server
> rebooting or shut down manually. I just want to make volume accessible
> always by client. That is the reason we went for replica volume.
>
> So I just would like to know how to make the client volume high available
> even some VM or node which is having gluster volume goes down unexpectedly
> had down time of 10 hours.
>
>
>
> Glusterfsd service is used to stop which is disabled in my cluster and I
> see one more service running gluserd.
>
> Will starting glusterfsd service in all 3 replica nodes will help in
> achieving what I am trying.
>
> Hope I am clear.
>
> Thanks,
> Ahemad
>
>
>
> Thanks,
> Ahemad
>
>
>
> On Tue, Jun 16, 2020 at 23:12, Strahil Nikolov
> <hunter86_bg at yahoo.com> wrote:
> In my cluster ,  the service is enabled and running.
>
> What actually  is your problem  ?
> When a gluster brick process dies unexpectedly - all fuse clients will be
> waiting for the timeout .
> The service glusterfsd is ensuring that during system shutdown ,  the
> brick procesees will be shutdown in such way that all native clients  won't
> 'hang' and wait for the timeout, but will directly choose  another brick.
>
> The same happens when you manually run the kill script  -  all gluster
> processes  shutdown and all clients are  redirected to another brick.
>
> Keep in mind that fuse mounts will  also be killed  both by the script and
> the glusterfsd service.
>
> Best Regards,
> Strahil Nikolov
>
> На 16 юни 2020 г. 19:48:32 GMT+03:00, ahemad shaik <ahemad_shaik at yahoo.com>
> написа:
> > Hi Strahil,
> >I have the gluster setup on centos 7 cluster.I see glusterfsd service
> >and it is in inactive state.
> >systemctl status glusterfsd.service● glusterfsd.service - GlusterFS
> >brick processes (stopping only)   Loaded: loaded
> >(/usr/lib/systemd/system/glusterfsd.service; disabled; vendor preset:
> >disabled)   Active: inactive (dead)
> >
> >so you mean starting this service in all the nodes where gluster
> >volumes are created, will solve the issue ?
> >
> >Thanks,Ahemad
> >
> >
> >On Tuesday, 16 June, 2020, 10:12:22 pm IST, Strahil Nikolov
> ><hunter86_bg at yahoo.com> wrote:
> >
> > Hi ahemad,
> >
> >the  script  kills  all gluster  processes,  so the clients won't wait
> >for the timeout before  switching to another node in the TSP.
> >
> >In CentOS/RHEL,  there  is a  systemd  service called
> >'glusterfsd.service' that  is taking care on shutdown to kill all
> >processes,  so clients won't hung.
> >
> >systemctl cat glusterfsd.service --no-pager
> ># /usr/lib/systemd/system/glusterfsd.service
> >[Unit]
> >Description=GlusterFS brick processes (stopping only)
> >After=network.target glusterd.service
> >
> >[Service]
> >Type=oneshot
> ># glusterd starts the glusterfsd processed on-demand
> ># /bin/true will mark this service as started, RemainAfterExit keeps it
> >active
> >ExecStart=/bin/true
> >RemainAfterExit=yes
> ># if there are no glusterfsd processes, a stop/reload should not give
> >an error
> >ExecStop=/bin/sh -c "/bin/killall --wait glusterfsd || /bin/true"
> >ExecReload=/bin/sh -c "/bin/killall -HUP glusterfsd || /bin/true"
> >
> >[Install]
> >WantedBy=multi-user.target
> >
> >Best Regards,
> >Strahil  Nikolov
> >
> >На 16 юни 2020 г. 18:41:59 GMT+03:00, ahemad shaik
> ><ahemad_shaik at yahoo.com> написа:
> >> Hi,
> >>I see there is a script file in below mentioned path in all nodes
> >using
> >>which gluster volume
> >>created./usr/share/glusterfs/scripts/stop-all-gluster-processes.sh
> >>I need to create a system service and when ever there is some server
> >>down, we need to call this script or we need to have it run always it
> >>will take care when some node is down to make sure that client will
> >not
> >>have any issues in accessing mount point ?
> >>can you please share any documentation on how to use this.That will be
> >>great help.
> >>Thanks,Ahemad
> >>
> >>
> >>
> >>
> >>On Tuesday, 16 June, 2020, 08:59:31 pm IST, Strahil Nikolov
> >><hunter86_bg at yahoo.com> wrote:
> >>
> >> Hi Ahemad,
> >>
> >>You can simplify it  by creating a systemd service that  will  call
> >>the script.
> >>
> >>It was  already mentioned  in a previous thread  (with example),  so
> >>you can just use  it.
> >>
> >>Best Regards,
> >>Strahil  Nikolov
> >>
> >>На 16 юни 2020 г. 16:02:07 GMT+03:00, Hu Bert <revirii at googlemail.com>
> >>написа:
> >>>Hi,
> >>>
> >>>if you simply reboot or shutdown one of the gluster nodes, there
> >might
> >>>be a (short or medium) unavailability of the volume on the nodes. To
> >>>avoid this there's script:
> >>>
> >>>/usr/share/glusterfs/scripts/stop-all-gluster-processes.sh (path may
> >>>be different depending on distribution)
> >>>
> >>>If i remember correctly: this notifies the clients that this node is
> >>>going to be unavailable (please correct me if the details are wrong).
> >>>If i do reboots of one gluster node, i always call this script and
> >>>never have seen unavailability issues on the clients.
> >>>
> >>>
> >>>Regards,
> >>>Hubert
> >>>
> >>>Am Mo., 15. Juni 2020 um 19:36 Uhr schrieb ahemad shaik
> >>><ahemad_shaik at yahoo.com>:
> >>>>
> >>>> Hi There,
> >>>>
> >>>> I have created 3 replica gluster volume with 3 bricks from 3 nodes.
> >>>>
> >>>> "gluster volume create glustervol replica 3 transport tcp
> >>node1:/data
> >>>node2:/data node3:/data force"
> >>>>
> >>>> mounted on client node using below command.
> >>>>
> >>>> "mount -t glusterfs node4:/glustervol    /mnt/"
> >>>>
> >>>> when any of the node (either node1,node2 or node3) goes down,
> >>gluster
> >>>mount/volume (/mnt) not accessible at client (node4).
> >>>>
> >>>> purpose of replicated volume is high availability but not able to
> >>>achieve it.
> >>>>
> >>>> Is it a bug or i am missing anything.
> >>>>
> >>>>
> >>>> Any suggestions will be great help!!!
> >>>>
> >>>> kindly suggest.
> >>>>
> >>>> Thanks,
> >>>> Ahemad
> >>>>
> >>>> ________
> >>>>
> >>>>
> >>>>
> >>>> Community Meeting Calendar:
> >>>>
> >>>> Schedule -
> >>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> >>>> Bridge: https://bluejeans.com/441850968
> >>>>
> >>>> Gluster-users mailing list
> >>>> Gluster-users at gluster.org
> >>>> https://lists.gluster.org/mailman/listinfo/gluster-users
> >>>________
> >>>
> >>>
> >>>
> >>>Community Meeting Calendar:
> >>>
> >>>Schedule -
> >>>Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> >>>Bridge: https://bluejeans.com/441850968
> >>>
> >>>Gluster-users mailing list
> >>>Gluster-users at gluster.org
> >>>https://lists.gluster.org/mailman/listinfo/gluster-users
>
> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200617/06890925/attachment.html>


More information about the Gluster-users mailing list