[Bugs] [Bug 1387364] New: glusterfs-client box dies when trying to write to gluster volume

Thu Oct 20 17:23:56 UTC 2016

https://bugzilla.redhat.com/show_bug.cgi?id=1387364

            Bug ID: 1387364
           Summary: glusterfs-client box dies when trying to write to
                    gluster volume
           Product: GlusterFS
           Version: 3.8
         Component: glusterd
          Severity: urgent
          Assignee: bugs at gluster.org
          Reporter: julioguevara150 at gmail.com
                CC: bugs at gluster.org

Description of problem:
I Have a two node replica glusterfs setup to keep a git repository available.
The gluster servers (lainf-git01p/10.10.66.123 and kcinf-git01p/10.10.64.123)
both have a /dev/sdb1 with xfs mounted on /data/brick1. A gluster volume has
been created in path /data/brick1/vg0 and exported with name gv0. This sames
servers mount the exported gluster volume with the command: 'mount -t glusterfs
lainf-git01p:/gv0 /gitlab-data' and both are able to mount the partition with
no issues. Both machines can list the files with no issue. Problems emerges
when i try to start writing files to the gluster volume. 

Whenever I try to execute 'dd if=/dev/urandom of=/gitlab-data/1 count=1
bs=100M' from node kcinf-git01p everything seems to be working fine, I see the
same file replicated to the brick on lainf-git01p and kcinf-git01p and listed
under /gitlab-data mountpoint.

But when i try to execute the same command from lainf-git01p the dd command
never finishes it's execution. I can see the file replicated over to
kcinf-git01p and cat it's content but lainf-git01p starts melting down. System
becomes unresponsive, no command or new ssh sessions can be created and the
system seems to be waiting for an event. Really quickly the system becomes
unusable and a hard reset is needed in order to get the system back up. 

Version-Release number of selected component (if applicable):
glusterfs.x86_64 3.8.4-1.el6                                                   
                               glusterfs-api.x86_64 3.8.4-1.el6                

glusterfs-cli.x86_64 3.8.4-1.el6                                               
                                   glusterfs-client-xlators.x86_64 3.8.4-1.el6 

 glusterfs-fuse.x86_64 3.8.4-1.el6                                             
                                     glusterfs-libs.x86_64 3.8.4-1.el6         

glusterfs-server.x86_64 3.8.4-1.el6

Packages from CentOS  SIG Storage
uname -r: 2.6.32-431.el6.x86_64
distro: CentOS release 6.5 (Final)

How reproducible:
Whenever i try to execute: 'dd if=/dev/urandom of=/gitlab-data/1 count=1
bs=100M' from lainf-git01p the whole box comes to a creeping halt. command like
kill, top won't respond, new ssh connections cannot be created and the box
needs to be hard rebooted in order to work again.

Steps to Reproduce:
1. mount -t glusterfs lainf-git01p:gv0 /gitlab-data
2. dd if=/dev/urandom of=/gitlab-data/1 count=1 bs=100M

Actual results:
When the file 1 is listed from the other client (kcinf-git01p) has the correct
size, but the client (lainf-git01p) never stops executing the dd command, it
hangs waiting even though the file has already been created. After this point
the box starts melting down until you are forced to do a hard reboot of the
system

Expected results:
dd command reports back that it completely successfully and no issues arise
after this. 

Additional info:
Latency between the machines is 35ms.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.