[Gluster-devel] glusterfs(3.2.7) hang when making the same dir at the same time

Anand Avati anand.avati at gmail.com
Thu Jan 31 20:19:00 UTC 2013


Can you also give the outputs of "getfattr -d -m . -e hex /backend/dir"
from each of the bricks? It will be interesting to know in case there was a
gfid mismatch somehow.

Avati

On Thu, Jan 31, 2013 at 1:47 AM, Song <gluster at 163.com> wrote:

> Joe,****
>
> ** **
>
> I test it again, dump related glusterfs info and create a bug report on
> bugzilla.****
>
> https://bugzilla.redhat.com/show_bug.cgi?id=906238****
>
> ** **
>
> I use "kill -USR1 <hanged glusterfs client process ID>" to dump info and
> find that "gfs28-replicate-5" maybe be hanged. Then, I dump glusterfsd info
> of "Brick16: 10.1.10.188:/xmail/disk2/gfs28" and find the
> "/xmail/disk2/gfs28/songcl/b83/003.txt" is opened two times by "ls -asl
> /proce/pid/fd" command. ****
>
> ** **
>
> Maybe this file is deadlocked according to corresponding glusterfsd log:**
> **
>
> [2013-01-31 13:42:20.927077] T [rpcsvc.c:187:rpcsvc_program_actor]
> 0-rpc-service: Actor found: GlusterFS 3.2.7 - INODELK****
>
> [2013-01-31 13:42:20.927090] T [server-resolve.c:127:resolve_loc_touchup]
> 0-gfs28-server: return value inode_path 11****
>
> [2013-01-31 13:42:20.927104] T [common.c:103:get_domain] 0-posix-locks:
> Domain gfs28-replicate-5 found****
>
> [2013-01-31 13:42:20.927113] T [inodelk.c:218:__lock_inodelk]
> 0-gfs28-locks: Lock (pid=1059928640) lk-owner:140197382404672
> 9223372036854775806 - 0 => Blocked****
>
> [2013-01-31 13:42:20.927123] T [inodelk.c:486:pl_inode_setlk]
> 0-gfs28-locks: Lock (pid=1059928640) (lk-owner=140197382404672)
> 9223372036854775806 - 0 => NOK****
>
> [2013-01-31 13:42:20.927132] T [inodelk.c:218:__lock_inodelk]
> 0-gfs28-locks: Lock (pid=1059928640) lk-owner:140197382404672
> 9223372036854775806 - 0 => Blocked****
>
> [2013-01-31 13:42:20.933429] T [rpcsvc.c:443:rpcsvc_handle_rpc_call]
> 0-rpcsvc: Client port: 987****
>
> ** **
>
> For more information, please refer to attachment.****
>
> ** **
>
> 1. PID:6988 is the hanged glusterfs client dump file.****
>
> 2. PID:31100 is the glusterfsd dump file of "Brick16".****
>
> 3. 188-xmail-disk2-gfs28.log.splitab is the glusterfsd log of "Brick16".**
> **
>
> ** **
>
> If you need any other debug information, please tell me. ****
>
> Thanks very much!****
>
> ** **
>
> *From:* Joe Julian [mailto:joe at julianfamily.org]
> *Sent:* Friday, January 25, 2013 12:15 AM
> *To:* Song; gluster-devel at nongnu.org
> *Subject:* Re: [Gluster-devel] glusterfs(3.2.7) hang when making the same
> dir at the same time****
>
> ** **
>
> This looks like a support question to me. If you are asking a development
> question, you might want to use strace or gdb to figure out where the hang
> is, file a bug report on bugzilla, and submit your patch(es) to gerrit. **
> **
>
> Song <gluster at 163.com> wrote:****
>
> Hi,****
>
>  ****
>
> Recently, glusterfs will hang when we do stress testing. To find the
> reason, we write a test shell script.****
>
>  ****
>
> We run the test shell on 5 servers at the same time. For a moment, all
> test programming is hang.****
>
> When execute command “cd /xmail/gfs1/scl_test/001”, also hang.****
>
>  ****
>
> The test shell script:****
>
>  ****
>
> *for((i=1;i<=100;i++));*
>
> *do *
>
> *rmdir /xmail/gfs1/scl_test/001*
>
> *if [ "$?" == "0" ];*
>
> *then *
>
> *echo "delete dir success"*
>
> *fi *
>
> * *
>
> *mkdir /xmail/gfs1/scl_test/001*
>
> *if [ "$?" == "0" ];*
>
> *then *
>
> *echo "create dir success"*
>
> *fi*
>
> * *
>
> *echo "1111" >>/xmail/gfs1/scl_test/001/001.txt*
>
> *echo "2222" >>/xmail/gfs1/scl_test/001/002.txt*
>
> *echo "3333" >>/xmail/gfs1/scl_test/001/003.txt*
>
> * *
>
> *rm -rf /xmail/gfs1/scl_test/001/001.txt*
>
> *rm -ff /xmail/gfs1/scl_test/001/002.txt*
>
> *rm -rf /xmail/gfs1/scl_test/001/003.txt*
>
> *done*
>
>  ****
>
> “/xmail/gfs1” is native mount point of gluster volume gfs1.****
>
>  ****
>
> Gluster volume info is as below:****
>
> [root at d181 glusterfs]# gluster volume info****
>
>  ****
>
> Volume Name: gfs1****
>
> Type: Distributed-Replicate****
>
> Status: Started****
>
> Number of Bricks: 30 x 3 = 90****
>
> Transport-type: tcp****
>
>  ****
>
>  ****
>
> Please help me, Thanks!****
>
>  ****
>
> ------------------------------
>
>
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> https://lists.nongnu.org/mailman/listinfo/gluster-devel****
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> https://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20130131/9a2d00c9/attachment-0001.html>


More information about the Gluster-devel mailing list