[Bugs] [Bug 1142411] New: quota: bricks coredump while creating data inside a subdir and lookup going on in parallel

bugzilla at redhat.com bugzilla at redhat.com
Tue Sep 16 17:37:21 UTC 2014


https://bugzilla.redhat.com/show_bug.cgi?id=1142411

            Bug ID: 1142411
           Summary: quota: bricks coredump while creating data inside a
                    subdir and lookup going on in parallel
           Product: GlusterFS
           Version: 3.6.0
         Component: quota
          Severity: urgent
          Assignee: gluster-bugs at redhat.com
          Reporter: srangana at redhat.com
                CC: bugs at gluster.org, gluster-bugs at redhat.com,
                    lmohanty at redhat.com, nbalacha at redhat.com,
                    nsathyan at redhat.com, saujain at redhat.com,
                    spalai at redhat.com, ssamanta at redhat.com
        Depends On: 1139473, 1140084



+++ This bug was initially created as a clone of Bug #1140084 +++

+++ This bug was initially created as a clone of Bug #1139473 +++

Description of problem:
For a volume I have set quota on a subdir to 100GB and I was just trying to
create data inside that directory.
Meanwhile I started lookup `find . | xargs stat` in a loop from another mount
point.

Although this data creation was I have done rename of files in some other
directory and triggered rebalance as well. Infact while data creation as
mentioned above was going rename in another directory and rebalance had
finished. So, altogether after sometime only data-creation and lookup was going
on in parallel and that's also these two operations from different
mount-points.

Version-Release number of selected component (if applicable):
glusterfs-3.6.0.28-1.el6rhs.x86_64

How reproducible:
happen to be seen with present exercise

Steps to Reproduce:
1. create a volume of type 6x2, start it
2. enable quota 
3. mount volume over nfs on clients c1, c2
4. create a dir say "qa1dir/dir1" and provide non-root user ownership to qa1dir
and dir1 both.
5. set limit of 100GB on qa1dir/dir1
6. create someother directory in mountpoint and create some files in this
directory. 
7. start creating data inside qa1dir/dir1 till quota limit is reached.
8. start renaming renaming of files in the directory created in step6.
9. start rebalance.
10. start "find . | xargs stat" in a for loop from client c2.


Actual results:
there are several bricks that hace coredump. 

bricks log,
pending frames:
frame : type(0) op(8)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 
2014-09-08 01:32:12
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.6.0.28
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x7fd76c2ddf06]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x7fd76c2f859f]
/lib64/libc.so.6[0x3c6be326b0]
/usr/lib64/libglusterfs.so.0(uuid_unpack+0x0)[0x7fd76c319190]
/usr/lib64/libglusterfs.so.0(+0x5ae66)[0x7fd76c318e66]
/usr/lib64/libglusterfs.so.0(uuid_utoa+0x37)[0x7fd76c2f5cb7]
/usr/lib64/glusterfs/3.6.0.28/xlator/features/quota.so(quota_rename_cbk+0x28e)[0x7fd760bb36be]
/usr/lib64/libglusterfs.so.0(default_rename_cbk+0xfd)[0x7fd76c2ec1fd]
/usr/lib64/libglusterfs.so.0(+0x4007a)[0x7fd76c2fe07a]
/usr/lib64/libglusterfs.so.0(call_resume+0xa0)[0x7fd76c2ff830]
/usr/lib64/glusterfs/3.6.0.28/xlator/features/marker.so(marker_rename_done+0x7a)[0x7fd760dd1e4a]
/usr/lib64/glusterfs/3.6.0.28/xlator/features/marker.so(marker_rename_release_newp_lock+0x2b4)[0x7fd760dd2414]
/usr/lib64/libglusterfs.so.0(default_inodelk_cbk+0xb9)[0x7fd76c2ea769]
/usr/lib64/glusterfs/3.6.0.28/xlator/features/locks.so(pl_common_inodelk+0x29f)[0x7fd761613caf]
/usr/lib64/glusterfs/3.6.0.28/xlator/features/locks.so(pl_inodelk+0x27)[0x7fd761614537]
/usr/lib64/libglusterfs.so.0(default_inodelk_resume+0x14d)[0x7fd76c2e640d]
/usr/lib64/libglusterfs.so.0(call_resume+0x2aa)[0x7fd76c2ffa3a]
/usr/lib64/glusterfs/3.6.0.28/xlator/performance/io-threads.so(iot_worker+0x158)[0x7fd7613fe348]
/lib64/libpthread.so.0[0x3c6c6079d1]
/lib64/libc.so.6(clone+0x6d)[0x3c6bee886d]
---------



bt of the coredump.

#0  uuid_unpack (in=0x8 <Address 0x8 out of bounds>, uu=0x7fd6fb6f54c0) at
../../contrib/uuid/unpack.c:44
#1  0x00007fd76c318e66 in uuid_unparse_x (uu=<value optimized out>,
out=0x7fd7181529e0 "1a71b01e-22bc-4b60-a0fc-3beac78cb403", fmt=0x7fd76c340d20
"%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x")
    at ../../contrib/uuid/unparse.c:55
#2  0x00007fd76c2f5cb7 in uuid_utoa (uuid=0x8 <Address 0x8 out of bounds>) at
common-utils.c:2138
#3  0x00007fd760bb36be in quota_rename_cbk (frame=0x7fd76b14cd64, cookie=<value
optimized out>, this=0x26563c0, op_ret=0, op_errno=61, buf=0x7fd76abb8518,
preoldparent=0x7fd76abb8668, 
    postoldparent=0x7fd76abb86d8, prenewparent=0x7fd76abb8748,
postnewparent=0x7fd76abb87b8, xdata=0x0) at quota.c:1931
#4  0x00007fd76c2ec1fd in default_rename_cbk (frame=0x7fd76b12de38,
cookie=<value optimized out>, this=<value optimized out>, op_ret=0,
op_errno=61, buf=<value optimized out>, preoldparent=0x7fd76abb8668, 
    postoldparent=0x7fd76abb86d8, prenewparent=0x7fd76abb8748,
postnewparent=0x7fd76abb87b8, xdata=0x0) at defaults.c:961
#5  0x00007fd76c2fe07a in call_resume_unwind (stub=0x7fd76abb7f18) at
call-stub.c:2604
#6  0x00007fd76c2ff830 in call_resume (stub=0x7fd76abb7f18) at call-stub.c:2843
#7  0x00007fd760dd1e4a in marker_rename_done (frame=0x7fd76b12de38,
cookie=<value optimized out>, this=0x2654e50, op_ret=<value optimized out>,
op_errno=<value optimized out>, xdata=<value optimized out>)
    at marker.c:1035
#8  0x00007fd760dd2414 in marker_rename_release_newp_lock
(frame=0x7fd76b12de38, cookie=<value optimized out>, this=0x2654e50,
op_ret=<value optimized out>, op_errno=<value optimized out>, 
    xdata=<value optimized out>) at marker.c:1101
#9  0x00007fd76c2ea769 in default_inodelk_cbk (frame=0x7fd76b16215c,
cookie=<value optimized out>, this=<value optimized out>, op_ret=0, op_errno=0,
xdata=<value optimized out>) at defaults.c:1175
#10 0x00007fd761613caf in pl_common_inodelk (frame=0x7fd76b12abd4, this=<value
optimized out>, volume=<value optimized out>, inode=<value optimized out>,
cmd=7, flock=<value optimized out>, 
    loc=0x7fd76abd0b7c, fd=0x0, xdata=0x0) at inodelk.c:792
#11 0x00007fd761614537 in pl_inodelk (frame=<value optimized out>, this=<value
optimized out>, volume=<value optimized out>, loc=<value optimized out>,
cmd=<value optimized out>, 
    flock=<value optimized out>, xdata=0x0) at inodelk.c:804
#12 0x00007fd76c2e640d in default_inodelk_resume (frame=0x7fd76b16215c,
this=0x2651390, volume=0x7fd71801c530 "dist-rep-marker", loc=0x7fd76abd0b7c,
cmd=7, lock=0x7fd76abd0c7c, xdata=0x0) at defaults.c:1577
#13 0x00007fd76c2ffa3a in call_resume_wind (stub=0x7fd76abd0b3c) at
call-stub.c:2460
#14 call_resume (stub=0x7fd76abd0b3c) at call-stub.c:2841
#15 0x00007fd7613fe348 in iot_worker (data=0x26830c0) at io-threads.c:214
#16 0x0000003c6c6079d1 in start_thread () from /lib64/libpthread.so.0
#17 0x0000003c6bee886d in clone () from /lib64/libc.so.6
(gdb) quit


Expected results:
No core dump exepected. 

Additional info:

--- Additional comment from Saurabh on 2014-09-09 08:50:03 MVT ---



--- Additional comment from RHEL Product and Program Management on 2014-09-09
08:52:21 MVT ---

Since this issue was entered in bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

--- Additional comment from Saurabh on 2014-09-09 09:54:39 MVT ---

collect sosreports from here,
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1139473/

--- Additional comment from Nithya Balachandran on 2014-09-09 16:22:48 MVT ---

(gdb) f 3
#0  uuid_unpack (in=0x8 <Address 0x8 out of bounds>, uu=0x7fd6fb6f54c0) at
../../contrib/uuid/unpack.c:44
#1  0x00007fd76c318e66 in uuid_unparse_x (uu=<value optimized out>,
out=0x7fd7181529e0 "1a71b01e-22bc-4b60-a0fc-3beac78cb403", fmt=0x7fd76c340d20
"%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x")
    at ../../contrib/uuid/unparse.c:55
#2  0x00007fd76c2f5cb7 in uuid_utoa (uuid=0x8 <Address 0x8 out of bounds>) at
common-utils.c:2138
#3  0x00007fd760bb36be in quota_rename_cbk (frame=0x7fd76b14cd64, cookie=<value
optimized out>, this=0x26563c0, op_ret=0, op_errno=61, buf=0x7fd76abb8518,
preoldparent=0x7fd76abb8668, 
    postoldparent=0x7fd76abb86d8, prenewparent=0x7fd76abb8748,
postnewparent=0x7fd76abb87b8, xdata=0x0) at quota.c:1931
#4  0x00007fd76c2ec1fd in default_rename_cbk (frame=0x7fd76b12de38,
cookie=<value optimized out>, this=<value optimized out>, op_ret=0,
op_errno=61, buf=<value optimized out>, preoldparent=0x7fd76abb8668, 
    postoldparent=0x7fd76abb86d8, prenewparent=0x7fd76abb8748,
postnewparent=0x7fd76abb87b8, xdata=0x0) at defaults.c:961

(gdb)f 3

#3  0x00007fd760bb36be in quota_rename_cbk (frame=0x7fd76b14cd64, cookie=<value
optimized out>, this=0x26563c0, op_ret=0, op_errno=61, buf=0x7fd76abb8518, 
    preoldparent=0x7fd76abb8668, postoldparent=0x7fd76abb86d8,
prenewparent=0x7fd76abb8748, postnewparent=0x7fd76abb87b8, xdata=0x0) at
quota.c:1931

The crash happens at:


                                gf_log (this->name, GF_LOG_WARNING,
                                        "new entry being linked (name:%s) for "
                                        "inode (gfid:%s) is already present "
                                        "in inode-dentry-list", dentry->name,
                                        uuid_utoa (local->newloc.inode->gfid));





(gdb) p local->newloc
$5 = {
  path = 0x7fd71c037d90
"/dir1//dir2//dir3//dir4//dir5//dir6//dir7//dir8//dir9//dir10//dir11//dir12//dir13//dir14//dir15//dir16//dir17//dir18//dir19//dir20//newdir5//file499",
name = 0x7fd71c037e1d "file499", inode = 0x0, parent = 0x7fd759edcc04, gfid =
'\000' <repeats 15 times>, 
  pargfid = "\020V8/H\351K\371\263\\\223\214:\323\016", <incomplete sequence
\344>}


(gdb) p local->newloc.inode
$47 = (inode_t *) 0x0



(gdb) set disassembly-flavor intel
(gdb) disassemble quota_rename_cbk
Dump of assembler code for function quota_rename_cbk:
   0x00007fd760bb3430 <+0>:    push   r15
   0x00007fd760bb3432 <+2>:    push   r14
...
   0x00007fd760bb369d <+621>:    add    rdi,0x8
   0x00007fd760bb36a1 <+625>:    call   0x7fd760bab408 <uuid_compare at plt>
   0x00007fd760bb36a6 <+630>:    test   eax,eax
   0x00007fd760bb36a8 <+632>:    jne    0x7fd760bb361b <quota_rename_cbk+491>
   0x00007fd760bb36ae <+638>:    mov    rdi,QWORD PTR [rbp+0xa0]
   0x00007fd760bb36b5 <+645>:    add    rdi,0x8
   0x00007fd760bb36b9 <+649>:    call   0x7fd760baaf38 <uuid_utoa at plt>
=> 0x00007fd760bb36be <+654>:    mov    rdx,QWORD PTR [rsp+0x38]
   0x00007fd760bb36c3 <+659>:    mov    QWORD PTR [rsp+0x8],rax
   0x00007fd760bb36c8 <+664>:    lea    r9,[rip+0xa569]        # 0x7fd760bbdc38
   0x00007fd760bb36cf <+671>:    lea    rsi,[rip+0x97e1]        #
0x7fd760bbceb7
   0x00007fd760bb36d6 <+678>:    mov    r8d,0x5
   0x00007fd760bb36dc <+684>:    mov    ecx,0x78f


(gdb) info reg
rax            0x0    0
rbx            0x7fd76b14cd64    140563191221604
rcx            0x3c6c8182a0    259518464672
rdx            0x7fd76c340d20    140563210046752
rsi            0x7fd6fb6f54c0    140561318106304
rdi            0x8    8      <--- this is the arg being passed to uuid_utoa()
rbp            0x266ee60    0x266ee60
rsp            0x7fd6fb6f5520    0x7fd6fb6f5520
r8             0x18    24
r9             0x2f524e8    49620200
r10            0x30    48
r11            0xc    12
r12            0x7fd76abb8518    140563185370392
r13            0x0    0
r14            0x26563c0    40199104
r15            0x0    0
rip            0x7fd760bb36be    0x7fd760bb36be <quota_rename_cbk+654>
eflags         0x246    [ PF ZF IF ]



(gdb) p $rbp+0xa0
$48 = (void *) 0x266ef00


(gdb) p &local->newloc->inode
$51 = (inode_t **) 0x266ef00

(gdb) p local->newloc->inode
$53 = (inode_t *) 0x0


Thus, the process crashed because it was trying to access memory address 0x8.

--- Additional comment from Anand Avati on 2014-09-10 05:52:02 EDT ---

REVIEW: http://review.gluster.org/8686 (quota: Avoid quota crash for inode
being NULL) posted (#1) for review on master by susant palai
(spalai at redhat.com)

--- Additional comment from Anand Avati on 2014-09-10 06:11:26 EDT ---

REVIEW: http://review.gluster.org/8686 (quota: Avoid quota crash for inode
being NULL) posted (#2) for review on master by susant palai
(spalai at redhat.com)

--- Additional comment from Anand Avati on 2014-09-10 07:25:13 EDT ---

REVIEW: http://review.gluster.org/8687 (features/quota: fixes to dentry
management code in rename.) posted (#1) for review on master by Raghavendra G
(rgowdapp at redhat.com)

--- Additional comment from Anand Avati on 2014-09-10 07:26:02 EDT ---

REVIEW: http://review.gluster.org/8687 (features/quota: fixes to dentry
management code in rename.) posted (#2) for review on master by Raghavendra G
(rgowdapp at redhat.com)

--- Additional comment from Anand Avati on 2014-09-15 03:22:08 EDT ---

REVIEW: http://review.gluster.org/8687 (features/quota: fixes to dentry
management code in rename.) posted (#3) for review on master by Raghavendra G
(rgowdapp at redhat.com)

--- Additional comment from Anand Avati on 2014-09-15 14:24:12 EDT ---

COMMIT: http://review.gluster.org/8687 committed in master by Vijay Bellur
(vbellur at redhat.com) 
------
commit 3e1935c8141c4f0ff3ee5af30c62a02da772666b
Author: Raghavendra G <rgowdapp at redhat.com>
Date:   Wed Sep 10 16:51:19 2014 +0530

    features/quota: fixes to dentry management code in rename.

    1. After a successful rename (src, dst), the dentry
     <dst-parent, dst-basename> would be associated with src-inode.

    2. Its src inode that survives if both of src and dst are present.

    The fixes are done based on the above two observation.

    Change-Id: I7492a512e3732b1455c243b02fae12d489532bfb
    BUG: 1140084
    Signed-off-by: Raghavendra G <rgowdapp at redhat.com>
    Reviewed-on: http://review.gluster.org/8687
    Reviewed-by: susant palai <spalai at redhat.com>
    Reviewed-by: Shyamsundar Ranganathan <srangana at redhat.com>
    Tested-by: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Vijay Bellur <vbellur at redhat.com>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1139473
[Bug 1139473] quota: bricks coredump while creating data inside a subdir
and lookup going on in parallel
https://bugzilla.redhat.com/show_bug.cgi?id=1140084
[Bug 1140084] quota: bricks coredump while creating data inside a subdir
and lookup going on in parallel
-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=SxkWs1CZGI&a=cc_unsubscribe


More information about the Bugs mailing list