[Bugs] [Bug 1394131] [md-cache]: All bricks crashed while performing symlink and rename from client at the same time

Mon Nov 14 10:13:52 UTC 2016

https://bugzilla.redhat.com/show_bug.cgi?id=1394131

--- Comment #2 from Poornima G <pgurusid at redhat.com> ---

All the 6 bricks of a volume (3x2) crashed with the upcall bt: 

[root at dhcp37-58 ~]# file core.5895.1476956627.dump.1
core.5895.1476956627.dump.1: ELF 64-bit LSB core file x86-64, version 1 (SYSV),
SVR4-style, from '/usr/sbin/glusterfsd -s 10.70.37.58 --volfile-id
master.10.70.37.58.rhs-brick1-', real uid: 0, effective uid: 0, real gid: 0,
effective gid: 0, execfn: '/usr/sbin/glusterfsd', platform: 'x86_64'
[root at dhcp37-58 ~]# 

(gdb) bt
#0  0x00007f9530adc210 in pthread_spin_lock () from /lib64/libpthread.so.0
#1  0x00007f951de3b129 in upcall_inode_ctx_get () from
/usr/lib64/glusterfs/3.8.4/xlator/features/upcall.so
#2  0x00007f951de3055f in upcall_local_init () from
/usr/lib64/glusterfs/3.8.4/xlator/features/upcall.so
#3  0x00007f951de3431a in up_setxattr () from
/usr/lib64/glusterfs/3.8.4/xlator/features/upcall.so
#4  0x00007f9531d072a4 in default_setxattr_resume () from
/lib64/libglusterfs.so.0
#5  0x00007f9531c9947d in call_resume () from /lib64/libglusterfs.so.0
#6  0x00007f951dc20743 in iot_worker () from
/usr/lib64/glusterfs/3.8.4/xlator/performance/io-threads.so
#7  0x00007f9530ad7dc5 in start_thread () from /lib64/libpthread.so.0
#8  0x00007f953041c73d in clone () from /lib64/libc.so.6
(gdb)

Steps Carried:
==============

This has happened in geo-rep setup, but all the master bricks are crashed and
looks more generic issue. However, I will write all the steps 

1. Create Master and Slave volume (3x2) each from 3 node clusters
2. Enable md-cache on master and slave
3. Create geo-rep between master and slave
4. Mount the Master volume (Fuse) thrice on same client at different location
5. Create Data on Master volume from one client and keep stat from other client
path:
crefi -T 10 -n 10 --multi -d 10 -b 10 --random --max=5K --min=1K --fop=create
/mnt/master/
find . | xargs stat
6. Let the data be synced to slave. Confirm via arequal checksum
7. Chmod on master volume from one client and keep stat from other client path:
crefi -T 10 -n 10 --multi -d 10 -b 10 --random --max=5K --min=1K --fop=chmod
/mnt/master/
find . | xargs stat
8. Let the data be synced to slave. Confirm via arequal checksum
9. Chown on master volume from one client and keep stat from other client path:
crefi -T 10 -n 10 --multi -d 10 -b 10 --random --max=5K --min=1K --fop=chown
/mnt/master/
find . | xargs stat
10. Let the data be synced to slave. Confirm via arequal checksum

11. Chgrp on master volume from one client and keep stat from other client
path:
crefi -T 10 -n 10 --multi -d 10 -b 10 --random --max=5K --min=1K --fop=chgrp
/mnt/master/
find . | xargs stat
12. Let the data be synced to slave. Confirm via arequal checksum

13. symlink  on master volume from one client and rename from another client
client path:
crefi -T 10 -n 10 --multi -d 10 -b 10 --random --max=5K --min=1K --fop=symlink
/mnt/master/
crefi -T 10 -n 10 --multi -d 10 -b 10 --random --max=5K --min=1K --fop=rename
/mnt/new_1

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.