[Bugs] [Bug 1522808] New: Gluster client crashes while using both tiering and sharding

bugzilla at redhat.com bugzilla at redhat.com
Wed Dec 6 13:45:16 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1522808

            Bug ID: 1522808
           Summary: Gluster client crashes while using both tiering and
                    sharding
           Product: GlusterFS
           Version: 3.12
         Component: glusterd
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: olivier.lambert at vates.fr
                CC: bugs at gluster.org



Description of problem:

Gluster client crashes while using both tiering and sharding

Version-Release number of selected component (if applicable):

gluster client, 3.12.1
gluster server side, same version

How reproducible: trivial

Steps to Reproduce:
1. create replicated (or distributed volume, doesn't matter) using sharding
2. everything works fine
3. add tiering on top of it, and check to have some files promoted. No
modification are made on tiering config
4. works fine **EXCEPT** for file removal operations (UNLINK)

Actual results:

Gluster client crashes with:

```
[2017-12-05 15:43:04.517921] E [MSGID: 133010]
[shard.c:1724:shard_common_lookup_shards_cbk] 0-xosan-shard: Lookup on shard 1
failed. Base file gfid = cfb9c865-4f8e-4b01-a0bb-520a9baa4635 [Invalid
argument]
pending frames:
frame : type(1) op(UNLINK)
frame : type(1) op(UNLINK)
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash: 
2017-12-05 15:43:04
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.12.1
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xa0)[0x7f66a1863460]
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f66a186d394]
/lib64/libc.so.6(+0x35670)[0x7f669ff48670]
/lib64/libpthread.so.0(pthread_mutex_lock+0x0)[0x7f66a06c4bd0]
/lib64/libglusterfs.so.0(__gf_free+0x136)[0x7f66a188b6b6]
/lib64/libglusterfs.so.0(data_destroy+0x5d)[0x7f66a185a5fd]
/lib64/libglusterfs.so.0(dict_destroy+0x60)[0x7f66a185b040]
/usr/lib64/glusterfs/3.12.1/xlator/features/shard.so(+0x13d43)[0x7f66998e1d43]
/usr/lib64/glusterfs/3.12.1/xlator/features/shard.so(+0x16f76)[0x7f66998e4f76]
/usr/lib64/glusterfs/3.12.1/xlator/features/shard.so(+0xe35f)[0x7f66998dc35f]
/usr/lib64/glusterfs/3.12.1/xlator/features/shard.so(+0xfd31)[0x7f66998ddd31]
/usr/lib64/glusterfs/3.12.1/xlator/features/shard.so(+0x105a5)[0x7f66998de5a5]
/usr/lib64/glusterfs/3.12.1/xlator/cluster/tier.so(+0x80ca3)[0x7f6699b71ca3]
/usr/lib64/glusterfs/3.12.1/xlator/cluster/distribute.so(+0x3703c)[0x7f6699dce03c]
/usr/lib64/glusterfs/3.12.1/xlator/cluster/replicate.so(+0xa1dc)[0x7f669a0371dc]
/usr/lib64/glusterfs/3.12.1/xlator/cluster/replicate.so(+0xbadb)[0x7f669a038adb]
/usr/lib64/glusterfs/3.12.1/xlator/cluster/replicate.so(+0xc566)[0x7f669a039566]
/usr/lib64/glusterfs/3.12.1/xlator/protocol/client.so(+0x17d4c)[0x7f669a2c5d4c]
/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0x90)[0x7f66a162be60]
/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1e7)[0x7f66a162c147]
/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7f66a1627f73]
/usr/lib64/glusterfs/3.12.1/rpc-transport/socket.so(+0x7536)[0x7f669c7ce536]
/usr/lib64/glusterfs/3.12.1/rpc-transport/socket.so(+0x9adc)[0x7f669c7d0adc]
/lib64/libglusterfs.so.0(+0x883b4)[0x7f66a18c03b4]
/lib64/libpthread.so.0(+0x7dc5)[0x7f66a06c2dc5]
/lib64/libc.so.6(clone+0x6d)[0x7f66a000921d]
```


Expected results: file should be correctly removed


Additional info:

```
# gluster volume info

Volume Name: myvolume
Type: Tier
Volume ID: c9014830-2e51-401a-a967-2a0dd225d16a
Status: Started
Snapshot Count: 0
Number of Bricks: 9
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: 172.31.100.104:/bricks/ssd
Brick2: 172.31.100.103:/bricks/ssd
Brick3: 172.31.100.102:/bricks/ssd
Brick4: 172.31.100.101:/bricks/ssd
Cold Tier:
Cold Tier Type : Disperse
Number of Bricks: 1 x (4 + 1) = 5
Brick5: 172.31.100.101:/bricks/xosan1/xosandir
Brick6: 172.31.100.102:/bricks/xosan1/xosandir
Brick7: 172.31.100.103:/bricks/xosan1/xosandir
Brick8: 172.31.100.104:/bricks/xosan1/xosandir
Brick9: 172.31.100.105:/bricks/xosan1/xosandir
Options Reconfigured:
cluster.tier-mode: cache
features.ctr-enabled: on
nfs.disable: on
transport.address-family: inet
network.remote-dio: enable
cluster.eager-lock: enable
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
performance.strict-write-ordering: off
client.event-threads: 8
server.event-threads: 8
performance.io-thread-count: 64
performance.stat-prefetch: on
features.shard: on
features.shard-block-size: 512MB
```

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list