[Gluster-devel] Assertion failed: lru_inode_ctx->block_num > 0

Thu Dec 8 05:37:33 UTC 2016

Hi,

With the help from my colleague, we did some changes to the code with
reduce number of SHARD_MAX_INODES (from 16384 to 16) and also include
the printing of blk_num inside __shard_update_shards_inode_list. We
then execute fio to first do sequential write of 300MB file. After
this run completed, we then use fio to generate random write (8k). And
during this random write run, we found that there is situation where
the blk_num is negative number and this trigger the following
assertion.

GF_ASSERT (lru_inode_ctx->block_num > 0);

[2016-12-08 03:16:34.217582] E
[shard.c:468:__shard_update_shards_inode_list]
(-->/usr/local/lib/glusterfs/3.7.17/xlator/features/shard.so(shard_common_lookup_shards_cbk+0x2d)
[0x7f7300930b6d]
-->/usr/local/lib/glusterfs/3.7.17/xlator/features/shard.so(shard_link_block_inode+0xce)
[0x7f7300930b1e]
-->/usr/local/lib/glusterfs/3.7.17/xlator/features/shard.so(__shard_update_shards_inode_list+0x36b)
[0x7f730092bf5b] ) 0-: Assertion failed: lru_inode_ctx->block_num > 0

Also, there is segmentation fault shortly after this assertion and
after that fio exit with error.

frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2016-12-08 03:16:34
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.17
/usr/local/lib/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x92)[0x7f730e900332]
/usr/local/lib/libglusterfs.so.0(gf_print_trace+0x2d5)[0x7f730e9250b5]
/lib64/libc.so.6(+0x35670)[0x7f730d1f1670]
/usr/local/lib/glusterfs/3.7.17/xlator/features/shard.so(__shard_update_shards_inode_list+0x1d4)[0x7f730092bdc4]
/usr/local/lib/glusterfs/3.7.17/xlator/features/shard.so(shard_link_block_inode+0xce)[0x7f7300930b1e]
/usr/local/lib/glusterfs/3.7.17/xlator/features/shard.so(shard_common_lookup_shards_cbk+0x2d)[0x7f7300930b6d]
/usr/local/lib/glusterfs/3.7.17/xlator/cluster/distribute.so(dht_lookup_cbk+0x380)[0x7f7300b8e240]
/usr/local/lib/glusterfs/3.7.17/xlator/protocol/client.so(client3_3_lookup_cbk+0x769)[0x7f7300df4989]
/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0x90)[0x7f730e6ce010]
/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x1df)[0x7f730e6ce2ef]
/usr/local/lib/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7f730e6ca483]
/usr/local/lib/glusterfs/3.7.17/rpc-transport/socket.so(+0x6344)[0x7f73034dc344]
/usr/local/lib/glusterfs/3.7.17/rpc-transport/socket.so(+0x8f44)[0x7f73034def44]
/usr/local/lib/libglusterfs.so.0(+0x925aa)[0x7f730e96c5aa]
/lib64/libpthread.so.0(+0x7dc5)[0x7f730d96ddc5]

Core dump:

Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/local/sbin/glusterfs
--volfile-server=10.217.242.32 --volfile-id=/testSF1'.
Program terminated with signal 11, Segmentation fault.
#0  list_del_init (old=0x7f72f4003de0) at ../../../../libglusterfs/src/list.h:87
87        old->prev->next = old->next;

bt

#0  list_del_init (old=0x7f72f4003de0) at ../../../../libglusterfs/src/list.h:87
#1  __shard_update_shards_inode_list
(linked_inode=linked_inode at entry=0x7f72fa7a6e48,
this=this at entry=0x7f72fc0090c0, base_inode=0x7f72fa7a5108,
    block_num=block_num at entry=10) at shard.c:469
#2  0x00007f7300930b1e in shard_link_block_inode
(local=local at entry=0x7f730ec4ed00, block_num=10, inode=<optimized
out>,
    buf=buf at entry=0x7f730180c990) at shard.c:1559
#3  0x00007f7300930b6d in shard_common_lookup_shards_cbk
(frame=0x7f730c611204, cookie=<optimized out>, this=0x7f72fc0090c0,
op_ret=0,
    op_errno=<optimized out>, inode=<optimized out>,
buf=0x7f730180c990, xdata=0x7f730c029cdc, postparent=0x7f730180ca00)
at shard.c:1596
#4  0x00007f7300b8e240 in dht_lookup_cbk (frame=0x7f730c61dc40,
cookie=<optimized out>, this=<optimized out>, op_ret=0, op_errno=22,
    inode=0x7f72fa7a6e48, stbuf=0x7f730180c990, xattr=0x7f730c029cdc,
postparent=0x7f730180ca00) at dht-common.c:2362
#5  0x00007f7300df4989 in client3_3_lookup_cbk (req=<optimized out>,
iov=<optimized out>, count=<optimized out>, myframe=0x7f730c616ab4)
    at client-rpc-fops.c:2988
#6  0x00007f730e6ce010 in rpc_clnt_handle_reply
(clnt=clnt at entry=0x7f72fc04c040, pollin=pollin at entry=0x7f72fc079560)
at rpc-clnt.c:796
#7  0x00007f730e6ce2ef in rpc_clnt_notify (trans=<optimized out>,
mydata=0x7f72fc04c070, event=<optimized out>, data=0x7f72fc079560)
    at rpc-clnt.c:967
#8  0x00007f730e6ca483 in rpc_transport_notify
(this=this at entry=0x7f72fc05bd30,
event=event at entry=RPC_TRANSPORT_MSG_RECEIVED,
    data=data at entry=0x7f72fc079560) at rpc-transport.c:546
#9  0x00007f73034dc344 in socket_event_poll_in
(this=this at entry=0x7f72fc05bd30) at socket.c:2250
#10 0x00007f73034def44 in socket_event_handler (fd=fd at entry=10,
idx=idx at entry=2, data=0x7f72fc05bd30, poll_in=1, poll_out=0,
poll_err=0)
    at socket.c:2363
#11 0x00007f730e96c5aa in event_dispatch_epoll_handler
(event=0x7f730180ced0, event_pool=0xf42ee0) at event-epoll.c:575
#12 event_dispatch_epoll_worker (data=0xf8d650) at event-epoll.c:678
#13 0x00007f730d96ddc5 in start_thread () from /lib64/libpthread.so.0
#14 0x00007f730d2b2ced in clone () from /lib64/libc.so.6

It seems like there is some situation where the structure is not
intialized properly? Appreciate if anyone can advice. Thanks.

Cw

On Wed, Dec 7, 2016 at 9:42 AM, qingwei wei <tchengwee at gmail.com> wrote:
> Hi,
>
> I did another test and this time FIO fails with
>
> fio: io_u error on file /mnt/testSF-HDD1/test: Invalid argument: write
> offset=114423242752, buflen=8192
> fio: pid=10052, err=22/file:io_u.c:1582, func=io_u error, error=Invalid argument
>
> test: (groupid=0, jobs=1): err=22 (file:io_u.c:1582, func=io_u error,
> error=Invalid argument): pid=10052: Tue Dec  6 15:18:47 2016
>
>
> Below is the client log:
>
> [2016-12-06 05:19:31.261289] I [fuse-bridge.c:5171:fuse_graph_setup]
> 0-fuse: switched to graph 0
> [2016-12-06 05:19:31.261355] I [MSGID: 114035]
> [client-handshake.c:193:client_set_lk_version_cbk]
> 0-testSF-HDD-client-5: Server lk version = 1
> [2016-12-06 05:19:31.261404] I [fuse-bridge.c:4083:fuse_init]
> 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22
> kernel 7.22
> [2016-12-06 05:19:31.262901] I [MSGID: 108031]
> [afr-common.c:2071:afr_local_discovery_cbk] 0-testSF-HDD-replicate-0:
> selecting local read_child testSF-HDD-client-1
> [2016-12-06 05:19:31.262930] I [MSGID: 108031]
> [afr-common.c:2071:afr_local_discovery_cbk] 0-testSF-HDD-replicate-0:
> selecting local read_child testSF-HDD-client-0
> [2016-12-06 05:19:31.262948] I [MSGID: 108031]
> [afr-common.c:2071:afr_local_discovery_cbk] 0-testSF-HDD-replicate-0:
> selecting local read_child testSF-HDD-client-2
> [2016-12-06 05:19:31.269592] I [MSGID: 108031]
> [afr-common.c:2071:afr_local_discovery_cbk] 0-testSF-HDD-replicate-1:
> selecting local read_child testSF-HDD-client-3
> [2016-12-06 05:19:31.269795] I [MSGID: 108031]
> [afr-common.c:2071:afr_local_discovery_cbk] 0-testSF-HDD-replicate-1:
> selecting local read_child testSF-HDD-client-4
> [2016-12-06 05:19:31.277763] I [MSGID: 108031]
> [afr-common.c:2071:afr_local_discovery_cbk] 0-testSF-HDD-replicate-1:
> selecting local read_child testSF-HDD-client-5
> [2016-12-06 06:58:05.399244] W [MSGID: 101159]
> [inode.c:1219:__inode_unlink] 0-inode:
> be318638-e8a0-4c6d-977d-7a937aa84806/864c9ea1-3a7e-4d41-899b-f30604a7584e.16284:
> dentry not found in 63af10b7-9dac-4a53-aab1-3cc17fff3255
> [2016-12-06 15:17:43.311400] E
> [shard.c:460:__shard_update_shards_inode_list]
> (-->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(shard_common_lookup_shards_cbk+0x2d)
> [0x7f5575680fdd]
> -->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(shard_link_block_inode+0xdf)
> [0x7f5575680f6f]
> -->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(__shard_update_shards_inode_list+0x22e)
> [0x7f557567c1ce] ) 0-: Assertion failed: lru_inode_ctx->block_num > 0
> [2016-12-06 15:17:43.311472] W [inode.c:1232:inode_unlink]
> (-->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(shard_link_block_inode+0xdf)
> [0x7f5575680f6f]
> -->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(__shard_update_shards_inode_list+0x14a)
> [0x7f557567c0ea] -->/lib64/libglusterfs.so.0(inode_unlink+0x9c)
> [0x7f558386ba0c] ) 0-testSF-HDD-shard: inode not found
> [2016-12-06 15:17:43.333456] W [inode.c:1133:inode_forget]
> (-->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(shard_link_block_inode+0xdf)
> [0x7f5575680f6f]
> -->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(__shard_update_shards_inode_list+0x154)
> [0x7f557567c0f4] -->/lib64/libglusterfs.so.0(inode_forget+0x90)
> [0x7f558386b800] ) 0-testSF-HDD-shard: inode not found
> [2016-12-06 15:18:47.129794] W [fuse-bridge.c:2311:fuse_writev_cbk]
> 0-glusterfs-fuse: 12555429: WRITE => -1
> gfid=864c9ea1-3a7e-4d41-899b-f30604a7584e fd=0x7f557016ae6c (Invalid
> argument)
>
> Below is the code and it will go to the else block when inode_count is
> greater than SHARD_MAX_INODES which is 16384. And my dataset of 400GB
> with 16MB shard size has enough shard file (400GB/16MB) to achieve it.
> When i do the test with smaller dataset, there is no such error.
>
> shard.c
>
>                 if (priv->inode_count + 1 <= SHARD_MAX_INODES) {
>                 /* If this inode was linked here for the first time (indicated
>                  * by empty list), and if there is still space in the priv list,
>                  * add this ctx to the tail of the list.
>                  */
>                         gf_uuid_copy (ctx->base_gfid, base_inode->gfid);
>                         ctx->block_num = block_num;
>                         list_add_tail (&ctx->ilist, &priv->ilist_head);
>                         priv->inode_count++;
>                 } else {
>                 /*If on the other hand there is no available slot for this inode
>                  * in the list, delete the lru inode from the head of the list,
>                  * unlink it. And in its place add this new inode into the list.
>                  */
>                         lru_inode_ctx = list_first_entry (&priv->ilist_head,
>                                                           shard_inode_ctx_t,
>                                                           ilist);
>                         /* add in message for debug*/
>                         gf_msg (THIS->name, GF_LOG_WARNING, 0,
> SHARD_MSG_INVALID_FOP,
>                         "block number = %d", lru_inode_ctx->block_num);
>
>                         GF_ASSERT (lru_inode_ctx->block_num > 0);
>
> Hopefully can get some advice from you guys on this. Thanks.
>
> Cw
>
> On Tue, Dec 6, 2016 at 9:07 AM, qingwei wei <tchengwee at gmail.com> wrote:
>> Hi,
>>
>> This is the repost of my email in the gluster-user mailing list.
>> Appreciate if anyone has any idea on the issue i have now. Thanks.
>>
>> I encountered this when i do the FIO random write on the fuse mount
>> gluster volume. After this assertion happen, the client log is filled
>> with pending frames messages and FIO just show zero IO in the progress
>> status. As i leave this test to run overnight, the client log file
>> fill up with those pending frame messages and hit 28GB for around 12
>> hours.
>>
>> The client log:
>>
>> [2016-12-04 15:48:35.274208] W [MSGID: 109072]
>> [dht-linkfile.c:50:dht_linkfile_lookup_cbk] 0-testSF-dht: got
>> non-linkfile testSF-replicate-0:/.shard/21da7b64-45e5-4c6a-9244-53d0284bf7ed.7038,
>> gfid = 00000000-0000-0000-0000-000000000000
>> [2016-12-04 15:48:35.277208] W [MSGID: 109072]
>> [dht-linkfile.c:50:dht_linkfile_lookup_cbk] 0-testSF-dht: got
>> non-linkfile testSF-replicate-0:/.shard/21da7b64-45e5-4c6a-9244-53d0284bf7ed.8957,
>> gfid = 00000000-0000-0000-0000-000000000000
>> [2016-12-04 15:48:35.277588] W [MSGID: 109072]
>> [dht-linkfile.c:50:dht_linkfile_lookup_cbk] 0-testSF-dht: got
>> non-linkfile testSF-replicate-0:/.shard/21da7b64-45e5-4c6a-9244-53d0284bf7ed.11912,
>> gfid = 00000000-0000-0000-0000-000000000000
>> [2016-12-04 15:48:35.312751] E
>> [shard.c:460:__shard_update_shards_inode_list]
>> (-->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(shard_common_lookup_shards_cbk+0x2d)
>> [0x7f86cc42efdd]
>> -->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(shard_link_block_inode+0xdf)
>> [0x7f86cc42ef6f]
>> -->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(__shard_update_shards_inode_list+0x22e)
>> [0x7f86cc42a1ce] ) 0-: Assertion failed: lru_inode_ctx->block_num > 0
>> pending frames:
>> frame : type(0) op(0)
>> frame : type(0) op(0)
>> frame : type(0) op(0)
>> frame : type(0) op(0)
>> frame : type(0) op(0)
>> frame : type(0) op(0)
>>
>> Gluster info (i am testing this on one server with each disk
>> representing one brick, this gluster volume is then mounted locally
>> via fuse)
>>
>> Volume Name: testSF
>> Type: Distributed-Replicate
>> Volume ID: 3f205363-5029-40d7-b1b5-216f9639b454
>> Status: Started
>> Number of Bricks: 2 x 3 = 6
>> Transport-type: tcp
>> Bricks:
>> Brick1: 192.168.123.4:/mnt/sdb_mssd/testSF
>> Brick2: 192.168.123.4:/mnt/sdc_mssd/testSF
>> Brick3: 192.168.123.4:/mnt/sdd_mssd/testSF
>> Brick4: 192.168.123.4:/mnt/sde_mssd/testSF
>> Brick5: 192.168.123.4:/mnt/sdf_mssd/testSF
>> Brick6: 192.168.123.4:/mnt/sdg_mssd/testSF
>> Options Reconfigured:
>> features.shard-block-size: 16MB
>> features.shard: on
>> performance.readdir-ahead: on
>>
>> Gluster version: 3.7.17
>>
>> The actual disk usage (Is about 91% full):
>>
>> /dev/sdb1                235G  202G   22G  91% /mnt/sdb_mssd
>> /dev/sdc1                235G  202G   22G  91% /mnt/sdc_mssd
>> /dev/sdd1                235G  202G   22G  91% /mnt/sdd_mssd
>> /dev/sde1                235G  200G   23G  90% /mnt/sde_mssd
>> /dev/sdf1                235G  200G   23G  90% /mnt/sdf_mssd
>> /dev/sdg1                235G  200G   23G  90% /mnt/sdg_mssd
>>
>> Anyone encounter this issue before?
>>
>> Cw