[Bugs] [Bug 1801623] New: spurious self-heald.t failure
bugzilla at redhat.com
bugzilla at redhat.com
Tue Feb 11 11:37:47 UTC 2020
https://bugzilla.redhat.com/show_bug.cgi?id=1801623
Bug ID: 1801623
Summary: spurious self-heald.t failure
Product: GlusterFS
Version: mainline
Status: NEW
Component: replicate
Assignee: bugs at gluster.org
Reporter: pkarampu at redhat.com
CC: bugs at gluster.org
Target Milestone: ---
Classification: Community
20:10:58 ./tests/basic/afr/self-heald.t ..
20:10:58 1..84
20:10:58 ok 1 [ 160/ 1061] < 46> 'glusterd'
20:10:58 ok 2 [ 9/ 17] < 47> 'pidof glusterd'
20:10:58 ok 3 [ 9/ 97] < 48> 'gluster --mode=script --wignore
volume create patchy replica 2
builder203.int.aws.gluster.org:/d/backends/patchy0
builder203.int.aws.gluster.org:/d/backends/patchy1
builder203.int.aws.gluster.org:/d/backends/patchy2
builder203.int.aws.gluster.org:/d/backends/patchy3
builder203.int.aws.gluster.org:/d/backends/patchy4
builder203.int.aws.gluster.org:/d/backends/patchy5'
20:10:58 ok 4 [ 12/ 147] < 49> 'gluster --mode=script --wignore
volume set patchy cluster.background-self-heal-count 0'
20:10:58 ok 5 [ 12/ 154] < 50> 'gluster --mode=script --wignore
volume set patchy cluster.eager-lock off'
20:10:58 ok 6 [ 15/ 159] < 51> 'gluster --mode=script --wignore
volume set patchy performance.flush-behind off'
20:10:58 ok 7 [ 14/ 1407] < 52> 'gluster --mode=script --wignore
volume start patchy'
20:10:58 ok 8 [ 10/ 217] < 53> '_GFS --attribute-timeout=0
--entry-timeout=0 --volfile-id=/patchy
--volfile-server=builder203.int.aws.gluster.org /mnt/glusterfs/0'
20:10:58 [2020-02-10 14:40:07.086270] E [rpc-clnt.c:346:saved_frames_unwind]
(-->
/build/install/lib/libglusterfs.so.0(_gf_log_callingfn+0x1bc)[0x7fb36875b6ee]
(--> /build/install/lib/libgfrpc.so.0(+0x1176f)[0x7fb3680c076f] (-->
/build/install/lib/libgfrpc.so.0(+0x11856)[0x7fb3680c0856] (-->
/build/install/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x16c)[0x7fb3680c0de4]
(--> /build/install/lib/libgfrpc.so.0(+0x12731)[0x7fb3680c1731] )))))
0-gf-attach-rpc: forced unwinding frame type(brick operations) op(--(1)) called
at 2020-02-10 14:40:07.077738 (xid=0x2)
20:10:58 got error -1 on RPC
20:10:58 ok 9 [ 14/ 2096] < 21> 'kill_brick patchy
builder203.int.aws.gluster.org /d/backends/patchy1'
20:10:58 ok 10 [ 15/ 2101] < 21> 'kill_brick patchy
builder203.int.aws.gluster.org /d/backends/patchy3'
20:10:58 [2020-02-10 14:40:11.315789] E [rpc-clnt.c:346:saved_frames_unwind]
(-->
/build/install/lib/libglusterfs.so.0(_gf_log_callingfn+0x1bc)[0x7f75e7b8a6ee]
(--> /build/install/lib/libgfrpc.so.0(+0x1176f)[0x7f75e74ef76f] (-->
/build/install/lib/libgfrpc.so.0(+0x11856)[0x7f75e74ef856] (-->
/build/install/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x16c)[0x7f75e74efde4]
(--> /build/install/lib/libgfrpc.so.0(+0x12731)[0x7f75e74f0731] )))))
0-gf-attach-rpc: forced unwinding frame type(brick operations) op(--(1)) called
at 2020-02-10 14:40:11.302544 (xid=0x2)
20:10:58 got error -1 on RPC
20:10:58 ok 11 [ 11/ 2103] < 21> 'kill_brick patchy
builder203.int.aws.gluster.org /d/backends/patchy5'
20:10:58 ok 12 [ 2386/ 194] < 72> '43 get_pending_heal_count patchy'
20:10:58 ok 13 [ 16/ 183] < 75> '3 disconnected_brick_count patchy'
20:10:58 ok 14 [ 112/ 175] < 82> '53 get_pending_heal_count patchy'
20:10:58 ok 15 [ 81/ 208] < 87> '43 get_pending_heal_count patchy'
20:10:58 ok 16 [ 16/ 77] < 90> '! gluster --mode=script --wignore
volume heal patchy'
20:10:58 ok 17 [ 11/ 155] < 91> 'gluster --mode=script --wignore
volume set patchy cluster.self-heal-daemon off'
20:10:58 ok 18 [ 11/ 71] < 92> '! gluster --mode=script --wignore
volume heal patchy'
20:10:58 ok 19 [ 24/ 72] < 93> '! gluster --mode=script --wignore
volume heal patchy full'
20:10:58 ok 20 [ 11/ 163] < 94> 'gluster --mode=script --wignore
volume start patchy force'
20:10:58 ok 21 [ 14/ 158] < 95> 'gluster --mode=script --wignore
volume set patchy cluster.self-heal-daemon on'
20:10:58 ok 22 [ 10/ 79] < 96> 'Y glustershd_up_status'
20:10:58 ok 23 [ 10/ 3342] < 34> '1 afr_child_up_status_in_shd patchy
1'
20:10:58 ok 24 [ 11/ 329] < 34> '1 afr_child_up_status_in_shd patchy
3'
20:10:58 ok 25 [ 10/ 93] < 34> '1 afr_child_up_status_in_shd patchy
5'
20:10:58 ok 26 [ 10/ 268] < 100> 'gluster --mode=script --wignore
volume heal patchy'
20:10:58 ok 27 [ 5164/ 1] < 103> '[ 43 -gt 0 ]'
20:10:58 ok 28 [ 9/ 57] < 105> 'gluster --mode=script --wignore
volume heal patchy full'
20:10:58 ok 29 [ 68/ 154] < 106> '^0$ get_pending_heal_count patchy'
20:10:58 not ok 30 [ 25/ 511] < 119> '0 get_pending_heal_count patchy'
-> 'Got "1" instead of "0"'
20:10:58 ok 31 [ 20/ 5] < 129> 'touch /mnt/glusterfs/0/f'
20:10:58 ok 32 [ 8/ 14] < 130> 'mkdir /mnt/glusterfs/0/d'
20:10:58 ok 33 [ 9/ 113] < 132> 'gluster --mode=script --wignore
volume set patchy cluster.data-self-heal off'
20:10:58 ok 34 [ 15/ 54] < 133> 'off volume_option patchy
cluster.data-self-heal'
20:10:58 ok 35 [ 9/ 1069] < 21> 'kill_brick patchy
builder203.int.aws.gluster.org /d/backends/patchy1'
20:10:58 ok 36 [ 9/ 1072] < 21> 'kill_brick patchy
builder203.int.aws.gluster.org /d/backends/patchy3'
20:10:58 ok 37 [ 9/ 1083] < 21> 'kill_brick patchy
builder203.int.aws.gluster.org /d/backends/patchy5'
20:10:58 ok 38 [ 18/ 116] < 136> '1 get_pending_heal_count patchy'
20:10:58 ok 39 [ 9/ 146] < 137> 'gluster --mode=script --wignore
volume start patchy force'
20:10:58 ok 40 [ 16/ 62] < 138> 'Y glustershd_up_status'
20:10:58 ok 41 [ 9/ 2572] < 34> '1 afr_child_up_status_in_shd patchy
1'
20:10:58 ok 42 [ 9/ 62] < 34> '1 afr_child_up_status_in_shd patchy
3'
20:10:58 ok 43 [ 8/ 62] < 34> '1 afr_child_up_status_in_shd patchy
5'
20:10:58 ok 44 [ 9/ 63] < 141> 'gluster --mode=script --wignore
volume heal patchy'
20:10:58 ok 45 [ 9/ 119] < 142> '^0$ get_pending_heal_count patchy'
20:10:58 ok 46 [ 9/ 109] < 143> 'gluster --mode=script --wignore
volume set patchy cluster.data-self-heal on'
20:10:58 ok 47 [ 18/ 114] < 146> 'gluster --mode=script --wignore
volume set patchy cluster.metadata-self-heal off'
20:10:58 ok 48 [ 10/ 53] < 147> 'off volume_option patchy
cluster.metadata-self-heal'
20:10:58 ok 49 [ 9/ 1068] < 21> 'kill_brick patchy
builder203.int.aws.gluster.org /d/backends/patchy1'
20:10:58 ok 50 [ 9/ 1072] < 21> 'kill_brick patchy
builder203.int.aws.gluster.org /d/backends/patchy3'
20:10:58 ok 51 [ 9/ 1076] < 21> 'kill_brick patchy
builder203.int.aws.gluster.org /d/backends/patchy5'
20:10:58 ok 52 [ 10/ 8] < 150> 'chmod 777 /mnt/glusterfs/0/f'
20:10:58 ok 53 [ 9/ 108] < 151> '1 get_pending_heal_count patchy'
20:10:58 ok 54 [ 9/ 153] < 152> 'gluster --mode=script --wignore
volume start patchy force'
20:10:58 ok 55 [ 13/ 63] < 153> 'Y glustershd_up_status'
20:10:58 ok 56 [ 9/ 2571] < 34> '1 afr_child_up_status_in_shd patchy
1'
20:10:58 ok 57 [ 9/ 64] < 34> '1 afr_child_up_status_in_shd patchy
3'
20:10:58 ok 58 [ 9/ 62] < 34> '1 afr_child_up_status_in_shd patchy
5'
20:10:58 ok 59 [ 8/ 62] < 156> 'gluster --mode=script --wignore
volume heal patchy'
20:10:58 ok 60 [ 10/ 121] < 157> '^0$ get_pending_heal_count patchy'
20:10:58 ok 61 [ 9/ 116] < 158> 'gluster --mode=script --wignore
volume set patchy cluster.metadata-self-heal on'
20:10:58 ok 62 [ 12/ 111] < 161> 'gluster --mode=script --wignore
volume set patchy cluster.entry-self-heal off'
20:10:58 ok 63 [ 11/ 55] < 162> 'off volume_option patchy
cluster.entry-self-heal'
20:10:58 ok 64 [ 9/ 1067] < 21> 'kill_brick patchy
builder203.int.aws.gluster.org /d/backends/patchy1'
20:10:58 ok 65 [ 9/ 1072] < 21> 'kill_brick patchy
builder203.int.aws.gluster.org /d/backends/patchy3'
20:10:58 ok 66 [ 9/ 1073] < 21> 'kill_brick patchy
builder203.int.aws.gluster.org /d/backends/patchy5'
20:10:58 ok 67 [ 10/ 10] < 164> 'touch /mnt/glusterfs/0/d/a'
20:10:58 ok 68 [ 129/ 1] < 168> 'test 2 -eq 2 -o 2 -eq 4'
20:10:58 ok 69 [ 9/ 124] < 169> 'gluster --mode=script --wignore
volume start patchy force'
20:10:58 ok 70 [ 11/ 64] < 170> 'Y glustershd_up_status'
20:10:58 ok 71 [ 9/ 2580] < 34> '1 afr_child_up_status_in_shd patchy
1'
20:10:58 ok 72 [ 9/ 64] < 34> '1 afr_child_up_status_in_shd patchy
3'
20:10:58 ok 73 [ 9/ 73] < 34> '1 afr_child_up_status_in_shd patchy
5'
20:10:58 ok 74 [ 9/ 60] < 172> 'gluster --mode=script --wignore
volume heal patchy'
20:10:58 ok 75 [ 11/ 120] < 173> '^0$ get_pending_heal_count patchy'
20:10:58 ok 76 [ 9/ 116] < 174> 'gluster --mode=script --wignore
volume set patchy cluster.entry-self-heal on'
20:10:58 ok 77 [ 11/ 57] < 178> '! gluster --mode=script --wignore
volume heal fail info'
20:10:58 ok 78 [ 9/ 7088] < 181> 'gluster --mode=script --wignore
volume stop patchy'
20:10:58 ok 79 [ 9/ 83] < 182> '! gluster --mode=script --wignore
volume heal patchy info'
20:10:58 ok 80 [ 9/ 3089] < 185> 'gluster --mode=script --wignore
volume delete patchy'
20:10:58 ok 81 [ 12/ 112] < 186> 'gluster --mode=script --wignore
volume create patchy builder203.int.aws.gluster.org:/d/backends/patchy{6}'
20:10:58 ok 82 [ 11/ 110] < 187> 'gluster --mode=script --wignore
volume start patchy'
20:10:58 ok 83 [ 11/ 95] < 188> '! gluster --mode=script --wignore
volume heal patchy info'
20:10:58 ok 84 [ 11/ 202] < 191> '! log_newer 1581345602 offset reused
from another DIR'
20:10:58 Failed 1/84 subtests
Problem:
heal-info code assumes that all indices in xattrop directory
definitely need heal. There is one corner case.
The very first xattrop on the file will lead to adding the
gfid to 'xattrop' index in fop path and in _cbk path it is
removed because the fop is zero-xattr xattrop in success case.
These gfids could be read by heal-info and shown as needing heal.
Fix:
Check the pending flag to see if the file definitely needs or
not instead of which index is being crawled at the moment.
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1.
2.
3.
Actual results:
Expected results:
Additional info:
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list