[Bugs] [Bug 1390840] New: write-behind: flush stuck by former failed write

bugzilla at redhat.com bugzilla at redhat.com
Wed Nov 2 04:32:02 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1390840

            Bug ID: 1390840
           Summary: write-behind: flush stuck by former failed write
           Product: GlusterFS
           Version: 3.7.15
         Component: write-behind
          Keywords: Triaged
          Assignee: bugs at gluster.org
          Reporter: rgowdapp at redhat.com
                CC: bugs at gluster.org, oleksandr at natalenko.name,
                    ryan.ding at open-fs.com, sarumuga at redhat.com
        Depends On: 1372211
            Blocks: 1390837, 1390838



+++ This bug was initially created as a clone of Bug #1372211 +++

Description of problem:
run ltp test, when there is some error in WRITE op, the running process is
stuck, the stack is:
[root at dynode1 ~]# cat /proc/16449/stack 
[<ffffffffa0436241>] wait_answer_interruptible+0x91/0xe0 [fuse]
[<ffffffffa0436653>] __fuse_request_send+0x253/0x2c0 [fuse]
[<ffffffffa04366d2>] fuse_request_send+0x12/0x20 [fuse]
[<ffffffffa043f77f>] fuse_flush+0xff/0x150 [fuse]
[<ffffffff811dc8a4>] filp_close+0x34/0x80
[<ffffffff811fce68>] put_files_struct+0x88/0xe0
[<ffffffff811fcf67>] exit_files+0x47/0x50
[<ffffffff810815d6>] do_exit+0x2b6/0xa60
[<ffffffff81081dff>] do_group_exit+0x3f/0xa0
[<ffffffff81081e74>] SyS_exit_group+0x14/0x20
[<ffffffff81646889>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

glusterdump logs:
[xlator.performance.write-behind.wb_inode]
path=/ltp-F9eG0ZSOME/rw-buffered-16436
inode=0x7fdbe8039b9c
window_conf=1048576
window_current=249856
transit-size=0
dontsync=0

[.WRITE]
request-ptr=0x7fdbe8020200
refcount=1
wound=no
generation-number=4
req->op_ret=-1
req->op_errno=116
sync-attempts=3
sync-in-progress=no
size=131072
offset=1220608
lied=-1
append=0
fulfilled=0
go=0

[.WRITE]
request-ptr=0x7fdbe8068c30
refcount=1
wound=no
generation-number=5
req->op_ret=-1
req->op_errno=116
sync-attempts=2
sync-in-progress=no
size=118784
offset=1351680
lied=-1
append=0
fulfilled=0
go=0

[.FLUSH]
request-ptr=0x7fdbe8021cd0
refcount=1
wound=no
generation-number=6
req->op_ret=0
req->op_errno=0
sync-attempts=0

gdb detail about above 3 requests:
(gdb) print *((wb_request_t *)0x7fdbe8021cd0)
$2 = {all = {next = 0x7fdbe803a608, prev = 0x7fdbe8068c30}, todo = {next =
0x7fdbe803a618, prev = 0x7fdbe8068c40}, lie = {next = 0x7fdbe8021cf0, 
    prev = 0x7fdbe8021cf0}, winds = {next = 0x7fdbe8021d00, prev =
0x7fdbe8021d00}, unwinds = {next = 0x7fdbe8021d10, prev = 0x7fdbe8021d10}, wip
= {
    next = 0x7fdbe8021d20, prev = 0x7fdbe8021d20}, stub = 0x7fdbe80224dc,
write_size = 0, orig_size = 0, total_size = 0, op_ret = 0, op_errno = 0, 
  refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_FLUSH, lk_owner = {len
= 8, data = "W\322T\f\271\367y$", '\000' <repeats 1015 times>}, 
  iobref = 0x0, gen = 6, fd = 0x7fdbe800f0dc, wind_count = 0, ordering = {size
= 0, off = 0, append = 0, tempted = 0, lied = 0, fulfilled = 0, 
    go = 0}}
(gdb) print *((wb_request_t *)0x7fdbe8020200)
$3 = {all = {next = 0x7fdbe8068c30, prev = 0x7fdbe803a608}, todo = {next =
0x7fdbe8068c40, prev = 0x7fdbe803a618}, lie = {next = 0x7fdbe8068c50, 
    prev = 0x7fdbe803a628}, winds = {next = 0x7fdbe8020230, prev =
0x7fdbe8020230}, unwinds = {next = 0x7fdbe8020240, prev = 0x7fdbe8020240}, wip
= {
    next = 0x7fdbe8020250, prev = 0x7fdbe8020250}, stub = 0x7fdbe8062c3c,
write_size = 131072, orig_size = 4096, total_size = 0, op_ret = -1, 
  op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_WRITE,
lk_owner = {len = 8, data = '\000' <repeats 1023 times>}, 
  iobref = 0x7fdbe80311a0, gen = 4, fd = 0x7fdbe805c89c, wind_count = 3,
ordering = {size = 131072, off = 1220608, append = 0, tempted = -1, 
    lied = -1, fulfilled = 0, go = 0}}
(gdb) print *((wb_request_t *)0x7fdbe8068c30)
$4 = {all = {next = 0x7fdbe8021cd0, prev = 0x7fdbe8020200}, todo = {next =
0x7fdbe8021ce0, prev = 0x7fdbe8020210}, lie = {next = 0x7fdbe803a628, 
    prev = 0x7fdbe8020220}, winds = {next = 0x7fdbe8068c60, prev =
0x7fdbe8068c60}, unwinds = {next = 0x7fdbe8068c70, prev = 0x7fdbe8068c70}, wip
= {
    next = 0x7fdbe8068c80, prev = 0x7fdbe8068c80}, stub = 0x7fdbe806746c,
write_size = 118784, orig_size = 4096, total_size = 0, op_ret = -1, 
  op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_WRITE,
lk_owner = {len = 8, data = '\000' <repeats 1023 times>}, 
  iobref = 0x7fdbe8052b10, gen = 5, fd = 0x7fdbe805c89c, wind_count = 2,
ordering = {size = 118784, off = 1351680, append = 0, tempted = -1, 
    lied = -1, fulfilled = 0, go = 0}}

--- Additional comment from Worker Ant on 2016-09-06 02:29:21 EDT ---

REVIEW: http://review.gluster.org/15380 (performance/write-behind: fix flush
stuck by former failed writes) posted (#2) for review on master by Oleksandr
Natalenko (oleksandr at natalenko.name)

--- Additional comment from Worker Ant on 2016-11-01 06:06:10 EDT ---

REVIEW: http://review.gluster.org/15380 (performance/write-behind: fix flush
stuck by former failed writes) posted (#3) for review on master by Ryan Ding
(ryan.ding at open-fs.com)


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1372211
[Bug 1372211] write-behind: flush stuck by former failed write
https://bugzilla.redhat.com/show_bug.cgi?id=1390837
[Bug 1390837] write-behind: flush stuck by former failed write
https://bugzilla.redhat.com/show_bug.cgi?id=1390838
[Bug 1390838] write-behind: flush stuck by former failed write
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list