[Bugs] [Bug 1797882] New: Segmentation fault occurs while truncate file
bugzilla at redhat.com
bugzilla at redhat.com
Tue Feb 4 06:27:23 UTC 2020
https://bugzilla.redhat.com/show_bug.cgi?id=1797882
Bug ID: 1797882
Summary: Segmentation fault occurs while truncate file
Product: GlusterFS
Version: mainline
Hardware: All
OS: All
Status: NEW
Component: write-behind
Severity: urgent
Assignee: bugs at gluster.org
Reporter: pkarampu at redhat.com
CC: bugs at gluster.org, nbalacha at redhat.com, pasik at iki.fi,
rgowdapp at redhat.com, vpolakis at gmail.com
Depends On: 1753592
Blocks: 1755678, 1755679
Target Milestone: ---
Classification: Community
+++ This bug was initially created as a clone of Bug #1753592 +++
+++ This bug was initially created as a clone of Bug #1749369 +++
Description of problem:
Segmentation fault occurs when bricks are nearly full 100% and in parallel
truncate of a file is attempted (No space left on device).
Prerequisite is that performance xlators are activated (read-ahead,
write-behind etc) like following
[root@ glusterfs]# gluster --glusterd-sock=/d/backends/1/glusterd/gd.sock
volume info
Volume Name: patchy
Type: Replicate
Volume ID: 5b79a54d-8207-4d38-a0e2-001adbdd11d8
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 127.1.1.1:/d/backends/1/brick
Brick2: 127.1.1.2:/d/backends/2/brick
Brick3: 127.1.1.3:/d/backends/3/brick
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
back trace is like follwoing
#0 mem_put (ptr=0x7fa8e006ec40) at mem-pool.c:847
#1 0x00007fa8e94d3fea in FRAME_DESTROY (frame=0x7fa8d0015698) at
../../../../libglusterfs/src/stack.h:173
#2 STACK_DESTROY (stack=0x7fa8d000b238) at
../../../../libglusterfs/src/stack.h:193
#3 fuse_truncate_cbk (frame=<optimized out>, cookie=<optimized out>,
this=<optimized out>, op_ret=<optimized out>, op_errno=28, prebuf=<optimized
out>, postbuf=0x7fa8e6637530, xdata=0x0) at fuse-bridge.c:813
#4 0x00007fa8e45b5f7d in io_stats_ftruncate_cbk (frame=0x7fa8d0009508,
cookie=<optimized out>, this=<optimized out>, op_ret=-1, op_errno=28,
prebuf=0x7fa8d001cac0, postbuf=0x7fa8e6637530, xdata=0x0) at io-stats.c:2425
#5 0x00007fa8e47d6feb in mdc_ftruncate_cbk (frame=0x7fa8d0015e58,
cookie=<optimized out>, this=<optimized out>, op_ret=-1, op_errno=28,
prebuf=0x7fa8d001cac0, postbuf=0x7fa8e6637530, xdata=0x0) at md-cache.c:1477
#6 0x00007fa8ec29e668 in default_ftruncate_cbk (frame=0x7fa8d000a558,
cookie=<optimized out>, this=<optimized out>, op_ret=-1, op_errno=28,
prebuf=0x7fa8d001cac0, postbuf=0x7fa8e6637530, xdata=0x0) at defaults.c:1375
#7 0x00007fa8e4bfc539 in qr_ftruncate_cbk (frame=0x7fa8d000d288,
cookie=<optimized out>, this=<optimized out>, op_ret=-1, op_errno=28,
prebuf=0x7fa8d001cac0, postbuf=0x7fa8e6637530, xdata=0x0) at quick-read.c:890
#8 0x00007fa8e4e06c48 in ioc_ftruncate_cbk (frame=frame at entry=0x7fa8d000f1e8,
cookie=<optimized out>, this=<optimized out>, op_ret=op_ret at entry=-1,
op_errno=op_errno at entry=28, prebuf=prebuf at entry=0x7fa8d001cac0,
postbuf=postbuf at entry=0x7fa8e6637530, xdata=xdata at entry=0x0) at io-cache.c:1277
#9 0x00007fa8e5022f2a in rda_ftruncate_cbk (frame=0x7fa8d000f088,
cookie=<optimized out>, this=<optimized out>, op_ret=-1, op_errno=28,
prebuf=0x7fa8d001cac0, postbuf=0x7fa8d001cb58, xdata=0x0) at
readdir-ahead.c:826
#10 0x00007fa8e522df58 in ra_truncate_cbk (frame=0x7fa8d0015588,
cookie=<optimized out>, this=<optimized out>, op_ret=-1, op_errno=28,
prebuf=0x7fa8d001cac0, postbuf=0x7fa8d001cb58, xdata=0x0) at read-ahead.c:668
#11 0x00007fa8ec22bd43 in call_resume_unwind (stub=stub at entry=0x7fa8d001bf88)
at call-stub.c:2348
#12 0x00007fa8ec23a989 in call_unwind_error_keep_stub (stub=0x7fa8d001bf88,
op_ret=op_ret at entry=-1, op_errno=28) at call-stub.c:2541
#13 0x00007fa8e544482c in wb_do_winds (wb_inode=wb_inode at entry=0x7fa8e006ec40,
tasks=tasks at entry=0x7fa8e6637750) at write-behind.c:1735
#14 0x00007fa8e54448db in wb_process_queue
(wb_inode=wb_inode at entry=0x7fa8e006ec40) at write-behind.c:1778
#15 0x00007fa8e5444a5f in wb_fulfill_cbk (frame=frame at entry=0x7fa8e005e808,
cookie=<optimized out>, this=<optimized out>, op_ret=op_ret at entry=-1,
op_errno=op_errno at entry=28, prebuf=prebuf at entry=0x7fa8e00731e0,
postbuf=postbuf at entry=0x7fa8e0073278, xdata=xdata at entry=0x0) at
write-behind.c:1105
#16 0x00007fa8e56c5f60 in dht_writev_cbk (frame=frame at entry=0x7fa8e006c0a8,
cookie=<optimized out>, this=<optimized out>, op_ret=-1, op_errno=28,
prebuf=prebuf at entry=0x7fa8e00731e0, postbuf=postbuf at entry=0x7fa8e0073278,
xdata=0x0) at dht-inode-write.c:140
#17 0x00007fa8e592f31e in afr_writev_unwind (frame=frame at entry=0x7fa8e0000b18,
this=this at entry=0x7fa8e000f820) at afr-inode-write.c:234
#18 0x00007fa8e592f4fd in afr_transaction_writev_unwind (frame=0x7fa8e005ed78,
this=0x7fa8e000f820) at afr-inode-write.c:248
#19 0x00007fa8e5941be2 in afr_transaction_done (frame=0x7fa8e005ed78,
this=0x7fa8e000f820) at afr-transaction.c:367
#20 0x00007fa8e59453d3 in afr_unlock_common_cbk
(frame=frame at entry=0x7fa8e005ed78, this=this at entry=0x7fa8e000f820,
op_ret=op_ret at entry=0, xdata=<optimized out>, op_errno=<optimized out>,
cookie=0x2) at afr-lk-common.c:243
#21 0x00007fa8e59461e8 in afr_unlock_inodelk_cbk (frame=0x7fa8e005ed78,
cookie=0x2, this=0x7fa8e000f820, op_ret=0, op_errno=<optimized out>,
xdata=<optimized out>) at afr-lk-common.c:281
#22 0x00007fa8e5c18f3f in client4_0_finodelk_cbk (req=<optimized out>,
iov=<optimized out>, count=<optimized out>, myframe=<optimized out>) at
client-rpc-fops_v2.c:1398
#23 0x00007fa8ebfe0980 in rpc_clnt_handle_reply
(clnt=clnt at entry=0x7fa8e0053800, pollin=pollin at entry=0x7fa8e006e270) at
rpc-clnt.c:755
#24 0x00007fa8ebfe0d53 in rpc_clnt_notify (trans=0x7fa8e0053b20,
mydata=0x7fa8e0053830, event=<optimized out>, data=0x7fa8e006e270) at
rpc-clnt.c:923
#25 0x00007fa8ebfdcd83 in rpc_transport_notify (this=this at entry=0x7fa8e0053b20,
event=event at entry=RPC_TRANSPORT_MSG_RECEIVED, data=data at entry=0x7fa8e006e270)
at rpc-transport.c:525
#26 0x00007fa8e6aae2cb in socket_event_poll_in (notify_handled=true,
this=0x7fa8e0053b20) at socket.c:2504
#27 socket_event_handler (fd=<optimized out>, idx=<optimized out>,
gen=<optimized out>, data=0x7fa8e0053b20, poll_in=<optimized out>,
poll_out=<optimized out>, poll_err=0) at socket.c:2905
#28 0x00007fa8ec2762d9 in event_dispatch_epoll_handler (event=0x7fa8e6637e90,
event_pool=0x18420b0) at event-epoll.c:591
#29 event_dispatch_epoll_worker (data=0x1865220) at event-epoll.c:668
#30 0x00007fa8eb99b182 in start_thread (arg=0x7fa8e6638700) at
pthread_create.c:312
#31 0x00007fa8eb6c830d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Program received signal SIGSEGV, Segmentation fault.
mem_put (ptr=0x7fa8e006ec40) at mem-pool.c:869
869 GF_ATOMIC_DEC(hdr->pool->active);
(gdb)
so it appears that while stack unwind of the frames following an error response
from brick (No space left on device) frame->local includes a memory location
that is not allocated via
mem_get but via calloc.
Breakpoint 1, __gf_calloc (nmemb=nmemb at entry=1, size=size at entry=208,
type=type at entry=163, typestr=typestr at entry=0x7fa8e544b892
"gf_wb_mt_wb_inode_t") at mem-pool.c:119
119 return (void *)ptr;
$2034 = (void *) 0x7fa8e006ec40
#0 __gf_calloc (nmemb=nmemb at entry=1, size=size at entry=208, type=type at entry=163,
typestr=typestr at entry=0x7fa8e544b892 "gf_wb_mt_wb_inode_t") at mem-pool.c:119
#1 0x00007fa8e5441e7b in __wb_inode_create (this=this at entry=0x7fa8e0013470,
inode=inode at entry=0x7fa8e00603a8) at write-behind.c:667
#2 0x00007fa8e5442063 in wb_inode_create (this=0x7fa8e0013470,
inode=0x7fa8e00603a8) at write-behind.c:708
#3 0x00007fa8e5444fc2 in wb_writev (frame=frame at entry=0x7fa8e005ed78,
this=<optimized out>, fd=fd at entry=0x7fa8d000cb88,
vector=vector at entry=0x7fa8d000d0b0, count=count at entry=1, offset=offset at entry=0,
flags=flags at entry=32769, iobref=iobref at entry=0x7fa8d00160f0,
xdata=xdata at entry=0x0) at write-behind.c:1851
#4 0x00007fa8e5231c78 in ra_writev (frame=frame at entry=0x7fa8e005d2d8,
this=0x7fa8e0014f50, fd=fd at entry=0x7fa8d000cb88,
vector=vector at entry=0x7fa8d000d0b0, count=count at entry=1, offset=offset at entry=0,
flags=flags at entry=32769, iobref=iobref at entry=0x7fa8d00160f0,
xdata=xdata at entry=0x0) at read-ahead.c:650
#5 0x00007fa8e501f10a in rda_writev (frame=frame at entry=0x7fa8e0000b18,
this=0x7fa8e0016940, fd=fd at entry=0x7fa8d000cb88,
vector=vector at entry=0x7fa8d000d0b0, count=count at entry=1, off=off at entry=0,
flags=flags at entry=32769, iobref=iobref at entry=0x7fa8d00160f0,
xdata=xdata at entry=0x0) at readdir-ahead.c:696
#6 0x00007fa8e4e0b2e6 in ioc_writev (frame=frame at entry=0x7fa8e006c0a8,
this=0x7fa8e00188d0, fd=fd at entry=0x7fa8d000cb88,
vector=vector at entry=0x7fa8d000d0b0, count=count at entry=1, offset=offset at entry=0,
flags=flags at entry=32769, iobref=iobref at entry=0x7fa8d00160f0,
xdata=xdata at entry=0x0) at io-cache.c:1233
#7 0x00007fa8e4bfabce in qr_writev (frame=frame at entry=0x7fa8e005e808,
this=0x7fa8e001a370, fd=fd at entry=0x7fa8d000cb88, iov=iov at entry=0x7fa8d000d0b0,
count=count at entry=1, offset=offset at entry=0, flags=flags at entry=32769,
iobref=iobref at entry=0x7fa8d00160f0, xdata=xdata at entry=0x0) at quick-read.c:845
#8 0x00007fa8ec2c2240 in default_writev_resume (frame=0x7fa8d000f088,
this=0x7fa8e001bde0, fd=0x7fa8d000cb88, vector=0x7fa8d000d0b0, count=1, off=0,
flags=32769, iobref=0x7fa8d00160f0, xdata=0x0) at defaults.c:2274
#9 0x00007fa8ec23a66a in call_resume_wind (stub=<optimized out>) at
call-stub.c:2040
and contains the inode entry.
magik number is there which troubles the mem_put and core dump occurs while
trying to derefenrce the pool pointer.
The destroyed frame is always ra_truncate_cbk winded from ra_ftruncate
e.g.
(gdb) p *frame
$9 = {root = 0x7fa1a0000ec8, parent = 0x7fa1a000c098, frames = {next =
0x7fa1a000a338, prev = 0x7fa1a000a338}, local = 0x0, this = 0x7fa1ac013470, ret
= 0x7fa1b2baedf0 <ra_truncate_cbk>, ref_count = 0, lock = {spinlock = 0, mutex
= {
__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind =
-1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size
= '\000' <repeats 16 times>, "\377\377\377\377", '\000' <repeats 19 times>,
__align = 0}}, cookie = 0x7fa1a000a328, complete = true, op =
GF_FOP_FTRUNCATE, begin = {tv_sec = 164325, tv_nsec = 696128864}, end = {tv_sec
= 164325, tv_nsec = 698002711},
wind_from = 0x7fa1b2bb86ff <__FUNCTION__.14980> "ra_ftruncate", wind_to =
0x7fa1b2bb8470 "(this->children->xlator)->fops->ftruncate", unwind_from =
0x7fa1b9c48e50 <__FUNCTION__.18620> "call_resume_unwind",
unwind_to = 0x7fa1b2bb7fd7 "ra_truncate_cbk"}
(gdb) q
and the inode ptr is copied to the frame local in the
2237 wb_ftruncate(call_frame_t *frame, xlator_t *this, fd_t *fd, off_t offset,
2238 dict_t *xdata)
2239 {
2240 wb_inode_t *wb_inode = NULL;
2241 call_stub_t *stub = NULL;
2242 int32_t op_errno = 0;
2243
2244 wb_inode = wb_inode_create(this, fd->inode);
2245 if (!wb_inode) {
2246 op_errno = ENOMEM;
2247 goto unwind;
2248 }
2249
2250 frame->local = wb_inode;
2251
2252 stub = fop_ftruncate_stub(frame, wb_ftruncate_helper, fd, offset,
xdata);
as seen in the gdb:
Breakpoint 1, wb_ftruncate (frame=frame at entry=0x7f605c00fc18,
this=0x7f6060013470, fd=fd at entry=0x7f605c00cbd8, offset=offset at entry=1,
xdata=xdata at entry=0x0) at write-behind.c:2239
2239 {
(gdb) p frame
$1 = (call_frame_t *) 0x7f605c00fc18
(gdb) p frame->local
$2 = (void *) 0x0
(gdb) watch frame->local
Watchpoint 2: frame->local
(gdb) c
Continuing.
Watchpoint 2: frame->local
Old value = (void *) 0x0
New value = (void *) 0x7f606006c540
wb_ftruncate (frame=frame at entry=0x7f605c00fc18, this=<optimized out>,
fd=fd at entry=0x7f605c00cbd8, offset=offset at entry=1,
xdata=xdata at entry=0x0) at write-behind.c:2252
2252 stub = fop_ftruncate_stub(frame, wb_ftruncate_helper, fd, offset,
xdata);
(gdb) bt
#0 wb_ftruncate (frame=frame at entry=0x7f605c00fc18, this=<optimized out>,
fd=fd at entry=0x7f605c00cbd8, offset=offset at entry=1,
xdata=xdata at entry=0x0) at write-behind.c:2252
#1 0x00007f6066c7dfc6 in ra_ftruncate (frame=frame at entry=0x7f605c00a308,
this=0x7f6060014f50, fd=fd at entry=0x7f605c00cbd8,
offset=offset at entry=1, xdata=xdata at entry=0x0) at read-ahead.c:903
#2 0x00007f6066a69627 in rda_ftruncate (frame=frame at entry=0x7f605c00c148,
this=0x7f6060016940, fd=fd at entry=0x7f605c00cbd8,
offset=offset at entry=1, xdata=xdata at entry=0x0) at readdir-ahead.c:835
#3 0x00007f6066856b78 in ioc_ftruncate (frame=frame at entry=0x7f605c00c9b8,
this=0x7f60600188d0, fd=fd at entry=0x7f605c00cbd8,
offset=offset at entry=1, xdata=xdata at entry=0x0) at io-cache.c:1327
#4 0x00007f60666461d1 in qr_ftruncate (frame=frame at entry=0x7f605c00fe98,
this=0x7f606001a370, fd=fd at entry=0x7f605c00cbd8,
offset=offset at entry=1, xdata=xdata at entry=0x0) at quick-read.c:904
#5 0x00007f606dd0986a in default_ftruncate_resume (frame=0x7f605c0016d8,
this=0x7f606001bde0, fd=0x7f605c00cbd8, offset=1, xdata=0x0)
at defaults.c:2023
How reproducible:
always
Steps to Reproduce:
- setup a test cluster
Volume Name: patchy
Type: Replicate
Volume ID: 5b79a54d-8207-4d38-a0e2-001adbdd11d8
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 127.1.1.1:/d/backends/1/brick
Brick2: 127.1.1.2:/d/backends/2/brick
Brick3: 127.1.1.3:/d/backends/3/brick
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
- fill the volume with dummy files e.g.
for i in {1..10}; do dd if=/home/epolakis/prom.img
of=/mnt/glusterfs/0/smallfile_IB__N$i bs=4096 count=10000 ; done
- use a test code to truncate a file like:
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <memory.h>
#define STR_LEN 1000
main() {
char *m_string;
int fd;
int ret;
char fn[]="/mnt/glusterfs/0/smallfile_N1";
struct stat st;
if ((m_string = (char*) malloc(STR_LEN)) == NULL)
perror("malloc() error");
else if ((fd = open(fn, O_CREAT | O_WRONLY, S_IRUSR | S_IWUSR)) < 0)
perror("open error");
else {
memset(m_string, '0', STR_LEN);
if ((ret = write(fd, m_string, STR_LEN)) == -1)
perror("write() error");
else {
if (ftruncate(fd, 1) != 0)
perror("ftruncate() error");
else {
fstat(fd, &st);
printf("the file has %ld bytes\n", (long) st.st_size);
}
}
close(fd);
}
free(m_string);
}
Actual results:
Expected results:
Additional info:
If my understanding is correct the inode that is calloc'ed should not released
till the forget.
--- Additional comment from evangelos on 2019-09-10 10:18:30 UTC ---
Would something like below enough ?
diff --git a/libglusterfs/src/mem-pool.c b/libglusterfs/src/mem-pool.c
index cff9632..fad6a2c 100644
--- a/libglusterfs/src/mem-pool.c
+++ b/libglusterfs/src/mem-pool.c
@@ -863,6 +863,13 @@ mem_put(void *ptr)
/* Not one of ours; don't touch it. */
return;
}
+
+ if(!hdr->pool_list){
+ gf_msg_callingfn("mem-pool", GF_LOG_ERROR, EINVAL, LG_MSG_INVALID_ARG,
+ "invalid argument hdr->pool_list NULL");
+ return;
+ }
+
pool_list = hdr->pool_list;
pt_pool = &pool_list->pools[hdr->power_of_two - POOL_SMALLEST];
this (as far as i understood but please correct me) this allocation is for the
inode that shouldn't be released unless kernel send the forget (rm/unlint etc).
--- Additional comment from Worker Ant on 2019-09-19 11:34:42 UTC ---
REVIEW: https://review.gluster.org/23445 (Segmentation fault occurs during
truncate) posted (#2) for review on master by Vangelis Polakis
--- Additional comment from Nithya Balachandran on 2019-09-24 05:37:47 UTC ---
Hi,
What version of gluster are you using?
--- Additional comment from evangelos on 2019-09-24 07:54:23 UTC ---
I have seen in this issue in 5.6
--- Additional comment from Nithya Balachandran on 2019-09-25 04:30:33 UTC ---
Thanks. We will take a look and get back to you.
--- Additional comment from Nithya Balachandran on 2019-09-25 13:50:34 UTC ---
Hi,
Thank you for an excellent test. Would you mind if I added it to the regression
framework?
I'm afraid the patch provided will not fix the issue here. In most xlators, the
local structure is allocated from a mempool which is why the mem_put will
usually work.
However, the wb xlator does not have a defined local struct of its own.
Instead, for the truncate and ftruncate fops, it saved the wb_inode (which is
allocated using GF_CALLOC) in frame->local. In the normal code path, this is
cleared in the wb_truncate_cbk and wb_ftruncate_cbk functions and frame->local
is set to NULL before unwinding the fop.
The test code, however, does a write followed by a conflicting truncate. As the
write fails with ENOSPC, __wb_pick_winds determines that there is a conflict,
updates the request op_ret and op_errno for the ftruncate request. The
wb_ftruncate_cbk is not called and the req is instead unwound in wb_do_winds ->
call_unwind_error_keep_stub code path. This does not clear frame->local which
causes the mem_put to crash.
--- Additional comment from Nithya Balachandran on 2019-09-25 14:27:30 UTC ---
(In reply to evangelos from comment #3)
> I have seen in this issue in 5.6
The issue also exists in the master branch.
--- Additional comment from Worker Ant on 2019-09-25 14:29:24 UTC ---
REVIEW: https://review.gluster.org/23485 (perf/write-behind: Clear frame->local
on conflict error) posted (#1) for review on master by N Balachandran
--- Additional comment from Worker Ant on 2019-09-26 02:28:52 UTC ---
REVIEW: https://review.gluster.org/23485 (perf/write-behind: Clear frame->local
on conflict error) merged (#1) on master by N Balachandran
--- Additional comment from evangelos on 2019-09-26 22:06:01 UTC ---
thanks a lot for the explanation/complete solution! yes, of course you can add
it in the regression test.
One comment though, wouldn't be good to also add the check in the mem_put for
the ptr before referencing ? i mean in case something similar happens in the
future instead or coredumping an error would appear.
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1753592
[Bug 1753592] Segmentation fault occurs while truncate file
https://bugzilla.redhat.com/show_bug.cgi?id=1755678
[Bug 1755678] Segmentation fault occurs while truncate file
https://bugzilla.redhat.com/show_bug.cgi?id=1755679
[Bug 1755679] Segmentation fault occurs while truncate file
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list