[Gluster-devel] Segfault at client3_3_create_cbk() when calling fops->create

Gustavo Bervian Brand gugabrand at gmail.com
Fri Oct 12 15:25:28 UTC 2012


> So you are using the frame pointer in readv_cbk to do the STACK_WIND of
> create or open. But are you sure you are not unwinding the same frame right
> after the STACK_WIND?
>

  I was unwinding by mistake at the end of read_cbk because I missed some
code there that I should have copied to another point. It was proper code
with the syncops, but not with the wind/unwinds... my mistake, sorry.


  Anyway, I was able to execute the code after some modifications, but
there was a problem intrinsic to the assync behaviour of the wind/unwind I
found a bit tricky to deal with when thinking about other situations I will
still have to implement. Let's see:

  My normal code flow with syncop was, at each read_cbk, to create and
write the file locally, and if the file existed, truncate it because I'd be
overwriting it. For now a simple global variable at "private" monitors if
the local write was beginning or in action already. Something like:

--> Syncop flow:
if priv->write_in_progress ==0
if (syncop_create)
syncop_open (RW | TRUNC)
 (syncop_fsetattr)

if priv->write_in_progress ==1
syncop_open (RW | APPEND)

  So, in wind/unwind code this translated to the logic below, with also 2
paths to go. But when executing this code it seems I get a race condition
where the 2nd path arrives at its end before the 1st one (which has more
steps to complete). Meaning that the 2nd data block from readv_cbk reaches *
read__open_cbk__writev_cbk* first, while the 1st readv_cbk data block is
still at *_read__create_cbk__open_cbk* during execution and hadn't written
the content it should have written first.

  This kind of behavior is expected and normal or am I missing something
here?

  My first ideia was to put a semaphore at the 2nd path waiting for the
conclusion of the 1st call, but it didn't work, so I endup creating one
path only, always calling the create() and open() in sequence, but changing
the flags at _read__create_cbk() to APPEND in case the error from create()
was "file already exists".

----> Wind/unwind code:
read ()
if priv->write_in_progress == 0
wind _create() -----------------------> 1st flow
 else if priv->write_in_progress == 1
         wind _open()      -----------------------> 2nd flow

> 1st flow (to be executed once only, at the beginning of the write):
_read__create_cbk()
if (error == file exist) wind _open -----> if exist, open with APPEND
flags, otherwise, keep the same flags
 else unwind ()  -> error, finish all here
*_read__create_cbk__open_cbk(*)
wind _fsetattr
_read__create_cbk__open_cbk__fsetattr_cbk() --> joint the operations from
create and open
wind _writev
_read__create_cbk__open_cbk__fsetattr_cbk__writev_cbk()
 unwind () -> success, chain of fops finished.

> 2nd flow:
_read__open_cbk()
wind _writev
*_read__open_cbk__writev_cb*k() -------> CONCLUDES 1ST
than _read__create_cbk__open_cbk
unwind () -> success, chain of fops finished.


----> *Stack (stbuf is NULL at fuse_read_cbk, *accessing stbuf->ia_size
generates the fault):
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff5605e29 in fuse_readv_cbk (frame=0x7ffff5bc1fc0,
cookie=0x7ffff5dcb184, this=0x651250, op_ret=131072, op_errno=0,
vector=0x7fffffffdec0, count=-8624, stbuf=0x0, iobref=0x7ffff6edac07,
    xdata=0x7fff00000000) at fuse-bridge.c:2037
2037                    gf_log ("glusterfs-fuse", GF_LOG_TRACE,
(gdb)
(gdb) bt
#0  0x00007ffff5605e29 in fuse_readv_cbk (frame=0x7ffff5bc1fc0,
cookie=0x7ffff5dcb184, this=0x651250, op_ret=131072, op_errno=0,
vector=0x7fffffffdec0, count=-8624, stbuf=0x0, iobref=0x7ffff6edac07,
    xdata=0x7fff00000000) at fuse-bridge.c:2037
#1  0x00007ffff398c610 in _read__open_cbk__writev_cbk
(frame=0x7ffff5dcb184, cookie=0x7ffff5dcb4e0, this=0x664bd0, op_ret=131072,
op_errno=0, prebuf=0x7fffffffdec0, postbuf=0x7fffffffde50, xdata=0x0)
    at gbfs_t.c:213
#2  0x00007ffff3bc3369 in client3_3_writev_cbk (req=0x7ffff367402c,
iov=0x7ffff367406c, count=1, myframe=0x7ffff5dcb4e0) at
client-rpc-fops.c:867
#3  0x00007ffff7944e8b in rpc_clnt_handle_reply (clnt=0x693890,
pollin=0x6e7d70) at rpc-clnt.c:784
#4  0x00007ffff79451fc in rpc_clnt_notify (trans=0x6a32c0, mydata=0x6938c0,
event=RPC_TRANSPORT_MSG_RECEIVED, data=0x6e7d70) at rpc-clnt.c:903
#5  0x00007ffff79416bb in rpc_transport_notify (this=0x6a32c0,
event=RPC_TRANSPORT_MSG_RECEIVED, data=0x6e7d70) at rpc-transport.c:495
#6  0x00007ffff3466e20 in socket_event_poll_in (this=0x6a32c0) at
socket.c:1986
#7  0x00007ffff34672bd in socket_event_handler (fd=14, idx=1,
data=0x6a32c0, poll_in=1, poll_out=0, poll_err=0) at socket.c:2097
#8  0x00007ffff7b98fce in event_dispatch_epoll_handler
(event_pool=0x6505e0, events=0x6c9cc0, i=0) at event.c:784
#9  0x00007ffff7b991ad in event_dispatch_epoll (event_pool=0x6505e0) at
event.c:845
#10 0x00007ffff7b99494 in event_dispatch (event_pool=0x6505e0) at
event.c:945
#11 0x0000000000408ae0 in main (argc=7, argv=0x7fffffffe568) at
glusterfsd.c:1814


Best,
Gustavo.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20121012/e8660596/attachment-0001.html>


More information about the Gluster-devel mailing list