[Gluster-devel] crash in afr

Raghavendra G raghavendra.hg at gmail.com
Thu Jun 18 11:30:33 UTC 2009


Shehjar,
Sorry, its not double free also. I was wrong.

Mihai,

We are still looking into this bug. We'll get back to you once we fix this.

regards,

On Thu, Jun 18, 2009 at 3:01 PM, Shehjar Tikoo <shehjart at gluster.com> wrote:

> Raghavendra G wrote:
>
>> While this fixes the double free, The actual fix has to copy the buffer
>> into an ioq_entry, instead of just storing the buffer pointer. If not, there
>> can be cases wherein by the time the ioq_entry is written to socket, the
>> buffer might've already been freed.
>>
>
> Yup. I hadnt seen your reply to the bug report when I sent this patch.
>
> Thanks
> Shehjar
>
>
>> On Thu, Jun 18, 2009 at 2:36 PM, Shehjar Tikoo <shehjart at gluster.com<mailto:
>> shehjart at gluster.com>> wrote:
>>
>>    I think I understand why you see the crash.
>>    Could you please apply the following patch and tell
>>    us if the crash is observed still?
>>
>>    Thanks
>>    Shehjar
>>
>>
>>
>>
>>    Mihai wrote:
>>
>>        Hello,
>>        I'm using a server side replication on 6 servers. Glusterfsd
>>        crashes on a few hour basis:
>>        gdb -se /usr/sbin/glusterfsd -c /core.26947 GNU gdb Fedora
>>        (6.8-27.el5) Copyright (C) 2008 Free Software Foundation, Inc.
>>        License GPLv3+: GNU GPL version 3 or later
>>        <http://gnu.org/licenses/gpl.html>
>>        This is free software: you are free to change and redistribute it.
>>        There is NO WARRANTY, to the extent permitted by law.  Type
>>        "show copying"
>>        and "show warranty" for details.
>>        This GDB was configured as "x86_64-redhat-linux-gnu"...
>>        (no debugging symbols found)
>>
>>        warning: .dynamic section for "/usr/lib64/libglusterfs.so.0" is
>>        not at the expected address
>>
>>        warning: difference appears to be caused by prelink, adjusting
>>        expectations Reading symbols from
>>        /usr/lib64/libglusterfs.so.0...done.
>>        Loaded symbols for /usr/lib64/libglusterfs.so.0 Reading symbols
>>        from /lib64/libdl.so.2...done.
>>        Loaded symbols for /lib64/libdl.so.2
>>        Reading symbols from /lib64/libpthread.so.0...done.
>>        Loaded symbols for /lib64/libpthread.so.0 Reading symbols from
>>        /lib64/libc.so.6...done.
>>        Loaded symbols for /lib64/libc.so.6
>>        Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
>>        Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols
>>        from /usr/lib64/glusterfs/2.0.2/xlator/storage/posix.so...done.
>>        Loaded symbols for
>>        /usr/lib64/glusterfs/2.0.2/xlator/storage/posix.so
>>        Reading symbols from
>>        /usr/lib64/glusterfs/2.0.2/xlator/features/locks.so...done.
>>        Loaded symbols for
>>        /usr/lib64/glusterfs/2.0.2/xlator/features/locks.so
>>        Reading symbols from
>>        /usr/lib64/glusterfs/2.0.2/xlator/performance/io-threads.so...done.
>>        Loaded symbols for
>>        /usr/lib64/glusterfs/2.0.2/xlator/performance/io-threads.so
>>        Reading symbols from
>>        /usr/lib64/glusterfs/2.0.2/xlator/protocol/client.so...done.
>>        Loaded symbols for
>>        /usr/lib64/glusterfs/2.0.2/xlator/protocol/client.so
>>        Reading symbols from
>>        /usr/lib64/glusterfs/2.0.2/xlator/cluster/replicate.so...done.
>>        Loaded symbols for
>>        /usr/lib64/glusterfs/2.0.2/xlator/cluster/replicate.so
>>        Reading symbols from
>>        /usr/lib64/glusterfs/2.0.2/xlator/protocol/server.so...done.
>>        Loaded symbols for
>>        /usr/lib64/glusterfs/2.0.2/xlator/protocol/server.so
>>        Reading symbols from
>>        /usr/lib64/glusterfs/2.0.2/transport/socket.so...done.
>>        Loaded symbols for /usr/lib64/glusterfs/2.0.2/transport/socket.so
>>        Reading symbols from
>> /usr/lib64/glusterfs/2.0.2/auth/addr.so...done.
>>        Loaded symbols for /usr/lib64/glusterfs/2.0.2/auth/addr.so
>>        Reading symbols from /lib64/libnss_files.so.2...done.
>>        Loaded symbols for /lib64/libnss_files.so.2 Reading symbols from
>>        /lib64/libgcc_s.so.1...done.
>>        Loaded symbols for /lib64/libgcc_s.so.1
>>        Core was generated by `/usr/sbin/glusterfsd -f
>>        /etc/glusterfs/glusterfsd.vol'.
>>        Program terminated with signal 6, Aborted.
>>        [New process 26947]
>>        [New process 26956]
>>        [New process 26955]
>>        [New process 26954]
>>        [New process 26953]
>>        [New process 26952]
>>        [New process 26951]
>>        [New process 26950]
>>        [New process 26949]
>>        [New process 26948]
>>        #0  0x0000003040030215 in raise () from /lib64/libc.so.6
>>        (gdb) bt
>>        #0  0x0000003040030215 in raise () from /lib64/libc.so.6
>>        #1  0x0000003040031cc0 in abort () from /lib64/libc.so.6
>>        #2  0x000000304006a7fb in __libc_message () from /lib64/libc.so.6
>>        #3  0x0000003040071ce2 in _int_free () from /lib64/libc.so.6
>>        #4  0x000000304007590c in free () from /lib64/libc.so.6
>>        #5  0x00002aaaaaaadcc9 in __socket_ioq_entry_free
>>        (entry=0x2aaab001da30) at socket.c:331
>>        #6  0x00002aaaaaaaf1c9 in __socket_ioq_churn_entry (this=<value
>>        optimized out>, entry=0x2aaab001da30) at socket.c:368
>>        #7  0x00002aaaaaaaf8ec in socket_submit (this=0xae11a70,
>>        buf=0x2aaab00378c0 "", len=340, vector=0x0, count=<value
>>        optimized out>,
>>           iobref=<value optimized out>) at socket.c:1281
>>        #8  0x00002b2e7c775bd3 in protocol_client_xfer
>>        (frame=0x2aaab0030ab0, this=0xae0ab00, trans=0xae11a70, type=1,
>>        op=40, hdr=0x2aaab00378c0, hdrlen=340,
>>           vector=0x0, count=0, iobref=0x0) at client-protocol.c:636
>>        #9  0x00002b2e7c77bc1a in client_xattrop (frame=0x2aaab0030ab0,
>>        this=0xae0ab00, loc=0x2aaab4004238, flags=GF_XATTROP_ADD_ARRAY,
>>        dict=0x2aaab4031fc0)
>>           at client-protocol.c:1922
>>        #10 0x00002b2e7c9a2cda in afr_changelog_pre_op
>>        (frame=0x2aaab401ea70, this=0xae0b280) at afr-transaction.c:782
>>        #11 0x00002b2e7c9a2f31 in afr_lock_rec (frame=0x2aaab401ea70,
>>        this=0xae0b280, child_index=1) at afr-transaction.c:979
>>        #12 0x00002b2e7c9a36a8 in afr_lock_cbk (frame=0x2aaab401ea70,
>>        cookie=<value optimized out>, this=0xae0b280, op_ret=0,
>>        op_errno=0) at afr-transaction.c:906
>>        #13 0x00002b2e7b6a75f0 in default_inodelk_cbk (frame=<value
>>        optimized out>, cookie=<value optimized out>, this=<value
>>        optimized out>, op_ret=-1,
>>           op_errno=128) at defaults.c:1199
>>        #14 0x00002b2e7c358182 in pl_inodelk (frame=0x2aaab4034c10,
>>        this=0xae08870, volume=<value optimized out>,
>>        loc=0x2aaab4032170, cmd=7, flock=0x0)
>>           at internal.c:194
>>        #15 0x00002b2e7b6a815c in default_inodelk (frame=0x2aaab4017660,
>>        this=0xae09080, volume=0xae0b260 "replicate",
>>        loc=0x2aaab4004238, cmd=7,
>>           lock=0x7fff2f422f80) at defaults.c:1209
>>        #16 0x00002b2e7c9a33ba in afr_lock_rec (frame=0x2aaab401ea70,
>>        this=0xae0b280, child_index=0) at afr-transaction.c:1006
>>        #17 0x00002b2e7c9a35c2 in afr_transaction (frame=0x2aaab401ea70,
>>        this=0xae0b280, type=AFR_DATA_TRANSACTION) at
>> afr-transaction.c:1170
>>        #18 0x00002b2e7c9a07cd in afr_truncate (frame=0x2aaab403ac30,
>>        this=0xae0b280, loc=0x2aaab40174a0, offset=0) at
>>        afr-inode-write.c:1224
>>        #19 0x00002b2e7cbc0969 in server_truncate_resume
>>        (frame=0x2aaab403acc0, this=<value optimized out>,
>>        loc=0x2aaab40174a0, offset=0) at server-protocol.c:4243 #20
>>        0x00002b2e7b6b06f7 in call_resume (stub=0x2aaab4017470) at
>>        call-stub.c:2384
>>        #21 0x00002b2e7cbc4125 in server_truncate (frame=0x2aaab403acc0,
>>        bound_xl=<value optimized out>, hdr=<value optimized out>,
>>        hdrlen=<value optimized out>,
>>           iobuf=<value optimized out>) at server-protocol.c:4291
>>        #22 0x00002b2e7cbbfb20 in protocol_server_pollin
>>        (this=0xae0bdf0, trans=0xae17960) at server-protocol.c:7735
>>        #23 0x00002b2e7cbbfbfb in notify (this=0xae0bdf0, event=<value
>>        optimized out>, data=0x6) at server-protocol.c:7791
>>        #24 0x00002aaaaaaafb43 in socket_event_handler (fd=<value
>>        optimized out>, idx=11, data=0xae17960, poll_in=1, poll_out=0,
>>        poll_err=0) at socket.c:813
>>        #25 0x00002b2e7b6ba2a5 in event_dispatch_epoll
>>        (event_pool=0xae02300) at event.c:804
>>        #26 0x0000000000403899 in main ()
>>        (gdb) bt full
>>        #0  0x0000003040030215 in raise () from /lib64/libc.so.6 No
>>        symbol table info available.
>>        #1  0x0000003040031cc0 in abort () from /lib64/libc.so.6 No
>>        symbol table info available.
>>        #2  0x000000304006a7fb in __libc_message () from
>>        /lib64/libc.so.6 No symbol table info available.
>>        #3  0x0000003040071ce2 in _int_free () from /lib64/libc.so.6 No
>>        symbol table info available.
>>        #4  0x000000304007590c in free () from /lib64/libc.so.6 No
>>        symbol table info available.
>>        #5  0x00002aaaaaaadcc9 in __socket_ioq_entry_free
>>        (entry=0x2aaab001da30) at socket.c:331 No locals.
>>        #6  0x00002aaaaaaaf1c9 in __socket_ioq_churn_entry (this=<value
>>        optimized out>, entry=0x2aaab001da30) at socket.c:368
>>               ret = 0
>>               __PRETTY_FUNCTION__ = "__socket_ioq_churn_entry"
>>        #7  0x00002aaaaaaaf8ec in socket_submit (this=0xae11a70,
>>        buf=0x2aaab00378c0 "", len=340, vector=0x0, count=<value
>>        optimized out>,
>>           iobref=<value optimized out>) at socket.c:1281
>>               priv = (socket_private_t *) 0xae11ec0
>>               ret = <value optimized out>
>>               need_poll_out = <value optimized out>
>>               entry = (struct ioq *) 0x2aaab001da30
>>               ctx = (glusterfs_ctx_t *) 0xae02010
>>               __FUNCTION__ = "socket_submit"
>>        #8  0x00002b2e7c775bd3 in protocol_client_xfer
>>        (frame=0x2aaab0030ab0, this=0xae0ab00, trans=0xae11a70, type=1,
>>        op=40, hdr=0x2aaab00378c0, hdrlen=340,
>>           vector=0x0, count=0, iobref=0x0) at client-protocol.c:636
>>               conf = (client_conf_t *) 0xae113c0
>>               conn = (client_connection_t *) 0xae11f90
>>               callid = 309893
>>               ret = <value optimized out>
>>               rsphdr = {callid = 0, type = 0, op = 0, size = 0, {req =
>>        {pid = 0, uid = 0, gid = 0}, rsp = {op_ret = 0, op_errno = 0}}}
>>               forget = {hdr = 0x0, hdrlen = 0, frame = 0x0}
>>        #9  0x00002b2e7c77bc1a in client_xattrop (frame=0x2aaab0030ab0,
>>        this=0xae0ab00, loc=0x2aaab4004238, flags=GF_XATTROP_ADD_ARRAY,
>>        dict=0x2aaab4031fc0)
>>           at client-protocol.c:1922
>>               hdr = (gf_hdr_common_t *) 0x101010101010101
>>               req = <value optimized out>
>>               dict_len = 242
>>               ret = <value optimized out>
>>               pathlen = <value optimized out>
>>               ino = 13893685
>>               __FUNCTION__ = "client_xattrop"
>>        #10 0x00002b2e7c9a2cda in afr_changelog_pre_op
>>        (frame=0x2aaab401ea70, this=0xae0b280) at afr-transaction.c:782
>>               _new = (call_frame_t *) 0x6943
>>               priv = (afr_private_t *) 0xae13740
>>               ret = <value optimized out>
>>               call_count = 1
>>               xattr = (dict_t *) 0x2aaab4031fc0
>>               local = (afr_local_t *) 0x2aaab4004200
>>               __FUNCTION__ = "afr_changelog_pre_op"
>>        #11 0x00002b2e7c9a2f31 in afr_lock_rec (frame=0x2aaab401ea70,
>>        this=0xae0b280, child_index=1) at afr-transaction.c:979
>>               local = (afr_local_t *) 0x2aaab4004200
>>               priv = (afr_private_t *) 0xae13740
>>               flock = {l_type = 1, l_whence = 12098, l_start = 0, l_len
>>        = 0, l_pid = 792866320}
>>               lower = <value optimized out>
>>               higher = <value optimized out>
>>               lower_name = <value optimized out>
>>               higher_name = <value optimized out>
>>               __FUNCTION__ = "afr_lock_rec"
>>        #12 0x00002b2e7c9a36a8 in afr_lock_cbk (frame=0x2aaab401ea70,
>>        cookie=<value optimized out>, this=0xae0b280, op_ret=0,
>>        op_errno=0) at afr-transaction.c:906 ---Type <return> to
>>        continue, or q <return> to quit---
>>               local = (afr_local_t *) 0x2aaab4004200
>>               child_index = 0
>>               call_count = 0
>>               __FUNCTION__ = "afr_lock_cbk"
>>        #13 0x00002b2e7b6a75f0 in default_inodelk_cbk (frame=<value
>>        optimized out>, cookie=<value optimized out>, this=<value
>>        optimized out>, op_ret=-1,
>>           op_errno=128) at defaults.c:1199
>>               fn = (ret_fn_t) 0x101010101010101
>>               _parent = (call_frame_t *) 0x6943
>>        #14 0x00002b2e7c358182 in pl_inodelk (frame=0x2aaab4034c10,
>>        this=0xae08870, volume=<value optimized out>,
>>        loc=0x2aaab4032170, cmd=7, flock=0x0)
>>           at internal.c:194
>>               fn = (ret_fn_t) 0x101010101010101
>>               _parent = (call_frame_t *) 0x6943
>>               op_ret = -1
>>               op_errno = 128
>>               ret = 0
>>               can_block = 1
>>               transport = <value optimized out>
>>               client_pid = 1
>>               pinode = (pl_inode_t *) 0x2aaab0032810
>>               reqlock = (posix_lock_t *) 0x2aaab4032170
>>               __FUNCTION__ = "pl_inodelk"
>>        #15 0x00002b2e7b6a815c in default_inodelk (frame=0x2aaab4017660,
>>        this=0xae09080, volume=0xae0b260 "replicate",
>>        loc=0x2aaab4004238, cmd=7,
>>           lock=0x7fff2f422f80) at defaults.c:1209
>>               _new = (call_frame_t *) 0x6943
>>        #16 0x00002b2e7c9a33ba in afr_lock_rec (frame=0x2aaab401ea70,
>>        this=0xae0b280, child_index=0) at afr-transaction.c:1006
>>               _new = (call_frame_t *) 0x6943
>>               local = (afr_local_t *) 0x2aaab4004200
>>               priv = (afr_private_t *) 0xae13740
>>               flock = {l_type = 1, l_whence = 1, l_start = 0, l_len =
>>        0, l_pid = 1074216160}
>>               lower = <value optimized out>
>>               higher = <value optimized out>
>>               lower_name = <value optimized out>
>>               higher_name = <value optimized out>
>>               __FUNCTION__ = "afr_lock_rec"
>>        #17 0x00002b2e7c9a35c2 in afr_transaction (frame=0x2aaab401ea70,
>>        this=0xae0b280, type=AFR_DATA_TRANSACTION) at
>> afr-transaction.c:1170
>>               local = (afr_local_t *) 0x2aaab4004200
>>               priv = (afr_private_t *) 0xae13740
>>        #18 0x00002b2e7c9a07cd in afr_truncate (frame=0x2aaab403ac30,
>>        this=0xae0b280, loc=0x2aaab40174a0, offset=0) at
>>        afr-inode-write.c:1224
>>               transaction_frame = (call_frame_t *) 0x2aaab401ea70
>>               op_errno = 107
>>               __FUNCTION__ = "afr_truncate"
>>        #19 0x00002b2e7cbc0969 in server_truncate_resume
>>        (frame=0x2aaab403acc0, this=<value optimized out>,
>>        loc=0x2aaab40174a0, offset=0) at server-protocol.c:4243
>>               _new = (call_frame_t *) 0x6943
>>               __FUNCTION__ = "server_truncate_resume"
>>        #20 0x00002b2e7b6b06f7 in call_resume (stub=0x2aaab4017470) at
>>        call-stub.c:2384
>>               __FUNCTION__ = "call_resume"
>>        #21 0x00002b2e7cbc4125 in server_truncate (frame=0x2aaab403acc0,
>>        bound_xl=<value optimized out>, hdr=<value optimized out>,
>>        hdrlen=<value optimized out>,
>>           iobuf=<value optimized out>) at server-protocol.c:4291
>>               truncate_stub = (call_stub_t *) 0x0
>>               state = (server_state_t *) 0x2aaab4032000
>>        #22 0x00002b2e7cbbfb20 in protocol_server_pollin
>>        (this=0xae0bdf0, trans=0xae17960) at server-protocol.c:7735
>>               hdr = 0x2aaab4017330 ""
>>               hdrlen = 98
>>               ret = 0
>>               iobuf = (struct iobuf *) 0x0
>>        #23 0x00002b2e7cbbfbfb in notify (this=0xae0bdf0, event=<value
>>        optimized out>, data=0x6) at server-protocol.c:7791
>>               ret = <value optimized out>
>>               trans = (transport_t *) 0x6943
>>        ---Type <return> to continue, or q <return> to quit---
>>               peerinfo = (peer_info_t *) 0xae179d0
>>               myinfo = (peer_info_t *) 0xae17ac0
>>               __FUNCTION__ = "notify"
>>        #24 0x00002aaaaaaafb43 in socket_event_handler (fd=<value
>>        optimized out>, idx=11, data=0xae17960, poll_in=1, poll_out=0,
>>        poll_err=0) at socket.c:813
>>               this = (transport_t *) 0x6943
>>               priv = (socket_private_t *) 0xae16ce0
>>               ret = 0
>>        #25 0x00002b2e7b6ba2a5 in event_dispatch_epoll
>>        (event_pool=0xae02300) at event.c:804
>>               events = (struct epoll_event *) 0xae15990
>>               i = 0
>>               ret = 1
>>               __FUNCTION__ = "event_dispatch_epoll"
>>        #26 0x0000000000403899 in main ()
>>        No symbol table info available.
>>
>>
>>
>>
>>
>>        _______________________________________________
>>        Gluster-devel mailing list
>>        Gluster-devel at nongnu.org <mailto:Gluster-devel at nongnu.org>
>>        http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>
>>
>>
>>    _______________________________________________
>>    Gluster-devel mailing list
>>    Gluster-devel at nongnu.org <mailto:Gluster-devel at nongnu.org>
>>    http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>
>>
>>
>>
>> --
>> Raghavendra G
>>
>>
>


-- 
Raghavendra G
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20090618/c5a25eb8/attachment-0003.html>


More information about the Gluster-devel mailing list