[Gluster-devel] 32bit client and 64bit server.

Raghavendra G raghavendra at gluster.com
Mon Mar 1 04:24:12 UTC 2010


Hi Mike,

Do you observe the same issue with vanilla 3.0.2 (without patch 2659)? Also
please find the comments inlined.

On Sat, Feb 27, 2010 at 4:40 AM, Mike Terzo <mterzo at gmail.com> wrote:

> I had this on the gluster-users list, but really seems to be a bit
> more of a development topic.
>
> On Fri, Feb 26, 2010 at 5:22 PM, Mike Terzo <mterzo at gmail.com> wrote:
> > I found this in the archives:
> > http://gluster.org/pipermail/gluster-users/20081027/000523.html
> >
> > I don't see any follow ups with any sort of solution. I'm running
> > gluster 3.0.2 with patch 2659 with a 64bit client and 64bit server
> > without any issues.  I turned up another box running 32 bit client
> > pointing to 2 different 64bit servers and I had the same issues listed
> > above. My kernels on both boxes clients are 2.6.30 neither have any
> > patches applied.
> >
> > Here's my client output
> >
> > pending frames:
> > frame : type(1) op(LOOKUP)
> > frame : type(1) op(STAT)
> > frame : type(1) op(STAT)
> >
> > patchset: v3.0.2
> > signal received: 11
> > time of crash: 2010-02-26 16:01:08
> > configuration details:
> > argp 1
> > backtrace 1
> > dlfcn 1
> > fdatasync 1
> > libpthread 1
> > llistxattr 1
> > setfsid 1
> > spinlock 1
> > epoll.h 1
> > xattr.h 1
> > st_atim.tv_nsec 1
> > package-string: glusterfs 3.0.2
> > [0xffffe400]
> > /usr/lib/libglusterfs.so.0[0xb7ffa8a4]
> > /usr/lib/libglusterfs.so.0(inode_unref+0x39)[0xb7ffb571]
> > /usr/lib/libglusterfs.so.0(loc_wipe+0x25)[0xb7feec52]
> > /usr/lib/libglusterfs.so.0(call_stub_destroy+0x786)[0xb800208e]
> > /usr/lib/libglusterfs.so.0(call_resume+0x73)[0xb80022a8]
> >
> /usr/lib/glusterfs/3.0.2/xlator/performance/io-threads.so(iot_worker_unordered+0x20)[0xb75e0ae3]
> > /lib/tls/libpthread.so.0[0xb7fcd1ce]
> > /lib/tls/libc.so.6(__clone+0x5e)[0xb7f6290e]
> > ---------
> > Segmentation fault (core dumped)
> >
> > I've turned off io-threads and the process hasn't cored again.  Is
> > this a general practice to not mix 32bit client and 64bit servers?
> >
> > Thanks,
> > --mike terzo
> >
>
> With io-threads turned not there, i got a much nicer looking stack trace.
>
>
> pending frames:
> frame : type(1) op(LOOKUP)
> frame : type(1) op(STAT)
> frame : type(1) op(STAT)
>
> patchset: v3.0.2
> signal received: 11
> time of crash: 2010-02-26 18:28:58
> configuration details:
> argp 1
> backtrace 1
> dlfcn 1
> fdatasync 1
> libpthread 1
> llistxattr 1
> setfsid 1
> spinlock 1
> epoll.h 1
> xattr.h 1
> st_atim.tv_nsec 1
> package-string: glusterfs 3.0.2
> [0xffffe400]
> /usr/lib/libglusterfs.so.0[0xb7f0b8a4]
> /usr/lib/libglusterfs.so.0(inode_unref+0x39)[0xb7f0c571]
> /usr/lib/libglusterfs.so.0(loc_wipe+0x25)[0xb7effc52]
> /usr/lib/glusterfs/3.0.2/xlator/mount/fuse.so[0xb74db3b0]
> /usr/lib/glusterfs/3.0.2/xlator/mount/fuse.so[0xb74e4c19]
> /usr/lib/libglusterfs.so.0[0xb7f0261e]
>
> /usr/lib/glusterfs/3.0.2/xlator/performance/write-behind.so(wb_stat_cbk+0x179)[0xb74fe035]
> /usr/lib/libglusterfs.so.0[0xb7f0261e]
>
> /usr/lib/glusterfs/3.0.2/xlator/cluster/replicate.so(afr_stat_cbk+0xb9)[0xb751f82a]
>
> /usr/lib/glusterfs/3.0.2/xlator/protocol/client.so(client_stat_cbk+0x1bd)[0xb755b525]
>
> /usr/lib/glusterfs/3.0.2/xlator/protocol/client.so(protocol_client_interpret+0x1e1)[0xb754b85d]
>
> /usr/lib/glusterfs/3.0.2/xlator/protocol/client.so(protocol_client_pollin+0xbe)[0xb754c0af]
>
> /usr/lib/glusterfs/3.0.2/xlator/protocol/client.so(notify+0x204)[0xb754f918]
> /usr/lib/libglusterfs.so.0(xlator_notify+0x39)[0xb7effae4]
>
> /usr/lib/glusterfs/3.0.2/transport/socket.so(socket_event_poll_in+0x39)[0xb6cd1f96]
>
> /usr/lib/glusterfs/3.0.2/transport/socket.so(socket_event_handler+0x52)[0xb6cd3c6e]
> /usr/lib/libglusterfs.so.0[0xb7f1962e]
> /usr/lib/libglusterfs.so.0(event_dispatch+0x21)[0xb7f199ae]
> glusterfs(main+0xc92)[0x804bcc7]
> /lib/tls/libc.so.6(__libc_start_main+0xd4)[0xb7dbeea4]
> glusterfs[0x8049e21]
>
> here's the backtrace of the core:
> (gdb) bt
> #0  __inode_invalidate (inode=0x805dd14) at inode.c:993
> #1  0xb7f0b8a4 in inode_table_prune (table=0x805dd18) at inode.c:1022
> #2  0xb7f0c571 in inode_unref (inode=0x805d858) at inode.c:399
> #3  0xb7effc52 in loc_wipe (loc=0xb4889aa4) at xlator.c:995
> #4  0xb74db3b0 in free_state (state=0xb4889a98) at fuse-bridge.c:182
> #5  0xb74e4c19 in fuse_attr_cbk (frame=0xb4892b7c, cookie=0xb04162e0,
> this=0x8052248, op_ret=0, op_errno=0,
>    buf=0xbfd15c9c) at fuse-bridge.c:731
> #6  0xb7f0261e in default_stat_cbk (frame=0xb04162e0,
> cookie=0xb0667c48, this=0x8058310, op_ret=0, op_errno=0, buf=0x0)
>    at defaults.c:88
> #7  0xb74fe035 in wb_stat_cbk (frame=0xb0667c48, cookie=0xb046bf00,
> this=0x8057dd8, op_ret=0, op_errno=0, buf=0x0)
>    at write-behind.c:543
> #8  0xb7f0261e in default_stat_cbk (frame=0xb046bf00,
> cookie=0xb06bdd28, this=0x8057828, op_ret=0, op_errno=0, buf=0x0)
>    at defaults.c:88
> #9  0xb751f82a in afr_stat_cbk (frame=0xb06bdd28, cookie=0x0,
> this=0x0, op_ret=0, op_errno=0, buf=0xbfd15c9c)
>    at afr-inode-read.c:227
> #10 0xb755b525 in client_stat_cbk (frame=0xb04b0b58, hdr=0xb04c5f00,
> hdrlen=188, iobuf=0x0) at client-protocol.c:4105
> #11 0xb754b85d in protocol_client_interpret (this=0x0,
> trans=0x805a1d8, hdr_p=0xb04c5f00 "", hdrlen=188, iobuf=0x0)
>    at client-protocol.c:6511
> #12 0xb754c0af in protocol_client_pollin (this=0x0, trans=0x805a1d8)
> at client-protocol.c:6809
> #13 0xb754f918 in notify (this=0x8056d18, event=2, data=0x805a1d8) at
> client-protocol.c:6928
> #14 0xb7effae4 in xlator_notify (xl=0x8056d18, event=0, data=0x0) at
> xlator.c:928
> #15 0xb6cd1f96 in socket_event_poll_in (this=0x805a1d8) at socket.c:729
> #16 0xb6cd3c6e in socket_event_handler (fd=8, idx=0, data=0x805a1d8,
> poll_in=1, poll_out=0, poll_err=0) at socket.c:829
> #17 0xb7f1962e in event_dispatch_epoll (event_pool=0x8051e48) at
> event.c:804
> ---Type <return> to continue, or q <return> to quit---
> #18 0xb7f199ae in event_dispatch (event_pool=0x1ece) at event.c:975
> #19 0x0804bcc7 in main (argc=5, argv=0xbfd16b54) at glusterfsd.c:1413
>
> and table's address, is way wrong.
> (gdb) p * inode
> $4 = {table = 0x78, lock = 1, nlookup = 33870112096256, generation =
> 4294967296, in_attic = 0, ref = 14057,
>  ino = 578108920368060400, st_mode = 134554184, fd_list = {next =
> 0x539, prev = 0x805deb8}, dentry_list = {
>    next = 0x8079608, prev = 0xb489c36c}, hash = {next = 0x805db44,
> prev = 0x24bd9}, list = {next = 0x805dd58,
>    prev = 0x805dd58}, _ctx = 0xfffffac4}
>
> (gdb) up
> #1  0xb7f0b8a4 in inode_table_prune (table=0x805dd18) at inode.c:1022
> 1022    in inode.c
> (gdb) p entry
> $5 = (inode_t *) 0x1ece
> (gdb) p *entry
> Cannot access memory at address 0x1ece
> (gdb) p table
> $6 = (inode_table_t *) 0x805dd18
> (gdb) p *table
> $7 = {lock = {__m_reserved = 1, __m_count = 0, __m_owner = 0x1ece,
> __m_kind = 0, __m_lock = {__status = 1,
>      __spinlock = 0}}, hashsize = 14057, name = 0x805d7f0
> "fuse/inode", root = 0x805db00, xl = 0x8052248,
>  lru_limit = 1337, inode_hash = 0x805deb8, name_hash = 0x8079608,
> active = {next = 0xb489c36c, prev = 0x805db44},
>  active_size = 150489, lru = {next = 0x805dd58, prev = 0x805dd58},
> lru_size = 4294965956,
>  lru_callback = 0xb7f0c855 <__inode_invalidate>, purge = {next =
> 0x805dd68, prev = 0x805dd68}, purge_size = 0, inval = {
>    next = 0xb489c864, prev = 0xb04c7234}, inval_size = 301000, attic
> = {next = 0xb633bf44, prev = 0x8095a64},
>  attic_size = 16}
>
> looking at inode.c there's a macro that gets called right before this:
>
> #define list_entry(ptr, type, member)                  \
>    ((type *)((char *)(ptr)-(unsigned long)(&((type *)0)->member)))
>
> (unsigned long) gets used here, which is bad.
>
> (gdb) p *((inode_t *) 0x805dd58)
> $20 = {table = 0x805dd58, lock = 134602072, nlookup =
> 13254313975044111044, generation = 578111566067916136,
>  in_attic = 0, ref = 3028928612, ino = 1292788113895988, st_mode =
> 3056844612, fd_list = {next = 0x8095a64,
>    prev = 0x10}, dentry_list = {next = 0x21, prev = 0xb42dff0c}, hash
> = {next = 0xb42dff0c, prev = 0xb41bcdf8}, list = {
>    next = 0xb5b7f850, prev = 0xb42dfed8}, _ctx = 0x805ddb0}
>
> printf("size of: %d\n", sizeof(unsigned long));
> on my 32bit system:
> size of: 4
> 64bit system:
> size of: 8
>
> looks like a type conversion is breaking everything.
>

Its not an issue with type casting. Pointers and unsigned long are of same
size on the same system. Here we are type casting a pointer on a 32 bit
system to unsigned long on the same 32 bit system and both will be of same
size.


>
> --mike terzo
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>


regards,
-- 
Raghavendra G
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20100301/35f5eca2/attachment-0003.html>


More information about the Gluster-devel mailing list