[Gluster-devel] glusterfsd 1.3.0pre2.2 segfault

Mon Mar 5 10:08:06 UTC 2007

Hi,

I've just downloaded the 1.3.0pre2.2 tarball to play with, and
glusterfsd segfaults when i try to mount it from a client. 

the test setup consists of a 64bit debian sarge install with a clean
install of glusterfs (server only) and the client is a 32bit ubuntu 6.10
install with the client and server installed.

here's my server.vol file...

volume brick
        type storage/posix
        option directory /scratch/glusterfs/data
        option debug on
end-volume

volume server
        type protocol/server
        option transport-type tcp/server
        option listen-port 6996
        option bind-address 134.226.114.115
        subvolumes brick
        option auth.ip.brick.allow 134.226.*
        option debug on
end-volume

any my client.vol file...

volume client0
  type protocol/client
  option transport-type tcp/client
  option remote-host 134.226.114.115
  option remote-port 6996
  option remote-subvolume brick
  option debug on
end-volume

volume bricks
  type cluster/unify
  subvolumes client0
  option debug on
  option scheduler rr
end-volume

### Add writebehind feature
volume writebehind
  type performance/write-behind
  option aggregate-size 131072 # unit in bytes
  subvolumes bricks
end-volume

### Add readahead feature
volume readahead
  type performance/read-ahead
  option page-size 65536     # unit in bytes
  option page-count 16       # cache per file  = (page-count x page-size)
  subvolumes writebehind
end-volume

the client was launched with....

glusterfs -f /opt/glusterfs/client.vol -N -l /dev/stdout -L DEBUG /mnt/

there were no errors in the client at all, it just hung, but on the
server, it segfaults, it was launched...

gdb --core=core  --args glusterfsd -f /scratch/glusterfs/server.vol -N -l /dev/stdout -L DEBUG

and here's the backtrace from gdb,

[Mar 05 09:54:10] [DEBUG/posix.c:1156/init()] posix:Directory: /scratch/glusterfs/data
[Mar 05 09:55:35] [DEBUG/tcp-server.c:148/tcp_server_notify()] tcp/server:Registering socket (7) for new transport object of 134.226.112.163

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 182900830352 (LWP 5461)]
0x0000007fbfffee70 in ?? ()
(gdb) backtrace
#0  0x0000007fbfffee70 in ?? ()
#1  0x0000002a95673720 in xlator_foreach (this=0x507860, fn=0x7fbfffee70, data=0x7fbfffee90) at xlator.c:179
#2  0x0000002a95ce332e in get_xlator_by_name (some_xl=0x507860, name=0x7fbfffee90 "|\233P") at proto-srv.c:2366
#3  0x0000002a95ce33b1 in mop_setvolume (frame=0x509e10, bound_xl=0x7fbfffee90, params=0x509bf0) at proto-srv.c:2395
#4  0x0000002a95ce3da3 in proto_srv_interpret (trans=0x502320, blk=0x509b00) at proto-srv.c:2771
#5  0x0000002a95ce4129 in proto_srv_notify (this=0x507860, trans=0x502320, event=1) at proto-srv.c:2898
#6  0x0000002a95676953 in transport_notify (this=0x0, event=-1073746288) at transport.c:146
#7  0x0000002a95676d0c in epoll_notify (eevent=5273696, data=0x502320) at epoll.c:43
#8  0x0000002a95676eaa in epoll_iteration () at epoll.c:123
#9  0x0000002a956769fb in transport_poll () at transport.c:230
#10 0x000000000040107f in main (argc=8, argv=0x7fbffff348) at glusterfsd.c:217
(gdb) 

this pretty much happened with me on the previous pre1 and pre2 releases
when i last tried it a few days ago and last week.

I hope I am not doing anything silly and I am not wasting anyones time.

Thanks,
Jimmy.

-- 
Jimmy Tang
Trinity Centre for High Performance Computing,
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
http://www.tchpc.tcd.ie/ | http://www.tchpc.tcd.ie/~jtang