[Gluster-users] CRASH on: ib-verbs.c:2020:ib_verbs_event_handler

Christian Marnitz christian.marnitz at icesmedia.de
Fri Aug 21 08:03:23 UTC 2009


Hi,

I had TRACE enabled on one Client, because of the other Problem of hanging Mountpoints: here is the Log.

[2009-08-21 00:22:44] T [fuse-bridge.c:1567:fuse_write] glusterfs-fuse: 87346: WRITE (0x7ffc7c215e10, size=358, offset=4068266)
[2009-08-21 00:22:44] T [fuse-bridge.c:1528:fuse_writev_cbk] glusterfs-fuse: 87346: WRITE => 358/358,4068266/4068624
[2009-08-21 00:22:44] T [fuse-bridge.c:1592:fuse_flush] glusterfs-fuse: 87347: FLUSH 0x7ffc7c215e10
[2009-08-21 00:22:44] T [fuse-bridge.c:855:fuse_err_cbk] glusterfs-fuse: 87347: FLUSH() ERR => 0
[2009-08-21 00:22:44] T [fuse-bridge.c:1610:fuse_release] glusterfs-fuse: 87348: RELEASE 0x7ffc7c215e10
[2009-08-21 00:22:54] E [ib-verbs.c:2020:ib_verbs_event_handler] transport/ib-verbs: remote3: pollin received on tcp socket (peer: 10.10.10.12:6997) after handshake is complete
[2009-08-21 00:22:54] D [ib-verbs.c:1889:ib_verbs_handshake_pollerr] transport/ib-verbs: remote3: peer disconnected, cleaning up
[2009-08-21 00:22:54] T [client-protocol.c:5694:protocol_client_cleanup] remote3: cleaning up state in transport object 0x623230
[2009-08-21 00:22:54] T [client-protocol.c:5636:client_protocol_reconnect] remote3: attempting reconnect
[2009-08-21 00:22:54] D [name.c:146:client_fill_address_family] remote3: address-family not specified, guessing it to be inet/inet6
[2009-08-21 00:22:54] D [name.c:218:af_inet_client_get_remote_sockaddr] remote3: option remote-port missing in volume remote3. Defaulting to 6997
[2009-08-21 00:22:54] T [common-utils.c:85:gf_resolve_ip6] resolver: flushing DNS cache
[2009-08-21 00:22:54] T [common-utils.c:92:gf_resolve_ip6] resolver: DNS cache not present, freshly probing hostname: 10.10.10.12
[2009-08-21 00:22:54] T [common-utils.c:129:gf_resolve_ip6] resolver: returning ip-10.10.10.12 (port-6997) for hostname: 10.10.10.12 and port: 6997
[2009-08-21 00:22:54] T [ib-verbs.c:2102:ib_verbs_connect] remote3: socket fd = 15
[2009-08-21 00:22:54] E [ib-verbs.c:2020:ib_verbs_event_handler] transport/ib-verbs: remote3: pollin received on tcp socket (peer: 10.10.10.12:6997) after handshake is complete
[2009-08-21 00:22:54] D [ib-verbs.c:1889:ib_verbs_handshake_pollerr] transport/ib-verbs: remote3: peer disconnected, cleaning up
[2009-08-21 00:22:54] T [client-protocol.c:5694:protocol_client_cleanup] remote3: cleaning up state in transport object 0x623840
[2009-08-21 00:22:54] T [client-protocol.c:5636:client_protocol_reconnect] remote3: attempting reconnect
[2009-08-21 00:22:54] D [name.c:146:client_fill_address_family] remote3: address-family not specified, guessing it to be inet/inet6
[2009-08-21 00:22:54] D [name.c:218:af_inet_client_get_remote_sockaddr] remote3: option remote-port missing in volume remote3. Defaulting to 6997
[2009-08-21 00:22:54] T [common-utils.c:85:gf_resolve_ip6] resolver: flushing DNS cache
[2009-08-21 00:22:54] T [common-utils.c:92:gf_resolve_ip6] resolver: DNS cache not present, freshly probing hostname: 10.10.10.12
[2009-08-21 00:22:54] T [common-utils.c:129:gf_resolve_ip6] resolver: returning ip-10.10.10.12 (port-6997) for hostname: 10.10.10.12 and port: 6997
[2009-08-21 00:22:54] T [ib-verbs.c:2102:ib_verbs_connect] remote3: socket fd = 16
[2009-08-21 00:22:54] N [client-protocol.c:6246:notify] remote3: disconnected
[2009-08-21 00:22:54] E [ib-verbs.c:1969:tcp_connect_finish] remote3: tcp connect to 10.10.10.12:6997 failed (Connection refused)
[2009-08-21 00:22:54] D [ib-verbs.c:1889:ib_verbs_handshake_pollerr] transport/ib-verbs: remote3: peer disconnected, cleaning up
[2009-08-21 00:22:54] T [client-protocol.c:5694:protocol_client_cleanup] remote3: cleaning up state in transport object 0x623230
[2009-08-21 00:22:56] T [fuse-bridge.c:1872:fuse_statfs] glusterfs-fuse: 87349: STATFS
[2009-08-21 00:22:57] E [ib-verbs.c:1969:tcp_connect_finish] remote3: tcp connect to 10.10.10.12:6997 failed (Connection refused)
[2009-08-21 00:22:57] D [ib-verbs.c:1889:ib_verbs_handshake_pollerr] transport/ib-verbs: remote3: peer disconnected, cleaning up
[2009-08-21 00:22:57] T [client-protocol.c:5694:protocol_client_cleanup] remote3: cleaning up state in transport object 0x623840
[2009-08-21 00:23:05] T [client-protocol.c:5636:client_protocol_reconnect] remote3: attempting reconnect
[2009-08-21 00:23:05] D [name.c:146:client_fill_address_family] remote3: address-family not specified, guessing it to be inet/inet6
[2009-08-21 00:23:05] D [name.c:218:af_inet_client_get_remote_sockaddr] remote3: option remote-port missing in volume remote3. Defaulting to 6997
[2009-08-21 00:23:05] T [common-utils.c:85:gf_resolve_ip6] resolver: flushing DNS cache
[2009-08-21 00:23:05] T [common-utils.c:92:gf_resolve_ip6] resolver: DNS cache not present, freshly probing hostname: 10.10.10.12
[2009-08-21 00:23:05] T [common-utils.c:129:gf_resolve_ip6] resolver: returning ip-10.10.10.12 (port-6997) for hostname: 10.10.10.12 and port: 6997
[2009-08-21 00:23:05] T [ib-verbs.c:2102:ib_verbs_connect] remote3: socket fd = 15
[2009-08-21 00:23:05] T [client-protocol.c:5636:client_protocol_reconnect] remote3: attempting reconnect
[2009-08-21 00:23:05] D [name.c:146:client_fill_address_family] remote3: address-family not specified, guessing it to be inet/inet6

.....

[2009-08-21 09:12:07] T [client-protocol.c:5636:client_protocol_reconnect] remote3: attempting reconnect
[2009-08-21 09:12:07] D [name.c:146:client_fill_address_family] remote3: address-family not specified, guessing it to be inet/inet6
[2009-08-21 09:12:07] D [name.c:218:af_inet_client_get_remote_sockaddr] remote3: option remote-port missing in volume remote3. Defaulting to 6997
[2009-08-21 09:12:07] T [common-utils.c:85:gf_resolve_ip6] resolver: flushing DNS cache
[2009-08-21 09:12:07] T [common-utils.c:92:gf_resolve_ip6] resolver: DNS cache not present, freshly probing hostname: 10.10.10.12
[2009-08-21 09:12:07] T [common-utils.c:129:gf_resolve_ip6] resolver: returning ip-10.10.10.12 (port-6997) for hostname: 10.10.10.12 and port: 6997
[2009-08-21 09:12:07] T [ib-verbs.c:2102:ib_verbs_connect] remote3: socket fd = 15
[2009-08-21 09:12:07] T [client-protocol.c:5636:client_protocol_reconnect] remote3: attempting reconnect
[2009-08-21 09:12:07] D [name.c:146:client_fill_address_family] remote3: address-family not specified, guessing it to be inet/inet6
[2009-08-21 09:12:07] D [name.c:218:af_inet_client_get_remote_sockaddr] remote3: option remote-port missing in volume remote3. Defaulting to 6997
[2009-08-21 09:12:07] T [common-utils.c:85:gf_resolve_ip6] resolver: flushing DNS cache
[2009-08-21 09:12:07] T [common-utils.c:92:gf_resolve_ip6] resolver: DNS cache not present, freshly probing hostname: 10.10.10.12
[2009-08-21 09:12:07] T [common-utils.c:129:gf_resolve_ip6] resolver: returning ip-10.10.10.12 (port-6997) for hostname: 10.10.10.12 and port: 6997
[2009-08-21 09:12:07] T [ib-verbs.c:2102:ib_verbs_connect] remote3: socket fd = 16
[2009-08-21 09:12:07] T [ib-verbs.c:1714:ib_verbs_handshake_pollin] transport/ib-verbs: remote3: transacted recv_size=524288 send_size=524288
[2009-08-21 09:12:07] D [client-protocol.c:6294:notify] remote3: got GF_EVENT_CHILD_UP
[2009-08-21 09:12:07] T [ib-verbs.c:1714:ib_verbs_handshake_pollin] transport/ib-verbs: remote3: transacted recv_size=524288 send_size=524288
[2009-08-21 09:12:07] D [client-protocol.c:6294:notify] remote3: got GF_EVENT_CHILD_UP
[2009-08-21 09:12:07] N [client-protocol.c:5559:client_setvolume_cbk] remote3: Connected to 10.10.10.12:6997, attached to remote volume 'brick'.
[2009-08-21 09:12:07] N [client-protocol.c:5559:client_setvolume_cbk] remote3: Connected to 10.10.10.12:6997, attached to remote volume 'brick'.
[2009-08-21 09:12:17] T [client-protocol.c:5645:client_protocol_reconnect] remote3: breaking reconnect chain
[2009-08-21 09:12:17] T [client-protocol.c:5645:client_protocol_reconnect] remote3: breaking reconnect chain


-> df -h -> hanging mountpoint

[2009-08-21 09:54:58] T [fuse-bridge.c:1872:fuse_statfs] glusterfs-fuse: 113926: STATFS


-> killing process

[2009-08-21 09:55:18] W [glusterfsd.c:842:cleanup_and_exit] glusterfs: shutting down
[2009-08-21 09:55:18] N [fuse-bridge.c:2833:fini] fuse: Unmounting '/mnt/glusterfs/fast/st00008/'.


Greetings,
Christian





-----Ursprüngliche Nachricht-----
Von: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] Im Auftrag von Christian Marnitz
Gesendet: Freitag, 21. August 2009 09:12
An: gluster-users at gluster.org
Betreff: [Gluster-users] CRASH on: ib-verbs.c:2020:ib_verbs_event_handler

Hi,

We got an crash on:

[2009-08-20 12:28:16] E [ib-verbs.c:2020:ib_verbs_event_handler] transport/ib-verbs: server: pollin received on tcp socket (peer: 10.10.10.121:1018) after handshake is complete
pending frames:
frame : type(1) op(LINK)
frame : type(1) op(LINK)

patchset: v2.0.5-25-g8dfdde5
signal received: 11
time of crash: 2009-08-21 00:22:54
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 2.0.6rc4
/lib/libc.so.6[0x7fc03c934100]
/usr/local/lib/libglusterfs.so.0(inode_ref+0xe)[0x7fc03d0a0aae]
/usr/local/lib/glusterfs/2.0.6rc4/xlator/protocol/server.so(server_link_resume+0x1a1)[0x7fc03b8bc411]
/usr/local/lib/libglusterfs.so.0(call_resume+0x1bc)[0x7fc03d0a209c]
/usr/local/lib/glusterfs/2.0.6rc4/xlator/protocol/server.so(server_stub_resume+0x32)[0x7fc03b8c9b52]
/usr/local/lib/glusterfs/2.0.6rc4/xlator/protocol/server.so[0x7fc03b8cb9d7]
/usr/local/lib/glusterfs/2.0.6rc4/xlator/performance/io-threads.so(iot_lookup_cbk+0x34)[0x7fc03badace4]
/usr/local/lib/libglusterfs.so.0[0x7fc03d098a64]
/usr/local/lib/glusterfs/2.0.6rc4/xlator/storage/posix.so(posix_lookup+0x2e3)[0x7fc03bef79a3]
/usr/local/lib/libglusterfs.so.0(default_lookup+0xb5)[0x7fc03d09bfa5]
/usr/local/lib/glusterfs/2.0.6rc4/xlator/performance/io-threads.so(iot_lookup_wrapper+0xb5)[0x7fc03bade115]
/usr/local/lib/libglusterfs.so.0(call_resume+0x423)[0x7fc03d0a2303]
/usr/local/lib/glusterfs/2.0.6rc4/xlator/performance/io-threads.so(iot_worker_unordered+0xe)[0x7fc03badbdbe]
/lib/libpthread.so.0[0x7fc03cc6a3f7]
/lib/libc.so.6(clone+0x6d)[0x7fc03c9d9b3d]
---------

Thats all in the logfile. OS is ubuntu server LTS 8.04 x64 with OFED-1.4. If I could provide you more, please let me know what and how.

Many thanks in advance and best regards,
Christian Marnitz






Christian Marnitz
Geschäftsführer

------------------------------------------------
iCES MEDIA GmbH
Fürstenfeldbrucker Str. 24
82272 Moorenweis
------------------------------------------------
fon:  +49 (0) 81 46 / 99 77 00 - 000
fax:  +49 (0) 81 46 / 99 77 00 - 099
mob:  +49 (0) 160 / 723 06 96
mail: cm at icesmedia.com
web:  www.icesmedia.com
------------------------------------------------
Geschäftsführer:  Christian Marnitz
Handelsregister:  Amtsgericht München HRB 136841
USt.-ID-Nr.       DE 215536707
Sitz der Gesellschaft ist München
------------------------------------------------

Die übertragenen Daten sind nur für die adressierte Person bzw. Firma bestimmt und können vertrauliche bzw. nicht öffentliche Informationen enthalten. Jede Auswertung, Weiterleitung, Verbreitung, andere Verwendung oder daraus abgeleitete Maßnahme durch andere Personen als der beabsichtigten Empfänger ist untersagt und kann bei Zuwiderhandlung u.a. zu Schadenersatzansprüchen führen. Falls Sie diese Information irrtuemlich erhalten haben, nehmen Sie bitte Kontakt mit dem Absender auf und löschen Sie die Daten auf jedem Computer und Datenträger.

This message is intended for the use of the individual(s) or entity to which it is addressed and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately and delete this message. Thank you.


_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users




More information about the Gluster-users mailing list