[Gluster-users] Client crash with 3.0.5 testing/cluster/ha
Amon Ott
ao at m-privacy.de
Tue Aug 10 11:01:32 UTC 2010
Hello all,
I can reliably crash the glusterfs client with a cluster/ha config by pulling
a plug on one connection. Also, the option preferred-subvolume is not yet
implemented, but is important for us. I would appreciate some help with both
issues.
We have a cluster setup of currently four nodes, each of which is both Gluster
server and client. We distribute over two mirrors of two volumes each, so any
node is allowed to fail. However, if the dedicated network with ip net
192.168.111.0/24 fails (switch blows up, network port broken etc.),
everything breaks with split brain. We would like to fallback the gluster
connections to the external network 192.168.4.0/24 in case of such errors.
Here comes the log:
Version : glusterfs 3.0.5 built on Jul 13 2010 10:19:07
git: v3.0.5
Starting Time: 2010-08-09 16:35:17
Command
line : /usr/sbin/glusterfs --log-level=NORMAL --volfile=/gluster/user.vol /home/user
PID : 27837
System name : Linux
Nodename : tgpro1
Kernel Release : 2.6.32.16-rsbac
Hardware Identifier: i686
Given volfile:
+------------------------------------------------------------------------------+
1: ## file auto generated by /usr/bin/glusterfs-volgen (mount.vol)
2: # Cmd line:
3: # $ /usr/bin/glusterfs-volgen -n userhome -r 1
192.168.111.1:/glusterfs/userhome 192.168.111.2:/glusterfs/userhome
192.168.111.3:/glusterfs/userhome 192.168.111.4:/glusterfs/userhome
4:
5: # RAID 1
6: # TRANSPORT-TYPE tcp
7: volume tgpro4-1
8: type protocol/client
9: option transport-type tcp
10: option remote-host 192.168.111.4
11: option transport.socket.nodelay on
12: option remote-port 6996
13: option remote-subvolume brick1
14: end-volume
15:
16: volume tgpro4-2
17: type protocol/client
18: option transport-type tcp
19: option remote-host 192.168.4.104
20: option transport.socket.nodelay on
21: option remote-port 6996
22: option remote-subvolume brick1
23: end-volume
24:
25: volume tgpro3-1
26: type protocol/client
27: option transport-type tcp
28: option remote-host 192.168.111.3
29: option transport.socket.nodelay on
30: option remote-port 6996
31: option remote-subvolume brick1
32: end-volume
33:
34: volume tgpro3-2
35: type protocol/client
36: option transport-type tcp
37: option remote-host 192.168.4.103
38: option transport.socket.nodelay on
39: option remote-port 6996
40: option remote-subvolume brick1
41: end-volume
42:
43: volume tgpro2-1
44: type protocol/client
45: option transport-type tcp
46: option remote-host 192.168.111.2
47: option transport.socket.nodelay on
48: option remote-port 6996
49: option remote-subvolume brick1
50: end-volume
51:
52: volume tgpro2-2
53: type protocol/client
54: option transport-type tcp
55: option remote-host 192.168.4.102
56: option transport.socket.nodelay on
57: option remote-port 6996
58: option remote-subvolume brick1
59: end-volume
60:
61: volume tgpro1-1
62: type protocol/client
63: option transport-type tcp
64: option remote-host 192.168.111.1
65: option transport.socket.nodelay on
66: option remote-port 6996
67: option remote-subvolume brick1
68: end-volume
69:
70: volume tgpro1-2
71: type protocol/client
72: option transport-type tcp
73: option remote-host 192.168.4.101
74: option transport.socket.nodelay on
75: option remote-port 6996
76: option remote-subvolume brick1
77: end-volume
78:
79: volume ha-4
80: type testing/cluster/ha
81: subvolumes tgpro4-1 tgpro4-2
82: option preferred-subvolume tgpro4-1
83: end-volume
84:
85: volume ha-3
86: type testing/cluster/ha
87: subvolumes tgpro3-1 tgpro3-2
88: option preferred-subvolume tgpro3-1
89: end-volume
90:
91: volume ha-2
92: type testing/cluster/ha
93: subvolumes tgpro2-1 tgpro2-2
94: option preferred-subvolume tgpro2-1
95: end-volume
96:
97: volume ha-1
98: type testing/cluster/ha
99: subvolumes tgpro1-1 tgpro1-2
100: option preferred-subvolume tgpro1-1
101: end-volume
102:
103: volume mirror-0
104: type cluster/replicate
105: subvolumes ha-1 ha-2
106: # Preferred read system (this system, if one of them)
107: # option read-subvolume ha-1
108: # Brick to prefer in case of split brain
109: option favorite-child ha-1
110: end-volume
111:
112: volume mirror-1
113: type cluster/replicate
114: subvolumes ha-3 ha-4
115: # Preferred read system (this system, if one of them)
116: # option read-subvolume ha-3
117: # Brick to prefer in case of split brain
118: option favorite-child ha-3
119: end-volume
120:
121: volume distribute
122: type cluster/distribute
123: option lookup-unhashed yes # lookup everywhere, if not found
124: subvolumes mirror-0 mirror-1
125: end-volume
126:
127: volume writebehind
128: type performance/write-behind
129: option cache-size 4MB
130: subvolumes distribute
131: end-volume
132:
133: volume quickread
134: type performance/quick-read
135: option cache-timeout 1
136: option max-file-size 64kB
137: subvolumes writebehind
138: end-volume
139:
140: volume statprefetch
141: type performance/stat-prefetch
142: subvolumes quickread
143: end-volume
144:
+------------------------------------------------------------------------------+
[2010-08-09 16:35:18] W [afr.c:2947:init] mirror-1: You have specified
subvolume 'ha-3' as the 'favorite child'. This means that if a discrepancy in
the content or attributes (ownership, permission, etc.) of a file is detected
among the subvolumes, the file on 'ha-3' will be considered the definitive
version and its contents will OVERWRITE the contents of the file on other
subvolumes. All versions of the file except that on 'ha-3' WILL BE LOST.
[2010-08-09 16:35:18] W [afr.c:2947:init] mirror-0: You have specified
subvolume 'ha-1' as the 'favorite child'. This means that if a discrepancy in
the content or attributes (ownership, permission, etc.) of a file is detected
among the subvolumes, the file on 'ha-1' will be considered the definitive
version and its contents will OVERWRITE the contents of the file on other
subvolumes. All versions of the file except that on 'ha-1' WILL BE LOST.
[2010-08-09 16:35:18] W [glusterfsd.c:548:_log_if_option_is_invalid] ha-1:
option 'preferred-subvolume' is not recognized
[2010-08-09 16:35:18] W [glusterfsd.c:548:_log_if_option_is_invalid] ha-2:
option 'preferred-subvolume' is not recognized
[2010-08-09 16:35:18] W [glusterfsd.c:548:_log_if_option_is_invalid] ha-3:
option 'preferred-subvolume' is not recognized
[2010-08-09 16:35:18] W [glusterfsd.c:548:_log_if_option_is_invalid] ha-4:
option 'preferred-subvolume' is not recognized
[2010-08-09 16:35:18] N [glusterfsd.c:1409:main] glusterfs: Successfully
started
[2010-08-09 16:35:18] N [client-protocol.c:6288:client_setvolume_cbk]
tgpro1-1: Connected to 192.168.111.1:6996, attached to remote
volume 'brick1'.
[2010-08-09 16:35:18] N [afr.c:2648:notify] mirror-0: Subvolume 'ha-1' came
back up; going online.
[2010-08-09 16:35:18] N [client-protocol.c:6288:client_setvolume_cbk]
tgpro1-1: Connected to 192.168.111.1:6996, attached to remote
volume 'brick1'.
[2010-08-09 16:35:18] N [afr.c:2648:notify] mirror-0: Subvolume 'ha-1' came
back up; going online.
[2010-08-09 16:35:18] N [client-protocol.c:6288:client_setvolume_cbk]
tgpro1-2: Connected to 192.168.4.101:6996, attached to remote
volume 'brick1'.
[2010-08-09 16:35:18] N [fuse-bridge.c:2953:fuse_init] glusterfs-fuse: FUSE
inited with protocol versions: glusterfs 7.13 kernel 7.13
[2010-08-09 16:35:18] N [client-protocol.c:6288:client_setvolume_cbk]
tgpro1-2: Connected to 192.168.4.101:6996, attached to remote
volume 'brick1'.
[2010-08-09 16:35:18] N [client-protocol.c:6288:client_setvolume_cbk]
tgpro2-2: Connected to 192.168.4.102:6996, attached to remote
volume 'brick1'.
[2010-08-09 16:35:18] N [client-protocol.c:6288:client_setvolume_cbk]
tgpro2-2: Connected to 192.168.4.102:6996, attached to remote
volume 'brick1'.
[2010-08-09 16:35:18] N [client-protocol.c:6288:client_setvolume_cbk]
tgpro3-2: Connected to 192.168.4.103:6996, attached to remote
volume 'brick1'.
[2010-08-09 16:35:18] N [afr.c:2648:notify] mirror-1: Subvolume 'ha-3' came
back up; going online.
pending frames:
patchset: v3.0.5
signal received: 11
time of crash: 2010-08-09 16:35:18
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.0.5
[0x5137f400]
/usr/lib/glusterfs/3.0.5/xlator/protocol/client.so(client_statfs_cbk+0x1c8)
[0x50964718]
/usr/lib/glusterfs/3.0.5/xlator/protocol/client.so(protocol_client_xfer+0x389)
[0x5096bca9]
/usr/lib/glusterfs/3.0.5/xlator/protocol/client.so(client_statfs+0x159)
[0x50972849]
/usr/lib/glusterfs/3.0.5/xlator/testing/cluster/ha.so(ha_statfs+0x11c)
[0x51371c4c]
/usr/lib/glusterfs/3.0.5/xlator/cluster/replicate.so(afr_statfs+0x2da)
[0x5093043a]
/usr/lib/glusterfs/3.0.5/xlator/cluster/distribute.so(dht_get_du_info_for_subvol+0x1ea)
[0x5090bdaa]
/usr/lib/glusterfs/3.0.5/xlator/cluster/distribute.so(dht_notify+0x172)
[0x5090d312]
/usr/lib/glusterfs/3.0.5/xlator/cluster/distribute.so(notify+0x2b)[0x5090d38b]
/usr/lib/libglusterfs.so.0(xlator_notify+0x3f)[0x5133fcef]
/usr/lib/libglusterfs.so.0(default_notify+0x4d)[0x5134773d]
/usr/lib/glusterfs/3.0.5/xlator/cluster/replicate.so(notify+0x232)[0x509297b2]
/usr/lib/libglusterfs.so.0(xlator_notify+0x3f)[0x5133fcef]
/usr/lib/libglusterfs.so.0(default_notify+0x4d)[0x5134773d]
/usr/lib/glusterfs/3.0.5/xlator/testing/cluster/ha.so(notify+0x189)
[0x5136f239]
/usr/lib/libglusterfs.so.0(xlator_notify+0x3f)[0x5133fcef]
/usr/lib/glusterfs/3.0.5/xlator/protocol/client.so(protocol_client_post_handshake+0x139)
[0x50977409]
/usr/lib/glusterfs/3.0.5/xlator/protocol/client.so(client_setvolume_cbk+0x2f1)
[0x50977711]
/usr/lib/glusterfs/3.0.5/xlator/protocol/client.so(protocol_client_interpret+0x2a5)
[0x509665f5]
/usr/lib/glusterfs/3.0.5/xlator/protocol/client.so(protocol_client_pollin+0xcf)
[0x5096673f]
/usr/lib/glusterfs/3.0.5/xlator/protocol/client.so(notify+0xd2)[0x50976962]
/usr/lib/libglusterfs.so.0(xlator_notify+0x3f)[0x5133fcef]
/usr/lib/glusterfs/3.0.5/transport/socket.so(socket_event_poll_in+0x3d)
[0x500bd25d]
/usr/lib/glusterfs/3.0.5/transport/socket.so(socket_event_handler+0xab)
[0x500bd31b]
/usr/lib/libglusterfs.so.0[0x5135b02a]
/usr/lib/libglusterfs.so.0(event_dispatch+0x21)[0x51359e21]
/usr/sbin/glusterfs(main+0xb5d)[0x15bbc13d]
/lib/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0x511ddb25]
/usr/sbin/glusterfs[0x15bba1f1]
---------
Amon Ott
--
Dr. Amon Ott - m-privacy GmbH
Am Köllnischen Park 1, 10179 Berlin
Tel: +49 30 24342334
Fax: +49 30 24342336
Web: http://www.m-privacy.de
Handelsregister:
Amtsgericht Charlottenburg HRB 84946
Geschäftsführer:
Dipl.-Kfm. Holger Maczkowsky,
Roman Maczkowsky
GnuPG-Key-ID: EA898571
More information about the Gluster-users
mailing list