[Gluster-users] Strange errors with cluster/distribute or cluster/replicate

Konstantin A. Lepikhov lakostis at unsafe.ru
Fri Mar 20 13:55:19 UTC 2009


Greetins!

Latest 2.0.0git have problems with afr:

I have 2 nodes with shared storage, both operate as client and server. On
shared storage located webapp synced from git. When I try to pull upstream
changes to that webapp glusterfs got sig 11. Simple git status got the
same results. I haven't this problem on 1.3 version of glusterfs but
it was terribly slow.

This error reproducible on both nodes and even with cluster/replicate.

glusterfs.log:

Version      : glusterfs 2.0.0git built on Mar 19 2009 20:01:13                                                                                               
TLA Revision : git://git.sv.gnu.org/gluster.git                                                                                                               
Starting Time: 2009-03-20 03:28:54                                                                                                                            
Command line : glusterfs -f /etc/glusterfs/afr-client-new.vol
--volume-name forum-cache /exports/forum                                                        
PID          : 19819                                                                                                                                          
System name  : Linux                                                                                                                                          
Nodename     : aabb                                                                                                                             
Kernel Release : 2.6.26-ovz-smp-alt0.3                                                                                                                        
Hardware Identifier: i686                                                                                                                                     
                                                                                                                                                              
Given volfile:
+------------------------------------------------------------------------------+
1: volume aabb
2: type protocol/client
3: option transport-type tcp
4: option transport-timeout 10
5: option remote-host 127.0.0.1
6: option remote-subvolume iot
7: end-volume
8:
9: volume xxyy
10: type protocol/client
11: option transport-type tcp
12: option transport-timeout 30
13: option remote-host <some remote ip>
14: option remote-subvolume iot
15: end-volume
16:
17: volume forum-afr
18: #type cluster/replicate
19: type cluster/distribute
20: subvolumes aabb xxyy
21: end-volume
22:
23: volume writebehind
24: type performance/write-behind
25: option aggregate-size 128KB
26: option window-size 1MB
27: subvolumes forum-afr
28: end-volume
29:
30: volume forum-cache
31: type performance/io-cache
32: option cache-size 128MB
33: subvolumes writebehind
34: end-volume

+------------------------------------------------------------------------------+
2009-03-20 03:28:54 W [xlator.c:430:validate_xlator_volume_options]
writebehind: option 'window-size' is deprecated, preferred is
'cache-size', continuing with correction
2009-03-20 03:28:54 W [xlator.c:430:validate_xlator_volume_options]
writebehind: option 'aggregate-size' is deprecated, preferred is
'block-size', continuing with correction
2009-03-20 03:28:54 N [glusterfsd.c:1134:main] glusterfs: Successfully
started
2009-03-20 03:28:54 N [client-protocol.c:6159:client_setvolume_cbk]
aabb: connection and handshake succeeded
2009-03-20 03:28:54 N [client-protocol.c:6159:client_setvolume_cbk]
aabb: connection and handshake succeeded
2009-03-20 03:28:54 W [dht-common.c:114:dht_lookup_dir_cbk] forum-afr:
lookup of / on xxyy returned error (Transport endpoint is not
connected)
2009-03-20 03:28:54 E [dht-layout.c:486:dht_layout_normalize] forum-afr:
found anomalies in /. holes=1 overlaps=0
2009-03-20 03:28:54 W [dht-common.c:152:dht_lookup_dir_cbk] forum-afr:
fixing assignment on /
2009-03-20 03:28:54 E [dht-selfheal.c:424:dht_selfheal_directory]
forum-afr: 1 subvolumes down -- not fixing
2009-03-20 03:28:54 N [client-protocol.c:6159:client_setvolume_cbk]
xxyy: connection and handshake succeeded
2009-03-20 03:28:54 N [client-protocol.c:6159:client_setvolume_cbk]
xxyy: connection and handshake succeeded
2009-03-20 03:29:01 W [dht-common.c:670:dht_lookup] forum-afr: incomplete
layout failure for path=/
2009-03-20 03:29:01 W [fuse-bridge.c:301:need_fresh_lookup] fuse-bridge:
revalidate of / failed (Resource temporarily unavailable)
pending frames:
frame : type(1) op(FLUSH)
frame : type(1) op(FLUSH)

patchset: git://git.sv.gnu.org/gluster.git
signal received: 11
configuration details:argp 1
backtrace 1
bdb->cursor->get 1
db.h 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 2.0.0git
/lib/libc.so.6[0xb7e1c708]
/usr/lib/glusterfs/2.0.0git/xlator/performance/write-behind.so(wb_do_ops+0x3e)[0xb7d7345e]
/usr/lib/glusterfs/2.0.0git/xlator/performance/write-behind.so(wb_process_queue+0xcb)[0xb7d7354b]
/usr/lib/glusterfs/2.0.0git/xlator/performance/write-behind.so(wb_flush+0x160)[0xb7d73980]
/usr/lib/libglusterfs.so.0(default_flush+0x9b)[0xb7f712fb]
/usr/lib/glusterfs/2.0.0git/xlator/mount/fuse.so[0xb7d59a51]
/usr/lib/libfuse.so.2[0xb7d47901]
/usr/lib/libfuse.so.2[0xb7d463e9]
/usr/lib/libfuse.so.2(fuse_session_process+0x26)[0xb7d48f46]
/usr/lib/glusterfs/2.0.0git/xlator/mount/fuse.so[0xb7d5fcfe]
/lib/libpthread.so.0[0xb7f4adaa]
/lib/libc.so.6(clone+0x5e)[0xb7ebef4e]
---------

glusterfsd.log:

================================================================================
Version      : glusterfs 2.0.0git built on Mar 19 2009 20:01:13
TLA Revision : git://git.sv.gnu.org/gluster.git
Starting Time: 2009-03-20 03:20:31
Command line : /usr/sbin/glusterfsd -p /var/run/glusterfsd.pid -f
/etc/glusterfs/afr-server-new.vol
PID          : 18819
System name  : Linux
Nodename     : aabb
Kernel Release : 2.6.26-ovz-smp-alt0.3
Hardware Identifier: i686

Given volfile:
+------------------------------------------------------------------------------+
1: volume posix
2: type storage/posix
3: option directory /var/lib/vz/storage/forum
4: end-volume
5:
6: volume forum-files
7: type features/locks
8: subvolumes posix
9: end-volume
10:
11: volume iot
12: type performance/io-threads
13: option thread-count 4
14: subvolumes forum-files
15: end-volume
16:
17: volume server
18: type protocol/server
19: option transport-type tcp
20: option transport.socket.listen-port 6996
21: subvolumes iot
22: option auth.addr.iot.allow *
23: end-volume

+------------------------------------------------------------------------------+
2009-03-20 03:20:31 N [glusterfsd.c:1134:main] glusterfs: Successfully started
2009-03-20 03:20:32 N [server-protocol.c:7513:mop_setvolume] server: accepted client from xxyy:1023 
2009-03-20 03:20:35 N [server-protocol.c:7513:mop_setvolume] server: accepted client from xxyy:1022
2009-03-20 03:20:35 N [server-protocol.c:7513:mop_setvolume] server: accepted client from 127.0.0.1:1023
2009-03-20 03:20:35 N [server-protocol.c:7513:mop_setvolume] server: accepted client from 127.0.0.1:1022
2009-03-20 03:24:51 E [posix.c:758:posix_mknod] posix: mknod on /html/upload/config.php: File exists
2009-03-20 03:28:35 N [server-protocol.c:8268:notify] server: 127.0.0.1:1023 disconnected
2009-03-20 03:28:35 N [server-protocol.c:8268:notify] server: 127.0.0.1:1022 disconnected
2009-03-20 03:28:35 N [server-helpers.c:530:server_connection_destroy] server: destroyed connection of aabb-18718-2009/03/20-03:16:25:432103-altlinux
2009-03-20 03:28:54 N [server-protocol.c:7513:mop_setvolume] server: accepted client from 127.0.0.1:1021
2009-03-20 03:28:54 N [server-protocol.c:7513:mop_setvolume] server: accepted client from 127.0.0.1:1020

-- 
WBR et al.




More information about the Gluster-users mailing list