[Gluster-users] Recovery from network failure

Georgecooldude georgecooldude at gmail.com
Sat Sep 26 21:55:23 UTC 2009


Does anyone have any ideas? This is a real gluster show stopper for me atm
:(

On Wed, Sep 23, 2009 at 9:41 PM, Georgecooldude <georgecooldude at gmail.com>wrote:

> It does seem to detect it in the log.
>
> This is what and did and the attached log file
>
> ###
> SRV01 - 192.168.1.1
> SRV02 - 192.168.1.2
> ##
>
>
> ------------------------
> Step 1: Copy large file to the gluster mount on server02
> admin at srv02:/mnt/glusterfs$ ls -lh
> total 1.2G
> -rw-r--r-- 1 root root 584M 2009-09-23 21:29 test03
> ------------------------
>
> ------------------------
> Step 2: Pull the cable from srv02
> ------------------------
>
> ------------------------
> Step 3: ls on srv01 - See I have a partial file
> admin at srv01:/mnt/glusterfs$ ls -lh
> total 775M
> -rw-r--r-- 1 root root 191M 2009-09-23 21:28 test03
> ------------------------
>
> ------------------------
> Server02 Log file looks like this:
> Version      : glusterfs 2.0.6 built on Sep 19 2009 18:00:37
> TLA Revision : v2.0.6
> Starting Time: 2009-09-23 21:26:49
> Command line : glusterfsd -f /etc/glusterfs/glusterfs-server.vol -l
> /var/log/gluster/gluster-log.txt -L DEBUG --volfile-check
> PID          : 5085
> System name  : Linux
> Nodename     : srv02
> Kernel Release : 2.6.24-24-server
> Hardware Identifier: x86_64
> Given volfile:
>
> +------------------------------------------------------------------------------+
>   1: # file: /etc/glusterfs/glusterfs-server.vol
>   2:
>   3: volume posix
>   4:   type storage/posix
>   5:   option directory /data/export
>   6: end-volume
>   7:
>   8: volume locks
>   9:   type features/locks
>  10:   subvolumes posix
>  11: end-volume
>  12:
>  13: volume brick
>  14:   type performance/io-threads
>  15:   option thread-count 8
>  16:   subvolumes locks
>  17: end-volume
>  18:
>  19: volume posix-ns
>  20:   type storage/posix
>  21:   option directory /data/export-ns
>  22: end-volume
>  23:
>  24: volume locks-ns
>  25:   type features/locks
>  26:   subvolumes posix-ns
>  27: end-volume
>  28:
>  29: volume brick-ns
>  30:   type performance/io-threads
>  31:   option thread-count 8
>  32:   subvolumes locks-ns
>  33: end-volume
>  34:
>  35: volume server
>  36:   type protocol/server
>  37:   option transport-type tcp
>  38:   option auth.addr.brick.allow *
>  39:   option auth.addr.brick-ns.allow *
>  40:   subvolumes brick brick-ns
>  41: end-volume
>
> +------------------------------------------------------------------------------+
> [2009-09-23 21:26:49] D [glusterfsd.c:1205:main] glusterfs: running in pid
> 5085
> [2009-09-23 21:26:49] D [io-threads.c:2280:init] brick: io-threads:
> Autoscaling: off, min_threads: 8, max_threads: 8
> [2009-09-23 21:26:49] D [io-threads.c:2280:init] brick-ns: io-threads:
> Autoscaling: off, min_threads: 8, max_threads: 8
> [2009-09-23 21:26:49] D [transport.c:141:transport_load] transport: attempt
> to load file /usr/local/lib/glusterfs/2.0.6/transport/socket.so
> [2009-09-23 21:26:49] N [glusterfsd.c:1224:main] glusterfs: Successfully
> started
> [2009-09-23 21:26:56] D [addr.c:174:gf_auth] brick-ns: allowed = "*",
> received addr = "192.168.1.2"
> [2009-09-23 21:26:56] N [server-protocol.c:7056:mop_setvolume] server:
> accepted client from 192.168.1.2:1021
> [2009-09-23 21:26:56] D [addr.c:174:gf_auth] brick-ns: allowed = "*",
> received addr = "192.168.1.2"
> [2009-09-23 21:26:56] N [server-protocol.c:7056:mop_setvolume] server:
> accepted client from 192.168.1.2:1020
> [2009-09-23 21:26:56] D [addr.c:174:gf_auth] brick: allowed = "*", received
> addr = "192.168.1.2"
> [2009-09-23 21:26:56] N [server-protocol.c:7056:mop_setvolume] server:
> accepted client from 192.168.1.2:1017
> [2009-09-23 21:26:56] D [addr.c:174:gf_auth] brick: allowed = "*", received
> addr = "192.168.1.2"
> [2009-09-23 21:26:56] N [server-protocol.c:7056:mop_setvolume] server:
> accepted client from 192.168.1.2:1016
> [2009-09-23 21:27:16] D [addr.c:174:gf_auth] brick: allowed = "*", received
> addr = "192.168.1.1"
> [2009-09-23 21:27:16] N [server-protocol.c:7056:mop_setvolume] server:
> accepted client from 192.168.1.1:1021
> [2009-09-23 21:27:17] D [addr.c:174:gf_auth] brick-ns: allowed = "*",
> received addr = "192.168.1.1"
> [2009-09-23 21:27:17] N [server-protocol.c:7056:mop_setvolume] server:
> accepted client from 192.168.1.1:1020
> [2009-09-23 21:27:17] D [addr.c:174:gf_auth] brick: allowed = "*", received
> addr = "192.168.1.1"
> [2009-09-23 21:27:17] N [server-protocol.c:7056:mop_setvolume] server:
> accepted client from 192.168.1.1:1017
> [2009-09-23 21:27:17] D [addr.c:174:gf_auth] brick-ns: allowed = "*",
> received addr = "192.168.1.1"
> [2009-09-23 21:27:17] N [server-protocol.c:7056:mop_setvolume] server:
> accepted client from 192.168.1.1:1016
> [2009-09-23 21:29:21] N [server-protocol.c:7816:notify] server:
> 192.168.1.1:1021 disconnected
> [2009-09-23 21:29:21] N [server-protocol.c:7816:notify] server:
> 192.168.1.1:1020 disconnected
> [2009-09-23 21:29:37] N [server-protocol.c:7816:notify] server:
> 192.168.1.1:1017 disconnected
> [2009-09-23 21:29:37] D [socket.c:1298:socket_submit] server: not connected
> (priv->connected = 255)
> [2009-09-23 21:29:37] N [server-helpers.c:779:server_connection_destroy]
> server: destroyed connection of srv01-5127-2009/09/23-20:52:02:522004-brick2
> [2009-09-23 21:29:37] N [server-protocol.c:7816:notify] server:
> 192.168.1.1:1016 disconnected
> [2009-09-23 21:29:37] N [server-helpers.c:779:server_connection_destroy]
> server: destroyed connection of
> srv01-5127-2009/09/23-20:52:02:522004-brick2-ns
> [2009-09-23 21:29:40] D [addr.c:174:gf_auth] brick: allowed = "*", received
> addr = "192.168.1.1"
> [2009-09-23 21:29:40] N [server-protocol.c:7056:mop_setvolume] server:
> accepted client from 192.168.1.1:1015
> [2009-09-23 21:29:40] D [addr.c:174:gf_auth] brick: allowed = "*", received
> addr = "192.168.1.1"
> [2009-09-23 21:29:40] N [server-protocol.c:7056:mop_setvolume] server:
> accepted client from 192.168.1.1:1014
> [2009-09-23 21:29:40] D [addr.c:174:gf_auth] brick-ns: allowed = "*",
> received addr = "192.168.1.1"
> [2009-09-23 21:29:40] N [server-protocol.c:7056:mop_setvolume] server:
> accepted client from 192.168.1.1:1013
> [2009-09-23 21:29:40] D [addr.c:174:gf_auth] brick-ns: allowed = "*",
> received addr = "192.168.1.1"
> [2009-09-23 21:29:40] N [server-protocol.c:7056:mop_setvolume] server:
> accepted client from 192.168.1.1:1012
> ------------------------
> No matter how many times I -ls the directory or file I cannot get it to
> sync.
>
> I can rename the files and have the name changes sync. Just not the files
> themselves.
>
> admin at srv02:/mnt/glusterfs$ ls -lh
> -rw-r--r-- 1 root root 584M 2009-09-23 21:29 test03
> admin at srv02:/mnt/glusterfs$ mv test03 test03a
>
> admin at srv01:/mnt/glusterfs$ ls -lh (on server02 now)
> -rw-r--r-- 1 root root 191M 2009-09-23 21:28 test03a
>
>
> Any ideas what I might be doing wrong?
>
>
>
>
> On Wed, Sep 23, 2009 at 5:55 AM, Anand Avati <avati at gluster.com> wrote:
>
>> On 9/23/09, Georgecooldude <georgecooldude at gmail.com> wrote:
>> > Anyone have any ideas on the below? Thanks.
>> >
>>
>> Does the logfile of the server whose cable you pulled out, recognize
>> the disconnection from the client?
>>
>> Avati
>>
>
>


More information about the Gluster-users mailing list