[Gluster-users] Recovery from network failure

Georgecooldude georgecooldude at gmail.com
Mon Oct 5 12:42:41 UTC 2009


Hi Guys,

Any ideas what might be causing this? I'm looking to deploy my servers soon
and if I can resolve this issue will be using Gluster. If not I'll have to
go with an alternative.

Any help much appreciated.



On Sat, Sep 26, 2009 at 10:55 PM, Georgecooldude
<georgecooldude at gmail.com>wrote:

> Does anyone have any ideas? This is a real gluster show stopper for me atm
> :(
>
>
> On Wed, Sep 23, 2009 at 9:41 PM, Georgecooldude <georgecooldude at gmail.com>wrote:
>
>> It does seem to detect it in the log.
>>
>> This is what and did and the attached log file
>>
>> ###
>> SRV01 - 192.168.1.1
>> SRV02 - 192.168.1.2
>> ##
>>
>>
>> ------------------------
>> Step 1: Copy large file to the gluster mount on server02
>> admin at srv02:/mnt/glusterfs$ ls -lh
>> total 1.2G
>> -rw-r--r-- 1 root root 584M 2009-09-23 21:29 test03
>> ------------------------
>>
>> ------------------------
>> Step 2: Pull the cable from srv02
>> ------------------------
>>
>> ------------------------
>> Step 3: ls on srv01 - See I have a partial file
>> admin at srv01:/mnt/glusterfs$ ls -lh
>> total 775M
>> -rw-r--r-- 1 root root 191M 2009-09-23 21:28 test03
>> ------------------------
>>
>> ------------------------
>> Server02 Log file looks like this:
>> Version      : glusterfs 2.0.6 built on Sep 19 2009 18:00:37
>> TLA Revision : v2.0.6
>> Starting Time: 2009-09-23 21:26:49
>> Command line : glusterfsd -f /etc/glusterfs/glusterfs-server.vol -l
>> /var/log/gluster/gluster-log.txt -L DEBUG --volfile-check
>> PID          : 5085
>> System name  : Linux
>> Nodename     : srv02
>> Kernel Release : 2.6.24-24-server
>> Hardware Identifier: x86_64
>> Given volfile:
>>
>> +------------------------------------------------------------------------------+
>>   1: # file: /etc/glusterfs/glusterfs-server.vol
>>   2:
>>   3: volume posix
>>   4:   type storage/posix
>>   5:   option directory /data/export
>>   6: end-volume
>>   7:
>>   8: volume locks
>>   9:   type features/locks
>>  10:   subvolumes posix
>>  11: end-volume
>>  12:
>>  13: volume brick
>>  14:   type performance/io-threads
>>  15:   option thread-count 8
>>  16:   subvolumes locks
>>  17: end-volume
>>  18:
>>  19: volume posix-ns
>>  20:   type storage/posix
>>  21:   option directory /data/export-ns
>>  22: end-volume
>>  23:
>>  24: volume locks-ns
>>  25:   type features/locks
>>  26:   subvolumes posix-ns
>>  27: end-volume
>>  28:
>>  29: volume brick-ns
>>  30:   type performance/io-threads
>>  31:   option thread-count 8
>>  32:   subvolumes locks-ns
>>  33: end-volume
>>  34:
>>  35: volume server
>>  36:   type protocol/server
>>  37:   option transport-type tcp
>>  38:   option auth.addr.brick.allow *
>>  39:   option auth.addr.brick-ns.allow *
>>  40:   subvolumes brick brick-ns
>>  41: end-volume
>>
>> +------------------------------------------------------------------------------+
>> [2009-09-23 21:26:49] D [glusterfsd.c:1205:main] glusterfs: running in pid
>> 5085
>> [2009-09-23 21:26:49] D [io-threads.c:2280:init] brick: io-threads:
>> Autoscaling: off, min_threads: 8, max_threads: 8
>> [2009-09-23 21:26:49] D [io-threads.c:2280:init] brick-ns: io-threads:
>> Autoscaling: off, min_threads: 8, max_threads: 8
>> [2009-09-23 21:26:49] D [transport.c:141:transport_load] transport:
>> attempt to load file /usr/local/lib/glusterfs/2.0.6/transport/socket.so
>> [2009-09-23 21:26:49] N [glusterfsd.c:1224:main] glusterfs: Successfully
>> started
>> [2009-09-23 21:26:56] D [addr.c:174:gf_auth] brick-ns: allowed = "*",
>> received addr = "192.168.1.2"
>> [2009-09-23 21:26:56] N [server-protocol.c:7056:mop_setvolume] server:
>> accepted client from 192.168.1.2:1021
>> [2009-09-23 21:26:56] D [addr.c:174:gf_auth] brick-ns: allowed = "*",
>> received addr = "192.168.1.2"
>> [2009-09-23 21:26:56] N [server-protocol.c:7056:mop_setvolume] server:
>> accepted client from 192.168.1.2:1020
>> [2009-09-23 21:26:56] D [addr.c:174:gf_auth] brick: allowed = "*",
>> received addr = "192.168.1.2"
>> [2009-09-23 21:26:56] N [server-protocol.c:7056:mop_setvolume] server:
>> accepted client from 192.168.1.2:1017
>> [2009-09-23 21:26:56] D [addr.c:174:gf_auth] brick: allowed = "*",
>> received addr = "192.168.1.2"
>> [2009-09-23 21:26:56] N [server-protocol.c:7056:mop_setvolume] server:
>> accepted client from 192.168.1.2:1016
>> [2009-09-23 21:27:16] D [addr.c:174:gf_auth] brick: allowed = "*",
>> received addr = "192.168.1.1"
>> [2009-09-23 21:27:16] N [server-protocol.c:7056:mop_setvolume] server:
>> accepted client from 192.168.1.1:1021
>> [2009-09-23 21:27:17] D [addr.c:174:gf_auth] brick-ns: allowed = "*",
>> received addr = "192.168.1.1"
>> [2009-09-23 21:27:17] N [server-protocol.c:7056:mop_setvolume] server:
>> accepted client from 192.168.1.1:1020
>> [2009-09-23 21:27:17] D [addr.c:174:gf_auth] brick: allowed = "*",
>> received addr = "192.168.1.1"
>> [2009-09-23 21:27:17] N [server-protocol.c:7056:mop_setvolume] server:
>> accepted client from 192.168.1.1:1017
>> [2009-09-23 21:27:17] D [addr.c:174:gf_auth] brick-ns: allowed = "*",
>> received addr = "192.168.1.1"
>> [2009-09-23 21:27:17] N [server-protocol.c:7056:mop_setvolume] server:
>> accepted client from 192.168.1.1:1016
>> [2009-09-23 21:29:21] N [server-protocol.c:7816:notify] server:
>> 192.168.1.1:1021 disconnected
>> [2009-09-23 21:29:21] N [server-protocol.c:7816:notify] server:
>> 192.168.1.1:1020 disconnected
>> [2009-09-23 21:29:37] N [server-protocol.c:7816:notify] server:
>> 192.168.1.1:1017 disconnected
>> [2009-09-23 21:29:37] D [socket.c:1298:socket_submit] server: not
>> connected (priv->connected = 255)
>> [2009-09-23 21:29:37] N [server-helpers.c:779:server_connection_destroy]
>> server: destroyed connection of srv01-5127-2009/09/23-20:52:02:522004-brick2
>> [2009-09-23 21:29:37] N [server-protocol.c:7816:notify] server:
>> 192.168.1.1:1016 disconnected
>> [2009-09-23 21:29:37] N [server-helpers.c:779:server_connection_destroy]
>> server: destroyed connection of
>> srv01-5127-2009/09/23-20:52:02:522004-brick2-ns
>> [2009-09-23 21:29:40] D [addr.c:174:gf_auth] brick: allowed = "*",
>> received addr = "192.168.1.1"
>> [2009-09-23 21:29:40] N [server-protocol.c:7056:mop_setvolume] server:
>> accepted client from 192.168.1.1:1015
>> [2009-09-23 21:29:40] D [addr.c:174:gf_auth] brick: allowed = "*",
>> received addr = "192.168.1.1"
>> [2009-09-23 21:29:40] N [server-protocol.c:7056:mop_setvolume] server:
>> accepted client from 192.168.1.1:1014
>> [2009-09-23 21:29:40] D [addr.c:174:gf_auth] brick-ns: allowed = "*",
>> received addr = "192.168.1.1"
>> [2009-09-23 21:29:40] N [server-protocol.c:7056:mop_setvolume] server:
>> accepted client from 192.168.1.1:1013
>> [2009-09-23 21:29:40] D [addr.c:174:gf_auth] brick-ns: allowed = "*",
>> received addr = "192.168.1.1"
>> [2009-09-23 21:29:40] N [server-protocol.c:7056:mop_setvolume] server:
>> accepted client from 192.168.1.1:1012
>> ------------------------
>> No matter how many times I -ls the directory or file I cannot get it to
>> sync.
>>
>> I can rename the files and have the name changes sync. Just not the files
>> themselves.
>>
>> admin at srv02:/mnt/glusterfs$ ls -lh
>> -rw-r--r-- 1 root root 584M 2009-09-23 21:29 test03
>> admin at srv02:/mnt/glusterfs$ mv test03 test03a
>>
>> admin at srv01:/mnt/glusterfs$ ls -lh (on server02 now)
>> -rw-r--r-- 1 root root 191M 2009-09-23 21:28 test03a
>>
>>
>> Any ideas what I might be doing wrong?
>>
>>
>>
>>
>> On Wed, Sep 23, 2009 at 5:55 AM, Anand Avati <avati at gluster.com> wrote:
>>
>>> On 9/23/09, Georgecooldude <georgecooldude at gmail.com> wrote:
>>> > Anyone have any ideas on the below? Thanks.
>>> >
>>>
>>> Does the logfile of the server whose cable you pulled out, recognize
>>> the disconnection from the client?
>>>
>>> Avati
>>>
>>
>>
>


More information about the Gluster-users mailing list