[Gluster-users] [Gluster-devel] GlusterFS drops mount point

Marcus Herou marcus.herou at tailsweep.com
Mon Sep 15 07:41:41 UTC 2008


Hi posting to the user list as well since I think there are 2 different
issues.

1. Mount point gets dropped after 12-24 hours, seems like the connection
gets stale since if you do an ls or such it hangs "forever".

2. Bad client config which spits out a lot of EIO's. We are onto this and
will fix the config asap.

However it is the no1 I'm really concerned about. We did heavy loadtests
with IOZone with the bad config (it actually works but unify does not like
it) and we got no errors from IOZone, to the contrary we got quite nice
throughput!

What would happen to glusterfs if the network between client and master goes
away for just a sec once a day ? I'm suspecting that this could be an issue.
it would be nice if glusterfs could "auto remount" like NFS.

Kindly

//Marcus






---------- Forwarded message ----------
From: Marcus Herou <marcus.herou at tailsweep.com>
Date: Sun, Sep 14, 2008 at 1:43 PM
Subject: Re: {Disarmed} Re: [Gluster-devel] GlusterFS drops mount point
To: "Amar S. Tumballi" <amar at zresearch.com>
Cc: Brian Taber <btaber at diversecg.com>, Gluster-devel at nongnu.org


Thanks a bunch!

So this would lead to that the mount point would "loose" it's connection ?

Kindly

//Marcus



On Sat, Sep 13, 2008 at 7:49 PM, Amar S. Tumballi <amar at zresearch.com>wrote:

> this will always lead to EIO as it fails to meet the criteria for unify's
> functioning.
>
> Unify wants a file to be present only on one of its subvolumes, and in this
> case, you have done afr of (v1 v2), (v2 v3), (v3 v1), which means, if a file
> is present on (v1 v2) pair, it will be seen by other two afrs too, (v2 in
> second pair, and v1 in third pair), so unify sees file to be present on all
> of its subvolume, and gets confused which file to open, and returns EIO.
>
> the fix is, you need to export two volumes (instead of currently present 1)
> per server, and make pairs of (v1-1 v2-2), (v2-1 v3-2) (v3-1 v1-2),  hope i
> am clear
>
> Regards,
>
>
>
>> Client:
>> volume v1
>>   type protocol/client
>>   option transport-type tcp/client
>>   option remote-host 192.168.10.30
>>   option remote-subvolume home
>> end-volume
>>
>> volume v2
>>   type protocol/client
>>   option transport-type tcp/client
>>   option remote-host 192.168.10.31
>>   option remote-subvolume home
>> end-volume
>>
>> volume v3
>>   type protocol/client
>>   option transport-type tcp/client
>>   option remote-host 192.168.10.32
>>   option remote-subvolume home
>> end-volume
>>
>> volume afr-1
>>   type cluster/afr
>>   subvolumes v1 v2
>> end-volume
>>
>> volume afr-2
>>   type cluster/afr
>>   subvolumes v2 v3
>> end-volume
>>
>> volume afr-3
>>   type cluster/afr
>>   subvolumes v3 v1
>> end-volume
>>
>> volume ns1
>>   type protocol/client
>>   option transport-type tcp/client
>>   option remote-host 192.168.10.30
>>   option remote-subvolume home-namespace
>> end-volume
>>
>> volume ns2
>>   type protocol/client
>>   option transport-type tcp/client
>>   option remote-host 192.168.10.31
>>   option remote-subvolume home-namespace
>> end-volume
>>
>> volume ns3
>>   type protocol/client
>>   option transport-type tcp/client
>>   option remote-host 192.168.10.32
>>   option remote-subvolume home-namespace
>> end-volume
>>
>> volume namespace
>>   type cluster/afr
>>   subvolumes ns1 ns2 ns3
>> end-volume
>>
>> volume v
>>   type cluster/unify
>>   option scheduler rr
>>   option namespace namespace
>>   subvolumes afr-1 afr-2 afr-3
>> end-volume
>>
>> I really hope we have misconfigured something since that is the easiest
>> fix :)
>>
>> Kindly
>>
>> //Marcus
>>
>>
>>
>>
>> On Sat, Sep 13, 2008 at 12:50 AM, Amar S. Tumballi <amar at zresearch.com>wrote:
>>
>>> Also which version of GlusterFS?
>>>
>>> ber at diversecg.com>
>>>
>>>> may be configuration issue...  lets start with config, what does you
>>>> config look like on client and server?
>>>>
>>>> Marcus Herou wrote:
>>>>
>>>>> Lots of these on server
>>>>> 2008-09-12 20:48:14 E [protocol.c:271:gf_block_unserialize_transport]
>>>>> server: EOF from peer (*MailScanner has detected a possible fraud attempt
>>>>> from "192.168.10.4:1007" claiming to be* *MailScanner warning:
>>>>> numerical links are often malicious:* 192.168.10.4:1007 <
>>>>> http://192.168.10.4:1007>)
>>>>> ...
>>>>> 2008-09-12 20:50:12 E [server-protocol.c:4153:server_closedir] server:
>>>>> not getting enough data, returning EINVAL
>>>>> ...
>>>>> 2008-09-12 20:50:12 E [server-protocol.c:4148:server_closedir] server:
>>>>> unresolved fd 6
>>>>> ...
>>>>> 2008-09-12 20:51:47 E [protocol.c:271:gf_block_unserialize_transport]
>>>>> server: EOF from peer (*MailScanner has detected a possible fraud attempt
>>>>> from "192.168.10.10:1015" claiming to be* *MailScanner warning:
>>>>> numerical links are often malicious:* 192.168.10.10:1015 <
>>>>> http://192.168.10.10:1015>)
>>>>>
>>>>> ...
>>>>>
>>>>> And lots of these on client
>>>>>
>>>>> 2008-09-12 19:54:45 E [afr.c:2201:afr_open] home-namespace: self heal
>>>>> failed, returning EIO
>>>>> 2008-09-12 19:54:45 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse:
>>>>> 3954: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5)
>>>>> 2008-09-12 19:54:45 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse:
>>>>> 3956: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5)
>>>>> 2008-09-12 19:54:45 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse:
>>>>> 3958: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5)
>>>>> 2008-09-12 19:54:45 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse:
>>>>> 3987: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5)
>>>>> 2008-09-12 19:54:45 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse:
>>>>> 3989: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5)
>>>>> 2008-09-12 19:54:45 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse:
>>>>> 3991: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5)
>>>>> 2008-09-12 19:54:45 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse:
>>>>> 3993: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5)
>>>>> 2008-09-12 19:54:54 C [client-protocol.c:212:call_bail] home3: bailing
>>>>> transport
>>>>> 2008-09-12 19:54:54 E [client-protocol.c:4827:client_protocol_cleanup]
>>>>> home3: forced unwinding frame type(2) op(5) reply=@0x809abb0
>>>>> 2008-09-12 19:54:54 E [client-protocol.c:4239:client_lock_cbk] home3:
>>>>> no proper reply from server, returning ENOTCONN
>>>>> 2008-09-12 19:54:54 E [afr.c:1933:afr_selfheal_lock_cbk] home-afr-3:
>>>>> (path=/rsyncer/.ssh/authorized_keys2 child=home3) op_ret=-1 op_errno=107
>>>>> 2008-09-12 19:54:54 E [afr.c:2201:afr_open] home-afr-3: self heal
>>>>> failed, returning EIO
>>>>> 2008-09-12 19:54:54 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse:
>>>>> 3970: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5)
>>>>> 2008-09-12 19:54:54 E [client-protocol.c:4827:client_protocol_cleanup]
>>>>> home3: forced unwinding frame type(2) op(5) reply=@0x809abb0
>>>>> 2008-09-12 19:54:54 E [client-protocol.c:4239:client_lock_cbk] home3:
>>>>> no proper reply from server, returning ENOTCONN
>>>>> 2008-09-12 19:54:54 E [afr.c:1933:afr_selfheal_lock_cbk] home-afr-3:
>>>>> (path=/rsyncer/.ssh/authorized_keys2 child=home3) op_ret=-1 op_errno=107
>>>>> 2008-09-12 19:54:54 E [afr.c:2201:afr_open] home-afr-3: self heal
>>>>> failed, returning EIO
>>>>> 2008-09-12 19:54:54 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse:
>>>>> 3971: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5)
>>>>> 2008-09-12 19:54:54 E [client-protocol.c:4827:client_protocol_cleanup]
>>>>> home3: forced unwinding frame type(2) op(5) reply=@0x809abb0
>>>>> 2008-09-12 19:54:54 E [client-protocol.c:4239:client_lock_cbk] home3:
>>>>> no proper reply from server, returning ENOTCONN
>>>>> 2008-09-12 19:54:54 E [afr.c:1933:afr_selfheal_lock_cbk] home-afr-3:
>>>>> (path=/rsyncer/.ssh/authorized_keys2 child=home3) op_ret=-1 op_errno=107
>>>>> 2008-09-12 19:54:54 E [afr.c:2201:afr_open] home-afr-3: self heal
>>>>> failed, returning EIO
>>>>> 2008-09-12 19:54:54 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse:
>>>>> 3972: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5)
>>>>> 2008-09-12 19:54:54 E [client-protocol.c:4827:client_protocol_cleanup]
>>>>> home3: forced unwinding frame type(2) op(5) reply=@0x809abb0
>>>>> 2008-09-12 19:54:54 E [client-protocol.c:4239:client_lock_cbk] home3:
>>>>> no proper reply from server, returning ENOTCONN
>>>>> 2008-09-12 19:54:54 E [afr.c:1933:afr_selfheal_lock_cbk] home-afr-3:
>>>>> (path=/rsyncer/.ssh/authorized_keys2 child=home3) op_ret=-1 op_errno=107
>>>>> 2008-09-12 19:54:54 E [afr.c:2201:afr_open] home-afr-3: self heal
>>>>> failed, returning EIO
>>>>> 2008-09-12 19:54:54 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse:
>>>>> 3974: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5)
>>>>> 2008-09-12 19:54:54 E [client-protocol.c:4827:client_protocol_cleanup]
>>>>> home3: forced unwinding frame type(2) op(5) reply=@0x809abb0
>>>>> 2008-09-12 19:54:54 E [client-protocol.c:4239:client_lock_cbk] home3:
>>>>> no proper reply from server, returning ENOTCONN
>>>>> 2008-09-12 19:54:54 E [afr.c:1933:afr_selfheal_lock_cbk] home-afr-3:
>>>>> (path=/rsyncer/.ssh/authorized_keys2 child=home3) op_ret=-1 op_errno=107
>>>>> 2008-09-12 19:54:54 E [afr.c:2201:afr_open] home-afr-3: self heal
>>>>> failed, returning EIO
>>>>> 2008-09-12 19:54:54 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse:
>>>>> 4001: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5)
>>>>> 2008-09-12 19:54:54 E [client-protocol.c:4827:client_protocol_cleanup]
>>>>> home3: forced unwinding frame type(2) op(5) reply=@0x809abb0
>>>>> 2008-09-12 19:54:54 E [client-protocol.c:4239:client_lock_cbk] home3:
>>>>> no proper reply from server, returning ENOTCONN
>>>>> 2008-09-12 19:54:54 E [afr.c:1933:afr_selfheal_lock_cbk] home-afr-3:
>>>>> (path=/rsyncer/.ssh/authorized_keys2 child=home3) op_ret=-1 op_errno=107
>>>>> 2008-09-12 19:54:54 E [afr.c:2201:afr_open] home-afr-3: self heal
>>>>> failed, returning EIO
>>>>> 2008-09-12 19:54:54 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse:
>>>>> 4002: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5)
>>>>> 2008-09-12 19:54:54 E [client-protocol.c:4827:client_protocol_cleanup]
>>>>> home3: forced unwinding frame type(2) op(5) reply=@0x809abb0
>>>>> 2008-09-12 19:54:54 E [client-protocol.c:4239:client_lock_cbk] home3:
>>>>> no proper reply from server, returning ENOTCONN
>>>>> 2008-09-12 19:54:54 E [afr.c:1933:afr_selfheal_lock_cbk] home-afr-3:
>>>>> (path=/rsyncer/.ssh/authorized_keys2 child=home3) op_ret=-1 op_errno=107
>>>>> 2008-09-12 19:54:54 E [afr.c:2201:afr_open] home-afr-3: self heal
>>>>> failed, returning EIO
>>>>> 2008-09-12 19:54:54 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse:
>>>>> 4004: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5)
>>>>> 2008-09-12 19:55:01 E [unify.c:335:unify_lookup] home: returning ESTALE
>>>>> for /rsyncer/.ssh/authorized_keys2: file count is 4
>>>>> 2008-09-12 19:55:01 E [unify.c:339:unify_lookup] home:
>>>>> /rsyncer/.ssh/authorized_keys2: found on home-namespace
>>>>> 2008-09-12 19:55:01 E [unify.c:339:unify_lookup] home:
>>>>> /rsyncer/.ssh/authorized_keys2: found on home-afr-2
>>>>> 2008-09-12 19:55:01 E [unify.c:339:unify_lookup] home:
>>>>> /rsyncer/.ssh/authorized_keys2: found on home-afr-1
>>>>> 2008-09-12 19:55:01 E [unify.c:339:unify_lookup] home:
>>>>> /rsyncer/.ssh/authorized_keys2: found on home-afr-3
>>>>>
>>>>>
>>>>> Both server and client are spitting out tons of these. Thought "E" was
>>>>> Error level, seems like DEBUG ?
>>>>>
>>>>> Kindly
>>>>>
>>>>> //Marcus
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 12, 2008 at 8:01 PM, Brian Taber <btaber at diversecg.com<mailto:
>>>>> btaber at diversecg.com>> wrote:
>>>>>
>>>>>    What do you see in your server and client logs for gluster?
>>>>>
>>>>>    -------------------------
>>>>>    Brian Taber
>>>>>    Owner/IT Specialist
>>>>>    Diverse Computer Group
>>>>>    Office: 774-206-5592
>>>>>    Cell: 508-496-9221
>>>>>    btaber at diversecg.com <mailto:btaber at diversecg.com>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>    Marcus Herou wrote:
>>>>>    > Hi.
>>>>>    >
>>>>>    > We have just recently installed a 3 node cluster with 16 SATA
>>>>>    disks each.
>>>>>    >
>>>>>    > We are using Hardy and the glusterfs-3.10 Ubuntu package on both
>>>>>    client(s)
>>>>>    > and server.
>>>>>    >
>>>>>    > We have only created one export (/home) yet since we want to
>>>>>    test it a while
>>>>>    > before putting it into a live high performance environment.
>>>>>    >
>>>>>    > The problem is currently that the client looses /home once a day
>>>>>    or so. This
>>>>>    > is really bad since this is a machine which all other connect to
>>>>>    with ssh
>>>>>    > keys thus making them unable to log in.
>>>>>    >
>>>>>    > Anyone seen something similar ?
>>>>>    >
>>>>>    > Kindly
>>>>>    >
>>>>>    > //Marcus
>>>>>    > _______________________________________________
>>>>>    > Gluster-devel mailing list
>>>>>    > Gluster-devel at nongnu.org <mailto:Gluster-devel at nongnu.org>
>>>>>    > http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>>>    >
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Marcus Herou CTO and co-founder Tailsweep AB
>>>>> +46702561312
>>>>> marcus.herou at tailsweep.com <mailto:marcus.herou at tailsweep.com>
>>>>> http://www.tailsweep.com/
>>>>> http://blogg.tailsweep.com/
>>>>>
>>>> _______________________________________________
>>>> Gluster-devel mailing list
>>>> Gluster-devel at nongnu.org
>>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>>
>>>
>>>
>>>
>>> --
>>> Amar Tumballi
>>> Gluster/GlusterFS Hacker
>>> [bulde on #gluster/irc.gnu.org]
>>> http://www.zresearch.com - Commoditizing Super Storage!
>>>
>>
>>
>>
>> --
>> Marcus Herou CTO and co-founder Tailsweep AB
>> +46702561312
>> marcus.herou at tailsweep.com
>> http://www.tailsweep.com/
>> http://blogg.tailsweep.com/
>>
>
>
>
> --
> Amar Tumballi
> Gluster/GlusterFS Hacker
> [bulde on #gluster/irc.gnu.org]
> http://www.zresearch.com - Commoditizing Super Storage!
>



-- 
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
marcus.herou at tailsweep.com
http://www.tailsweep.com/
http://blogg.tailsweep.com/



-- 
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
marcus.herou at tailsweep.com
http://www.tailsweep.com/
http://blogg.tailsweep.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20080915/1ff7533c/attachment.html>


More information about the Gluster-users mailing list