[Gluster-devel] Problems with self-heal

E-Comm Factory sistemas at e-commfactory.com
Tue Feb 26 08:15:08 UTC 2008


Happy to read it. Thank you for your time!

Congratulations for this good job.

On Tue, Feb 26, 2008 at 5:33 AM, Raghavendra G <raghavendra.hg at gmail.com>
wrote:

> Hi,
>
> Thanks for the access. We have found the issue and will soon commit a fix
> to tla repository.
>
> regards,
>
>
> On Thu, Feb 21, 2008 at 10:20 PM, E-Comm Factory <
> sistemas at e-commfactory.com> wrote:
>
> > Hello Raghavendra,
> >
> > Now client box is down because we are in non working hours.
> > You can check it from 9:00 to 14:00 and from 14:00 to 19:00 GMT+1.
> >
> > Thanks in advance,
> >
> >
> > **
> > On Thu, Feb 21, 2008 at 12:00 PM, Raghavendra G <
> > raghavendra.hg at gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I tried to reproduce the issue on my local system with the
> > > configuration provided by you without any success. Self-heal is working fine
> > > on my system. Is it possible to provide us access to your system, so that we
> > > can look at the issue first hand?
> > >
> > > regards,
> > >
> > >
> > > On Thu, Feb 21, 2008 at 1:34 PM, E-Comm Factory <
> > > sistemas at e-commfactory.com> wrote:
> > >
> > > > This is my desired configuration:
> > > >
> > > > - Server Box 1: 4 HD: with 4 storage partition a 4 namespace
> > > > partitions
> > > >
> > > >  AFR with the 4 ns partitions
> > > >  UNIFY with 4 ds partitions and previous AFR volume
> > > >
> > > > - Server Box 2: 4 HD: with 4 storage partition a 4 namespace
> > > > partitions
> > > >
> > > >  AFR with the 4 ns partitions
> > > >  UNIFY with 4 ds partitions and previous AFR volume
> > > >
> > > > - Client Box:
> > > >
> > > >  AFR with the two UNIFY volumes
> > > >
> > > >
> > > > Seems that i CAN'T use an afr volume under my two unify volumes if
> > > > then the
> > > > client use afr again with this 2 unified volumes. So i have deleted
> > > > the
> > > > namespace afr and now i use a single posix volume as ns.
> > > >
> > > > Is it the way it shold be done? or maybe i missed something.
> > > >
> > > > I will gratefully accept eny help.
> > > >
> > > > On Tue, Feb 19, 2008 at 7:12 PM, E-Comm Factory <
> > > > sistemas at e-commfactory.com>
> > > > wrote:
> > > >
> > > > >
> > > > > More info about this issue:
> > > > >
> > > > > 2008-02-19 19:06:09 D [inode.c:356:__active_inode]
> > > > disk-fs11/inode:
> > > > > activating inode(1064961), lru=6/1024
> > > > > 2008-02-19 19:06:09 D [lock.c:128:mop_lock_impl] lock: Lock
> > > > request to
> > > > > /file.img queued
> > > > > 2008-02-19 19:09:10 E [protocol.c
> > > > :259:gf_block_unserialize_transport]
> > > > > server: EOF from peer (192.168.1.103:1023)
> > > > > 2008-02-19 19:09:10 C [tcp.c:87:tcp_disconnect] server: connection
> > > > > disconnected
> > > > > 2008-02-19 19:09:10 D [tcp-server.c:145:tcp_server_notify] server:
> > > > > Registering socket (4) for new transport object of 192.168.1.103
> > > > > 2008-02-19 19:09:10 D [ip.c:98:gf_auth] disk-fs11: allowed = "*",
> > > > received
> > > > > ip addr = "192.168.1.103"
> > > > > 2008-02-19 19:09:10 D [server-protocol.c:5487:mop_setvolume]
> > > > server:
> > > > > accepted client from 192.168.1.103:1021
> > > > > 2008-02-19 19:09:10 E [server-protocol.c:183:generic_reply]
> > > > server:
> > > > > transport_writev failed
> > > > > 2008-02-19 19:09:10 D [server-protocol.c
> > > > :6067:server_protocol_cleanup]
> > > > > server: cleaned up transport state for client 192.168.1.103:1023
> > > > > 2008-02-19 19:09:10 D [tcp-server.c:248:gf_transport_fini] server:
> > > > > destroying transport object for 192.168.1.103:1023 (fd=4)
> > > > > 2008-02-19 19:09:10 D [inode.c:386:__passive_inode]
> > > > disk-fs11/inode:
> > > > > passivating inode(1064961), lru=7/1024
> > > > > 2008-02-19 19:09:10 D [inode.c:356:__active_inode]
> > > > disk-fs11/inode:
> > > > > activating inode(1064961), lru=6/1024
> > > > > 2008-02-19 19:09:12 D [inode.c:386:__passive_inode]
> > > > disk-fs11/inode:
> > > > > passivating inode(1064961), lru=7/1024
> > > > >
> > > > > I don't know if it could help. Any help will be apreciated!
> > > > >
> > > > > Finally, thanks guys for this fantastic project. GlusterFS is
> > > > amazing.
> > > > >
> > > > >
> > > > > On Tue, Feb 19, 2008 at 5:26 PM, E-Comm Factory <
> > > > > sistemas at e-commfactory.com> wrote:
> > > > >
> > > > > >
> > > > > > I also tested glusterfs-mainline-2.5 PATCH 674 with the same
> > > > results.
> > > > > >
> > > > > >
> > > > > > On Tue, Feb 19, 2008 at 5:23 PM, Toni Valverde <
> > > > > > tvalverde at e-commfactory.com> wrote:
> > > > > >
> > > > > > > I also tested glusterfs-mainline-2.5 PATCH 674 with the same
> > > > > > > results.
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Feb 19, 2008 at 4:05 PM, E-Comm Factory <
> > > > > > > sistemas at e-commfactory.com> wrote:
> > > > > > >
> > > > > > > > thanks amar
> > > > > > > >
> > > > > > > > - glusterfs--mainline--2.5 PATCH 665
> > > > > > > > - fuse-2.7.2glfs8
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Feb 18, 2008 at 8:48 PM, Amar S. Tumballi <
> > > > > > > > amar at zresearch.com> wrote:
> > > > > > > >
> > > > > > > > > Can you please let us know what version of fuse and
> > > > glusterfs you
> > > > > > > > > are running these tests from?
> > > > > > > > >
> > > > > > > > > -amar
> > > > > > > > >
> > > > > > > > > On Feb 18, 2008 11:03 PM, E-Comm Factory <
> > > > > > > > > sistemas at e-commfactory.com> wrote:
> > > > > > > > >
> > > > > > > > > > Hello,
> > > > > > > > > >
> > > > > > > > > > I have 2 boxes with 4 unified disks (so i have 2
> > > > volumes). Then,
> > > > > > > > > > in client
> > > > > > > > > > side, i have set afr with this 2 virtual volumes.
> > > > > > > > > >
> > > > > > > > > > For testing purposes I've deleted one file on the second
> > > > afr
> > > > > > > > > > volume and then
> > > > > > > > > > tried to self-heal the global afr but it crashes with
> > > > this
> > > > > > > > > > error:
> > > > > > > > > >
> > > > > > > > > > [afr.c:2754:afr_open] disk: self heal failed, returning
> > > > EIOç
> > > > > > > > > > [fuse-bridge.c:675:fuse_fd_cbk] glusterfs-fuse: 98:
> > > > > > > > > > /fichero4.img => -1 (5)
> > > > > > > > > >
> > > > > > > > > > An strace to the pid of the glusterfs-server running on
> > > > the
> > > > > > > > > > first afr volume
> > > > > > > > > > crashes too when selfhealing:
> > > > > > > > > >
> > > > > > > > > > epoll_wait(6, {{EPOLLIN, {u32=6304624, u64=6304624}}},
> > > > 2,
> > > > > > > > > > 4294967295) = 1
> > > > > > > > > > read(4, out of memory
> > > > > > > > > > 0x7fff9a3b1d90, 113)            = 113
> > > > > > > > > > read(4, Segmentation fault
> > > > > > > > > >
> > > > > > > > > > My server config file (same on both server boxes):
> > > > > > > > > >
> > > > > > > > > > # datastores
> > > > > > > > > > volume disk1
> > > > > > > > > >  type storage/posix
> > > > > > > > > >  option directory /mnt/disk1
> > > > > > > > > > end-volume
> > > > > > > > > > volume disk2
> > > > > > > > > >  type storage/posix
> > > > > > > > > >  option directory /mnt/disk2
> > > > > > > > > > end-volume
> > > > > > > > > > volume disk3
> > > > > > > > > >  type storage/posix
> > > > > > > > > >  option directory /mnt/disk3
> > > > > > > > > > end-volume
> > > > > > > > > > volume disk4
> > > > > > > > > >  type storage/posix
> > > > > > > > > >  option directory /mnt/disk4
> > > > > > > > > > end-volume
> > > > > > > > > >
> > > > > > > > > > # namespaces
> > > > > > > > > > volume disk1-ns
> > > > > > > > > >  type storage/posix
> > > > > > > > > >  option directory /mnt/disk1-ns
> > > > > > > > > > end-volume
> > > > > > > > > > volume disk2-ns
> > > > > > > > > >  type storage/posix
> > > > > > > > > >  option directory /mnt/disk2-ns
> > > > > > > > > > end-volume
> > > > > > > > > > #volume disk3-ns
> > > > > > > > > > #  type storage/posix
> > > > > > > > > > #  option directory /mnt/disk3-ns
> > > > > > > > > > #end-volume
> > > > > > > > > > #volume disk4-ns
> > > > > > > > > > #  type storage/posix
> > > > > > > > > > #  option directory /mnt/disk4-ns
> > > > > > > > > > #end-volume
> > > > > > > > > >
> > > > > > > > > > # afr de namespaces
> > > > > > > > > > volume disk-ns-afr
> > > > > > > > > >  type cluster/afr
> > > > > > > > > >  subvolumes disk1-ns disk2-ns
> > > > > > > > > >  option scheduler random
> > > > > > > > > > end-volume
> > > > > > > > > >
> > > > > > > > > > # unify de datastores
> > > > > > > > > > volume disk-unify
> > > > > > > > > >  type cluster/unify
> > > > > > > > > >  subvolumes disk1 disk2 disk3 disk4
> > > > > > > > > >  option namespace disk-ns-afr
> > > > > > > > > >  option scheduler rr
> > > > > > > > > > end-volume
> > > > > > > > > >
> > > > > > > > > > # performace para el disco
> > > > > > > > > > volume disk-fs11
> > > > > > > > > >  type performance/io-threads
> > > > > > > > > >  option thread-count 8
> > > > > > > > > >  option cache-size 64MB
> > > > > > > > > >  subvolumes disk-unify
> > > > > > > > > > end-volume
> > > > > > > > > >
> > > > > > > > > > # permitimos acceso a cualquier cliente
> > > > > > > > > > volume server
> > > > > > > > > >  type protocol/server
> > > > > > > > > >  option transport-type tcp/server
> > > > > > > > > >  subvolumes disk-fs11
> > > > > > > > > >  option auth.ip.disk-fs11.allow *
> > > > > > > > > > end-volume
> > > > > > > > > >
> > > > > > > > > > My client config file:
> > > > > > > > > >
> > > > > > > > > > volume disk-fs11
> > > > > > > > > >  type protocol/client
> > > > > > > > > >  option transport-type tcp/client
> > > > > > > > > >  option remote-host 192.168.1.34
> > > > > > > > > >  option remote-subvolume disk-fs11
> > > > > > > > > > end-volume
> > > > > > > > > >
> > > > > > > > > > volume disk-fs12
> > > > > > > > > >  type protocol/client
> > > > > > > > > >  option transport-type tcp/client
> > > > > > > > > >  option remote-host 192.168.1.35
> > > > > > > > > >  option remote-subvolume disk-fs12
> > > > > > > > > > end-volume
> > > > > > > > > >
> > > > > > > > > > volume disk
> > > > > > > > > >  type cluster/afr
> > > > > > > > > >  subvolumes disk-fs11 disk-fs12
> > > > > > > > > > end-volume
> > > > > > > > > >
> > > > > > > > > > volume trace
> > > > > > > > > >  type debug/trace
> > > > > > > > > >  subvolumes disk
> > > > > > > > > > #  option includes
> > > > open,close,create,readdir,opendir,closedir
> > > > > > > > > > #  option excludes lookup,read,write
> > > > > > > > > > end-volume
> > > > > > > > > >
> > > > > > > > > > Anyone could help me?
> > > > > > > > > >
> > > > > > > > > > Thanks in advance.
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > ecomm
> > > > > > > > > > sistemas at e-commfactory.com
> > > > > > > > > >  _______________________________________________
> > > > > > > > > > Gluster-devel mailing list
> > > > > > > > > > Gluster-devel at nongnu.org
> > > > > > > > > > http://lists.nongnu.org/mailman/listinfo/gluster-devel
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Amar Tumballi
> > > > > > > > > Gluster/GlusterFS Hacker
> > > > > > > > > [bulde on #gluster/irc.gnu.org]
> > > > > > > > > http://www.zresearch.com - Commoditizing Supercomputing
> > > > and
> > > > > > > > > Superstorage!
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Toni Valverde
> > > > > > > > tvalverde at e-commfactory.com
> > > > > > > >
> > > > > > > > Electronic Commerce Factory S.L.
> > > > > > > > C/Martin de los Heros, 59bis - 1º nº 8
> > > > > > > > 28008 - Madrid
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Toni Valverde
> > > > > > > tvalverde at e-commfactory.com
> > > > > > >
> > > > > > > Electronic Commerce Factory S.L.
> > > > > > > C/Martin de los Heros, 59bis - 1º nº 8
> > > > > > > 28008 - Madrid
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Toni Valverde
> > > > > > tvalverde at e-commfactory.com
> > > > > >
> > > > > > Electronic Commerce Factory S.L.
> > > > > > C/Martin de los Heros, 59bis - 1º nº 8
> > > > > > 28008 - Madrid
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Toni Valverde
> > > > > tvalverde at e-commfactory.com
> > > > >
> > > > > Electronic Commerce Factory S.L.
> > > > > C/Martin de los Heros, 59bis - 1º nº 8
> > > > > 28008 - Madrid
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Toni Valverde
> > > > tvalverde at e-commfactory.com
> > > >
> > > > Electronic Commerce Factory S.L.
> > > > C/Martin de los Heros, 59bis - 1º nº 8
> > > > 28008 - Madrid
> > > > _______________________________________________
> > > > Gluster-devel mailing list
> > > > Gluster-devel at nongnu.org
> > > > http://lists.nongnu.org/mailman/listinfo/gluster-devel
> > > >
> > >
> > >
> > >
> > > --
> > > Raghavendra G
> > >
> > > A centipede was happy quite, until a toad in fun,
> > > Said, "Prey, which leg comes after which?",
> > > This raised his doubts to such a pitch,
> > > He fell flat into the ditch,
> > > Not knowing how to run.
> > > -Anonymous
> >
> >
> >
> >
> > --
> > Toni Valverde
> > tvalverde at e-commfactory.com
> >
> > Electronic Commerce Factory S.L.
> > C/Martin de los Heros, 59bis - 1º nº 8
> > 28008 - Madrid
> >
>
>
>
> --
> Raghavendra G
>
> A centipede was happy quite, until a toad in fun,
> Said, "Prey, which leg comes after which?",
> This raised his doubts to such a pitch,
> He fell flat into the ditch,
> Not knowing how to run.
> -Anonymous
>



-- 
Toni Valverde
tvalverde at e-commfactory.com

Electronic Commerce Factory S.L.
C/Martin de los Heros, 59bis - 1º nº 8
28008 - Madrid



More information about the Gluster-devel mailing list