[Gluster-users] Some problems

Atıf CEYLAN mehmet at atifceylan.com
Sat Dec 1 11:29:20 UTC 2012


On Fri, 2012-11-30 at 12:51 -0500, Jeff Darcy wrote:

> On 11/28/2012 10:27 AM, Atıf CEYLAN wrote:
> > My first question: if GlusterFS was start before than the imap/pop3 server
> > can't be map 993 and 995 ports by imap/pop3 server. Because GlusterFS use them.
> > I didn't understand why it use these ports?
> 
> Like many other system programs, GlusterFS tries to use ports below 1024 which 
> are supposed to be privileged, hunting downward until it finds one that's 
> available.  If this is a problem for you, I suggest looking into the 
> "portreserve" command.
> 
> > Second, one of two debian was crash and boot up again. When it was start,
> > GlusterFS heal process was start. But a few minutes later written below records
> > into the log and GlusterFS native client (or FUSE) was crash.
> >
> > [2012-11-28 12:11:33.763486] E
> > [afr-self-heal-data.c:763:afr_sh_data_fxattrop_fstat_done] 0-m3-replicate-0:
> > Unable to self-heal contents of
> > '/domains/1/abc.com/info/Maildir/dovecot.index.log' (possible split-brain).
> > Please delete the file from all but the preferred subvolume.
> > [2012-11-28 12:11:33.763659] E
> > [afr-self-heal-common.c:2160:afr_self_heal_completion_cbk] 0-m3-replicate-0:
> > background  meta-data data self-heal failed on
> > /domains/1/O/abc.com/info/Maildir/dovecot.index.log
> > [2012-11-28 12:11:33.763927] W [afr-open.c:213:afr_open] 0-m3-replicate-0:
> > failed to open as split brain seen, returning EIO
> > [2012-11-28 12:11:33.763958] W [fuse-bridge.c:1948:fuse_readv_cbk]
> > 0-glusterfs-fuse: 432877: READ =-1 (Input/output error)
> > [2012-11-28 12:11:33.764039] W [afr-open.c:213:afr_open] 0-m3-replicate-0:
> > failed to open as split brain seen, returning EIO
> > [2012-11-28 12:11:33.764062] W [fuse-bridge.c:1948:fuse_readv_cbk]
> > 0-glusterfs-fuse: 432878: READ =-1 (Input/output error)
> > [2012-11-28 12:11:33.764062] W [fuse-bridge.c:1948:fuse_readv_cbk]
> > 0-glusterfs-fuse: 432878: READ =-1 (Input/output error)
> > [2012-11-28 12:11:36.274580] E
> > [afr-self-heal-data.c:763:afr_sh_data_fxattrop_fstat_done] 0-m3-replicate-0:
> > Unable to self-heal contents of
> > '/domains/xxx.com/info/Maildir/dovecot.index.log' (possible split-brain).
> > Please delete the file from all but the preferred subvolume.
> > [2012-11-28 12:11:36.274781] E
> > [afr-self-heal-common.c:2160:afr_self_heal_completion_cbk] 0-m3-replicate-0:
> > background  meta-data data self-heal failed on
> > /domains/xxx.com/info/Maildir/dovecot.index.log
> 
> The phrase "split brain" means that we detected changes to both replicas, and 
> it would be unsafe to let one override the other (i.e. might lost data) so we 
> keep our hands off until the user has a chance to intervene.  This can happen 
> in two distinct ways:
> 
> * Network partition: client A can only reach replica X, client B can only reach 
> replica Y, both make changes which end up causing split brain.
> 
> * Multiple failures over time.  X goes down, changes occur only on Y, then Y 
> goes down and X comes up (or X comes up and Y goes down before self-heal is 
> finished) so changes only occur at X.
> 
> The quorum feature should address both of these, at the expense of returning 
> errors if an insufficient number of replicas are available (so it works best 
> with replica count >= 3).
> 
> It's also usually worth figuring out why such problems happened in the first 
> place.  Do you have a lot of network problems or server failures?  Are these 
> servers widely separated?  Either is likely to cau	se problems not only with 
> GlusterFS but with any distributed filesystem, so it's a good idea to address 
> such issues or at least mention them when reporting problems.

One of my servers is currently disabled. Because when I run glusterfs
service that client(fuse) give "/ mnt/s2/mail/mail1-2: Transport
endpoint is not connected" error message. I want to run both servers
without shutdown my http and other services. So I can't start the
glusterfs service of the crashed server. How I make debugging and fixing
the errors without stopping worked services? 

There are lots of file consistency errors at gluster.log looks like
below.

Shall I move below files to outside of cluster directories and run heal
command and  I move files again to old directories over client?

[2012-11-28 12:11:33.763486] E
[afr-self-heal-data.c:763:afr_sh_data_fxattrop_fstat_done]
0-m3-replicate-0: Unable to self-heal contents of
'/domains/1/abc.com/info/Maildir/dovecot.index.log' (possible
split-brain). Please delete the file from all but the preferred
subvolume.
[2012-11-28 12:11:33.763659] E
[afr-self-heal-common.c:2160:afr_self_heal_completion_cbk]
0-m3-replicate-0: background  meta-data data self-heal failed
on /domains/1/O/abc.com/info/Maildir/dovecot.index.log
[2012-11-28 12:11:33.763927] W [afr-open.c:213:afr_open]
0-m3-replicate-0: failed to open as split brain seen, returning EIO
[2012-11-28 12:11:33.763958] W [fuse-bridge.c:1948:fuse_readv_cbk]
0-glusterfs-fuse: 432877: READ =-1 (Input/output error)
[2012-11-28 12:11:33.764039] W [afr-open.c:213:afr_open]
0-m3-replicate-0: failed to open as split brain seen, returning EIO
[2012-11-28 12:11:33.764062] W [fuse-bridge.c:1948:fuse_readv_cbk]
0-glusterfs-fuse: 432878: READ =-1 (Input/output error)
[2012-11-28 12:11:33.764062] W [fuse-bridge.c:1948:fuse_readv_cbk]
0-glusterfs-fuse: 432878: READ =-1 (Input/output error)
[2012-11-28 12:11:36.274580] E
[afr-self-heal-data.c:763:afr_sh_data_fxattrop_fstat_done]
0-m3-replicate-0: Unable to self-heal contents of
'/domains/xxx.com/info/Maildir/dovecot.index.log' (possible
split-brain). Please delete the file from all but the preferred
subvolume.
[2012-11-28 12:11:36.274781] E
[afr-self-heal-common.c:2160:afr_self_heal_completion_cbk]
0-m3-replicate-0: background  meta-data data self-heal failed
on /domains/xxx.com/info/Maildir/dovecot.index.log

-- 
M.Atıf CEYLAN
Yurdum Yazılım
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20121201/8b5b6f58/attachment.html>


More information about the Gluster-users mailing list