[Gluster-users] rsync causing gluster to crash

Benjamin Hudgens bhudgens at photodex.com
Mon Mar 22 17:35:13 UTC 2010


Hello,

Was there additional information we needed to provide to get help
diagnosing our issue with keeping GlusterFS mounted?

Kind Regards,
Benjamin

-----Original Message-----
From: gluster-users-bounces at gluster.org
[mailto:gluster-users-bounces at gluster.org] On Behalf Of Benjamin Hudgens
Sent: Friday, March 19, 2010 2:54 PM
Cc: Gluster Users
Subject: Re: [Gluster-users] rsync causing gluster to crash

Hello Vikas,

Thank you for your help.  Gluster does not seem to core dump when this
occurrence happens thus it is not creating a dump file.  Hopefully I can
provide all the information you need to help you diagnose this issue.

Per the original email we get an error message indicating a stale NFS
mount.  Unfortunately, we are not using NFS at any point in our setup so
the error is misleading.  Once this condition occurs we cannot connect
to the Gluster mount point without re-mounting (standard umount/mount).

Below are the exact steps to reproduce our particular problem:

---------------------------

/mnt/control <- contains sample (random) files on a LOCAL drive
/mnt/backups <- our glusterfs mount point

---------------------------

--NO gluster
backup0:/mnt# ps auxw | grep glus | grep -v grep

--Mounting Gluster (see bottom for fstab)
backup0:/mnt# mount /mnt/backups/

--Gluster is running
backup0:/mnt# ps auxw | grep glus | grep -v grep
root      4341  0.0  0.0  27844  1456 ?        Ssl  14:29   0:00
/usr/local/sbin/glusterfs --log-level=NORMAL
--volfile=/etc/glusterfs/glusterfs.vol /mnt/backups

--Can browse the dir
backup0:/mnt# cd backups/
backup0:/mnt/backups# ls
10001   100401  100801  101301  10201   102501  103101  103801  104301
105201  105701  106101  106701  107201  107601  10801   108601  10901
109301  1101    110501  11101   111401  11501
1001    100601  101     101401  102201  103001  103401  10401   104401
105301  106001  106401  106801  107301  107701  108101  108701  109101
109401  110101  110801  111101  111701  115101
100101  100701  10101   102001  102401  10301   103501  104201  10501
105601  10601   106501  10701   107501  107901  108401  108901  109201
11001   110201  111001  111301  111901  115701

--Run Rsync
backup0:/mnt# rsync -rav -X /mnt/control/ /mnt/backups/ 2> output.txt 1>
output.txt

--Nothing on this machine is NFS... weird error
backup0:/mnt# cd backups
  bash: cd: backups: Stale NFS file handle

--Gluster is still running
backup0:/mnt# ps auxw |grep glu | grep -v grep | grep -v tail
root      4780  0.4  0.0  29900  2480 ?        Ssl  14:42   0:01
/usr/local/sbin/glusterfs --log-level=NORMAL
--volfile=/etc/glusterfs/glusterfs.vol /mnt/backups

--Our version of Gluster
backup0:/usr/local/src# ls -al /usr/local/src/glusterfs-3.0.3.tar.gz
-rw-r--r-- 1 root staff 1684933 2010-03-14 00:04
/usr/local/src/glusterfs-3.0.3.tar.gz

---------------------------

backup0:/mnt# cat /etc/fstab
# /etc/fstab: static file system information.
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
proc            /proc           proc    defaults        0       0
/dev/sda1       /               ext3    errors=remount-ro 0       1
/dev/sdb1       /mnt/sdb1       ext3    errors=remount-ro 0       1
/dev/sdc1       /mnt/sdc1       ext3    errors=remount-ro 0       1
/dev/sdd1       /mnt/sdd1       ext3    errors=remount-ro 0       1
/dev/sda5       none            swap    sw              0       0
/etc/glusterfs/glusterfs.vol  /mnt/backups  glusterfs  defaults  0  0

---------------------------

backup0:/mnt# cat /etc/glusterfs/glusterfs.vol
### file: client-volume.vol.sample

#####################################
###  GlusterFS Client Volume File  ##
#####################################

#### CONFIG FILE RULES:
### "#" is comment character.
### - Config file is case sensitive
### - Options within a volume block can be in any order.
### - Spaces or tabs are used as delimitter within a line.
### - Each option should end within a line.
### - Missing or commented fields will assume default values.
### - Blank/commented lines are allowed.
### - Sub-volumes should already be defined above before referring.

# drives
volume backup1-sda1
  type protocol/client
  option transport-type tcp
  option remote-host 10.10.0.71
  option remote-subvolume sda1
end-volume

volume backup2-sda1
  type protocol/client
  option transport-type tcp
  option remote-host 10.10.0.72
  option remote-subvolume sda1
end-volume

volume backup3-sda1
  type protocol/client
  option transport-type tcp
  option remote-host 10.10.0.73
  option remote-subvolume sda1
end-volume

volume backup4-sda1
  type protocol/client
  option transport-type tcp
  option remote-host 10.10.0.74
  option remote-subvolume sda1
end-volume

# replication
volume b-1-2-sda1
  type cluster/replicate
  subvolumes backup1-sda1 backup2-sda1
end-volume

volume b-3-4-sda1
  type cluster/replicate
  subvolumes backup3-sda1 backup4-sda1
end-volume

# striping
  volume distribute
  type cluster/distribute
  subvolumes b-1-2-sda1 b-3-4-sda1
end-volume

-------------------------------------

Gluster Log

========================================================================
========
Version      : glusterfs 3.0.3 built on Mar 19 2010 11:55:08
git: v3.0.2-41-g029062c
Starting Time: 2010-03-19 14:50:34
Command line : /usr/local/sbin/glusterfs --log-level=NORMAL
--volfile=/etc/glusterfs/glusterfs.vol /mnt/backups
PID          : 4941
System name  : Linux
Nodename     : backup0
Kernel Release : 2.6.30-2-686
Hardware Identifier: i686

Given volfile:
+-----------------------------------------------------------------------
-------+
  1: ### file: client-volume.vol.sample
  2:
  3: #####################################
  4: ###  GlusterFS Client Volume File  ##
  5: #####################################
  6:
  7: #### CONFIG FILE RULES:
  8: ### "#" is comment character.
  9: ### - Config file is case sensitive
 10: ### - Options within a volume block can be in any order.
 11: ### - Spaces or tabs are used as delimitter within a line.
 12: ### - Each option should end within a line.
 13: ### - Missing or commented fields will assume default values.
 14: ### - Blank/commented lines are allowed.
 15: ### - Sub-volumes should already be defined above before referring.
 16:
 17: # drives
 18: volume backup1-sda1
 19:   type protocol/client
 20:   option transport-type tcp
 21:   option remote-host 10.10.0.71
 22:   option remote-subvolume sda1
 23: end-volume
 24:
 25: volume backup2-sda1
 26:   type protocol/client
 27:   option transport-type tcp
 28:   option remote-host 10.10.0.72
 29:   option remote-subvolume sda1
 30: end-volume
 31:
 32: volume backup3-sda1
 33:   type protocol/client
 34:   option transport-type tcp
 35:   option remote-host 10.10.0.73
 36:   option remote-subvolume sda1
 37: end-volume
 38:
 39: volume backup4-sda1
 40:   type protocol/client
 41:   option transport-type tcp
 42:   option remote-host 10.10.0.74
 43:   option remote-subvolume sda1
 44: end-volume
 45:
 46: # replication
 47: volume b-1-2-sda1
 48:   type cluster/replicate
 49:   subvolumes backup1-sda1 backup2-sda1
 50: end-volume
 51:
 52: volume b-3-4-sda1
 53:   type cluster/replicate
 54:   subvolumes backup3-sda1 backup4-sda1
 55: end-volume
 56:
 57: # striping
 58: volume distribute
 59:   type cluster/distribute
 60:   subvolumes b-1-2-sda1 b-3-4-sda1
 61: end-volume
62:
 63: #### Add readahead feature
 64: #volume readahead
 65: #  type performance/read-ahead
 66: #  option page-size 1MB     # unit in bytes
 67: #  option page-count 2      # cache per file  = (page-count x
page-size)
 68: #  subvolumes distribute
 69: #end-volume
 70: #
 71: #### Add IO-Cache feature
 72: #volume iocache
 73: #    type performance/io-cache
 74: #    option cache-size `grep 'MemTotal' /proc/meminfo  | awk
'{print $2 * 0.2 / 1024}' | cut -f1 -d.`MB
 75: #    option cache-timeout 1
 76: #    subvolumes readahead
 77: #end-volume
 78: #
 79: #### Add quick-read feature
 80: #volume quickread
 81: #    type performance/quick-read
 82: #    option cache-timeout 1
 83: #    option max-file-size 64kB
 84: #    subvolumes iocache
 85: #end-volume
 86: #
 87: #### Add writeback feature
 88: #volume writeback
 89: #  type performance/write-behind
 90: #  option aggregate-size 1MB
 91: #  option window-size 2MB
 92: #  option flush-behind off
 93: #  subvolumes quickread
 94: #end-volume
 95: #

+-----------------------------------------------------------------------
-------+
[2010-03-19 14:50:34] N [glusterfsd.c:1396:main] glusterfs: Successfully
started
[2010-03-19 14:50:34] N [client-protocol.c:6246:client_setvolume_cbk]
backup4-sda1: Connected to 10.10.0.74:6996, attached to remote volume
'sda1'.
[2010-03-19 14:50:34] N [afr.c:2627:notify] b-3-4-sda1: Subvolume
'backup4-sda1' came back up; going online.
[2010-03-19 14:50:34] N [client-protocol.c:6246:client_setvolume_cbk]
backup4-sda1: Connected to 10.10.0.74:6996, attached to remote volume
'sda1'.
[2010-03-19 14:50:34] N [afr.c:2627:notify] b-3-4-sda1: Subvolume
'backup4-sda1' came back up; going online.
[2010-03-19 14:50:34] N [client-protocol.c:6246:client_setvolume_cbk]
backup2-sda1: Connected to 10.10.0.72:6996, attached to remote volume
'sda1'.
[2010-03-19 14:50:34] N [afr.c:2627:notify] b-1-2-sda1: Subvolume
'backup2-sda1' came back up; going online.
[2010-03-19 14:50:34] N [client-protocol.c:6246:client_setvolume_cbk]
backup2-sda1: Connected to 10.10.0.72:6996, attached to remote volume
'sda1'.
[2010-03-19 14:50:34] N [afr.c:2627:notify] b-1-2-sda1: Subvolume
'backup2-sda1' came back up; going online.
[2010-03-19 14:50:34] N [client-protocol.c:6246:client_setvolume_cbk]
backup1-sda1: Connected to 10.10.0.71:6996, attached to remote volume
'sda1'.
[2010-03-19 14:50:34] N [client-protocol.c:6246:client_setvolume_cbk]
backup1-sda1: Connected to 10.10.0.71:6996, attached to remote volume
'sda1'.
[2010-03-19 14:50:34] N [client-protocol.c:6246:client_setvolume_cbk]
backup3-sda1: Connected to 10.10.0.73:6996, attached to remote volume
'sda1'.
[2010-03-19 14:50:34] N [client-protocol.c:6246:client_setvolume_cbk]
backup3-sda1: Connected to 10.10.0.73:6996, attached to remote volume
'sda1'.
[2010-03-19 14:50:34] N [fuse-bridge.c:2942:fuse_init] glusterfs-fuse:
FUSE inited with protocol versions: glusterfs 7.13 kernel 7.11
[2010-03-19 14:50:49] W [fuse-bridge.c:491:fuse_entry_cbk]
glusterfs-fuse: LOOKUP(/100101) inode (ptr=0x9a57280, ino=373293060,
gen=5450392427238002552) found conflict (ptr=0x9a509a8, ino=373293060,
gen=5450392427238002552)
[2010-03-19 14:50:49] W [fuse-bridge.c:722:fuse_attr_cbk]
glusterfs-fuse: 1349: LOOKUP() / => -1 (Stale NFS file handle)
[2010-03-19 14:50:49] W [fuse-bridge.c:722:fuse_attr_cbk]
glusterfs-fuse: 1350: LOOKUP() / => -1 (Stale NFS file handle)
[2010-03-19 14:50:49] W [fuse-bridge.c:722:fuse_attr_cbk]
glusterfs-fuse: 1351: LOOKUP() / => -1 (Stale NFS file handle)
[2010-03-19 14:50:49] W [fuse-bridge.c:722:fuse_attr_cbk]
glusterfs-fuse: 1352: LOOKUP() / => -1 (Stale NFS file handle)
[2010-03-19 14:50:49] W [fuse-bridge.c:722:fuse_attr_cbk]
glusterfs-fuse: 1353: LOOKUP() / => -1 (Stale NFS file handle)
[2010-03-19 14:50:49] W [fuse-bridge.c:722:fuse_attr_cbk]
glusterfs-fuse: 1354: LOOKUP() / => -1 (Stale NFS file handle)
[2010-03-19 14:50:49] W [fuse-bridge.c:722:fuse_attr_cbk]
glusterfs-fuse: 1355: LOOKUP() / => -1 (Stale NFS file handle)
[Repeated for hundreds of lines]



----------------------------------
Benjamin Hudgens
Director of Information Technology
Photodex Corporation

www.photodex.com
bhudgens at photodex.com
(512) 674-9920




-----Original Message-----
From: gluster-users-bounces at gluster.org
[mailto:gluster-users-bounces at gluster.org] On Behalf Of Vikas Gorur
Sent: Friday, March 19, 2010 2:02 PM
To: Joe Grace
Cc: Gluster Users
Subject: Re: [Gluster-users] rsync causing gluster to crash


On Mar 19, 2010, at 9:47 AM, Joe Grace wrote:

> Thank you for the reply.
> 
> Version 3.0.2 from source on Debian sqeeze.
> 
> Here is the client log:


That is the client volume file. What I meant was the client log file you
can usually find in /usr/local/var/log/glusterfs/glusterfs.log.

If you have a core file from glusterfs (look in /core.*), you can get a
backtrace by:

# gdb /path/to/glusterfs /path/to/core/file
(gdb) thread apply all bt full

Please send us the output of that command.

------------------------------
Vikas Gorur
Engineer - Gluster, Inc.
+1 (408) 770 1894
------------------------------







_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users



More information about the Gluster-users mailing list