[Gluster-users] [Gluster-devel] missing files

David F. Robinson david.robinson at corvidtec.com
Fri Feb 6 04:36:46 UTC 2015


Not repeatable.  Once it shows up, it stays there.  I sent some other 
strange behavior I am seeing to Pranith earlier this evening.  Attached 
below...

David

Another issue I am having that might be related is that I cannot delete 
some directories. It complains that the directories are not empty. But 
when I list them out, there is nothing there.
However, if I know of the name of the directory, I can cd into it and 
see the files.

[root at gfs01a Phase_1_SOCOM14-003_adv_armor]# pwd
/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor

[root at gfs01a Phase_1_SOCOM14-003_adv_armor]# ls -al
total 0
drwxrws--x 7 root root 449 Feb 4 18:12 .
drwxrwx--- 3 root root 200 Feb 4 18:19 ..
drwxrws--- 3 root root 41 Feb 4 18:12 References
drwxrws--x 4 root root 54 Feb 4 18:12 Testing
drwxrws--- 4 root root 51 Feb 4 18:12 Velodyne
drwxrws--x 4 root root 38 Feb 4 18:12 progress_reports

[root at gfs01a Phase_1_SOCOM14-003_adv_armor]# rm -rf *
rm: cannot remove `References': Directory not empty
rm: cannot remove `Testing': Directory not empty
rm: cannot remove `Velodyne': Directory not empty
rm: cannot remove `progress_reports/pr2': Directory not empty
rm: cannot remove `progress_reports/pr3': Directory not empty

[root at gfs01a Phase_1_SOCOM14-003_adv_armor]# ls -alR
total 0
drwxrws--x 6 root root 449 Feb 4 18:12 .
drwxrwx--- 3 root root 200 Feb 4 18:19 ..
drwxrws--- 3 root root 41 Feb 4 18:12 References *** Note that there is 
nothing in this References directory.
drwxrws--x 4 root root 54 Feb 4 18:12 Testing
drwxrws--- 4 root root 51 Feb 4 18:12 Velodyne
drwxrws--x 4 root root 38 Feb 4 18:12 progress_reports


However, from the bricks (see listings below), there are other 
directories that are not shown. For example, the References directory 
contains the USSOCOM_OPAQUE_ARMOR directory on the brick, but it doesn't 
show up on the volume.

[root at gfs01a USSOCOM_OPAQUE_ARMOR]# pwd
/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor

[root at gfs01a Phase_1_SOCOM14-003_adv_armor]# cd References/
[root at gfs01a References]# ls -al *** There is nothing shown in the 
References directory
total 0
drwxrws--- 3 root root 133 Feb 4 18:12 .
drwxrws--x 7 root root 449 Feb 4 18:12 ..

[root at gfs01a References]# cd USSOCOM_OPAQUE_ARMOR *** From the brick 
listing, I knew the directory name. Even though it isn't shown, I can cd 
to it and see the files.
[root at gfs01a USSOCOM_OPAQUE_ARMOR]# ls -al
total 6787
drwxrws--- 2 streadway sbir 244 Feb 5 21:28 .
drwxrws--- 3 root root 164 Feb 5 21:28 ..
-rwxrw---- 1 streadway sbir 42440 Jun 19 2014 ARMOR PACKAGES.one
-rwxrw---- 1 streadway sbir 17248 Jun 19 2014 COMPARISON OF 
SOLUTIONS.one
-rwxrw---- 1 streadway sbir 38184 Jun 19 2014 CURRENT STANDARD 
ARMORING.one
-rwxrw---- 1 sgilbert sbir 2974120 Jan 22 09:15 FEASABILITY STUDY.docx
-rwxrw---- 1 streadway sbir 3826704 Jan 21 14:57 FEASABILITY STUDY.one
-rwxrw---- 1 streadway sbir 49736 Jan 21 13:18 GIVEN TRADE SPACE.one



The recursive file listed (ls -alR) from each of the bricks shows that 
there are files/directories that do not show up on the /homegfs volume.

[root at gfs01a Phase_1_SOCOM14-003_adv_armor]# ls -alR 
/data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References
/data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 75 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 6648
drwxrws--- 2 streadway sbir 75 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw---- 2 sgilbert sbir 2974120 Jan 22 09:15 FEASABILITY STUDY.docx
-rwxrw---- 2 streadway sbir 3826704 Jan 21 14:57 FEASABILITY STUDY.one

/data/brick02a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 2 root root 10 Feb 4 18:12 .
drwxrws--x 6 root root 95 Feb 4 18:12 ..

[root at gfs01b ~]# ls -alR 
/data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References
/data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 75 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 6648
drwxrws--- 2 streadway sbir 75 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw---- 2 sgilbert sbir 2974120 Jan 22 09:15 FEASABILITY STUDY.docx
-rwxrw---- 2 streadway sbir 3826704 Jan 21 14:57 FEASABILITY STUDY.one

/data/brick02b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 2 root root 10 Feb 4 18:12 .
drwxrws--x 6 root root 95 Feb 4 18:12 ..

[root at gfs02a ~]# ls -alR 
/data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References
/data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 80 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 72
drwxrws--- 2 streadway sbir 80 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw---- 2 streadway sbir 17248 Jun 19 2014 COMPARISON OF 
SOLUTIONS.one
-rwxrw---- 2 streadway sbir 49736 Jan 21 13:18 GIVEN TRADE SPACE.one

/data/brick02a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 79 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick02a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 84
drwxrws--- 2 streadway sbir 79 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw---- 2 streadway sbir 42440 Jun 19 2014 ARMOR PACKAGES.one
-rwxrw---- 2 streadway sbir 38184 Jun 19 2014 CURRENT STANDARD 
ARMORING.one

[root at gfs02b ~]# ls -alR 
/data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References
/data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 80 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 72
drwxrws--- 2 streadway sbir 80 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw---- 2 streadway sbir 17248 Jun 19 2014 COMPARISON OF 
SOLUTIONS.one
-rwxrw---- 2 streadway sbir 49736 Jan 21 13:18 GIVEN TRADE SPACE.one

/data/brick02b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 79 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick02b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 84
drwxrws--- 2 streadway sbir 79 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw---- 2 streadway sbir 42440 Jun 19 2014 ARMOR PACKAGES.one
-rwxrw---- 2 streadway sbir 38184 Jun 19 2014 CURRENT STANDARD 
ARMORING.one





------ Original Message ------
From: "Xavier Hernandez" <xhernandez at datalab.es>
To: "David F. Robinson" <david.robinson at corvidtec.com>; "Benjamin 
Turner" <bennyturns at gmail.com>; "Pranith Kumar Karampuri" 
<pkarampu at redhat.com>
Cc: "gluster-users at gluster.org" <gluster-users at gluster.org>; "Gluster 
Devel" <gluster-devel at gluster.org>
Sent: 2/5/2015 5:14:22 AM
Subject: Re: [Gluster-devel] missing files

>Is the failure repeatable ? with the same directories ?
>
>It's very weird that the directories appear on the volume when you do 
>an 'ls' on the bricks. Could it be that you only made a single 'ls' on 
>fuse mount which not showed the directory ? Is it possible that this 
>'ls' triggered a self-heal that repaired the problem, whatever it was, 
>and when you did another 'ls' on the fuse mount after the 'ls' on the 
>bricks, the directories were there ?
>
>The first 'ls' could have healed the files, causing that the following 
>'ls' on the bricks showed the files as if nothing were damaged. If 
>that's the case, it's possible that there were some disconnections 
>during the copy.
>
>Added Pranith because he knows better replication and self-heal 
>details.
>
>Xavi
>
>On 02/04/2015 07:23 PM, David F. Robinson wrote:
>>Distributed/replicated
>>
>>Volume Name: homegfs
>>Type: Distributed-Replicate
>>Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071
>>Status: Started
>>Number of Bricks: 4 x 2 = 8
>>Transport-type: tcp
>>Bricks:
>>Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs
>>Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs
>>Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs
>>Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs
>>Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs
>>Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs
>>Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs
>>Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs
>>Options Reconfigured:
>>performance.io-thread-count: 32
>>performance.cache-size: 128MB
>>performance.write-behind-window-size: 128MB
>>server.allow-insecure: on
>>network.ping-timeout: 10
>>storage.owner-gid: 100
>>geo-replication.indexing: off
>>geo-replication.ignore-pid-check: on
>>changelog.changelog: on
>>changelog.fsync-interval: 3
>>changelog.rollover-time: 15
>>server.manage-gids: on
>>
>>
>>------ Original Message ------
>>From: "Xavier Hernandez" <xhernandez at datalab.es>
>>To: "David F. Robinson" <david.robinson at corvidtec.com>; "Benjamin
>>Turner" <bennyturns at gmail.com>
>>Cc: "gluster-users at gluster.org" <gluster-users at gluster.org>; "Gluster
>>Devel" <gluster-devel at gluster.org>
>>Sent: 2/4/2015 6:03:45 AM
>>Subject: Re: [Gluster-devel] missing files
>>
>>>On 02/04/2015 01:30 AM, David F. Robinson wrote:
>>>>Sorry. Thought about this a little more. I should have been clearer.
>>>>The files were on both bricks of the replica, not just one side. So,
>>>>both bricks had to have been up... The files/directories just don't 
>>>>show
>>>>up on the mount.
>>>>I was reading and saw a related bug
>>>>(https://bugzilla.redhat.com/show_bug.cgi?id=1159484). I saw it
>>>>suggested to run:
>>>>          find <mount> -d -exec getfattr -h -n trusted.ec.heal {} \;
>>>
>>>This command is specific for a dispersed volume. It won't do anything
>>>(aside from the error you are seeing) on a replicated volume.
>>>
>>>I think you are using a replicated volume, right ?
>>>
>>>In this case I'm not sure what can be happening. Is your volume a 
>>>pure
>>>replicated one or a distributed-replicated ? on a pure replicated it
>>>doesn't make sense that some entries do not show in an 'ls' when the
>>>file is in both replicas (at least without any error message in the
>>>logs). On a distributed-replicated it could be caused by some problem
>>>while combining contents of each replica set.
>>>
>>>What's the configuration of your volume ?
>>>
>>>Xavi
>>>
>>>>
>>>>I get a bunch of errors for operation not supported:
>>>>[root at gfs02a homegfs]# find wks_backup -d -exec getfattr -h -n
>>>>trusted.ec.heal {} \;
>>>>find: warning: the -d option is deprecated; please use -depth 
>>>>instead,
>>>>because the latter is a POSIX-compliant feature.
>>>>wks_backup/homer_backup/backup: trusted.ec.heal: Operation not 
>>>>supported
>>>>wks_backup/homer_backup/logs/2014_05_20.log: trusted.ec.heal: 
>>>>Operation
>>>>not supported
>>>>wks_backup/homer_backup/logs/2014_05_21.log: trusted.ec.heal: 
>>>>Operation
>>>>not supported
>>>>wks_backup/homer_backup/logs/2014_05_18.log: trusted.ec.heal: 
>>>>Operation
>>>>not supported
>>>>wks_backup/homer_backup/logs/2014_05_19.log: trusted.ec.heal: 
>>>>Operation
>>>>not supported
>>>>wks_backup/homer_backup/logs/2014_05_22.log: trusted.ec.heal: 
>>>>Operation
>>>>not supported
>>>>wks_backup/homer_backup/logs: trusted.ec.heal: Operation not 
>>>>supported
>>>>wks_backup/homer_backup: trusted.ec.heal: Operation not supported
>>>>------ Original Message ------
>>>>From: "Benjamin Turner" <bennyturns at gmail.com
>>>><mailto:bennyturns at gmail.com>>
>>>>To: "David F. Robinson" <david.robinson at corvidtec.com
>>>><mailto:david.robinson at corvidtec.com>>
>>>>Cc: "Gluster Devel" <gluster-devel at gluster.org
>>>><mailto:gluster-devel at gluster.org>>; "gluster-users at gluster.org"
>>>><gluster-users at gluster.org <mailto:gluster-users at gluster.org>>
>>>>Sent: 2/3/2015 7:12:34 PM
>>>>Subject: Re: [Gluster-devel] missing files
>>>>>It sounds to me like the files were only copied to one replica, 
>>>>>werent
>>>>>there for the initial for the initial ls which triggered a self 
>>>>>heal,
>>>>>and were there for the last ls because they were healed. Is there 
>>>>>any
>>>>>chance that one of the replicas was down during the rsync? It could
>>>>>be that you lost a brick during copy or something like that. To
>>>>>confirm I would look for disconnects in the brick logs as well as
>>>>>checking glusterfshd.log to verify the missing files were actually
>>>>>healed.
>>>>>
>>>>>-b
>>>>>
>>>>>On Tue, Feb 3, 2015 at 5:37 PM, David F. Robinson
>>>>><david.robinson at corvidtec.com 
>>>>><mailto:david.robinson at corvidtec.com>>
>>>>>wrote:
>>>>>
>>>>>     I rsync'd 20-TB over to my gluster system and noticed that I 
>>>>>had
>>>>>     some directories missing even though the rsync completed 
>>>>>normally.
>>>>>     The rsync logs showed that the missing files were transferred.
>>>>>     I went to the bricks and did an 'ls -al
>>>>>     /data/brick*/homegfs/dir/*' the files were on the bricks. After 
>>>>>I
>>>>>     did this 'ls', the files then showed up on the FUSE mounts.
>>>>>     1) Why are the files hidden on the fuse mount?
>>>>>     2) Why does the ls make them show up on the FUSE mount?
>>>>>     3) How can I prevent this from happening again?
>>>>>     Note, I also mounted the gluster volume using NFS and saw the 
>>>>>same
>>>>>     behavior. The files/directories were not shown until I did the
>>>>>     "ls" on the bricks.
>>>>>     David
>>>>>     ===============================
>>>>>     David F. Robinson, Ph.D.
>>>>>     President - Corvid Technologies
>>>>>     704.799.6944 x101 <tel:704.799.6944%20x101> [office]
>>>>>     704.252.1310 <tel:704.252.1310> [cell]
>>>>>     704.799.7974 <tel:704.799.7974> [fax]
>>>>>     David.Robinson at corvidtec.com 
>>>>><mailto:David.Robinson at corvidtec.com>
>>>>>     http://www.corvidtechnologies.com
>>>>><http://www.corvidtechnologies.com/>
>>>>>
>>>>>     _______________________________________________
>>>>>     Gluster-devel mailing list
>>>>>     Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org>
>>>>>     http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>
>>>>>
>>>>
>>>>
>>>>_______________________________________________
>>>>Gluster-devel mailing list
>>>>Gluster-devel at gluster.org
>>>>http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>
>>



More information about the Gluster-users mailing list