[Gluster-users] Debian, 3.1.1, duplicate files

Jacob Shucart jacob at gluster.com
Thu Jan 13 21:53:48 UTC 2011


Phil,

This sounds to me like an issue identified that affects Gluster directories 
that were part of older versions related to extended attributes that were 
set on the directories.  I believe this issue is supposed to be fixed in 
3.1.2.  I don't know how large your dataset is, but a way to fix it would be 
to:

1. Delete the Gluster volume.
2. On the back end directories on your nodes, scrub the offending extended 
attribute with the command:
	find /back/end/dir -exec setfattr -x trusted.gfid {} \;
3. Create the Gluster volume again.
4. Mount the volume somewhere as a GlusterFS(mount -t glusterfs....) and 
run:
	find /mnt/gluster -print0 | xargs --null stat
5. Enjoy.

Please let me know if that helps.  Thank you.

-Jacob

-----Original Message-----
From: gluster-users-bounces at gluster.org 
[mailto:gluster-users-bounces at gluster.org] On Behalf Of phil cryer
Sent: Thursday, January 13, 2011 9:07 AM
To: gluster-users at gluster.org
Subject: Re: [Gluster-users] Debian, 3.1.1, duplicate files

So, I haven't heard anything back, so I just wanted to update this
just in case anyone else comes across it. This was an old store that
we created in 3.0.4, that kept getting duplicate files, basically we
ran an update script that would use wget, try to download any files
that were not present on the local box but were on the remote. Of
course if it just downloaded the same file it would either 1) ignore
it and not download it because it would see that we already have it 2)
overwrite that file (clobber) with a new version of that file or 2)
rewrite the file as file.1 so as not to mess with the original one
(no-clobber) - but in fact it did none of these - so instead we ended
up with the bizzare feature of having multiple/identical files in the
same directory. Meanwhile we're also using far more space than we
should have (~70TB instead of ~40TB or so) thanks to having
directories like this:

# ls -al /mnt/glusterfs//www/t/tijdschriftvoore1951nede/
total 536436
drwxr-xr-x    2 www-data www-data    294912 Jan 13 10:05 .
drwx------ 1016 www-data www-data   3846144 Dec 12 11:10 ..
-rwxr-xr-x    1 www-data www-data   1151282 Jul 12  2010
tijdschriftvoore1951nede_djvu.txt
-rwxr-xr-x    1 www-data www-data   1151282 Jul 12  2010
tijdschriftvoore1951nede_djvu.txt
-rwxr-xr-x    1 www-data www-data  12078834 Jul 12  2010
tijdschriftvoore1951nede_djvu.xml
-rwxr-xr-x    1 www-data www-data  12078834 Jul 12  2010
tijdschriftvoore1951nede_djvu.xml
-rwxr-xr-x    1 www-data www-data    271733 Jul 12  2010
tijdschriftvoore1951nede.gif
-rwxr-xr-x    1 www-data www-data    271733 Jul 12  2010
tijdschriftvoore1951nede.gif
-rwxr-xr-x    1 www-data www-data 257779301 Jul 12  2010
tijdschriftvoore1951nede_jp2.zip
-rwxr-xr-x    1 www-data www-data 257779301 Jul 12  2010
tijdschriftvoore1951nede_jp2.zip
-rwxr-xr-x    1 www-data www-data      2278 Jul 12  2010
tijdschriftvoore1951nede_marc.xml
-rwxr-xr-x    1 www-data www-data      2278 Jul 12  2010
tijdschriftvoore1951nede_marc.xml
-rwxr-xr-x    1 www-data www-data       720 Jul 12  2010
tijdschriftvoore1951nede_meta.mrc
-rwxr-xr-x    1 www-data www-data       720 Jul 12  2010
tijdschriftvoore1951nede_meta.mrc
-rwxr-xr-x    1 www-data www-data    546411 Jul 12  2010
tijdschriftvoore1951nede_names.xml
-rwxr-xr-x    1 www-data www-data    546411 Jul 12  2010
tijdschriftvoore1951nede_names.xml
-rwxr-xr-x    1 www-data www-data       256 Jul 12  2010
tijdschriftvoore1951nede_names.xml_meta.txt
-rwxr-xr-x    1 www-data www-data       256 Jul 12  2010
tijdschriftvoore1951nede_names.xml_meta.txt
-rwxr-xr-x    1 www-data www-data    257556 Jul 13  2010
tijdschriftvoore1951nede_scandata.xml
-rwxr-xr-x    1 www-data www-data    257556 Jul 13  2010
tijdschriftvoore1951nede_scandata.xml

Ouch, right? So, I installed 3.1.1, that went well, I got it on all
the drives and servers we had before, have a total capacity of 96TB
again, good, all seems to be working, mounted the old directories and
saw the same issue with the duplicate files and let it sit over night
to see if it would notice this and try to fix things. Then we're
seeing gluster logs saying things like:

==> glusterfs/mnt-glusterfs.log <==
[2011-01-13 11:46:23.2762] I [afr-common.c:662:afr_lookup_done]
bhl-volume-replicate-55: entries are missing in lookup of
/www/t/tijdschriftvoore1951nede.
[2011-01-13 11:46:23.2817] I [afr-common.c:716:afr_lookup_done]
bhl-volume-replicate-55: background  meta-data data entry self-heal
triggered. path: /www/t/tijdschriftvoore1951nede
[2011-01-13 11:46:23.5342] I
[afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
bhl-volume-replicate-55: background  meta-data data entry self-heal
completed on /www/t/tijdschriftvoore1951nede

...so we think, hey, maybe we're all set here, it's fixing itself and
removing those duplicate files, but no such luck:

# ls -al /mnt/glusterfs//www/t/tijdschriftvoore1951nede/
total 536436
drwxr-xr-x    2 www-data www-data    294912 Jan 13 10:05 .
drwx------ 1016 www-data www-data   3846144 Dec 12 11:10 ..
-rwxr-xr-x    1 www-data www-data   1151282 Jul 12  2010
tijdschriftvoore1951nede_djvu.txt
-rwxr-xr-x    1 www-data www-data   1151282 Jul 12  2010
tijdschriftvoore1951nede_djvu.txt
-rwxr-xr-x    1 www-data www-data  12078834 Jul 12  2010
tijdschriftvoore1951nede_djvu.xml
-rwxr-xr-x    1 www-data www-data  12078834 Jul 12  2010
tijdschriftvoore1951nede_djvu.xml
-rwxr-xr-x    1 www-data www-data    271733 Jul 12  2010
tijdschriftvoore1951nede.gif
-rwxr-xr-x    1 www-data www-data    271733 Jul 12  2010
tijdschriftvoore1951nede.gif
-rwxr-xr-x    1 www-data www-data 257779301 Jul 12  2010
tijdschriftvoore1951nede_jp2.zip
-rwxr-xr-x    1 www-data www-data 257779301 Jul 12  2010
tijdschriftvoore1951nede_jp2.zip
-rwxr-xr-x    1 www-data www-data      2278 Jul 12  2010
tijdschriftvoore1951nede_marc.xml
-rwxr-xr-x    1 www-data www-data      2278 Jul 12  2010
tijdschriftvoore1951nede_marc.xml
-rwxr-xr-x    1 www-data www-data       720 Jul 12  2010
tijdschriftvoore1951nede_meta.mrc
-rwxr-xr-x    1 www-data www-data       720 Jul 12  2010
tijdschriftvoore1951nede_meta.mrc
-rwxr-xr-x    1 www-data www-data    546411 Jul 12  2010
tijdschriftvoore1951nede_names.xml
-rwxr-xr-x    1 www-data www-data    546411 Jul 12  2010
tijdschriftvoore1951nede_names.xml
-rwxr-xr-x    1 www-data www-data       256 Jul 12  2010
tijdschriftvoore1951nede_names.xml_meta.txt
-rwxr-xr-x    1 www-data www-data       256 Jul 12  2010
tijdschriftvoore1951nede_names.xml_meta.txt
-rwxr-xr-x    1 www-data www-data    257556 Jul 13  2010
tijdschriftvoore1951nede_scandata.xml
-rwxr-xr-x    1 www-data www-data    257556 Jul 13  2010
tijdschriftvoore1951nede_scandata.xml

but, this allows us to do (in my opinion) scary things like this:

# ls -al /mnt/glusterfs//www/t/tijdschriftvoore1951nede/*_names.xml
-rwxr-xr-x 1 www-data www-data 546411 Jul 12  2010
/mnt/glusterfs//www/t/tijdschriftvoore1951nede/tijdschriftvoore1951nede_names.xml
-rwxr-xr-x 1 www-data www-data 546411 Jul 12  2010
/mnt/glusterfs//www/t/tijdschriftvoore1951nede/tijdschriftvoore1951nede_names.xml

# rm 
/mnt/glusterfs//www/t/tijdschriftvoore1951nede/tijdschriftvoore1951nede_names.xml

# ls -al /mnt/glusterfs//www/t/tijdschriftvoore1951nede/*_names.xml
-rwxr-xr-x 1 www-data www-data 546411 Jul 12  2010
/mnt/glusterfs//www/t/tijdschriftvoore1951nede/tijdschriftvoore1951nede_names.xml

eek! so it only removed one of the files, even though they both had
the same name. At this point we're going to wipe all 70TB and
re-transfer, hoping it stops when it gets all the files and doesn't
start writing the files with the same names as before. Anyone with
advice or insight into this issue? Would love to learn why it did
this, and REALLY hope it doesn't do it again.

Thanks

P



On Wed, Jan 12, 2011 at 2:37 PM, phil cryer <phil at cryer.us> wrote:
> I'm now running gluster 3.1.1 on Debian. A directory that was running
> under 3.0.4 had duplicate files, but I've remounted things now that
> we're running 3.1.1 in hopes it would fix things, but so far it has
> not:
>
> # ls -l /mnt/glusterfs/www/0/0descriptionofta581unittotal 37992
> -rwxr-xr-x 1 www-data www-data   796343 Jun 23  2010
> 0descriptionofta581unit_bw.pdf
> -rwxr-xr-x 1 www-data www-data   796343 Jun 23  2010
> 0descriptionofta581unit_bw.pdf
> ---------T 1 root     root         1497 Jun 24  2010
> 0descriptionofta581unit_dc.xml
> ---------T 1 root     root         1497 Jun 24  2010
> 0descriptionofta581unit_dc.xml
> ---------T 1 www-data www-data   577050 Jun 24  2010
> 0descriptionofta581unit.djvu
> ---------T 1 www-data www-data   577050 Jun 24  2010
> 0descriptionofta581unit.djvu
> -rwxr-xr-x 1 www-data www-data    33272 Jun 22  2010
> 0descriptionofta581unit_djvu.txt
> -rwxr-xr-x 1 www-data www-data    33272 Jun 22  2010
> 0descriptionofta581unit_djvu.txt
> -rwxr-xr-x 1 www-data www-data     4445 Jun 23  2010
> 0descriptionofta581unit_files.xml
> -rwxr-xr-x 1 www-data www-data     4445 Jun 23  2010
> 0descriptionofta581unit_files.xml
> -rwxr-xr-x 1 www-data www-data     5011 Jun 22  2010
> 0descriptionofta581unit_marc.xml
> -rwxr-xr-x 1 www-data www-data     5011 Jun 22  2010
> 0descriptionofta581unit_marc.xml
> -rwxr-xr-x 1 www-data www-data      360 Jun 23  2010
> 0descriptionofta581unit_metasource.xml
> -rwxr-xr-x 1 www-data www-data      360 Jun 23  2010
> 0descriptionofta581unit_metasource.xml
> -rwxr-xr-x 1 www-data www-data     2848 Jun 22  2010
> 0descriptionofta581unit_meta.xml
> -rwxr-xr-x 1 www-data www-data     2848 Jun 22  2010
> 0descriptionofta581unit_meta.xml
> -rwxr-xr-x 1 www-data www-data 16916480 Jun 22  2010
> 0descriptionofta581unit_orig_jp2.tar
> -rwxr-xr-x 1 www-data www-data 16916480 Jun 22  2010
> 0descriptionofta581unit_orig_jp2.tar
> -rwxr-xr-x 1 www-data www-data  1051810 Jun 22  2010 
> 0descriptionofta581unit.pdf
> -rwxr-xr-x 1 www-data www-data  1051810 Jun 22  2010 
> 0descriptionofta581unit.pdf
>
> While running the latest, 3.1.1, I noticed some log files that said:
>
> [..]
> [2011-01-12 15:24:33.325546] I
> [afr-common.c:613:afr_lookup_self_heal_check] bhl-volume-replicate-69:
> size differs for
> /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
> [2011-01-12 15:24:33.325558] I [afr-common.c:716:afr_lookup_done]
> bhl-volume-replicate-69: background  meta-data data self-heal
> triggered. path:
> /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
> [2011-01-12 15:24:33.364501] I
> [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
> bhl-volume-replicate-66: background  meta-data data self-heal
> completed on /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
> [2011-01-12 15:24:33.364881] I
> [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
> bhl-volume-replicate-69: background  meta-data data self-heal
> completed on /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
>
> I assumed it was fixing that, but it didn't. Here's the full logs that
> include all the gluster.log work it did in this directory:
> http://pastebin.com/8X52Em7Y
>
> Question: how can I 'fix' this, or is the best bet to remove
> everything and start over? It's going to set us back, but I'd rather
> do it now that keep banging on this without any resolution.
>
> Thanks for the help, really like the new gluster command, very nice!
>
> P
> --
> http://philcryer.com
>



-- 
http://philcryer.com
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users



More information about the Gluster-users mailing list