[Gluster-users] Debian, 3.1.1, duplicate files
Jacob Shucart
jacob at gluster.com
Thu Jan 13 21:53:48 UTC 2011
Phil,
This sounds to me like an issue identified that affects Gluster directories
that were part of older versions related to extended attributes that were
set on the directories. I believe this issue is supposed to be fixed in
3.1.2. I don't know how large your dataset is, but a way to fix it would be
to:
1. Delete the Gluster volume.
2. On the back end directories on your nodes, scrub the offending extended
attribute with the command:
find /back/end/dir -exec setfattr -x trusted.gfid {} \;
3. Create the Gluster volume again.
4. Mount the volume somewhere as a GlusterFS(mount -t glusterfs....) and
run:
find /mnt/gluster -print0 | xargs --null stat
5. Enjoy.
Please let me know if that helps. Thank you.
-Jacob
-----Original Message-----
From: gluster-users-bounces at gluster.org
[mailto:gluster-users-bounces at gluster.org] On Behalf Of phil cryer
Sent: Thursday, January 13, 2011 9:07 AM
To: gluster-users at gluster.org
Subject: Re: [Gluster-users] Debian, 3.1.1, duplicate files
So, I haven't heard anything back, so I just wanted to update this
just in case anyone else comes across it. This was an old store that
we created in 3.0.4, that kept getting duplicate files, basically we
ran an update script that would use wget, try to download any files
that were not present on the local box but were on the remote. Of
course if it just downloaded the same file it would either 1) ignore
it and not download it because it would see that we already have it 2)
overwrite that file (clobber) with a new version of that file or 2)
rewrite the file as file.1 so as not to mess with the original one
(no-clobber) - but in fact it did none of these - so instead we ended
up with the bizzare feature of having multiple/identical files in the
same directory. Meanwhile we're also using far more space than we
should have (~70TB instead of ~40TB or so) thanks to having
directories like this:
# ls -al /mnt/glusterfs//www/t/tijdschriftvoore1951nede/
total 536436
drwxr-xr-x 2 www-data www-data 294912 Jan 13 10:05 .
drwx------ 1016 www-data www-data 3846144 Dec 12 11:10 ..
-rwxr-xr-x 1 www-data www-data 1151282 Jul 12 2010
tijdschriftvoore1951nede_djvu.txt
-rwxr-xr-x 1 www-data www-data 1151282 Jul 12 2010
tijdschriftvoore1951nede_djvu.txt
-rwxr-xr-x 1 www-data www-data 12078834 Jul 12 2010
tijdschriftvoore1951nede_djvu.xml
-rwxr-xr-x 1 www-data www-data 12078834 Jul 12 2010
tijdschriftvoore1951nede_djvu.xml
-rwxr-xr-x 1 www-data www-data 271733 Jul 12 2010
tijdschriftvoore1951nede.gif
-rwxr-xr-x 1 www-data www-data 271733 Jul 12 2010
tijdschriftvoore1951nede.gif
-rwxr-xr-x 1 www-data www-data 257779301 Jul 12 2010
tijdschriftvoore1951nede_jp2.zip
-rwxr-xr-x 1 www-data www-data 257779301 Jul 12 2010
tijdschriftvoore1951nede_jp2.zip
-rwxr-xr-x 1 www-data www-data 2278 Jul 12 2010
tijdschriftvoore1951nede_marc.xml
-rwxr-xr-x 1 www-data www-data 2278 Jul 12 2010
tijdschriftvoore1951nede_marc.xml
-rwxr-xr-x 1 www-data www-data 720 Jul 12 2010
tijdschriftvoore1951nede_meta.mrc
-rwxr-xr-x 1 www-data www-data 720 Jul 12 2010
tijdschriftvoore1951nede_meta.mrc
-rwxr-xr-x 1 www-data www-data 546411 Jul 12 2010
tijdschriftvoore1951nede_names.xml
-rwxr-xr-x 1 www-data www-data 546411 Jul 12 2010
tijdschriftvoore1951nede_names.xml
-rwxr-xr-x 1 www-data www-data 256 Jul 12 2010
tijdschriftvoore1951nede_names.xml_meta.txt
-rwxr-xr-x 1 www-data www-data 256 Jul 12 2010
tijdschriftvoore1951nede_names.xml_meta.txt
-rwxr-xr-x 1 www-data www-data 257556 Jul 13 2010
tijdschriftvoore1951nede_scandata.xml
-rwxr-xr-x 1 www-data www-data 257556 Jul 13 2010
tijdschriftvoore1951nede_scandata.xml
Ouch, right? So, I installed 3.1.1, that went well, I got it on all
the drives and servers we had before, have a total capacity of 96TB
again, good, all seems to be working, mounted the old directories and
saw the same issue with the duplicate files and let it sit over night
to see if it would notice this and try to fix things. Then we're
seeing gluster logs saying things like:
==> glusterfs/mnt-glusterfs.log <==
[2011-01-13 11:46:23.2762] I [afr-common.c:662:afr_lookup_done]
bhl-volume-replicate-55: entries are missing in lookup of
/www/t/tijdschriftvoore1951nede.
[2011-01-13 11:46:23.2817] I [afr-common.c:716:afr_lookup_done]
bhl-volume-replicate-55: background meta-data data entry self-heal
triggered. path: /www/t/tijdschriftvoore1951nede
[2011-01-13 11:46:23.5342] I
[afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
bhl-volume-replicate-55: background meta-data data entry self-heal
completed on /www/t/tijdschriftvoore1951nede
...so we think, hey, maybe we're all set here, it's fixing itself and
removing those duplicate files, but no such luck:
# ls -al /mnt/glusterfs//www/t/tijdschriftvoore1951nede/
total 536436
drwxr-xr-x 2 www-data www-data 294912 Jan 13 10:05 .
drwx------ 1016 www-data www-data 3846144 Dec 12 11:10 ..
-rwxr-xr-x 1 www-data www-data 1151282 Jul 12 2010
tijdschriftvoore1951nede_djvu.txt
-rwxr-xr-x 1 www-data www-data 1151282 Jul 12 2010
tijdschriftvoore1951nede_djvu.txt
-rwxr-xr-x 1 www-data www-data 12078834 Jul 12 2010
tijdschriftvoore1951nede_djvu.xml
-rwxr-xr-x 1 www-data www-data 12078834 Jul 12 2010
tijdschriftvoore1951nede_djvu.xml
-rwxr-xr-x 1 www-data www-data 271733 Jul 12 2010
tijdschriftvoore1951nede.gif
-rwxr-xr-x 1 www-data www-data 271733 Jul 12 2010
tijdschriftvoore1951nede.gif
-rwxr-xr-x 1 www-data www-data 257779301 Jul 12 2010
tijdschriftvoore1951nede_jp2.zip
-rwxr-xr-x 1 www-data www-data 257779301 Jul 12 2010
tijdschriftvoore1951nede_jp2.zip
-rwxr-xr-x 1 www-data www-data 2278 Jul 12 2010
tijdschriftvoore1951nede_marc.xml
-rwxr-xr-x 1 www-data www-data 2278 Jul 12 2010
tijdschriftvoore1951nede_marc.xml
-rwxr-xr-x 1 www-data www-data 720 Jul 12 2010
tijdschriftvoore1951nede_meta.mrc
-rwxr-xr-x 1 www-data www-data 720 Jul 12 2010
tijdschriftvoore1951nede_meta.mrc
-rwxr-xr-x 1 www-data www-data 546411 Jul 12 2010
tijdschriftvoore1951nede_names.xml
-rwxr-xr-x 1 www-data www-data 546411 Jul 12 2010
tijdschriftvoore1951nede_names.xml
-rwxr-xr-x 1 www-data www-data 256 Jul 12 2010
tijdschriftvoore1951nede_names.xml_meta.txt
-rwxr-xr-x 1 www-data www-data 256 Jul 12 2010
tijdschriftvoore1951nede_names.xml_meta.txt
-rwxr-xr-x 1 www-data www-data 257556 Jul 13 2010
tijdschriftvoore1951nede_scandata.xml
-rwxr-xr-x 1 www-data www-data 257556 Jul 13 2010
tijdschriftvoore1951nede_scandata.xml
but, this allows us to do (in my opinion) scary things like this:
# ls -al /mnt/glusterfs//www/t/tijdschriftvoore1951nede/*_names.xml
-rwxr-xr-x 1 www-data www-data 546411 Jul 12 2010
/mnt/glusterfs//www/t/tijdschriftvoore1951nede/tijdschriftvoore1951nede_names.xml
-rwxr-xr-x 1 www-data www-data 546411 Jul 12 2010
/mnt/glusterfs//www/t/tijdschriftvoore1951nede/tijdschriftvoore1951nede_names.xml
# rm
/mnt/glusterfs//www/t/tijdschriftvoore1951nede/tijdschriftvoore1951nede_names.xml
# ls -al /mnt/glusterfs//www/t/tijdschriftvoore1951nede/*_names.xml
-rwxr-xr-x 1 www-data www-data 546411 Jul 12 2010
/mnt/glusterfs//www/t/tijdschriftvoore1951nede/tijdschriftvoore1951nede_names.xml
eek! so it only removed one of the files, even though they both had
the same name. At this point we're going to wipe all 70TB and
re-transfer, hoping it stops when it gets all the files and doesn't
start writing the files with the same names as before. Anyone with
advice or insight into this issue? Would love to learn why it did
this, and REALLY hope it doesn't do it again.
Thanks
P
On Wed, Jan 12, 2011 at 2:37 PM, phil cryer <phil at cryer.us> wrote:
> I'm now running gluster 3.1.1 on Debian. A directory that was running
> under 3.0.4 had duplicate files, but I've remounted things now that
> we're running 3.1.1 in hopes it would fix things, but so far it has
> not:
>
> # ls -l /mnt/glusterfs/www/0/0descriptionofta581unittotal 37992
> -rwxr-xr-x 1 www-data www-data 796343 Jun 23 2010
> 0descriptionofta581unit_bw.pdf
> -rwxr-xr-x 1 www-data www-data 796343 Jun 23 2010
> 0descriptionofta581unit_bw.pdf
> ---------T 1 root root 1497 Jun 24 2010
> 0descriptionofta581unit_dc.xml
> ---------T 1 root root 1497 Jun 24 2010
> 0descriptionofta581unit_dc.xml
> ---------T 1 www-data www-data 577050 Jun 24 2010
> 0descriptionofta581unit.djvu
> ---------T 1 www-data www-data 577050 Jun 24 2010
> 0descriptionofta581unit.djvu
> -rwxr-xr-x 1 www-data www-data 33272 Jun 22 2010
> 0descriptionofta581unit_djvu.txt
> -rwxr-xr-x 1 www-data www-data 33272 Jun 22 2010
> 0descriptionofta581unit_djvu.txt
> -rwxr-xr-x 1 www-data www-data 4445 Jun 23 2010
> 0descriptionofta581unit_files.xml
> -rwxr-xr-x 1 www-data www-data 4445 Jun 23 2010
> 0descriptionofta581unit_files.xml
> -rwxr-xr-x 1 www-data www-data 5011 Jun 22 2010
> 0descriptionofta581unit_marc.xml
> -rwxr-xr-x 1 www-data www-data 5011 Jun 22 2010
> 0descriptionofta581unit_marc.xml
> -rwxr-xr-x 1 www-data www-data 360 Jun 23 2010
> 0descriptionofta581unit_metasource.xml
> -rwxr-xr-x 1 www-data www-data 360 Jun 23 2010
> 0descriptionofta581unit_metasource.xml
> -rwxr-xr-x 1 www-data www-data 2848 Jun 22 2010
> 0descriptionofta581unit_meta.xml
> -rwxr-xr-x 1 www-data www-data 2848 Jun 22 2010
> 0descriptionofta581unit_meta.xml
> -rwxr-xr-x 1 www-data www-data 16916480 Jun 22 2010
> 0descriptionofta581unit_orig_jp2.tar
> -rwxr-xr-x 1 www-data www-data 16916480 Jun 22 2010
> 0descriptionofta581unit_orig_jp2.tar
> -rwxr-xr-x 1 www-data www-data 1051810 Jun 22 2010
> 0descriptionofta581unit.pdf
> -rwxr-xr-x 1 www-data www-data 1051810 Jun 22 2010
> 0descriptionofta581unit.pdf
>
> While running the latest, 3.1.1, I noticed some log files that said:
>
> [..]
> [2011-01-12 15:24:33.325546] I
> [afr-common.c:613:afr_lookup_self_heal_check] bhl-volume-replicate-69:
> size differs for
> /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
> [2011-01-12 15:24:33.325558] I [afr-common.c:716:afr_lookup_done]
> bhl-volume-replicate-69: background meta-data data self-heal
> triggered. path:
> /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
> [2011-01-12 15:24:33.364501] I
> [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
> bhl-volume-replicate-66: background meta-data data self-heal
> completed on /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
> [2011-01-12 15:24:33.364881] I
> [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
> bhl-volume-replicate-69: background meta-data data self-heal
> completed on /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
>
> I assumed it was fixing that, but it didn't. Here's the full logs that
> include all the gluster.log work it did in this directory:
> http://pastebin.com/8X52Em7Y
>
> Question: how can I 'fix' this, or is the best bet to remove
> everything and start over? It's going to set us back, but I'd rather
> do it now that keep banging on this without any resolution.
>
> Thanks for the help, really like the new gluster command, very nice!
>
> P
> --
> http://philcryer.com
>
--
http://philcryer.com
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
More information about the Gluster-users
mailing list