[Gluster-users] Debian, 3.1.1, duplicate files
phil cryer
phil at cryer.us
Fri Feb 11 21:29:22 UTC 2011
On Thu, Jan 13, 2011 at 3:53 PM, Jacob Shucart <jacob at gluster.com> wrote:
> Phil,
>
> This sounds to me like an issue identified that affects Gluster directories
> that were part of older versions related to extended attributes that were
> set on the directories. I believe this issue is supposed to be fixed in
> 3.1.2. I don't know how large your dataset is, but a way to fix it would be
> to:
>
> 1. Delete the Gluster volume.
> 2. On the back end directories on your nodes, scrub the offending extended
> attribute with the command:
> find /back/end/dir -exec setfattr -x trusted.gfid {} \;
> 3. Create the Gluster volume again.
> 4. Mount the volume somewhere as a GlusterFS(mount -t glusterfs....) and
> run:
> find /mnt/gluster -print0 | xargs --null stat
> 5. Enjoy.
Jacob
Thanks for your reply, to solve this I installed 3.1.2, then
re-formatted all of my drives (bricks). It might have been overkill,
but I wanted to start completely fresh with 3.1.2. So far, we've had
no issues with the setup, and I'll be careful from now on when I
update versions, hopefully they will be a path to avoid gotchas like
this!
Thanks
P
>
> Please let me know if that helps. Thank you.
>
> -Jacob
>
> -----Original Message-----
> From: gluster-users-bounces at gluster.org
> [mailto:gluster-users-bounces at gluster.org] On Behalf Of phil cryer
> Sent: Thursday, January 13, 2011 9:07 AM
> To: gluster-users at gluster.org
> Subject: Re: [Gluster-users] Debian, 3.1.1, duplicate files
>
> So, I haven't heard anything back, so I just wanted to update this
> just in case anyone else comes across it. This was an old store that
> we created in 3.0.4, that kept getting duplicate files, basically we
> ran an update script that would use wget, try to download any files
> that were not present on the local box but were on the remote. Of
> course if it just downloaded the same file it would either 1) ignore
> it and not download it because it would see that we already have it 2)
> overwrite that file (clobber) with a new version of that file or 2)
> rewrite the file as file.1 so as not to mess with the original one
> (no-clobber) - but in fact it did none of these - so instead we ended
> up with the bizzare feature of having multiple/identical files in the
> same directory. Meanwhile we're also using far more space than we
> should have (~70TB instead of ~40TB or so) thanks to having
> directories like this:
>
> # ls -al /mnt/glusterfs//www/t/tijdschriftvoore1951nede/
> total 536436
> drwxr-xr-x 2 www-data www-data 294912 Jan 13 10:05 .
> drwx------ 1016 www-data www-data 3846144 Dec 12 11:10 ..
> -rwxr-xr-x 1 www-data www-data 1151282 Jul 12 2010
> tijdschriftvoore1951nede_djvu.txt
> -rwxr-xr-x 1 www-data www-data 1151282 Jul 12 2010
> tijdschriftvoore1951nede_djvu.txt
> -rwxr-xr-x 1 www-data www-data 12078834 Jul 12 2010
> tijdschriftvoore1951nede_djvu.xml
> -rwxr-xr-x 1 www-data www-data 12078834 Jul 12 2010
> tijdschriftvoore1951nede_djvu.xml
> -rwxr-xr-x 1 www-data www-data 271733 Jul 12 2010
> tijdschriftvoore1951nede.gif
> -rwxr-xr-x 1 www-data www-data 271733 Jul 12 2010
> tijdschriftvoore1951nede.gif
> -rwxr-xr-x 1 www-data www-data 257779301 Jul 12 2010
> tijdschriftvoore1951nede_jp2.zip
> -rwxr-xr-x 1 www-data www-data 257779301 Jul 12 2010
> tijdschriftvoore1951nede_jp2.zip
> -rwxr-xr-x 1 www-data www-data 2278 Jul 12 2010
> tijdschriftvoore1951nede_marc.xml
> -rwxr-xr-x 1 www-data www-data 2278 Jul 12 2010
> tijdschriftvoore1951nede_marc.xml
> -rwxr-xr-x 1 www-data www-data 720 Jul 12 2010
> tijdschriftvoore1951nede_meta.mrc
> -rwxr-xr-x 1 www-data www-data 720 Jul 12 2010
> tijdschriftvoore1951nede_meta.mrc
> -rwxr-xr-x 1 www-data www-data 546411 Jul 12 2010
> tijdschriftvoore1951nede_names.xml
> -rwxr-xr-x 1 www-data www-data 546411 Jul 12 2010
> tijdschriftvoore1951nede_names.xml
> -rwxr-xr-x 1 www-data www-data 256 Jul 12 2010
> tijdschriftvoore1951nede_names.xml_meta.txt
> -rwxr-xr-x 1 www-data www-data 256 Jul 12 2010
> tijdschriftvoore1951nede_names.xml_meta.txt
> -rwxr-xr-x 1 www-data www-data 257556 Jul 13 2010
> tijdschriftvoore1951nede_scandata.xml
> -rwxr-xr-x 1 www-data www-data 257556 Jul 13 2010
> tijdschriftvoore1951nede_scandata.xml
>
> Ouch, right? So, I installed 3.1.1, that went well, I got it on all
> the drives and servers we had before, have a total capacity of 96TB
> again, good, all seems to be working, mounted the old directories and
> saw the same issue with the duplicate files and let it sit over night
> to see if it would notice this and try to fix things. Then we're
> seeing gluster logs saying things like:
>
> ==> glusterfs/mnt-glusterfs.log <==
> [2011-01-13 11:46:23.2762] I [afr-common.c:662:afr_lookup_done]
> bhl-volume-replicate-55: entries are missing in lookup of
> /www/t/tijdschriftvoore1951nede.
> [2011-01-13 11:46:23.2817] I [afr-common.c:716:afr_lookup_done]
> bhl-volume-replicate-55: background meta-data data entry self-heal
> triggered. path: /www/t/tijdschriftvoore1951nede
> [2011-01-13 11:46:23.5342] I
> [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
> bhl-volume-replicate-55: background meta-data data entry self-heal
> completed on /www/t/tijdschriftvoore1951nede
>
> ...so we think, hey, maybe we're all set here, it's fixing itself and
> removing those duplicate files, but no such luck:
>
> # ls -al /mnt/glusterfs//www/t/tijdschriftvoore1951nede/
> total 536436
> drwxr-xr-x 2 www-data www-data 294912 Jan 13 10:05 .
> drwx------ 1016 www-data www-data 3846144 Dec 12 11:10 ..
> -rwxr-xr-x 1 www-data www-data 1151282 Jul 12 2010
> tijdschriftvoore1951nede_djvu.txt
> -rwxr-xr-x 1 www-data www-data 1151282 Jul 12 2010
> tijdschriftvoore1951nede_djvu.txt
> -rwxr-xr-x 1 www-data www-data 12078834 Jul 12 2010
> tijdschriftvoore1951nede_djvu.xml
> -rwxr-xr-x 1 www-data www-data 12078834 Jul 12 2010
> tijdschriftvoore1951nede_djvu.xml
> -rwxr-xr-x 1 www-data www-data 271733 Jul 12 2010
> tijdschriftvoore1951nede.gif
> -rwxr-xr-x 1 www-data www-data 271733 Jul 12 2010
> tijdschriftvoore1951nede.gif
> -rwxr-xr-x 1 www-data www-data 257779301 Jul 12 2010
> tijdschriftvoore1951nede_jp2.zip
> -rwxr-xr-x 1 www-data www-data 257779301 Jul 12 2010
> tijdschriftvoore1951nede_jp2.zip
> -rwxr-xr-x 1 www-data www-data 2278 Jul 12 2010
> tijdschriftvoore1951nede_marc.xml
> -rwxr-xr-x 1 www-data www-data 2278 Jul 12 2010
> tijdschriftvoore1951nede_marc.xml
> -rwxr-xr-x 1 www-data www-data 720 Jul 12 2010
> tijdschriftvoore1951nede_meta.mrc
> -rwxr-xr-x 1 www-data www-data 720 Jul 12 2010
> tijdschriftvoore1951nede_meta.mrc
> -rwxr-xr-x 1 www-data www-data 546411 Jul 12 2010
> tijdschriftvoore1951nede_names.xml
> -rwxr-xr-x 1 www-data www-data 546411 Jul 12 2010
> tijdschriftvoore1951nede_names.xml
> -rwxr-xr-x 1 www-data www-data 256 Jul 12 2010
> tijdschriftvoore1951nede_names.xml_meta.txt
> -rwxr-xr-x 1 www-data www-data 256 Jul 12 2010
> tijdschriftvoore1951nede_names.xml_meta.txt
> -rwxr-xr-x 1 www-data www-data 257556 Jul 13 2010
> tijdschriftvoore1951nede_scandata.xml
> -rwxr-xr-x 1 www-data www-data 257556 Jul 13 2010
> tijdschriftvoore1951nede_scandata.xml
>
> but, this allows us to do (in my opinion) scary things like this:
>
> # ls -al /mnt/glusterfs//www/t/tijdschriftvoore1951nede/*_names.xml
> -rwxr-xr-x 1 www-data www-data 546411 Jul 12 2010
> /mnt/glusterfs//www/t/tijdschriftvoore1951nede/tijdschriftvoore1951nede_names.xml
> -rwxr-xr-x 1 www-data www-data 546411 Jul 12 2010
> /mnt/glusterfs//www/t/tijdschriftvoore1951nede/tijdschriftvoore1951nede_names.xml
>
> # rm
> /mnt/glusterfs//www/t/tijdschriftvoore1951nede/tijdschriftvoore1951nede_names.xml
>
> # ls -al /mnt/glusterfs//www/t/tijdschriftvoore1951nede/*_names.xml
> -rwxr-xr-x 1 www-data www-data 546411 Jul 12 2010
> /mnt/glusterfs//www/t/tijdschriftvoore1951nede/tijdschriftvoore1951nede_names.xml
>
> eek! so it only removed one of the files, even though they both had
> the same name. At this point we're going to wipe all 70TB and
> re-transfer, hoping it stops when it gets all the files and doesn't
> start writing the files with the same names as before. Anyone with
> advice or insight into this issue? Would love to learn why it did
> this, and REALLY hope it doesn't do it again.
>
> Thanks
>
> P
>
>
>
> On Wed, Jan 12, 2011 at 2:37 PM, phil cryer <phil at cryer.us> wrote:
>> I'm now running gluster 3.1.1 on Debian. A directory that was running
>> under 3.0.4 had duplicate files, but I've remounted things now that
>> we're running 3.1.1 in hopes it would fix things, but so far it has
>> not:
>>
>> # ls -l /mnt/glusterfs/www/0/0descriptionofta581unittotal 37992
>> -rwxr-xr-x 1 www-data www-data 796343 Jun 23 2010
>> 0descriptionofta581unit_bw.pdf
>> -rwxr-xr-x 1 www-data www-data 796343 Jun 23 2010
>> 0descriptionofta581unit_bw.pdf
>> ---------T 1 root root 1497 Jun 24 2010
>> 0descriptionofta581unit_dc.xml
>> ---------T 1 root root 1497 Jun 24 2010
>> 0descriptionofta581unit_dc.xml
>> ---------T 1 www-data www-data 577050 Jun 24 2010
>> 0descriptionofta581unit.djvu
>> ---------T 1 www-data www-data 577050 Jun 24 2010
>> 0descriptionofta581unit.djvu
>> -rwxr-xr-x 1 www-data www-data 33272 Jun 22 2010
>> 0descriptionofta581unit_djvu.txt
>> -rwxr-xr-x 1 www-data www-data 33272 Jun 22 2010
>> 0descriptionofta581unit_djvu.txt
>> -rwxr-xr-x 1 www-data www-data 4445 Jun 23 2010
>> 0descriptionofta581unit_files.xml
>> -rwxr-xr-x 1 www-data www-data 4445 Jun 23 2010
>> 0descriptionofta581unit_files.xml
>> -rwxr-xr-x 1 www-data www-data 5011 Jun 22 2010
>> 0descriptionofta581unit_marc.xml
>> -rwxr-xr-x 1 www-data www-data 5011 Jun 22 2010
>> 0descriptionofta581unit_marc.xml
>> -rwxr-xr-x 1 www-data www-data 360 Jun 23 2010
>> 0descriptionofta581unit_metasource.xml
>> -rwxr-xr-x 1 www-data www-data 360 Jun 23 2010
>> 0descriptionofta581unit_metasource.xml
>> -rwxr-xr-x 1 www-data www-data 2848 Jun 22 2010
>> 0descriptionofta581unit_meta.xml
>> -rwxr-xr-x 1 www-data www-data 2848 Jun 22 2010
>> 0descriptionofta581unit_meta.xml
>> -rwxr-xr-x 1 www-data www-data 16916480 Jun 22 2010
>> 0descriptionofta581unit_orig_jp2.tar
>> -rwxr-xr-x 1 www-data www-data 16916480 Jun 22 2010
>> 0descriptionofta581unit_orig_jp2.tar
>> -rwxr-xr-x 1 www-data www-data 1051810 Jun 22 2010
>> 0descriptionofta581unit.pdf
>> -rwxr-xr-x 1 www-data www-data 1051810 Jun 22 2010
>> 0descriptionofta581unit.pdf
>>
>> While running the latest, 3.1.1, I noticed some log files that said:
>>
>> [..]
>> [2011-01-12 15:24:33.325546] I
>> [afr-common.c:613:afr_lookup_self_heal_check] bhl-volume-replicate-69:
>> size differs for
>> /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
>> [2011-01-12 15:24:33.325558] I [afr-common.c:716:afr_lookup_done]
>> bhl-volume-replicate-69: background meta-data data self-heal
>> triggered. path:
>> /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
>> [2011-01-12 15:24:33.364501] I
>> [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
>> bhl-volume-replicate-66: background meta-data data self-heal
>> completed on /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
>> [2011-01-12 15:24:33.364881] I
>> [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
>> bhl-volume-replicate-69: background meta-data data self-heal
>> completed on /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
>>
>> I assumed it was fixing that, but it didn't. Here's the full logs that
>> include all the gluster.log work it did in this directory:
>> http://pastebin.com/8X52Em7Y
>>
>> Question: how can I 'fix' this, or is the best bet to remove
>> everything and start over? It's going to set us back, but I'd rather
>> do it now that keep banging on this without any resolution.
>>
>> Thanks for the help, really like the new gluster command, very nice!
>>
>> P
>> --
>> http://philcryer.com
>>
>
>
>
> --
> http://philcryer.com
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
--
http://philcryer.com
More information about the Gluster-users
mailing list