[Gluster-users] Seeing duplicate files, with duplicate names, inode number, etc

phil cryer phil at cryer.us
Thu Oct 28 03:28:40 UTC 2010


Additional on this, I'm copying some of the directories to external
drives to transfer them to outside repositories, but looking at the
output it's clear that gluster has re-downloaded, and doubled up files
in many directories. The output shows that it copies the file once to
the external drive, then tries the next file, but since it's the same
the external drive rejects it. Again, if we run an md5sum against the
files they match, inodes are the same, everything...these are not hard
links - what is happening?

Here's the output:

---------------------------------------------------------------------------------------
[...]
cp: cannot create regular file
`/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_raw_jp2.zip':
File exists
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_pure_jp2.zip'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_pure_jp2.zip'
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_meta.mrc'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_meta.mrc'
cp: cannot create regular file
`/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_meta.mrc':
File exists
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_marc.xml'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_marc.xml'
cp: cannot create regular file
`/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_marc.xml':
File exists
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_abbyy.gz'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_abbyy.gz'
cp: cannot create regular file
`/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_abbyy.gz':
File exists
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_lib_jp2.zip'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_lib_jp2.zip'
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich.djvu'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich.djvu'
cp: cannot create regular file
`/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich.djvu':
File exists
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_jp2.zip'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_jp2.zip'
cp: cannot create regular file
`/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_jp2.zip':
File exists
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_metasource.xml'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_metasource.xml'
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich.gif'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich.gif'
cp: cannot create regular file
`/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich.gif':
File exists
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_flippy.zip'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_flippy.zip'
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_djvu.xml'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_djvu.xml'
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_pure_jp2.zip'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_pure_jp2.zip'
cp: cannot create regular file
`/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_pure_jp2.zip':
File exists
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_dc.xml'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_dc.xml'
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich.pdf'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich.pdf'
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_metasource.xml'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_metasource.xml'
cp: cannot create regular file
`/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_metasource.xml':
File exists
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_djvu.xml'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_djvu.xml'
cp: cannot create regular file
`/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_djvu.xml':
File exists
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich.pdf'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich.pdf'
cp: cannot create regular file
`/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich.pdf':
File exists
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_dc.xml'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_dc.xml'
cp: cannot create regular file
`/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_dc.xml':
File exists
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_flippy.zip'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_flippy.zip'
cp: cannot create regular file
`/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_flippy.zip':
File exists
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_djvu.txt'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_djvu.txt'
cp: cannot create regular file
`/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_djvu.txt':
File exists
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/scandata.zip' ->
`/mnt/external/n/naturalistslibra30jardrich/scandata.zip'
cp: cannot create regular file
`/mnt/external/n/naturalistslibra30jardrich/scandata.zip': File exists
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_meta.xml'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_meta.xml'
`/mnt/glusterfs/www/n/naturalistslibra30jardrich/naturalistslibra30jardrich_lib_jp2.zip'
-> `/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_lib_jp2.zip'
cp: cannot create regular file
`/mnt/external/n/naturalistslibra30jardrich/naturalistslibra30jardrich_lib_jp2.zip':
File exists
`/mnt/glusterfs/www/n/newconquestofcen00andr' ->
`/mnt/external/n/newconquestofcen00andr'
`/mnt/glusterfs/www/n/nederlandschtijd02arnh' ->
`/mnt/external/n/nederlandschtijd02arnh'
`/mnt/glusterfs/www/n/newconceptionsin00snyduoft' ->
`/mnt/external/n/newconceptionsin00snyduoft'
`/mnt/glusterfs/www/n/nestseggsofnorth00daviuoft' ->
`/mnt/external/n/nestseggsofnorth00daviuoft'
`/mnt/glusterfs/www/n/nederlandschtijd02arnh' ->
`/mnt/external/n/nederlandschtijd02arnh'
cp: cannot create regular file
`/mnt/external/n/nederlandschtijd02arnh': File exists
`/mnt/glusterfs/www/n/naturalsciencemo02lond' ->
`/mnt/external/n/naturalsciencemo02lond'
`/mnt/glusterfs/www/n/naturwissenschaf10brau' ->
`/mnt/external/n/naturwissenschaf10brau'
`/mnt/glusterfs/www/n/notizenausdemgeb79weim' ->
`/mnt/external/n/notizenausdemgeb79weim'
`/mnt/glusterfs/www/n/nomenclatureofco00britsm' ->
`/mnt/external/n/nomenclatureofco00britsm'
`/mnt/glusterfs/www/n/noaatechnicalrep649unit' ->
`/mnt/external/n/noaatechnicalrep649unit'
[16:25:57] [root at clustr-02 /mnt/external]#
---------------------------------------------------------------------------------------

>From this you can pick one of the first ones that had so many issues,
and see duplicate files...
http://cluster.biodiversitylibrary.org/n/naturalistslibra30jardrich/

P



On Wed, Oct 27, 2010 at 12:23 PM, phil cryer <phil at cryer.us> wrote:
> We're building our cluster of data, downloading book data from
> Internet Archive. I've come across one that looks like this:
> http://cluster.biodiversitylibrary.org/n/naturwissenschaft19deut/
>
> Almost all the files appear to be there twice, but have the same name,
> timestamp and inode! What could be causing this, and how can we fix
> it? At issue is space; it appears that we're using far more space than
> we should, and an `du -h` or `ls -lsh` both say this directory takes
> 3.9G when it should really be about 1/2 that. If it has done this on
> many of the directories, it could explain how we're using 78T of 97T
> of space already.
>
> P
> --
> http://philcryer.com
>



-- 
http://philcryer.com



More information about the Gluster-users mailing list