[Gluster-users] missing files
Jeremy Enos
jenos at ncsa.uiuc.edu
Tue Nov 24 01:39:09 UTC 2009
I have another clue to report:
So I have my export directory as:
/export
Mounted as:
/scratch
If I do "ls -lR /scratch", it's supposed to synchronize all files and
metadata, right? Well, it doesn't seem to be doing that.
I have approx 100 files in one problematic folder. Only 50 show up to
ls. That is, until I list it specifically. They also don't show up in
the export directory until ls'd by name in /scratch.
ls /scratch/file* # results in files1-49 being listed
ls /export/file* # same result as above
ls /export/file50.dat # no such file or directory
ls /scratch/file50.dat # lists file as if nothing was ever wrong
ls /export/file50.dat # shows up now after specific ls call in /scratch
ls /scratch/file* # results in files 1-50 being listed now (magic?)
ls /export/file* # also results in files 1-50 being listed now
I'm considering doing a:
for n in `seq 51-100` ; do ls /scratch/file$n.dat ; done
just to recover the files. However, I'm delaying that so I can keep
some in the problematic state should someone give me some additional
debugging steps here. Don't get me wrong- I appreciate any help I can
get w/ a free product like this. But I'm actually surprised that a
report like this just seems to be hitting a dead end on this list in
terms of responses. Isn't this alarming behavior? Somehow the
filesystem got into a state where files still were recorded, but weren't
represented until specifically listed. That should tell us something,
but I'm no expert here.
thx-
Jeremy
Jeremy Enos wrote:
> Can anyone tell me if there's hope of recovering data here? Steps to
> take? Anything? Is something wrong with my configuration? (raid1 over
> raid0) If I don't have a clue what went wrong or why, or how to
> recover, then even formatting and starting fresh doesn't lend much hope
> in future reliability.
> thx-
>
> Jeremy
>
> Jeremy Enos wrote:
>
>> plain text send...
>>
>> Jeremy Enos wrote:
>>
>>> What kind of tweaking and tampering was necessary to recover the lost
>>> data?
>>>
>>> Jeremy
>>>
>>> My configuration:
>>> Oh yes- of course- don't know why I left this out. Version and
>>> config files follow.
>>>
>>> [jenos at ac glusterfs]$ rpm -qa |grep gluster
>>> glusterfs-common-2.0.7-1.fc10.x86_64
>>> glusterfs-client-2.0.7-1.fc10.x86_64
>>>
>>>
>>> [jenos at ac glusterfs]$ cat glusterfs.vol
>>> #-----------IB remotes------------------
>>> volume remote1
>>> type protocol/client
>>> option transport-type ib-verbs/client
>>> option remote-host ac11
>>> option remote-subvolume ibstripe
>>> end-volume
>>>
>>> volume remote2
>>> type protocol/client
>>> option transport-type ib-verbs/client
>>> option remote-host ac12
>>> option remote-subvolume ibstripe
>>> end-volume
>>>
>>> volume remote3
>>> type protocol/client
>>> option transport-type ib-verbs/client
>>> option remote-host ac13
>>> option remote-subvolume ibstripe
>>> end-volume
>>>
>>> volume remote4
>>> type protocol/client
>>> option transport-type ib-verbs/client
>>> option remote-host ac14
>>> option remote-subvolume ibstripe
>>> end-volume
>>>
>>> volume remote5
>>> type protocol/client
>>> option transport-type ib-verbs/client
>>> option remote-host ac15
>>> option remote-subvolume ibstripe
>>> end-volume
>>>
>>> volume remote6
>>> type protocol/client
>>> option transport-type ib-verbs/client
>>> option remote-host ac16
>>> option remote-subvolume ibstripe
>>> end-volume
>>>
>>> volume remote7
>>> type protocol/client
>>> option transport-type ib-verbs/client
>>> option remote-host ac17
>>> option remote-subvolume ibstripe
>>> end-volume
>>>
>>> volume remote8
>>> type protocol/client
>>> option transport-type ib-verbs/client
>>> option remote-host ac18
>>> option remote-subvolume ibstripe
>>> end-volume
>>>
>>> volume remote9
>>> type protocol/client
>>> option transport-type ib-verbs/client
>>> option remote-host ac19
>>> option remote-subvolume ibstripe
>>> end-volume
>>>
>>> volume remote10
>>> type protocol/client
>>> option transport-type ib-verbs/client
>>> option remote-host ac20
>>> option remote-subvolume ibstripe
>>> end-volume
>>>
>>> #----------Stripe and Replicate------------------
>>>
>>> volume stripe1
>>> type cluster/stripe
>>> option block-size 1MB
>>> subvolumes remote1 remote2 remote3 remote4 remote5
>>> end-volume
>>>
>>> volume stripe2
>>> type cluster/stripe
>>> option block-size 1MB
>>> subvolumes remote6 remote7 remote8 remote9 remote10
>>> end-volume
>>>
>>> volume replicate
>>> type cluster/replicate
>>> option metadata-self-heal on
>>> subvolumes stripe1 stripe2
>>> end-volume
>>>
>>> #------------Performance Options-------------------
>>>
>>> volume readahead
>>> type performance/read-ahead
>>> option page-count 4 # 2 is default option
>>> option force-atime-update off # default is off
>>> subvolumes replicate
>>> end-volume
>>>
>>> volume writebehind
>>> type performance/write-behind
>>> option cache-size 1MB
>>> subvolumes readahead
>>> end-volume
>>>
>>> volume cache
>>> type performance/io-cache
>>> option cache-size 1GB
>>> subvolumes writebehind
>>> end-volume
>>>
>>> [jenos at ac glusterfs]$ cat glusterfsd.vol
>>> volume posix
>>> type storage/posix
>>> option directory /export
>>> end-volume
>>>
>>> volume locks
>>> type features/locks
>>> subvolumes posix
>>> end-volume
>>>
>>> volume ibstripe
>>> type performance/io-threads
>>> option thread-count 4
>>> subvolumes locks
>>> end-volume
>>>
>>> volume server-ib
>>> type protocol/server
>>> option transport-type ib-verbs/server
>>> option auth.addr.ibstripe.allow *
>>> subvolumes ibstripe
>>> end-volume
>>>
>>> volume server-tcp
>>> type protocol/server
>>> option transport-type tcp/server
>>> option auth.addr.ibstripe.allow *
>>> subvolumes ibstripe
>>> end-volume
>>>
>>> [jenos at ac glusterfs]$
>>>
>>>
>>>
>>> Krzysztof Strasburger wrote:
>>>
>>>> On Wed, Nov 04, 2009 at 01:31:30AM -0600, Jeremy Enos wrote:
>>>>
>>>>
>>>>> Hi-
>>>>> I've got a problem where certain batches of files written out to
>>>>> gluster have disappeared. Also, newly created files sometimes
>>>>> don't show up to ls unless they are explicitly specified to ls and
>>>>> other tools.
>>>>>
>>>>> In my export folder, everything appears fine.
>>>>> I have found that when I touch the missing file in gluster, it
>>>>> comes back, shows a file size, but appears empty. I've tried
>>>>> umounting, restarting all glusterfsds, remounting, and it stayed
>>>>> the same. Also, this problem did not show up immediately after
>>>>> setting up the filesystem, at least during basic tests. Any ideas?
>>>>>
>>>>>
>>>> What is your configuration? I experienced similar problems with unify
>>>> after a disk crash. The namespace (replicated) was not rebuilt
>>>> correctly
>>>> after replacing the failing unit and I had to add some files manually
>>>> (OK, using a script, but an intervention was needed). No data loss,
>>>> only a bit of tweaking and tampering ;).
>>>> Krzysztof
>>>>
More information about the Gluster-users
mailing list