[Gluster-users] rm -rf errors

Mon May 11 17:05:08 UTC 2009

Hello Frederico,

Can you please try with ip addresses instead of hostnames in volume file?
There was a problem with long hostnames in 2.0.0 which has been subsequently
fixed.

Thanks,
Vijay

On Mon, May 11, 2009 at 10:19 PM, Sacerdoti, Federico <
Federico.Sacerdoti at deshawresearch.com> wrote:

>  Thanks. I updated to 2.0.0 but the daemons will not start and give a very
> generic error that does not help
>
> 2009-05-11 12:41:56 E [glusterfsd.c:483:_xlator_graph_init] drdan0199:
> validating translator failed
> 2009-05-11 12:41:56 E [glusterfsd.c:1145:main] glusterfs: translator
> initialization failed.  exiting
>
> Can you see something wrong in the volume file? This works fine for
> 2.0.0rc4
>
> --START--
> volume storage
>   type storage/posix
>   option directory /scratch/glusterfs/export
> end-volume
>
> # Required for AFR (file replication) module
> volume locks
>   type features/locks
>   subvolumes storage
> end-volume
>
> volume brick
>   type performance/io-threads
> #option thread-count 1
>   option thread-count 8
>   subvolumes locks
> end-volume
>
> volume server
>   type protocol/server
>   subvolumes brick
>   option transport-type tcp
>   option auth.addr.brick.allow 10.232.*
> end-volume
>
> volume drdan0191
>   type protocol/client
>   option transport-type tcp
>   option remote-host drdan0191.en.desres.deshaw.com
>   option remote-subvolume brick
> end-volume
>
> volume drdan0192
>   type protocol/client
>   option transport-type tcp
>   option remote-host drdan0192.en.desres.deshaw.com
>   option remote-subvolume brick
> end-volume
> [...]
>
> volume nufa
>   type cluster/nufa
>   option local-volume-name `hostname -s`
>   #subvolumes replicate1 replicate2 replicate3 replicate4 replicate5
>   subvolumes drdan0191 drdan0192 drdan0193 drdan0194 drdan0195 drdan0196
> drdan0197 drdan0198 drdan0199 drdan0200
> end-volume
>
> # This, from https://savannah.nongnu.org/bugs/?24972, does the
> # filesystem mounting at server start time. Like an /etc/fstab entry
> volume fuse
>   type mount/fuse
>   option direct-io-mode 1
>   option entry-timeout 1
>   #option attr-timeout 1 (not recognized in 2.0)
>   option mountpoint /mnt/glusterfs
>   subvolumes nufa
> end-volume
> --END--
>
> Thanks,
> fds
>
>
>  ------------------------------
> *From:* Liam Slusser [mailto:lslusser at gmail.com]
> *Sent:* Thursday, May 07, 2009 1:51 PM
> *To:* Sacerdoti, Federico
> *Cc:* gluster-users at gluster.org
> *Subject:* Re: [Gluster-users] rm -rf errors
>
> You should try upgrading to the 2.0.0 release and try again.  They fixed
> all sorts of bugs.
> liam
>
> On Thu, May 7, 2009 at 8:21 AM, Sacerdoti, Federico <
> Federico.Sacerdoti at deshawresearch.com> wrote:
>
>> Hello,
>>
>> I am evaluating glusterfs and have seen some strange behavior with
>> remove. I have gluster/2.0.0rc4 setup on 10 linux nodes connected with
>> GigE. The config is Nufa/fuse with one storage brick per server, as seen
>> in the attached nufa.vol config file, which I use for both clients and
>> servers.
>>
>> My experiment is to launch 10 parallel writers, each of whom writes
>> 32GiB worth of data in small files (2MB) to a shared gluster-fuse
>> mounted filesystem. The files are named uniquely per client, so each
>> file is only written once. This worked well, and I am seeing performance
>> close to that of native disk, even with 8-writers per node.
>>
>> However when I do a parallel "rm -rf writedir/" on the 10 nodes, where
>> writedir is the directory written in by the parallel writers described
>> above, I see strange effects. There are 69,000 UNLINK errors in the
>> glusterfsd.log of one server, in the form shown below. This alone is not
>> surprising as the operation is ocurring in parallel. However the remove
>> took much longer than expected, 92min, and more surprisingly the rm
>> command exited 0 but files remained in the writedir!
>>
>> I ran rm -rf writedir from a single client, and it too exited 0 but left
>> the writedir non-empty. Is this expected?
>>
>> Thanks,
>> Federico
>>
>> --From glusterfsd.log--
>> 2009-05-04 11:35:15 E [fuse-bridge.c:964:fuse_unlink_cbk]
>> glusterfs-fuse: 5764889: UNLINK() /write.2MB.runid1.p1/5 => -1 (No such
>> file or directory)
>> 2009-05-04 11:35:15 E [dht-common.c:1294:dht_err_cbk] nufa: subvolume
>> drdan0192 returned -1 (No such file or directory)
>> 2009-05-04 11:35:15 E [fuse-bridge.c:964:fuse_unlink_cbk]
>> glusterfs-fuse: 5764894: UNLINK() /write.2MB.runid1.p1/51 => -1 (No such
>> file or directory)
>> --end--
>>  <<nufa.vol>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>>
>>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090511/3b045b7d/attachment.html>