[Gluster-users] 3.1.2 with "No such file" and "Invalid argument" errors

Steve Wilson stevew at purdue.edu
Tue Feb 8 19:32:46 UTC 2011


On 02/07/2011 11:49 PM, Raghavendra G wrote:
> Hi Steve,
>
> Are the back-end file systems working correctly? I am seeing lots of errors in server log files while accessing back-end filesystem.
>
> gluster-01-brick.log.1:[2011-01-26 03:43:07.353445] E [posix.c:2193:posix_open] post-posix: open on /gluster/01/bri
> ck/home/lev/deltah/aadimers/serd/converge/0..75000/serd_phi-psi_hist.4deg.0..75000_map.cmd: Read-only file system
> gluster-01-brick.log.1:[2011-01-26 03:43:07.353857] E [posix.c:678:posix_setattr] post-posix: setattr (utimes) on /
> gluster/01/brick/home/lev/deltah/aadimers/serd/converge/0..75000/serd_phi-psi_hist.4deg.0..75000_map.cmd failed: Re
> ad-only file system
> gluster-01-brick.log.1:[2011-01-26 03:43:07.354827] E [posix.c:2318:posix_readv] post-posix: read failed on fd=0x7f
> 28e50dc1c8: Input/output error
> gluster-01-brick.log.1:[2011-01-26 03:43:07.357396] E [posix.c:2193:posix_open] post-posix: open on /gluster/01/bri
> ck/home/lev/deltah/aadimers/serd/converge/0..75000/serd_phi-psi_hist.4deg.0..75000_map.ps: Read-only file system
> gluster-01-brick.log.1:[2011-01-26 03:43:07.357794] E [posix.c:678:posix_setattr] post-posix: setattr (utimes) on /
> gluster/01/brick/home/lev/deltah/aadimers/serd/converge/0..75000/serd_phi-psi_hist.4deg.0..75000_map.ps failed: Rea
> d-only file system
> gluster-01-brick.log.1:[2011-01-26 03:43:07.358865] E [posix.c:2318:posix_readv] post-posix: read failed on fd=0x7f
> 28e50dc1c8: Input/output error
> gluster-01-brick.log.1:[2011-01-26 03:43:07.359264] E [posix.c:2318:posix_readv] post-posix: read failed on fd=0x7f
> 28e50dc1c8: Input/output error
> gluster-01-brick.log.1:[2011-01-26 03:43:07.359548] E [posix.c:2318:posix_readv] post-posix: read failed on fd=0x7f
> 28e50dc1c8: Input/output error
> gluster-01-brick.log.1:[2011-01-26 03:43:07.367163] E [posix.c:2318:posix_readv] post-posix: read failed on fd=0x7f
>
> I am seeing other errors, which indicate that the backend is read-only filesystem. Due to this distribute and replicate are not able to store the metadata (using xattrs), which in turn is resulting in lots of split-brains and layout NULL errors. Can you please check the backend file system?
>
> regards,


Yes, the filesystem was read-only for a time when a disk failed.  We 
then rebuilt the brick on that disk from the corresponding brick in the 
second server (with the volume stopped, of course) using:
     rsync -aXv brick/ stanley:/gluster/06/brick/

Following some instructions we found on the mailing list we then:
     1)  deleted the volume
     2)  ran "find /gluster -exec setfattr -x trusted.gfid \{\} \;" on 
the bricks
     3)  created the volume again
     4)  mounted the volume
     5)  ran "find . -print0 | xargs --null stat > /dev/null" on the 
mounted volume

This returned us to what seemed to be a stable state (i.e., no errors 
from running "ls -alR" from the top of the volume).  Then after putting 
the volume back into service, these errors started occurring again.  I 
have noticed that turning off "performance.stat-prefetch" has brought 
about a great improvement.  We continue to see some errors like this on 
one of the servers:

    [2011-02-08 14:22:08.360799] I [dht-common.c:369:dht_revalidate_cbk]
    post-dht: subvolume post-replicate-1 returned -1 (Invalid argument)
    [2011-02-08 14:22:08.836672] I [dht-common.c:369:dht_revalidate_cbk]
    post-dht: subvolume post-replicate-4 returned -1 (Invalid argument)
    [2011-02-08 14:22:39.468388] I [dht-common.c:369:dht_revalidate_cbk]
    post-dht: subvolume post-replicate-0 returned -1 (Invalid argument)
    [2011-02-08 14:22:39.468436] W [fuse-bridge.c:184:fuse_entry_cbk]
    glusterfs-fuse: 22465136: LOOKUP() /home/lev/.Xauthority => -1
    (Invalid argument)
    [2011-02-08 14:22:40.462910] I [dht-common.c:369:dht_revalidate_cbk]
    post-dht: subvolume post-replicate-5 returned -1 (Invalid argument)
    [2011-02-08 14:22:40.462958] W [fuse-bridge.c:184:fuse_entry_cbk]
    glusterfs-fuse: 22466110: LOOKUP() /home/lev/.viminfo => -1 (Invalid
    argument)

And the user sees:

    root at stanley:/net/post/lev# ls -al .viminfo .Xauthority
    ls: cannot access .viminfo: Invalid argument
    ls: cannot access .Xauthority: Invalid argument

But only from one client (which also happens to be the server giving the 
errors above).  Another client (the other server) shows these same files 
without problem:

    root at pablo:/net/post/lev# ls -al .viminfo .Xauthority
    -rw------- 1 lev post 9400 2011-02-07 22:52 .viminfo
    -rw------- 1 lev post 7401 2011-02-08 00:27 .Xauthority


Steve

> ----- Original Message -----
>> From: "Steve Wilson"<stevew at purdue.edu>
>> To: "Lakshmipathi"<lakshmipathi at gluster.com>
>> Cc: "Raghavendra G"<raghavendra at gluster.com>
>> Sent: Thursday, February 3, 2011 7:21:36 PM
>> Subject: Re: [Gluster-users] 3.1.2 with "No such file" and "Invalid argument" errors
>> Hi,
>>
>> Thanks for looking into this. Any ideas so far? Or anything you'd like
>> me to try?
>>
>> Here's some other perhaps relevant information:
>> * all bricks are formatted ext4 and mounted with the noatime option
>> in addition to default options
>> * servers and clients are running Ubuntu 10.04
>> * I did try mounting the GlusterFS volume with direct-io-mode
>> disabled but that didn't fix the problem
>>
>> Thanks!
>>
>> Steve
>>
>> On 02/01/2011 07:35 AM, Lakshmipathi wrote:
>>> Hi,
>>> Could you please sent us client and server log files?
>>>
>>>
>> --
>> Steven M. Wilson, Systems and Network Manager
>> Markey Center for Structural Biology
>> Purdue University
>> (765) 496-1946

-- 
Steven M. Wilson, Systems and Network Manager
Markey Center for Structural Biology
Purdue University
(765) 496-1946



More information about the Gluster-users mailing list