[Gluster-users] Self-heal Problems with gluster and nfs

Norman Mähler n.maehler at uni-assist.de
Tue Jul 8 10:53:07 UTC 2014


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Of course:

The configuration is:

Volume Name: gluster_dateisystem
Type: Replicate
Volume ID: 2766695c-b8aa-46fd-b84d-4793b7ce847a
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: filecluster1:/mnt/raid
Brick2: filecluster2:/mnt/raid
Options Reconfigured:
nfs.enable-ino32: on
performance.cache-size: 512MB
diagnostics.brick-log-level: WARNING
diagnostics.client-log-level: WARNING
nfs.addr-namelookup: off
performance.cache-refresh-timeout: 60
performance.cache-max-file-size: 100MB
performance.write-behind-window-size: 10MB
performance.io-thread-count: 18
performance.stat-prefetch: off


The file count in xattrop is

Brick 1: 2706
Brick 2: 2687

Norman

Am 08.07.2014 12:28, schrieb Pranith Kumar Karampuri:
> It seems like entry self-heal is happening. What is the volume 
> configuration? Could you give ls
> <brick-path>/.glusterfs/indices/xattrop | wc -l Count for all the
> bricks
> 
> Pranith On 07/08/2014 03:36 PM, Norman Mähler wrote:
>> Hello Pranith,
>> 
>> here are the logs. I only giv you the last 3000 lines, because
>> the nfs.log from today is already 550 MB.
>> 
>> There are the standard files from a user home on the gluster
>> system. All you normally find in a user home. Config files,
>> firefox and thunderbird files etc.
>> 
>> Thanks in advance Norman
>> 
>> Am 08.07.2014 11:46, schrieb Pranith Kumar Karampuri:
>>> On 07/08/2014 02:46 PM, Norman Mähler wrote: Hello again,
>>> 
>>> i could resolve the self heal problems with the missing gfid
>>> files on one of the servers by deleting the gfid files on the
>>> other server.
>>> 
>>> They had a link count of 1 which means that the file on that
>>> the gfid pointed was already deleted.
>>> 
>>> 
>>> We have still these errors
>>> 
>>> [2014-07-08 09:09:43.564488] W 
>>> [client-rpc-fops.c:2469:client3_3_link_cbk] 
>>> 0-gluster_dateisystem-client-0: remote operation failed: File
>>> exists (00000000-0000-0000-0000-000000000000 -> 
>>> <gfid:b338b09e-2577-45b3-82bd-032f954dd083>/lock)
>>> 
>>> which appear in the glusterfshd.log and these
>>> 
>>> [2014-07-08 09:13:31.198462] E 
>>> [client-rpc-fops.c:5179:client3_3_inodelk] 
>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/cluster/replicate.so(+0x466b8)
>>>
>>>
>>>
>>> 
[0x7f5d29d4e6b8]
>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/cluster/replicate.so(afr_lock_blocking+0x844)
>>>
>>>
>>>
>>> 
[0x7f5d29d4e2e4]
>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/protocol/client.so(client_inodelk+0x99)
>>>
>>>
>>>
>>> 
[0x7f5d29f8b3c9]))) 0-: Assertion failed: 0
>>> 
>>> from the nfs.log.
>>>> Could you attach mount (nfs.log) and brick logs please. Do
>>>> you have files with lots of hard-links? Pranith
>>> I think the error messages belong together but I don't have any
>>> idea how to solve them.
>>> 
>>> Still we have got a very bad performance issue. The system load
>>> on the servers is above 20 and nearly no one is able to work in
>>> here on a client...
>>> 
>>> Hope for help Norman
>>> 
>>> 
>>> Am 07.07.2014 15:39, schrieb Pranith Kumar Karampuri:
>>>>>> On 07/07/2014 06:58 PM, Norman Mähler wrote: Dear
>>>>>> community,
>>>>>> 
>>>>>> we have got some serious problems with our Gluster
>>>>>> installation.
>>>>>> 
>>>>>> Here is the setting:
>>>>>> 
>>>>>> We have got 2 bricks (version 3.4.4) on a debian 7.5, one
>>>>>> of them with an nfs export. There are about 120 clients
>>>>>> connecting to the exported nfs. These clients are thin
>>>>>> clients reading and writing their Linux home directories
>>>>>> from the exported nfs.
>>>>>> 
>>>>>> We want to change the access of these clients one by one
>>>>>> to access via gluster client.
>>>>>>> I did not understand what you meant by this. Are you
>>>>>>> moving to glusterfs-fuse based mounts?
>>>>>> 
>>>>>> Here are our problems:
>>>>>> 
>>>>>> In the moment we have got two types of error messages
>>>>>> which come in burts to our glusterfshd.log
>>>>>> 
>>>>>> [2014-07-07 13:10:21.572487] W 
>>>>>> [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 
>>>>>> 0-gluster_dateisystem-client-1: remote operation failed:
>>>>>> No such file or directory [2014-07-07 13:10:21.573448] W 
>>>>>> [client-rpc-fops.c:471:client3_3_open_cbk] 
>>>>>> 0-gluster_dateisystem-client-1: remote operation failed:
>>>>>> No such file or directory. Path: 
>>>>>> <gfid:b0c4f78a-249f-4db7-9d5b-0902c7d8f6cc> 
>>>>>> (00000000-0000-0000-0000-000000000000) [2014-07-07
>>>>>> 13:10:21.573468] E
>>>>>> [afr-self-heal-data.c:1270:afr_sh_data_open_cbk] 
>>>>>> 0-gluster_dateisystem-replicate-0: open of 
>>>>>> <gfid:b0c4f78a-249f-4db7-9d5b-0902c7d8f6cc> failed on
>>>>>> child gluster_dateisystem-client-1 (No such file or
>>>>>> directory)
>>>>>> 
>>>>>> 
>>>>>> This looks like a missing gfid file on one of the bricks.
>>>>>> I looked it up and yes the file is missing on the second
>>>>>> brick.
>>>>>> 
>>>>>> We got these messages the other way round, too (missing
>>>>>> on client-0 and the first brick).
>>>>>> 
>>>>>> Is it possible to repair this one by copying the gfid
>>>>>> file to the brick where it was missing? Or ist there
>>>>>> another way to repair it?
>>>>>> 
>>>>>> 
>>>>>> The second message is
>>>>>> 
>>>>>> [2014-07-07 13:06:35.948738] W 
>>>>>> [client-rpc-fops.c:2469:client3_3_link_cbk] 
>>>>>> 0-gluster_dateisystem-client-1: remote operation failed:
>>>>>> File exists (00000000-0000-0000-0000-000000000000 -> 
>>>>>> <gfid:aae47250-8f69-480c-ac75-2da2f4d21d7a>/lock)
>>>>>> 
>>>>>> and I really do not know what to do with this one...
>>>>>>> Did any of the bricks went offline and came back
>>>>>>> online? Pranith
>>>>>> I am really looking forward to your help because this is
>>>>>> an active system and the system load on the nfs brick is
>>>>>> about 25 (!!)
>>>>>> 
>>>>>> Thanks in advance! Norman Maehler
>>>>>> 
>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> Gluster-users mailing list Gluster-users at gluster.org 
>>>>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
>>>>>>> 
- -- 
Mit freundlichen Grüßen,

Norman Mähler

Bereichsleiter IT-Hochschulservice
uni-assist e. V.
Geneststr. 5
Aufgang H, 3. Etage
10829 Berlin

Tel.: 030-66644382
n.maehler at uni-assist.de
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTu82TAAoJEB810LSP8y+R3jMH/0K9U2jBrukBDrdDvMf542Cz
Qoi4Lq2KU+SwUL6tcR4kymC0iGe5ZDk0baEOBwzdBmW1Nu19saGKjhXxYskmjSu4
lKJP216229eSOHD6mTlwamgj6DCgxlFZwMzLJMbiEaRhZzFTK5PMbkhslV3IP8IK
jmKlNwdhGVJ7nUCjt+Mu203kCdQUv8X/a3UKO341LkdqlOSSsmhOEL34Mop51vmL
mZZdw5fCZisK29vKeZr1vBvIbRYvx3kBSRjYWPtBq1pRx4DbhTdoYnSfLULt+MJJ
fgYIDS3ykYx/U10wmtHs75+rFxvtXOLe3QiwuakE8nj/quIvKRZorGJ9BSvqYoQ=
=vWIo
-----END PGP SIGNATURE-----



More information about the Gluster-users mailing list