[Gluster-users] I/O error for one folder within the mountpoint

Fri Jul 7 09:54:15 UTC 2017

What does the mount log say when you get the EIO error on snooper? Check 
if there is a gfid mismatch on snooper directory or the files under it 
for all 3 bricks. In any case the mount log or the glustershd.log of the 
3 nodes for the gfids you listed below should give you some idea on why 
the files aren't healed.
Thanks.

On 07/07/2017 03:10 PM, Florian Leleu wrote:
>
> Hi Ravi,
>
> thanks for your answer, sure there you go:
>
> # gluster volume heal applicatif info
> Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
> <gfid:e3b5ef36-a635-4e0e-bd97-d204a1f8e7ed>
> <gfid:f8030467-b7a3-4744-a945-ff0b532e9401>
> <gfid:def47b0b-b77e-4f0e-a402-b83c0f2d354b>
> <gfid:46f76502-b1d5-43af-8c42-3d833e86eb44>
> <gfid:d27a71d2-6d53-413d-b88c-33edea202cc2>
> <gfid:7e7f02b2-3f2d-41ff-9cad-cd3b5a1e506a>
> Status: Connected
> Number of entries: 6
>
> Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
> <gfid:47ddf66f-a5e9-4490-8cd7-88e8b812cdbd>
> <gfid:8057d06e-5323-47ff-8168-d983c4a82475>
> <gfid:5b2ea4e4-ce84-4f07-bd66-5a0e17edb2b0>
> <gfid:baedf8a2-1a3f-4219-86a1-c19f51f08f4e>
> <gfid:8261c22c-e85a-4d0e-b057-196b744f3558>
> <gfid:842b30c1-6016-45bd-9685-6be76911bd98>
> <gfid:1fcaef0f-c97d-41e6-87cd-cd02f197bf38>
> <gfid:9d041c80-b7e4-4012-a097-3db5b09fe471>
> <gfid:ff48a14a-c1d5-45c6-a52a-b3e2402d0316>
> <gfid:01409b23-eff2-4bda-966e-ab6133784001>
> <gfid:c723e484-63fc-4267-b3f0-4090194370a0>
> <gfid:fb1339a8-803f-4e29-b0dc-244e6c4427ed>
> <gfid:056f3bba-6324-4cd8-b08d-bdf0fca44104>
> <gfid:a8f6d7e5-0ff2-4747-89f3-87592597adda>
> <gfid:3f6438a0-2712-4a09-9bff-d5a3027362b4>
> <gfid:392c8e2f-9da4-4af8-a387-bfdfea2f404e>
> <gfid:37e1edfd-9f58-4da3-8abe-819670c70906>
> <gfid:15b7cdb3-aae8-4ca5-b28c-e87a3e599c9b>
> <gfid:1d087e51-fb40-4606-8bb5-58936fb11a4c>
> <gfid:bb0352b9-4a5e-4075-9179-05c3a5766cf4>
> <gfid:40133fcf-a1fb-4d60-b169-e2355b66fb53>
> <gfid:00f75963-1b4a-4d75-9558-36b7d85bd30b>
> <gfid:2c0babdf-c828-475e-b2f5-0f44441fffdc>
> <gfid:bbeff672-43ef-48c9-a3a2-96264aa46152>
> <gfid:6c0969dd-bd30-4ba0-a7e5-ba4b3a972b9f>
> <gfid:4c81ea14-56f4-4b30-8fff-c088fe4b3dff>
> <gfid:1072cda3-53c9-4b95-992d-f102f6f87209>
> <gfid:2e8f9f29-78f9-4402-bc0c-e63af8cf77d6>
> <gfid:eeaa2765-44f4-4891-8502-5787b1310de2>
> Status: Connected
> Number of entries: 29
>
> Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
> <gfid:e3b5ef36-a635-4e0e-bd97-d204a1f8e7ed>
> <gfid:f8030467-b7a3-4744-a945-ff0b532e9401>
> <gfid:def47b0b-b77e-4f0e-a402-b83c0f2d354b>
> <gfid:46f76502-b1d5-43af-8c42-3d833e86eb44>
> <gfid:d27a71d2-6d53-413d-b88c-33edea202cc2>
> <gfid:7e7f02b2-3f2d-41ff-9cad-cd3b5a1e506a>
> Status: Connected
> Number of entries: 6
>
>
> # gluster volume heal applicatif info split-brain
> Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
> Status: Connected
> Number of entries in split-brain: 0
>
> Doesn't it seem odd that the first command give some different output ?
>
> Le 07/07/2017 à 11:31, Ravishankar N a écrit :
>> On 07/07/2017 01:23 PM, Florian Leleu wrote:
>>>
>>> Hello everyone,
>>>
>>> first time on the ML so excuse me if I'm not following well the 
>>> rules, I'll improve if I get comments.
>>>
>>> We got one volume "applicatif" on three nodes (2 and 1 arbiter), 
>>> each following command was made on node ipvr8.xxx:
>>>
>>> # gluster volume info applicatif
>>>
>>> Volume Name: applicatif
>>> Type: Replicate
>>> Volume ID: ac222863-9210-4354-9636-2c822b332504
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x (2 + 1) = 3
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: ipvr7.xxx:/mnt/gluster-applicatif/brick
>>> Brick2: ipvr8.xxx:/mnt/gluster-applicatif/brick
>>> Brick3: ipvr9.xxx:/mnt/gluster-applicatif/brick (arbiter)
>>> Options Reconfigured:
>>> performance.read-ahead: on
>>> performance.cache-size: 1024MB
>>> performance.quick-read: off
>>> performance.stat-prefetch: on
>>> performance.io-cache: off
>>> transport.address-family: inet
>>> performance.readdir-ahead: on
>>> nfs.disable: off
>>>
>>> # gluster volume status applicatif
>>> Status of volume: applicatif
>>> Gluster process                             TCP Port  RDMA Port  
>>> Online  Pid
>>> ------------------------------------------------------------------------------
>>> Brick ipvr7.xxx:/mnt/gluster-applicatif/
>>> brick                                       49154 0          Y       
>>> 2814
>>> Brick ipvr8.xxx:/mnt/gluster-applicatif/
>>> brick                                       49154 0          Y       
>>> 2672
>>> Brick ipvr9.xxx:/mnt/gluster-applicatif/
>>> brick                                       49154 0          Y       
>>> 3424
>>> NFS Server on localhost                     2049 0          Y       
>>> 26530
>>> Self-heal Daemon on localhost               N/A N/A        Y       26538
>>> NFS Server on ipvr9.xxx                  2049 0          Y       12238
>>> Self-heal Daemon on ipvr9.xxx            N/A N/A        Y       12246
>>> NFS Server on ipvr7.xxx                  2049 0          Y       2234
>>> Self-heal Daemon on ipvr7.xxx            N/A N/A        Y       2243
>>>
>>> Task Status of Volume applicatif
>>> ------------------------------------------------------------------------------
>>> There are no active volume tasks
>>>
>>> The volume is mounted with autofs (nfs) in /home/applicatif and one 
>>> folder is "broken":
>>>
>>> l /home/applicatif/services/
>>> ls: cannot access /home/applicatif/services/snooper: Input/output error
>>> total 16
>>> lrwxrwxrwx  1 applicatif applicatif    9 Apr  6 15:53 config -> 
>>> ../config
>>> lrwxrwxrwx  1 applicatif applicatif    7 Apr  6 15:54 .pwd -> ../.pwd
>>> drwxr-xr-x  3 applicatif applicatif 4096 Apr 12 10:24 querybuilder
>>> d?????????  ? ?          ?             ?            ? snooper
>>> drwxr-xr-x  3 applicatif applicatif 4096 Jul  6 02:57 snooper_new
>>> drwxr-xr-x 16 applicatif applicatif 4096 Jul  6 02:58 snooper_old
>>> drwxr-xr-x  4 applicatif applicatif 4096 Jul  4 23:45 ssnooper
>>>
>>> I checked wether there was a heal, and it seems so:
>>>
>>> # gluster volume heal applicatif statistics heal-count
>>> Gathering count of entries to be healed on volume applicatif has 
>>> been successful
>>>
>>> Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
>>> Number of entries: 8
>>>
>>> Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
>>> Number of entries: 29
>>>
>>> Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
>>> Number of entries: 8
>>>
>>> But actually in the brick on each server the folder "snooper" is fine.
>>>
>>> I tried rebooting the servers, restarting gluster after killing 
>>> every process using it but it's not working.
>>>
>>> Has anyone already experienced that ? Any help would be nice.
>>>
>>
>> Can you share the output of `gluster volume heal <volname> info` and 
>> `gluster volume heal <volname> info split-brain`? If the second 
>> command shows entries, please also share the getfattr output from the 
>> bricks for these files (getfattr -d -m . -e hex /brick/path/to/file).
>> -Ravi
>>>
>>> Thanks a lot !
>>>
>>> -- 
>>>
>>> Cordialement,
>>>
>>> <http://www.cognix-systems.com/> 		
>>>
>>> Florian LELEU
>>> Responsable Hosting, Cognix Systems
>>>
>>> *Rennes* | Brest | Saint-Malo | Paris
>>> florian.leleu at cognix-systems.com 
>>> <mailto:florian.leleu at cognix-systems.com>
>>>
>>> Tél. : 02 99 27 75 92
>>>
>>> 			
>>> Facebook Cognix Systems <https://www.facebook.com/cognix.systems/>
>>> Twitter Cognix Systems <https://twitter.com/cognixsystems>
>>> Logo Cognix Systems <http://www.cognix-systems.com/>
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>
> -- 
>
> Cordialement,
>
> <http://www.cognix-systems.com/> 		
>
> Florian LELEU
> Responsable Hosting, Cognix Systems
>
> *Rennes* | Brest | Saint-Malo | Paris
> florian.leleu at cognix-systems.com <mailto:florian.leleu at cognix-systems.com>
>
> Tél. : 02 99 27 75 92
>
> 			
> Facebook Cognix Systems <https://www.facebook.com/cognix.systems/>
> Twitter Cognix Systems <https://twitter.com/cognixsystems>
> Logo Cognix Systems <http://www.cognix-systems.com/>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/78aad03e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 4935 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/78aad03e/attachment.jpe>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 1444 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/78aad03e/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 1623 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/78aad03e/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 1474 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/78aad03e/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 4935 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/78aad03e/attachment-0001.jpe>