[Gluster-users] healing never ends (or never starts?) on replicated volume with virtual block device

Roman romeo.r at gmail.com
Thu Nov 6 13:00:33 UTC 2014


[2014-11-06 12:21:01.575385] I
[afr-self-heal-common.c:2868:afr_log_self_heal_completion_status]
0-HA-WIN-TT-1T-replicate-0:  foreground data self heal  is successfully
completed,  data self heal from HA-WIN-TT-1T-client-1  to sinks
 HA-WIN-TT-1T-client-0, with 966367641600 bytes on HA-WIN-TT-1T-client-0,
966367641600 bytes on HA-WIN-TT-1T-client-1,  data - Pending matrix:  [ [ 1
1 ] [ 14192 0 ] ]  on <gfid:a2d3dc2d-6572-4e14-b1bf-4f1027433117>
[2014-11-06 12:21:01.576454] W
[client-rpc-fops.c:2774:client3_3_lookup_cbk] 0-HA-WIN-TT-1T-client-1:
remote operation failed: No such file or directory. Path:
<gfid:b8026077-696e-45a8-b5a5-9c55e3813d38>
(b8026077-696e-45a8-b5a5-9c55e3813d38)
[2014-11-06 12:21:01.576517] W
[client-rpc-fops.c:2774:client3_3_lookup_cbk] 0-HA-WIN-TT-1T-client-0:
remote operation failed: No such file or directory. Path:
<gfid:b8026077-696e-45a8-b5a5-9c55e3813d38>
(b8026077-696e-45a8-b5a5-9c55e3813d38)
[2014-11-06 12:21:01.577284] W
[client-rpc-fops.c:2774:client3_3_lookup_cbk] 0-HA-WIN-TT-1T-client-1:
remote operation failed: No such file or directory. Path:
<gfid:2f1ad6a8-11aa-483c-a561-75ce45e5f245>
(2f1ad6a8-11aa-483c-a561-75ce45e5f245)
[2014-11-06 12:21:01.577349] W
[client-rpc-fops.c:2774:client3_3_lookup_cbk] 0-HA-WIN-TT-1T-client-0:
remote operation failed: No such file or directory. Path:
<gfid:2f1ad6a8-11aa-483c-a561-75ce45e5f245>
(2f1ad6a8-11aa-483c-a561-75ce45e5f245)
[2014-11-06 12:21:01.578105] W
[client-rpc-fops.c:2774:client3_3_lookup_cbk] 0-HA-WIN-TT-1T-client-1:
remote operation failed: No such file or directory. Path:
<gfid:66b3e4de-aea7-4fa2-9c0d-93fc1846dc35>
(66b3e4de-aea7-4fa2-9c0d-93fc1846dc35)
[2014-11-06 12:21:01.578170] W
[client-rpc-fops.c:2774:client3_3_lookup_cbk] 0-HA-WIN-TT-1T-client-0:
remote operation failed: No such file or directory. Path:
<gfid:66b3e4de-aea7-4fa2-9c0d-93fc1846dc35>
(66b3e4de-aea7-4fa2-9c0d-93fc1846dc35)
[2014-11-06 12:21:01.902786] I
[afr-self-heal-common.c:2868:afr_log_self_heal_completion_status]
0-HA-WIN-TT-1T-replicate-0:  metadata self heal  is successfully completed,
foreground data self heal  is successfully completed,  data self heal from
HA-WIN-TT-1T-client-1  to sinks  HA-WIN-TT-1T-client-0, with 0 bytes on
HA-WIN-TT-1T-client-0, 4 bytes on HA-WIN-TT-1T-client-1,  data - Pending
matrix:  [ [ 0 0 ] [ 3 0 ] ]  metadata self heal from source
HA-WIN-TT-1T-client-1 to HA-WIN-TT-1T-client-0,  metadata - Pending matrix:
 [ [ 0 0 ] [ 3 0 ] ], on <gfid:c7b624e4-1c66-4de7-aa89-95460ff098aa>

ehm, just found this. what could it mean? any way to gfid to file
transformation?

2014-11-06 14:40 GMT+02:00 Roman <romeo.r at gmail.com>:

> oh, never mind. it is synced now. took a LOT of time :)
>
>
> 2014-11-06 13:12 GMT+02:00 Roman <romeo.r at gmail.com>:
>
>> Hi,
>>
>> another stupid/interesting situation:
>>
>> root at stor1:~# gluster volume heal HA-WIN-TT-1T info
>> Brick stor1:/exports/NFS-WIN/1T/
>> /disk - Possibly undergoing heal
>> Number of entries: 1
>>
>> Brick stor2:/exports/NFS-WIN/1T/
>> /test
>> /disk - Possibly undergoing heal
>> Number of entries: 2
>>
>> due to testings I've brought down stor1 port on the switch and the made
>> it up again.
>> then one of the volumes successfully  restored and healed (with virtual
>> machines)
>> while other still (about 2 hours atm) says, there is a healing process,
>> meanwhile there is no traffic between the servers and client/server.
>>
>> the /test is simple new file, i've made while stor1 was down.
>> the /disk is a simple virtual block-device made of /dev/null which is
>> 900GB and is mounted on windows server via iscsitarget :). and it seem it
>> wont stop healing forever, as it can't decide which file is right?
>>
>> gluster client machine, where is volume for iscsi target is monted logs:
>> [2014-11-06 08:19:36.949092] W
>> [client-rpc-fops.c:1812:client3_3_fxattrop_cbk] 0-HA-WIN-TT-1T-client-0:
>> remote operation failed: Transport endpoint is not connected
>> [2014-11-06 08:19:36.949148] W
>> [client-rpc-fops.c:1812:client3_3_fxattrop_cbk] 0-HA-WIN-TT-1T-client-0:
>> remote operation failed: Transport endpoint is not connected
>> [2014-11-06 08:19:36.951202] W
>> [client-rpc-fops.c:1580:client3_3_finodelk_cbk] 0-HA-WIN-TT-1T-client-0:
>> remote operation failed: Transport endpoint is not connected
>> [2014-11-06 08:19:57.682937] W [socket.c:522:__socket_rwv] 0-glusterfs:
>> readv on 10.250.0.1:24007 failed (Connection timed out)
>> [2014-11-06 08:20:17.950981] E [socket.c:2161:socket_connect_finish]
>> 0-glusterfs: connection to 10.250.0.1:24007 failed (No route to host)
>> [2014-11-06 08:20:40.062928] E [socket.c:2161:socket_connect_finish]
>> 0-HA-WIN-TT-1T-client-0: connection to 10.250.0.1:24007 failed
>> (Connection timed out)
>> [2014-11-06 08:30:15.638197] W [dht-diskusage.c:232:dht_is_subvol_filled]
>> 0-HA-WIN-TT-1T-dht: disk space on subvolume 'HA-WIN-TT-1T-replicate-0' is
>> getting full (95.00 %), consider adding more nodes
>> [2014-11-06 08:36:18.385659] I [glusterfsd-mgmt.c:1307:mgmt_getspec_cbk]
>> 0-glusterfs: No change in volfile, continuing
>> [2014-11-06 08:36:18.386573] I [rpc-clnt.c:1729:rpc_clnt_reconfig]
>> 0-HA-WIN-TT-1T-client-0: changing port to 49160 (from 0)
>> [2014-11-06 08:36:18.387182] I
>> [client-handshake.c:1677:select_server_supported_programs]
>> 0-HA-WIN-TT-1T-client-0: Using Program GlusterFS 3.3, Num (1298437),
>> Version (330)
>> [2014-11-06 08:36:18.387414] I
>> [client-handshake.c:1462:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-0:
>> Connected to 10.250.0.1:49160, attached to remote volume
>> '/exports/NFS-WIN/1T'.
>> [2014-11-06 08:36:18.387433] I
>> [client-handshake.c:1474:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-0:
>> Server and Client lk-version numbers are not same, reopening the fds
>> [2014-11-06 08:36:18.387446] I
>> [client-handshake.c:1314:client_post_handshake] 0-HA-WIN-TT-1T-client-0: 1
>> fds open - Delaying child_up until they are re-opened
>> [2014-11-06 08:36:18.387730] I
>> [client-handshake.c:936:client_child_up_reopen_done]
>> 0-HA-WIN-TT-1T-client-0: last fd open'd/lock-self-heal'd - notifying
>> CHILD-UP
>> [2014-11-06 08:36:18.387862] I
>> [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-WIN-TT-1T-client-0:
>> Server lk version = 1
>>
>> brick log on stor1:
>>
>> [2014-11-06 08:38:04.269503] I
>> [client-handshake.c:1677:select_server_supported_programs]
>> 0-HA-WIN-TT-1T-client-1: Using Program GlusterFS 3.3, Num (1298437),
>> Version (330)
>> [2014-11-06 08:38:04.269908] I
>> [client-handshake.c:1462:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-1:
>> Connected to 10.250.0.2:49160, attached to remote volume
>> '/exports/NFS-WIN/1T'.
>> [2014-11-06 08:38:04.269962] I
>> [client-handshake.c:1474:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-1:
>> Server and Client lk-version numbers are not same, reopening the fds
>> [2014-11-06 08:38:04.270560] I
>> [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-WIN-TT-1T-client-1:
>> Server lk version = 1
>> [2014-11-06 08:39:33.277219] I
>> [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-WIN-TT-1T-replicate-0:
>> Another crawl is in progress for HA-WIN-TT-1T-client-0
>> [2014-11-06 08:49:33.327786] I
>> [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-WIN-TT-1T-replicate-0:
>> Another crawl is in progress for HA-WIN-TT-1T-client-0
>> [2014-11-06 08:59:33.375835] I
>> [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-WIN-TT-1T-replicate-0:
>> Another crawl is in progress for HA-WIN-TT-1T-client-0
>> [2014-11-06 09:09:33.430726] I
>> [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-WIN-TT-1T-replicate-0:
>> Another crawl is in progress for HA-WIN-TT-1T-client-0
>> [2014-11-06 09:19:33.486488] I
>> [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-WIN-TT-1T-replicate-0:
>> Another crawl is in progress for HA-WIN-TT-1T-client-0
>> [2014-11-06 09:29:33.541596] I
>> [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-WIN-TT-1T-replicate-0:
>> Another crawl is in progress for HA-WIN-TT-1T-client-0
>> [2014-11-06 09:39:33.595242] I
>> [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-WIN-TT-1T-replicate-0:
>> Another crawl is in progress for HA-WIN-TT-1T-client-0
>> [2014-11-06 09:49:33.648526] I
>> [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-WIN-TT-1T-replicate-0:
>> Another crawl is in progress for HA-WIN-TT-1T-client-0
>> [2014-11-06 09:59:33.702368] I
>> [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-WIN-TT-1T-replicate-0:
>> Another crawl is in progress for HA-WIN-TT-1T-client-0
>> [2014-11-06 10:09:33.756633] I
>> [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-WIN-TT-1T-replicate-0:
>> Another crawl is in progress for HA-WIN-TT-1T-client-0
>> [2014-11-06 10:19:33.810984] I
>> [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-WIN-TT-1T-replicate-0:
>> Another crawl is in progress for HA-WIN-TT-1T-client-0
>> [2014-11-06 10:29:33.865172] I
>> [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-WIN-TT-1T-replicate-0:
>> Another crawl is in progress for HA-WIN-TT-1T-client-0
>> [2014-11-06 10:39:33.918765] I
>> [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-WIN-TT-1T-replicate-0:
>> Another crawl is in progress for HA-WIN-TT-1T-client-0
>> [2014-11-06 10:49:33.973283] I
>> [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-WIN-TT-1T-replicate-0:
>> Another crawl is in progress for HA-WIN-TT-1T-client-0
>> [2014-11-06 10:59:34.028836] I
>> [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-WIN-TT-1T-replicate-0:
>> Another crawl is in progress for HA-WIN-TT-1T-client-0
>>
>> same on stor2
>>
>> --
>> Best regards,
>> Roman.
>>
>
>
>
> --
> Best regards,
> Roman.
>



-- 
Best regards,
Roman.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141106/1562828f/attachment.html>


More information about the Gluster-users mailing list