[Gluster-users] Self heal doesn't seem to work when file is updated

Mohit Anchlia mohitanchlia at gmail.com
Tue Mar 8 18:10:28 UTC 2011


These are the logs that I see:

mount FS logs

[2011-03-08 10:02:29.1920] I [afr-common.c:716:afr_lookup_done]
test-volume-replicate-0: background  entry self-heal triggered. path:
/
[2011-03-08 10:02:29.6111] I
[afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
test-volume-replicate-0: background  entry self-heal completed on /
[2011-03-08 10:02:37.374866] I [afr-common.c:716:afr_lookup_done]
test-volume-replicate-0: background  meta-data data self-heal
triggered. path: /a.tx
[2011-03-08 10:02:37.376012] I
[afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
test-volume-replicate-0: background  meta-data data self-heal
completed on /a.tx
[2011-03-08 10:02:45.554547] I [afr-common.c:716:afr_lookup_done]
test-volume-replicate-0: background  meta-data data self-heal
triggered. path: /a.tx
[2011-03-08 10:02:45.555553] I
[afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
test-volume-replicate-0: background  meta-data data self-heal
completed on /a.tx

gluster log

[2011-03-08 10:01:54.316161] W [dict.c:1205:data_to_str] dict: @data=(nil)
[2011-03-08 10:01:54.316179] W [dict.c:1205:data_to_str] dict: @data=(nil)
[2011-03-08 10:01:54.317337] W [dict.c:1205:data_to_str] dict: @data=(nil)
[2011-03-08 10:01:54.317417] W [dict.c:1205:data_to_str] dict: @data=(nil)
[2011-03-08 10:01:54.318279] I
[glusterd3_1-mops.c:1233:glusterd3_1_stage_op] glusterd: Sent op req
to 0 peers
[2011-03-08 10:01:54.323578] I
[glusterd-utils.c:858:glusterd_service_stop] : Stopping gluster nfsd
running in pid: 11310
[2011-03-08 10:01:55.342835] I
[glusterd3_1-mops.c:1323:glusterd3_1_commit_op] glusterd: Sent op req
to 0 peers
[2011-03-08 10:01:55.342964] I
[glusterd3_1-mops.c:1145:glusterd3_1_cluster_unlock] glusterd: Sent
unlock req to 0 peers
[2011-03-08 10:01:55.342985] I
[glusterd-op-sm.c:4845:glusterd_op_txn_complete] glusterd: Cleared
local lock
[2011-03-08 10:02:04.886217] I
[glusterd-handler.c:715:glusterd_handle_cli_get_volume] glusterd:
Received get vol req
[2011-03-08 10:02:18.828788] I
[glusterd3_1-mops.c:172:glusterd3_1_friend_add_cbk] glusterd: Received
RJT from uuid: 34f59271-21b3-4533-9a89-2fd06523c729, host:
testefitarc01, port: 0
[2011-03-08 10:02:18.828818] I
[glusterd-utils.c:2066:glusterd_friend_find_by_uuid] glusterd: Friend
found.. state: Peer in Cluster


brick logs

[2011-03-08 10:02:29.3068] D [posix.c:265:posix_lstat_with_gfid]
test-volume-posix: failed to get gfid
[2011-03-08 10:02:29.4383] D [io-threads.c:2072:__iot_workers_scale]
test-volume-io-threads: scaled threads to 2 (queue_size=4/2)
[2011-03-08 10:02:29.4641] D [dict.c:331:dict_get]
(-->/usr/lib64/libglusterfs.so.0(call_resume+0xc5e) [0x2b46d2e1008e]
(-->/usr/lib64/glusterfs/3.1.2/xlator/performance/io-threads.so(iot_lookup_wrapper+0xda)
[0x2aaaab80069a]
(-->/usr/lib64/glusterfs/3.1.2/xlator/features/locks.so(pl_lookup+0x78)
[0x2aaaab5e9f48]))) dict: @this=(nil) key=glusterfs.entrylk-count
[2011-03-08 10:02:29.4663] D [dict.c:331:dict_get]
(-->/usr/lib64/libglusterfs.so.0(call_resume+0xc5e) [0x2b46d2e1008e]
(-->/usr/lib64/glusterfs/3.1.2/xlator/performance/io-threads.so(iot_lookup_wrapper+0xda)
[0x2aaaab80069a]
(-->/usr/lib64/glusterfs/3.1.2/xlator/features/locks.so(pl_lookup+0x78)
[0x2aaaab5e9f48]))) dict: @this=(nil) key=glusterfs.entrylk-count
[2011-03-08 10:02:29.4698] D [dict.c:331:dict_get]
(-->/usr/lib64/libglusterfs.so.0(call_resume+0xc5e) [0x2b46d2e1008e]
(-->/usr/lib64/glusterfs/3.1.2/xlator/performance/io-threads.so(iot_lookup_wrapper+0xda)
[0x2aaaab80069a]
(-->/usr/lib64/glusterfs/3.1.2/xlator/features/locks.so(pl_lookup+0x92)
[0x2aaaab5e9f62]))) dict: @this=(nil) key=glusterfs.inodelk-count
[2011-03-08 10:02:29.4723] D [dict.c:331:dict_get]
(-->/usr/lib64/libglusterfs.so.0(call_resume+0xc5e) [0x2b46d2e1008e]
(-->/usr/lib64/glusterfs/3.1.2/xlator/performance/io-threads.so(iot_lookup_wrapper+0xda)
[0x2aaaab80069a]
(-->/usr/lib64/glusterfs/3.1.2/xlator/features/locks.so(pl_lookup+0x92)
[0x2aaaab5e9f62]))) dict: @this=(nil) key=glusterfs.inodelk-count
[2011-03-08 10:02:29.4746] D [dict.c:331:dict_get]
(-->/usr/lib64/libglusterfs.so.0(call_resume+0xc5e) [0x2b46d2e1008e]
(-->/usr/lib64/glusterfs/3.1.2/xlator/performance/io-threads.so(iot_lookup_wrapper+0xda)
[0x2aaaab80069a]
(-->/usr/lib64/glusterfs/3.1.2/xlator/features/locks.so(pl_lookup+0xad)
[0x2aaaab5e9f7d]))) dict: @this=(nil) key=glusterfs.posixlk-count
[2011-03-08 10:02:29.4770] D [dict.c:331:dict_get]
(-->/usr/lib64/libglusterfs.so.0(call_resume+0xc5e) [0x2b46d2e1008e]
(-->/usr/lib64/glusterfs/3.1.2/xlator/performance/io-threads.so(iot_lookup_wrapper+0xda)
[0x2aaaab80069a]
(-->/usr/lib64/glusterfs/3.1.2/xlator/features/locks.so(pl_lookup+0xad)
[0x2aaaab5e9f7d]))) dict: @this=(nil) key=glusterfs.posixlk-count
[2011-03-08 10:02:29.4838] D [dict.c:331:dict_get]
(-->/usr/lib64/libglusterfs.so.0(call_resume+0xc5e) [0x2b46d2e1008e]
(-->/usr/lib64/glusterfs/3.1.2/xlator/performance/io-threads.so(iot_lookup_wrapper+0xda)
[0x2aaaab80069a]
(-->/usr/lib64/glusterfs/3.1.2/xlator/features/locks.so(pl_lookup+0x78)
[0x2aaaab5e9f48]))) dict: @this=(nil) key=glusterfs.entrylk-count
[2011-03-08 10:02:29.4884] D [dict.c:331:dict_get]
(-->/usr/lib64/libglusterfs.so.0(call_resume+0xc5e) [0x2b46d2e1008e]
(-->/usr/lib64/glusterfs/3.1.2/xlator/performance/io-threads.so(iot_lookup_wrapper+0xda)
[0x2aaaab80069a]
(-->/usr/lib64/glusterfs/3.1.2/xlator/features/locks.so(pl_lookup+0x92)
[0x2aaaab5e9f62]))) dict: @this=(nil) key=glusterfs.inodelk-count
[2011-03-08 10:02:29.4915] D [dict.c:331:dict_get]
(-->/usr/lib64/libglusterfs.so.0(call_resume+0xc5e) [0x2b46d2e1008e]
(-->/usr/lib64/glusterfs/3.1.2/xlator/performance/io-threads.so(iot_lookup_wrapper+0xda)
[0x2aaaab80069a]
(-->/usr/lib64/glusterfs/3.1.2/xlator/features/locks.so(pl_lookup+0x78)
[0x2aaaab5e9f48]))) dict: @this=(nil) key=glusterfs.entrylk-count
[2011-03-08 10:02:29.4938] D [dict.c:331:dict_get]
(-->/usr/lib64/libglusterfs.so.0(call_resume+0xc5e) [0x2b46d2e1008e]
(-->/usr/lib64/glusterfs/3.1.2/xlator/performance/io-threads.so(iot_lookup_wrapper+0xda)
[0x2aaaab80069a]
(-->/usr/lib64/glusterfs/3.1.2/xlator/features/locks.so(pl_lookup+0xad)
[0x2aaaab5e9f7d]))) dict: @this=(nil) key=glusterfs.posixlk-count
[2011-03-08 10:02:29.4992] D [dict.c:331:dict_get]
(-->/usr/lib64/libglusterfs.so.0(call_resume+0xc5e) [0x2b46d2e1008e]
(-->/usr/lib64/glusterfs/3.1.2/xlator/performance/io-threads.so(iot_lookup_wrapper+0xda)
[0x2aaaab80069a]
(-->/usr/lib64/glusterfs/3.1.2/xlator/features/locks.so(pl_lookup+0x92)
[0x2aaaab5e9f62]))) dict: @this=(nil) key=glusterfs.inodelk-count
[2011-03-08 10:02:29.5049] D [dict.c:331:dict_get]
(-->/usr/lib64/libglusterfs.so.0(call_resume+0xc5e) [0x2b46d2e1008e]
(-->/usr/lib64/glusterfs/3.1.2/xlator/performance/io-threads.so(iot_lookup_wrapper+0xda)
[0x2aaaab80069a]
(-->/usr/lib64/glusterfs/3.1.2/xlator/features/locks.so(pl_lookup+0x78)
[0x2aaaab5e9f48]))) dict: @this=(nil) key=glusterfs.entrylk-count
[2011-03-08 10:02:29.5079] D [dict.c:331:dict_get]
(-->/usr/lib64/libglusterfs.so.0(call_resume+0xc5e) [0x2b46d2e1008e]
(-->/usr/lib64/glusterfs/3.1.2/xlator/performance/io-threads.so(iot_lookup_wrapper+0xda)
[0x2aaaab80069a]
(-->/usr/lib64/glusterfs/3.1.2/xlator/features/locks.so(pl_lookup+0xad)
[0x2aaaab5e9f7d]))) dict: @this=(nil) key=glusterfs.posixlk-count
[2011-03-08 10:02:29.5103] D [dict.c:331:dict_get]
(-->/usr/lib64/libglusterfs.so.0(call_resume+0xc5e) [0x2b46d2e1008e]
(-->/usr/lib64/glusterfs/3.1.2/xlator/performance/io-threads.so(iot_lookup_wrapper+0xda)
[0x2aaaab80069a]
(-->/usr/lib64/glusterfs/3.1.2/xlator/features/locks.so(pl_lookup+0x92)
[0x2aaaab5e9f62]))) dict: @this=(nil) key=glusterfs.inodelk-count
[2011-03-08 10:02:29.5149] D [dict.c:331:dict_get]
(-->/usr/lib64/libglusterfs.so.0(call_resume+0xc5e) [0x2b46d2e1008e]
(-->/usr/lib64/glusterfs/3.1.2/xlator/performance/io-threads.so(iot_lookup_wrapper+0xda)
[0x2aaaab80069a]
(-->/usr/lib64/glusterfs/3.1.2/xlator/features/locks.so(pl_lookup+0xad)
[0x2aaaab5e9f7d]))) dict: @this=(nil) key=glusterfs.posixlk-count
[2011-03-08 10:02:29.6752] D [posix.c:265:posix_lstat_with_gfid]
test-volume-posix: failed to get gfid
[2011-03-08 10:02:33.854018] D [posix.c:265:posix_lstat_with_gfid]
test-volume-posix: failed to get gfid
[2011-03-08 10:02:37.375341] D [inodelk.c:293:__inode_unlock_lock]
test-volume-locks:  Matching lock found for unlock
[2011-03-08 10:02:37.375722] D [inodelk.c:288:__inode_unlock_lock]
test-volume-locks:  Matching lock not found for unlock
[2011-03-08 10:02:37.375739] D [inodelk.c:490:pl_inode_setlk]
test-volume-locks: Bad Unlock issued on Inode lock
[2011-03-08 10:02:39.555699] D [posix.c:265:posix_lstat_with_gfid]
test-volume-posix: failed to get gfid
[2011-03-08 10:02:45.554927] D [inodelk.c:293:__inode_unlock_lock]
test-volume-locks:  Matching lock found for unlock
[2011-03-08 10:02:45.555351] D [inodelk.c:288:__inode_unlock_lock]
test-volume-locks:  Matching lock not found for unlock
[2011-03-08 10:02:45.555368] D [inodelk.c:490:pl_inode_setlk]
test-volume-locks: Bad Unlock issued on Inode lock
[2011-03-08 10:02:45.556476] D [inodelk.c:293:__inode_unlock_lock]
test-volume-locks:  Matching lock found for unlock
[2011-03-08 10:02:45.557164] D [inodelk.c:293:__inode_unlock_lock]
test-volume-locks:  Matching lock found for unlock
[2011-03-08 10:02:45.557403] D [inodelk.c:293:__inode_unlock_lock]
test-volume-locks:  Matching lock found for unlock
[2011-03-08 10:02:46.966643] D [posix.c:265:posix_lstat_with_gfid]
test-volume-posix: failed to get gfid
[2011-03-08 10:06:22.81683] D [io-threads.c:107:iot_worker]
test-volume-io-threads: timeout, terminated. conf->curr_count=1


On Tue, Mar 8, 2011 at 9:54 AM, Joe Landman
<landman at scalableinformatics.com> wrote:
> On 03/08/2011 12:51 PM, Mohit Anchlia wrote:
>>
>> They are all sync from same ntp server so they are exactly the same. I
>> don't really think it has anything to do with it since stat,touch, ls
>> -alR commands are all diff. variety of commands. Also, this problem
>> seems to be a problem when updating existing files when a server is
>> down.
>>
>
> We've run into problems with a number of things when timing drifted by more
> than a few seconds.
>
> Turn up debugging on the server.  Then try your test.  See if it still
> fails.
>
> --
> Joseph Landman, Ph.D
> Founder and CEO
> Scalable Informatics Inc.
> email: landman at scalableinformatics.com
> web  : http://scalableinformatics.com
>       http://scalableinformatics.com/sicluster
> phone: +1 734 786 8423 x121
> fax  : +1 866 888 3112
> cell : +1 734 612 4615
>



More information about the Gluster-users mailing list