[Gluster-devel] question on self-heal
Emmanuel Dreyfus
manu at netbsd.org
Mon Jul 30 12:33:35 UTC 2012
Hi
A question on self heal: As I understand, when a lookup occurs, the client
checks if self heal must be done, it heals if required, the proceed with
the lookup.
I encounter rare situation where self heal is done but I still get the
non healed-result. For instance, I do read a file, get no result as if it
were empty, then attempt to read it again and get the correct file content.
Here is an example. I am building in a release-3.3 glusterfs volume,
and the build fails because of an empty Makefile. The client log
shows that this is a replication problem:
includes ===> external/intel-fw-eula/ipw2100
nbmake: don't know how to make includes. Stop
client log:
[2012-07-30 10:09:54.756766] E
[afr-self-heal-common.c:1087:afr_sh_common_lookup_resp_handler]
0-pfs-replicate-0: path /manu/netbsd/usr/src/external/intel-fw-eula/ipw2100
on subvolume pfs-client-1 => -1 (No such file or directory)
[2012-07-30 10:09:55.056577] I [afr-common.c:1340:afr_launch_self_heal]
0-pfs-replicate-0: entry self-heal triggered.
path: /manu/netbsd/usr/src/external/intel-fw-eula/ipw2100,
reason: checksums of directory differ
[2012-07-30 10:09:55.062865] E
[afr-self-heal-common.c:1087:afr_sh_common_lookup_resp_handler]
0-pfs-replicate-0: path
/manu/netbsd/usr/src/external/intel-fw-eula/ipw2100/CVS on
subvolume pfs-client-1 => -1 (No such file or directory)
[2012-07-30 10:09:55.063069] E
[afr-self-heal-common.c:1087:afr_sh_common_lookup_resp_handler]
0-pfs-replicate-0: path
/manu/netbsd/usr/src/external/intel-fw-eula/ipw2100/Makefile on
subvolume pfs-client-1 => -1 (No such file or directory)
[2012-07-30 10:09:55.063268] E
[afr-self-heal-common.c:1087:afr_sh_common_lookup_resp_handler]
0-pfs-replicate-0: path
/manu/netbsd/usr/src/external/intel-fw-eula/ipw2100/dist on
subvolume pfs-client-1 => -1 (No such file or directory)
[2012-07-30 10:09:55.480500] I
[afr-self-heal-common.c:2159:afr_self_heal_completion_cbk]
0-pfs-replicate-0: background entry self-heal completed on
/manu/netbsd/usr/src/external/intel-fw-eula/ipw2100
And if I run ls -l the file will finally be healed:
$ ls -l external/intel-fw-eula/ipw2100/Makefile
-rw-r--r-- 1 manu manu 224 Oct 30 2008 external/intel-fw-eula/ipw2100/Makefile
client log:
[2012-07-30 14:30:05.058560] I [afr-common.c:1340:afr_launch_self_heal]
0-pfs-replicate-0: background meta-data self-heal triggered. path:
/manu/netbsd/usr/src/external/intel-fw-eula/ipw2100, reason: lookup
detected pending operations
[2012-07-30 14:30:05.086289] I
[afr-self-heal-common.c:2159:afr_self_heal_completion_cbk]
0-pfs-replicate-0: background meta-data self-heal completed on
/manu/netbsd/usr/src/external/intel-fw-eula/ipw2100
[2012-07-30 14:30:05.527602] I
[afr-common.c:1189:afr_detect_self_heal_by_iatt] 0-pfs-replicate-0:
size differs for
/manu/netbsd/usr/src/external/intel-fw-eula/ipw2100/Makefile
[2012-07-30 14:30:05.527655] I [afr-common.c:1340:afr_launch_self_heal]
0-pfs-replicate-0: background meta-data data self-heal triggered.
path: /manu/netbsd/usr/src/external/intel-fw-eula/ipw2100/Makefile,
reason: lookup detected pending operations
[2012-07-30 14:30:05.580709] I
[afr-self-heal-algorithm.c:116:sh_loop_driver_done] 0-pfs-replicate-0:
full self-heal completed on
/manu/netbsd/usr/src/external/intel-fw-eula/ipw2100/Makefile
[2012-07-30 14:30:05.615283] I
[afr-self-heal-common.c:2159:afr_self_heal_completion_cbk]
0-pfs-replicate-0: background meta-data data self-heal completed
on /manu/netbsd/usr/src/external/intel-fw-eula/ipw2100/Makefile
This is a bug, right?
--
Emmanuel Dreyfus
manu at netbsd.org
More information about the Gluster-devel
mailing list