[Gluster-devel] cache problem?

Emmanuel Dreyfus manu at netbsd.org
Wed Jul 13 17:58:09 UTC 2011


Hi

Another weird bug. If I download a file through FTP to my glusterfs
volume, sometime the file gets out of sync between the client and
the backend. I get this:

client# md5 gnusrc.tgz 
MD5 (gnusrc.tgz) = 471a73c374ec2b5733571c01647a69d5

server# md5 /export/wd3a/tmp/gnusrc.tgz
MD5 (/export/wd3a/tmp/gnusrc.tgz) = cf03446a7f31713002ef3b74020b173f

If the client duplicates the file, it gets even more weird:

client# cp gnusrc.tgz gnusrc-cp.tgz
client# md5 gnusrc.tgz gnusrc-cp.tgz
MD5 (gnusrc-cp.tgz) = 471a73c374ec2b5733571c01647a69d5
MD5 (gnusrc.tgz) = 471a73c374ec2b5733571c01647a69d5

server# md5 /export/*/tmp/*.tgz     
MD5 (/export/wd3a/tmp/gnusrc-cp.tgz) = 471a73c374ec2b5733571c01647a69d5
MD5 (/export/wd3a/tmp/gnusrc.tgz) = cf03446a7f31713002ef3b74020b173f

At least now I can compare the files on the server.
server# mkdir -p /root/parts/gnusrc /root/parts/gnusrc-cp
server# cd /root/parts/gnusrc && split -b 1m /export/wd3a/tmp/gnusrc.tgz
server# cd /root/parts/gnusrc-cp && split -b 1m /export/wd3a/tmp/gnusrc-cp.tgz
server# diff -r /root/parts/gnusrc /root/parts/gnusrc-cp
Binary files /root/parts/gnusrc/xco and /root/parts/gnusrc-cp/xco differ
Binary files /root/parts/gnusrc/xcp and /root/parts/gnusrc-cp/xcp differ

Note that there are 80 chunks. Using hexdump -C the picture becomes
clearer. In xco, the data was replaced by zeros from offset 0x00045750
to offset 0x000fffff. In xcp, data was replaced by zeros from offert
0x00000000 to offset 0x00032fff. The two chuncks are neightbours, 
that means we have a single  hunk of data that was zered. Unfortunately,
neither the size nor the alignement look like page boundaries.

Since the glusterfs client has cached the right data, that means the bug
is somewhere within glusterfs, or at least its NetBSD port. I think NetBSD
FUSE elumation can be ruled out. Opinions?

-- 
Emmanuel Dreyfus
manu at netbsd.org




More information about the Gluster-devel mailing list