[Gluster-devel] a bug when read files in a symbol-link directory
Vijay Bellur
vijay at gluster.com
Mon Sep 7 05:10:26 UTC 2009
Hi He,
Can you please re-create the problem with -L DEBUG and post both the
client and server side logs?
Thanks,
Vijay
He Xiaobin wrote:
>
> I use glusterfs in a cluster system (configured as:
> dht->afr->client->server->iothreads->locks->posix), after days
> running, it is stable, but with a poor porformance (slower thann NFS
> exported from only one server), and most important is that a bug came
> to me these days. This is really an emergency, so I need your help!
>
> What is the BUG? In this system, I use mvapich+blcr for task
> checkpoint and restore. I don't know how mvapich works, but I am sure
> it used glusterfs in my case. When using glusterfs in checkpointing a
> task, it created one ckpt file for each proccess of the task, all the
> ckpt files placed in directory called 1, and it will create a symbol
> link called 0 pointing to directory 1. There is example, fortest is
> username, .ckpt is the ckpt file directory for this user, 1972 is the
> task id, 0 is the symbol link and bt.C.64-19.ckpt is a ckpt file the
> task's 19th proccess
> [fortest at gfsclient02 1972]$ pwd
> /mnt/glusterfs/.ckpt/1972
> [fortest at gfsclient02 1972]$ ll
> total 132
> lrwxrwxrwx 1 fortest fortest 31 Sep 4 17:09 0 ->
> /mnt/glusterfs/fortest/.ckpt/1972/1
> drwx------ 2 fortest fortest 65536 Sep 4 20:06 1
> [fortest at gfsclient02 1972]$ ls 1/
> bt.C.64-0.ckpt bt.C.64-21.ckpt bt.C.64-33.ckpt bt.C.64-45.ckpt
> bt.C.64-57.ckpt
> bt.C.64-10.ckpt bt.C.64-22.ckpt bt.C.64-34.ckpt bt.C.64-46.ckpt
> bt.C.64-58.ckpt
> bt.C.64-11.ckpt bt.C.64-23.ckpt bt.C.64-35.ckpt bt.C.64-47.ckpt
> bt.C.64-59.ckpt
> bt.C.64-12.ckpt bt.C.64-24.ckpt bt.C.64-36.ckpt bt.C.64-48.ckpt
> bt.C.64-5.ckpt
> bt.C.64-13.ckpt bt.C.64-25.ckpt bt.C.64-37.ckpt bt.C.64-49.ckpt
> bt.C.64-60.ckpt
> bt.C.64-14.ckpt bt.C.64-26.ckpt bt.C.64-38.ckpt bt.C.64-4.ckpt
> bt.C.64-61.ckpt
> bt.C.64-15.ckpt bt.C.64-27.ckpt bt.C.64-39.ckpt bt.C.64-50.ckpt
> bt.C.64-62.ckpt
> bt.C.64-16.ckpt bt.C.64-28.ckpt bt.C.64-3.ckpt bt.C.64-51.ckpt
> bt.C.64-63.ckpt
> bt.C.64-17.ckpt bt.C.64-29.ckpt bt.C.64-40.ckpt bt.C.64-52.ckpt
> bt.C.64-6.ckpt
> bt.C.64-18.ckpt bt.C.64-2.ckpt bt.C.64-41.ckpt bt.C.64-53.ckpt
> bt.C.64-7.ckpt
> bt.C.64-19.ckpt bt.C.64-30.ckpt bt.C.64-42.ckpt bt.C.64-54.ckpt
> bt.C.64-8.ckpt
> bt.C.64-1.ckpt bt.C.64-31.ckpt bt.C.64-43.ckpt bt.C.64-55.ckpt
> bt.C.64-9.ckpt
> bt.C.64-20.ckpt bt.C.64-32.ckpt bt.C.64-44.ckpt bt.C.64-56.ckpt
>
> When the task need to be restored, mvapich will read the ckpt file
> from 0 (the symbol link) and restore the task! All this perform
> smoothly in NFS, but in glusterfs it will output following messages.
> However sometimes task restoring can finish at last, while others
> can't almost with the same messages. I have verifed the missing files
> mvapich outputed was indeed there. Another useful tips is that fewer
> gluster client doing the task, few times it would be came to this bug
> when task restoring. And startup glusterfs without direct-io could not
> help too.
>
> OUTPUT OF THE TASK WHEN RESTORE:
>
> 19: Restart: path /mnt/glusterfs/fortest/.ckpt/1972/0/bt.C.64-19.ckpt:
> No such file or directory20: Restart: path
> /mnt/glusterfs/fortest/.ckpt/1972/0/bt.C.64-20.ckpt: No such file or
> directorysrun: error: gfsclient10: task[19-20]: Exited with exit code 1
> 21: Restart: path /mnt/glusterfs/fortest/.ckpt/1972/0/bt.C.64-21.ckpt:
> No such file or directory18: Restart: path
> /mnt/glusterfs/fortest/.ckpt/1972/0/bt.C.64-18.ckpt: No such file or
> directorysrun: error: gfsclient10: task21: Exited with exit code 1
> srun: error: cn010: task18: Exited with exit code 1
> 17: Restart: path /mnt/glusterfs/fortest/.ckpt/1972/0/bt.C.64-17.ckpt:
> No such file or directorysrun: error: gfsclient10: task17: Exited with
> exit code 1
> 23: Restart: path /mnt/glusterfs/fortest/.ckpt/1972/0/bt.C.64-23.ckpt:
> No such file or directory22: Restart: path
> /mnt/glusterfs/fortest/.ckpt/1972/0/bt.C.64-22.ckpt: No such file or
> directorysrun: error: gfsclient10: task23: Exited with exit code 1
> srun: error: cn010: task[16,22]: Exited with exit code 1
> 16: Restart: path /mnt/glusterfs/fortest/.ckpt/1972/0/bt.C.64-16.ckpt:
> No such file or directory
>
>
> I use "debug/trace" and start the gluster with "-L DEBUG", and got the
> following logs when the ckpt can't to be found:
>
> [2009-09-04 17:12:35] N [trace.c:1290:trace_readlink] tr0: 174536:
> (loc {path=/fortest/.ckp
> t/1972/0, ino=1380450540}, size=4096)
> [2009-09-04 17:12:35] N [trace.c:484:trace_readlink_cbk] tr0: 174536:
> (op_ret=31, op_errno=
> 0, buf=/mnt/glusterfs/fortest/.ckpt/1972/1)
> [2009-09-04 17:12:35] E [fuse-bridge.c:987:fuse_readlink_cbk]
> glusterfs-fuse: 174536: /fortest/
> .ckpt/1972/0 => /mnt/glusterfs/fortest/.ckpt/1972/1 @ 1252055555
> [2009-09-04 17:12:35] N [trace.c:1245:trace_lookup] tr0: 174537: (loc
> {path=/fortest/.ckpt/
> 1972/1, ino=0})
> [2009-09-04 17:12:35] N [trace.c:513:trace_lookup_cbk] tr0: 174508:
> (op_ret=0, ino=0, *buf
> {st_dev=2065, st_ino=7068450884, st_mode=40700, st_nlink=2,
> st_uid=1001, st_gid=1001, st_rd
> ev=0, st_size=65536, st_blksize=4096, st_blocks=256})
> [2009-09-04 17:12:35] E [fuse-bridge.c:255:fuse_loc_fill]
> glusterfs-fuse: inode_path failed for
> 8003256399/bt.C.64-22.ckpt @ 1252055555
> [2009-09-04 17:12:35] W [fuse-bridge.c:436:fuse_lookup]
> glusterfs-fuse: 174539: LOOKUP 80032563
> 99/bt.C.64-22.ckpt (fuse_loc_fill() failed)
> [2009-09-04 17:12:35] N [trace.c:513:trace_lookup_cbk] tr0: 174522:
> (op_ret=0, ino=0, *buf
> {st_dev=2065, st_ino=7068450884, st_mode=40700, st_nlink=2,
> st_uid=1001, st_gid=1001, st_rd
> ev=0, st_size=65536, st_blksize=4096, st_blocks=256})
> [2009-09-04 17:12:35] E [fuse-bridge.c:255:fuse_loc_fill]
> glusterfs-fuse: inode_path failed for
> 8003256399/bt.C.64-16.ckpt @ 1252055555
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
More information about the Gluster-devel
mailing list