[Gluster-users] Deadlock with NFS client and glusterfs server on the same host

Thu Jan 20 13:36:50 UTC 2011

Glusterfs deadlocks when a volume is mounted over NFS on the same host 
where glusterfsd is running. I would like to know if this configuration 
is supported, because I know that some network filesystems can deadlock 
in this configuration.

Deadlock happens when writing a file big enough to fill the filesystem 
cache and kernel is trying to flush it to free some memory for 
glusterfsd which needs memory to commit some filesystem blocks to free 
some memory for glusterfsd...

I'm testing glusterfs-3.1.1 on Fedora 14 with kernel 
2.6.35.10-74.fc14.x86_64.

# ps xaww -Owchan:22
   PID WCHAN                  S TTY          TIME COMMAND
  1603 nfs_wait_bit_killable  D ?        00:00:07 /usr/sbin/glusterfs -f 
/etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid -l 
/var/log/glusterfs/nfs.log
  1655 nfs_wait_bit_killable  D pts/1    00:00:00 dd if=/dev/zero 
of=/home/gluster/bigfile bs=1M count=100

# cat /proc/1603/stack
[<ffffffffa0100a6c>] nfs_wait_bit_killable+0x34/0x38 [nfs]
[<ffffffffa010cea7>] nfs_commit_inode+0x71/0x1d6 [nfs]
[<ffffffffa00ff128>] nfs_release_page+0x66/0x83 [nfs]
[<ffffffff810d2d47>] try_to_release_page+0x32/0x3b
[<ffffffff810de9ec>] shrink_page_list+0x2cf/0x446
[<ffffffff810deeb0>] shrink_inactive_list.clone.35+0x34d/0x5c6
[<ffffffff810df732>] shrink_zone+0x355/0x3e2
[<ffffffff810dfbaa>] do_try_to_free_pages+0x160/0x363
[<ffffffff810dff46>] try_to_free_pages+0x67/0x69
[<ffffffff810da1d3>] __alloc_pages_nodemask+0x525/0x776
[<ffffffff81100015>] alloc_pages_current+0xa9/0xc3
[<ffffffff813fc9cc>] tcp_sendmsg+0x3c5/0x809
[<ffffffff813aefe3>] __sock_sendmsg+0x6b/0x77
[<ffffffff813afd99>] sock_aio_write+0xc2/0xd6
[<ffffffff811176aa>] do_sync_readv_writev+0xc1/0x100
[<ffffffff81117900>] do_readv_writev+0xa7/0x127
[<ffffffff811179c5>] vfs_writev+0x45/0x47
[<ffffffff81117ae8>] sys_writev+0x4a/0x93
[<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

# cat /proc/1655/stack
[<ffffffffa0100a6c>] nfs_wait_bit_killable+0x34/0x38 [nfs]
[<ffffffffa010cfd6>] nfs_commit_inode+0x1a0/0x1d6 [nfs]
[<ffffffffa00ff128>] nfs_release_page+0x66/0x83 [nfs]
[<ffffffff810d2d47>] try_to_release_page+0x32/0x3b
[<ffffffff810de9ec>] shrink_page_list+0x2cf/0x446
[<ffffffff810deeb0>] shrink_inactive_list.clone.35+0x34d/0x5c6
[<ffffffff810df732>] shrink_zone+0x355/0x3e2
[<ffffffff810dfbaa>] do_try_to_free_pages+0x160/0x363
[<ffffffff810dff46>] try_to_free_pages+0x67/0x69
[<ffffffff810da1d3>] __alloc_pages_nodemask+0x525/0x776
[<ffffffff81100015>] alloc_pages_current+0xa9/0xc3
[<ffffffff810d3aa3>] __page_cache_alloc+0x77/0x7e
[<ffffffff810d3c6c>] grab_cache_page_write_begin+0x5c/0xa3
[<ffffffffa00ff41e>] nfs_write_begin+0xd4/0x187 [nfs]
[<ffffffff810d2f27>] generic_file_buffered_write+0xfa/0x23d
[<ffffffff810d498c>] __generic_file_aio_write+0x24f/0x27f
[<ffffffff810d4a17>] generic_file_aio_write+0x5b/0xab
[<ffffffffa010001f>] nfs_file_write+0xe0/0x172 [nfs]
[<ffffffff81116bf6>] do_sync_write+0xcb/0x108
[<ffffffff811172d0>] vfs_write+0xac/0x100
[<ffffffff811174d9>] sys_write+0x4a/0x6e
[<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff