[Gluster-users] A problem when mount glusterfs via NFS

Pippo pippo0805 at 163.com
Tue Mar 26 02:05:17 UTC 2013


Hi Pranith:

    Thanks for your reply. I run Glusterfs 3.3。
    I have a very important situation forgot to describe:
    When the nfs server and client are both running CentOS 5.5,they work well. The problem appear only the nfs server run CentOS 5.5 and the client run CentOS 6.3.
    
The /var/log/messages: 
Mar 12 18:04:41 localhost kernel: glusterfs invoked oom-killer: gfp_mask=0x280d2, order=0, oomkilladj=0
Mar 12 18:04:41 localhost kernel: 
Mar 12 18:04:41 localhost kernel: Call Trace:
Mar 12 18:04:41 localhost kernel:  [<ffffffff800c723e>] out_of_memory+0x8e/0x2f3
Mar 12 18:04:41 localhost kernel:  [<ffffffff8002e22d>] __wake_up+0x38/0x4f
Mar 12 18:04:41 localhost kernel:  [<ffffffff8000f53f>] __alloc_pages+0x27f/0x308
Mar 12 18:04:41 localhost kernel:  [<ffffffff80008e9f>] __handle_mm_fault+0x73c/0x1039
Mar 12 18:04:41 localhost kernel:  [<ffffffff80066b55>] do_page_fault+0x4cb/0x874
Mar 12 18:04:41 localhost kernel:  [<ffffffff800f8935>] sys_epoll_wait+0x3b8/0x3f9
Mar 12 18:04:41 localhost kernel:  [<ffffffff8005dde9>] error_exit+0x0/0x84
Mar 12 18:04:41 localhost kernel: 
Mar 12 18:04:41 localhost kernel: Mem-info:
Mar 12 18:04:41 localhost kernel: Node 0 DMA per-cpu:
Mar 12 18:04:41 localhost kernel: cpu 0 hot: high 0, batch 1 used:0
Mar 12 18:04:41 localhost kernel: cpu 0 cold: high 0, batch 1 used:0
Mar 12 18:04:41 localhost kernel: cpu 1 hot: high 0, batch 1 used:0
Mar 12 18:04:41 localhost kernel: cpu 1 cold: high 0, batch 1 used:0
Mar 12 18:04:41 localhost kernel: cpu 2 hot: high 0, batch 1 used:0
Mar 12 18:04:41 localhost kernel: cpu 2 cold: high 0, batch 1 used:0
Mar 12 18:04:41 localhost kernel: cpu 3 hot: high 0, batch 1 used:0
Mar 12 18:04:41 localhost kernel: cpu 3 cold: high 0, batch 1 used:0
Mar 12 18:04:41 localhost kernel: cpu 4 hot: high 0, batch 1 used:0
Mar 12 18:04:41 localhost kernel: cpu 4 cold: high 0, batch 1 used:0
Mar 12 18:04:41 localhost kernel: cpu 5 hot: high 0, batch 1 used:0
Mar 12 18:04:41 localhost kernel: cpu 5 cold: high 0, batch 1 used:0
Mar 12 18:04:41 localhost kernel: cpu 6 hot: high 0, batch 1 used:0
Mar 12 18:04:41 localhost kernel: cpu 6 cold: high 0, batch 1 used:0
Mar 12 18:04:41 localhost kernel: cpu 7 hot: high 0, batch 1 used:0
Mar 12 18:04:41 localhost kernel: cpu 7 cold: high 0, batch 1 used:0
Mar 12 18:04:41 localhost kernel: Node 0 DMA32 per-cpu:
Mar 12 18:04:41 localhost kernel: cpu 0 hot: high 186, batch 31 used:131
Mar 12 18:04:41 localhost kernel: cpu 0 cold: high 62, batch 15 used:49
Mar 12 18:04:41 localhost kernel: cpu 1 hot: high 186, batch 31 used:13
Mar 12 18:04:41 localhost kernel: cpu 1 cold: high 62, batch 15 used:54
Mar 12 18:04:41 localhost kernel: cpu 2 hot: high 186, batch 31 used:30
Mar 12 18:04:41 localhost kernel: cpu 2 cold: high 62, batch 15 used:23
Mar 12 18:04:41 localhost kernel: cpu 3 hot: high 186, batch 31 used:106
Mar 12 18:04:41 localhost kernel: cpu 3 cold: high 62, batch 15 used:40
Mar 12 18:04:41 localhost kernel: cpu 4 hot: high 186, batch 31 used:19
Mar 12 18:04:41 localhost kernel: cpu 4 cold: high 62, batch 15 used:52
Mar 12 18:04:41 localhost kernel: cpu 5 hot: high 186, batch 31 used:19
Mar 12 18:04:41 localhost kernel: cpu 5 cold: high 62, batch 15 used:51
Mar 12 18:04:41 localhost kernel: cpu 6 hot: high 186, batch 31 used:38
Mar 12 18:04:41 localhost kernel: cpu 6 cold: high 62, batch 15 used:49
Mar 12 18:04:41 localhost kernel: cpu 7 hot: high 186, batch 31 used:27
Mar 12 18:04:41 localhost kernel: cpu 7 cold: high 62, batch 15 used:48
Mar 12 18:04:41 localhost kernel: Node 0 Normal per-cpu:
Mar 12 18:04:41 localhost kernel: cpu 0 hot: high 186, batch 31 used:59
Mar 12 18:04:41 localhost kernel: cpu 0 cold: high 62, batch 15 used:43
Mar 12 18:04:41 localhost kernel: cpu 1 hot: high 186, batch 31 used:36
Mar 12 18:04:41 localhost kernel: cpu 1 cold: high 62, batch 15 used:56
Mar 12 18:04:41 localhost kernel: cpu 2 hot: high 186, batch 31 used:25
Mar 12 18:04:41 localhost kernel: cpu 2 cold: high 62, batch 15 used:42
Mar 12 18:04:41 localhost kernel: cpu 3 hot: high 186, batch 31 used:22
Mar 12 18:04:41 localhost kernel: cpu 3 cold: high 62, batch 15 used:43
Mar 12 18:04:41 localhost kernel: cpu 4 hot: high 186, batch 31 used:140
Mar 12 18:04:41 localhost kernel: cpu 4 cold: high 62, batch 15 used:51
Mar 12 18:04:41 localhost kernel: cpu 5 hot: high 186, batch 31 used:2
Mar 12 18:04:41 localhost kernel: cpu 5 cold: high 62, batch 15 used:51
Mar 12 18:04:41 localhost kernel: cpu 6 hot: high 186, batch 31 used:39
Mar 12 18:04:41 localhost kernel: cpu 6 cold: high 62, batch 15 used:55
Mar 12 18:04:41 localhost kernel: cpu 7 hot: high 186, batch 31 used:28
Mar 12 18:04:41 localhost kernel: cpu 7 cold: high 62, batch 15 used:57
Mar 12 18:04:41 localhost kernel: Node 0 HighMem per-cpu: empty
Mar 12 18:04:41 localhost kernel: Free pages:       47116kB (0kB HighMem)
Mar 12 18:04:41 localhost kernel: Active:941872 inactive:355 dirty:0 writeback:0 unstable:0 free:11779 slab:40203 mapped-file:1 mapped-anon:938241 pagetables:3947
Mar 12 18:04:41 localhost kernel: Node 0 DMA free:10876kB min:80kB low:100kB high:120kB active:0kB inactive:0kB present:10476kB pages_scanned:0 all_unreclaimable? yes
Mar 12 18:04:41 localhost kernel: lowmem_reserve[]: 0 2978 3988 3988
Mar 12 18:04:41 localhost kernel: Node 0 DMA32 free:28400kB min:24404kB low:30504kB high:36604kB active:2964256kB inactive:0kB present:3049956kB pages_scanned:15320246 all_unreclaimable? yes
Mar 12 18:04:41 localhost kernel: lowmem_reserve[]: 0 0 1010 1010
Mar 12 18:04:41 localhost kernel: Node 0 Normal free:7840kB min:8276kB low:10344kB high:12412kB active:803516kB inactive:1292kB present:1034240kB pages_scanned:7795789 all_unreclaimable? yes
Mar 12 18:04:41 localhost kernel: lowmem_reserve[]: 0 0 0 0
Mar 12 18:04:41 localhost kernel: Node 0 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar 12 18:04:41 localhost kernel: lowmem_reserve[]: 0 0 0 0
Mar 12 18:04:41 localhost kernel: Node 0 DMA: 3*4kB 0*8kB 5*16kB 3*32kB 5*64kB 3*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 2*4096kB = 10876kB
Mar 12 18:04:41 localhost kernel: Node 0 DMA32: 16*4kB 0*8kB 1*16kB 1*32kB 0*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 6*4096kB = 28400kB
Mar 12 18:04:41 localhost kernel: Node 0 Normal: 10*4kB 3*8kB 0*16kB 1*32kB 1*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 1*4096kB = 7840kB
Mar 12 18:04:41 localhost kernel: Node 0 HighMem: empty
Mar 12 18:04:41 localhost kernel: 4366 pagecache pages
Mar 12 18:04:41 localhost kernel: Swap cache: add 0, delete 0, find 0/0, race 0+0
Mar 12 18:04:41 localhost kernel: Free swap  = 0kB
Mar 12 18:04:41 localhost kernel: Total swap = 0kB
Mar 12 18:04:41 localhost kernel: Free swap:            0kB
Mar 12 18:04:42 localhost kernel: 1310720 pages of RAM
Mar 12 18:04:42 localhost kernel: 305192 reserved pages
Mar 12 18:04:42 localhost kernel: 12126 pages shared
Mar 12 18:04:42 localhost kernel: 0 pages swap cached
Mar 12 18:04:42 localhost kernel: Out of memory: Killed process 4738, UID 0, (glusterfs).
Mar 12 18:04:42 localhost kernel: irqbalance invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0
Mar 12 18:04:42 localhost kernel: 
Mar 12 18:04:42 localhost kernel: Call Trace:
Mar 12 18:04:42 localhost kernel:  [<ffffffff800c723e>] out_of_memory+0x8e/0x2f3
Mar 12 18:04:42 localhost kernel:  [<ffffffff8002e22d>] __wake_up+0x38/0x4f
Mar 12 18:04:42 localhost kernel:  [<ffffffff8000f53f>] __alloc_pages+0x27f/0x308
Mar 12 18:04:42 localhost kernel:  [<ffffffff80012eea>] __do_page_cache_readahead+0x96/0x179
Mar 12 18:04:42 localhost kernel:  [<ffffffff800138a2>] filemap_nopage+0x14c/0x360
Mar 12 18:04:42 localhost kernel:  [<ffffffff8000895e>] __handle_mm_fault+0x1fb/0x1039
Mar 12 18:04:42 localhost kernel:  [<ffffffff80062ff8>] thread_return+0x62/0xfe
Mar 12 18:04:42 localhost kernel:  [<ffffffff80066b55>] do_page_fault+0x4cb/0x874
Mar 12 18:04:42 localhost kernel:  [<ffffffff8005a4bc>] hrtimer_cancel+0xc/0x16
Mar 12 18:04:42 localhost kernel:  [<ffffffff80063d05>] do_nanosleep+0x47/0x70
Mar 12 18:04:42 localhost kernel:  [<ffffffff8005a3a9>] hrtimer_nanosleep+0x58/0x118
Mar 12 18:04:42 localhost kernel:  [<ffffffff8005dde9>] error_exit+0x0/0x84
Mar 12 18:04:42 localhost kernel: 
Mar 12 18:04:42 localhost kernel: Mem-info:
Mar 12 18:04:42 localhost kernel: Node 0 DMA per-cpu:
Mar 12 18:04:42 localhost kernel: cpu 0 hot: high 0, batch 1 used:0
Mar 12 18:04:42 localhost kernel: cpu 0 cold: high 0, batch 1 used:0
Mar 12 18:04:42 localhost kernel: cpu 1 hot: high 0, batch 1 used:0
Mar 12 18:04:42 localhost kernel: cpu 1 cold: high 0, batch 1 used:0
Mar 12 18:04:42 localhost kernel: cpu 2 hot: high 0, batch 1 used:0
Mar 12 18:04:42 localhost kernel: cpu 2 cold: high 0, batch 1 used:0
Mar 12 18:04:42 localhost kernel: cpu 3 hot: high 0, batch 1 used:0
Mar 12 18:04:42 localhost kernel: cpu 3 cold: high 0, batch 1 used:0
Mar 12 18:04:42 localhost kernel: cpu 4 hot: high 0, batch 1 used:0
Mar 12 18:04:42 localhost kernel: cpu 4 cold: high 0, batch 1 used:0
Mar 12 18:04:42 localhost kernel: cpu 5 hot: high 0, batch 1 used:0
Mar 12 18:04:42 localhost kernel: cpu 5 cold: high 0, batch 1 used:0
Mar 12 18:04:42 localhost kernel: cpu 6 hot: high 0, batch 1 used:0
Mar 12 18:04:42 localhost kernel: cpu 6 cold: high 0, batch 1 used:0
Mar 12 18:04:42 localhost kernel: cpu 7 hot: high 0, batch 1 used:0
Mar 12 18:04:42 localhost kernel: cpu 7 cold: high 0, batch 1 used:0
Mar 12 18:04:42 localhost kernel: Node 0 DMA32 per-cpu:
Mar 12 18:04:42 localhost kernel: cpu 0 hot: high 186, batch 31 used:183
Mar 12 18:04:42 localhost kernel: cpu 0 cold: high 62, batch 15 used:49
Mar 12 18:04:42 localhost kernel: cpu 1 hot: high 186, batch 31 used:20
Mar 12 18:04:42 localhost kernel: cpu 1 cold: high 62, batch 15 used:54
Mar 12 18:04:42 localhost kernel: cpu 2 hot: high 186, batch 31 used:36
Mar 12 18:04:42 localhost kernel: cpu 2 cold: high 62, batch 15 used:23
Mar 12 18:04:42 localhost kernel: cpu 3 hot: high 186, batch 31 used:113
Mar 12 18:04:42 localhost kernel: cpu 3 cold: high 62, batch 15 used:40
Mar 12 18:04:42 localhost kernel: cpu 4 hot: high 186, batch 31 used:21
Mar 12 18:04:42 localhost kernel: cpu 4 cold: high 62, batch 15 used:53
Mar 12 18:04:42 localhost kernel: cpu 5 hot: high 186, batch 31 used:24
Mar 12 18:04:42 localhost kernel: cpu 5 cold: high 62, batch 15 used:51
Mar 12 18:04:42 localhost kernel: cpu 6 hot: high 186, batch 31 used:44
Mar 12 18:04:42 localhost kernel: cpu 6 cold: high 62, batch 15 used:49
Mar 12 18:04:42 localhost kernel: cpu 7 hot: high 186, batch 31 used:32
Mar 12 18:04:42 localhost kernel: cpu 7 cold: high 62, batch 15 used:52
Mar 12 18:04:42 localhost kernel: Node 0 Normal per-cpu:
Mar 12 18:04:42 localhost kernel: cpu 0 hot: high 186, batch 31 used:127
Mar 12 18:04:42 localhost kernel: cpu 0 cold: high 62, batch 15 used:43
Mar 12 18:04:42 localhost kernel: cpu 1 hot: high 186, batch 31 used:42
Mar 12 18:04:42 localhost kernel: cpu 1 cold: high 62, batch 15 used:56
Mar 12 18:04:42 localhost kernel: cpu 2 hot: high 186, batch 31 used:38
Mar 12 18:04:42 localhost kernel: cpu 2 cold: high 62, batch 15 used:42
Mar 12 18:04:42 localhost kernel: cpu 3 hot: high 186, batch 31 used:31
Mar 12 18:04:42 localhost kernel: cpu 3 cold: high 62, batch 15 used:43
Mar 12 18:04:42 localhost kernel: cpu 4 hot: high 186, batch 31 used:142
Mar 12 18:04:42 localhost kernel: cpu 4 cold: high 62, batch 15 used:51
Mar 12 18:04:42 localhost kernel: cpu 5 hot: high 186, batch 31 used:9
Mar 12 18:04:42 localhost kernel: cpu 5 cold: high 62, batch 15 used:51
Mar 12 18:04:42 localhost kernel: cpu 6 hot: high 186, batch 31 used:50
Mar 12 18:04:42 localhost kernel: cpu 6 cold: high 62, batch 15 used:55
Mar 12 18:04:42 localhost kernel: cpu 7 hot: high 186, batch 31 used:49
Mar 12 18:04:42 localhost kernel: cpu 7 cold: high 62, batch 15 used:57
Mar 12 18:04:42 localhost kernel: Node 0 HighMem per-cpu: empty
Mar 12 18:04:42 localhost kernel: Free pages:       47324kB (0kB HighMem)
Mar 12 18:04:42 localhost kernel: Active:941925 inactive:356 dirty:0 writeback:0 unstable:0 free:11831 slab:39893 mapped-file:1 mapped-anon:938241 pagetables:3947
Mar 12 18:04:42 localhost kernel: Node 0 DMA free:10876kB min:80kB low:100kB high:120kB active:0kB inactive:0kB present:10476kB pages_scanned:0 all_unreclaimable? yes
Mar 12 18:04:42 localhost kernel: lowmem_reserve[]: 0 2978 3988 3988
Mar 12 18:04:42 localhost kernel: Node 0 DMA32 free:28568kB min:24404kB low:30504kB high:36604kB active:2963940kB inactive:48kB present:3049956kB pages_scanned:494718 all_unreclaimable? no
Mar 12 18:04:42 localhost kernel: lowmem_reserve[]: 0 0 1010 1010
Mar 12 18:04:42 localhost kernel: Node 0 Normal free:7880kB min:8276kB low:10344kB high:12412kB active:803444kB inactive:1420kB present:1034240kB pages_scanned:6454043 all_unreclaimable? yes
Mar 12 18:04:42 localhost kernel: lowmem_reserve[]: 0 0 0 0
Mar 12 18:04:42 localhost kernel: Node 0 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar 12 18:04:42 localhost kernel: lowmem_reserve[]: 0 0 0 0
Mar 12 18:04:42 localhost kernel: Node 0 DMA: 3*4kB 0*8kB 5*16kB 3*32kB 5*64kB 3*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 2*4096kB = 10876kB
Mar 12 18:04:42 localhost kernel: Node 0 DMA32: 1*4kB 9*8kB 3*16kB 1*32kB 1*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 6*4096kB = 28508kB
Mar 12 18:04:42 localhost kernel: Node 0 Normal: 10*4kB 6*8kB 1*16kB 1*32kB 1*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 1*4096kB = 7880kB
Mar 12 18:04:42 localhost kernel: Node 0 HighMem: empty
Mar 12 18:04:42 localhost kernel: 4366 pagecache pages
Mar 12 18:04:42 localhost kernel: Swap cache: add 0, delete 0, find 0/0, race 0+0
Mar 12 18:04:42 localhost kernel: Free swap  = 0kB
Mar 12 18:04:42 localhost kernel: Total swap = 0kB
Mar 12 18:04:42 localhost kernel: Free swap:            0kB
Mar 12 18:04:42 localhost kernel: 1310720 pages of RAM
Mar 12 18:04:42 localhost kernel: 305192 reserved pages
Mar 12 18:04:42 localhost kernel: 12225 pages shared
Mar 12 18:04:42 localhost kernel: 0 pages swap cached
Mar 12 18:04:42 localhost kernel: Out of memory: Killed process 4741, UID 0, (glusterfs).
Mar 12 19:26:52 localhost GlusterFS[4743]: [2013-03-12 19:26:52.844286] C [rpc-clnt.c:476:rpc_clnt_fill_request_info] 0-tcfstest-client-0: cannot lookup the saved frame corresponding to xid (74) 
Mar 12 23:50:01 localhost syslogd 1.4.1: restart (remote reception).
Mar 13 09:18:01 localhost auditd[2895]: Audit daemon rotating log files
Mar 13 09:53:49 localhost rpc.statd[6821]: Version 1.0.9 Starting
Mar 13 09:53:49 localhost rpc.statd[6821]: statd running as root. chown /var/lib/nfs/statd/sm to choose different user 
Mar 13 09:56:40 localhost kernel: fuse init (API version 7.10)


log from nfs server in /var/log/glusterfs/nfs.log:
[2013-03-13 09:53:48.901714] I [glusterfsd.c:1666:main] 0-/sbin/glusterfs: Started running /sbin/glusterfs version 3.3.0
[2013-03-13 09:53:49.031612] I [nfs.c:821:init] 0-nfs: NFS service started
[2013-03-13 09:53:49.039195] W [graph.c:316:_log_if_unknown_option] 0-nfs-server: option 'rpc-auth.auth-glusterfs' is not recognized
[2013-03-13 09:53:49.039238] W [graph.c:316:_log_if_unknown_option] 0-nfs-server: option 'rpc-auth-allow-insecure' is not recognized
[2013-03-13 09:53:49.039258] W [graph.c:316:_log_if_unknown_option] 0-nfs-server: option 'transport-type' is not recognized
[2013-03-13 09:53:49.039304] I [client.c:2142:notify] 0-tcfstest-client-0: parent translators are ready, attempting connect on transport
[2013-03-13 09:53:49.044451] I [client.c:2142:notify] 0-tcfstest-client-1: parent translators are ready, attempting connect on transport
[2013-03-13 09:53:49.047895] I [client.c:2142:notify] 0-tcfstest-client-2: parent translators are ready, attempting connect on transport
[2013-03-13 09:53:49.051169] I [client.c:2142:notify] 0-tcfstest-client-3: parent translators are ready, attempting connect on transport
[2013-03-13 09:53:49.054372] I [client.c:2142:notify] 0-tcfstest-client-4: parent translators are ready, attempting connect on transport
[2013-03-13 09:53:49.057610] I [client.c:2142:notify] 0-tcfstest-client-5: parent translators are ready, attempting connect on transport
Given volfile:
+------------------------------------------------------------------------------+
  1: volume tcfstest-client-0
  2:     type protocol/client
  3:     option remote-host 125.210.140.17
  4:     option remote-subvolume /mnt/p1/exp
  5:     option transport-type tcp
  6:     option username 51b818d5-9b20-402a-9087-b03c5a91b01f
  7:     option password fc720fbe-23b7-4696-971f-09d57c822e82
  8: end-volume
  9: 
 10: volume tcfstest-client-1
 11:     type protocol/client
 12:     option remote-host 125.210.140.18
 13:     option remote-subvolume /mnt/p1/exp
 14:     option transport-type tcp
 15:     option username 51b818d5-9b20-402a-9087-b03c5a91b01f
 16:     option password fc720fbe-23b7-4696-971f-09d57c822e82
 17: end-volume
 18: 
 19: volume tcfstest-client-2
 20:     type protocol/client
 21:     option remote-host 125.210.140.17
 22:     option remote-subvolume /mnt/p2/exp
 23:     option transport-type tcp
 24:     option username 51b818d5-9b20-402a-9087-b03c5a91b01f
 25:     option password fc720fbe-23b7-4696-971f-09d57c822e82
 26: end-volume
 27: 
 28: volume tcfstest-client-3
 29:     type protocol/client
 30:     option remote-host 125.210.140.18
 31:     option remote-subvolume /mnt/p2/exp
 32:     option transport-type tcp
 33:     option username 51b818d5-9b20-402a-9087-b03c5a91b01f
 34:     option password fc720fbe-23b7-4696-971f-09d57c822e82
 35: end-volume
 36: 
 37: volume tcfstest-client-4
 38:     type protocol/client
 39:     option remote-host 125.210.140.19
 40:     option remote-subvolume /mnt/p1/exp
 41:     option transport-type tcp
 42:     option username 51b818d5-9b20-402a-9087-b03c5a91b01f
 43:     option password fc720fbe-23b7-4696-971f-09d57c822e82
 44: end-volume
 45: 
 46: volume tcfstest-client-5
 47:     type protocol/client
 48:     option remote-host 125.210.140.20
 49:     option remote-subvolume /mnt/p1/exp
 50:     option transport-type tcp
 51:     option username 51b818d5-9b20-402a-9087-b03c5a91b01f
 52:     option password fc720fbe-23b7-4696-971f-09d57c822e82
 53: end-volume
 54: 
 55: volume tcfstest-replicate-0
 56:     type cluster/replicate
 57:     subvolumes tcfstest-client-0 tcfstest-client-1
 58: end-volume
 59: 
 60: volume tcfstest-replicate-1
 61:     type cluster/replicate
 62:     subvolumes tcfstest-client-2 tcfstest-client-3
 63: end-volume
 64: 
 65: volume tcfstest-replicate-2
 66:     type cluster/replicate
 67:     subvolumes tcfstest-client-4 tcfstest-client-5
 68: end-volume
 69: 
 70: volume tcfstest-dht
 71:     type cluster/distribute
 72:     subvolumes tcfstest-replicate-0 tcfstest-replicate-1 tcfstest-replicate-2
 73: end-volume
 74: 
 75: volume tcfstest
 76:     type debug/io-stats
 77:     option latency-measurement off
 78:     option count-fop-hits off
 79:     subvolumes tcfstest-dht
 80: end-volume
 81: 
 82: volume nfs-server
 83:     type nfs/server
 84:     option nfs.dynamic-volumes on
 85:     option nfs.nlm on
 86:     option rpc-auth.addr.tcfstest.allow *
 87:     option nfs3.tcfstest.volume-id a880c23b-b02b-4bb6-94b4-80829a893a20
 88:     subvolumes tcfstest
 89: end-volume

+------------------------------------------------------------------------------+
[2013-03-13 09:53:49.061446] I [rpc-clnt.c:1660:rpc_clnt_reconfig] 0-tcfstest-client-2: changing port to 24014 (from 0)
[2013-03-13 09:53:49.061502] I [rpc-clnt.c:1660:rpc_clnt_reconfig] 0-tcfstest-client-0: changing port to 24013 (from 0)
[2013-03-13 09:53:49.061558] I [rpc-clnt.c:1660:rpc_clnt_reconfig] 0-tcfstest-client-3: changing port to 24014 (from 0)
[2013-03-13 09:53:49.061615] I [rpc-clnt.c:1660:rpc_clnt_reconfig] 0-tcfstest-client-1: changing port to 24013 (from 0)
[2013-03-13 09:53:49.061671] I [rpc-clnt.c:1660:rpc_clnt_reconfig] 0-tcfstest-client-5: changing port to 24011 (from 0)
[2013-03-13 09:53:49.061701] I [rpc-clnt.c:1660:rpc_clnt_reconfig] 0-tcfstest-client-4: changing port to 24011 (from 0)
[2013-03-13 09:53:51.603169] W [socket.c:410:__socket_keepalive] 0-socket: failed to set keep idle on socket 8
[2013-03-13 09:53:51.603224] W [socket.c:1876:socket_server_event_handler] 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported
[2013-03-13 09:53:52.922575] I [client-handshake.c:1636:select_server_supported_programs] 0-tcfstest-client-2: Using Program GlusterFS 3.3.0, Num (1298437), Version (330)
[2013-03-13 09:53:52.925650] I [client-handshake.c:1433:client_setvolume_cbk] 0-tcfstest-client-2: Connected to 125.210.140.17:24014, attached to remote volume '/mnt/p2/exp'.
[2013-03-13 09:53:52.925689] I [client-handshake.c:1445:client_setvolume_cbk] 0-tcfstest-client-2: Server and Client lk-version numbers are not same, reopening the fds
[2013-03-13 09:53:52.925824] I [afr-common.c:3627:afr_notify] 0-tcfstest-replicate-1: Subvolume 'tcfstest-client-2' came back up; going online.
[2013-03-13 09:53:52.926002] I [client-handshake.c:453:client_set_lk_version_cbk] 0-tcfstest-client-2: Server lk version = 1
[2013-03-13 09:53:52.926088] I [client-handshake.c:1636:select_server_supported_programs] 0-tcfstest-client-0: Using Program GlusterFS 3.3.0, Num (1298437), Version (330)
[2013-03-13 09:53:52.929945] I [client-handshake.c:1433:client_setvolume_cbk] 0-tcfstest-client-0: Connected to 125.210.140.17:24013, attached to remote volume '/mnt/p1/exp'.
[2013-03-13 09:53:52.929983] I [client-handshake.c:1445:client_setvolume_cbk] 0-tcfstest-client-0: Server and Client lk-version numbers are not same, reopening the fds
[2013-03-13 09:53:52.930042] I [afr-common.c:3627:afr_notify] 0-tcfstest-replicate-0: Subvolume 'tcfstest-client-0' came back up; going online.
[2013-03-13 09:53:52.930073] I [client-handshake.c:453:client_set_lk_version_cbk] 0-tcfstest-client-0: Server lk version = 1
[2013-03-13 09:53:52.930979] I [client-handshake.c:1636:select_server_supported_programs] 0-tcfstest-client-3: Using Program GlusterFS 3.3.0, Num (1298437), Version (330)
[2013-03-13 09:53:52.934061] I [client-handshake.c:1433:client_setvolume_cbk] 0-tcfstest-client-3: Connected to 125.210.140.18:24014, attached to remote volume '/mnt/p2/exp'.
[2013-03-13 09:53:52.934084] I [client-handshake.c:1445:client_setvolume_cbk] 0-tcfstest-client-3: Server and Client lk-version numbers are not same, reopening the fds
[2013-03-13 09:53:52.934241] I [client-handshake.c:1636:select_server_supported_programs] 0-tcfstest-client-1: Using Program GlusterFS 3.3.0, Num (1298437), Version (330)
[2013-03-13 09:53:52.934339] I [client-handshake.c:453:client_set_lk_version_cbk] 0-tcfstest-client-3: Server lk version = 1
[2013-03-13 09:53:52.937572] I [client-handshake.c:1433:client_setvolume_cbk] 0-tcfstest-client-1: Connected to 125.210.140.18:24013, attached to remote volume '/mnt/p1/exp'.
[2013-03-13 09:53:52.937605] I [client-handshake.c:1445:client_setvolume_cbk] 0-tcfstest-client-1: Server and Client lk-version numbers are not same, reopening the fds
[2013-03-13 09:53:52.937813] I [client-handshake.c:1636:select_server_supported_programs] 0-tcfstest-client-5: Using Program GlusterFS 3.3.0, Num (1298437), Version (330)
[2013-03-13 09:53:52.937918] I [client-handshake.c:453:client_set_lk_version_cbk] 0-tcfstest-client-1: Server lk version = 1
[2013-03-13 09:53:52.940899] I [client-handshake.c:1433:client_setvolume_cbk] 0-tcfstest-client-5: Connected to 125.210.140.20:24011, attached to remote volume '/mnt/p1/exp'.
[2013-03-13 09:53:52.940925] I [client-handshake.c:1445:client_setvolume_cbk] 0-tcfstest-client-5: Server and Client lk-version numbers are not same, reopening the fds
[2013-03-13 09:53:52.940972] I [afr-common.c:3627:afr_notify] 0-tcfstest-replicate-2: Subvolume 'tcfstest-client-5' came back up; going online.
[2013-03-13 09:53:52.941097] I [client-handshake.c:453:client_set_lk_version_cbk] 0-tcfstest-client-5: Server lk version = 1
[2013-03-13 09:53:52.941334] I [client-handshake.c:1636:select_server_supported_programs] 0-tcfstest-client-4: Using Program GlusterFS 3.3.0, Num (1298437), Version (330)
[2013-03-13 09:53:52.944357] I [client-handshake.c:1433:client_setvolume_cbk] 0-tcfstest-client-4: Connected to 125.210.140.19:24011, attached to remote volume '/mnt/p1/exp'.
[2013-03-13 09:53:52.944381] I [client-handshake.c:1445:client_setvolume_cbk] 0-tcfstest-client-4: Server and Client lk-version numbers are not same, reopening the fds
[2013-03-13 09:53:52.951156] I [client-handshake.c:453:client_set_lk_version_cbk] 0-tcfstest-client-4: Server lk version = 1
[2013-03-13 09:53:52.951332] I [afr-common.c:1964:afr_set_root_inode_on_first_lookup] 0-tcfstest-replicate-0: added root inode
[2013-03-13 09:53:52.951889] I [afr-common.c:1964:afr_set_root_inode_on_first_lookup] 0-tcfstest-replicate-1: added root inode
[2013-03-13 09:53:52.952169] I [afr-common.c:1964:afr_set_root_inode_on_first_lookup] 0-tcfstest-replicate-2: added root inode



Pippo

From: Pranith Kumar K
Date: 2013-03-26 01:31
To: pippo0805
CC: gluster-users
Subject: Re: [Gluster-users] A problem when mount glusterfs via NFS
On 03/25/2013 08:23 AM, Pippo wrote:

HI:

I run glusterfs with four nodes, 2x2 Distributed-Replicate.
I mounted it via fuse and did some test, it was ok.
However when I mounted it via nfs, a problem was found:
    
    When I copied 200G files to the glusterfs, the glusterfs process in the server node(mounted by client) was killed because of OOM, 
and all terminals of the client were hung. Trying to test for many times, I got the same result. The heavier load I pushed via the
client, the faster glusterfs process been killed. I run "top" in the server, found that the glusterfs process eat MEM very fast,  and 
never gone down until it was killed. I think it is a bug of glusterfs process, it leak memory.
   
I google "glusterfs OOM" but can not find any solutions. Is anyone know about this problem and give me some tips? Many thanks!




Pippo

 

_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users
hi,
        Could you let us know the version of glusterfs you were using? nfs server Logs of that run would help us if you could attach that to this mail.

Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130326/62a5b69a/attachment.html>


More information about the Gluster-users mailing list