[Gluster-users] Errors during dbench run (rename failed)

Marc Seeger marc.seeger at acquia.com
Sun Mar 17 13:25:53 UTC 2013


Hi,

We just ran into drench dying on one of our test runs.
We execute a dbench each on 2 machines.
We use the following parameters: dbench 6 -t 60 -D $DIRECTORY (host specific, they each write in a separate one)
The directories are on a mountpoint connected using glusterfs 3.3.1 (3.3.1-ubuntu1~lucid8 from https://launchpad.net/~semiosis/+archive/ubuntu-glusterfs-3.3)

This is how dbench died:

I, [2013-03-16T05:34:03.176890 #13121]  INFO -- : [710] rename /mnt/gfs/something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT /mnt/gfs/something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PPTB1E4.TMP failed (No such file or directory) - expected NT_STATUS_OK


These are the logs at the time. They are a bit noisy, the matching message is emphasised using *****:

[2013-03-16 05:34:03.082813] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f274
[2013-03-16 05:34:03.082813] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory
[2013-03-16 05:34:03.082813] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PPTB1E4.TMP (b49d6051-93f6-4eca-b161-865a5bea964b)
[2013-03-16 05:34:03.082813] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f4cc
[2013-03-16 05:34:03.082813] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f6c0
[2013-03-16 05:34:03.082813] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f468
[2013-03-16 05:34:03.082813] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f33c
[2013-03-16 05:34:03.082813] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory
[2013-03-16 05:34:03.092814] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/ZD16.BMP (73e3b099-48cd-4e76-8049-c64bf8f63500)
[2013-03-16 05:34:03.092814] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT (ba53fb9f-0648-4794-aaa9-bba9331b52cb)
[2013-03-16 05:34:03.092814] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PCBENCHM.PPT (a0c96e9a-4d4a-4984-9892-ff0b2ecbb7e3)
[2013-03-16 05:34:03.092814] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client4/~dmtmp/PWRPNT/PPTB1E4.TMP (2b8f1677-6376-4286-a381-8f4897bc9f4a)
[2013-03-16 05:34:03.092814] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f594
[2013-03-16 05:34:03.092814] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f3a0
[2013-03-16 05:34:03.092814] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f2d8
[2013-03-16 05:34:03.112816] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client4/~dmtmp/PWRPNT/ZD16.BMP (eafa5f6a-fe12-4b9c-a5b9-386f2ff2123f)
[2013-03-16 05:34:03.112816] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client4/~dmtmp/PWRPNT/NEWPCB.PPT (8c99ede1-3782-49f0-b544-00f4ec3beb9b)
[2013-03-16 05:34:03.112816] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client4/~dmtmp/PWRPNT/PCBENCHM.PPT (a725ede8-bc10-42a1-9622-55afad13f9f7)
[2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory
[2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory
[2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory
[2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory
[2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory
[2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory
[2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory
[2013-03-16 05:34:03.132819] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory
[2013-03-16 05:34:03.132819] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory
[2013-03-16 05:34:03.132819] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory
[2013-03-16 05:34:03.132819] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory
[2013-03-16 05:34:03.142820] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT (ba53fb9f-0648-4794-aaa9-bba9331b52cb)
[2013-03-16 05:34:03.142820] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f788
[2013-03-16 05:34:03.142820] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT (ba53fb9f-0648-4794-aaa9-bba9331b52cb)
[2013-03-16 05:34:03.142820] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory
[2013-03-16 05:34:03.142820] W [client3_1-fops.c:2546:client3_1_opendir_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT (6512393c-65b8-4d86-ae78-8a12eb2be395)
[2013-03-16 05:34:03.172824] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory
[2013-03-16 05:34:03.172824] W [fuse-bridge.c:1516:fuse_rename_cbk] 0-glusterfs-fuse: 11218: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT -> /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PPTB1E4.TMP => -1 (No such file or directory)
[2013-03-16 05:34:03.232831] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f404
[2013-03-16 05:34:03.242832] I [afr-open.c:318:afr_openfd_fix_open_cbk] 0-replicate0: fd for /something.example.com_1363412031/clients/client0/~dmtmp/PWRPNT/PCBENCHM.PPT opened successfully on subvolume remote9
[2013-03-16 05:34:03.252834] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f5f8
[2013-03-16 05:34:03.262835] I [afr-open.c:318:afr_openfd_fix_open_cbk] 0-replicate0: fd for /something.example.com_1363412031/clients/client3/~dmtmp/PWRPNT/PCBENCHM.PPT opened successfully on subvolume remote9

***********************
[2013-03-16 05:34:03.172824] W [fuse-bridge.c:1516:fuse_rename_cbk] 0-glusterfs-fuse: 11218: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT -> /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PPTB1E4.TMP => -1 (No such file or directory)
***********************

[2013-03-16 05:34:03.232831] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f404
[2013-03-16 05:34:03.242832] I [afr-open.c:318:afr_openfd_fix_open_cbk] 0-replicate0: fd for /something.example.com_1363412031/clients/client0/~dmtmp/PWRPNT/PCBENCHM.PPT opened successfully on subvolume remote9
[2013-03-16 05:34:03.252834] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f5f8
[2013-03-16 05:34:03.262835] I [afr-open.c:318:afr_openfd_fix_open_cbk] 0-replicate0: fd for /something.example.com_1363412031/clients/client3/~dmtmp/PWRPNT/PCBENCHM.PPT opened successfully on subvolume remote9
[2013-03-16 05:36:21.547011] C [client-handshake.c:126:rpc_client_ping_timer_expired] 0-remote8: server 10.245.15.65:24007 has not responded in the last 42 seconds, disconnecting.
[2013-03-16 05:36:21.547011] C [client-handshake.c:126:rpc_client_ping_timer_expired] 0-remote9: server 10.196.239.242:24007 has not responded in the last 42 seconds, disconnecting.
[2013-03-16 05:36:21.547011] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x7f1adab0a048] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) [0x7f1adab09d00] (-->/usr/lib/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f1adab0976e]))) 0-remote8: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2013-03-16 05:35:10.750385 (xid=0x18942x)
[2013-03-16 05:36:21.547011] W [client3_1-fops.c:2630:client3_1_lookup_cbk] 0-remote8: remote operation failed: Transport endpoint is not connected. Path: / (00000000-0000-0000-0000-000000000001)
[2013-03-16 05:36:21.547011] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x7f1adab0a048] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) [0x7f1adab09d00] (-->/usr/lib/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f1adab0976e]))) 0-remote8: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2013-03-16 05:35:18.191110 (xid=0x18943x)
[2013-03-16 05:36:21.547011] W [client3_1-fops.c:2630:client3_1_lookup_cbk] 0-remote8: remote operation failed: Transport endpoint is not connected. Path: / (00000000-0000-0000-0000-000000000001)
[2013-03-16 05:36:21.547011] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x7f1adab0a048] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) [0x7f1adab09d00] (-->/usr/lib/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f1adab0976e]))) 0-remote8: forced unwinding frame type(GlusterFS Handshake) op(PING(3)) called at 2013-03-16 05:35:39.543151 (xid=0x18944x)


Anybody have an idea what could cause such errors?
The rpc_client_ping_timer_expired timeouts seem a bit strange. They are after the fail and we do test networking problems in a previous test, so they might just have stuck around from then.


Cheers,
Marc


More information about the Gluster-users mailing list