[Gluster-devel] QEMU (and other libgfapi client?) crashes on add-brick / replace-brick

Guido De Rosa guido.derosa at vemarsas.it
Mon Jul 29 08:51:35 UTC 2013


Apparently the problem isn't fixed... even when qemu doesn't crash, the
guest raises many I/O error and turns unusable, just like a real machine
would do if you physically remove the hard drive, I guess...

I'm doing more tests anyway and will post a much more detailed report as
soon as I can. Thanks for now.

Guido


2013/7/28 Anand Avati <anand.avati at gmail.com>

> Guido,
> You need a couple of fixes:
>
> http://review.gluster.org/5378 - to fix the root cause of the failure
> (portmap failure)
> http://review.gluster.org/5407 - to prevent a crash in case of such
> failures.
>
> Can you please apply the patches and confirm if they fix your issue?
>
> Thanks,
> Avati
>
>
> On Fri, Jul 26, 2013 at 4:02 PM, Guido De Rosa <guido.derosa at vemarsas.it>wrote:
>
>> 2013/7/26 Anand Avati <anand.avati at gmail.com>:
>> > Can you please post the backtrace and logs from the crash?
>>
>> # gdb
>> ...
>> (gdb) file /usr/local/bin/qemu-system-x86_64
>> Reading symbols from /usr/local/bin/qemu-system-x86_64...done.
>> (gdb) run -uuid "e06dc280-d74a-0130-49e3-003018a4d17c" -name
>> "deb-on-gluster" -m "2048" -vnc ":1" -k "it" -pidfile
>> "/var/run/onboard/qemu-e06dc280.pid" -monitor
>> unix:"/var/run/onboard/qemu-e06dc280.sock",server,nowait -smp 2
>> -device piix3-usb-uhci,id=piix3-uhci -device usb-ehci,id=ehci -device
>> usb-tablet,bus=piix3-uhci.0 -drive
>>
>> serial="QME06DC28000",if="virtio",media="disk",cache="unsafe",file="gluster://localhost/gvtest/QEMU/deb-on-gluster/disk0.qcow2",index=0
>> -drive serial="QME06DC28001",if="ide",media="cdrom",bus=1,unit=0 -net
>> nic,vlan=0,model=virtio,macaddr=DE:AD:BE:6C:CF:30 -net
>> tap,vlan=0,ifname=vDebOnGluster_0,script=no,downscript=no -boot
>> menu=on,order=dc -cpu host -enable-kvm -runas onboard
>>
>> ...
>>
>> On another terminal:
>>
>> # gluster volume add-brick gvtest replica 2
>> 192.168.232.101:/var/export/gluster/gvtest
>> volume add-brick: success
>>
>> And here the crash:
>>
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
>> [New Thread 0x7fffe5dad700 (LWP 8611)]
>> [New Thread 0x7fffe55ac700 (LWP 8618)]
>> [New Thread 0x7fffe4548700 (LWP 8621)]
>> [New Thread 0x7fffe3b33700 (LWP 8622)]
>> [New Thread 0x7fffdbdfd700 (LWP 8623)]
>> [New Thread 0x7fffdb5fc700 (LWP 8624)]
>> [New Thread 0x7fffd9dff700 (LWP 8626)]
>> [New Thread 0x7fffc7fff700 (LWP 8627)]
>> [New Thread 0x7fffc63fe700 (LWP 8628)]
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> [Switching to Thread 0x7fffe5dad700 (LWP 8611)]
>> glfs_subvol_done (fs=0x5555566cfc20, subvol=subvol at entry=0x0) at
>> glfs-resolve.c:802
>> 802 glfs-resolve.c: No such file or directory.
>> (gdb)
>> (gdb) bt
>> #0  glfs_subvol_done (fs=0x5555566cfc20, subvol=subvol at entry=0x0) at
>> glfs-resolve.c:802
>> #1  0x00007ffff70aff50 in glfs_pwritev (glfd=0x5555566d79b0,
>> iovec=<optimized out>,
>>     iovcnt=<optimized out>, offset=8368975872, flags=0) at glfs-fops.c:761
>> #2  0x00007ffff70b04e7 in glfs_io_async_task (data=<optimized out>) at
>> glfs-fops.c:584
>> #3  0x00007ffff0080b22 in synctask_wrap (old_task=<optimized out>) at
>> syncop.c:131
>> #4  0x00007ffff0541710 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
>> #5  0x0000000000000000 in ?? ()
>>
>>
>> Log: /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
>>
>> [2013-07-26 22:27:45.877325] I
>> [glusterd-brick-ops.c:370:__glusterd_handle_add_brick] 0-management:
>> Received add brick req
>> [2013-07-26 22:27:45.877398] I
>> [glusterd-brick-ops.c:417:__glusterd_handle_add_brick] 0-management:
>> repl
>> ica-count is 2
>> [2013-07-26 22:27:45.877415] I
>> [glusterd-brick-ops.c:193:gd_addbr_validate_replica_count]
>> 0-management:
>> Changing the type of volume gvtest from 'distribute' to 'replica'
>> [2013-07-26 22:27:45.879877] I
>> [glusterd-brick-ops.c:894:glusterd_op_perform_add_bricks]
>> 0-management: r
>> eplica-count is set 2
>> [2013-07-26 22:27:45.879907] I
>> [glusterd-brick-ops.c:898:glusterd_op_perform_add_bricks]
>> 0-management: t
>> ype is set 2, need to change it
>> [2013-07-26 22:27:45.882314] I
>> [glusterd-utils.c:954:glusterd_volume_brickinfo_get] 0-management:
>> Found
>> brick
>> [2013-07-26 22:27:45.882349] I
>> [glusterd-utils.c:954:glusterd_volume_brickinfo_get] 0-management:
>> Found
>> brick
>> [2013-07-26 22:27:45.883952] I
>> [glusterd-utils.c:954:glusterd_volume_brickinfo_get] 0-management:
>> Found
>> brick
>> [2013-07-26 22:27:45.883978] I
>> [glusterd-utils.c:954:glusterd_volume_brickinfo_get] 0-management:
>> Found
>> brick
>> [2013-07-26 22:27:45.884704] I
>> [glusterd-utils.c:954:glusterd_volume_brickinfo_get] 0-management:
>> Found
>> brick
>> [2013-07-26 22:27:46.617898] I
>> [glusterd-utils.c:954:glusterd_volume_brickinfo_get] 0-management:
>> Found
>> brick
>> [2013-07-26 22:27:46.617945] I
>> [glusterd-utils.c:954:glusterd_volume_brickinfo_get] 0-management:
>> Found
>> brick
>> [2013-07-26 22:27:47.618736] E
>> [glusterd-utils.c:3627:glusterd_nodesvc_unlink_socket_file]
>> 0-management:
>> [2013-07-26 22:27:45.877325] I
>> [glusterd-brick-ops.c:370:__glusterd_handle_add_brick] 0-management:
>> Rece
>> ived add brick req
>> [2013-07-26 22:27:45.877398] I
>> [glusterd-brick-ops.c:417:__glusterd_handle_add_brick] 0-management:
>> repl
>> ica-count is 2
>> [2013-07-26 22:27:45.877415] I
>> [glusterd-brick-ops.c:193:gd_addbr_validate_replica_count]
>> 0-management:
>> Changing the type of volume gvtest from 'distribute' to 'replica'
>> [2013-07-26 22:27:45.879877] I
>> [glusterd-brick-ops.c:894:glusterd_op_perform_add_bricks]
>> 0-management: r
>> eplica-count is set 2
>> [2013-07-26 22:27:45.879907] I
>> [glusterd-brick-ops.c:898:glusterd_op_perform_add_bricks]
>> 0-management: t
>> ype is set 2, need to change it
>> [2013-07-26 22:27:45.882314] I
>> [glusterd-utils.c:954:glusterd_volume_brickinfo_get] 0-management:
>> Found
>> brick
>> [2013-07-26 22:27:45.882349] I
>> [glusterd-utils.c:954:glusterd_volume_brickinfo_get] 0-management:
>> Found
>> brick
>> [2013-07-26 22:27:45.883952] I
>> [glusterd-utils.c:954:glusterd_volume_brickinfo_get] 0-management:
>> Found
>> brick
>> [2013-07-26 22:27:45.883978] I
>> [glusterd-utils.c:954:glusterd_volume_brickinfo_get] 0-management:
>> Found
>> brick
>> [2013-07-26 22:27:45.884704] I
>> [glusterd-utils.c:954:glusterd_volume_brickinfo_get] 0-management:
>> Found
>> brick
>> [2013-07-26 22:27:46.617898] I
>> [glusterd-utils.c:954:glusterd_volume_brickinfo_get] 0-management:
>> Found
>> brick
>> [2013-07-26 22:27:46.617945] I
>> [glusterd-utils.c:954:glusterd_volume_brickinfo_get] 0-management:
>> Found
>> brick
>> [2013-07-26 22:27:47.618736] E
>> [glusterd-utils.c:3627:glusterd_nodesvc_unlink_socket_file]
>> 0-management:
>> [2013-07-26 22:27:47.619092] I
>> [glusterd-utils.c:3661:glusterd_nfs_pmap_deregister] 0-: De-registered
>> MOUNTV3 successfully
>> [2013-07-26 22:27:47.619386] I
>> [glusterd-utils.c:3666:glusterd_nfs_pmap_deregister] 0-: De-registered
>> MOUNTV1 successfully
>> [2013-07-26 22:27:47.619682] I
>> [glusterd-utils.c:3671:glusterd_nfs_pmap_deregister] 0-: De-registered
>> NFSV3 successfully
>> [2013-07-26 22:27:47.620004] I
>> [glusterd-utils.c:3676:glusterd_nfs_pmap_deregister] 0-: De-registered
>> NLM v4 successfully
>> [2013-07-26 22:27:47.620343] I
>> [glusterd-utils.c:3681:glusterd_nfs_pmap_deregister] 0-: De-registered
>> NLM v1 successfully
>> [2013-07-26 22:27:47.622905] I
>> [rpc-clnt.c:962:rpc_clnt_connection_init] 0-management: setting
>> frame-timeout to 600
>> [2013-07-26 22:27:47.622988] I [socket.c:3480:socket_init]
>> 0-management: SSL support is NOT enabled
>> [2013-07-26 22:27:47.623003] I [socket.c:3495:socket_init]
>> 0-management: using system polling thread
>> [2013-07-26 22:27:48.650339] E
>> [glusterd-utils.c:3627:glusterd_nodesvc_unlink_socket_file]
>> 0-management: Failed to remove
>> /var/run/52b6ed075a07af2e9235e49dd9d214af.socke
>> t error: No such file or directory
>> [2013-07-26 22:27:48.652919] I
>> [rpc-clnt.c:962:rpc_clnt_connection_init] 0-management: setting
>> frame-timeout to 600
>> [2013-07-26 22:27:48.653002] I [socket.c:3480:socket_init]
>> 0-management: SSL support is NOT enabled
>> [2013-07-26 22:27:48.653017] I [socket.c:3495:socket_init]
>> 0-management: using system polling thread
>> [2013-07-26 22:27:48.653215] W [socket.c:514:__socket_rwv]
>> 0-management: readv failed (No data available)
>> [2013-07-26 22:27:48.653333] I [mem-pool.c:541:mem_pool_destroy]
>> 0-management: size=2236 max=0 total=0
>> [2013-07-26 22:27:48.653354] I [mem-pool.c:541:mem_pool_destroy]
>> 0-management: size=124 max=0 total=0
>> [2013-07-26 22:27:48.653398] I [socket.c:2236:socket_event_handler]
>> 0-transport: disconnecting now
>> [2013-07-26 22:27:48.653453] W [socket.c:514:__socket_rwv]
>> 0-management: readv failed (No data available)
>> [2013-07-26 22:27:48.653478] I [mem-pool.c:541:mem_pool_destroy]
>> 0-management: size=2236 max=0 total=0
>> [2013-07-26 22:27:48.653493] I [mem-pool.c:541:mem_pool_destroy]
>> 0-management: size=124 max=0 total=0
>> [2013-07-26 22:27:48.653537] I [socket.c:2236:socket_event_handler]
>> 0-transport: disconnecting now
>> [2013-07-26 22:27:48.680240] E [rpcsvc.c:519:rpcsvc_handle_rpc_call]
>> 0-glusterd: Request received from non-privileged port. Failing request
>> [2013-07-26 22:27:52.093567] E [rpcsvc.c:519:rpcsvc_handle_rpc_call]
>> 0-glusterd: Request received from non-privileged port. Failing request
>> [2013-07-26 22:27:55.094469] E [rpcsvc.c:519:rpcsvc_handle_rpc_call]
>> 0-glusterd: Request received from non-privileged port. Failing request
>>
>> --
>>
>> G.
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20130729/b7329e57/attachment-0001.html>


More information about the Gluster-devel mailing list