[Gluster-users] libgfapi failover problem on replica bricks
Roman
romeo.r at gmail.com
Wed Aug 6 06:00:58 UTC 2014
Also, this time files are not the same!
root at stor1:~# md5sum /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
32411360c53116b96a059f17306caeda
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
root at stor2:~# md5sum /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
65b8a6031bcb6f5fb3a11cb1e8b1c9c9
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
2014-08-05 16:33 GMT+03:00 Roman <romeo.r at gmail.com>:
> Nope, it is not working. But this time it went a bit other way
>
> root at gluster-client:~# dmesg
> Segmentation fault
>
>
> I was not able even to start the VM after I done the tests
>
> Could not read qcow2 header: Operation not permitted
>
> And it seems, it never starts to sync files after first disconnect. VM
> survives first disconnect, but not second (I waited around 30 minutes).
> Also, I've got network.ping-timeout: 2 in volume settings, but logs react
> on first disconnect around 30 seconds. Second was faster, 2 seconds.
>
> Reaction was different also:
>
> slower one:
> [2014-08-05 13:26:19.558435] W [socket.c:514:__socket_rwv] 0-glusterfs:
> readv failed (Connection timed out)
> [2014-08-05 13:26:19.558485] W
> [socket.c:1962:__socket_proto_state_machine] 0-glusterfs: reading from
> socket failed. Error (Connection timed out), peer (10.250.0.1:24007)
> [2014-08-05 13:26:21.281426] W [socket.c:514:__socket_rwv]
> 0-HA-fast-150G-PVE1-client-0: readv failed (Connection timed out)
> [2014-08-05 13:26:21.281474] W
> [socket.c:1962:__socket_proto_state_machine] 0-HA-fast-150G-PVE1-client-0:
> reading from socket failed. Error (Connection timed out), peer (
> 10.250.0.1:49153)
> [2014-08-05 13:26:21.281507] I [client.c:2098:client_rpc_notify]
> 0-HA-fast-150G-PVE1-client-0: disconnected
>
> the fast one:
> 2014-08-05 12:52:44.607389] C
> [client-handshake.c:127:rpc_client_ping_timer_expired]
> 0-HA-fast-150G-PVE1-client-1: server 10.250.0.2:49153 has not responded
> in the last 2 seconds, disconnecting.
> [2014-08-05 12:52:44.607491] W [socket.c:514:__socket_rwv]
> 0-HA-fast-150G-PVE1-client-1: readv failed (No data available)
> [2014-08-05 12:52:44.607585] E [rpc-clnt.c:368:saved_frames_unwind]
> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)
> [0x7fcb1b4b0558]
> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)
> [0x7fcb1b4aea63]
> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)
> [0x7fcb1b4ae97e]))) 0-HA-fast-150G-PVE1-client-1: forced unwinding frame
> type(GlusterFS 3.3) op(LOOKUP(27)) called at 2014-08-05 12:52:42.463881
> (xid=0x381883x)
> [2014-08-05 12:52:44.607604] W
> [client-rpc-fops.c:2624:client3_3_lookup_cbk] 0-HA-fast-150G-PVE1-client-1:
> remote operation failed: Transport endpoint is not connected. Path: /
> (00000000-0000-0000-0000-000000000001)
> [2014-08-05 12:52:44.607736] E [rpc-clnt.c:368:saved_frames_unwind]
> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)
> [0x7fcb1b4b0558]
> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)
> [0x7fcb1b4aea63]
> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)
> [0x7fcb1b4ae97e]))) 0-HA-fast-150G-PVE1-client-1: forced unwinding frame
> type(GlusterFS Handshake) op(PING(3)) called at 2014-08-05 12:52:42.463891
> (xid=0x381884x)
> [2014-08-05 12:52:44.607753] W [client-handshake.c:276:client_ping_cbk]
> 0-HA-fast-150G-PVE1-client-1: timer must have expired
> [2014-08-05 12:52:44.607776] I [client.c:2098:client_rpc_notify]
> 0-HA-fast-150G-PVE1-client-1: disconnected
>
>
>
> I've got SSD disks (just for an info).
> Should I go and give a try for 3.5.2?
>
>
>
> 2014-08-05 13:06 GMT+03:00 Pranith Kumar Karampuri <pkarampu at redhat.com>:
>
> reply along with gluster-users please :-). May be you are hitting 'reply'
>> instead of 'reply all'?
>>
>> Pranith
>>
>> On 08/05/2014 03:35 PM, Roman wrote:
>>
>> To make sure and clean, I've created another VM with raw format and goint
>> to repeat those steps. So now I've got two VM-s one with qcow2 format and
>> other with raw format. I will send another e-mail shortly.
>>
>>
>> 2014-08-05 13:01 GMT+03:00 Pranith Kumar Karampuri <pkarampu at redhat.com>:
>>
>>>
>>> On 08/05/2014 03:07 PM, Roman wrote:
>>>
>>> really, seems like the same file
>>>
>>> stor1:
>>> a951641c5230472929836f9fcede6b04
>>> /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
>>>
>>> stor2:
>>> a951641c5230472929836f9fcede6b04
>>> /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
>>>
>>>
>>> one thing I've seen from logs, that somehow proxmox VE is connecting
>>> with wrong version to servers?
>>> [2014-08-05 09:23:45.218550] I
>>> [client-handshake.c:1659:select_server_supported_programs]
>>> 0-HA-fast-150G-PVE1-client-0: Using Program GlusterFS 3.3, Num (1298437),
>>> Version (330)
>>>
>>> It is the rpc (over the network data structures) version, which is not
>>> changed at all from 3.3 so thats not a problem. So what is the conclusion?
>>> Is your test case working now or not?
>>>
>>> Pranith
>>>
>>> but if I issue:
>>> root at pve1:~# glusterfs -V
>>> glusterfs 3.4.4 built on Jun 28 2014 03:44:57
>>> seems ok.
>>>
>>> server use 3.4.4 meanwhile
>>> [2014-08-05 09:23:45.117875] I [server-handshake.c:567:server_setvolume]
>>> 0-HA-fast-150G-PVE1-server: accepted client from
>>> stor1-9004-2014/08/05-09:23:45:93538-HA-fast-150G-PVE1-client-1-0 (version:
>>> 3.4.4)
>>> [2014-08-05 09:23:49.103035] I
>>> [server-handshake.c:567:server_setvolume] 0-HA-fast-150G-PVE1-server:
>>> accepted client from
>>> stor1-8998-2014/08/05-09:23:45:89883-HA-fast-150G-PVE1-client-0-0 (version:
>>> 3.4.4)
>>>
>>> if this could be the reason, of course.
>>> I did restart the Proxmox VE yesterday (just for an information)
>>>
>>>
>>>
>>>
>>>
>>> 2014-08-05 12:30 GMT+03:00 Pranith Kumar Karampuri <pkarampu at redhat.com>
>>> :
>>>
>>>>
>>>> On 08/05/2014 02:33 PM, Roman wrote:
>>>>
>>>> Waited long enough for now, still different sizes and no logs about
>>>> healing :(
>>>>
>>>> stor1
>>>> # file: exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
>>>> trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000
>>>> trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000
>>>> trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921
>>>>
>>>> root at stor1:~# du -sh /exports/fast-test/150G/images/127/
>>>> 1.2G /exports/fast-test/150G/images/127/
>>>>
>>>>
>>>> stor2
>>>> # file: exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
>>>> trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000
>>>> trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000
>>>> trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921
>>>>
>>>>
>>>> root at stor2:~# du -sh /exports/fast-test/150G/images/127/
>>>> 1.4G /exports/fast-test/150G/images/127/
>>>>
>>>> According to the changelogs, the file doesn't need any healing. Could
>>>> you stop the operations on the VMs and take md5sum on both these machines?
>>>>
>>>> Pranith
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2014-08-05 11:49 GMT+03:00 Pranith Kumar Karampuri <pkarampu at redhat.com
>>>> >:
>>>>
>>>>>
>>>>> On 08/05/2014 02:06 PM, Roman wrote:
>>>>>
>>>>> Well, it seems like it doesn't see the changes were made to the volume
>>>>> ? I created two files 200 and 100 MB (from /dev/zero) after I disconnected
>>>>> the first brick. Then connected it back and got these logs:
>>>>>
>>>>> [2014-08-05 08:30:37.830150] I
>>>>> [glusterfsd-mgmt.c:1584:mgmt_getspec_cbk] 0-glusterfs: No change in
>>>>> volfile, continuing
>>>>> [2014-08-05 08:30:37.830207] I [rpc-clnt.c:1676:rpc_clnt_reconfig]
>>>>> 0-HA-fast-150G-PVE1-client-0: changing port to 49153 (from 0)
>>>>> [2014-08-05 08:30:37.830239] W [socket.c:514:__socket_rwv]
>>>>> 0-HA-fast-150G-PVE1-client-0: readv failed (No data available)
>>>>> [2014-08-05 08:30:37.831024] I
>>>>> [client-handshake.c:1659:select_server_supported_programs]
>>>>> 0-HA-fast-150G-PVE1-client-0: Using Program GlusterFS 3.3, Num (1298437),
>>>>> Version (330)
>>>>> [2014-08-05 08:30:37.831375] I
>>>>> [client-handshake.c:1456:client_setvolume_cbk]
>>>>> 0-HA-fast-150G-PVE1-client-0: Connected to 10.250.0.1:49153, attached
>>>>> to remote volume '/exports/fast-test/150G'.
>>>>> [2014-08-05 08:30:37.831394] I
>>>>> [client-handshake.c:1468:client_setvolume_cbk]
>>>>> 0-HA-fast-150G-PVE1-client-0: Server and Client lk-version numbers are not
>>>>> same, reopening the fds
>>>>> [2014-08-05 08:30:37.831566] I
>>>>> [client-handshake.c:450:client_set_lk_version_cbk]
>>>>> 0-HA-fast-150G-PVE1-client-0: Server lk version = 1
>>>>>
>>>>>
>>>>> [2014-08-05 08:30:37.830150] I
>>>>> [glusterfsd-mgmt.c:1584:mgmt_getspec_cbk] 0-glusterfs: No change in
>>>>> volfile, continuing
>>>>> this line seems weird to me tbh.
>>>>> I do not see any traffic on switch interfaces between gluster servers,
>>>>> which means, there is no syncing between them.
>>>>> I tried to ls -l the files on the client and servers to trigger the
>>>>> healing, but seems like no success. Should I wait more?
>>>>>
>>>>> Yes, it should take around 10-15 minutes. Could you provide 'getfattr
>>>>> -d -m. -e hex <file-on-brick>' on both the bricks.
>>>>>
>>>>> Pranith
>>>>>
>>>>>
>>>>>
>>>>> 2014-08-05 11:25 GMT+03:00 Pranith Kumar Karampuri <
>>>>> pkarampu at redhat.com>:
>>>>>
>>>>>>
>>>>>> On 08/05/2014 01:10 PM, Roman wrote:
>>>>>>
>>>>>> Ahha! For some reason I was not able to start the VM anymore, Proxmox
>>>>>> VE told me, that it is not able to read the qcow2 header due to permission
>>>>>> is denied for some reason. So I just deleted that file and created a new
>>>>>> VM. And the nex message I've got was this:
>>>>>>
>>>>>> Seems like these are the messages where you took down the bricks
>>>>>> before self-heal. Could you restart the run waiting for self-heals to
>>>>>> complete before taking down the next brick?
>>>>>>
>>>>>> Pranith
>>>>>>
>>>>>>
>>>>>>
>>>>>> [2014-08-05 07:31:25.663412] E
>>>>>> [afr-self-heal-common.c:197:afr_sh_print_split_brain_log]
>>>>>> 0-HA-fast-150G-PVE1-replicate-0: Unable to self-heal contents of
>>>>>> '/images/124/vm-124-disk-1.qcow2' (possible split-brain). Please delete the
>>>>>> file from all but the preferred subvolume.- Pending matrix: [ [ 0 60 ] [
>>>>>> 11 0 ] ]
>>>>>> [2014-08-05 07:31:25.663955] E
>>>>>> [afr-self-heal-common.c:2262:afr_self_heal_completion_cbk]
>>>>>> 0-HA-fast-150G-PVE1-replicate-0: background data self-heal failed on
>>>>>> /images/124/vm-124-disk-1.qcow2
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2014-08-05 10:13 GMT+03:00 Pranith Kumar Karampuri <
>>>>>> pkarampu at redhat.com>:
>>>>>>
>>>>>>> I just responded to your earlier mail about how the log looks. The
>>>>>>> log comes on the mount's logfile
>>>>>>>
>>>>>>> Pranith
>>>>>>>
>>>>>>> On 08/05/2014 12:41 PM, Roman wrote:
>>>>>>>
>>>>>>> Ok, so I've waited enough, I think. Had no any traffic on switch
>>>>>>> ports between servers. Could not find any suitable log message about
>>>>>>> completed self-heal (waited about 30 minutes). Plugged out the other
>>>>>>> server's UTP cable this time and got in the same situation:
>>>>>>> root at gluster-test1:~# cat /var/log/dmesg
>>>>>>> -bash: /bin/cat: Input/output error
>>>>>>>
>>>>>>> brick logs:
>>>>>>> [2014-08-05 07:09:03.005474] I [server.c:762:server_rpc_notify]
>>>>>>> 0-HA-fast-150G-PVE1-server: disconnecting connectionfrom
>>>>>>> pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0
>>>>>>> [2014-08-05 07:09:03.005530] I
>>>>>>> [server-helpers.c:729:server_connection_put] 0-HA-fast-150G-PVE1-server:
>>>>>>> Shutting down connection
>>>>>>> pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0
>>>>>>> [2014-08-05 07:09:03.005560] I [server-helpers.c:463:do_fd_cleanup]
>>>>>>> 0-HA-fast-150G-PVE1-server: fd cleanup on /images/124/vm-124-disk-1.qcow2
>>>>>>> [2014-08-05 07:09:03.005797] I
>>>>>>> [server-helpers.c:617:server_connection_destroy]
>>>>>>> 0-HA-fast-150G-PVE1-server: destroyed connection of
>>>>>>> pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2014-08-05 9:53 GMT+03:00 Pranith Kumar Karampuri <
>>>>>>> pkarampu at redhat.com>:
>>>>>>>
>>>>>>>> Do you think it is possible for you to do these tests on the
>>>>>>>> latest version 3.5.2? 'gluster volume heal <volname> info' would give you
>>>>>>>> that information in versions > 3.5.1.
>>>>>>>> Otherwise you will have to check it from either the logs, there
>>>>>>>> will be self-heal completed message on the mount logs (or) by observing
>>>>>>>> 'getfattr -d -m. -e hex <image-file-on-bricks>'
>>>>>>>>
>>>>>>>> Pranith
>>>>>>>>
>>>>>>>>
>>>>>>>> On 08/05/2014 12:09 PM, Roman wrote:
>>>>>>>>
>>>>>>>> Ok, I understand. I will try this shortly.
>>>>>>>> How can I be sure, that healing process is done, if I am not able
>>>>>>>> to see its status?
>>>>>>>>
>>>>>>>>
>>>>>>>> 2014-08-05 9:30 GMT+03:00 Pranith Kumar Karampuri <
>>>>>>>> pkarampu at redhat.com>:
>>>>>>>>
>>>>>>>>> Mounts will do the healing, not the self-heal-daemon. The problem
>>>>>>>>> I feel is that whichever process does the healing has the latest
>>>>>>>>> information about the good bricks in this usecase. Since for VM usecase,
>>>>>>>>> mounts should have the latest information, we should let the mounts do the
>>>>>>>>> healing. If the mount accesses the VM image either by someone doing
>>>>>>>>> operations inside the VM or explicit stat on the file it should do the
>>>>>>>>> healing.
>>>>>>>>>
>>>>>>>>> Pranith.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 08/05/2014 10:39 AM, Roman wrote:
>>>>>>>>>
>>>>>>>>> Hmmm, you told me to turn it off. Did I understood something
>>>>>>>>> wrong? After I issued the command you've sent me, I was not able to watch
>>>>>>>>> the healing process, it said, it won't be healed, becouse its turned off.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2014-08-05 5:39 GMT+03:00 Pranith Kumar Karampuri <
>>>>>>>>> pkarampu at redhat.com>:
>>>>>>>>>
>>>>>>>>>> You didn't mention anything about self-healing. Did you wait
>>>>>>>>>> until the self-heal is complete?
>>>>>>>>>>
>>>>>>>>>> Pranith
>>>>>>>>>>
>>>>>>>>>> On 08/04/2014 05:49 PM, Roman wrote:
>>>>>>>>>>
>>>>>>>>>> Hi!
>>>>>>>>>> Result is pretty same. I set the switch port down for 1st server,
>>>>>>>>>> it was ok. Then set it up back and set other server's port off. and it
>>>>>>>>>> triggered IO error on two virtual machines: one with local root FS but
>>>>>>>>>> network mounted storage. and other with network root FS. 1st gave an error
>>>>>>>>>> on copying to or from the mounted network disk, other just gave me an error
>>>>>>>>>> for even reading log.files.
>>>>>>>>>>
>>>>>>>>>> cat: /var/log/alternatives.log: Input/output error
>>>>>>>>>> then I reset the kvm VM and it said me, there is no boot
>>>>>>>>>> device. Next I virtually powered it off and then back on and it has booted.
>>>>>>>>>>
>>>>>>>>>> By the way, did I have to start/stop volume?
>>>>>>>>>>
>>>>>>>>>> >> Could you do the following and test it again?
>>>>>>>>>> >> gluster volume set <volname> cluster.self-heal-daemon off
>>>>>>>>>>
>>>>>>>>>> >>Pranith
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2014-08-04 14:10 GMT+03:00 Pranith Kumar Karampuri <
>>>>>>>>>> pkarampu at redhat.com>:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 08/04/2014 03:33 PM, Roman wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hello!
>>>>>>>>>>>
>>>>>>>>>>> Facing the same problem as mentioned here:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> http://supercolony.gluster.org/pipermail/gluster-users/2014-April/039959.html
>>>>>>>>>>>
>>>>>>>>>>> my set up is up and running, so i'm ready to help you back
>>>>>>>>>>> with feedback.
>>>>>>>>>>>
>>>>>>>>>>> setup:
>>>>>>>>>>> proxmox server as client
>>>>>>>>>>> 2 gluster physical servers
>>>>>>>>>>>
>>>>>>>>>>> server side and client side both running atm 3.4.4 glusterfs
>>>>>>>>>>> from gluster repo.
>>>>>>>>>>>
>>>>>>>>>>> the problem is:
>>>>>>>>>>>
>>>>>>>>>>> 1. craeted replica bricks.
>>>>>>>>>>> 2. mounted in proxmox (tried both promox ways: via GUI and fstab
>>>>>>>>>>> (with backup volume line), btw while mounting via fstab I'm unable to
>>>>>>>>>>> launch a VM without cache, meanwhile direct-io-mode is enabled in fstab
>>>>>>>>>>> line)
>>>>>>>>>>> 3. installed VM
>>>>>>>>>>> 4. bring one volume down - ok
>>>>>>>>>>> 5. bringing up, waiting for sync is done.
>>>>>>>>>>> 6. bring other volume down - getting IO errors on VM guest and
>>>>>>>>>>> not able to restore the VM after I reset the VM via host. It says (no
>>>>>>>>>>> bootable media). After I shut it down (forced) and bring back up, it boots.
>>>>>>>>>>>
>>>>>>>>>>> Could you do the following and test it again?
>>>>>>>>>>> gluster volume set <volname> cluster.self-heal-daemon off
>>>>>>>>>>>
>>>>>>>>>>> Pranith
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Need help. Tried 3.4.3, 3.4.4.
>>>>>>>>>>> Still missing pkg-s for 3.4.5 for debian and 3.5.2 (3.5.1 always
>>>>>>>>>>> gives a healing error for some reason)
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Roman.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Gluster-users mailing listGluster-users at gluster.orghttp://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Best regards,
>>>>>>>>>> Roman.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best regards,
>>>>>>>>> Roman.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best regards,
>>>>>>>> Roman.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Roman.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best regards,
>>>>>> Roman.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Roman.
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Roman.
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Roman.
>>>
>>>
>>>
>>
>>
>> --
>> Best regards,
>> Roman.
>>
>>
>>
>
>
> --
> Best regards,
> Roman.
>
--
Best regards,
Roman.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140806/852874c1/attachment.html>
More information about the Gluster-users
mailing list