[Gluster-users] libgfapi failover problem on replica bricks
Pranith Kumar Karampuri
pkarampu at redhat.com
Wed Aug 6 06:39:57 UTC 2014
Roman,
The file went into split-brain. I think we should do these tests
with 3.5.2. Where monitoring the heals is easier. Let me also come up
with a document about how to do this testing you are trying to do.
Humble/Niels,
Do we have debs available for 3.5.2? In 3.5.1 there was packaging
issue where /usr/bin/glfsheal is not packaged along with the deb. I
think that should be fixed now as well?
Pranith
On 08/06/2014 11:52 AM, Roman wrote:
> good morning,
>
> root at stor1:~# getfattr -d -m. -e hex
> /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
> getfattr: Removing leading '/' from absolute path names
> # file: exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
> trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000
> trusted.afr.HA-fast-150G-PVE1-client-1=0x000001320000000000000000
> trusted.gfid=0x23c79523075a4158bea38078da570449
>
> getfattr: Removing leading '/' from absolute path names
> # file: exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
> trusted.afr.HA-fast-150G-PVE1-client-0=0x000000040000000000000000
> trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000
> trusted.gfid=0x23c79523075a4158bea38078da570449
>
>
>
> 2014-08-06 9:20 GMT+03:00 Pranith Kumar Karampuri <pkarampu at redhat.com
> <mailto:pkarampu at redhat.com>>:
>
>
> On 08/06/2014 11:30 AM, Roman wrote:
>> Also, this time files are not the same!
>>
>> root at stor1:~# md5sum
>> /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
>> 32411360c53116b96a059f17306caeda
>> /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
>>
>> root at stor2:~# md5sum
>> /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
>> 65b8a6031bcb6f5fb3a11cb1e8b1c9c9
>> /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
> What is the getfattr output?
>
> Pranith
>
>>
>>
>> 2014-08-05 16:33 GMT+03:00 Roman <romeo.r at gmail.com
>> <mailto:romeo.r at gmail.com>>:
>>
>> Nope, it is not working. But this time it went a bit other way
>>
>> root at gluster-client:~# dmesg
>> Segmentation fault
>>
>>
>> I was not able even to start the VM after I done the tests
>>
>> Could not read qcow2 header: Operation not permitted
>>
>> And it seems, it never starts to sync files after first
>> disconnect. VM survives first disconnect, but not second (I
>> waited around 30 minutes). Also, I've
>> got network.ping-timeout: 2 in volume settings, but logs
>> react on first disconnect around 30 seconds. Second was
>> faster, 2 seconds.
>>
>> Reaction was different also:
>>
>> slower one:
>> [2014-08-05 13:26:19.558435] W [socket.c:514:__socket_rwv]
>> 0-glusterfs: readv failed (Connection timed out)
>> [2014-08-05 13:26:19.558485] W
>> [socket.c:1962:__socket_proto_state_machine] 0-glusterfs:
>> reading from socket failed. Error (Connection timed out),
>> peer (10.250.0.1:24007 <http://10.250.0.1:24007>)
>> [2014-08-05 13:26:21.281426] W [socket.c:514:__socket_rwv]
>> 0-HA-fast-150G-PVE1-client-0: readv failed (Connection timed out)
>> [2014-08-05 13:26:21.281474] W
>> [socket.c:1962:__socket_proto_state_machine]
>> 0-HA-fast-150G-PVE1-client-0: reading from socket failed.
>> Error (Connection timed out), peer (10.250.0.1:49153
>> <http://10.250.0.1:49153>)
>> [2014-08-05 13:26:21.281507] I
>> [client.c:2098:client_rpc_notify]
>> 0-HA-fast-150G-PVE1-client-0: disconnected
>>
>> the fast one:
>> 2014-08-05 12:52:44.607389] C
>> [client-handshake.c:127:rpc_client_ping_timer_expired]
>> 0-HA-fast-150G-PVE1-client-1: server 10.250.0.2:49153
>> <http://10.250.0.2:49153> has not responded in the last 2
>> seconds, disconnecting.
>> [2014-08-05 12:52:44.607491] W [socket.c:514:__socket_rwv]
>> 0-HA-fast-150G-PVE1-client-1: readv failed (No data available)
>> [2014-08-05 12:52:44.607585] E
>> [rpc-clnt.c:368:saved_frames_unwind]
>> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)
>> [0x7fcb1b4b0558]
>> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)
>> [0x7fcb1b4aea63]
>> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)
>> [0x7fcb1b4ae97e]))) 0-HA-fast-150G-PVE1-client-1: forced
>> unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at
>> 2014-08-05 12:52:42.463881 (xid=0x381883x)
>> [2014-08-05 12:52:44.607604] W
>> [client-rpc-fops.c:2624:client3_3_lookup_cbk]
>> 0-HA-fast-150G-PVE1-client-1: remote operation failed:
>> Transport endpoint is not connected. Path: /
>> (00000000-0000-0000-0000-000000000001)
>> [2014-08-05 12:52:44.607736] E
>> [rpc-clnt.c:368:saved_frames_unwind]
>> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)
>> [0x7fcb1b4b0558]
>> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)
>> [0x7fcb1b4aea63]
>> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)
>> [0x7fcb1b4ae97e]))) 0-HA-fast-150G-PVE1-client-1: forced
>> unwinding frame type(GlusterFS Handshake) op(PING(3)) called
>> at 2014-08-05 12:52:42.463891 (xid=0x381884x)
>> [2014-08-05 12:52:44.607753] W
>> [client-handshake.c:276:client_ping_cbk]
>> 0-HA-fast-150G-PVE1-client-1: timer must have expired
>> [2014-08-05 12:52:44.607776] I
>> [client.c:2098:client_rpc_notify]
>> 0-HA-fast-150G-PVE1-client-1: disconnected
>>
>>
>>
>> I've got SSD disks (just for an info).
>> Should I go and give a try for 3.5.2?
>>
>>
>>
>> 2014-08-05 13:06 GMT+03:00 Pranith Kumar Karampuri
>> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>>:
>>
>> reply along with gluster-users please :-). May be you are
>> hitting 'reply' instead of 'reply all'?
>>
>> Pranith
>>
>> On 08/05/2014 03:35 PM, Roman wrote:
>>> To make sure and clean, I've created another VM with raw
>>> format and goint to repeat those steps. So now I've got
>>> two VM-s one with qcow2 format and other with raw
>>> format. I will send another e-mail shortly.
>>>
>>>
>>> 2014-08-05 13:01 GMT+03:00 Pranith Kumar Karampuri
>>> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>>:
>>>
>>>
>>> On 08/05/2014 03:07 PM, Roman wrote:
>>>> really, seems like the same file
>>>>
>>>> stor1:
>>>> a951641c5230472929836f9fcede6b04
>>>> /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
>>>>
>>>> stor2:
>>>> a951641c5230472929836f9fcede6b04
>>>> /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
>>>>
>>>>
>>>> one thing I've seen from logs, that somehow proxmox
>>>> VE is connecting with wrong version to servers?
>>>> [2014-08-05 09:23:45.218550] I
>>>> [client-handshake.c:1659:select_server_supported_programs]
>>>> 0-HA-fast-150G-PVE1-client-0: Using Program
>>>> GlusterFS 3.3, Num (1298437), Version (330)
>>> It is the rpc (over the network data structures)
>>> version, which is not changed at all from 3.3 so
>>> thats not a problem. So what is the conclusion? Is
>>> your test case working now or not?
>>>
>>> Pranith
>>>
>>>> but if I issue:
>>>> root at pve1:~# glusterfs -V
>>>> glusterfs 3.4.4 built on Jun 28 2014 03:44:57
>>>> seems ok.
>>>>
>>>> server use 3.4.4 meanwhile
>>>> [2014-08-05 09:23:45.117875] I
>>>> [server-handshake.c:567:server_setvolume]
>>>> 0-HA-fast-150G-PVE1-server: accepted client from
>>>> stor1-9004-2014/08/05-09:23:45:93538-HA-fast-150G-PVE1-client-1-0
>>>> (version: 3.4.4)
>>>> [2014-08-05 09:23:49.103035] I
>>>> [server-handshake.c:567:server_setvolume]
>>>> 0-HA-fast-150G-PVE1-server: accepted client from
>>>> stor1-8998-2014/08/05-09:23:45:89883-HA-fast-150G-PVE1-client-0-0
>>>> (version: 3.4.4)
>>>>
>>>> if this could be the reason, of course.
>>>> I did restart the Proxmox VE yesterday (just for an
>>>> information)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2014-08-05 12:30 GMT+03:00 Pranith Kumar Karampuri
>>>> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>>:
>>>>
>>>>
>>>> On 08/05/2014 02:33 PM, Roman wrote:
>>>>> Waited long enough for now, still different
>>>>> sizes and no logs about healing :(
>>>>>
>>>>> stor1
>>>>> # file:
>>>>> exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
>>>>> trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000
>>>>> trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000
>>>>> trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921
>>>>>
>>>>> root at stor1:~# du -sh
>>>>> /exports/fast-test/150G/images/127/
>>>>> 1.2G /exports/fast-test/150G/images/127/
>>>>>
>>>>>
>>>>> stor2
>>>>> # file:
>>>>> exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
>>>>> trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000
>>>>> trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000
>>>>> trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921
>>>>>
>>>>>
>>>>> root at stor2:~# du -sh
>>>>> /exports/fast-test/150G/images/127/
>>>>> 1.4G /exports/fast-test/150G/images/127/
>>>> According to the changelogs, the file doesn't
>>>> need any healing. Could you stop the operations
>>>> on the VMs and take md5sum on both these machines?
>>>>
>>>> Pranith
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2014-08-05 11:49 GMT+03:00 Pranith Kumar
>>>>> Karampuri <pkarampu at redhat.com
>>>>> <mailto:pkarampu at redhat.com>>:
>>>>>
>>>>>
>>>>> On 08/05/2014 02:06 PM, Roman wrote:
>>>>>> Well, it seems like it doesn't see the
>>>>>> changes were made to the volume ? I
>>>>>> created two files 200 and 100 MB (from
>>>>>> /dev/zero) after I disconnected the first
>>>>>> brick. Then connected it back and got
>>>>>> these logs:
>>>>>>
>>>>>> [2014-08-05 08:30:37.830150] I
>>>>>> [glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]
>>>>>> 0-glusterfs: No change in volfile, continuing
>>>>>> [2014-08-05 08:30:37.830207] I
>>>>>> [rpc-clnt.c:1676:rpc_clnt_reconfig]
>>>>>> 0-HA-fast-150G-PVE1-client-0: changing
>>>>>> port to 49153 (from 0)
>>>>>> [2014-08-05 08:30:37.830239] W
>>>>>> [socket.c:514:__socket_rwv]
>>>>>> 0-HA-fast-150G-PVE1-client-0: readv
>>>>>> failed (No data available)
>>>>>> [2014-08-05 08:30:37.831024] I
>>>>>> [client-handshake.c:1659:select_server_supported_programs]
>>>>>> 0-HA-fast-150G-PVE1-client-0: Using
>>>>>> Program GlusterFS 3.3, Num (1298437),
>>>>>> Version (330)
>>>>>> [2014-08-05 08:30:37.831375] I
>>>>>> [client-handshake.c:1456:client_setvolume_cbk]
>>>>>> 0-HA-fast-150G-PVE1-client-0: Connected
>>>>>> to 10.250.0.1:49153
>>>>>> <http://10.250.0.1:49153>, attached to
>>>>>> remote volume '/exports/fast-test/150G'.
>>>>>> [2014-08-05 08:30:37.831394] I
>>>>>> [client-handshake.c:1468:client_setvolume_cbk]
>>>>>> 0-HA-fast-150G-PVE1-client-0: Server and
>>>>>> Client lk-version numbers are not same,
>>>>>> reopening the fds
>>>>>> [2014-08-05 08:30:37.831566] I
>>>>>> [client-handshake.c:450:client_set_lk_version_cbk]
>>>>>> 0-HA-fast-150G-PVE1-client-0: Server lk
>>>>>> version = 1
>>>>>>
>>>>>>
>>>>>> [2014-08-05 08:30:37.830150] I
>>>>>> [glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]
>>>>>> 0-glusterfs: No change in volfile, continuing
>>>>>> this line seems weird to me tbh.
>>>>>> I do not see any traffic on switch
>>>>>> interfaces between gluster servers, which
>>>>>> means, there is no syncing between them.
>>>>>> I tried to ls -l the files on the client
>>>>>> and servers to trigger the healing, but
>>>>>> seems like no success. Should I wait more?
>>>>> Yes, it should take around 10-15 minutes.
>>>>> Could you provide 'getfattr -d -m. -e hex
>>>>> <file-on-brick>' on both the bricks.
>>>>>
>>>>> Pranith
>>>>>
>>>>>>
>>>>>>
>>>>>> 2014-08-05 11:25 GMT+03:00 Pranith Kumar
>>>>>> Karampuri <pkarampu at redhat.com
>>>>>> <mailto:pkarampu at redhat.com>>:
>>>>>>
>>>>>>
>>>>>> On 08/05/2014 01:10 PM, Roman wrote:
>>>>>>> Ahha! For some reason I was not able
>>>>>>> to start the VM anymore, Proxmox VE
>>>>>>> told me, that it is not able to read
>>>>>>> the qcow2 header due to permission
>>>>>>> is denied for some reason. So I just
>>>>>>> deleted that file and created a new
>>>>>>> VM. And the nex message I've got was
>>>>>>> this:
>>>>>> Seems like these are the messages
>>>>>> where you took down the bricks before
>>>>>> self-heal. Could you restart the run
>>>>>> waiting for self-heals to complete
>>>>>> before taking down the next brick?
>>>>>>
>>>>>> Pranith
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> [2014-08-05 07:31:25.663412] E
>>>>>>> [afr-self-heal-common.c:197:afr_sh_print_split_brain_log]
>>>>>>> 0-HA-fast-150G-PVE1-replicate-0:
>>>>>>> Unable to self-heal contents of
>>>>>>> '/images/124/vm-124-disk-1.qcow2'
>>>>>>> (possible split-brain). Please
>>>>>>> delete the file from all but the
>>>>>>> preferred subvolume.- Pending
>>>>>>> matrix: [ [ 0 60 ] [ 11 0 ] ]
>>>>>>> [2014-08-05 07:31:25.663955] E
>>>>>>> [afr-self-heal-common.c:2262:afr_self_heal_completion_cbk]
>>>>>>> 0-HA-fast-150G-PVE1-replicate-0:
>>>>>>> background data self-heal failed on
>>>>>>> /images/124/vm-124-disk-1.qcow2
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2014-08-05 10:13 GMT+03:00 Pranith
>>>>>>> Kumar Karampuri <pkarampu at redhat.com
>>>>>>> <mailto:pkarampu at redhat.com>>:
>>>>>>>
>>>>>>> I just responded to your earlier
>>>>>>> mail about how the log looks.
>>>>>>> The log comes on the mount's logfile
>>>>>>>
>>>>>>> Pranith
>>>>>>>
>>>>>>> On 08/05/2014 12:41 PM, Roman wrote:
>>>>>>>> Ok, so I've waited enough, I
>>>>>>>> think. Had no any traffic on
>>>>>>>> switch ports between servers.
>>>>>>>> Could not find any suitable log
>>>>>>>> message about completed
>>>>>>>> self-heal (waited about 30
>>>>>>>> minutes). Plugged out the other
>>>>>>>> server's UTP cable this time
>>>>>>>> and got in the same situation:
>>>>>>>> root at gluster-test1:~# cat
>>>>>>>> /var/log/dmesg
>>>>>>>> -bash: /bin/cat: Input/output error
>>>>>>>>
>>>>>>>> brick logs:
>>>>>>>> [2014-08-05 07:09:03.005474] I
>>>>>>>> [server.c:762:server_rpc_notify] 0-HA-fast-150G-PVE1-server:
>>>>>>>> disconnecting connectionfrom
>>>>>>>> pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0
>>>>>>>> [2014-08-05 07:09:03.005530] I
>>>>>>>> [server-helpers.c:729:server_connection_put]
>>>>>>>> 0-HA-fast-150G-PVE1-server:
>>>>>>>> Shutting down connection
>>>>>>>> pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0
>>>>>>>> [2014-08-05 07:09:03.005560] I
>>>>>>>> [server-helpers.c:463:do_fd_cleanup]
>>>>>>>> 0-HA-fast-150G-PVE1-server: fd
>>>>>>>> cleanup on
>>>>>>>> /images/124/vm-124-disk-1.qcow2
>>>>>>>> [2014-08-05 07:09:03.005797] I
>>>>>>>> [server-helpers.c:617:server_connection_destroy]
>>>>>>>> 0-HA-fast-150G-PVE1-server:
>>>>>>>> destroyed connection of
>>>>>>>> pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2014-08-05 9:53 GMT+03:00
>>>>>>>> Pranith Kumar Karampuri
>>>>>>>> <pkarampu at redhat.com
>>>>>>>> <mailto:pkarampu at redhat.com>>:
>>>>>>>>
>>>>>>>> Do you think it is possible
>>>>>>>> for you to do these tests
>>>>>>>> on the latest version
>>>>>>>> 3.5.2? 'gluster volume heal
>>>>>>>> <volname> info' would give
>>>>>>>> you that information in
>>>>>>>> versions > 3.5.1.
>>>>>>>> Otherwise you will have to
>>>>>>>> check it from either the
>>>>>>>> logs, there will be
>>>>>>>> self-heal completed message
>>>>>>>> on the mount logs (or) by
>>>>>>>> observing 'getfattr -d -m.
>>>>>>>> -e hex <image-file-on-bricks>'
>>>>>>>>
>>>>>>>> Pranith
>>>>>>>>
>>>>>>>>
>>>>>>>> On 08/05/2014 12:09 PM,
>>>>>>>> Roman wrote:
>>>>>>>>> Ok, I understand. I will
>>>>>>>>> try this shortly.
>>>>>>>>> How can I be sure, that
>>>>>>>>> healing process is done,
>>>>>>>>> if I am not able to see
>>>>>>>>> its status?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2014-08-05 9:30 GMT+03:00
>>>>>>>>> Pranith Kumar Karampuri
>>>>>>>>> <pkarampu at redhat.com
>>>>>>>>> <mailto:pkarampu at redhat.com>>:
>>>>>>>>>
>>>>>>>>> Mounts will do the
>>>>>>>>> healing, not the
>>>>>>>>> self-heal-daemon. The
>>>>>>>>> problem I feel is that
>>>>>>>>> whichever process does
>>>>>>>>> the healing has the
>>>>>>>>> latest information
>>>>>>>>> about the good bricks
>>>>>>>>> in this usecase. Since
>>>>>>>>> for VM usecase, mounts
>>>>>>>>> should have the latest
>>>>>>>>> information, we should
>>>>>>>>> let the mounts do the
>>>>>>>>> healing. If the mount
>>>>>>>>> accesses the VM image
>>>>>>>>> either by someone
>>>>>>>>> doing operations
>>>>>>>>> inside the VM or
>>>>>>>>> explicit stat on the
>>>>>>>>> file it should do the
>>>>>>>>> healing.
>>>>>>>>>
>>>>>>>>> Pranith.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 08/05/2014 10:39
>>>>>>>>> AM, Roman wrote:
>>>>>>>>>> Hmmm, you told me to
>>>>>>>>>> turn it off. Did I
>>>>>>>>>> understood something
>>>>>>>>>> wrong? After I issued
>>>>>>>>>> the command you've
>>>>>>>>>> sent me, I was not
>>>>>>>>>> able to watch the
>>>>>>>>>> healing process, it
>>>>>>>>>> said, it won't be
>>>>>>>>>> healed, becouse its
>>>>>>>>>> turned off.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2014-08-05 5:39
>>>>>>>>>> GMT+03:00 Pranith
>>>>>>>>>> Kumar Karampuri
>>>>>>>>>> <pkarampu at redhat.com
>>>>>>>>>> <mailto:pkarampu at redhat.com>>:
>>>>>>>>>>
>>>>>>>>>> You didn't
>>>>>>>>>> mention anything
>>>>>>>>>> about
>>>>>>>>>> self-healing. Did
>>>>>>>>>> you wait until
>>>>>>>>>> the self-heal is
>>>>>>>>>> complete?
>>>>>>>>>>
>>>>>>>>>> Pranith
>>>>>>>>>>
>>>>>>>>>> On 08/04/2014
>>>>>>>>>> 05:49 PM, Roman
>>>>>>>>>> wrote:
>>>>>>>>>>> Hi!
>>>>>>>>>>> Result is pretty
>>>>>>>>>>> same. I set the
>>>>>>>>>>> switch port down
>>>>>>>>>>> for 1st server,
>>>>>>>>>>> it was ok. Then
>>>>>>>>>>> set it up back
>>>>>>>>>>> and set other
>>>>>>>>>>> server's port
>>>>>>>>>>> off. and it
>>>>>>>>>>> triggered IO
>>>>>>>>>>> error on two
>>>>>>>>>>> virtual
>>>>>>>>>>> machines: one
>>>>>>>>>>> with local root
>>>>>>>>>>> FS but network
>>>>>>>>>>> mounted storage.
>>>>>>>>>>> and other with
>>>>>>>>>>> network root FS.
>>>>>>>>>>> 1st gave an
>>>>>>>>>>> error on copying
>>>>>>>>>>> to or from the
>>>>>>>>>>> mounted network
>>>>>>>>>>> disk, other just
>>>>>>>>>>> gave me an error
>>>>>>>>>>> for even reading
>>>>>>>>>>> log.files.
>>>>>>>>>>>
>>>>>>>>>>> cat:
>>>>>>>>>>> /var/log/alternatives.log:
>>>>>>>>>>> Input/output error
>>>>>>>>>>> then I reset the
>>>>>>>>>>> kvm VM and it
>>>>>>>>>>> said me, there
>>>>>>>>>>> is no boot
>>>>>>>>>>> device. Next I
>>>>>>>>>>> virtually
>>>>>>>>>>> powered it off
>>>>>>>>>>> and then back on
>>>>>>>>>>> and it has booted.
>>>>>>>>>>>
>>>>>>>>>>> By the way, did
>>>>>>>>>>> I have to
>>>>>>>>>>> start/stop volume?
>>>>>>>>>>>
>>>>>>>>>>> >> Could you do
>>>>>>>>>>> the following
>>>>>>>>>>> and test it again?
>>>>>>>>>>> >> gluster volume
>>>>>>>>>>> set <volname>
>>>>>>>>>>> cluster.self-heal-daemon
>>>>>>>>>>> off
>>>>>>>>>>>
>>>>>>>>>>> >>Pranith
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2014-08-04 14:10
>>>>>>>>>>> GMT+03:00
>>>>>>>>>>> Pranith Kumar
>>>>>>>>>>> Karampuri
>>>>>>>>>>> <pkarampu at redhat.com
>>>>>>>>>>> <mailto:pkarampu at redhat.com>>:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On
>>>>>>>>>>> 08/04/2014
>>>>>>>>>>> 03:33 PM,
>>>>>>>>>>> Roman wrote:
>>>>>>>>>>>> Hello!
>>>>>>>>>>>>
>>>>>>>>>>>> Facing the
>>>>>>>>>>>> same
>>>>>>>>>>>> problem as
>>>>>>>>>>>> mentioned
>>>>>>>>>>>> here:
>>>>>>>>>>>>
>>>>>>>>>>>> http://supercolony.gluster.org/pipermail/gluster-users/2014-April/039959.html
>>>>>>>>>>>>
>>>>>>>>>>>> my set up
>>>>>>>>>>>> is up and
>>>>>>>>>>>> running, so
>>>>>>>>>>>> i'm ready
>>>>>>>>>>>> to help you
>>>>>>>>>>>> back with
>>>>>>>>>>>> feedback.
>>>>>>>>>>>>
>>>>>>>>>>>> setup:
>>>>>>>>>>>> proxmox
>>>>>>>>>>>> server as
>>>>>>>>>>>> client
>>>>>>>>>>>> 2 gluster
>>>>>>>>>>>> physical
>>>>>>>>>>>> servers
>>>>>>>>>>>>
>>>>>>>>>>>> server side
>>>>>>>>>>>> and client
>>>>>>>>>>>> side both
>>>>>>>>>>>> running atm
>>>>>>>>>>>> 3.4.4
>>>>>>>>>>>> glusterfs
>>>>>>>>>>>> from
>>>>>>>>>>>> gluster repo.
>>>>>>>>>>>>
>>>>>>>>>>>> the problem is:
>>>>>>>>>>>>
>>>>>>>>>>>> 1. craeted
>>>>>>>>>>>> replica bricks.
>>>>>>>>>>>> 2. mounted
>>>>>>>>>>>> in proxmox
>>>>>>>>>>>> (tried both
>>>>>>>>>>>> promox
>>>>>>>>>>>> ways: via
>>>>>>>>>>>> GUI and
>>>>>>>>>>>> fstab (with
>>>>>>>>>>>> backup
>>>>>>>>>>>> volume
>>>>>>>>>>>> line), btw
>>>>>>>>>>>> while
>>>>>>>>>>>> mounting
>>>>>>>>>>>> via fstab
>>>>>>>>>>>> I'm unable
>>>>>>>>>>>> to launch a
>>>>>>>>>>>> VM without
>>>>>>>>>>>> cache,
>>>>>>>>>>>> meanwhile
>>>>>>>>>>>> direct-io-mode
>>>>>>>>>>>> is enabled
>>>>>>>>>>>> in fstab line)
>>>>>>>>>>>> 3. installed VM
>>>>>>>>>>>> 4. bring
>>>>>>>>>>>> one volume
>>>>>>>>>>>> down - ok
>>>>>>>>>>>> 5. bringing
>>>>>>>>>>>> up, waiting
>>>>>>>>>>>> for sync is
>>>>>>>>>>>> done.
>>>>>>>>>>>> 6. bring
>>>>>>>>>>>> other
>>>>>>>>>>>> volume down
>>>>>>>>>>>> - getting
>>>>>>>>>>>> IO errors
>>>>>>>>>>>> on VM guest
>>>>>>>>>>>> and not
>>>>>>>>>>>> able to
>>>>>>>>>>>> restore the
>>>>>>>>>>>> VM after I
>>>>>>>>>>>> reset the
>>>>>>>>>>>> VM via
>>>>>>>>>>>> host. It
>>>>>>>>>>>> says (no
>>>>>>>>>>>> bootable
>>>>>>>>>>>> media).
>>>>>>>>>>>> After I
>>>>>>>>>>>> shut it
>>>>>>>>>>>> down
>>>>>>>>>>>> (forced)
>>>>>>>>>>>> and bring
>>>>>>>>>>>> back up, it
>>>>>>>>>>>> boots.
>>>>>>>>>>> Could you do
>>>>>>>>>>> the
>>>>>>>>>>> following
>>>>>>>>>>> and test it
>>>>>>>>>>> again?
>>>>>>>>>>> gluster
>>>>>>>>>>> volume set
>>>>>>>>>>> <volname>
>>>>>>>>>>> cluster.self-heal-daemon
>>>>>>>>>>> off
>>>>>>>>>>>
>>>>>>>>>>> Pranith
>>>>>>>>>>>>
>>>>>>>>>>>> Need help.
>>>>>>>>>>>> Tried
>>>>>>>>>>>> 3.4.3, 3.4.4.
>>>>>>>>>>>> Still
>>>>>>>>>>>> missing
>>>>>>>>>>>> pkg-s for
>>>>>>>>>>>> 3.4.5 for
>>>>>>>>>>>> debian and
>>>>>>>>>>>> 3.5.2
>>>>>>>>>>>> (3.5.1
>>>>>>>>>>>> always
>>>>>>>>>>>> gives a
>>>>>>>>>>>> healing
>>>>>>>>>>>> error for
>>>>>>>>>>>> some reason)
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> Roman.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>>>>>>>>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Roman.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Best regards,
>>>>>>>>>> Roman.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best regards,
>>>>>>>>> Roman.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best regards,
>>>>>>>> Roman.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Roman.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best regards,
>>>>>> Roman.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Roman.
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Roman.
>>>
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Roman.
>>
>>
>>
>>
>> --
>> Best regards,
>> Roman.
>>
>>
>>
>>
>> --
>> Best regards,
>> Roman.
>
>
>
>
> --
> Best regards,
> Roman.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140806/f2001092/attachment-0001.html>
More information about the Gluster-users
mailing list