[Gluster-users] libgfapi failover problem on replica bricks

Pranith Kumar Karampuri pkarampu at redhat.com
Wed Aug 6 06:20:05 UTC 2014


On 08/06/2014 11:30 AM, Roman wrote:
> Also, this time files are not the same!
>
> root at stor1:~# md5sum 
> /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
> 32411360c53116b96a059f17306caeda 
>  /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
>
> root at stor2:~# md5sum 
> /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
> 65b8a6031bcb6f5fb3a11cb1e8b1c9c9 
>  /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
What is the getfattr output?

Pranith
>
>
> 2014-08-05 16:33 GMT+03:00 Roman <romeo.r at gmail.com 
> <mailto:romeo.r at gmail.com>>:
>
>     Nope, it is not working. But this time it went a bit other way
>
>     root at gluster-client:~# dmesg
>     Segmentation fault
>
>
>     I was not able even to start the VM after I done the tests
>
>     Could not read qcow2 header: Operation not permitted
>
>     And it seems, it never starts to sync files after first
>     disconnect. VM survives first disconnect, but not second (I waited
>     around 30 minutes). Also, I've got network.ping-timeout: 2 in
>     volume settings, but logs react on first disconnect around 30
>     seconds. Second was faster, 2 seconds.
>
>     Reaction was different also:
>
>     slower one:
>     [2014-08-05 13:26:19.558435] W [socket.c:514:__socket_rwv]
>     0-glusterfs: readv failed (Connection timed out)
>     [2014-08-05 13:26:19.558485] W
>     [socket.c:1962:__socket_proto_state_machine] 0-glusterfs: reading
>     from socket failed. Error (Connection timed out), peer
>     (10.250.0.1:24007 <http://10.250.0.1:24007>)
>     [2014-08-05 13:26:21.281426] W [socket.c:514:__socket_rwv]
>     0-HA-fast-150G-PVE1-client-0: readv failed (Connection timed out)
>     [2014-08-05 13:26:21.281474] W
>     [socket.c:1962:__socket_proto_state_machine]
>     0-HA-fast-150G-PVE1-client-0: reading from socket failed. Error
>     (Connection timed out), peer (10.250.0.1:49153
>     <http://10.250.0.1:49153>)
>     [2014-08-05 13:26:21.281507] I [client.c:2098:client_rpc_notify]
>     0-HA-fast-150G-PVE1-client-0: disconnected
>
>     the fast one:
>     2014-08-05 12:52:44.607389] C
>     [client-handshake.c:127:rpc_client_ping_timer_expired]
>     0-HA-fast-150G-PVE1-client-1: server 10.250.0.2:49153
>     <http://10.250.0.2:49153> has not responded in the last 2 seconds,
>     disconnecting.
>     [2014-08-05 12:52:44.607491] W [socket.c:514:__socket_rwv]
>     0-HA-fast-150G-PVE1-client-1: readv failed (No data available)
>     [2014-08-05 12:52:44.607585] E
>     [rpc-clnt.c:368:saved_frames_unwind]
>     (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)
>     [0x7fcb1b4b0558]
>     (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)
>     [0x7fcb1b4aea63]
>     (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)
>     [0x7fcb1b4ae97e]))) 0-HA-fast-150G-PVE1-client-1: forced unwinding
>     frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2014-08-05
>     12:52:42.463881 (xid=0x381883x)
>     [2014-08-05 12:52:44.607604] W
>     [client-rpc-fops.c:2624:client3_3_lookup_cbk]
>     0-HA-fast-150G-PVE1-client-1: remote operation failed: Transport
>     endpoint is not connected. Path: /
>     (00000000-0000-0000-0000-000000000001)
>     [2014-08-05 12:52:44.607736] E
>     [rpc-clnt.c:368:saved_frames_unwind]
>     (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)
>     [0x7fcb1b4b0558]
>     (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)
>     [0x7fcb1b4aea63]
>     (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)
>     [0x7fcb1b4ae97e]))) 0-HA-fast-150G-PVE1-client-1: forced unwinding
>     frame type(GlusterFS Handshake) op(PING(3)) called at 2014-08-05
>     12:52:42.463891 (xid=0x381884x)
>     [2014-08-05 12:52:44.607753] W
>     [client-handshake.c:276:client_ping_cbk]
>     0-HA-fast-150G-PVE1-client-1: timer must have expired
>     [2014-08-05 12:52:44.607776] I [client.c:2098:client_rpc_notify]
>     0-HA-fast-150G-PVE1-client-1: disconnected
>
>
>
>     I've got SSD disks (just for an info).
>     Should I go and give a try for 3.5.2?
>
>
>
>     2014-08-05 13:06 GMT+03:00 Pranith Kumar Karampuri
>     <pkarampu at redhat.com <mailto:pkarampu at redhat.com>>:
>
>         reply along with gluster-users please :-). May be you are
>         hitting 'reply' instead of 'reply all'?
>
>         Pranith
>
>         On 08/05/2014 03:35 PM, Roman wrote:
>>         To make sure and clean, I've created another VM with raw
>>         format and goint to repeat those steps. So now I've got two
>>         VM-s one with qcow2 format and other with raw format. I will
>>         send another e-mail shortly.
>>
>>
>>         2014-08-05 13:01 GMT+03:00 Pranith Kumar Karampuri
>>         <pkarampu at redhat.com <mailto:pkarampu at redhat.com>>:
>>
>>
>>             On 08/05/2014 03:07 PM, Roman wrote:
>>>             really, seems like the same file
>>>
>>>             stor1:
>>>             a951641c5230472929836f9fcede6b04
>>>              /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
>>>
>>>             stor2:
>>>             a951641c5230472929836f9fcede6b04
>>>              /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
>>>
>>>
>>>             one thing I've seen from logs, that somehow proxmox VE
>>>             is connecting with wrong version to servers?
>>>             [2014-08-05 09:23:45.218550] I
>>>             [client-handshake.c:1659:select_server_supported_programs]
>>>             0-HA-fast-150G-PVE1-client-0: Using Program GlusterFS
>>>             3.3, Num (1298437), Version (330)
>>             It is the rpc (over the network data structures) version,
>>             which is not changed at all from 3.3 so thats not a
>>             problem. So what is the conclusion? Is your test case
>>             working now or not?
>>
>>             Pranith
>>
>>>             but if I issue:
>>>             root at pve1:~# glusterfs -V
>>>             glusterfs 3.4.4 built on Jun 28 2014 03:44:57
>>>             seems ok.
>>>
>>>             server  use 3.4.4 meanwhile
>>>             [2014-08-05 09:23:45.117875] I
>>>             [server-handshake.c:567:server_setvolume]
>>>             0-HA-fast-150G-PVE1-server: accepted client from
>>>             stor1-9004-2014/08/05-09:23:45:93538-HA-fast-150G-PVE1-client-1-0
>>>             (version: 3.4.4)
>>>             [2014-08-05 09:23:49.103035] I
>>>             [server-handshake.c:567:server_setvolume]
>>>             0-HA-fast-150G-PVE1-server: accepted client from
>>>             stor1-8998-2014/08/05-09:23:45:89883-HA-fast-150G-PVE1-client-0-0
>>>             (version: 3.4.4)
>>>
>>>             if this could be the reason, of course.
>>>             I did restart the Proxmox VE yesterday (just for an
>>>             information)
>>>
>>>
>>>
>>>
>>>
>>>             2014-08-05 12:30 GMT+03:00 Pranith Kumar Karampuri
>>>             <pkarampu at redhat.com <mailto:pkarampu at redhat.com>>:
>>>
>>>
>>>                 On 08/05/2014 02:33 PM, Roman wrote:
>>>>                 Waited long enough for now, still different sizes
>>>>                 and no logs about healing :(
>>>>
>>>>                 stor1
>>>>                 # file:
>>>>                 exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
>>>>                 trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000
>>>>                 trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000
>>>>                 trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921
>>>>
>>>>                 root at stor1:~# du -sh
>>>>                 /exports/fast-test/150G/images/127/
>>>>                 1.2G  /exports/fast-test/150G/images/127/
>>>>
>>>>
>>>>                 stor2
>>>>                 # file:
>>>>                 exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
>>>>                 trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000
>>>>                 trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000
>>>>                 trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921
>>>>
>>>>
>>>>                 root at stor2:~# du -sh
>>>>                 /exports/fast-test/150G/images/127/
>>>>                 1.4G  /exports/fast-test/150G/images/127/
>>>                 According to the changelogs, the file doesn't need
>>>                 any healing. Could you stop the operations on the
>>>                 VMs and take md5sum on both these machines?
>>>
>>>                 Pranith
>>>
>>>>
>>>>
>>>>
>>>>
>>>>                 2014-08-05 11:49 GMT+03:00 Pranith Kumar Karampuri
>>>>                 <pkarampu at redhat.com <mailto:pkarampu at redhat.com>>:
>>>>
>>>>
>>>>                     On 08/05/2014 02:06 PM, Roman wrote:
>>>>>                     Well, it seems like it doesn't see the changes
>>>>>                     were made to the volume ? I created two files
>>>>>                     200 and 100 MB (from /dev/zero) after I
>>>>>                     disconnected the first brick. Then connected
>>>>>                     it back and got these logs:
>>>>>
>>>>>                     [2014-08-05 08:30:37.830150] I
>>>>>                     [glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]
>>>>>                     0-glusterfs: No change in volfile, continuing
>>>>>                     [2014-08-05 08:30:37.830207] I
>>>>>                     [rpc-clnt.c:1676:rpc_clnt_reconfig]
>>>>>                     0-HA-fast-150G-PVE1-client-0: changing port to
>>>>>                     49153 (from 0)
>>>>>                     [2014-08-05 08:30:37.830239] W
>>>>>                     [socket.c:514:__socket_rwv]
>>>>>                     0-HA-fast-150G-PVE1-client-0: readv failed (No
>>>>>                     data available)
>>>>>                     [2014-08-05 08:30:37.831024] I
>>>>>                     [client-handshake.c:1659:select_server_supported_programs]
>>>>>                     0-HA-fast-150G-PVE1-client-0: Using Program
>>>>>                     GlusterFS 3.3, Num (1298437), Version (330)
>>>>>                     [2014-08-05 08:30:37.831375] I
>>>>>                     [client-handshake.c:1456:client_setvolume_cbk]
>>>>>                     0-HA-fast-150G-PVE1-client-0: Connected to
>>>>>                     10.250.0.1:49153 <http://10.250.0.1:49153>,
>>>>>                     attached to remote volume
>>>>>                     '/exports/fast-test/150G'.
>>>>>                     [2014-08-05 08:30:37.831394] I
>>>>>                     [client-handshake.c:1468:client_setvolume_cbk]
>>>>>                     0-HA-fast-150G-PVE1-client-0: Server and
>>>>>                     Client lk-version numbers are not same,
>>>>>                     reopening the fds
>>>>>                     [2014-08-05 08:30:37.831566] I
>>>>>                     [client-handshake.c:450:client_set_lk_version_cbk]
>>>>>                     0-HA-fast-150G-PVE1-client-0: Server lk
>>>>>                     version = 1
>>>>>
>>>>>
>>>>>                     [2014-08-05 08:30:37.830150] I
>>>>>                     [glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]
>>>>>                     0-glusterfs: No change in volfile, continuing
>>>>>                     this line seems weird to me tbh.
>>>>>                     I do not see any traffic on switch interfaces
>>>>>                     between gluster servers, which means, there is
>>>>>                     no syncing between them.
>>>>>                     I tried to ls -l the files on the client and
>>>>>                     servers to trigger the healing, but seems like
>>>>>                     no success. Should I wait more?
>>>>                     Yes, it should take around 10-15 minutes. Could
>>>>                     you provide 'getfattr -d -m. -e hex
>>>>                     <file-on-brick>' on both the bricks.
>>>>
>>>>                     Pranith
>>>>
>>>>>
>>>>>
>>>>>                     2014-08-05 11:25 GMT+03:00 Pranith Kumar
>>>>>                     Karampuri <pkarampu at redhat.com
>>>>>                     <mailto:pkarampu at redhat.com>>:
>>>>>
>>>>>
>>>>>                         On 08/05/2014 01:10 PM, Roman wrote:
>>>>>>                         Ahha! For some reason I was not able to
>>>>>>                         start the VM anymore, Proxmox VE told me,
>>>>>>                         that it is not able to read the qcow2
>>>>>>                         header due to permission is denied for
>>>>>>                         some reason. So I just deleted that file
>>>>>>                         and created a new VM. And the nex message
>>>>>>                         I've got was this:
>>>>>                         Seems like these are the messages where
>>>>>                         you took down the bricks before self-heal.
>>>>>                         Could you restart the run waiting for
>>>>>                         self-heals to complete before taking down
>>>>>                         the next brick?
>>>>>
>>>>>                         Pranith
>>>>>
>>>>>>
>>>>>>
>>>>>>                         [2014-08-05 07:31:25.663412] E
>>>>>>                         [afr-self-heal-common.c:197:afr_sh_print_split_brain_log]
>>>>>>                         0-HA-fast-150G-PVE1-replicate-0: Unable
>>>>>>                         to self-heal contents of
>>>>>>                         '/images/124/vm-124-disk-1.qcow2'
>>>>>>                         (possible split-brain). Please delete the
>>>>>>                         file from all but the preferred
>>>>>>                         subvolume.- Pending matrix:  [ [ 0 60 ] [
>>>>>>                         11 0 ] ]
>>>>>>                         [2014-08-05 07:31:25.663955] E
>>>>>>                         [afr-self-heal-common.c:2262:afr_self_heal_completion_cbk]
>>>>>>                         0-HA-fast-150G-PVE1-replicate-0:
>>>>>>                         background  data self-heal failed on
>>>>>>                         /images/124/vm-124-disk-1.qcow2
>>>>>>
>>>>>>
>>>>>>
>>>>>>                         2014-08-05 10:13 GMT+03:00 Pranith Kumar
>>>>>>                         Karampuri <pkarampu at redhat.com
>>>>>>                         <mailto:pkarampu at redhat.com>>:
>>>>>>
>>>>>>                             I just responded to your earlier mail
>>>>>>                             about how the log looks. The log
>>>>>>                             comes on the mount's logfile
>>>>>>
>>>>>>                             Pranith
>>>>>>
>>>>>>                             On 08/05/2014 12:41 PM, Roman wrote:
>>>>>>>                             Ok, so I've waited enough, I think.
>>>>>>>                             Had no any traffic on switch ports
>>>>>>>                             between servers. Could not find any
>>>>>>>                             suitable log message about completed
>>>>>>>                             self-heal (waited about 30 minutes).
>>>>>>>                             Plugged out the other server's UTP
>>>>>>>                             cable this time and got in the same
>>>>>>>                             situation:
>>>>>>>                             root at gluster-test1:~# cat /var/log/dmesg
>>>>>>>                             -bash: /bin/cat: Input/output error
>>>>>>>
>>>>>>>                             brick logs:
>>>>>>>                             [2014-08-05 07:09:03.005474] I
>>>>>>>                             [server.c:762:server_rpc_notify]
>>>>>>>                             0-HA-fast-150G-PVE1-server:
>>>>>>>                             disconnecting connectionfrom
>>>>>>>                             pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0
>>>>>>>                             [2014-08-05 07:09:03.005530] I
>>>>>>>                             [server-helpers.c:729:server_connection_put]
>>>>>>>                             0-HA-fast-150G-PVE1-server: Shutting
>>>>>>>                             down connection
>>>>>>>                             pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0
>>>>>>>                             [2014-08-05 07:09:03.005560] I
>>>>>>>                             [server-helpers.c:463:do_fd_cleanup]
>>>>>>>                             0-HA-fast-150G-PVE1-server: fd
>>>>>>>                             cleanup on
>>>>>>>                             /images/124/vm-124-disk-1.qcow2
>>>>>>>                             [2014-08-05 07:09:03.005797] I
>>>>>>>                             [server-helpers.c:617:server_connection_destroy]
>>>>>>>                             0-HA-fast-150G-PVE1-server:
>>>>>>>                             destroyed connection of
>>>>>>>                             pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>                             2014-08-05 9:53 GMT+03:00 Pranith
>>>>>>>                             Kumar Karampuri <pkarampu at redhat.com
>>>>>>>                             <mailto:pkarampu at redhat.com>>:
>>>>>>>
>>>>>>>                                 Do you think it is possible for
>>>>>>>                                 you to do these tests on the
>>>>>>>                                 latest version 3.5.2? 'gluster
>>>>>>>                                 volume heal <volname> info'
>>>>>>>                                 would give you that information
>>>>>>>                                 in versions > 3.5.1.
>>>>>>>                                 Otherwise you will have to check
>>>>>>>                                 it from either the logs, there
>>>>>>>                                 will be self-heal completed
>>>>>>>                                 message on the mount logs (or)
>>>>>>>                                 by observing 'getfattr -d -m. -e
>>>>>>>                                 hex <image-file-on-bricks>'
>>>>>>>
>>>>>>>                                 Pranith
>>>>>>>
>>>>>>>
>>>>>>>                                 On 08/05/2014 12:09 PM, Roman wrote:
>>>>>>>>                                 Ok, I understand. I will try
>>>>>>>>                                 this shortly.
>>>>>>>>                                 How can I be sure, that healing
>>>>>>>>                                 process is done, if I am not
>>>>>>>>                                 able to see its status?
>>>>>>>>
>>>>>>>>
>>>>>>>>                                 2014-08-05 9:30 GMT+03:00
>>>>>>>>                                 Pranith Kumar Karampuri
>>>>>>>>                                 <pkarampu at redhat.com
>>>>>>>>                                 <mailto:pkarampu at redhat.com>>:
>>>>>>>>
>>>>>>>>                                     Mounts will do the healing,
>>>>>>>>                                     not the self-heal-daemon.
>>>>>>>>                                     The problem I feel is that
>>>>>>>>                                     whichever process does the
>>>>>>>>                                     healing has the latest
>>>>>>>>                                     information about the good
>>>>>>>>                                     bricks in this usecase.
>>>>>>>>                                     Since for VM usecase,
>>>>>>>>                                     mounts should have the
>>>>>>>>                                     latest information, we
>>>>>>>>                                     should let the mounts do
>>>>>>>>                                     the healing. If the mount
>>>>>>>>                                     accesses the VM image
>>>>>>>>                                     either by someone doing
>>>>>>>>                                     operations inside the VM or
>>>>>>>>                                     explicit stat on the file
>>>>>>>>                                     it should do the healing.
>>>>>>>>
>>>>>>>>                                     Pranith.
>>>>>>>>
>>>>>>>>
>>>>>>>>                                     On 08/05/2014 10:39 AM,
>>>>>>>>                                     Roman wrote:
>>>>>>>>>                                     Hmmm, you told me to turn
>>>>>>>>>                                     it off. Did I understood
>>>>>>>>>                                     something wrong? After I
>>>>>>>>>                                     issued the command you've
>>>>>>>>>                                     sent me, I was not able to
>>>>>>>>>                                     watch the healing process,
>>>>>>>>>                                     it said, it won't be
>>>>>>>>>                                     healed, becouse its turned
>>>>>>>>>                                     off.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                                     2014-08-05 5:39 GMT+03:00
>>>>>>>>>                                     Pranith Kumar Karampuri
>>>>>>>>>                                     <pkarampu at redhat.com
>>>>>>>>>                                     <mailto:pkarampu at redhat.com>>:
>>>>>>>>>
>>>>>>>>>                                         You didn't mention
>>>>>>>>>                                         anything about
>>>>>>>>>                                         self-healing. Did you
>>>>>>>>>                                         wait until the
>>>>>>>>>                                         self-heal is complete?
>>>>>>>>>
>>>>>>>>>                                         Pranith
>>>>>>>>>
>>>>>>>>>                                         On 08/04/2014 05:49
>>>>>>>>>                                         PM, Roman wrote:
>>>>>>>>>>                                         Hi!
>>>>>>>>>>                                         Result is pretty
>>>>>>>>>>                                         same. I set the
>>>>>>>>>>                                         switch port down for
>>>>>>>>>>                                         1st server, it was
>>>>>>>>>>                                         ok. Then set it up
>>>>>>>>>>                                         back and set other
>>>>>>>>>>                                         server's port off.
>>>>>>>>>>                                         and it triggered IO
>>>>>>>>>>                                         error on two virtual
>>>>>>>>>>                                         machines: one with
>>>>>>>>>>                                         local root FS but
>>>>>>>>>>                                         network mounted
>>>>>>>>>>                                         storage. and other
>>>>>>>>>>                                         with network root FS.
>>>>>>>>>>                                         1st gave an error on
>>>>>>>>>>                                         copying to or from
>>>>>>>>>>                                         the mounted network
>>>>>>>>>>                                         disk, other just gave
>>>>>>>>>>                                         me an error for even
>>>>>>>>>>                                         reading log.files.
>>>>>>>>>>
>>>>>>>>>>                                         cat:
>>>>>>>>>>                                         /var/log/alternatives.log:
>>>>>>>>>>                                         Input/output error
>>>>>>>>>>                                         then I reset the kvm
>>>>>>>>>>                                         VM and it said me,
>>>>>>>>>>                                         there is no boot
>>>>>>>>>>                                         device. Next I
>>>>>>>>>>                                         virtually powered it
>>>>>>>>>>                                         off and then back on
>>>>>>>>>>                                         and it has booted.
>>>>>>>>>>
>>>>>>>>>>                                         By the way, did I
>>>>>>>>>>                                         have to start/stop
>>>>>>>>>>                                         volume?
>>>>>>>>>>
>>>>>>>>>>                                         >> Could you do the
>>>>>>>>>>                                         following and test it
>>>>>>>>>>                                         again?
>>>>>>>>>>                                         >> gluster volume set
>>>>>>>>>>                                         <volname>
>>>>>>>>>>                                         cluster.self-heal-daemon
>>>>>>>>>>                                         off
>>>>>>>>>>
>>>>>>>>>>                                         >>Pranith
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                                         2014-08-04 14:10
>>>>>>>>>>                                         GMT+03:00 Pranith
>>>>>>>>>>                                         Kumar Karampuri
>>>>>>>>>>                                         <pkarampu at redhat.com
>>>>>>>>>>                                         <mailto:pkarampu at redhat.com>>:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                                             On 08/04/2014
>>>>>>>>>>                                             03:33 PM, Roman
>>>>>>>>>>                                             wrote:
>>>>>>>>>>>                                             Hello!
>>>>>>>>>>>
>>>>>>>>>>>                                             Facing the same
>>>>>>>>>>>                                             problem as
>>>>>>>>>>>                                             mentioned here:
>>>>>>>>>>>
>>>>>>>>>>>                                             http://supercolony.gluster.org/pipermail/gluster-users/2014-April/039959.html
>>>>>>>>>>>
>>>>>>>>>>>                                             my set up is up
>>>>>>>>>>>                                             and running, so
>>>>>>>>>>>                                             i'm ready to
>>>>>>>>>>>                                             help you back
>>>>>>>>>>>                                             with feedback.
>>>>>>>>>>>
>>>>>>>>>>>                                             setup:
>>>>>>>>>>>                                             proxmox server
>>>>>>>>>>>                                             as client
>>>>>>>>>>>                                             2 gluster
>>>>>>>>>>>                                             physical  servers
>>>>>>>>>>>
>>>>>>>>>>>                                             server side and
>>>>>>>>>>>                                             client side both
>>>>>>>>>>>                                             running atm
>>>>>>>>>>>                                             3.4.4 glusterfs
>>>>>>>>>>>                                             from gluster repo.
>>>>>>>>>>>
>>>>>>>>>>>                                             the problem is:
>>>>>>>>>>>
>>>>>>>>>>>                                             1. craeted
>>>>>>>>>>>                                             replica bricks.
>>>>>>>>>>>                                             2. mounted in
>>>>>>>>>>>                                             proxmox (tried
>>>>>>>>>>>                                             both promox
>>>>>>>>>>>                                             ways: via GUI
>>>>>>>>>>>                                             and fstab (with
>>>>>>>>>>>                                             backup volume
>>>>>>>>>>>                                             line), btw while
>>>>>>>>>>>                                             mounting via
>>>>>>>>>>>                                             fstab I'm unable
>>>>>>>>>>>                                             to launch a VM
>>>>>>>>>>>                                             without cache,
>>>>>>>>>>>                                             meanwhile
>>>>>>>>>>>                                             direct-io-mode
>>>>>>>>>>>                                             is enabled in
>>>>>>>>>>>                                             fstab line)
>>>>>>>>>>>                                             3. installed VM
>>>>>>>>>>>                                             4. bring one
>>>>>>>>>>>                                             volume down - ok
>>>>>>>>>>>                                             5. bringing up,
>>>>>>>>>>>                                             waiting for sync
>>>>>>>>>>>                                             is done.
>>>>>>>>>>>                                             6. bring other
>>>>>>>>>>>                                             volume down -
>>>>>>>>>>>                                             getting IO
>>>>>>>>>>>                                             errors on VM
>>>>>>>>>>>                                             guest and not
>>>>>>>>>>>                                             able to restore
>>>>>>>>>>>                                             the VM after I
>>>>>>>>>>>                                             reset the VM via
>>>>>>>>>>>                                             host. It says
>>>>>>>>>>>                                             (no bootable
>>>>>>>>>>>                                             media). After I
>>>>>>>>>>>                                             shut it down
>>>>>>>>>>>                                             (forced) and
>>>>>>>>>>>                                             bring back up,
>>>>>>>>>>>                                             it boots.
>>>>>>>>>>                                             Could you do the
>>>>>>>>>>                                             following and
>>>>>>>>>>                                             test it again?
>>>>>>>>>>                                             gluster volume
>>>>>>>>>>                                             set <volname>
>>>>>>>>>>                                             cluster.self-heal-daemon
>>>>>>>>>>                                             off
>>>>>>>>>>
>>>>>>>>>>                                             Pranith
>>>>>>>>>>>
>>>>>>>>>>>                                             Need help. Tried
>>>>>>>>>>>                                             3.4.3, 3.4.4.
>>>>>>>>>>>                                             Still missing
>>>>>>>>>>>                                             pkg-s for 3.4.5
>>>>>>>>>>>                                             for debian and
>>>>>>>>>>>                                             3.5.2 (3.5.1
>>>>>>>>>>>                                             always gives a
>>>>>>>>>>>                                             healing error
>>>>>>>>>>>                                             for some reason)
>>>>>>>>>>>
>>>>>>>>>>>                                             -- 
>>>>>>>>>>>                                             Best regards,
>>>>>>>>>>>                                             Roman.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                                             _______________________________________________
>>>>>>>>>>>                                             Gluster-users mailing list
>>>>>>>>>>>                                             Gluster-users at gluster.org  <mailto:Gluster-users at gluster.org>
>>>>>>>>>>>                                             http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                                         -- 
>>>>>>>>>>                                         Best regards,
>>>>>>>>>>                                         Roman.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                                     -- 
>>>>>>>>>                                     Best regards,
>>>>>>>>>                                     Roman.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                                 -- 
>>>>>>>>                                 Best regards,
>>>>>>>>                                 Roman.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>                             -- 
>>>>>>>                             Best regards,
>>>>>>>                             Roman.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>                         -- 
>>>>>>                         Best regards,
>>>>>>                         Roman.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                     -- 
>>>>>                     Best regards,
>>>>>                     Roman.
>>>>
>>>>
>>>>
>>>>
>>>>                 -- 
>>>>                 Best regards,
>>>>                 Roman.
>>>
>>>
>>>
>>>
>>>             -- 
>>>             Best regards,
>>>             Roman.
>>
>>
>>
>>
>>         -- 
>>         Best regards,
>>         Roman.
>
>
>
>
>     -- 
>     Best regards,
>     Roman.
>
>
>
>
> -- 
> Best regards,
> Roman.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140806/8f075e8d/attachment.html>


More information about the Gluster-users mailing list