[Gluster-users] glusterfs under high load failing?

Roman romeo.r at gmail.com
Mon Oct 13 19:50:36 UTC 2014


ok. done.
this time there were no disconnects, at least all of vms are working, but
got some mails from VM about IO writes again.

WARNINGs: Read IO Wait time is 1.45 (outside range [0:1]).

here is the output

root at stor1:~# gluster volume profile HA-WIN-TT-1T info
Brick: stor1:/exports/NFS-WIN/1T
--------------------------------
Cumulative Stats:
   Block Size:             131072b+              262144b+
 No. of Reads:                    0                     0
No. of Writes:              7372798                     1
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls
Fop
 ---------   -----------   -----------   -----------   ------------
 ----
      0.00       0.00 us       0.00 us       0.00 us             25
RELEASE
      0.00       0.00 us       0.00 us       0.00 us             16
 RELEASEDIR
      0.00      64.00 us      52.00 us      76.00 us              2
ENTRYLK
      0.00      73.50 us      51.00 us      96.00 us              2
FLUSH
      0.00      68.43 us      30.00 us     135.00 us              7
 STATFS
      0.00      54.31 us      44.00 us     109.00 us             16
OPENDIR
      0.00      50.75 us      16.00 us      74.00 us             24
FSTAT
      0.00      47.77 us      19.00 us     119.00 us             26
 GETXATTR
      0.00      59.21 us      21.00 us      89.00 us             24
 OPEN
      0.00      59.39 us      22.00 us     296.00 us             28
READDIR
      0.00    4972.00 us    4972.00 us    4972.00 us              1
 CREATE
      0.00      97.42 us      19.00 us     184.00 us             62
 LOOKUP
      0.00      89.49 us      20.00 us     656.00 us            324
 FXATTROP
      3.91 1255944.81 us     127.00 us 23397532.00 us            189
FSYNC
      7.40 3406275.50 us      17.00 us 23398013.00 us            132
INODELK
     34.96   94598.02 us       8.00 us 23398705.00 us          22445
 FINODELK
     53.73     442.66 us      79.00 us 3116494.00 us        7372799
WRITE

    Duration: 7813 seconds
   Data Read: 0 bytes
Data Written: 966367641600 bytes

Interval 0 Stats:
   Block Size:             131072b+              262144b+
 No. of Reads:                    0                     0
No. of Writes:              7372798                     1
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls
Fop
 ---------   -----------   -----------   -----------   ------------
 ----
      0.00       0.00 us       0.00 us       0.00 us             25
RELEASE
      0.00       0.00 us       0.00 us       0.00 us             16
 RELEASEDIR
      0.00      64.00 us      52.00 us      76.00 us              2
ENTRYLK
      0.00      73.50 us      51.00 us      96.00 us              2
FLUSH
      0.00      68.43 us      30.00 us     135.00 us              7
 STATFS
      0.00      54.31 us      44.00 us     109.00 us             16
OPENDIR
      0.00      50.75 us      16.00 us      74.00 us             24
FSTAT
      0.00      47.77 us      19.00 us     119.00 us             26
 GETXATTR
      0.00      59.21 us      21.00 us      89.00 us             24
 OPEN
      0.00      59.39 us      22.00 us     296.00 us             28
READDIR
      0.00    4972.00 us    4972.00 us    4972.00 us              1
 CREATE
      0.00      97.42 us      19.00 us     184.00 us             62
 LOOKUP
      0.00      89.49 us      20.00 us     656.00 us            324
 FXATTROP
      3.91 1255944.81 us     127.00 us 23397532.00 us            189
FSYNC
      7.40 3406275.50 us      17.00 us 23398013.00 us            132
INODELK
     34.96   94598.02 us       8.00 us 23398705.00 us          22445
 FINODELK
     53.73     442.66 us      79.00 us 3116494.00 us        7372799
WRITE

    Duration: 7813 seconds
   Data Read: 0 bytes
Data Written: 966367641600 bytes

Brick: stor2:/exports/NFS-WIN/1T
--------------------------------
Cumulative Stats:
   Block Size:             131072b+              262144b+
 No. of Reads:                    0                     0
No. of Writes:              7372798                     1
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls
Fop
 ---------   -----------   -----------   -----------   ------------
 ----
      0.00       0.00 us       0.00 us       0.00 us             25
RELEASE
      0.00       0.00 us       0.00 us       0.00 us             16
 RELEASEDIR
      0.00      61.50 us      46.00 us      77.00 us              2
ENTRYLK
      0.00      82.00 us      67.00 us      97.00 us              2
FLUSH
      0.00     265.00 us     265.00 us     265.00 us              1
 CREATE
      0.00      57.43 us      30.00 us      85.00 us              7
 STATFS
      0.00      61.12 us      37.00 us     107.00 us             16
OPENDIR
      0.00      44.04 us      12.00 us      86.00 us             24
FSTAT
      0.00      41.42 us      24.00 us      96.00 us             26
 GETXATTR
      0.00      45.93 us      24.00 us     133.00 us             28
READDIR
      0.00      57.17 us      25.00 us     147.00 us             24
 OPEN
      0.00     145.28 us      31.00 us     288.00 us             32
 READDIRP
      0.00      39.50 us      10.00 us     152.00 us            132
INODELK
      0.00     330.97 us      20.00 us   14280.00 us             62
 LOOKUP
      0.00      79.06 us      19.00 us     851.00 us            430
 FXATTROP
      0.02      29.32 us       7.00 us   28154.00 us          22568
 FINODELK
      7.80 1313096.68 us     125.00 us 23281862.00 us            189
FSYNC
     92.18     397.92 us      76.00 us 1838343.00 us        7372799
WRITE

    Duration: 7811 seconds
   Data Read: 0 bytes
Data Written: 966367641600 bytes

Interval 0 Stats:
   Block Size:             131072b+              262144b+
 No. of Reads:                    0                     0
No. of Writes:              7372798                     1
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls
Fop
 ---------   -----------   -----------   -----------   ------------
 ----
      0.00       0.00 us       0.00 us       0.00 us             25
RELEASE
      0.00       0.00 us       0.00 us       0.00 us             16
 RELEASEDIR
      0.00      61.50 us      46.00 us      77.00 us              2
ENTRYLK
      0.00      82.00 us      67.00 us      97.00 us              2
FLUSH
      0.00     265.00 us     265.00 us     265.00 us              1
 CREATE
      0.00      57.43 us      30.00 us      85.00 us              7
 STATFS
      0.00      61.12 us      37.00 us     107.00 us             16
OPENDIR
      0.00      44.04 us      12.00 us      86.00 us             24
FSTAT
      0.00      41.42 us      24.00 us      96.00 us             26
 GETXATTR
      0.00      45.93 us      24.00 us     133.00 us             28
READDIR
      0.00      57.17 us      25.00 us     147.00 us             24
 OPEN
      0.00     145.28 us      31.00 us     288.00 us             32
 READDIRP
      0.00      39.50 us      10.00 us     152.00 us            132
INODELK
      0.00     330.97 us      20.00 us   14280.00 us             62
 LOOKUP
      0.00      79.06 us      19.00 us     851.00 us            430
 FXATTROP
      0.02      29.32 us       7.00 us   28154.00 us          22568
 FINODELK
      7.80 1313096.68 us     125.00 us 23281862.00 us            189
FSYNC
     92.18     397.92 us      76.00 us 1838343.00 us        7372799
WRITE

    Duration: 7811 seconds
   Data Read: 0 bytes
Data Written: 966367641600 bytes

does it make something more clear?

2014-10-13 20:40 GMT+03:00 Roman <romeo.r at gmail.com>:

> i think i may know what was an issue. There was an iscsitarget service
> runing, that was exporting this generated block device. so maybe my
> collegue Windows server picked it up and mountd :) I'll if it will happen
> again.
>
> 2014-10-13 20:27 GMT+03:00 Roman <romeo.r at gmail.com>:
>
>> So may I restart the volume and start the test, or you need something
>> else from this issue?
>>
>> 2014-10-13 19:49 GMT+03:00 Pranith Kumar Karampuri <pkarampu at redhat.com>:
>>
>>>
>>> On 10/13/2014 10:03 PM, Roman wrote:
>>>
>>> hmm,
>>> seems like another strange issue? Seen this before. Had to restart the
>>> volume to get my empty space back.
>>>  root at glstor-cli:/srv/nfs/HA-WIN-TT-1T# ls -l
>>> total 943718400
>>> -rw-r--r-- 1 root root 966367641600 Oct 13 16:55 disk
>>> root at glstor-cli:/srv/nfs/HA-WIN-TT-1T# rm disk
>>> root at glstor-cli:/srv/nfs/HA-WIN-TT-1T# df -h
>>> Filesystem                                              Size  Used Avail
>>> Use% Mounted on
>>> rootfs                                                  282G  1.1G  266G
>>>   1% /
>>> udev                                                     10M     0   10M
>>>   0% /dev
>>> tmpfs                                                   1.4G  228K  1.4G
>>>   1% /run
>>> /dev/disk/by-uuid/c62ee3c0-c0e5-44af-b0cd-7cb3fbcc0fba  282G  1.1G  266G
>>>   1% /
>>> tmpfs                                                   5.0M     0  5.0M
>>>   0% /run/lock
>>> tmpfs                                                   5.2G     0  5.2G
>>>   0% /run/shm
>>> stor1:HA-WIN-TT-1T                                     1008G  901G   57G
>>>  95% /srv/nfs/HA-WIN-TT-1T
>>>
>>>  no file, but size is still 901G.
>>> Both servers show the same.
>>> Do I really have to restart the volume to fix that?
>>>
>>> IMO this can happen if there is an fd leak. open-fd is the only variable
>>> that can change with volume restart. How do you re-create the bug?
>>>
>>> Pranith
>>>
>>>
>>> 2014-10-13 19:30 GMT+03:00 Roman <romeo.r at gmail.com>:
>>>
>>>> Sure.
>>>> I'll let it to run for this night .
>>>>
>>>> 2014-10-13 19:19 GMT+03:00 Pranith Kumar Karampuri <pkarampu at redhat.com
>>>> >:
>>>>
>>>>>  hi Roman,
>>>>>      Do you think we can run this test again? this time, could you
>>>>> enable 'gluster volume profile <volname> start', do the same test. Provide
>>>>> output of 'gluster volume profile <volname> info' and logs after the test?
>>>>>
>>>>> Pranith
>>>>>
>>>>> On 10/13/2014 09:45 PM, Roman wrote:
>>>>>
>>>>> Sure !
>>>>>
>>>>>  root at stor1:~# gluster volume info
>>>>>
>>>>>  Volume Name: HA-2TB-TT-Proxmox-cluster
>>>>> Type: Replicate
>>>>> Volume ID: 66e38bde-c5fa-4ce2-be6e-6b2adeaa16c2
>>>>> Status: Started
>>>>> Number of Bricks: 1 x 2 = 2
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: stor1:/exports/HA-2TB-TT-Proxmox-cluster/2TB
>>>>> Brick2: stor2:/exports/HA-2TB-TT-Proxmox-cluster/2TB
>>>>> Options Reconfigured:
>>>>> nfs.disable: 0
>>>>> network.ping-timeout: 10
>>>>>
>>>>>  Volume Name: HA-WIN-TT-1T
>>>>> Type: Replicate
>>>>> Volume ID: 2937ac01-4cba-44a8-8ff8-0161b67f8ee4
>>>>> Status: Started
>>>>> Number of Bricks: 1 x 2 = 2
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: stor1:/exports/NFS-WIN/1T
>>>>> Brick2: stor2:/exports/NFS-WIN/1T
>>>>> Options Reconfigured:
>>>>> nfs.disable: 1
>>>>> network.ping-timeout: 10
>>>>>
>>>>>
>>>>>
>>>>> 2014-10-13 19:09 GMT+03:00 Pranith Kumar Karampuri <
>>>>> pkarampu at redhat.com>:
>>>>>
>>>>>>  Could you give your 'gluster volume info' output?
>>>>>>
>>>>>> Pranith
>>>>>>
>>>>>> On 10/13/2014 09:36 PM, Roman wrote:
>>>>>>
>>>>>>  Hi,
>>>>>>
>>>>>>  I've got this kind of setup (servers run replica)
>>>>>>
>>>>>>
>>>>>>  @ 10G backend
>>>>>> gluster storage1
>>>>>> gluster storage2
>>>>>> gluster client1
>>>>>>
>>>>>>  @1g backend
>>>>>> other gluster clients
>>>>>>
>>>>>>  Servers got HW RAID5 with SAS disks.
>>>>>>
>>>>>>  So today I've desided to create a 900GB file for iscsi target that
>>>>>> will be located @ glusterfs separate volume, using dd (just a dummy file
>>>>>> filled with zeros, bs=1G count 900)
>>>>>> For the first of all the process took pretty lots of time, the
>>>>>> writing speed was 130 MB/sec (client port was 2 gbps, servers ports were
>>>>>> running @ 1gbps).
>>>>>> Then it reported something like "endpoint is not connected" and all
>>>>>> of my VMs on the other volume started to give me IO errors.
>>>>>> Servers load was around 4,6 (total 12 cores)
>>>>>>
>>>>>>  Maybe it was due to timeout of 2 secs, so I've made it a big
>>>>>> higher, 10 sec.
>>>>>>
>>>>>>  Also during the dd image creation time, VMs very often reported me
>>>>>> that their disks are slow like
>>>>>>
>>>>>> WARNINGs: Read IO Wait time is -0.02 (outside range [0:1]).
>>>>>>
>>>>>> Is 130MB /sec is the maximum bandwidth for all of the volumes in
>>>>>> total? That why would we need 10g backends?
>>>>>>
>>>>>> HW Raid local speed is 300 MB/sec, so it should not be an issue. any
>>>>>> ideas or mby any advices?
>>>>>>
>>>>>>
>>>>>>  Maybe some1 got optimized sysctl.conf for 10G backend?
>>>>>>
>>>>>> mine is pretty simple, which can be found from googling.
>>>>>>
>>>>>>
>>>>>>  just to mention: those VM-s were connected using separate 1gbps
>>>>>> intraface, which means, they should not be affected by the client with 10g
>>>>>> backend.
>>>>>>
>>>>>>
>>>>>>  logs are pretty useless, they just say  this during the outage
>>>>>>
>>>>>>
>>>>>>  [2014-10-13 12:09:18.392910] W
>>>>>> [client-handshake.c:276:client_ping_cbk]
>>>>>> 0-HA-2TB-TT-Proxmox-cluster-client-0: timer must have expired
>>>>>>
>>>>>> [2014-10-13 12:10:08.389708] C
>>>>>> [client-handshake.c:127:rpc_client_ping_timer_expired]
>>>>>> 0-HA-2TB-TT-Proxmox-cluster-client-0: server 10.250.0.1:49159 has
>>>>>> not responded in the last 2 seconds, disconnecting.
>>>>>>
>>>>>> [2014-10-13 12:10:08.390312] W
>>>>>> [client-handshake.c:276:client_ping_cbk]
>>>>>> 0-HA-2TB-TT-Proxmox-cluster-client-0: timer must have expired
>>>>>>  so I decided to set the timout a bit higher.
>>>>>>
>>>>>>  So it seems to me, that under high load GlusterFS is not useable?
>>>>>> 130 MB/s is not that much to get some kind of timeouts or makeing the
>>>>>> systme so slow, that VM-s feeling themselves bad.
>>>>>>
>>>>>>  Of course, after the disconnection, healing process was started,
>>>>>> but as VM-s lost connection to both of servers, it was pretty useless, they
>>>>>> could not run anymore. and BTW, when u load the server with such huge job
>>>>>> (dd of 900GB), healing process goes soooooo slow :)
>>>>>>
>>>>>>
>>>>>>
>>>>>>  --
>>>>>> Best regards,
>>>>>> Roman.
>>>>>>
>>>>>>
>>>>>>  _______________________________________________
>>>>>> Gluster-users mailing listGluster-users at gluster.orghttp://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>  --
>>>>> Best regards,
>>>>> Roman.
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>   --
>>>> Best regards,
>>>> Roman.
>>>>
>>>
>>>
>>>
>>>  --
>>> Best regards,
>>> Roman.
>>>
>>>
>>>
>>
>>
>> --
>> Best regards,
>> Roman.
>>
>
>
>
> --
> Best regards,
> Roman.
>



-- 
Best regards,
Roman.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141013/96035e3a/attachment.html>


More information about the Gluster-users mailing list