[Gluster-users] How to find out what GlusterFS is doing
Yaniv Kaul
ykaul at redhat.com
Thu Nov 5 14:28:24 UTC 2020
On Thu, Nov 5, 2020 at 4:18 PM mabi <mabi at protonmail.ch> wrote:
> Below is the top output of running "top -bHd d" on one of the nodes, maybe
> that can help to see what that glusterfsd process is doing?
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 4375 root 20 0 2856784 120492 8360 D 61.1 0.4 117:09.29
> glfs_iotwr001
>
Waiting for IO, just like the rest of those in D state.
You may have a slow storage subsystem. How many cores do you have, btw?
Y.
4385 root 20 0 2856784 120492 8360 R 61.1 0.4 117:12.92
> glfs_iotwr003
> 4387 root 20 0 2856784 120492 8360 R 61.1 0.4 117:32.19
> glfs_iotwr005
> 4388 root 20 0 2856784 120492 8360 R 61.1 0.4 117:28.87
> glfs_iotwr006
> 4391 root 20 0 2856784 120492 8360 D 61.1 0.4 117:20.71
> glfs_iotwr008
> 4395 root 20 0 2856784 120492 8360 D 61.1 0.4 117:17.22
> glfs_iotwr009
> 4405 root 20 0 2856784 120492 8360 R 61.1 0.4 117:19.52
> glfs_iotwr00d
> 4406 root 20 0 2856784 120492 8360 R 61.1 0.4 117:29.51
> glfs_iotwr00e
> 4366 root 20 0 2856784 120492 8360 D 55.6 0.4 117:27.58
> glfs_iotwr000
> 4386 root 20 0 2856784 120492 8360 D 55.6 0.4 117:22.77
> glfs_iotwr004
> 4390 root 20 0 2856784 120492 8360 D 55.6 0.4 117:26.49
> glfs_iotwr007
> 4396 root 20 0 2856784 120492 8360 R 55.6 0.4 117:23.68
> glfs_iotwr00a
> 4376 root 20 0 2856784 120492 8360 D 50.0 0.4 117:36.17
> glfs_iotwr002
> 4397 root 20 0 2856784 120492 8360 D 50.0 0.4 117:11.09
> glfs_iotwr00b
> 4403 root 20 0 2856784 120492 8360 R 50.0 0.4 117:26.34
> glfs_iotwr00c
> 4408 root 20 0 2856784 120492 8360 D 50.0 0.4 117:27.47
> glfs_iotwr00f
> 9814 root 20 0 2043684 75208 8424 D 22.2 0.2 50:15.20
> glfs_iotwr003
> 28131 root 20 0 2043684 75208 8424 R 22.2 0.2 50:07.46
> glfs_iotwr004
> 2208 root 20 0 2043684 75208 8424 R 22.2 0.2 49:32.70
> glfs_iotwr008
> 2372 root 20 0 2043684 75208 8424 R 22.2 0.2 49:52.60
> glfs_iotwr009
> 2375 root 20 0 2043684 75208 8424 D 22.2 0.2 49:54.08
> glfs_iotwr00c
> 767 root 39 19 0 0 0 R 16.7 0.0 67:50.83
> dbuf_evict
> 4132 onadmin 20 0 45292 4184 3176 R 16.7 0.0 0:00.04 top
> 28484 root 20 0 2043684 75208 8424 R 11.1 0.2 49:41.34
> glfs_iotwr005
> 2376 root 20 0 2043684 75208 8424 R 11.1 0.2 49:49.49
> glfs_iotwr00d
> 2719 root 20 0 2043684 75208 8424 R 11.1 0.2 49:58.61
> glfs_iotwr00e
> 4384 root 20 0 2856784 120492 8360 S 5.6 0.4 4:01.27
> glfs_rpcrqhnd
> 3842 root 20 0 2043684 75208 8424 S 5.6 0.2 0:30.12
> glfs_epoll001
> 1 root 20 0 57696 7340 5248 S 0.0 0.0 0:03.59 systemd
> 2 root 20 0 0 0 0 S 0.0 0.0 0:09.57 kthreadd
> 3 root 20 0 0 0 0 S 0.0 0.0 0:00.16
> ksoftirqd/0
> 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00
> kworker/0:0H
> 7 root 20 0 0 0 0 S 0.0 0.0 0:07.36
> rcu_sched
> 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
> 9 root rt 0 0 0 0 S 0.0 0.0 0:00.03
> migration/0
> 10 root 0 -20 0 0 0 S 0.0 0.0 0:00.00
> lru-add-drain
> 11 root rt 0 0 0 0 S 0.0 0.0 0:00.01
> watchdog/0
> 12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/0
> 13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/1
>
> Any clues anyone?
>
> The load is really high around 20 now on the two nodes...
>
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Thursday, November 5, 2020 11:50 AM, mabi <mabi at protonmail.ch> wrote:
>
> > Hello,
> >
> > I have a 3 node replica including arbiter GlusterFS 7.8 server with 3
> volumes and the two nodes (not arbiter) seem to have a high load due to the
> glusterfsd brick process taking all CPU resources (12 cores).
> >
> > Checking these two servers with iostat command shows that the disks are
> not so busy and that they are mostly doing writes activity. On the FUSE
> clients there is not so much activity so I was wondering how to find out or
> explain why GlusterFS is currently generating such a high load on these two
> servers (the arbiter does not show any high load). There are no files
> currently healing either. This volume is the only volume which has the
> quota enabled if this might be a hint. So does anyone know how to see why
> GlusterFS is so busy on a specific volume?
> >
> > Here is a sample "vmstat 60" of one of the nodes:
> >
> > onadmin at gfs1b:~$ vmstat 60
> > procs -----------memory---------- ---swap-- -----io---- -system--
> ------cpu-----
> > r b swpd free buff cache si so bi bo in cs us sy id wa st
> > 9 2 0 22296776 32004 260284 0 0 33 301 153 39 2 60 36 2 0
> > 13 0 0 22244540 32048 260456 0 0 343 2798 10898 367652 2 80 16 1 0
> > 18 0 0 22215740 32056 260672 0 0 308 2524 9892 334537 2 83 14 1 0
> > 18 0 0 22179348 32084 260828 0 0 169 2038 8703 250351 1 88 10 0 0
> >
> > I already tried rebooting but that did not help and there is nothing
> special in the log files either.
> >
> > Best regards,
> > Mabi
>
>
> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20201105/2f582847/attachment.html>
More information about the Gluster-users
mailing list