[Gluster-users] Gluster rebalance taking many years

Wed May 2 08:15:12 UTC 2018

Hi,

There is not much information in this log file. Which server's file is
this? I will need to see the rebalance logs from both nodes.

It sounds like there are a lot of files on the volume which is why the
rebalance will take time. What is the current rebalance status for the
volume?

Rebalance should not affect volume operations so is there a particular
reason why the estimated time is a cause of concern?

Regards,
Nithya

On 30 April 2018 at 13:10, shadowsocks飞飞 <kiwizhang618 at gmail.com> wrote:

> I cannot calculate the number of files normally
>
> Through df -i I got the approximate number of files is  63694442
>
> [root at CentOS-73-64-minimal ~]# df -i
> Filesystem                              Inodes    IUsed      IFree IUse%
> Mounted on
> /dev/md2                             131981312 30901030  101080282   24% /
> devtmpfs                               8192893      435    8192458    1%
> /dev
> tmpfs                                  8199799     8029    8191770    1%
> /dev/shm
> tmpfs                                  8199799     1415    8198384    1%
> /run
> tmpfs                                  8199799       16    8199783    1%
> /sys/fs/cgroup
> /dev/md3                             110067712 29199861   80867851   27%
> /home
> /dev/md1                                131072      363     130709    1%
> /boot
> gluster1:/web                       2559860992 63694442 2496166550    3%
> /web
> tmpfs                                  8199799        1    8199798    1%
> /run/user/0
>
>
> The rebalance log is in the attachment
>
> the cluster information
>
> gluster volume status web detail
> Status of volume: web
> ------------------------------------------------------------
> ------------------
> Brick                : Brick gluster1:/home/export/md3/brick
> TCP Port             : 49154
> RDMA Port            : 0
> Online               : Y
> Pid                  : 16730
> File System          : ext4
> Device               : /dev/md3
> Mount Options        : rw,noatime,nodiratime,nobarrier,data=ordered
> Inode Size           : 256
> Disk Space Free      : 239.4GB
> Total Disk Space     : 1.6TB
> Inode Count          : 110067712
> Free Inodes          : 80867992
> ------------------------------------------------------------
> ------------------
> Brick                : Brick gluster1:/export/md2/brick
> TCP Port             : 49155
> RDMA Port            : 0
> Online               : Y
> Pid                  : 16758
> File System          : ext4
> Device               : /dev/md2
> Mount Options        : rw,noatime,nodiratime,nobarrier,data=ordered
> Inode Size           : 256
> Disk Space Free      : 589.4GB
> Total Disk Space     : 1.9TB
> Inode Count          : 131981312
> Free Inodes          : 101080484
> ------------------------------------------------------------
> ------------------
> Brick                : Brick gluster2:/home/export/md3/brick
> TCP Port             : 49152
> RDMA Port            : 0
> Online               : Y
> Pid                  : 12556
> File System          : xfs
> Device               : /dev/md3
> Mount Options        : rw,noatime,nodiratime,attr2,in
> ode64,sunit=1024,swidth=3072,noquota
> Inode Size           : 256
> Disk Space Free      : 10.7TB
> Total Disk Space     : 10.8TB
> Inode Count          : 2317811968
> Free Inodes          : 2314218207
>
> Most of the files in the cluster are pictures smaller than 1M
>
>
> 2018-04-30 15:16 GMT+08:00 Nithya Balachandran <nbalacha at redhat.com>:
>
>> Hi,
>>
>>
>> This value is an ongoing rough estimate based on the amount of data
>> rebalance has migrated since it started. The values will cange as the
>> rebalance progresses.
>> A few questions:
>>
>>    1. How many files/dirs do you have on this volume?
>>    2. What is the average size of the files?
>>    3. What is the total size of the data on the volume?
>>
>>
>> Can you send us the rebalance log?
>>
>>
>> Thanks,
>> Nithya
>>
>> On 30 April 2018 at 10:33, kiwizhang618 <kiwizhang618 at gmail.com> wrote:
>>
>>>  I met a big problem,the cluster rebalance takes a long time after
>>> adding a new node
>>>
>>> gluster volume rebalance web status
>>>                                     Node Rebalanced-files          size
>>>       scanned      failures       skipped               status  run time in
>>> h:m:s
>>>                                ---------      -----------   -----------
>>>   -----------   -----------   -----------         ------------
>>> --------------
>>>                                localhost              900        43.5MB
>>>          2232             0            69          in progress
>>>  0:36:49
>>>                                 gluster2             1052        39.3MB
>>>          4393             0          1052          in progress
>>>  0:36:49
>>> Estimated time left for rebalance to complete :     9919:44:34
>>> volume rebalance: web: success
>>>
>>> the rebalance log
>>> [glusterfsd.c:2511:main] 0-/usr/sbin/glusterfs: Started running
>>> /usr/sbin/glusterfs version 3.12.8 (args: /usr/sbin/glusterfs -s localhost
>>> --volfile-id rebalance/web --xlator-option *dht.use-readdirp=yes
>>> --xlator-option *dht.lookup-unhashed=yes --xlator-option
>>> *dht.assert-no-child-down=yes --xlator-option
>>> *replicate*.data-self-heal=off --xlator-option
>>> *replicate*.metadata-self-heal=off --xlator-option
>>> *replicate*.entry-self-heal=off --xlator-option
>>> *dht.readdir-optimize=on --xlator-option *dht.rebalance-cmd=1
>>> --xlator-option *dht.node-uuid=d47ad89d-7979-4ede-9aba-e04f020bb4f0
>>> --xlator-option *dht.commit-hash=3610561770 --socket-file
>>> /var/run/gluster/gluster-rebalance-bdef10eb-1c83-410c-8ad3-fe286450004b.sock
>>> --pid-file /var/lib/glusterd/vols/web/rebalance/d47ad89d-7979-4ede-9aba-e04f020bb4f0.pid
>>> -l /var/log/glusterfs/web-rebalance.log)
>>> [2018-04-30 04:20:45.100902] W [MSGID: 101002]
>>> [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is
>>> deprecated, preferred is 'transport.address-family', continuing with
>>> correction
>>> [2018-04-30 04:20:45.103927] I [MSGID: 101190]
>>> [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
>>> with index 1
>>> [2018-04-30 04:20:55.191261] E [MSGID: 109039]
>>> [dht-common.c:3113:dht_find_local_subvol_cbk] 0-web-dht: getxattr err
>>> for dir [No data available]
>>> [2018-04-30 04:21:19.783469] E [MSGID: 109023]
>>> [dht-rebalance.c:2669:gf_defrag_migrate_single_file] 0-web-dht: Migrate
>>> file failed: /2018/02/x187f6596-36ac-45e6-bd7a-019804dfe427.jpg, lookup
>>> failed [Stale file handle]
>>> The message "E [MSGID: 109039] [dht-common.c:3113:dht_find_local_subvol_cbk]
>>> 0-web-dht: getxattr err for dir [No data available]" repeated 2 times
>>> between [2018-04-30 04:20:55.191261] and [2018-04-30 04:20:55.193615]
>>>
>>> the gluster info
>>> Volume Name: web
>>> Type: Distribute
>>> Volume ID: bdef10eb-1c83-410c-8ad3-fe286450004b
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 3
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: gluster1:/home/export/md3/brick
>>> Brick2: gluster1:/export/md2/brick
>>> Brick3: gluster2:/home/export/md3/brick
>>> Options Reconfigured:
>>> nfs.trusted-sync: on
>>> nfs.trusted-write: on
>>> cluster.rebal-throttle: aggressive
>>> features.inode-quota: off
>>> features.quota: off
>>> cluster.shd-wait-qlength: 1024
>>> transport.address-family: inet
>>> cluster.lookup-unhashed: auto
>>> performance.cache-size: 1GB
>>> performance.client-io-threads: on
>>> performance.write-behind-window-size: 4MB
>>> performance.io-thread-count: 8
>>> performance.force-readdirp: on
>>> performance.readdir-ahead: on
>>> cluster.readdir-optimize: on
>>> performance.high-prio-threads: 8
>>> performance.flush-behind: on
>>> performance.write-behind: on
>>> performance.quick-read: off
>>> performance.io-cache: on
>>> performance.read-ahead: off
>>> server.event-threads: 8
>>> cluster.lookup-optimize: on
>>> features.cache-invalidation: on
>>> features.cache-invalidation-timeout: 600
>>> performance.stat-prefetch: off
>>> performance.md-cache-timeout: 60
>>> network.inode-lru-limit: 90000
>>> diagnostics.brick-log-level: ERROR
>>> diagnostics.brick-sys-log-level: ERROR
>>> diagnostics.client-log-level: ERROR
>>> diagnostics.client-sys-log-level: ERROR
>>> cluster.min-free-disk: 20%
>>> cluster.self-heal-window-size: 16
>>> cluster.self-heal-readdir-size: 1024
>>> cluster.background-self-heal-count: 4
>>> cluster.heal-wait-queue-length: 128
>>> client.event-threads: 8
>>> performance.cache-invalidation: on
>>> nfs.disable: off
>>> nfs.acl: off
>>> cluster.brick-multiplex: disable
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180502/916eeefd/attachment.html>