[Gluster-users] Gluster rebalance taking many years
shadowsocks飞飞
kiwizhang618 at gmail.com
Mon Apr 30 07:40:19 UTC 2018
I cannot calculate the number of files normally
Through df -i I got the approximate number of files is 63694442
[root at CentOS-73-64-minimal ~]# df -i
Filesystem Inodes IUsed IFree IUse%
Mounted on
/dev/md2 131981312 30901030 101080282 24% /
devtmpfs 8192893 435 8192458 1%
/dev
tmpfs 8199799 8029 8191770 1%
/dev/shm
tmpfs 8199799 1415 8198384 1%
/run
tmpfs 8199799 16 8199783 1%
/sys/fs/cgroup
/dev/md3 110067712 29199861 80867851 27%
/home
/dev/md1 131072 363 130709 1%
/boot
gluster1:/web 2559860992 63694442 2496166550 3%
/web
tmpfs 8199799 1 8199798 1%
/run/user/0
The rebalance log is in the attachment
the cluster information
gluster volume status web detail
Status of volume: web
------------------------------------------------------------
------------------
Brick : Brick gluster1:/home/export/md3/brick
TCP Port : 49154
RDMA Port : 0
Online : Y
Pid : 16730
File System : ext4
Device : /dev/md3
Mount Options : rw,noatime,nodiratime,nobarrier,data=ordered
Inode Size : 256
Disk Space Free : 239.4GB
Total Disk Space : 1.6TB
Inode Count : 110067712
Free Inodes : 80867992
------------------------------------------------------------
------------------
Brick : Brick gluster1:/export/md2/brick
TCP Port : 49155
RDMA Port : 0
Online : Y
Pid : 16758
File System : ext4
Device : /dev/md2
Mount Options : rw,noatime,nodiratime,nobarrier,data=ordered
Inode Size : 256
Disk Space Free : 589.4GB
Total Disk Space : 1.9TB
Inode Count : 131981312
Free Inodes : 101080484
------------------------------------------------------------
------------------
Brick : Brick gluster2:/home/export/md3/brick
TCP Port : 49152
RDMA Port : 0
Online : Y
Pid : 12556
File System : xfs
Device : /dev/md3
Mount Options : rw,noatime,nodiratime,attr2,
inode64,sunit=1024,swidth=3072,noquota
Inode Size : 256
Disk Space Free : 10.7TB
Total Disk Space : 10.8TB
Inode Count : 2317811968
Free Inodes : 2314218207
Most of the files in the cluster are pictures smaller than 1M
2018-04-30 15:16 GMT+08:00 Nithya Balachandran <nbalacha at redhat.com>:
> Hi,
>
>
> This value is an ongoing rough estimate based on the amount of data
> rebalance has migrated since it started. The values will cange as the
> rebalance progresses.
> A few questions:
>
> 1. How many files/dirs do you have on this volume?
> 2. What is the average size of the files?
> 3. What is the total size of the data on the volume?
>
>
> Can you send us the rebalance log?
>
>
> Thanks,
> Nithya
>
> On 30 April 2018 at 10:33, kiwizhang618 <kiwizhang618 at gmail.com> wrote:
>
>> I met a big problem,the cluster rebalance takes a long time after adding
>> a new node
>>
>> gluster volume rebalance web status
>> Node Rebalanced-files size
>> scanned failures skipped status run time in
>> h:m:s
>> --------- ----------- -----------
>> ----------- ----------- ----------- ------------
>> --------------
>> localhost 900 43.5MB
>> 2232 0 69 in progress 0:36:49
>> gluster2 1052 39.3MB
>> 4393 0 1052 in progress 0:36:49
>> Estimated time left for rebalance to complete : 9919:44:34
>> volume rebalance: web: success
>>
>> the rebalance log
>> [glusterfsd.c:2511:main] 0-/usr/sbin/glusterfs: Started running
>> /usr/sbin/glusterfs version 3.12.8 (args: /usr/sbin/glusterfs -s localhost
>> --volfile-id rebalance/web --xlator-option *dht.use-readdirp=yes
>> --xlator-option *dht.lookup-unhashed=yes --xlator-option
>> *dht.assert-no-child-down=yes --xlator-option
>> *replicate*.data-self-heal=off --xlator-option
>> *replicate*.metadata-self-heal=off --xlator-option
>> *replicate*.entry-self-heal=off --xlator-option *dht.readdir-optimize=on
>> --xlator-option *dht.rebalance-cmd=1 --xlator-option
>> *dht.node-uuid=d47ad89d-7979-4ede-9aba-e04f020bb4f0 --xlator-option
>> *dht.commit-hash=3610561770 --socket-file /var/run/gluster/gluster-rebal
>> ance-bdef10eb-1c83-410c-8ad3-fe286450004b.sock --pid-file
>> /var/lib/glusterd/vols/web/rebalance/d47ad89d-7979-4ede-9aba-e04f020bb4f0.pid
>> -l /var/log/glusterfs/web-rebalance.log)
>> [2018-04-30 04:20:45.100902] W [MSGID: 101002]
>> [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is
>> deprecated, preferred is 'transport.address-family', continuing with
>> correction
>> [2018-04-30 04:20:45.103927] I [MSGID: 101190]
>> [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
>> with index 1
>> [2018-04-30 04:20:55.191261] E [MSGID: 109039]
>> [dht-common.c:3113:dht_find_local_subvol_cbk] 0-web-dht: getxattr err
>> for dir [No data available]
>> [2018-04-30 04:21:19.783469] E [MSGID: 109023]
>> [dht-rebalance.c:2669:gf_defrag_migrate_single_file] 0-web-dht: Migrate
>> file failed: /2018/02/x187f6596-36ac-45e6-bd7a-019804dfe427.jpg, lookup
>> failed [Stale file handle]
>> The message "E [MSGID: 109039] [dht-common.c:3113:dht_find_local_subvol_cbk]
>> 0-web-dht: getxattr err for dir [No data available]" repeated 2 times
>> between [2018-04-30 04:20:55.191261] and [2018-04-30 04:20:55.193615]
>>
>> the gluster info
>> Volume Name: web
>> Type: Distribute
>> Volume ID: bdef10eb-1c83-410c-8ad3-fe286450004b
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: gluster1:/home/export/md3/brick
>> Brick2: gluster1:/export/md2/brick
>> Brick3: gluster2:/home/export/md3/brick
>> Options Reconfigured:
>> nfs.trusted-sync: on
>> nfs.trusted-write: on
>> cluster.rebal-throttle: aggressive
>> features.inode-quota: off
>> features.quota: off
>> cluster.shd-wait-qlength: 1024
>> transport.address-family: inet
>> cluster.lookup-unhashed: auto
>> performance.cache-size: 1GB
>> performance.client-io-threads: on
>> performance.write-behind-window-size: 4MB
>> performance.io-thread-count: 8
>> performance.force-readdirp: on
>> performance.readdir-ahead: on
>> cluster.readdir-optimize: on
>> performance.high-prio-threads: 8
>> performance.flush-behind: on
>> performance.write-behind: on
>> performance.quick-read: off
>> performance.io-cache: on
>> performance.read-ahead: off
>> server.event-threads: 8
>> cluster.lookup-optimize: on
>> features.cache-invalidation: on
>> features.cache-invalidation-timeout: 600
>> performance.stat-prefetch: off
>> performance.md-cache-timeout: 60
>> network.inode-lru-limit: 90000
>> diagnostics.brick-log-level: ERROR
>> diagnostics.brick-sys-log-level: ERROR
>> diagnostics.client-log-level: ERROR
>> diagnostics.client-sys-log-level: ERROR
>> cluster.min-free-disk: 20%
>> cluster.self-heal-window-size: 16
>> cluster.self-heal-readdir-size: 1024
>> cluster.background-self-heal-count: 4
>> cluster.heal-wait-queue-length: 128
>> client.event-threads: 8
>> performance.cache-invalidation: on
>> nfs.disable: off
>> nfs.acl: off
>> cluster.brick-multiplex: disable
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180430/8ba1729c/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: web-rebalance.log
Type: application/octet-stream
Size: 8306 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180430/8ba1729c/attachment.obj>
More information about the Gluster-users
mailing list