[Gluster-users] question on rebalance errors gluster 7.2 (adding to distributed/replicated)
Erik Jacobson
erik.jacobson at hpe.com
Tue Feb 11 00:46:13 UTC 2020
My question: Are the errors and anomalies below something I need to
investigate? Are should I not be worried?
I installed a test cluster to gluster 7.2 to run some tests, preparing
to see if we gain confidence to put this on the 5,120 node
supercomputer instead of gluster 4.1.6.
I started with a 3x2 volume with heavy optimizations for writes and NFS.
(6 nodes, distribute/replicate).
I booted my NFS-root clients and maintained them online.
I then performaned a add-brick operation to make it a 3x3 instead of
3.2 (so 9 servers instead of 6).
The rebalance went much better for me than gluster 4.1.6. However, I saw
some errors. We noted them first here -- 14 errors on leader8, and a few
on the others. These are the NEW nodes so the data flow was from the old
nodes to these three that at least have one error:
[root at leader8 glusterfs]# gluster volume rebalance cm_shared status
Node Rebalanced-files size scanned failures skipped status run time in h:m:s
--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------
leader1.head.cm.eag.rdlabs.hpecorp.net 18933 596.4MB 181780 0 3760 completed 0:41:39
172.23.0.4 18960 1.2GB 181831 0 3766 completed 0:41:39
172.23.0.5 18691 1.2GB 181826 0 3716 completed 0:41:39
172.23.0.6 14917 618.8MB 175758 0 3869 completed 0:35:40
172.23.0.7 15114 573.5MB 175728 0 3853 completed 0:35:41
172.23.0.8 14864 459.2MB 175742 0 3951 completed 0:35:40
172.23.0.9 0 0Bytes 11 3 0 completed 0:08:26
172.23.0.11 0 0Bytes 242 1 0 completed 0:08:25
localhost 0 0Bytes 5 14 0 completed 0:08:26
volume rebalance: cm_shared: success
My rebalance log is like 32M and I find it's hard for people to help me
when I post that much data. So I've tried to filter some of the data
here. Two classes -- anomalies and errors.
Errors (14 reported on this node):
[root at leader8 glusterfs]# grep -i "error from gf_defrag_get_entry" cm_shared-rebalance.log
[2020-02-10 23:23:55.286830] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry
[2020-02-10 23:24:12.903496] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry
[2020-02-10 23:24:15.226948] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry
[2020-02-10 23:24:15.259480] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry
[2020-02-10 23:24:15.398784] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry
[2020-02-10 23:24:16.633033] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry
[2020-02-10 23:24:16.645847] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry
[2020-02-10 23:24:21.783528] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry
[2020-02-10 23:24:22.307464] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry
[2020-02-10 23:25:23.391256] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry
[2020-02-10 23:26:34.203129] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry
[2020-02-10 23:26:39.669243] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry
[2020-02-10 23:27:42.615081] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry
[2020-02-10 23:28:53.942158] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry
Brick log errors around 23:23:55 (to match the first error above):
[2020-02-10 23:23:54.605681] W [MSGID: 113096] [posix-handle.c:834:posix_handle_soft] 0-cm_shared-posix: symlink ../../a4/3e/a43ef7fd-08eb-434c-8168-96a92059d186/LC_MESSAGES -> /data/brick_cm_shared/.glusterfs/10/d9/10d97106-49b1-4c5e-a86f-b8e70c9ef838 failed [File exists]
[2020-02-10 23:23:54.883387] W [MSGID: 113096] [posix-handle.c:834:posix_handle_soft] 0-cm_shared-posix: symlink ../../7d/66/7d66930c-3bd0-40c8-9473-897fcd2f8c11/LC_MESSAGES -> /data/brick_cm_shared/.glusterfs/7c/41/7c412877-2443-43a8-9c7a-67ada4d96a13 failed [File exists]
[2020-02-10 23:23:55.284155] W [MSGID: 113096] [posix-handle.c:834:posix_handle_soft] 0-cm_shared-posix: symlink ../../a0/2c/a02c8b2d-f587-4c58-9de9-7928828e37e5/LC_MESSAGES -> /data/brick_cm_shared/.glusterfs/eb/79/eb79298d-a65e-41f3-a9a8-da4634879e88 failed [File exists]
[2020-02-10 23:23:55.284178] E [MSGID: 113020] [posix-entry-ops.c:835:posix_mkdir] 0-cm_shared-posix: setting gfid on /data/brick_cm_shared/image/images_ro_nfs/rhel8.0/usr/share/vim/vim80/lang/zh_CN.UTF-8/LC_MESSAGES failed [File exists]
[2020-02-10 23:23:55.284913] W [MSGID: 113103] [posix-entry-ops.c:247:posix_lookup] 0-cm_shared-posix: Found stale gfid handle /data/brick_cm_shared/.glusterfs/eb/79/eb79298d-a65e-41f3-a9a8-da4634879e88, removing it. [No such file or directory]
[2020-02-10 23:23:57.218664] W [MSGID: 113096] [posix-handle.c:834:posix_handle_soft] 0-cm_shared-posix: symlink ../../86/c2/86c2e694-d00b-4dcf-8383-60ce0cb07275/html -> /data/brick_cm_shared/.glusterfs/5c/f0/5cf0cc7d-86fe-4ba2-bea5-1d8ad3616274 failed [File exists]
Example anomalies - normal root files:
[2020-02-10 23:28:18.816012] I [MSGID: 109063] [dht-layout.c:647:dht_layout_normalize] 0-cm_shared-dht: Found anomalies in /image/images_dist/rhel8.0/usr/lib64/python3.6/email (gfid = 4194dca6-dcc9-409b-a162-58e90b8db63d). Holes=1 overlaps=0
[2020-02-10 23:28:18.822869] I [MSGID: 109063] [dht-layout.c:647:dht_layout_normalize] 0-cm_shared-dht: Found anomalies in /image/images_dist/rhel8.0/usr/lib64/python3.6/email/__pycache__ (gfid = 07e4e462-de25-4840-99dc-f4235b4b45bf). Holes=1 overlaps=0
[2020-02-10 23:28:18.834924] I [MSGID: 109063] [dht-layout.c:647:dht_layout_normalize] 0-cm_shared-dht: Found anomalies in /image/images_dist/rhel8.0/usr/lib64/python3.6/email/mime (gfid = f882e53c-43c6-48ea-9230-c0bc7eee901f). Holes=1 overlaps=0
...
Example anomalies - sparse files with XFS images used as node-writable space:
(but these are just the directories that hold the sparse files, not the
spars files themselves)
[2020-02-10 23:26:07.231529] I [MSGID: 109063] [dht-layout.c:647:dht_layout_normalize] 0-cm_shared-dht: Found anomalies in /image/images_rw_nfs/n2521 (gfid = 3b65777c-5fc5-4213-9525-294e74a560ca). Holes=1 overlaps=0
[2020-02-10 23:26:07.237923] I [MSGID: 109063] [dht-layout.c:647:dht_layout_normalize] 0-cm_shared-dht: Found anomalies in /image/images_rw_nfs/n2521/rhel8.0-aarch64 (gfid = f822683d-7136-4d5c-8df5-94f1b84afc03). Holes=1 overlaps=0
Volume info:
[root at leader8 glusterfs]# gluster volume status cm_shared
Status of volume: cm_shared
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 172.23.0.3:/data/brick_cm_shared 49152 0 Y 36543
Brick 172.23.0.4:/data/brick_cm_shared 49152 0 Y 34371
Brick 172.23.0.5:/data/brick_cm_shared 49152 0 Y 34451
Brick 172.23.0.6:/data/brick_cm_shared 49152 0 Y 35685
Brick 172.23.0.7:/data/brick_cm_shared 49152 0 Y 34068
Brick 172.23.0.8:/data/brick_cm_shared 49152 0 Y 35093
Brick 172.23.0.9:/data/brick_cm_shared 49154 0 Y 31940
Brick 172.23.0.10:/data/brick_cm_shared 49154 0 Y 32420
Brick 172.23.0.11:/data/brick_cm_shared 49154 0 Y 32906
Self-heal Daemon on localhost N/A N/A Y 32063
NFS Server on localhost 2049 0 Y 32493
Self-heal Daemon on 172.23.0.4 N/A N/A Y 34435
NFS Server on 172.23.0.4 2049 0 Y 9636
Self-heal Daemon on 172.23.0.5 N/A N/A Y 34514
NFS Server on 172.23.0.5 2049 0 Y 11483
Self-heal Daemon on 172.23.0.7 N/A N/A Y 34131
NFS Server on 172.23.0.7 2049 0 Y 12294
Self-heal Daemon on 172.23.0.6 N/A N/A Y 35752
NFS Server on 172.23.0.6 2049 0 Y 4699
Self-heal Daemon on leader1.head.cm.eag.rdl
abs.hpecorp.net N/A N/A Y 36626
NFS Server on leader1.head.cm.eag.rdlabs.hp
ecorp.net 2049 0 Y 8736
Self-heal Daemon on 172.23.0.9 N/A N/A Y 31583
NFS Server on 172.23.0.9 2049 0 Y 31996
Self-heal Daemon on 172.23.0.11 N/A N/A Y 32550
NFS Server on 172.23.0.11 2049 0 Y 32962
Self-heal Daemon on 172.23.0.8 N/A N/A Y 35160
NFS Server on 172.23.0.8 2049 0 Y 2250
Task Status of Volume cm_shared
------------------------------------------------------------------------------
Task : Rebalance
ID : f42c98ad-801a-4376-94ea-7dff698f8241
Status : completed
Commands used to grow:
ssh leader1 gluster volume add-brick cm_shared 172.23.0.9://data/brick_cm_shared 172.23.0.10://data/brick_cm_shared 172.23.0.11://data/brick_cm_shared
volume add-brick: success
ssh leader1 gluster volume rebalance cm_shared start
volume rebalance: cm_shared: success: Rebalance on cm_shared has been started successfully. Use rebalance status command to check status of the rebalance process.
All volume data/settings:
[root at leader8 glusterfs]# gluster volume get cm_shared all
Option Value
------ -----
cluster.lookup-unhashed auto
cluster.lookup-optimize on
cluster.min-free-disk 10%
cluster.min-free-inodes 5%
cluster.rebalance-stats off
cluster.subvols-per-directory (null)
cluster.readdir-optimize off
cluster.rsync-hash-regex (null)
cluster.extra-hash-regex (null)
cluster.dht-xattr-name trusted.glusterfs.dht
cluster.randomize-hash-range-by-gfid off
cluster.rebal-throttle normal
cluster.lock-migration off
cluster.force-migration off
cluster.local-volume-name (null)
cluster.weighted-rebalance on
cluster.switch-pattern (null)
cluster.entry-change-log on
cluster.read-subvolume (null)
cluster.read-subvolume-index -1
cluster.read-hash-mode 1
cluster.background-self-heal-count 8
cluster.metadata-self-heal off
cluster.data-self-heal off
cluster.entry-self-heal off
cluster.self-heal-daemon on
cluster.heal-timeout 600
cluster.self-heal-window-size 1
cluster.data-change-log on
cluster.metadata-change-log on
cluster.data-self-heal-algorithm (null)
cluster.eager-lock on
disperse.eager-lock on
disperse.other-eager-lock on
disperse.eager-lock-timeout 1
disperse.other-eager-lock-timeout 1
cluster.quorum-type auto
cluster.quorum-count (null)
cluster.choose-local true
cluster.self-heal-readdir-size 1KB
cluster.post-op-delay-secs 1
cluster.ensure-durability on
cluster.consistent-metadata no
cluster.heal-wait-queue-length 128
cluster.favorite-child-policy none
cluster.full-lock yes
cluster.optimistic-change-log on
diagnostics.latency-measurement off
diagnostics.dump-fd-stats off
diagnostics.count-fop-hits off
diagnostics.brick-log-level INFO
diagnostics.client-log-level INFO
diagnostics.brick-sys-log-level CRITICAL
diagnostics.client-sys-log-level CRITICAL
diagnostics.brick-logger (null)
diagnostics.client-logger (null)
diagnostics.brick-log-format (null)
diagnostics.client-log-format (null)
diagnostics.brick-log-buf-size 5
diagnostics.client-log-buf-size 5
diagnostics.brick-log-flush-timeout 120
diagnostics.client-log-flush-timeout 120
diagnostics.stats-dump-interval 0
diagnostics.fop-sample-interval 0
diagnostics.stats-dump-format json
diagnostics.fop-sample-buf-size 65535
diagnostics.stats-dnscache-ttl-sec 86400
performance.cache-max-file-size 0
performance.cache-min-file-size 0
performance.cache-refresh-timeout 60
performance.cache-priority
performance.cache-size 8GB
performance.io-thread-count 32
performance.high-prio-threads 16
performance.normal-prio-threads 16
performance.low-prio-threads 16
performance.least-prio-threads 1
performance.enable-least-priority on
performance.iot-watchdog-secs (null)
performance.iot-cleanup-disconnected-reqsoff
performance.iot-pass-through false
performance.io-cache-pass-through false
performance.cache-size 8GB
performance.qr-cache-timeout 1
performance.cache-invalidation on
performance.ctime-invalidation false
performance.flush-behind on
performance.nfs.flush-behind on
performance.write-behind-window-size 1024MB
performance.resync-failed-syncs-after-fsyncoff
performance.nfs.write-behind-window-size1MB
performance.strict-o-direct off
performance.nfs.strict-o-direct off
performance.strict-write-ordering off
performance.nfs.strict-write-ordering off
performance.write-behind-trickling-writesoff
performance.aggregate-size 2048KB
performance.nfs.write-behind-trickling-writeson
performance.lazy-open yes
performance.read-after-open yes
performance.open-behind-pass-through false
performance.read-ahead-page-count 4
performance.read-ahead-pass-through false
performance.readdir-ahead-pass-through false
performance.md-cache-pass-through false
performance.md-cache-timeout 600
performance.cache-swift-metadata true
performance.cache-samba-metadata false
performance.cache-capability-xattrs true
performance.cache-ima-xattrs true
performance.md-cache-statfs off
performance.xattr-cache-list
performance.nl-cache-pass-through false
network.frame-timeout 1800
network.ping-timeout 42
network.tcp-window-size (null)
client.ssl off
network.remote-dio disable
client.event-threads 32
client.tcp-user-timeout 0
client.keepalive-time 20
client.keepalive-interval 2
client.keepalive-count 9
network.tcp-window-size (null)
network.inode-lru-limit 1000000
auth.allow *
auth.reject (null)
transport.keepalive 1
server.allow-insecure on
server.root-squash off
server.all-squash off
server.anonuid 65534
server.anongid 65534
server.statedump-path /var/run/gluster
server.outstanding-rpc-limit 1024
server.ssl off
auth.ssl-allow *
server.manage-gids off
server.dynamic-auth on
client.send-gids on
server.gid-timeout 300
server.own-thread (null)
server.event-threads 32
server.tcp-user-timeout 42
server.keepalive-time 20
server.keepalive-interval 2
server.keepalive-count 9
transport.listen-backlog 16384
transport.address-family inet
performance.write-behind on
performance.read-ahead on
performance.readdir-ahead on
performance.io-cache on
performance.open-behind on
performance.quick-read on
performance.nl-cache off
performance.stat-prefetch on
performance.client-io-threads on
performance.nfs.write-behind on
performance.nfs.read-ahead off
performance.nfs.io-cache on
performance.nfs.quick-read off
performance.nfs.stat-prefetch off
performance.nfs.io-threads off
performance.force-readdirp true
performance.cache-invalidation on
performance.global-cache-invalidation true
features.uss off
features.snapshot-directory .snaps
features.show-snapshot-directory off
features.tag-namespaces off
network.compression off
network.compression.window-size -15
network.compression.mem-level 8
network.compression.min-size 0
network.compression.compression-level -1
network.compression.debug false
features.default-soft-limit 80%
features.soft-timeout 60
features.hard-timeout 5
features.alert-time 86400
features.quota-deem-statfs off
geo-replication.indexing off
geo-replication.indexing off
geo-replication.ignore-pid-check off
geo-replication.ignore-pid-check off
features.quota off
features.inode-quota off
features.bitrot disable
debug.trace off
debug.log-history no
debug.log-file no
debug.exclude-ops (null)
debug.include-ops (null)
debug.error-gen off
debug.error-failure (null)
debug.error-number (null)
debug.random-failure off
debug.error-fops (null)
nfs.enable-ino32 no
nfs.mem-factor 15
nfs.export-dirs on
nfs.export-volumes on
nfs.addr-namelookup off
nfs.dynamic-volumes off
nfs.register-with-portmap on
nfs.outstanding-rpc-limit 1024
nfs.port 2049
nfs.rpc-auth-unix on
nfs.rpc-auth-null on
nfs.rpc-auth-allow all
nfs.rpc-auth-reject none
nfs.ports-insecure off
nfs.trusted-sync off
nfs.trusted-write off
nfs.volume-access read-write
nfs.export-dir
nfs.disable off
nfs.nlm off
nfs.acl on
nfs.mount-udp off
nfs.mount-rmtab /-
nfs.rpc-statd /sbin/rpc.statd
nfs.server-aux-gids off
nfs.drc off
nfs.drc-size 0x20000
nfs.read-size (1 * 1048576ULL)
nfs.write-size (1 * 1048576ULL)
nfs.readdir-size (1 * 1048576ULL)
nfs.rdirplus on
nfs.event-threads 2
nfs.exports-auth-enable on
nfs.auth-refresh-interval-sec 360
nfs.auth-cache-ttl-sec 360
features.read-only off
features.worm off
features.worm-file-level off
features.worm-files-deletable on
features.default-retention-period 120
features.retention-mode relax
features.auto-commit-period 180
storage.linux-aio off
storage.batch-fsync-mode reverse-fsync
storage.batch-fsync-delay-usec 0
storage.owner-uid -1
storage.owner-gid -1
storage.node-uuid-pathinfo off
storage.health-check-interval 30
storage.build-pgfid off
storage.gfid2path on
storage.gfid2path-separator :
storage.reserve 1
storage.reserve-size 0
storage.health-check-timeout 10
storage.fips-mode-rchecksum on
storage.force-create-mode 0000
storage.force-directory-mode 0000
storage.create-mask 0777
storage.create-directory-mask 0777
storage.max-hardlinks 0
features.ctime on
config.gfproxyd off
cluster.server-quorum-type off
cluster.server-quorum-ratio 51
changelog.changelog off
changelog.changelog-dir {{ brick.path }}/.glusterfs/changelogs
changelog.encoding ascii
changelog.rollover-time 15
changelog.fsync-interval 5
changelog.changelog-barrier-timeout 120
changelog.capture-del-path off
features.barrier disable
features.barrier-timeout 120
features.trash off
features.trash-dir .trashcan
features.trash-eliminate-path (null)
features.trash-max-filesize 5MB
features.trash-internal-op off
cluster.enable-shared-storage disable
locks.trace off
locks.mandatory-locking off
cluster.disperse-self-heal-daemon enable
cluster.quorum-reads no
client.bind-insecure (null)
features.shard off
features.shard-block-size 64MB
features.shard-lru-limit 16384
features.shard-deletion-rate 100
features.scrub-throttle lazy
features.scrub-freq biweekly
features.scrub false
features.expiry-time 120
features.cache-invalidation on
features.cache-invalidation-timeout 600
features.leases off
features.lease-lock-recall-timeout 60
disperse.background-heals 8
disperse.heal-wait-qlength 128
cluster.heal-timeout 600
dht.force-readdirp on
disperse.read-policy gfid-hash
cluster.shd-max-threads 1
cluster.shd-wait-qlength 1024
cluster.locking-scheme full
cluster.granular-entry-heal no
features.locks-revocation-secs 0
features.locks-revocation-clear-all false
features.locks-revocation-max-blocked 0
features.locks-monkey-unlocking false
features.locks-notify-contention no
features.locks-notify-contention-delay 5
disperse.shd-max-threads 1
disperse.shd-wait-qlength 1024
disperse.cpu-extensions auto
disperse.self-heal-window-size 1
cluster.use-compound-fops off
performance.parallel-readdir on
performance.rda-request-size 131072
performance.rda-low-wmark 4096
performance.rda-high-wmark 128KB
performance.rda-cache-limit 10MB
performance.nl-cache-positive-entry false
performance.nl-cache-limit 10MB
performance.nl-cache-timeout 60
cluster.brick-multiplex disable
glusterd.vol_count_per_thread 100
cluster.max-bricks-per-process 250
disperse.optimistic-change-log on
disperse.stripe-cache 4
cluster.halo-enabled False
cluster.halo-shd-max-latency 99999
cluster.halo-nfsd-max-latency 5
cluster.halo-max-latency 5
cluster.halo-max-replicas 99999
cluster.halo-min-replicas 2
features.selinux on
cluster.daemon-log-level INFO
debug.delay-gen off
delay-gen.delay-percentage 10%
delay-gen.delay-duration 100000
delay-gen.enable
disperse.parallel-writes on
features.sdfs off
features.cloudsync off
features.ctime on
ctime.noatime on
features.cloudsync-storetype (null)
features.enforce-mandatory-lock off
config.global-threading off
config.client-threads 16
config.brick-threads 16
features.cloudsync-remote-read off
features.cloudsync-store-id (null)
features.cloudsync-product-id (null)
More information about the Gluster-users
mailing list