[Bugs] [Bug 1276943] New: Data Tiering:heat counters not getting reset and also internal ops seem to be heating the files
bugzilla at redhat.com
bugzilla at redhat.com
Sun Nov 1 15:32:31 UTC 2015
https://bugzilla.redhat.com/show_bug.cgi?id=1276943
Bug ID: 1276943
Summary: Data Tiering:heat counters not getting reset and also
internal ops seem to be heating the files
Product: GlusterFS
Version: mainline
Component: tiering
Severity: urgent
Priority: urgent
Assignee: bugs at gluster.org
Reporter: dlambrig at redhat.com
QA Contact: bugs at gluster.org
CC: bugs at gluster.org, nchilaka at redhat.com,
vagarwal at redhat.com
Depends On: 1272450
Blocks: 1272452, 1275483, 1275524
+++ This bug was initially created as a clone of Bug #1272450 +++
Description of problem:
========================
I observed that sometimes the heat of a file is not getting reset in the next
cycle. Also seems like internal operations like xattr changes are heating
files, which is not acceptable
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
==================
1.created a 2x2 vol and started it
2.attached a tier with pure-distribute of 4 bricks ->each disk with only 1GB
size(tried even with 2x2 hot layer)
3.Now, enabled ctr
4. Created a file f1 of size 700mb, which hashed to brick1 of hot tier
5. When idle the file got demoted
6. Now created f2 such that it got hashed to same brick as f1, ie brick1 of hot
tier and 700mb size
7. Waited for it to get demoted
8. Now i touched both f1 and f2 files to heat them, but as the space in hot
tier will be insufficient, I wanted to see the behavior
9. f1 got promoted , but f2 failed with tier log saying disk space not
sufficient which is perfectly valid
10. But the heat measure was still showing up in sqldb query.
11, Waited for f1 to get demoted, ie in the next cycle,But i saw that while f1
got demoted, f2 got promoted, as the heat counters were not reset.
Also there was newly read heat counter too seen.
Expected results:
===============
>Heat counter should get reset
>Also, internal metadata read/writes or operations should not heat files
>read counters should not get set in this case
CLI LOGS:
=======
[root at zod glusterfs]# tail -f portugal-tier.log
[2015-10-16 11:50:00.715894] E [MSGID: 109023]
[dht-rebalance.c:699:__dht_check_free_space] 0-portugal-tier-dht: data movement
attempted from node (portugal-cold-dht) to node (portugal-hot-dht) which does
not have required free space for (/lisbon.2)
[2015-10-16 11:50:00.716661] E [MSGID: 109037]
[tier.c:492:tier_migrate_using_query_file] 0-portugal-tier-dht: ERROR -28 in
current migration lisbon.2 /lisbon.2
[2015-10-16 11:50:00.716820] E [MSGID: 109037] [tier.c:1454:tier_start]
0-portugal-tier-dht: Promotion failed
[2015-10-16 11:52:00.728929] I [MSGID: 109038]
[tier.c:1008:tier_build_migration_qfile] 0-portugal-tier-dht: Failed to remove
/var/run/gluster/portugal-tier-dht/promotequeryfile-portugal-tier-dht
[2015-10-16 11:52:00.734278] I [MSGID: 109038]
[tier.c:476:tier_migrate_using_query_file] 0-portugal-tier-dht: Tier 1
src_subvol portugal-cold-dht file lisbon.2
[2015-10-16 11:52:00.736125] I [dht-rebalance.c:1103:dht_migrate_file]
0-portugal-tier-dht: /lisbon.2: attempting to move from portugal-cold-dht to
portugal-hot-dht
[2015-10-16 11:52:22.250989] I [MSGID: 109022]
[dht-rebalance.c:1430:dht_migrate_file] 0-portugal-tier-dht: completed
migration of /lisbon.2 from subvolume portugal-cold-dht to portugal-hot-dht
[2015-10-16 12:17:16.194105] I [MSGID: 109028]
[dht-rebalance.c:3327:gf_defrag_status_get] 0-glusterfs: Rebalance is in
progress. Time taken is 83897.00 secs
[2015-10-16 12:17:16.194140] I [MSGID: 109028]
[dht-rebalance.c:3331:gf_defrag_status_get] 0-glusterfs: Files migrated: 15,
size: 0, lookups: 21, failures: 6, skipped: 0
Status of volume: portugal
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Hot Bricks:
Brick yarrow:/dummy/brick108/portugal_hot 49240 0 Y 30546
Brick zod:/dummy/brick108/portugal_hot 49240 0 Y 4923
Brick yarrow:/dummy/brick107/portugal_hot 49239 0 Y 30524
Brick zod:/dummy/brick107/portugal_hot 49239 0 Y 4903
Cold Bricks:
Brick zod:/rhs/brick1/portugal 49237 0 Y 2557
Brick yarrow:/rhs/brick1/portugal 49237 0 Y 28413
Brick zod:/rhs/brick2/portugal 49238 0 Y 2575
Brick yarrow:/rhs/brick2/portugal 49238 0 Y 28433
NFS Server on localhost 2049 0 Y 11729
Self-heal Daemon on localhost N/A N/A Y 11852
Quota Daemon on localhost N/A N/A Y 11748
NFS Server on yarrow 2049 0 Y 32441
Self-heal Daemon on yarrow N/A N/A Y 32646
Quota Daemon on yarrow N/A N/A Y 32537
Task Status of Volume portugal
------------------------------------------------------------------------------
Task : Tier migration
ID : 931de257-0dcd-4125-87a8-0cce35caca38
Status : in progress
[root at zod ~]# gluster v tier portugal status
Node Promoted files Demoted files Status
--------- --------- --------- ---------
localhost 7 8 in progress
yarrow 0 19 in progress
volume rebalance: portugal: success:
[root at zod ~]#
--- Additional comment from nchilaka on 2015-10-16 08:29:02 EDT ---
#######before start of f1 or f2 promote
[root at zod ~]# gluster v tier portugal status
Node Promoted files Demoted files Status
--------- --------- --------- ---------
localhost 7 8 in progress
yarrow 0 19 in progress
volume rebalance: portugal: success:
[root at zod ~]#
[root at zod ~]#
[root at zod ~]#
[root at zod ~]# #######after f1 got promoted#############
[root at zod ~]# gluster v tier portugal status
Node Promoted files Demoted files Status
--------- --------- --------- ---------
localhost 8 8 in progress
yarrow 0 20 in progress
volume rebalance: portugal: success:
[root at zod ~]#
[root at zod ~]#
[root at zod ~]# #######after f1 got demoted and f2 promoted#############
[root at zod ~]# gluster v tier portugal status
Node Promoted files Demoted files Status
--------- --------- --------- ---------
localhost 9 8 in progress
yarrow 0 21 in progress
volume rebalance: portugal: success:
[root at zod ~]#
--- Additional comment from nchilaka on 2015-10-16 08:29 EDT ---
--- Additional comment from nchilaka on 2015-10-16 08:33:58 EDT ---
sosreport at rhsqe-repo.lab.eng.blr.redhat.com:/home/repo/sosreports/nchilaka/bug.1272450
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1272450
[Bug 1272450] Data Tiering:heat counters not getting reset and also
internal ops seem to be heating the files
https://bugzilla.redhat.com/show_bug.cgi?id=1272452
[Bug 1272452] Data Tiering:heat counters not getting reset and also
internal ops seem to be heating the files
https://bugzilla.redhat.com/show_bug.cgi?id=1275483
[Bug 1275483] Data Tiering:heat counters not getting reset and also
internal ops seem to be heating the files
https://bugzilla.redhat.com/show_bug.cgi?id=1275524
[Bug 1275524] Data Tiering:heat counters not getting reset and also
internal ops seem to be heating the files
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list