[Gluster-devel] counters in tiering / request for comments
Dan Lambright
dlambrig at redhat.com
Mon Aug 29 14:01:41 UTC 2016
Below is a write-up on tiering counters (bz 1275917) I give three options, and I think option (1) and (3) are doable. (2) is harder and would need more discussion.
Currently counters give limited information on tiering behavior. They are just a raw count of the number of files moved each direction. The overall feature is much less usable as a result.
Generally counters should work with future tiering use cases, i.e. tier according to location or some other policy.
$ gluster volume tier vol1 status
Node Promoted files Demoted files Status
--------- --------- --------- ---------
localhost 20 30 in progress
172.17.60.18 0 0 in progress
172.17.60.19 0 0 in progress
172.17.60.20 0 0 in progress
(1)
Customers want to know the total number of files / MB on a tier at any one time. I propose we query the database on the bricks for each tier, to get a count of the number of files.
$ gluster volume tier vol1 status
Node Promoted files /hot count Demoted files / cold count Status
--------- --------- --------- ---------
localhost 20 / 500 30 /2000 in progress
172.17.60.18 0 0 in progress
172.17.60.19 0 0 in progress
172.17.60.20 0 0 in progress
(2)
People need to know the ratio of I/Os served by the hot tier to the cold tier. For an administrator, if 90% of your I/Os go to the hot tier, this is good. If only 20% are served by the hot tier, this is bad, and there is a misconfiguration.
Something like this is what we want:
$ gluster volume tier vol1 status
Node Promoted files Demoted files Read Hit rate Write Hit Rate Status
--------- --------- --------- --------- ------- --------
localhost 0 0 80% 75% in progress
The difficulty is how to capture that. When we read a large file, it is broken up into multiple individual reads. Each piece is a single read FOP. Should we consider each FOP individually? Or does only the first "hit" to the hot tier count?
Also, when an FOP comes in, it will first look on one tier, and then the other tier. The callback to the FOP checks success or failure. It is only when the file is found on none of the subvolumes that the FOP returns an error. New code needs to deal with this complexity. If there is failure on the cold tier but success on the hot tier, the "hit count" should be bumped.
We probably do not want to update the "hit rate" on all FOPs.
(3)
A simpler new counter to implement is the #MB promoted or demoted. I think that could be satisfied in a separate patch and could be done quicker.
This output with (2) and (3):
$ gluster volume tier vol1 status
Node Promoted files/MB Demoted files/MB Read Hit rate Write Hit Rate Status
--------- --------- --------- --------- ------- --------
localhost 120/2033MB 50/1044MB 80% 75% in progress
More information about the Gluster-devel
mailing list