[Gluster-devel] tiering: emergency demotions
Milind Changire
mchangir at redhat.com
Fri Aug 12 13:57:35 UTC 2016
On 08/10/2016 12:06 PM, Milind Changire wrote:
> Emergency demotions will be required whenever writes breach the
> hi-watermark. Emergency demotions are required to avoid ENOSPC in case
> of continuous writes that originate on the hot tier.
>
> There are two concerns in this area:
>
> 1. enforcing max-cycle-time during emergency demotions
> max-cycle-time is the time the tiering daemon spends in promotions or
> demotions
> I tend to think that the tiering daemon skip this check for the
> emergency situation and continue demotions until the watermark drops
> below the hi-watermark
Update:
To keep matters simple and manageable, it has been decided to *enforce*
max-cycle-time to yield the worker threads to attend to impending tier
management tasks if the need arises.
>
> 2. file demotion policy
> I tend to think that evicting the largest file with the most recent
> *write* should be chosen for eviction when write-freq-threshold is
> NON-ZERO.
> Choosing a least written file is just going to delay file migration
> of an active file which might consume hot tier disk space resulting
> in a ENOSPC, in the worst case.
> In cases where write-freq-threshold are ZERO, the most recently
> *written* file can be chosen for eviction.
> In the case of choosing the largest file within the
> write-freq-threshold, a stat() on the files would be required to
> calculate the number of files that need to be demoted to take the
> watermark below the hi-watermark. Finding the number of most recently
> written files to demote could also help make demotions in parallel
> rather than in the sequential manner currently in place.
Update:
The idea of choosing the files wrt file size has been dropped.
Iteratively, the most recently written file will be chosen for eviction
from the hot tier in case of a hi-watermark breach and until the
watermark drops below hi-watermark.
The idea of parallelizing multiple promotions/demotions has been
deferred.
-----
Sustained writes creating larges files in the hot tier which
cumulatively breach the hi-watermark does NOT seem to be a good
workload for making use of tiering. The assumption is that, to make the
most of of the hot tier, the hi-watermark would be closer to 100.
In this case a sustained large file copy might easily breach the
hi-watermark and may even consume the entire hot tier space, resulting
in a ENOSPC.
eg. an example of a sustained write
# cp file1 /mnt/glustervol/dir
Workloads that would seem to make the most of tiering are:
1. Many smaller files, which are created in small bursts of write
activity and then closed
2. Few large files where updates are in-place and the file size
does not grow beyond the hi-watermark eg. database, with frequent
in-line compaction/de-fragmentation policy enabled
3. Frequent reads of few large files, mostly static in size, which
cumulatively don't breach the hi-watermark. Frequently reading
a large number of smaller, mostly static, files would be good
tiering workload candidates as well.
>
> Comments are requested.
>
More information about the Gluster-devel
mailing list