[Gluster-devel] GlusterFS User and Group Quotas

Joseph Fernandes josferna at redhat.com
Tue Dec 8 09:50:45 UTC 2015


Answer inline

----- Original Message -----
From: "Vijaikumar Mallikarjuna" <vmallika at redhat.com>
To: "Gluster Devel" <gluster-devel at gluster.org>
Sent: Tuesday, December 8, 2015 3:02:55 PM
Subject: [Gluster-devel] GlusterFS User and Group Quotas

Hi All, 

Below is the design for ' GlusterFS User and Group Quotas', please provide your feedback on the same. 


Developers: 
Vijaikumar.M and Manikandan.S 


Introduction: 
User and Group quotas is to limit the amount of disk space for a 
specified user/group ID. 
This documents provides some details about how the accounting (marker xlator) can be done 
for user and group quotas 


Design: 
We have three different approaches, each has pros and cons 

Approach-1) 
T1 - For each file/dir 'file_x', create a contribution extended attribute say 'trusted.glusterfs.quota.<uid>-contri' 
T2 - In a lookup/write operation read the actual size from the stat-buf, add the delta size to the contribution xattr 
T3 - Create a file .glusterfs/quota/users/<uid>. 
Update size extended attribute say 'trusted.glusterfs.quota.size' by adding the delta size calculated in T2 

Same for group quotas a size xattr is updated under .glusterfs/quota/groups/<gid>. 

cons: 
If the brick crashes after executing T2 and before T3. Now accounting information is in-correct. 
To recover and correct the accounting information, entire file-systems needs to be crawled to fix the trusted.glusterfs.quota.size 
value by summing up the contribution of all files with UID. But is a slow process. 


Approach-2) 
T1 - For each file/dir 'file_x', create a contribution extended attribute say 'trusted.glusterfs.quota.<uid>-contri' 
T2 - create a directory '.glusterfs/quota/users/<uid>' 
create a hardlink for file file_x under this directories 
T3 - In a lookup/write operation, set dirty flag 'trusted.glusterfs.quota.dirty' for directory '.glusterfs/quota/users/<uid>' 
T4 - Read the actual size of a file from the stat-buf, add the delta size to the contribution xattr 
T5 - update size extended attribute say for directory '.glusterfs/quota/users/<uid>' 
T6 - unset the dirty flag 

Same for group quotas a size xattr is updated under .glusterfs/quota/groups/<gid>. 

Problem of approach 1 of crawling entire brick is solved by only crawling the directory which is set dirty. 

cons: 
Need to make sure that the hard-link for a file is consistent when having another hardlinks 
under .glusterfs/quota/users/<uid> and .glusterfs/quota/groups/<gid> 


Approach-3) 
T1 - For each file/dir 'file_x', update a contribution entry in the SQL-LITE DB (Create a DB file under .glusterfs/quota/ ) 
T2 - In a lookup/write operation read the actual size from the statbuf, add the update the size in the USER-QUOTA schema in the DB 
T3 - In a lookup/write operation, set dirty flag 'trusted.glusterfs.quota.dirty' for directory '.glusterfs/quota/users/<uid>' 

Atomicity problem found in approach 1 and 2 is solved by using DB transactions. 

Note: need to test the consistency of the SQL-LITE DB. 

We feel approach-3 is more simpler and efficient way of implementing user/group quotas. 



JOE: 
6 Points if you are planning to use sqlite
1) Use Libgfdb to access (write/read) the DB as it gives you flexibility to change type of database/datastore
2) Place the updates in a SQL Transaction BEGIN update; update; END
3) As you are not looking at durability, but more concerned with Atomicity use a big sqlite cache so that performance, refer CTR/Libgfdb for settings
4) Use the default journaling mode in sqlite, as you are doing async writes and need good reading performance.
5) Use a separate db file, and dont use tiering db file as its configured for write performance.
6) Use "INSERT IF NOT EXISTS ELSE UPDATE" in the write path for crash protection as you will be using a huge sqlite cache,
   but beware though this is convenient, its not a performant way to achieve eventual consistency. CTR lookup heal kind of
   approach will be good.

If you are planning on a POC I can help :)


Thanks, 
Vijay 

_______________________________________________
Gluster-devel mailing list
Gluster-devel at gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


More information about the Gluster-devel mailing list