[Gluster-devel] Snapshot and Data Tiering

Joseph Fernandes josferna at redhat.com
Fri Dec 19 07:16:49 UTC 2014


Hi All, 

These are the MOM of the snapshot and data tiering interops meet (apologies for the late update)

1) USS should not have problems with the changes made in DHT (DHT over DHT), as USS xlator sits above DHT.
2) With the introduction of the heat capturing DB we have few things to take care off, when a snapshot of the brick is taken
   a. Location of the sqlite3 files:  Today the location of the sqlite3 files by default reside in the brick (brick_path/.glusterfs/)
      this make taking the snapshot of the db easier as it is done via LVM along with the brick. If the location is outside the brick(which is configurable
      eg: have all the DB files in SSD for better performance), then during taking a snapshot glusterd needs to take a manual backup of these files,
      which would take some time and the gluster CLI would timeout. So for the first cut we would have the DB files in the brick itself, until we have a solution for 
      CLI timeout. 
   b. Type of the DataBase: For the first cut we are considering only sqlite3. And sqlite3 works excellent with LVM snapshots. If a new DB type like leveldb is 
      introduced in the future, we need to investigate on its compatibility with LVM snapshots. And this might be a deciding factor to have such a DB type in gluster.
   c. Check-pointing the Sqlite3 DB: Before taking a snapshot, Glusterd should issue a checkpoint command to the Sqlite3 DB to flush all the db cache on to the Disk.
      Action item on Data Tiering team: 
                     1) To give the time taken to do so. i.e checkpointing time 
                     2) Provide a generic API in libgfdb to do so OR handle the CTR xlator notification from glusterd to do checkpointing
      Action item on snapshot team :
                     1) provide hooks to call the generic API OR do the brick-ops to notify the CTR Xlator
   d. Snapshot aware bricks: For a brick belonging to a snapshot the CTR xlator should not record reads (which come from USS). Solution 
                     1) send CTR Xlator notification after the snapshot brick is started to turn off recording 
                     2) OR While the snapshot brick is started by glusterd pass a option marking the brick to be apart of snapshot. This is more generic solution.
3) The snapshot restore problem : When a snapshot is restored,
                    1)  it will bring the volume to the point-in-time state i.e for example 
                        The current state of the volume is,  HOT tier has 50 % of data & COLD tier has 50 % of data. And the snapshot has the volume in the state 
                        HOT Tier has  20 % of data & COLD tier has 80 % of data. A restore will bring the volume to HOT:20% COLD:80%. i.e it will undo all the 
                        promotions and demotions. This should be mentioned in the documentation.
                    2) In addition to this, since the restored DB has time recorded in the past, File that were considered HOT in the past are now COLD. This will have                   
                       all the data moved to the COLD tier if an data tiering scanner runs after the restore of the snapshot. This should be recorded in the 
                      documentation as a recommendation that not to run the data tiering scanner immediately after a restore of snapshot. The System should be given 
                      time to learn the new heat patterns. The learning time depends on nature of work load.  
4) During a data tiering activity snapshot activities like create/restore should be disables, just as it is done during adding and removing of the brick, which leads to a rebalance. 

Let me know if anything else is missing or any correction are required.

Regards,
Joe


More information about the Gluster-devel mailing list