[Gluster-devel] Improving Geo-replication Status and Checkpoints

Wed Apr 1 09:28:41 UTC 2015

On 04/01/2015 02:50 PM, Sahina Bose wrote:
>
> On 04/01/2015 02:30 PM, Aravinda wrote:
>> Hi,
>>
>> In each node of Master Cluster one Monitor process and one or more 
>> worker process for each brick in that node.
>> Monitor will have status file, which will be updated by glusterd. 
>> Possible Status values in monitor_status file are Created, Started, 
>> Paused, Stopped.
>>
>> Geo-rep can not be paused if monitor status is not "Started".
>>
>> Based on monitor_status, we need to hide other information from brick 
>> status file from showing it to user. For example, If monitor status 
>> is "Stopped", it will not make sense to show "Crawl Status" in 
>> Geo-rep Status output. Created a Matrix of possible status values 
>> based on Status of Monitor. VALUE represents actual unchanged value 
>> from Brick status file.
>>
>> Monitor Status --->        Created        Started Paused Stopped
>> -------------------------------------------------------------------------- 
>>
>> session                    VALUE          VALUE VALUE VALUE
>> brick                      VALUE          VALUE VALUE VALUE
>> node                       VALUE          VALUE VALUE VALUE
>> node_uuid                  VALUE          VALUE VALUE VALUE
>> volume                     VALUE          VALUE VALUE VALUE
>> slave_user                 VALUE          VALUE VALUE VALUE
>> slave_node                 N/A            VALUE VALUE       N/A
>> status                     Created        VALUE Paused Stopped
>> last_synced                N/A            VALUE VALUE VALUE
>> crawl_status               N/A            VALUE N/A         N/A
>> entry                      N/A            VALUE N/A         N/A
>> data                       N/A            VALUE N/A         N/A
>> meta                       N/A            VALUE N/A         N/A
>> failures                   N/A            VALUE VALUE VALUE
>> checkpoint_completed       N/A            VALUE VALUE VALUE
>> checkpoint_time            N/A            VALUE VALUE VALUE
>> checkpoint_completed_time  N/A            VALUE VALUE VALUE
>>
>> Where:
>> session - only in XML output, Complete session URL which is used in 
>> Create command
>> brick - Master Brick Node
>> node - Master Node
>> node_uuid - Master Node UUID, Only in XML output
>> volume - Master Volume
>> slave_user - Slave User
>> slave_node - Slave node to which respective master worker is connected.
>> status - Created/Initializing../Active/Passive/Faulty/Paused/Stopped
>> last_synced - Last synced Time
>> crawl_status - Hybrid/History/Changelog
>> entry - Number of entry ops pending(per session, resets counter if 
>> worker restart)
>> data - Number of data ops pending(per session, resets counter if 
>> worker restart)
>> meta - Number of meta ops pending(per session, resets counter if 
>> worker restart)
>> failures - Number of failures. (If count more than 0, then action 
>> item for admin to look in log files)
>> checkpoint_completed - Checkpoint Status Yes/No/ N/A
>> checkpoint_time - Checkpoint Set time or N/A
>> checkpoint_completed_time - Checkpoint Completed Time or N/A
>>
>> Along with the monitor_status, if brick status is Faulty, following 
>> fields will be displayed as N/A.
>> active, paused, slave_node, crawl_status, entry, data, metadata
>
> Some questions -
>
> * Would monitor status also have "Initializing" state?
Monitor status is only internal to Geo-replication, Status output will 
not have monitor_status.

> * What's the difference between brick and master node above?
Brick - Brick path as shown in volume info, Node: Hostname as shown in 
Volume info

> * Is the last_synced time returned in UTC?
TBD.
> * active and paused - are these fields or status values?
Status will be Paused irrespective of Active/Passive.

>
>>
>> Let me know your thoughts.
>>
>> -- 
>> regards
>> Aravinda
>>
>>
>> On 02/03/2015 11:00 PM, Aravinda wrote:
>>> Today we discussed about Geo-rep Status design, summary of the 
>>> discussion.
>>>
>>> - No usecase for "Deletes pending" column, should we retain it?
>>> - No separate column for Active/Passive. Worker can be 
>>> Active/Passive only when worker is Stable(It can't be Faulty and 
>>> Active)
>>> - Rename "Not Started" status as "Created"
>>> - Checkpoint columns will be retained in the Status output till we 
>>> support Multiple checkpoints. Three columns instead of Single 
>>> column(Completed, Checkpoint time and Completion time)
>>> - Still we have confusion about "Files Pending" and "Files Synced", 
>>> What numbers it has to show. Georep can't map the number to exact 
>>> count on disk.
>>>   Venky suggested to show Entry, Data and Metadata pending as three 
>>> columns. (Remove "Files Pending" and "Files Synced")
>>> - Rename "Files Skipped" to "Failures"
>>>
>>> Status output proposed:
>>> -----------------------
>>> MASTER NODE - Master node hostname/IP
>>> MASTER VOL - Master volume name
>>> MASTER BRICK - Master brick path
>>> SLAVE USER - Slave user to which geo-rep is established.
>>> SLAVE - Slave host and Volume name(HOST::VOL format)
>>> STATUS - Created/Initializing../Started/Active/Passive/Stopped/Faulty
>>> LAST SYNCED - Last synced time(Based on stime xattr)
>>> CRAWL STATUS - Hybrid/History/Changelog
>>> CHECKPOINT STATUS - Yes/No/ N/A
>>> CHECKPOINT TIME - Checkpoint Set Time
>>> CHECKPOINT COMPLETED - Checkpoint Completion Time
>>>
>>> Not yet decided
>>> ---------------
>>> FILES SYNCD - Number of Files Synced
>>> FILES PENDING - Number of Files Pending
>>> DELETES PENDING- Number of Deletes Pending
>>> FILES SKIPPED - Number of Files skipped
>>> ENTRIES - Create/Delete/MKDIR/RENAME etc
>>> DATA - Data operations
>>> METADATA - SETATTR, SETXATTR etc
>>>
>>> Let me know your suggestions.
>>>
>>> -- 
>>> regards
>>> Aravinda
>>>
>>>
>>> On 02/02/2015 04:51 PM, Aravinda wrote:
>>>> Thanks Sahina, replied inline.
>>>>
>>>> -- 
>>>> regards
>>>> Aravinda
>>>>
>>>> On 02/02/2015 12:55 PM, Sahina Bose wrote:
>>>>>
>>>>> On 01/28/2015 04:07 PM, Aravinda wrote:
>>>>>> Background
>>>>>> ----------
>>>>>> We have `status` and `status detail` commands for GlusterFS 
>>>>>> geo-replication, This mail is to fix the existing issues in these 
>>>>>> command outputs. Let us know if we need any other columns which 
>>>>>> helps users to get meaningful status.
>>>>>>
>>>>>> Existing output
>>>>>> ---------------
>>>>>> Status command output
>>>>>>     MASTER NODE - Master node hostname/IP
>>>>>>     MASTER VOL - Master volume name
>>>>>>     MASTER BRICK - Master brick path
>>>>>>     SLAVE - Slave host and Volume name(HOST::VOL format)
>>>>>>     STATUS - Stable/Faulty/Active/Passive/Stopped/Not Started
>>>>>>     CHECKPOINT STATUS - Details about Checkpoint completion
>>>>>>     CRAWL STATUS - Hybrid/History/Changelog
>>>>>>
>>>>>> Status detail -
>>>>>>     MASTER NODE - Master node hostname/IP
>>>>>>     MASTER VOL - Master volume name
>>>>>>     MASTER BRICK - Master brick path
>>>>>>     SLAVE - Slave host and Volume name(HOST::VOL format)
>>>>>>     STATUS - Stable/Faulty/Active/Passive/Stopped/Not Started
>>>>>>     CHECKPOINT STATUS - Details about Checkpoint completion
>>>>>>     CRAWL STATUS - Hybrid/History/Changelog
>>>>>>     FILES SYNCD - Number of Files Synced
>>>>>>     FILES PENDING - Number of Files Pending
>>>>>>     BYTES PENDING - Bytes pending
>>>>>>     DELETES PENDING - Number of Deletes Pending
>>>>>>     FILES SKIPPED - Number of Files skipped
>>>>>>
>>>>>>
>>>>>> Issues with existing status and status detail:
>>>>>> ----------------------------------------------
>>>>>>
>>>>>> 1. Active/Passive and Stable/faulty status is mixed up - Same 
>>>>>> column is used to show both active/passive status as well as 
>>>>>> Stable/faulty status. If Active node goes faulty then by looking 
>>>>>> at the status it is difficult to understand Active node is faulty 
>>>>>> or the passive one.
>>>>>> 2. Info about last synced time, unless we set checkpoint it is 
>>>>>> difficult to understand till what time data is synced to slave. 
>>>>>> For example, if a admin want's to know all the files synced which 
>>>>>> are created 15 mins ago, it is not possible without setting 
>>>>>> checkpoint.
>>>>>> 3. Wrong values in metrics.
>>>>>> 4. When multiple bricks present in same node. Status shows Faulty 
>>>>>> when one of the worker is faulty in that node.
>>>>>>
>>>>>> Changes:
>>>>>> --------
>>>>>> 1. Active nodes will be prefixed with * to identify it is a 
>>>>>> active node.(In xml output active tag will be introduced with 
>>>>>> values 0 or 1)
>>>>>> 2. New column will show the last synced time, which minimizes the 
>>>>>> use of checkpoint feature. Checkpoint status will be shown only 
>>>>>> in status detail.
>>>>>> 3. Checkpoint Status is removed, Separate Checkpoint command will 
>>>>>> be added to gluster cli(We can introduce multiple Checkpoint 
>>>>>> feature with this change)
>>>>>> 4. Status values will be "Not 
>>>>>> Started/Initializing/Started/Faulty/Stopped". Stable is changed 
>>>>>> to "Started"
>>>>>> 5. Slave User column will be introduced to show to which user 
>>>>>> geo-rep session is established.(Useful in Non root geo-rep)
>>>>>> 6. Bytes pending column will be removed. It is not possible to 
>>>>>> identify the delta without simulating sync. For example, we are 
>>>>>> using rsync to sync data from master to slave, If we need to know 
>>>>>> how much data to be transferred then we have to run the rsync 
>>>>>> command with --dry-run flag before running actual command. With 
>>>>>> tar-ssh we have to stat all the files which are identified to be 
>>>>>> synced to calculate the total bytes to be synced. Both are costly 
>>>>>> operations which degrades the geo-rep performance.(In Future we 
>>>>>> can include these columns)
>>>>>> 7. Files pending, Synced, deletes pending are only session 
>>>>>> information of the worker, these numbers will not match with the 
>>>>>> number of files present in Filesystem. If worker restarts, 
>>>>>> counter will reset to zero. When worker restarts, it logs 
>>>>>> previous session stats before resetting it.
>>>>>> 8. Files Skipped is persistent status across sessions, Shows 
>>>>>> exact count of number of files skipped(Can get list of GFIDs 
>>>>>> skipped from log file)
>>>>>> 9. "Deletes Pending" column can be removed?
>>>>>
>>>>> Is there any way to know if there are errors syncing any of the 
>>>>> files? Which column would that reflect in?
>>>> "Skipped" Column shows number of files failed to sync to Slave.
>>>>
>>>>> Is the last synced time - the least of the synced time across the 
>>>>> nodes?
>>>> Status output will have one entry for each brick, so we are 
>>>> planning to display last synced time from that brick.
>>>>>
>>>>>
>>>>>>
>>>>>> Example output
>>>>>>
>>>>>>     MASTER NODE  MASTER VOL  MASTER BRICK  SLAVE USER 
>>>>>> SLAVE             STATUS    LAST SYNCED           CRAWL
>>>>>> ---------------------------------------------------------------------------------------------------------------- 
>>>>>>
>>>>>>     * fedoravm1  gvm         /gfs/b1       root fedoravm3::gvs 
>>>>>> Started   2014-05-10 03:07 pm   Changelog
>>>>>>       fedoravm2  gvm         /gfs/b2       root fedoravm4::gvs 
>>>>>> Started   2014-05-10 03:07 pm   Changelog
>>>>>>
>>>>>> New Status columns
>>>>>>
>>>>>>     ACTIVE_PASSIVE - * if Active else none.
>>>>>>     MASTER NODE - Master node hostname/IP
>>>>>>     MASTER VOL - Master volume name
>>>>>>     MASTER BRICK - Master brick path
>>>>>>     SLAVE USER - Slave user to which geo-rep is established.
>>>>>>     SLAVE - Slave host and Volume name(HOST::VOL format)
>>>>>>     STATUS - Stable/Faulty/Active/Passive/Stopped/Not Started
>>>>>>     LAST SYNCED - Last synced time(Based on stime xattr)
>>>>>>     CHECKPOINT STATUS - Details about Checkpoint completion
>>>>>>     CRAWL STATUS - Hybrid/History/Changelog
>>>>>>     FILES SYNCD - Number of Files Synced
>>>>>>     FILES PENDING - Number of Files Pending
>>>>>>     DELETES PENDING- Number of Deletes Pending
>>>>>>     FILES SKIPPED - Number of Files skipped
>>>>>>
>>>>>>
>>>>>> XML output
>>>>>>     active
>>>>>>     master_node
>>>>>>     master_node_uuid
>>>>>>     master_brick
>>>>>>     slave_user
>>>>>>     slave
>>>>>>     status
>>>>>>     last_synced
>>>>>>     crawl_status
>>>>>>     files_syncd
>>>>>>     files_pending
>>>>>>     deletes_pending
>>>>>>     files_skipped
>>>>>>
>>>>>>
>>>>>> Checkpoints
>>>>>> ===========
>>>>>> New set of Gluster CLI commands will be introduced for Checkpoints.
>>>>>>
>>>>>>     gluster volume geo-replication <VOLNAME> 
>>>>>> <SLAVEHOST>::<SLAVEVOL> checkpoint create <NAME> <DATE>
>>>>>> gluster volume geo-replication <VOLNAME> <SLAVEHOST>::<SLAVEVOL> 
>>>>>> checkpoint delete <NAME>
>>>>>>     gluster volume geo-replication <VOLNAME> 
>>>>>> <SLAVEHOST>::<SLAVEVOL> checkpoint delete all
>>>>>>     gluster volume geo-replication <VOLNAME> 
>>>>>> <SLAVEHOST>::<SLAVEVOL> checkpoint status [<NAME>]
>>>>>>     gluster volume geo-replication <VOLNAME> checkpoint status # 
>>>>>> For all geo-rep sessions for that volume
>>>>>>     gluster volume geo-replication checkpoint status # For all 
>>>>>> geo-rep sessions for all volumes
>>>>>>
>>>>>>
>>>>>> Checkpoint Status:
>>>>>>
>>>>>>     SESSION                    NAME      Completed Checkpoint 
>>>>>> Time        Completion Time
>>>>>> ----------------------------------------------------------------------------------------- 
>>>>>>
>>>>>>     gvm->root at fedoravm3::gvs   Chk1      Yes 2014-11-30 11:30 
>>>>>> pm    2014-12-01 02:30 pm
>>>>>>     gvm->root at fedoravm3::gvs   Chk2      No 2014-12-01 10:00 
>>>>>> pm    N/A
>>>>>
>>>>> Can the time information have the timezone information as well? Or 
>>>>> is this UTC time?
>>>>> (Same comment for last synced time)
>>>> Sure. Will have UTC time in Status output.
>>>>>
>>>>>>
>>>>>> XML output:
>>>>>>     session
>>>>>>     master_uuid
>>>>>>     name
>>>>>>     completed
>>>>>>     checkpoint_time
>>>>>>     completion_time
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> regards
>>>>>> Aravinda
>>>>>> _______________________________________________
>>>>>> Gluster-devel mailing list
>>>>>> Gluster-devel at gluster.org
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-devel mailing list
>>>> Gluster-devel at gluster.org
>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>