[Gluster-devel] gluster source code help

Mon Feb 13 09:25:08 UTC 2017

Hi Ravi,
Thanks .  I have created a simple 2 node replica volume.

*root at dhcp-192-168-36-220:/home/user/gluster/rep-brick1#  gluster v info
rep-vol*

*Volume Name: rep-vol*
*Type: Replicate*
*Volume ID: c9c9ef39-27e5-44d5-be69-82423c743304*
*Status: Started*
*Number of Bricks: 1 x 2 = 2*
*Transport-type: tcp*
*Bricks:*
*Brick1: 192.168.36.220:/home/user/gluster/rep-brick1*
*Brick2: 192.168.36.220:/home/user/gluster/rep-brick2*
*Options Reconfigured:*
*features.inode-quota: off*
*features.quota: off*
*performance.readdir-ahead: on*

 *Killed brick1 process.*

*root at dhcp-192-168-36-220:/home/user/gluster/rep-brick1# gluster v status
rep-vol*
*Status of volume: rep-vol*
*Gluster process                             TCP Port  RDMA Port  Online
 Pid*
*------------------------------------------------------------------------------*
*Brick 192.168.36.220:/home/user/gluster/rep*
*-brick1                                     N/A       N/A        N
N/A  *
*Brick 192.168.36.220:/home/user/gluster/rep*
*-brick2                                     49211     0          Y
20157*
*NFS Server on localhost                     N/A       N/A        N
N/A  *
*Self-heal Daemon on localhost               N/A       N/A        Y
20186*

*Task Status of Volume rep-vol*
*------------------------------------------------------------------------------*
*There are no active volume task*

*And copying wish.txt  to mount  directory.*

*From brick2 ,*

*root at dhcp-192-168-36-220:/home/user/gluster/rep-brick2/.glusterfs#
getfattr -d -e hex -m . ../wish.txt *
*# file: ../wish.txt*
*trusted.afr.dirty=0x000000000000000000000000*
*trusted.afr.rep-vol-client-0=0x000000020000000100000000*
*trusted.bit-rot.version=0x0200000000000000589ab1410003e910*
*trusted.gfid=0xe9f3aafb3f844bca8922a00d48abc643*

*root at dhcp-192-168-36-220:/home/user/gluster/rep-brick2/.glusterfs/indices/xattrop#
ll*
*total 8*
*drw------- 2 root root 4096 Feb  8 13:50 ./*
*drw------- 4 root root 4096 Feb  8 13:48 ../*
*---------- 4 root root    0 Feb  8 13:50
00000000-0000-0000-0000-000000000001*
*---------- 4 root root    0 Feb  8 13:50
00000000-0000-0000-0000-000000000005*
*---------- 4 root root    0 Feb  8 13:50
e9f3aafb-3f84-4bca-8922-a00d48abc643*
*---------- 4 root root    0 Feb  8 13:50
xattrop-b3beb437-cea4-46eb-9eb4-8d83bfa7baa1*

 In the above, I can see the gfid of wish.txt
(e9f3aafb-3f84-4bca-8922-a00d48abc643) , which need  to be healed.
1. What are  " *00000000-0000-0000-0000-000000000001*" and "
*00000000-0000-0000-0000-000000000005* " ?
(I can understand *trusted.afr.rep-vol-client-0  as the changelog of brick1
 as seen by brick2--- from
 https://github.com/gluster/glusterfs-specs/blob/master/done/Features/afr-v1.md
<https://github.com/gluster/glusterfs-specs/blob/master/done/Features/afr-v1.md>)*

2.  I know xattrop-* is a base file. How this is related  to the files
which require  healing ? (Assuming more than one file to be healed).
    What does  the numeric part on xattrop-*   (
* xattrop-b3beb437-cea4-46eb-9eb4-8d83bfa7baa1) *signify?

3. After brick1 is brought to online, the file is healed. Now only
xattrop-* remain  under .glusterfs/indices/xattrop.
  But still  there  is  gfid entry in .glusterfs/e9/f3   directory. Is this
an  expected  behavior?

On Tue, Feb 7, 2017 at 8:21 PM, Ravishankar N <ravishankar at redhat.com>
wrote:

> On 02/07/2017 01:32 PM, jayakrishnan mm wrote:
>
>
>
> On Mon, Feb 6, 2017 at 6:05 PM, Ravishankar N <ravishankar at redhat.com>
> wrote:
>
>> On 02/06/2017 03:15 PM, jayakrishnan mm wrote:
>>
>>
>>
>> On Mon, Feb 6, 2017 at 2:36 PM, jayakrishnan mm <
>> jayakrishnan.mm at gmail.com> wrote:
>>
>>>
>>>
>>> On Fri, Feb 3, 2017 at 7:58 PM, Ravishankar N <ravishankar at redhat.com>
>>> wrote:
>>>
>>>> On 02/03/2017 09:14 AM, jayakrishnan mm wrote:
>>>>
>>>>
>>>>
>>>> On Thu, Feb 2, 2017 at 8:17 PM, Ravishankar N <ravishankar at redhat.com>
>>>> wrote:
>>>>
>>>>> On 02/02/2017 10:46 AM, jayakrishnan mm wrote:
>>>>>
>>>>> Hi
>>>>>
>>>>> How  do I determine, which part of the  code is run on the client, and
>>>>> which part of the code is run on the server nodes by merely looking at the
>>>>> the glusterfs  source code ?
>>>>> I knew  there are client side  and server side translators which will
>>>>> run on respective platforms. I am looking at part of self heal daemon
>>>>> source  (ec/afr) which will run on the server nodes  and  the part which
>>>>> run on the clients.
>>>>>
>>>>>
>>>>> The self-heal daemon that runs on the server is also a client process
>>>>> in the sense that it has client side xlators like ec or afr and
>>>>> protocol/client (see the shd volfile 'glustershd-server.vol') loaded and
>>>>> talks to the bricks like a normal client does.
>>>>> The difference is that only self-heal related 'logic' get executed on
>>>>> the shd while both self-heal and I/O related logic get executed from the
>>>>> mount. The self-heal logic resides mostly in afr-self-heal*.[ch] while I/O
>>>>> related logic is there in the other files.
>>>>> HTH,
>>>>> Ravi
>>>>>
>>>>
>>>> Hi JK,
>>>>
>>>> Dear  Ravi,
>>>> Thanks for your kind explanation.
>>>> So, each server node will have a separate self-heal daemon(shd) up and
>>>> running , every time a child_up event occurs, and this will  be an index
>>>> healer.
>>>> And each daemon  will spawn  "priv->child_count " number of threads on
>>>> each server node . correct ?
>>>>
>>>> shd is always running and yes those many threads are spawned for index
>>>> heal when the process starts.
>>>>
>>>> 1. When exactly a full healer spawns  threads?
>>>>
>>>> Whenever you run `gluster volume heal volname full`. See afr_xl_op().
>>>> There are some bugs in launching full heal though.
>>>>
>>>> 2. When can GF_EVENT_TRANSLATOR_OP & GF_SHD_OP_HEAL_INDEX happen
>>>> together (so that index healer spawns thread) ?
>>>>     similarly when can GF_EVENT_TRANSLATOR_OP & GF_SHD_OP_HEAL_FULL
>>>>  happen ? During replace-brick ?
>>>> Is it possible that index healer and full healer spawns threads
>>>> together (so that total number of  threads  is 2*priv->child_count)?
>>>>
>>>> index heal threads wake up and run once every 10 minutes or whatever
>>>> the cluster.heal-timeout is. They are also run when a brick comes up like
>>>> you said, via afr_notify(). It is also run when you manually launch
>>>> 'gluster volume heal volname`. Again see afr_xl_op().
>>>>
>>>> 3. In /var/lib/glusterd/glustershd/glustershd-server.vol , why
>>>>  debug/io-stats  is chosen as the top xlator ?
>>>>
>>>> io-stats is generally loaded as the top most xlator in all graphs at
>>>> the appropriate place for gathering profile-info, but for shd, I'm not sure
>>>> if it has any specific use other than acting as a placeholder as a parent
>>>> to all replica xlators.
>>>>
>>>
>>
>>
>> Hi Ravi,
>>
>> The self heal daemon searches   in .glusterfs/indices/xattrop   directory
>> for the files/dirs  to be healed . Who is updating this information , and
>> on what basis ?
>>
>>
>> Please see https://github.com/gluster/glusterfs-specs/blob/master/done/
>> Features/afr-v1.md, it is a bit dated (relevant to AFR v1, which is in
>> glusterfs 3.5 and older I think) but the concepts are similar. The entries
>> are added/removed by the index translator during the pre-op/post-op phases
>> of the AFR transaction .
>>
>
> Hi Ravi,
>
>   Went  thru' the document & source code.   I see there are options to
> enable/disable  entry/data/metadata  change logs. If  "data-change-log" is
> 1 (by default , it is 1), this will  enable data change log, which results
>  in __changelog_enabled() to return 1  and  thereafter
> call afr_changelog_pre_op() . Similar logic  for post-op also, which occurs
> just before unlock.
> Is this responsible for creating/deleting  entries  inside
>  .glusterfs/indices/xattrop ?
>
> Yes, index_xattrop() adds the entry during pre-op and removes it during
> post-op if it was successful.
>
> Currently I can't verify, since the mount point for the rep volume hangs
>  when data-change-log is set to 0. (using glusterfs v 3.7.15). Ideally, the
> entries  should  not appear  (in the case of  brick failure and a write
> thereafter)  if this option is set to '0', am I correct ?
>
> Best Regards
> JK
>
>
>>
>>
>> Thanks Ravi, for the explanation.
>>> Regards
>>> JK
>>>
>>>>
>>>> Regards,
>>>> Ravi
>>>>
>>>> Thanks
>>>> Best regards
>>>>
>>>>>
>>>>> Best regards
>>>>> JK
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-devel mailing listGluster-devel at gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-devel
>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20170213/31db32d1/attachment-0001.html>