[Gluster-devel] gluster source code help
jayakrishnan mm
jayakrishnan.mm at gmail.com
Mon Feb 13 09:25:08 UTC 2017
Hi Ravi,
Thanks . I have created a simple 2 node replica volume.
*root at dhcp-192-168-36-220:/home/user/gluster/rep-brick1# gluster v info
rep-vol*
*Volume Name: rep-vol*
*Type: Replicate*
*Volume ID: c9c9ef39-27e5-44d5-be69-82423c743304*
*Status: Started*
*Number of Bricks: 1 x 2 = 2*
*Transport-type: tcp*
*Bricks:*
*Brick1: 192.168.36.220:/home/user/gluster/rep-brick1*
*Brick2: 192.168.36.220:/home/user/gluster/rep-brick2*
*Options Reconfigured:*
*features.inode-quota: off*
*features.quota: off*
*performance.readdir-ahead: on*
*Killed brick1 process.*
*root at dhcp-192-168-36-220:/home/user/gluster/rep-brick1# gluster v status
rep-vol*
*Status of volume: rep-vol*
*Gluster process TCP Port RDMA Port Online
Pid*
*------------------------------------------------------------------------------*
*Brick 192.168.36.220:/home/user/gluster/rep*
*-brick1 N/A N/A N
N/A *
*Brick 192.168.36.220:/home/user/gluster/rep*
*-brick2 49211 0 Y
20157*
*NFS Server on localhost N/A N/A N
N/A *
*Self-heal Daemon on localhost N/A N/A Y
20186*
*Task Status of Volume rep-vol*
*------------------------------------------------------------------------------*
*There are no active volume task*
*And copying wish.txt to mount directory.*
*From brick2 ,*
*root at dhcp-192-168-36-220:/home/user/gluster/rep-brick2/.glusterfs#
getfattr -d -e hex -m . ../wish.txt *
*# file: ../wish.txt*
*trusted.afr.dirty=0x000000000000000000000000*
*trusted.afr.rep-vol-client-0=0x000000020000000100000000*
*trusted.bit-rot.version=0x0200000000000000589ab1410003e910*
*trusted.gfid=0xe9f3aafb3f844bca8922a00d48abc643*
*root at dhcp-192-168-36-220:/home/user/gluster/rep-brick2/.glusterfs/indices/xattrop#
ll*
*total 8*
*drw------- 2 root root 4096 Feb 8 13:50 ./*
*drw------- 4 root root 4096 Feb 8 13:48 ../*
*---------- 4 root root 0 Feb 8 13:50
00000000-0000-0000-0000-000000000001*
*---------- 4 root root 0 Feb 8 13:50
00000000-0000-0000-0000-000000000005*
*---------- 4 root root 0 Feb 8 13:50
e9f3aafb-3f84-4bca-8922-a00d48abc643*
*---------- 4 root root 0 Feb 8 13:50
xattrop-b3beb437-cea4-46eb-9eb4-8d83bfa7baa1*
In the above, I can see the gfid of wish.txt
(e9f3aafb-3f84-4bca-8922-a00d48abc643) , which need to be healed.
1. What are " *00000000-0000-0000-0000-000000000001*" and "
*00000000-0000-0000-0000-000000000005* " ?
(I can understand *trusted.afr.rep-vol-client-0 as the changelog of brick1
as seen by brick2--- from
https://github.com/gluster/glusterfs-specs/blob/master/done/Features/afr-v1.md
<https://github.com/gluster/glusterfs-specs/blob/master/done/Features/afr-v1.md>)*
2. I know xattrop-* is a base file. How this is related to the files
which require healing ? (Assuming more than one file to be healed).
What does the numeric part on xattrop-* (
* xattrop-b3beb437-cea4-46eb-9eb4-8d83bfa7baa1) *signify?
3. After brick1 is brought to online, the file is healed. Now only
xattrop-* remain under .glusterfs/indices/xattrop.
But still there is gfid entry in .glusterfs/e9/f3 directory. Is this
an expected behavior?
On Tue, Feb 7, 2017 at 8:21 PM, Ravishankar N <ravishankar at redhat.com>
wrote:
> On 02/07/2017 01:32 PM, jayakrishnan mm wrote:
>
>
>
> On Mon, Feb 6, 2017 at 6:05 PM, Ravishankar N <ravishankar at redhat.com>
> wrote:
>
>> On 02/06/2017 03:15 PM, jayakrishnan mm wrote:
>>
>>
>>
>> On Mon, Feb 6, 2017 at 2:36 PM, jayakrishnan mm <
>> jayakrishnan.mm at gmail.com> wrote:
>>
>>>
>>>
>>> On Fri, Feb 3, 2017 at 7:58 PM, Ravishankar N <ravishankar at redhat.com>
>>> wrote:
>>>
>>>> On 02/03/2017 09:14 AM, jayakrishnan mm wrote:
>>>>
>>>>
>>>>
>>>> On Thu, Feb 2, 2017 at 8:17 PM, Ravishankar N <ravishankar at redhat.com>
>>>> wrote:
>>>>
>>>>> On 02/02/2017 10:46 AM, jayakrishnan mm wrote:
>>>>>
>>>>> Hi
>>>>>
>>>>> How do I determine, which part of the code is run on the client, and
>>>>> which part of the code is run on the server nodes by merely looking at the
>>>>> the glusterfs source code ?
>>>>> I knew there are client side and server side translators which will
>>>>> run on respective platforms. I am looking at part of self heal daemon
>>>>> source (ec/afr) which will run on the server nodes and the part which
>>>>> run on the clients.
>>>>>
>>>>>
>>>>> The self-heal daemon that runs on the server is also a client process
>>>>> in the sense that it has client side xlators like ec or afr and
>>>>> protocol/client (see the shd volfile 'glustershd-server.vol') loaded and
>>>>> talks to the bricks like a normal client does.
>>>>> The difference is that only self-heal related 'logic' get executed on
>>>>> the shd while both self-heal and I/O related logic get executed from the
>>>>> mount. The self-heal logic resides mostly in afr-self-heal*.[ch] while I/O
>>>>> related logic is there in the other files.
>>>>> HTH,
>>>>> Ravi
>>>>>
>>>>
>>>> Hi JK,
>>>>
>>>> Dear Ravi,
>>>> Thanks for your kind explanation.
>>>> So, each server node will have a separate self-heal daemon(shd) up and
>>>> running , every time a child_up event occurs, and this will be an index
>>>> healer.
>>>> And each daemon will spawn "priv->child_count " number of threads on
>>>> each server node . correct ?
>>>>
>>>> shd is always running and yes those many threads are spawned for index
>>>> heal when the process starts.
>>>>
>>>> 1. When exactly a full healer spawns threads?
>>>>
>>>> Whenever you run `gluster volume heal volname full`. See afr_xl_op().
>>>> There are some bugs in launching full heal though.
>>>>
>>>> 2. When can GF_EVENT_TRANSLATOR_OP & GF_SHD_OP_HEAL_INDEX happen
>>>> together (so that index healer spawns thread) ?
>>>> similarly when can GF_EVENT_TRANSLATOR_OP & GF_SHD_OP_HEAL_FULL
>>>> happen ? During replace-brick ?
>>>> Is it possible that index healer and full healer spawns threads
>>>> together (so that total number of threads is 2*priv->child_count)?
>>>>
>>>> index heal threads wake up and run once every 10 minutes or whatever
>>>> the cluster.heal-timeout is. They are also run when a brick comes up like
>>>> you said, via afr_notify(). It is also run when you manually launch
>>>> 'gluster volume heal volname`. Again see afr_xl_op().
>>>>
>>>> 3. In /var/lib/glusterd/glustershd/glustershd-server.vol , why
>>>> debug/io-stats is chosen as the top xlator ?
>>>>
>>>> io-stats is generally loaded as the top most xlator in all graphs at
>>>> the appropriate place for gathering profile-info, but for shd, I'm not sure
>>>> if it has any specific use other than acting as a placeholder as a parent
>>>> to all replica xlators.
>>>>
>>>
>>
>>
>> Hi Ravi,
>>
>> The self heal daemon searches in .glusterfs/indices/xattrop directory
>> for the files/dirs to be healed . Who is updating this information , and
>> on what basis ?
>>
>>
>> Please see https://github.com/gluster/glusterfs-specs/blob/master/done/
>> Features/afr-v1.md, it is a bit dated (relevant to AFR v1, which is in
>> glusterfs 3.5 and older I think) but the concepts are similar. The entries
>> are added/removed by the index translator during the pre-op/post-op phases
>> of the AFR transaction .
>>
>
> Hi Ravi,
>
> Went thru' the document & source code. I see there are options to
> enable/disable entry/data/metadata change logs. If "data-change-log" is
> 1 (by default , it is 1), this will enable data change log, which results
> in __changelog_enabled() to return 1 and thereafter
> call afr_changelog_pre_op() . Similar logic for post-op also, which occurs
> just before unlock.
> Is this responsible for creating/deleting entries inside
> .glusterfs/indices/xattrop ?
>
> Yes, index_xattrop() adds the entry during pre-op and removes it during
> post-op if it was successful.
>
> Currently I can't verify, since the mount point for the rep volume hangs
> when data-change-log is set to 0. (using glusterfs v 3.7.15). Ideally, the
> entries should not appear (in the case of brick failure and a write
> thereafter) if this option is set to '0', am I correct ?
>
> Best Regards
> JK
>
>
>>
>>
>> Thanks Ravi, for the explanation.
>>> Regards
>>> JK
>>>
>>>>
>>>> Regards,
>>>> Ravi
>>>>
>>>> Thanks
>>>> Best regards
>>>>
>>>>>
>>>>> Best regards
>>>>> JK
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-devel mailing listGluster-devel at gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-devel
>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20170213/31db32d1/attachment-0001.html>
More information about the Gluster-devel
mailing list