[Gluster-users] question about info and info.tmp

Atin Mukherjee amukherj at redhat.com
Thu Nov 24 11:12:07 UTC 2016


Xin - I appreciate your patience. I'd need some more time to pick this item
up from my backlog. I believe we have a workaround applicable here too.

On Thu, 24 Nov 2016 at 14:24, songxin <songxin_1980 at 126.com> wrote:

>
>
>
> Hi Atin,
> Actually, the glusterfs is used in my project.
> And our test team find this issue.
> So I want to make sure that whether you plan to fix it.
> if you have plan I will wait you because your method shoud be better than
> mine.
>
> Thanks,
> Xin
>
>
> 在 2016-11-21 10:00:36,"Atin Mukherjee" <atin.mukherjee83 at gmail.com> 写道:
>
> Hi Xin,
>
> I've not got a chance to look into it yet. delete stale volume function is
> in place to take care of wiping off volume configuration data which has
> been deleted from the cluster. However we need to revisit this code to see
> if this function is anymore needed given we recently added a validation to
> fail delete request if one of the glusterd is down. I'll get back to you on
> this.
>
> On Mon, 21 Nov 2016 at 07:24, songxin <songxin_1980 at 126.com> wrote:
>
> Hi Atin,
> Thank you for your support.
>
> And any conclusions about this issue?
>
> Thanks,
> Xin
>
>
>
>
>
> 在 2016-11-16 20:59:05,"Atin Mukherjee" <amukherj at redhat.com> 写道:
>
>
>
> On Tue, Nov 15, 2016 at 1:53 PM, songxin <songxin_1980 at 126.com> wrote:
>
> ok, thank you.
>
>
>
>
> 在 2016-11-15 16:12:34,"Atin Mukherjee" <amukherj at redhat.com> 写道:
>
>
>
> On Tue, Nov 15, 2016 at 12:47 PM, songxin <songxin_1980 at 126.com> wrote:
>
>
> Hi Atin,
>
> I think the root cause is in the function glusterd_import_friend_volume as
> below.
>
> int32_t
> glusterd_import_friend_volume (dict_t *peer_data, size_t count)
> {
> ...
>         ret = glusterd_volinfo_find (new_volinfo->volname, &old_volinfo);
>         if (0 == ret) {
>                 (void) gd_check_and_update_rebalance_info (old_volinfo,
>                                                            new_volinfo);
>                 (void) glusterd_delete_stale_volume (old_volinfo,
> new_volinfo);
>         }
> ...
>         ret = glusterd_store_volinfo (new_volinfo,
> GLUSTERD_VOLINFO_VER_AC_NONE);
>         if (ret) {
>                 gf_msg (this->name, GF_LOG_ERROR, 0,
>                         GD_MSG_VOLINFO_STORE_FAIL, "Failed to store "
>                         "volinfo for volume %s", new_volinfo->volname);
>                 goto out;
>         }
> ...
> }
>
> glusterd_delete_stale_volume will remove the info and bricks/* and the
> glusterd_store_volinfo will create the new one.
> But if glusterd is killed before rename the info will is empty.
>
> And glusterd will start failed because the infois empty in the next time
> you start the glusterd.
>
> Any idea, Atin?
>
>
> Give me some time, will check it out, but reading at this analysis looks
> very well possible if a volume is changed when the glusterd was done on
> node a and when the same comes up during peer handshake we update the
> volinfo and during that time glusterd goes down once again. I'll confirm it
> by tomorrow.
>
>
> I checked the code and it does look like you have got the right RCA for
> the issue which you simulated through those two scripts. However this can
> happen even when you try to create a fresh volume and while glusterd tries
> to write the content into the store and goes down before renaming the
> info.tmp file you get into the same situation.
>
> I'd really need to think through if this can be fixed. Suggestions are
> always appreciated.
>
>
>
>
> BTW, excellent work Xin!
>
>
> Thanks,
> Xin
>
>
> 在 2016-11-15 12:07:05,"Atin Mukherjee" <amukherj at redhat.com> 写道:
>
>
>
> On Tue, Nov 15, 2016 at 8:58 AM, songxin <songxin_1980 at 126.com> wrote:
>
> Hi Atin,
> I have some clues about this issue.
> I could reproduce this issue use the scrip that mentioned in
> https://bugzilla.redhat.com/show_bug.cgi?id=1308487 .
>
>
> I really appreciate your help in trying to nail down this issue. While I
> am at your email and going through the code to figure out the possible
> cause for it, unfortunately I don't see any script in the attachment of the
> bug.  Could you please cross check?
>
>
>
> After I added some debug print,which like below, in glusterd-store.c and I
> found that the /var/lib/glusterd/vols/xxx/info and /var/lib/glusterd/vols/xxx/bricks/*
> are removed.
> But other files in /var/lib/glusterd/vols/xxx/ will not be remove.
>
> int32_t
> glusterd_store_volinfo (glusterd_volinfo_t *volinfo,
> glusterd_volinfo_ver_ac_t ac)
> {
>         int32_t                 ret = -1;
>
>         GF_ASSERT (volinfo)
>
>         ret = access("/var/lib/glusterd/vols/gv0/info", F_OK);
>         if(ret < 0)
>         {
>                 gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info is not
> exit(%d)", errno);
>         }
>         else
>         {
>                 ret = stat("/var/lib/glusterd/vols/gv0/info", &buf);
>                 if(ret < 0)
>                 {
>                         gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "stat info
> error");
>                 }
>                 else
>                 {
>                         gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info size
> is %lu, inode num is %lu", buf.st_size, buf.st_ino);
>                 }
>         }
>
>         glusterd_perform_volinfo_version_action (volinfo, ac);
>         ret = glusterd_store_create_volume_dir (volinfo);
>         if (ret)
>                 goto out;
>
> ...
> }
>
> So it is easy to understand why  the info or
> 10.32.1.144.-opt-lvmdir-c2-brick sometimes is empty.
> It is becaue the info file is not exist, and it will be create by “fd =
> open (path, O_RDWR | O_CREAT | O_APPEND, 0600);” in function
> gf_store_handle_new.
> And the info file is empty before rename.
> So the info file is empty if glusterd shutdown before rename.
>
>
>
> My question is following.
> 1.I did not find the point the info is removed.Could you tell me the point
> where the info and /bricks/* are removed?
> 2.why the file info and bricks/* is removed?But other files in var/lib/glusterd/vols/xxx/
> are not be removed?
>
>
> AFAIK, we never delete the info file and hence this file is opened with
> O_APPEND flag. As I said I will go back and cross check the code once again.
>
>
>
>
> Thanks,
> Xin
>
>
> 在 2016-11-11 20:34:05,"Atin Mukherjee" <amukherj at redhat.com> 写道:
>
>
>
> On Fri, Nov 11, 2016 at 4:00 PM, songxin <songxin_1980 at 126.com> wrote:
>
> Hi Atin,
>
> Thank you for your support.
> Sincerely wait for your reply.
>
> By the way, could you make sure that the issue, file info is empty, cause
> by rename is interrupted in kernel?
>
>
> As per my RCA on that bug, it looked to be.
>
>
>
> Thanks,
> Xin
>
> 在 2016-11-11 15:49:02,"Atin Mukherjee" <amukherj at redhat.com> 写道:
>
>
>
> On Fri, Nov 11, 2016 at 1:15 PM, songxin <songxin_1980 at 126.com> wrote:
>
> Hi Atin,
> Thank you for your reply.
> Actually it is very difficult to reproduce because I don't know when there
> was an ongoing commit happening.It is just a coincidence.
> But I want to make sure the root cause.
>
>
> I'll give it a another try and see if this situation can be
> simulated/reproduced and will keep you posted.
>
>
>
> So I would be grateful if you could answer my questions below.
>
> You said that "This issue is hit at part of the negative testing where
> while gluster volume set was executed at the same point of time glusterd in
> another instance was brought down. In the faulty node we could see
> /var/lib/glusterd/vols/<volname>info file been empty whereas the info.tmp
> file has the correct contents." in comment.
>
> I have two questions for you.
>
> 1.Could you reproduce this issue by gluster volume set glusterd which was brought down?
> 2.Could you be certain that this issue is cause by rename is interrupted in kernel?
>
> In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, are both empty.
> But in my view only one rename can be running at the same time because of the big lock.
> Why there are two files are empty?
>
>
> Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be running in two thread?
>
> Thanks,
> Xin
>
>
> 在 2016-11-11 15:27:03,"Atin Mukherjee" <amukherj at redhat.com> 写道:
>
>
>
> On Fri, Nov 11, 2016 at 12:38 PM, songxin <songxin_1980 at 126.com> wrote:
>
>
> Hi Atin,
> Thank you for your reply.
>
> As you said that the info file can only be changed in the glusterd_store_volinfo()
> sequentially because of the big lock.
>
> I have found the similar issue as below that you mentioned.
> https://bugzilla.redhat.com/show_bug.cgi?id=1308487
>
>
> Great, so this is what I was actually trying to refer in my first email
> that I saw a similar issue. Have you got a chance to look at
> https://bugzilla.redhat.com/show_bug.cgi?id=1308487#c4 ? But in your
> case, did you try to bring down glusterd when there was an ongoing commit
> happening?
>
>
>
> You said that "This issue is hit at part of the negative testing where
> while gluster volume set was executed at the same point of time glusterd in
> another instance was brought down. In the faulty node we could see
> /var/lib/glusterd/vols/<volname>info file been empty whereas the info.tmp
> file has the correct contents." in comment.
>
> I have two questions for you.
>
> 1.Could you reproduce this issue by gluster volume set glusterd which was brought down?
> 2.Could you be certain that this issue is cause by rename is interrupted in kernel?
>
> In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, are both empty.
> But in my view only one rename can be running at the same time because of the big lock.
> Why there are two files are empty?
>
>
> Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be running in two thread?
>
> Thanks,
> Xin
>
>
>
>
> 在 2016-11-11 14:36:40,"Atin Mukherjee" <amukherj at redhat.com> 写道:
>
>
>
> On Fri, Nov 11, 2016 at 8:33 AM, songxin <songxin_1980 at 126.com> wrote:
>
> Hi Atin,
>
> Thank you for your reply.
> I have two questions for you.
>
> 1.Are the two files info and info.tmp are only to be created or changed in
> function glusterd_store_volinfo()? I did not find other point in which the
> two file are changed.
>
>
> If we are talking about info file volume then yes, the mentioned function
> actually takes care of it.
>
>
> 2.I found that glusterd_store_volinfo() will be call in many point by
> glusterd.Is there a problem of thread synchronization?If so, one thread may
> open a same file info.tmp using O_TRUNC flag when another thread is
> writing the info,tmp.Could this case happen?
>
>
>  In glusterd threads are big lock protected and I don't see a possibility
> (theoretically) to have two glusterd_store_volinfo () calls at a given
> point of time.
>
>
>
> Thanks,
> Xin
>
>
> At 2016-11-10 21:41:06, "Atin Mukherjee" <amukherj at redhat.com> wrote:
>
> Did you run out of disk space by any chance? AFAIK, the code is like we
> write new stuffs to .tmp file and rename it back to the original file. In
> case of a disk space issue I expect both the files to be of non zero size.
> But having said that I vaguely remember a similar issue (in the form of a
> bug or an email) landed up once but we couldn't reproduce it, so something
> is wrong with the atomic update here is what I guess. I'll be glad if you
> have a reproducer for the same and then we can dig into it further.
>
> On Thu, Nov 10, 2016 at 1:32 PM, songxin <songxin_1980 at 126.com> wrote:
>
> Hi,
> When I start the glusterd some error happened.
> And the log is following.
>
> [2016-11-08 07:58:34.989365] I [MSGID: 100030] [glusterfsd.c:2318:main]
> 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6
> (args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO)
> [2016-11-08 07:58:34.998356] I [MSGID: 106478] [glusterd.c:1350:init]
> 0-management: Maximum allowed open file descriptors set to 65536
> [2016-11-08 07:58:35.000667] I [MSGID: 106479] [glusterd.c:1399:init]
> 0-management: Using /system/glusterd as working directory
> [2016-11-08 07:58:35.024508] I [MSGID: 106514]
> [glusterd-store.c:2075:glusterd_restore_op_version] 0-management: Upgrade
> detected. Setting op-version to minimum : 1
> *[2016-11-08 07:58:35.025356] E [MSGID: 106206]
> [glusterd-store.c:2562:glusterd_store_update_volinfo] 0-management: Failed
> to get next store iter *
> *[2016-11-08 07:58:35.025401] E [MSGID: 106207]
> [glusterd-store.c:2844:glusterd_store_retrieve_volume] 0-management: Failed
> to update volinfo for c_glusterfs volume *
> *[2016-11-08 07:58:35.025463] E [MSGID: 106201]
> [glusterd-store.c:3042:glusterd_store_retrieve_volumes] 0-management:
> Unable to restore volume: c_glusterfs *
> *[2016-11-08 07:58:35.025544] E [MSGID: 101019] [xlator.c:428:xlator_init]
> 0-management: Initialization of volume 'management' failed, review your
> volfile again *
> *[2016-11-08 07:58:35.025582] E [graph.c:322:glusterfs_graph_init]
> 0-management: initializing translator failed *
> *[2016-11-08 07:58:35.025629] E [graph.c:661:glusterfs_graph_activate]
> 0-graph: init failed *
> [2016-11-08 07:58:35.026109] W [glusterfsd.c:1236:cleanup_and_exit]
> (-->/usr/sbin/glusterd(glusterfs_volumes_init-0x1b260) [0x1000a718]
> -->/usr/sbin/glusterd(glusterfs_process_volfp-0x1b3b8) [0x1000a5a8]
> -->/usr/sbin/glusterd(cleanup_and_exit-0x1c02c) [0x100098bc] ) 0-: received
> signum (0), shutting down
>
>
> And then I found that the size of vols/volume_name/info is 0.It cause
> glusterd shutdown.
> But I found that vols/volume_name_info.tmp is not 0.
> And I found that there is a brick file vols/volume_name/bricks/xxxx.brick
> is 0, but vols/volume_name/bricks/xxxx.brick.tmp is not 0.
>
> I read the function code glusterd_store_volinfo () in glusterd-store.c .
> I know that the info.tmp will be rename to info in function
> glusterd_store_volume_atomic_update().
>
> But my question is that why the info file is 0 but info.tmp is not 0.
>
>
> Thanks,
> Xin
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> --
>
> ~ Atin (atinm)
>
>
>
>
>
>
>
>
> --
>
> ~ Atin (atinm)
>
>
>
>
>
>
>
>
> --
>
> ~ Atin (atinm)
>
>
>
>
>
>
>
>
> --
>
> ~ Atin (atinm)
>
>
>
>
>
>
>
>
> --
>
> ~ Atin (atinm)
>
>
>
>
>
>
>
>
> --
>
> ~ Atin (atinm)
>
>
>
>
>
>
>
>
> --
>
> ~ Atin (atinm)
>
>
>
>
>
>
>
>
> --
>
> ~ Atin (atinm)
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
> --
> --Atin
>
> --
- Atin (atinm)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161124/93e6be10/attachment.html>


More information about the Gluster-users mailing list