[Gluster-devel] Fwd: Re: [RFC] Block Device Xlator Design

Amar Tumballi amarts at redhat.com
Wed Jul 11 10:56:05 UTC 2012


Wrong mail-id used earlier. please refer below


-------- Original Message --------
Subject: Re: [Gluster-devel] [RFC] Block Device Xlator Design
Date: Wed, 11 Jul 2012 16:24:24 +0530
From: Amar Tumballi <atumball at redhat.com>
To: M. Mohan Kumar <mohan at in.ibm.com>
CC: Shishir Gowda <sgowda at redhat.com>, gluster-devel at nongnu.org



>
> I posted GlusterFS server xlator patches to enable exporting Block
> Devices (currently only Logical Volumes) as regular files at the
> client side couple of weeks ago. Here is the link for the patches:
>         http://review.gluster.com/3551
>
> I would to like to discuss about the design of this xlator.
>
> Current code uses lvm2-devel library to find out list of logical volumes
> for the given volume group (in BD xlator each volume file exports on
> volume group, in future we may extend this to export multiple volume
> groups if needed). init routine of BD xlator constructs internal data
> structure holding list of all logical volumes in the VG.
>

Went through the patchset, and it looks fine. One of the major thing to
take care is, the build should not fail or it should not assume that
lvm2-devel library is always present. Hence it should have corresponding
checks in configure.ac to handle the situation. (ref: you can look into
how libibverbs-devel dependency is handled)


> When open request comes corresponding open interface in BD xlator opens
> the intended LV by using this logic: /dev/<vg-name>/<lv-name>. This path
> is actually a symbolic link to /dev/dm-<x>. Is my assumption about
> having this /dev/<vg-name>/<lv-name> is it right? Will it always work?

This should be fine. One concern here is how do we keep track of gfid to
path mappings. With having proper resolution there, we can guarantee the
behavior.

>
> Also if there is a request to create a file (in turn it has to create a
> LV at the server side), lvm2 api is used to create a logical volume in
> the given VG but with a pre-determined size ie one logical extent size
> because create interface does not take size as one of the parameters but
> size is one of the parameters to create a logical volume.
>
> In a typical VM disk image scenario qemu-img first creates a file and
> then uses truncate command to set the required file size. So this should
> not be an issue with this kind of usage.
>

I think creat() followed by ftruncate() should just work fine too.


> But there are other issues in the BD xlator code as of now. lvm2 api
> does not support resizing a LV, creating snapshot of LV. But there are
> tools available to do the same. So BD xlator code forks and executes the
> required binary to achieve the functionality. i.e when truncate is
> called on a BD xlator volume, it will result in running lvresize binary
> with required parameters. I checked with lvm2-devel mailing list about
> their plan to support lv resizing and creating snapshots & waiting for
> the responses.
>
> Is it okay to rely on external binaries to create a snapshot of a LV and
> resize it?
>

It is ok to call external binaries (security issues are present, but is
a different topic of discussion). Two things to take care here:

1. as Avati rightly mentioned, utilize the runner
('libglusterfs/src/run.h') interface.

2. If you are expecting/waiting on return value of these binaries, then
we have to make sure we have a mechanism to handle hang situation.

> Also when a LV is created out-of-band for example, using gluster cli to
> create a LV (I am working on the gluster cli patches to create LV and
> copy/snapshot LVs), BD xlator will not be aware of these changes and I
> am looking if 'notify' feature of xlator can be used to notify the BD
> xlator to create a LV, snapshot instead of doing it from gluster
> management xlators. I have sent a mail to gluster-devel asking some more
> information about this.
>

Refer to Shishir's response for this.

Hope this serves as a initial review comment.

Regards,
Amar






More information about the Gluster-devel mailing list