[Gluster-users] [Gluster-devel]glusterfs crashed lead by liblvm2app.so with BD xlator

Tue Nov 11 07:45:18 UTC 2014

On Monday, November 10, 2014, Vijay Bellur <vbellur at redhat.com> wrote:

> On 11/08/2014 03:50 PM, Jaden Liang wrote:
>
>>
>> Hi all,
>>
>> We are testing BD xlator to verify the KVM running with gluster. After
>> some
>> simple tests, we encountered a coredump of glusterfs lead by
>> liblvm2app.so.
>> Hope some one here might give some advises about this issue.
>>
>> We have debug for some time, and found out this coredump is triggered by a
>> thread-safe issue. From the core file, the top function is _update_mda()
>> with a invailid pointer which is from lvmcache_foreach_mda(). As we
>> know, the glusterfsd
>> has some io threads to simulate the async io. That will make more than 1
>> thread run into
>> bd_statfs_cbk(). And in liblvm2app.so, _text_read() will look up an info
>> in a hash
>> table named _pvid_hash. If no info item exist, it will allocate a new
>> one. However,
>> there isn't any lock to protect this operations! liblvm2app.so will get
>> crashed with
>> multi-thread like this precedures:
>>
>> Thread A and thread B go into bd_statfs_cbk() at the same time:
>> 1. A allocate an new info node, and put it into _pvid_hash, call
>> lvmcache_foreach_mda().
>> 2. B looks up and get the info generaed by A in _pvid_hash, pass it to
>> lvmcache_del_mdas(), this will free the info node.
>> 3. A keep using the info node which has been freed by B.
>> 4. Memory crash...
>>
>>
> Thanks for the report and the steps to recreate the problem.
>
>  #9  0x00007f83b599753f in _lvm_vg_open (mode=0x7f83b5c8971e "r",
>> vgname=0x11d2c50 "bd-vg", libh=0x11d3c40,
>>      flags=<optimized out>) at lvm_vg.c:221
>> #10 lvm_vg_open (libh=0x11d3c40, vgname=0x11d2c50 "bd-vg",
>> mode=mode at entry=0x7f83b5c8971e "r", flags=flags at entry=0)
>>      at lvm_vg.c:238
>> #11 0x00007f83b5c7ee36 in bd_statfs_cbk (frame=0x7f83b95416e4,
>> cookie=<optimized out>, this=0x119eb90, op_ret=0, op_errno=0,
>>      buff=0x7f83b1d0ac70, xdata=0x0) at bd.c:353
>>
>
>
> One quick fix would be to serialize calls to lvm_vg_open() by holding a
> lock in bd xlator. Have you tried attempting that?

> -Vijay
>

Thanks for reviewing this issue.

Yes, we also noticed this spot and had added a lock to lvm_vg_open() to
test. It looks find by now. We are
going to do some more tests with a real KVM to verify this modification.
Then submit a patch.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141111/734c23e0/attachment.html>