[Gluster-devel] libgfapi usage issues: overwrites 'THIS' and use after free

Soumya Koduri skoduri at redhat.com
Sat Apr 18 10:09:31 UTC 2015



On 04/17/2015 04:51 PM, Raghavendra Talur wrote:
> On Friday 17 April 2015 03:53 PM, Poornima Gurusiddaiah wrote:
>> Hi,
>>
>> There are two concerns in the usage of libgfapi which have been
>> present from day one, but now
>> with new users of libgfapi its a necessity to fix these:
>>
>> 1. When libgfapi is used by the GlusterFS internal xlators, 'THIS'
>> gets overwritten:
>> Eg: when snapview-server creates a new fs instance for every snap that
>> is created.
>> Currently any libgfapi calls inside the xlator overwrites the THIS
>> value to contain glfs-master(gfapi xlator).
>> Hence after the api exits, any further code in the parent xlator
>> referring to THIS will refer
>> to the glfs-master(gfapi xlator).
>>
>> Proposed solutions:
>> - Store and restore THIS in every API exposed by libgfapi, patch for
>> the same can be found at http://review.gluster.org/#/c/9797/
>> - Other solution suggested by Niels was to not have internal xlators
>> calling libgfapi and
>> move the core functionality to libglusterfs. But even with this the
>> nested mount/ctx issue can still exist.
>
>
> Restating the problem to make sure I understand it correctly:
> When a "glusterfs" process wants to access a volume using libgfapi we
> have a problem at hand where THIS->ctx of original glusterfs process and
> libgfapi cause conflicts.
> Examples could be
> 1. Snapd which is a glusterfs brick process wants to become a glusterfs
> client process for a different volume using libgfapi.
> 2. Someday glusterd may choose to become client to a management volume
> using libgfapi.
> 3. Someone writes an application in libgfapi which gives unified view of
> multiple glusterfs volumes using libgfapi.
> 4. This is can *possibly* happen without libgfapi if One glusterfs
> process does more than one glusterfs_ctx_new in a nested fashion. (I am
> not sure it is technically possible with current infra or if it is
> already done somewhere).
>
> So the question boils down to "Can we have two ctx in a nested fashion
> in a single process and switch as and when required?, If so how?"
>
> With Poornima's patch here, we have taken care of all cases where it is
> libgfapi encapsulated in another (glusterfs|libgfapi) process.
>
> What we have not solved is the case where a glusterfsd process creates
> another nested ctx as mentioned in case 4 above.
>
> If we think we will never encounter case 4 above then it may just be
> enough to go ahead with this patch and solve this for libgfapi.
>
> I am not sure if I understand Niels' suggestion completely. How can a
> gluster xlator access a different volume than what it is part of, if not
> through libgfapi?
>
>
>>
>> 2. When libgfapi APIs are called by the application on a fs object,
>> that is already closed(glfs_fini()'ed):
>> Ideally it is the applications responsibility to take care, to not do
>> such things. But its also good
>> to not crash libgfapi when such ops are performed by the application.
>> We have already seen these issues in snapview server.
>>
>> Proposed Solutions/workarounds:
>> - Do not free the fs object(leaks few bytes) have a state bit to say
>> valid or invalid. In every API check
>> for the fs validity before proceeding. Patch for same
>> @http://review.gluster.org/#/c/9797/
>> - As suggested by Niels, have a fs global pool which tracks
>> allocated/freed fs objects.
>> - Have the applications fix it, so that they do not call fops on
>> closed fs object(unmounted fs).
>> This mandates multithreaded/asynchronous applications to have some
>> synchronization mechanism.
>>
>
> IMHO, allocation and free of memory of fs objects should
> never be libgfapi's responsibility. They must be allocated by the
> application as per struct defined in glfs.h.
>
> Once a fini() is called on a fs, it is like a close() on a fd.
> As Poornima suggests, just set the invalid flag which is part of the
> struct. It should be application's responsibilty to free the fs memory
> after fini returns.
>
In that case then gfapi needs to expose "struct glfs" internal structure 
to the applications to allocate and free the objects. Even then IMO it 
doesn't solve the issue.

Instead of just "glfs_fini(fs)", now the applications will make two calls -

"glfs_fini (fs)";
"FREE (fs)";

Now, what prevents other threads (in a multi-threaded applications) to 
not reference 'fs' object in parallel unless the application assigns 
'fs' to NULL which the application can do even now.

glfs_fini (fs);
fs = NULL;

Or am I missing something here? IMHO applications/consumers of libgfapi 
should make a call to "glfs_fini(fs)"  only after processing on-going 
I/Os (the way we free 'ctx' structure in glfs_fini after stopping all 
the other threads  accessing it) and prevent further I/Os on that 'fs' 
object by making that object value to be NULL.

Thanks,
Soumya

> The above patch checks for validity of fs in entry of
> every fop and just returns EINVAL whenever the invalid flag is set.
>
> The only downside to this is that it breaks api backward compatibility.
> glfs_new currently takes only volname as its input.
> new glfs_new would take a preallocated fs and a volname as its input.
>
>
>> Please let me know your comments on the same.
>>
>> Regards,
>> Poornima
>>
>>
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel


More information about the Gluster-devel mailing list