[Gluster-users] libgfapi access

Pranith Kumar Karampuri pkarampu at redhat.com
Wed Dec 16 12:14:59 UTC 2015



On 12/16/2015 02:24 PM, Poornima Gurusiddaiah wrote:
> Answers inline
>
> ----- Original Message -----
>> From: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
>> To: "Ankireddypalle Reddy" <areddy at commvault.com>, "Vijay Bellur" <vbellur at redhat.com>, gluster-users at gluster.org,
>> "Shyam" <srangana at redhat.com>, "Niels de Vos" <ndevos at redhat.com>
>> Sent: Wednesday, December 16, 2015 1:14:35 PM
>> Subject: Re: [Gluster-users] libgfapi access
>>
>>
>>
>> On 12/16/2015 01:51 AM, Ankireddypalle Reddy wrote:
>>> Thanks for the explanation. Valgrind profiling shows multiple memcpy's
>>> being invoked for each write through libgfapi. Is there a way to avoid
>>> these memcpy's?. Also is there a limit on the number of glfs_t* instances
> For every buffer passed by application to libgfapi, libgfapi does a memcopy,
> for various reasons. As of yet there is no way to limit this, there were some dicussions
> on adding capabilities for libgfapi to provide buffer so that memcopies can be avoided.
>
>>> that can be allocated at a given point of time. I've encountered cases
>>> where if more than 8 glfs_t* instances are being allocated then glfs_init
>>> fails.
> Currently there is no way to limit the number of glfs_t instances for a process.
> But it is quite easy to implement this in the application itself. What application
> are using libgfapi for?
Do you think it is possible for you to share the c program which is 
leading to this problem? It would be easier to find the problem that way.

Pranith
>
>> Including maintainers of gfapi.
>>
>> Pranith
>>
>>>    
>>>
>>> Thanks and Regards,
>>> Ram
>>>
>>> -----Original Message-----
>>> From: Pranith Kumar Karampuri [mailto:pkarampu at redhat.com]
>>> Sent: Monday, December 14, 2015 11:13 PM
>>> To: Ankireddypalle Reddy; Vijay Bellur; gluster-users at gluster.org
>>> Subject: Re: [Gluster-users] libgfapi access
>>>
>>>
>>>
>>> On 12/11/2015 08:58 PM, Ankireddypalle Reddy wrote:
>>>> Pranith,
>>>>                    Thanks for checking this. Though the time taken to run
>>>>                    was 18 seconds if you look at  the time consumed in
>>>>                    user land as well as kernel land for executing the
>>>>                    command then it is evident that fuse took almost half
>>>>                    the time as libgfapi. Also from the collected profiles
>>>>                    it is evident that the average latency for the write
>>>>                    command is less for fuse than for libgfapi. Are there
>>>>                    any recommendations for I/O through libgfapi for
>>>>                    disperse volumes. Is there any way to avoid the extra
>>>>                    memcpy's that are being made when performing I/O
>>>>                    through libgfapi.
>>> hi Ankireddy,
>>>            Oh this is not a problem. If we use fuse, the system call 'write'
>>>            from ./GlusterFuseTest will go through fuse-kernel, fuse kernel
>>>            sends the write operation to glusterfs mount process which is a
>>>            user process. Time taken to complete that call from then on is
>>>            computed against the glusterfs mount process until it responds
>>>            to the fuse-kernel, not against the ./GlusterFuseTest process.
>>>            If we use gfapi, there is no system call over head, instead
>>>            ./GlusterFuseTest process directly makes calls with the bricks
>>>            through gfapi library. So all the time that the process spends
>>>            communicating with the bricks and getting the response is
>>>            counted against ./GlusterFuseTest. That is the reason you see
>>>            more 'user' time.
>>>
>>> So again, There are quite a few workloads where gfapi has proven to give
>>> better response times than fuse mounts because we avoid the context switch
>>> costs of  ./GlusterFuseTest -> fuse-kernel -> glusterfs-mount ->
>>> fuse-kernel (for response)-> ./GlusterFuseTest (for response to 'write')
>>>
>>> Hope that helps. Sorry for the delay in response, was in too many meetings
>>> yesterday.
>>>
>>> Pranith
>>>> Thanks and Regards,
>>>> Ram
>>>>
>>>> -----Original Message-----
>>>> From: Pranith Kumar Karampuri [mailto:pkarampu at redhat.com]
>>>> Sent: Thursday, December 10, 2015 10:57 PM
>>>> To: Ankireddypalle Reddy; Vijay Bellur; gluster-users at gluster.org
>>>> Subject: Re: [Gluster-users] libgfapi access
>>>>
>>>>
>>>>
>>>> On 12/10/2015 07:15 PM, Ankireddypalle Reddy wrote:
>>>>> Hi,
>>>>>          Please let me know in case you need any more details. Even for
>>>>>          only write operations fuse seems to outperform libgfapi. Is it
>>>>>          because of disperse volumes?. Also I noticed a lot of data loss
>>>>>          in case I use libgfapi asyn I/O for disperse volumes.
>>>> Fuse and gfapi seem to take same amount of time to complete the run, i.e.
>>>> 18 seconds. Could you let me know what you mean by fuse outperforming
>>>> gfapi?
>>>>
>>>> Pranith
>>>>> Thanks and Regards,
>>>>> Ram
>>>>>
>>>>> -----Original Message-----
>>>>> From: Ankireddypalle Reddy
>>>>> Sent: Wednesday, December 09, 2015 5:01 PM
>>>>> To: 'Pranith Kumar Karampuri'; Vijay Bellur;
>>>>> gluster-users at gluster.org
>>>>> Subject: RE: [Gluster-users] libgfapi access
>>>>>
>>>>> Hi,
>>>>>              I upgraded my setup to gluster 3.7.3. I tested writes by
>>>>>              performing writes through fuse and through libgfapi.
>>>>>              Attached are the profiles generated from fuse and libgfapi.
>>>>>              The test programs essentially writes 10000 blocks each of
>>>>>              128K.
>>>>>
>>>>> [root at santest2 Base]# time ./GlusterFuseTest /ws/glus 131072 10000 Mount
>>>>> path: /ws/glus Block size: 131072 Num of blocks: 10000 Will perform
>>>>> write test on mount path : /ws/glus Succesfully created file
>>>>> /ws/glus/1449697583.glfs Successfully filled file
>>>>> /ws/glus/1449697583.glfs Write test succeeded Write test succeeded.
>>>>>
>>>>> real    0m18.722s
>>>>> user    0m3.913s
>>>>> sys     0m1.126s
>>>>>
>>>>> [root at santest2 Base]# time ./GlusterLibGFApiTest dispersevol santest2
>>>>> 24007 131072 10000 Host name: santest2
>>>>> Volume: dispersevol
>>>>> Port: 24007
>>>>> Block size: 131072
>>>>> Num of blocks: 10000
>>>>> Will perform write test on volume: dispersevol Successfully filled file
>>>>> 1449697651.glfs Write test succeeded Write test succeeded.
>>>>>
>>>>> real    0m18.630s
>>>>> user    0m8.804s
>>>>> sys     0m1.870s
>>>>>
>>>>> Thanks and Regards,
>>>>> Ram
>>>>>
>>>>>       
>>>>>
>>>>> -----Original Message-----
>>>>> From: Pranith Kumar Karampuri [mailto:pkarampu at redhat.com]
>>>>> Sent: Wednesday, December 09, 2015 1:39 AM
>>>>> To: Ankireddypalle Reddy; Vijay Bellur; gluster-users at gluster.org
>>>>> Subject: Re: [Gluster-users] libgfapi access
>>>>>
>>>>>
>>>>>
>>>>> On 12/08/2015 08:28 PM, Ankireddypalle Reddy wrote:
>>>>>> Vijay,
>>>>>>                   We are trying to write data backed up by Commvault
>>>>>>                   simpana to glusterfs volume.  The data being written
>>>>>>                   is around 30 GB. Two kinds of write requests happen.
>>>>>> 	1) 1MB requests
>>>>>> 	2) Small write requests of size 128 bytes. In case of libgfapi access
>>>>>> 	these are cached and a single 128KB write request is made where as in
>>>>>> 	case of FUSE the 128 byte write request is handled to FUSE directly.
>>>>>>
>>>>>> 	glusterfs 3.6.5 built on Aug 24 2015 10:02:43
>>>>>>
>>>>>>                      Volume Name: dispersevol
>>>>>> 	Type: Disperse
>>>>>> 	Volume ID: c5d6ccf8-6fec-4912-ab2e-6a7701e4c4c0
>>>>>> 	Status: Started
>>>>>> 	Number of Bricks: 1 x (2 + 1) = 3
>>>>>> 	Transport-type: tcp
>>>>>> 	Bricks:
>>>>>> 	Brick1: ssdtest:/mnt/ssdfs1/brick3
>>>>>> 	Brick2: sanserver2:/data/brick3
>>>>>> 	Brick3: santest2:/home/brick3
>>>>>> 	Options Reconfigured:
>>>>>> 	performance.cache-size: 512MB
>>>>>> 	performance.write-behind-window-size: 8MB
>>>>>> 	performance.io-thread-count: 32
>>>>>> 	performance.flush-behind: on
>>>>> hi,
>>>>>           Things look okay. May be we can find something using profile
>>>>>           info.
>>>>>
>>>>> Could you post the results of the following operations:
>>>>> 1) gluster volume profile <volname> start
>>>>> 2) Run the fuse workload
>>>>> 3) gluster volume profile <volname> info > /path/to/file-1/to/send/us
>>>>> 4) Run the libgfapi workload
>>>>> 5)gluster volume profile <volname> info > /path/to/file-2/to/send/us
>>>>>
>>>>> Send both these files to us to check what are the extra fops if any that
>>>>> are sent over network which may be causing the delay.
>>>>>
>>>>> I see that you are using disperse volume. If you are going to use
>>>>> disperse volume for production usecases, I suggest you use 3.7.x
>>>>> preferably 3.7.3. We fixed a bug in releases from 3.7.4 till 3.7.6 which
>>>>> will be released in 3.7.7.
>>>>>
>>>>> Pranith
>>>>>> Thanks and Regards,
>>>>>> Ram
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Vijay Bellur [mailto:vbellur at redhat.com]
>>>>>> Sent: Monday, December 07, 2015 6:13 PM
>>>>>> To: Ankireddypalle Reddy; gluster-users at gluster.org
>>>>>> Subject: Re: [Gluster-users] libgfapi access
>>>>>>
>>>>>> On 12/07/2015 10:29 AM, Ankireddypalle Reddy wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>>              I am trying to use  libgfapi  interface to access
>>>>>>> gluster volume. What I noticed is that reads/writes to the gluster
>>>>>>> volume through libgfapi interface are slower than FUSE.  I was
>>>>>>> expecting the contrary. Are there any recommendations/settings
>>>>>>> suggested to be used while using libgfapi interface.
>>>>>>>
>>>>>> Can you please provide more details about your tests? Providing
>>>>>> information like I/O block size, file size, throughput would be
>>>>>> helpful.
>>>>>>
>>>>>> Thanks,
>>>>>> Vijay
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ***************************Legal
>>>>>> Disclaimer***************************
>>>>>> "This communication may contain confidential and privileged material
>>>>>> for the sole use of the intended recipient. Any unauthorized review,
>>>>>> use or distribution by others is strictly prohibited. If you have
>>>>>> received the message by mistake, please advise the sender by reply email
>>>>>> and delete the message. Thank you."
>>>>>> ********************************************************************
>>>>>> *
>>>>>> * _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>> ***************************Legal
>>>>> Disclaimer***************************
>>>>> "This communication may contain confidential and privileged material
>>>>> for the sole use of the intended recipient. Any unauthorized review,
>>>>> use or distribution by others is strictly prohibited. If you have
>>>>> received the message by mistake, please advise the sender by reply email
>>>>> and delete the message. Thank you."
>>>>> *********************************************************************
>>>>> *
>>>>
>>>> ***************************Legal Disclaimer***************************
>>>> "This communication may contain confidential and privileged material
>>>> for the sole use of the intended recipient. Any unauthorized review,
>>>> use or distribution by others is strictly prohibited. If you have
>>>> received the message by mistake, please advise the sender by reply email
>>>> and delete the message. Thank you."
>>>> **********************************************************************
>>>
>>>
>>> ***************************Legal Disclaimer***************************
>>> "This communication may contain confidential and privileged material for
>>> the
>>> sole use of the intended recipient. Any unauthorized review, use or
>>> distribution
>>> by others is strictly prohibited. If you have received the message by
>>> mistake,
>>> please advise the sender by reply email and delete the message. Thank you."
>>> **********************************************************************
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>



More information about the Gluster-users mailing list