[Gluster-users] libgfapi access
Pranith Kumar Karampuri
pkarampu at redhat.com
Wed Dec 16 12:14:59 UTC 2015
On 12/16/2015 02:24 PM, Poornima Gurusiddaiah wrote:
> Answers inline
>
> ----- Original Message -----
>> From: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
>> To: "Ankireddypalle Reddy" <areddy at commvault.com>, "Vijay Bellur" <vbellur at redhat.com>, gluster-users at gluster.org,
>> "Shyam" <srangana at redhat.com>, "Niels de Vos" <ndevos at redhat.com>
>> Sent: Wednesday, December 16, 2015 1:14:35 PM
>> Subject: Re: [Gluster-users] libgfapi access
>>
>>
>>
>> On 12/16/2015 01:51 AM, Ankireddypalle Reddy wrote:
>>> Thanks for the explanation. Valgrind profiling shows multiple memcpy's
>>> being invoked for each write through libgfapi. Is there a way to avoid
>>> these memcpy's?. Also is there a limit on the number of glfs_t* instances
> For every buffer passed by application to libgfapi, libgfapi does a memcopy,
> for various reasons. As of yet there is no way to limit this, there were some dicussions
> on adding capabilities for libgfapi to provide buffer so that memcopies can be avoided.
>
>>> that can be allocated at a given point of time. I've encountered cases
>>> where if more than 8 glfs_t* instances are being allocated then glfs_init
>>> fails.
> Currently there is no way to limit the number of glfs_t instances for a process.
> But it is quite easy to implement this in the application itself. What application
> are using libgfapi for?
Do you think it is possible for you to share the c program which is
leading to this problem? It would be easier to find the problem that way.
Pranith
>
>> Including maintainers of gfapi.
>>
>> Pranith
>>
>>>
>>>
>>> Thanks and Regards,
>>> Ram
>>>
>>> -----Original Message-----
>>> From: Pranith Kumar Karampuri [mailto:pkarampu at redhat.com]
>>> Sent: Monday, December 14, 2015 11:13 PM
>>> To: Ankireddypalle Reddy; Vijay Bellur; gluster-users at gluster.org
>>> Subject: Re: [Gluster-users] libgfapi access
>>>
>>>
>>>
>>> On 12/11/2015 08:58 PM, Ankireddypalle Reddy wrote:
>>>> Pranith,
>>>> Thanks for checking this. Though the time taken to run
>>>> was 18 seconds if you look at the time consumed in
>>>> user land as well as kernel land for executing the
>>>> command then it is evident that fuse took almost half
>>>> the time as libgfapi. Also from the collected profiles
>>>> it is evident that the average latency for the write
>>>> command is less for fuse than for libgfapi. Are there
>>>> any recommendations for I/O through libgfapi for
>>>> disperse volumes. Is there any way to avoid the extra
>>>> memcpy's that are being made when performing I/O
>>>> through libgfapi.
>>> hi Ankireddy,
>>> Oh this is not a problem. If we use fuse, the system call 'write'
>>> from ./GlusterFuseTest will go through fuse-kernel, fuse kernel
>>> sends the write operation to glusterfs mount process which is a
>>> user process. Time taken to complete that call from then on is
>>> computed against the glusterfs mount process until it responds
>>> to the fuse-kernel, not against the ./GlusterFuseTest process.
>>> If we use gfapi, there is no system call over head, instead
>>> ./GlusterFuseTest process directly makes calls with the bricks
>>> through gfapi library. So all the time that the process spends
>>> communicating with the bricks and getting the response is
>>> counted against ./GlusterFuseTest. That is the reason you see
>>> more 'user' time.
>>>
>>> So again, There are quite a few workloads where gfapi has proven to give
>>> better response times than fuse mounts because we avoid the context switch
>>> costs of ./GlusterFuseTest -> fuse-kernel -> glusterfs-mount ->
>>> fuse-kernel (for response)-> ./GlusterFuseTest (for response to 'write')
>>>
>>> Hope that helps. Sorry for the delay in response, was in too many meetings
>>> yesterday.
>>>
>>> Pranith
>>>> Thanks and Regards,
>>>> Ram
>>>>
>>>> -----Original Message-----
>>>> From: Pranith Kumar Karampuri [mailto:pkarampu at redhat.com]
>>>> Sent: Thursday, December 10, 2015 10:57 PM
>>>> To: Ankireddypalle Reddy; Vijay Bellur; gluster-users at gluster.org
>>>> Subject: Re: [Gluster-users] libgfapi access
>>>>
>>>>
>>>>
>>>> On 12/10/2015 07:15 PM, Ankireddypalle Reddy wrote:
>>>>> Hi,
>>>>> Please let me know in case you need any more details. Even for
>>>>> only write operations fuse seems to outperform libgfapi. Is it
>>>>> because of disperse volumes?. Also I noticed a lot of data loss
>>>>> in case I use libgfapi asyn I/O for disperse volumes.
>>>> Fuse and gfapi seem to take same amount of time to complete the run, i.e.
>>>> 18 seconds. Could you let me know what you mean by fuse outperforming
>>>> gfapi?
>>>>
>>>> Pranith
>>>>> Thanks and Regards,
>>>>> Ram
>>>>>
>>>>> -----Original Message-----
>>>>> From: Ankireddypalle Reddy
>>>>> Sent: Wednesday, December 09, 2015 5:01 PM
>>>>> To: 'Pranith Kumar Karampuri'; Vijay Bellur;
>>>>> gluster-users at gluster.org
>>>>> Subject: RE: [Gluster-users] libgfapi access
>>>>>
>>>>> Hi,
>>>>> I upgraded my setup to gluster 3.7.3. I tested writes by
>>>>> performing writes through fuse and through libgfapi.
>>>>> Attached are the profiles generated from fuse and libgfapi.
>>>>> The test programs essentially writes 10000 blocks each of
>>>>> 128K.
>>>>>
>>>>> [root at santest2 Base]# time ./GlusterFuseTest /ws/glus 131072 10000 Mount
>>>>> path: /ws/glus Block size: 131072 Num of blocks: 10000 Will perform
>>>>> write test on mount path : /ws/glus Succesfully created file
>>>>> /ws/glus/1449697583.glfs Successfully filled file
>>>>> /ws/glus/1449697583.glfs Write test succeeded Write test succeeded.
>>>>>
>>>>> real 0m18.722s
>>>>> user 0m3.913s
>>>>> sys 0m1.126s
>>>>>
>>>>> [root at santest2 Base]# time ./GlusterLibGFApiTest dispersevol santest2
>>>>> 24007 131072 10000 Host name: santest2
>>>>> Volume: dispersevol
>>>>> Port: 24007
>>>>> Block size: 131072
>>>>> Num of blocks: 10000
>>>>> Will perform write test on volume: dispersevol Successfully filled file
>>>>> 1449697651.glfs Write test succeeded Write test succeeded.
>>>>>
>>>>> real 0m18.630s
>>>>> user 0m8.804s
>>>>> sys 0m1.870s
>>>>>
>>>>> Thanks and Regards,
>>>>> Ram
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Pranith Kumar Karampuri [mailto:pkarampu at redhat.com]
>>>>> Sent: Wednesday, December 09, 2015 1:39 AM
>>>>> To: Ankireddypalle Reddy; Vijay Bellur; gluster-users at gluster.org
>>>>> Subject: Re: [Gluster-users] libgfapi access
>>>>>
>>>>>
>>>>>
>>>>> On 12/08/2015 08:28 PM, Ankireddypalle Reddy wrote:
>>>>>> Vijay,
>>>>>> We are trying to write data backed up by Commvault
>>>>>> simpana to glusterfs volume. The data being written
>>>>>> is around 30 GB. Two kinds of write requests happen.
>>>>>> 1) 1MB requests
>>>>>> 2) Small write requests of size 128 bytes. In case of libgfapi access
>>>>>> these are cached and a single 128KB write request is made where as in
>>>>>> case of FUSE the 128 byte write request is handled to FUSE directly.
>>>>>>
>>>>>> glusterfs 3.6.5 built on Aug 24 2015 10:02:43
>>>>>>
>>>>>> Volume Name: dispersevol
>>>>>> Type: Disperse
>>>>>> Volume ID: c5d6ccf8-6fec-4912-ab2e-6a7701e4c4c0
>>>>>> Status: Started
>>>>>> Number of Bricks: 1 x (2 + 1) = 3
>>>>>> Transport-type: tcp
>>>>>> Bricks:
>>>>>> Brick1: ssdtest:/mnt/ssdfs1/brick3
>>>>>> Brick2: sanserver2:/data/brick3
>>>>>> Brick3: santest2:/home/brick3
>>>>>> Options Reconfigured:
>>>>>> performance.cache-size: 512MB
>>>>>> performance.write-behind-window-size: 8MB
>>>>>> performance.io-thread-count: 32
>>>>>> performance.flush-behind: on
>>>>> hi,
>>>>> Things look okay. May be we can find something using profile
>>>>> info.
>>>>>
>>>>> Could you post the results of the following operations:
>>>>> 1) gluster volume profile <volname> start
>>>>> 2) Run the fuse workload
>>>>> 3) gluster volume profile <volname> info > /path/to/file-1/to/send/us
>>>>> 4) Run the libgfapi workload
>>>>> 5)gluster volume profile <volname> info > /path/to/file-2/to/send/us
>>>>>
>>>>> Send both these files to us to check what are the extra fops if any that
>>>>> are sent over network which may be causing the delay.
>>>>>
>>>>> I see that you are using disperse volume. If you are going to use
>>>>> disperse volume for production usecases, I suggest you use 3.7.x
>>>>> preferably 3.7.3. We fixed a bug in releases from 3.7.4 till 3.7.6 which
>>>>> will be released in 3.7.7.
>>>>>
>>>>> Pranith
>>>>>> Thanks and Regards,
>>>>>> Ram
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Vijay Bellur [mailto:vbellur at redhat.com]
>>>>>> Sent: Monday, December 07, 2015 6:13 PM
>>>>>> To: Ankireddypalle Reddy; gluster-users at gluster.org
>>>>>> Subject: Re: [Gluster-users] libgfapi access
>>>>>>
>>>>>> On 12/07/2015 10:29 AM, Ankireddypalle Reddy wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am trying to use libgfapi interface to access
>>>>>>> gluster volume. What I noticed is that reads/writes to the gluster
>>>>>>> volume through libgfapi interface are slower than FUSE. I was
>>>>>>> expecting the contrary. Are there any recommendations/settings
>>>>>>> suggested to be used while using libgfapi interface.
>>>>>>>
>>>>>> Can you please provide more details about your tests? Providing
>>>>>> information like I/O block size, file size, throughput would be
>>>>>> helpful.
>>>>>>
>>>>>> Thanks,
>>>>>> Vijay
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ***************************Legal
>>>>>> Disclaimer***************************
>>>>>> "This communication may contain confidential and privileged material
>>>>>> for the sole use of the intended recipient. Any unauthorized review,
>>>>>> use or distribution by others is strictly prohibited. If you have
>>>>>> received the message by mistake, please advise the sender by reply email
>>>>>> and delete the message. Thank you."
>>>>>> ********************************************************************
>>>>>> *
>>>>>> * _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>> ***************************Legal
>>>>> Disclaimer***************************
>>>>> "This communication may contain confidential and privileged material
>>>>> for the sole use of the intended recipient. Any unauthorized review,
>>>>> use or distribution by others is strictly prohibited. If you have
>>>>> received the message by mistake, please advise the sender by reply email
>>>>> and delete the message. Thank you."
>>>>> *********************************************************************
>>>>> *
>>>>
>>>> ***************************Legal Disclaimer***************************
>>>> "This communication may contain confidential and privileged material
>>>> for the sole use of the intended recipient. Any unauthorized review,
>>>> use or distribution by others is strictly prohibited. If you have
>>>> received the message by mistake, please advise the sender by reply email
>>>> and delete the message. Thank you."
>>>> **********************************************************************
>>>
>>>
>>> ***************************Legal Disclaimer***************************
>>> "This communication may contain confidential and privileged material for
>>> the
>>> sole use of the intended recipient. Any unauthorized review, use or
>>> distribution
>>> by others is strictly prohibited. If you have received the message by
>>> mistake,
>>> please advise the sender by reply email and delete the message. Thank you."
>>> **********************************************************************
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
More information about the Gluster-users
mailing list