[Gluster-devel] missing files
Pranith Kumar Karampuri
pkarampu at redhat.com
Thu Feb 12 11:32:20 UTC 2015
On 02/12/2015 04:52 PM, Pranith Kumar Karampuri wrote:
>
> On 02/12/2015 03:05 PM, Pranith Kumar Karampuri wrote:
>>
>> On 02/12/2015 09:14 AM, Justin Clift wrote:
>>> On 12 Feb 2015, at 03:02, Shyam <srangana at redhat.com> wrote:
>>>> On 02/11/2015 08:28 AM, David F. Robinson wrote:
>>>>> My base filesystem has 40-TB and the tar takes 19 minutes. I
>>>>> copied over 10-TB and it took the tar extraction from 1-minute to
>>>>> 7-minutes.
>>>>>
>>>>> My suspicion is that it is related to number of files and not
>>>>> necessarily file size. Shyam is looking into reproducing this
>>>>> behavior on a redhat system.
>>>> I am able to reproduce the issue on a similar setup internally (at
>>>> least at the surface it seems to be similar to what David is facing).
>>>>
>>>> I will continue the investigation for the root cause.
>> Here is the initial analysis of my investigation: (Thanks for
>> providing me with the setup shyam, keep the setup we may need it for
>> further analysis)
>>
>> On bad volume:
>> %-latency Avg-latency Min-Latency Max-Latency No. of
>> calls Fop
>> --------- ----------- ----------- -----------
>> ------------ ----
>> 0.00 0.00 us 0.00 us 0.00 us 937104 FORGET
>> 0.00 0.00 us 0.00 us 0.00 us 872478 RELEASE
>> 0.00 0.00 us 0.00 us 0.00 us 23668 RELEASEDIR
>> 0.00 41.86 us 23.00 us 86.00 us 92 STAT
>> 0.01 39.40 us 24.00 us 104.00 us 218 STATFS
>> 0.28 55.99 us 43.00 us 1152.00 us 4065 SETXATTR
>> 0.58 56.89 us 25.00 us 4505.00 us 8236 OPENDIR
>> 0.73 26.80 us 11.00 us 257.00 us 22238 FLUSH
>> 0.77 152.83 us 92.00 us 8819.00 us 4065 RMDIR
>> 2.57 62.00 us 21.00 us 409.00 us 33643 WRITE
>> 5.46 199.16 us 108.00 us 469938.00 us 22238 UNLINK
>> 6.70 69.83 us 43.00 us 7777.00 us 77809 LOOKUP
>> 6.97 447.60 us 21.00 us 54875.00 us 12631 READDIRP
>> 7.73 79.42 us 33.00 us 1535.00 us 78909 SETATTR
>> 14.11 2815.00 us 176.00 us 2106305.00 us 4065 MKDIR
>> 54.09 1972.62 us 138.00 us 1520773.00 us 22238 CREATE
>>
>> On good volume:
>> %-latency Avg-latency Min-Latency Max-Latency No. of
>> calls Fop
>> --------- ----------- ----------- -----------
>> ------------ ----
>> 0.00 0.00 us 0.00 us 0.00 us 58870 FORGET
>> 0.00 0.00 us 0.00 us 0.00 us 66016 RELEASE
>> 0.00 0.00 us 0.00 us 0.00 us 16480 RELEASEDIR
>> 0.00 61.50 us 58.00 us 65.00 us 2 OPEN
>> 0.01 39.56 us 16.00 us 112.00 us 71 STAT
>> 0.02 41.29 us 27.00 us 79.00 us 163 STATFS
>> 0.03 36.06 us 17.00 us 98.00 us 301 FSTAT
>> 0.79 62.38 us 39.00 us 269.00 us 4065 SETXATTR
>> 1.14 242.99 us 25.00 us 28636.00 us 1497 READ
>> 1.54 59.76 us 25.00 us 6325.00 us 8236 OPENDIR
>> 1.70 133.75 us 89.00 us 374.00 us 4065 RMDIR
>> 2.25 32.65 us 15.00 us 265.00 us 22006 FLUSH
>> 3.37 265.05 us 172.00 us 2349.00 us 4065 MKDIR
>> 7.14 68.34 us 21.00 us 21902.00 us 33357 WRITE
>> 11.00 159.68 us 107.00 us 2567.00 us 22003 UNLINK
>> 13.82 200.54 us 133.00 us 21762.00 us 22003 CREATE
>> 17.85 448.85 us 22.00 us 54046.00 us 12697 READDIRP
>> 18.37 76.12 us 45.00 us 294.00 us 77044 LOOKUP
>> 20.95 85.54 us 35.00 us 1404.00 us 78204 SETATTR
>>
>> As we can see here, FORGET/RELEASE are way more in the brick from
>> full volume compared to the brick from empty volume. It seems to
>> suggest that the inode-table on the volume with lots of data is
>> carrying too many passive inodes in the table which need to be
>> displaced to create new ones. Need to check if they come in the
>> fop-path. Need to continue my investigations further, will let you know.
> Just to increase confidence performed one more test. Stopped the
> volumes and re-started. Now on both the volumes, the numbers are
> almost same:
>
> [root at gqac031 gluster-mount]# time rm -rf boost_1_57_0 ; time tar xf
> boost_1_57_0.tar.gz
>
> real 1m15.074s
> user 0m0.550s
> sys 0m4.656s
>
> real 2m46.866s
> user 0m5.347s
> sys 0m16.047s
>
> [root at gqac031 gluster-mount]# cd /gluster-emptyvol/
> [root at gqac031 gluster-emptyvol]# ls
> boost_1_57_0.tar.gz
> [root at gqac031 gluster-emptyvol]# time tar xf boost_1_57_0.tar.gz
>
> real 2m31.467s
> user 0m5.475s
> sys 0m15.471s
>
> gqas015.sbu.lab.eng.bos.redhat.com:testvol on /gluster-mount type
> fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
> gqas015.sbu.lab.eng.bos.redhat.com:emotyvol on /gluster-emptyvol type
> fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
I just checked that inode_link links the inode and calls
inode_table_prune which triggers these inode_forgets as a synchronous
operation in the fop path.
Pranith
>
> Pranith
>>
>> Pranith
>>> Thanks Shyam. :)
>>>
>>> + Justin
>>>
>>> --
>>> GlusterFS - http://www.gluster.org
>>>
>>> An open source, distributed file system scaling to several
>>> petabytes, and handling thousands of clients.
>>>
>>> My personal twitter: twitter.com/realjustinclift
>>>
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
More information about the Gluster-devel
mailing list