[Gluster-devel] missing files

Pranith Kumar Karampuri pkarampu at redhat.com
Thu Feb 12 11:32:20 UTC 2015


On 02/12/2015 04:52 PM, Pranith Kumar Karampuri wrote:
>
> On 02/12/2015 03:05 PM, Pranith Kumar Karampuri wrote:
>>
>> On 02/12/2015 09:14 AM, Justin Clift wrote:
>>> On 12 Feb 2015, at 03:02, Shyam <srangana at redhat.com> wrote:
>>>> On 02/11/2015 08:28 AM, David F. Robinson wrote:
>>>>> My base filesystem has 40-TB and the tar takes 19 minutes. I 
>>>>> copied over 10-TB and it took the tar extraction from 1-minute to 
>>>>> 7-minutes.
>>>>>
>>>>> My suspicion is that it is related to number of files and not 
>>>>> necessarily file size. Shyam is looking into reproducing this 
>>>>> behavior on a redhat system.
>>>> I am able to reproduce the issue on a similar setup internally (at 
>>>> least at the surface it seems to be similar to what David is facing).
>>>>
>>>> I will continue the investigation for the root cause.
>> Here is the initial analysis of my investigation: (Thanks for 
>> providing me with the setup shyam, keep the setup we may need it for 
>> further analysis)
>>
>> On bad volume:
>>  %-latency   Avg-latency   Min-Latency   Max-Latency   No. of 
>> calls         Fop
>>  ---------   -----------   -----------   ----------- 
>> ------------        ----
>>       0.00       0.00 us       0.00 us       0.00 us 937104 FORGET
>>       0.00       0.00 us       0.00 us       0.00 us 872478 RELEASE
>>       0.00       0.00 us       0.00 us       0.00 us 23668 RELEASEDIR
>>       0.00      41.86 us      23.00 us      86.00 us 92 STAT
>>       0.01      39.40 us      24.00 us     104.00 us 218 STATFS
>>       0.28      55.99 us      43.00 us    1152.00 us 4065 SETXATTR
>>       0.58      56.89 us      25.00 us    4505.00 us 8236 OPENDIR
>>       0.73      26.80 us      11.00 us     257.00 us 22238 FLUSH
>>       0.77     152.83 us      92.00 us    8819.00 us 4065 RMDIR
>>       2.57      62.00 us      21.00 us     409.00 us 33643 WRITE
>>       5.46     199.16 us     108.00 us  469938.00 us 22238 UNLINK
>>       6.70      69.83 us      43.00 us    7777.00 us 77809 LOOKUP
>>       6.97     447.60 us      21.00 us   54875.00 us 12631 READDIRP
>>       7.73      79.42 us      33.00 us    1535.00 us 78909 SETATTR
>>      14.11    2815.00 us     176.00 us 2106305.00 us 4065 MKDIR
>>      54.09    1972.62 us     138.00 us 1520773.00 us 22238 CREATE
>>
>> On good volume:
>>  %-latency   Avg-latency   Min-Latency   Max-Latency   No. of 
>> calls         Fop
>>  ---------   -----------   -----------   ----------- 
>> ------------        ----
>>       0.00       0.00 us       0.00 us       0.00 us 58870 FORGET
>>       0.00       0.00 us       0.00 us       0.00 us 66016 RELEASE
>>       0.00       0.00 us       0.00 us       0.00 us 16480 RELEASEDIR
>>       0.00      61.50 us      58.00 us      65.00 us 2 OPEN
>>       0.01      39.56 us      16.00 us     112.00 us 71 STAT
>>       0.02      41.29 us      27.00 us      79.00 us 163 STATFS
>>       0.03      36.06 us      17.00 us      98.00 us 301 FSTAT
>>       0.79      62.38 us      39.00 us     269.00 us 4065 SETXATTR
>>       1.14     242.99 us      25.00 us   28636.00 us 1497 READ
>>       1.54      59.76 us      25.00 us    6325.00 us 8236 OPENDIR
>>       1.70     133.75 us      89.00 us     374.00 us 4065 RMDIR
>>       2.25      32.65 us      15.00 us     265.00 us 22006 FLUSH
>>       3.37     265.05 us     172.00 us    2349.00 us 4065 MKDIR
>>       7.14      68.34 us      21.00 us   21902.00 us 33357 WRITE
>>      11.00     159.68 us     107.00 us    2567.00 us 22003 UNLINK
>>      13.82     200.54 us     133.00 us   21762.00 us 22003 CREATE
>>      17.85     448.85 us      22.00 us   54046.00 us 12697 READDIRP
>>      18.37      76.12 us      45.00 us     294.00 us 77044 LOOKUP
>>      20.95      85.54 us      35.00 us    1404.00 us 78204 SETATTR
>>
>> As we can see here, FORGET/RELEASE are way more in the brick from 
>> full volume compared to the brick from empty volume. It seems to 
>> suggest that the inode-table on the volume with lots of data is 
>> carrying too many passive inodes in the table which need to be 
>> displaced to create new ones. Need to check if they come in the 
>> fop-path. Need to continue my investigations further, will let you know.
> Just to increase confidence performed one more test. Stopped the 
> volumes and re-started. Now on both the volumes, the numbers are 
> almost same:
>
> [root at gqac031 gluster-mount]# time rm -rf boost_1_57_0 ; time tar xf 
> boost_1_57_0.tar.gz
>
> real    1m15.074s
> user    0m0.550s
> sys     0m4.656s
>
> real    2m46.866s
> user    0m5.347s
> sys     0m16.047s
>
> [root at gqac031 gluster-mount]# cd /gluster-emptyvol/
> [root at gqac031 gluster-emptyvol]# ls
> boost_1_57_0.tar.gz
> [root at gqac031 gluster-emptyvol]# time tar xf boost_1_57_0.tar.gz
>
> real    2m31.467s
> user    0m5.475s
> sys     0m15.471s
>
> gqas015.sbu.lab.eng.bos.redhat.com:testvol on /gluster-mount type 
> fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
> gqas015.sbu.lab.eng.bos.redhat.com:emotyvol on /gluster-emptyvol type 
> fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
I just checked that inode_link links the inode and calls 
inode_table_prune which triggers these inode_forgets as a synchronous 
operation in the fop path.

Pranith
>
> Pranith
>>
>> Pranith
>>> Thanks Shyam. :)
>>>
>>> + Justin
>>>
>>> -- 
>>> GlusterFS - http://www.gluster.org
>>>
>>> An open source, distributed file system scaling to several
>>> petabytes, and handling thousands of clients.
>>>
>>> My personal twitter: twitter.com/realjustinclift
>>>
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel



More information about the Gluster-devel mailing list