[Gluster-users] Glusterfs3.2.5 scaling

alfred de sollize desollize at gmail.com
Tue May 22 01:18:39 UTC 2012


On Tue, May 22, 2012 at 3:35 AM, alfred de sollize <desollize at gmail.com>wrote:

> Hi All,
>            We tried as Amar suggested the client io-threads but still I
> get following errors in the log files
>
>
> =================================================================================
> [2012-05-21 17:54:56.491110] D [posix.c:316:posix_lstat_with_gfid]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/features/access-control.so(posix_acl_readdirp+0x208)
> [0x7f16165edda8]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_readdirp+0xf)
> [0x7f16168025df]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_do_readdir+0x655)
> [0x7f1616802505]))) 0-home-export-posix: failed to get gfid
> [2012-05-21 17:56:30.328936] D [posix.c:316:posix_lstat_with_gfid]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/features/access-control.so(posix_acl_readdirp+0x208)
> [0x7f16165edda8]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_readdirp+0xf)
> [0x7f16168025df]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_do_readdir+0x655)
> [0x7f1616802505]))) 0-home-export-posix: failed to get gfid
> [2012-05-21 18:02:16.150848] D [server-resolve.c:256:resolve_path_deep]
> 0-/home9: RESOLVE RENAME() seeking deep resolution of
> /tia/mom4p1/work/noriver/ascii/19610401.fms.out
> [2012-05-21 18:02:16.151193] D [server-resolve.c:201:resolve_deep_cbk]
> 0-home-export-server: /tia/mom4p1/work/noriver/ascii/19610401.fms.out:
> failed to resolve (No such file or directory)
> [2012-05-21 18:02:16.151231] D
> [server-resolve.c:170:resolve_deep_continue] 0-home-export-server: return
> value of resolve_*_simple 1
> [2012-05-21 18:04:16.468] D [io-threads.c:118:iot_worker]
> 0-home-export-io-threads: timeout, terminated. conf->curr_count=4
> [2012-05-21 18:04:16.533] D [io-threads.c:118:iot_worker]
> 0-home-export-io-threads: timeout, terminated. conf->curr_count=3
> [2012-05-21 18:04:16.602] D [io-threads.c:118:iot_worker]
> 0-home-export-io-threads: timeout, terminated. conf->curr_count=2
> [2012-05-21 18:04:31.348] D [io-threads.c:118:iot_worker]
> 0-home-export-io-threads: timeout, terminated. conf->curr_count=1
> [2012-05-21 18:22:12.295387] D [posix.c:316:posix_lstat_with_gfid]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/features/access-control.so(posix_acl_readdirp+0x208)
> [0x7f16165edda8]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_readdirp+0xf)
> [0x7f16168025df]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_do_readdir+0x655)
> [0x7f1616802505]))) 0-home-export-posix: failed to get gfid
> [2012-05-21 18:24:03.353140] D [posix.c:316:posix_lstat_with_gfid]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/features/access-control.so(posix_acl_readdirp+0x208)
> [0x7f16165edda8]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_readdirp+0xf)
> [0x7f16168025df]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_do_readdir+0x655)
> [0x7f1616802505]))) 0-home-export-posix: failed to get gfid
> [2012-05-21 18:24:05.764889] D [posix.c:316:posix_lstat_with_gfid]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/features/access-control.so(posix_acl_readdirp+0x208)
> [0x7f16165edda8]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_readdirp+0xf)
> [0x7f16168025df]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_do_readdir+0x655)
> [0x7f1616802505]))) 0-home-export-posix: failed to get gfid
> [2012-05-21 18:26:27.378861] D [posix.c:316:posix_lstat_with_gfid]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/features/access-control.so(posix_acl_readdirp+0x208)
> [0x7f16165edda8]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_readdirp+0xf)
> [0x7f16168025df]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_do_readdir+0x655)
> [0x7f1616802505]))) 0-home-export-posix: failed to get gfid
> [2012-05-21 19:03:37.89056] D [posix.c:769:posix_do_chmod]
> 0-home-export-posix:
> /home9/tia/mom4p1/work/noriver/ascii/.19610401.fms.out.swp (Function not
> implemented)
> [2012-05-21 19:04:23.410529] D [io-threads.c:2017:__iot_workers_scale]
> 0-home-export-io-threads: scaled threads to 2 (queue_size=4/2)
> [2012-05-21 19:05:09.355671] D [io-threads.c:2017:__iot_workers_scale]
> 0-home-export-io-threads: scaled threads to 3 (queue_size=8/3)
> [2012-05-21 19:05:18.109547] D [io-threads.c:2017:__iot_workers_scale]
> 0-home-export-io-threads: scaled threads to 4 (queue_size=16/4)
> [2012-05-21 19:05:38.403356] D [io-threads.c:2017:__iot_workers_scale]
> 0-home-export-io-threads: scaled threads to 5 (queue_size=32/5)
> [2012-05-21 19:10:37.850018] D [posix.c:769:posix_do_chmod]
> 0-home-export-posix: /home9/gfskurt/kurt/CCSM/new/scripts/i128/.hf.swp
> (Function not implemented)
> [2012-05-21 19:37:29.997392] D [posix.c:769:posix_do_chmod]
> 0-home-export-posix: /home9/tia/mom4p1/work/noriver/.ibhosts.swp (Function
> not implemented)
> [2012-05-21 19:55:02.254952] D [server-resolve.c:256:resolve_path_deep]
> 0-/home9: RESOLVE OPEN() seeking deep resolution of
> /tia/mom4p1/work/noriver/diag_table
> [2012-05-21 19:55:02.387242] D [server-resolve.c:256:resolve_path_deep]
> 0-/home9: RESOLVE RENAME() seeking deep resolution of
> /tia/mom4p1/work/noriver/ascii/19610401.fms.out
> [2012-05-21 19:55:02.387592] D [server-resolve.c:201:resolve_deep_cbk]
> 0-home-export-server: /tia/mom4p1/work/noriver/ascii/19610401.fms.out:
> failed to resolve (No such file or directory)
> [2012-05-21 19:55:02.387624] D
> [server-resolve.c:170:resolve_deep_continue] 0-home-export-server: return
> value of resolve_*_simple 1
> [2012-05-21 19:55:26.576525] D [posix.c:769:posix_do_chmod]
> 0-home-export-posix:
> /home9/tia/mom4p1/work/noriver/ascii/.19610401.fms.out.swp (Function not
> implemented)
> [2012-05-21 20:00:01.264940] D [server-resolve.c:256:resolve_path_deep]
> 0-/home9: RESOLVE UNLINK() seeking deep resolution of
> /tia/mom4p1/work/noriver/history/19610401.ice_diag.nc
> [2012-05-21 20:00:01.306673] D [server-resolve.c:256:resolve_path_deep]
> 0-/home9: RESOLVE UNLINK() seeking deep resolution of
> /tia/mom4p1/work/noriver/history/19610401.ocean_diag.nc
> [2012-05-21 20:00:01.969034] D [server-resolve.c:256:resolve_path_deep]
> 0-/home9: RESOLVE UNLINK() seeking deep resolution of
> /tia/mom4p1/work/noriver/history/19610401.ocean_temp.nc
> [2012-05-21 20:01:14.414520] D [posix.c:769:posix_do_chmod]
> 0-home-export-posix:
> /home9/tia/mom4p1/work/noriver/ascii/.19610401.fms.out.swp (Function not
> implemented)
> [2012-05-21 20:08:20.756821] D [server-resolve.c:256:resolve_path_deep]
> 0-/home9: RESOLVE RENAME() seeking deep resolution of
> /tia/mom4p1/work/noriver/ascii/no_time_stamp.logfile.000000.out
> [2012-05-21 20:08:20.757176] D [server-resolve.c:201:resolve_deep_cbk]
> 0-home-export-server:
> /tia/mom4p1/work/noriver/ascii/no_time_stamp.logfile.000000.out: failed to
> resolve (No such file or directory)
> [2012-05-21 20:08:20.757214] D
> [server-resolve.c:170:resolve_deep_continue] 0-home-export-server: return
> value of resolve_*_simple 1
> [2012-05-21 20:17:50.736328] D [posix.c:769:posix_do_chmod]
> 0-home-export-posix: /home9/tia/mom4p1/work/noriver/.ibhosts.swp (Function
> not implemented)
> [2012-05-21 20:26:34.648108] D [posix.c:769:posix_do_chmod]
> 0-home-export-posix: /home9/tia/mom4p1/work/noriver/.ibhosts.swp (Function
> not implemented)
> [2012-05-21 20:33:56.293882] D [posix.c:769:posix_do_chmod]
> 0-home-export-posix:
> /home9/gfskurt/kurt/CCSM/new/scratch/i512/ocn/input/gx1v6_tavg_contents
> (Function not implemented)
> [2012-05-21 20:34:14.798261] D [posix.c:769:posix_do_chmod]
> 0-home-export-posix:
> /home9/gfskurt/kurt/CCSM/new/scripts/i512/CaseDocs/ice_in (Function not
> implemented)
> [2012-05-21 20:34:14.840494] D [posix.c:769:posix_do_chmod]
> 0-home-export-posix:
> /home9/gfskurt/kurt/CCSM/new/scripts/i512/CaseDocs/ice_in (Function not
> implemented)
> [2012-05-21 20:34:14.933004] D [posix.c:769:posix_do_chmod]
> 0-home-export-posix:
> /home9/gfskurt/kurt/CCSM/new/scripts/i512/CaseDocs/ice_in (Function not
> implemented)
> [2012-05-21 20:36:42.411] D [io-threads.c:118:iot_worker]
> 0-home-export-io-threads: timeout, terminated. conf->curr_count=4
> [2012-05-21 20:36:42.533] D [io-threads.c:118:iot_worker]
> 0-home-export-io-threads: timeout, terminated. conf->curr_count=3
> [2012-05-21 20:36:42.571] D [io-threads.c:118:iot_worker]
> 0-home-export-io-threads: timeout, terminated. conf->curr_count=2
> [2012-05-21 20:39:29.233] D [io-threads.c:118:iot_worker]
> 0-home-export-io-threads: timeout, terminated. conf->curr_count=1
> [2012-05-21 22:24:07.309485] D [posix.c:316:posix_lstat_with_gfid]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/features/access-control.so(posix_acl_readdirp+0x208)
> [0x7f16165edda8]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_readdirp+0xf)
> [0x7f16168025df]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_do_readdir+0x655)
> [0x7f1616802505]))) 0-home-export-posix: failed to get gfid
> [2012-05-21 22:24:07.466879] D [posix.c:316:posix_lstat_with_gfid]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/features/access-control.so(posix_acl_readdirp+0x208)
> [0x7f16165edda8]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_readdirp+0xf)
> [0x7f16168025df]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_do_readdir+0x655)
> [0x7f1616802505]))) 0-home-export-posix: failed to get gfid
> [2012-05-22 02:48:07.317096] D [posix.c:316:posix_lstat_with_gfid]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/features/access-control.so(posix_acl_readdirp+0x208)
> [0x7f16165edda8]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_readdirp+0xf)
> [0x7f16168025df]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_do_readdir+0x655)
> [0x7f1616802505]))) 0-home-export-posix: failed to get gfid
> [2012-05-22 02:48:09.155900] D [posix.c:316:posix_lstat_with_gfid]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/features/access-control.so(posix_acl_readdirp+0x208)
> [0x7f16165edda8]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_readdirp+0xf)
> [0x7f16168025df]
> (-->/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/storage/posix.so(posix_do_readdir+0x655)
> [0x7f1616802505]))) 0-home-export-posix: failed to get gfid
>
>
> =============================================================================================================
> after which this brick would get disconnected and we get files missing and
> TENC errors
> Is there a limitation on simultaneous i-o that could be handled by
> glusterfs?
> reg
> Al
>
>
> On Tue, May 15, 2012 at 11:24 AM, Amar Tumballi <amarts at redhat.com> wrote:
>
>> On 05/15/2012 09:02 AM, alfred de sollize wrote:
>>
>>> Has anybody worked with ccsm , wrf on Glusterfs? These spawn huge number
>>> of threads for IO.
>>> What are xlators to disable and what oher tune-ups so it stops giving
>>>  TENC
>>> Al
>>>
>>>
>>>
>> Considering you have lot of threads doing I/O, can you try 'gluster
>> volume set <VOLNAME> client-io-threads enable' and see if it makes any
>> difference?
>>
>> Regards,
>> Amar
>>
>>
>>>
>>>
>>> On Sun, May 13, 2012 at 10:54 PM, alfred de sollize <desollize at gmail.com
>>> <mailto:desollize at gmail.com>> wrote:
>>>
>>>    Hi Amar,
>>>                     The version is  in  subject line -  is
>>> Glusterfs-3.2.5
>>>
>>>    TENC comes in clients for commands like - [cd  /home]  or [df -h
>>> /home]
>>>    /home   - Transport End point not connected
>>>    it also appears in server logs,
>>>    Attaching some part of client and server logs for one of the error
>>>    bricks
>>>    is some problem for application integrations also? We use Centos-6.1..
>>>    Al
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>    On Fri, May 11, 2012 at 11:32 AM, Amar Tumballi <amarts at redhat.com
>>>    <mailto:amarts at redhat.com>> wrote:
>>>
>>>        On 05/10/2012 09:39 PM, alfred de sollize wrote:
>>>
>>>            We are setting up a 180 node cluster for weather modeling. 2
>>>            Storage
>>>            servers with 32GB Ram each. QDR INfiniband interconnect.
>>>            When we run iozone with 1GB perthread (128Kb blocksize) from
>>>            32 clients
>>>            (2 iozone threads per client).
>>>            The run succeeds however run fails for 64 clients and we
>>>            start getting
>>>            "Transport Endpoint not connected" errors.
>>>
>>>            There are 10 bricks(5 from each server ) each of ~4.2TB
>>>            making 42TB of
>>>            export-volume  that is fuse-mounted on the  clients.
>>>
>>>            There is no other error in the log files except for "TENC" .
>>>
>>>
>>>        When you say log file, which files are you looking at. Also,
>>>        which version of the glusterfs are you running ?
>>>
>>>        -Amar
>>>
>>>
>>>
>>>
>>>
>>> ______________________________**_________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://gluster.org/cgi-bin/**mailman/listinfo/gluster-users<http://gluster.org/cgi-bin/mailman/listinfo/gluster-users>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120522/3dda8a66/attachment.html>


More information about the Gluster-users mailing list