[Gluster-users] rebalance fix layout necessary

Amudhan P amudhan83 at gmail.com
Fri Apr 7 14:05:40 UTC 2017


Volume type:
Disperse Volume  8+2  = 1080 bricks

First time added 8+2 * 3 sets and it started giving issue in listing
folder. so, remounted mount point and it was working fine.

Second added 8+2 *13 sets and it also had the same issue.

when listing folder it was returning an empty folder or not showing all the
folders.

when ongoing write was interrupted it throws an error destination not
folder not available.

adding few more lines from log.. let me know if you need full log file.

[2017-04-05 13:40:03.702624] I [glusterfsd-mgmt.c:52:mgmt_cbk_spec] 0-mgmt:
Volume file changed
[2017-04-05 13:40:04.970055] I [MSGID: 122067]
[ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-123: Using 'sse' CPU
extensions
[2017-04-05 13:40:04.971194] I [MSGID: 122067]
[ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-122: Using 'sse' CPU
extensions
[2017-04-05 13:40:04.972144] I [MSGID: 122067]
[ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-121: Using 'sse' CPU
extensions
[2017-04-05 13:40:04.973131] I [MSGID: 122067]
[ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-120: Using 'sse' CPU
extensions
[2017-04-05 13:40:04.974072] I [MSGID: 122067]
[ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-119: Using 'sse' CPU
extensions
[2017-04-05 13:40:04.975005] I [MSGID: 122067]
[ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-118: Using 'sse' CPU
extensions
[2017-04-05 13:40:04.975936] I [MSGID: 122067]
[ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-117: Using 'sse' CPU
extensions
[2017-04-05 13:40:04.976905] I [MSGID: 122067]
[ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-116: Using 'sse' CPU
extensions
[2017-04-05 13:40:04.977825] I [MSGID: 122067]
[ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-115: Using 'sse' CPU
extensions
[2017-04-05 13:40:04.978755] I [MSGID: 122067]
[ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-114: Using 'sse' CPU
extensions
[2017-04-05 13:40:04.979689] I [MSGID: 122067]
[ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-113: Using 'sse' CPU
extensions
[2017-04-05 13:40:04.980626] I [MSGID: 122067]
[ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-112: Using 'sse' CPU
extensions
[2017-04-05 13:40:07.270412] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-736: changing port to 49153 (from 0)
[2017-04-05 13:40:07.271902] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-746: changing port to 49154 (from 0)
[2017-04-05 13:40:07.272076] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-756: changing port to 49155 (from 0)
[2017-04-05 13:40:07.273154] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-766: changing port to 49156 (from 0)
[2017-04-05 13:40:07.273193] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-776: changing port to 49157 (from 0)
[2017-04-05 13:40:07.273371] I [MSGID: 114046]
[client-handshake.c:1216:client_setvolume_cbk] 2-gfs-vol-client-579:
Connected to gfs-vol-client-579, attached to remote volume
'/media/disk22/brick22'.
[2017-04-05 13:40:07.273388] I [MSGID: 114047]
[client-handshake.c:1227:client_setvolume_cbk] 2-gfs-vol-client-579: Server
and Client lk-version numbers are not same, reopening the fds
[2017-04-05 13:40:07.273435] I [MSGID: 114035]
[client-handshake.c:202:client_set_lk_version_cbk] 2-gfs-vol-client-433:
Server lk version = 1
[2017-04-05 13:40:07.275632] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-786: changing port to 49158 (from 0)
[2017-04-05 13:40:07.275685] I [MSGID: 114046]
[client-handshake.c:1216:client_setvolume_cbk] 2-gfs-vol-client-589:
Connected to gfs-vol-client-589, attached to remote volume
'/media/disk23/brick23'.
[2017-04-05 13:40:07.275707] I [MSGID: 114047]
[client-handshake.c:1227:client_setvolume_cbk] 2-gfs-vol-client-589: Server
and Client lk-version numbers are not same, reopening the fds
[2017-04-05 13:40:07.087011] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-811: changing port to 49161 (from 0)
[2017-04-05 13:40:07.087031] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-420: changing port to 49158 (from 0)
[2017-04-05 13:40:07.087045] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-521: changing port to 49168 (from 0)
[2017-04-05 13:40:07.087060] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-430: changing port to 49159 (from 0)
[2017-04-05 13:40:07.087074] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-531: changing port to 49169 (from 0)
[2017-04-05 13:40:07.087098] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-440: changing port to 49160 (from 0)
[2017-04-05 13:40:07.087105] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-821: changing port to 49162 (from 0)
[2017-04-05 13:40:07.087117] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-450: changing port to 49161 (from 0)
[2017-04-05 13:40:07.087131] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-831: changing port to 49163 (from 0)
[2017-04-05 13:40:07.087134] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-460: changing port to 49162 (from 0)
[2017-04-05 13:40:07.087157] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-841: changing port to 49164 (from 0)
[2017-04-05 13:40:07.087181] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-541: changing port to 49170 (from 0)
[2017-04-05 13:40:07.087185] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-470: changing port to 49163 (from 0)
[2017-04-05 13:40:07.087202] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-851: changing port to 49165 (from 0)
[2017-04-05 13:40:07.087241] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-480: changing port to 49164 (from 0)
[2017-04-05 13:40:07.087240] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-551: changing port to 49171 (from 0)
[2017-04-05 13:40:07.087263] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-861: changing port to 49166 (from 0)
[2017-04-05 13:40:07.087281] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-571: changing port to 49173 (from 0)
[2017-04-05 13:40:07.087284] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-561: changing port to 49172 (from 0)
[2017-04-05 13:40:07.087318] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-581: changing port to 49174 (from 0)
[2017-04-05 13:40:07.087318] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-490: changing port to 49165 (from 0)
[2017-04-05 13:40:07.087344] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-500: changing port to 49166 (from 0)
[2017-04-05 13:40:07.087352] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-871: changing port to 49167 (from 0)
[2017-04-05 13:40:07.087372] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
2-gfs-vol-client-591: changing port to 49175 (from 0)

[2017-04-05 13:40:07.681293] I [MSGID: 114046]
[client-handshake.c:1216:client_setvolume_cbk] 2-gfs-vol-client-755:
Connected to gfs-vol-client-755, attached to remote volume
'/media/disk4/brick4'.
[2017-04-05 13:40:07.681312] I [MSGID: 114047]
[client-handshake.c:1227:client_setvolume_cbk] 2-gfs-vol-client-755: Server
and Client lk-version numbers are not same, reopening the fds
[2017-04-05 13:40:07.681317] I [MSGID: 122061] [ec.c:340:ec_up]
2-gfs-vol-disperse-74: Going UP
[2017-04-05 13:40:07.681428] I [MSGID: 122061] [ec.c:340:ec_up]
2-gfs-vol-disperse-75: Going UP
[2017-04-05 13:40:07.681454] I [MSGID: 114046]
[client-handshake.c:1216:client_setvolume_cbk] 2-gfs-vol-client-1049:
Connected to gfs-vol-client-1049, attached to remote volume
'/media/disk33/brick33'.
[2017-04-05 13:45:10.689344] I [MSGID: 114018]
[client.c:2276:client_rpc_notify] 0-gfs-vol-client-71: disconnected from
gfs-vol-client-71. Client process will keep trying to connect to glusterd
until brick's port is available
[2017-04-05 13:45:10.689376] I [MSGID: 114021] [client.c:2361:notify]
0-gfs-vol-client-73: current graph is no longer active, destroying
rpc_client
[2017-04-05 13:45:10.689380] I [MSGID: 114018]
[client.c:2276:client_rpc_notify] 0-gfs-vol-client-72: disconnected from
gfs-vol-client-72. Client process will keep trying to connect to glusterd
until brick's port is available
[2017-04-05 13:45:10.689389] I [MSGID: 114021] [client.c:2361:notify]
0-gfs-vol-client-74: current graph is no longer active, destroying
rpc_client
[2017-04-05 13:45:10.689394] I [MSGID: 114018]
[client.c:2276:client_rpc_notify] 0-gfs-vol-client-73: disconnected from
gfs-vol-client-73. Client process will keep trying to connect to glusterd
until brick's port is available
[2017-04-05 13:45:10.689390] I [MSGID: 122062] [ec.c:354:ec_down]
0-gfs-vol-disperse-7: Going DOWN
[2017-04-05 13:45:10.689428] I [MSGID: 114021] [client.c:2361:notify]
0-gfs-vol-client-75: current graph is no longer active, destroying
rpc_client
[2017-04-05 13:45:10.689443] I [MSGID: 114018]
[client.c:2276:client_rpc_notify] 0-gfs-vol-client-74: disconnected from
gfs-vol-client-74. Client process will keep trying to connect to glusterd
until brick's port is available

On Fri, Apr 7, 2017 at 11:05 AM, Nithya Balachandran <nbalacha at redhat.com>
wrote:

>
>
> On 6 April 2017 at 14:56, Amudhan P <amudhan83 at gmail.com> wrote:
>
>> Hi,
>>
>> I was able to add bricks to the volume successfully.
>> Client was reading, writing and listing data from mount point.
>> But after adding bricks I had issues in folder listing (not listing all
>> folders or returning empty folder list) and write was interrupted.
>>
>
> This is strange.The issue with listing folders you referred to earlier was
> because  of the rebalance but this seems new.
>
> How many bricks did you add and what is your volume config? What errors
> did you see while writing or listing folders?
>
> remounting volume has solved the issue and now working fine.
>>
>> I was under the impression that running rebalance would cause folder
>> listing issue but now adding brick itself created a problem.
>> It's irrelevant whether client busy or idle need to remount to solve the
>> issue.
>>
>> Also, i would like to know using brick in a volume without fix-layout
>> cause folder listing slowness.
>>
>>
>> Below a snippet of log from client when this happened. let me know if you
>> any more additional info.
>>
>> Client and Servers are 3.10.1, volume mounted thru fuse.
>>
>> Machine busy downloading & uploading
>>
>> [2017-04-05 13:39:33.487176] I [MSGID: 114021] [client.c:2361:notify]
>> 0-gfs-vol-client-1107: current graph is no longer active, destroying
>> rpc_client
>> [2017-04-05 13:39:33.487196] I [MSGID: 114021] [client.c:2361:notify]
>> 0-gfs-vol-client-1108: current graph is no longer active, destroying
>> rpc_client
>> [2017-04-05 13:39:33.487201] I [MSGID: 114018]
>> [client.c:2276:client_rpc_notify] 0-gfs-vol-client-1107: disconnected
>> from gfs-vol-client-1107. Client process will keep trying to connect to
>> glusterd until brick's port is available
>> [2017-04-05 13:39:33.487212] I [MSGID: 114021] [client.c:2361:notify]
>> 0-gfs-vol-client-1109: current graph is no longer active, destroying
>> rpc_client
>> [2017-04-05 13:39:33.487217] I [MSGID: 114018]
>> [client.c:2276:client_rpc_notify] 0-gfs-vol-client-1108: disconnected
>> from gfs-vol-client-1108. Client process will keep trying to connect to
>> glusterd until brick's port is available
>> [2017-04-05 13:39:33.487232] I [MSGID: 114018]
>> [client.c:2276:client_rpc_notify] 0-gfs-vol-client-1109: disconnected
>> from gfs-vol-client-1109. Client process will keep trying to connect to
>> glusterd until brick's port is available
>>
>>
>> Idle system
>>
>> 2017-04-05 13:40:07.692336] I [MSGID: 114035]
>> [client-handshake.c:202:client_set_lk_version_cbk]
>> 2-gfs-vol-client-1065: Server lk version = 1
>> [2017-04-05 13:40:07.692383] I [MSGID: 114035]
>> [client-handshake.c:202:client_set_lk_version_cbk] 2-gfs-vol-client-995:
>> Server lk version = 1
>> [2017-04-05 13:40:07.692430] I [MSGID: 114035]
>> [client-handshake.c:202:client_set_lk_version_cbk] 2-gfs-vol-client-965:
>> Server lk version = 1
>> [2017-04-05 13:40:07.692485] I [MSGID: 114035]
>> [client-handshake.c:202:client_set_lk_version_cbk]
>> 2-gfs-vol-client-1075: Server lk version = 1
>> [2017-04-05 13:40:07.692532] I [MSGID: 114035]
>> [client-handshake.c:202:client_set_lk_version_cbk]
>> 2-gfs-vol-client-1025: Server lk version = 1
>> [2017-04-05 13:40:07.692569] I [MSGID: 114035]
>> [client-handshake.c:202:client_set_lk_version_cbk]
>> 2-gfs-vol-client-1055: Server lk version = 1
>> [2017-04-05 13:40:07.692620] I [MSGID: 114035]
>> [client-handshake.c:202:client_set_lk_version_cbk] 2-gfs-vol-client-955:
>> Server lk version = 1
>> [2017-04-05 13:40:07.692681] I [MSGID: 114035]
>> [client-handshake.c:202:client_set_lk_version_cbk]
>> 2-gfs-vol-client-1035: Server lk version = 1
>> [2017-04-05 13:40:07.692870] I [MSGID: 114035]
>> [client-handshake.c:202:client_set_lk_version_cbk]
>> 2-gfs-vol-client-1045: Server lk version = 1
>>
>>
>> Regards,
>> Amudhan
>>
>> On Tue, Apr 4, 2017 at 4:31 PM, Amudhan P <amudhan83 at gmail.com> wrote:
>>
>>> I mean time takes for listing folders and files? because of "rebalance
>>> fix layout" was not done.
>>>
>>>
>>> On Tue, Apr 4, 2017 at 1:51 PM, Amudhan P <amudhan83 at gmail.com> wrote:
>>>
>>>> Ok, good to hear.
>>>>
>>>> will there be any impact in listing folder and files?.
>>>>
>>>>
>>>> On Tue, Apr 4, 2017 at 1:43 PM, Nithya Balachandran <
>>>> nbalacha at redhat.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On 4 April 2017 at 12:33, Amudhan P <amudhan83 at gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have a query on rebalancing.
>>>>>>
>>>>>> let's consider following is my folder hierarchy.
>>>>>>
>>>>>> parent1-fol (parent folder)
>>>>>>               |_
>>>>>>                  class-fol-1 ( 1 st level subfolder)
>>>>>>                                |_
>>>>>>                                   A ( 2 nd level subfolder)
>>>>>>                                    |_
>>>>>>                                       childfol-1 (child folder
>>>>>> created every time before writing files)
>>>>>>
>>>>>>
>>>>>> Now, I have a running cluster with 3.10.1 with disperse volume and I
>>>>>> am planning to expand cluster by adding bricks.
>>>>>>
>>>>>> will there be a problem using newly added bricks without doing a
>>>>>> "rebalance fix layout" other than existing files cannot be rebalanced to
>>>>>> new brick and files created under existing folder will not go to new brick?.
>>>>>>
>>>>>> I tested above case in my test setup and observed files created under
>>>>>> new folder goes to new brick. and I don't see any issue on listing files
>>>>>> and folder.
>>>>>>
>>>>>> so, My case is we create child folder every time before creating
>>>>>> files.
>>>>>>
>>>>>> The reason to avoid rebalance is I have more than 10000 folders
>>>>>> across 1080 bricks. so triggering rebalance will take a long time and in my
>>>>>> previous expansion in 3.7 was not able to access some folders randomly
>>>>>> until fix layout completes.
>>>>>>
>>>>>>
>>>>> It sounds like you will not need to run a rebalance or fix-layout for
>>>>> this. It should work fine.
>>>>>
>>>>> Regards,
>>>>> Nithya
>>>>>
>>>>>>
>>>>>> regards
>>>>>> Amudhan
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170407/19933366/attachment.html>


More information about the Gluster-users mailing list