[Gluster-users] Replica brick not working
Atin Mukherjee
amukherj at redhat.com
Thu Dec 8 05:13:22 UTC 2016
>From the log snippet:
[2016-12-07 09:15:35.677645] I [MSGID: 106482]
[glusterd-brick-ops.c:442:__glusterd_handle_add_brick] 0-management:
Received add brick req
[2016-12-07 09:15:35.677708] I [MSGID: 106062]
[glusterd-brick-ops.c:494:__glusterd_handle_add_brick] 0-management:
replica-count is 2
[2016-12-07 09:15:35.677735] E [MSGID: 106291]
[glusterd-brick-ops.c:614:__glusterd_handle_add_brick] 0-management:
The last log entry indicates that we hit the code path in
gd_addbr_validate_replica_count ()
if (replica_count == volinfo->replica_count)
{
if (!(total_bricks % volinfo->dist_leaf_count))
{
ret =
1;
goto
out;
}
}
@Pranith, Ravi - Milos was trying to convert a dist (1 X 1) volume to a
replicate (1 X 2) using add brick and hit this issue where add-brick
failed. The cluster is operating with 3.7.6. Could you help on what
scenario this code path can be hit? One straight forward issue I see here
is missing err_str in this path.
On Wed, Dec 7, 2016 at 7:56 PM, Miloš Čučulović - MDPI <cuculovic at mdpi.com>
wrote:
> Sure Atin, logs are attached.
>
> - Kindest regards,
>
> Milos Cuculovic
> IT Manager
>
> ---
> MDPI AG
> Postfach, CH-4020 Basel, Switzerland
> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
> Tel. +41 61 683 77 35
> Fax +41 61 302 89 18
> Email: cuculovic at mdpi.com
> Skype: milos.cuculovic.mdpi
>
> On 07.12.2016 11:32, Atin Mukherjee wrote:
>
>> Milos,
>>
>> Giving snippets wouldn't help much, could you get me all the log files
>> (/var/log/glusterfs/*) from both the nodes?
>>
>> On Wed, Dec 7, 2016 at 3:54 PM, Miloš Čučulović - MDPI
>> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> wrote:
>>
>> Thanks, here is the log after volume force:
>>
>> [2016-12-07 10:23:39.157234] I [MSGID: 115036]
>> [server.c:552:server_rpc_notify] 0-storage-server: disconnecting
>> connection from
>> storage2-23175-2016/12/07-10:14:56:951307-storage-client-0-0-0
>> [2016-12-07 10:23:39.157301] I [MSGID: 101055]
>> [client_t.c:419:gf_client_unref] 0-storage-server: Shutting down
>> connection
>> storage2-23175-2016/12/07-10:14:56:951307-storage-client-0-0-0
>> [2016-12-07 10:23:40.187805] I [login.c:81:gf_auth] 0-auth/login:
>> allowed user names: ef4e608d-487b-49a3-85dd-0b36b3554312
>> [2016-12-07 10:23:40.187848] I [MSGID: 115029]
>> [server-handshake.c:612:server_setvolume] 0-storage-server: accepted
>> client from
>> storage2-23679-2016/12/07-10:23:40:160327-storage-client-0-0-0
>> (version: 3.7.6)
>> [2016-12-07 10:23:52.817529] E [MSGID: 113001]
>> [posix-helpers.c:1177:posix_handle_pair] 0-storage-posix:
>> /data/data-cluster/dms/submissions/User - 226485:
>> key:glusterfs.preop.parent.keyflags: 1 length:22 [Operation not
>> supported]
>> [2016-12-07 10:23:52.817598] E [MSGID: 113001]
>> [posix.c:1384:posix_mkdir] 0-storage-posix: setting xattrs on
>> /data/data-cluster/dms/submissions/User - 226485 failed [Operation
>> not supported]
>> [2016-12-07 10:23:52.821388] E [MSGID: 113001]
>> [posix-helpers.c:1177:posix_handle_pair] 0-storage-posix:
>> /data/data-cluster/dms/submissions/User -
>> 226485/815a39ccc2cb41dadba45fe7c1e226d4:
>> key:glusterfs.preop.parent.keyflags: 1 length:22 [Operation not
>> supported]
>> [2016-12-07 10:23:52.821434] E [MSGID: 113001]
>> [posix.c:1384:posix_mkdir] 0-storage-posix: setting xattrs on
>> /data/data-cluster/dms/submissions/User -
>> 226485/815a39ccc2cb41dadba45fe7c1e226d4 failed [Operation not
>> supported]
>>
>> - Kindest regards,
>>
>> Milos Cuculovic
>> IT Manager
>>
>> ---
>> MDPI AG
>> Postfach, CH-4020 Basel, Switzerland
>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
>> Tel. +41 61 683 77 35
>> Fax +41 61 302 89 18
>> Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> Skype: milos.cuculovic.mdpi
>>
>> On 07.12.2016 11:19, Atin Mukherjee wrote:
>>
>> You are referring to wrong log file which is for self heal
>> daemon. You'd
>> need to get back with the brick log file.
>>
>> On Wed, Dec 7, 2016 at 3:45 PM, Miloš Čučulović - MDPI
>> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>> wrote:
>>
>> This is the log file after force command:
>>
>>
>> [2016-12-07 10:14:55.945937] W
>> [glusterfsd.c:1236:cleanup_and_exit]
>> (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x770a)
>> [0x7fe9d905570a]
>> -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xdd) [0x40810d]
>> -->/usr/sbin/glusterfs(cleanup_and_exit+0x4d) [0x407f8d] )
>> 0-:
>> received signum (15), shutting down
>> [2016-12-07 10:14:56.960573] I [MSGID: 100030]
>> [glusterfsd.c:2318:main] 0-/usr/sbin/glusterfs: Started
>> running
>> /usr/sbin/glusterfs version 3.7.6 (args: /usr/sbin/glusterfs
>> -s
>> localhost --volfile-id gluster/glustershd -p
>> /var/lib/glusterd/glustershd/run/glustershd.pid -l
>> /var/log/glusterfs/glustershd.log -S
>> /var/run/gluster/2599dc977214c2895ef1b090a26c1518.socket
>> --xlator-option
>> *replicate*.node-uuid=7c988af2-9f76-4843-8e6f-d94866d57bb0)
>> [2016-12-07 10:14:56.968437] I [MSGID: 101190]
>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll:
>> Started
>> thread with index 1
>> [2016-12-07 10:14:56.969774] I
>> [graph.c:269:gf_add_cmdline_options]
>> 0-storage-replicate-0: adding option 'node-uuid' for volume
>> 'storage-replicate-0' with value
>> '7c988af2-9f76-4843-8e6f-d94866d57bb0'
>> [2016-12-07 10:14:56.985257] I [MSGID: 101190]
>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll:
>> Started
>> thread with index 2
>> [2016-12-07 10:14:56.986105] I [MSGID: 114020]
>> [client.c:2118:notify] 0-storage-client-0: parent
>> translators are
>> ready, attempting connect on transport
>> [2016-12-07 10:14:56.986668] I [MSGID: 114020]
>> [client.c:2118:notify] 0-storage-client-1: parent
>> translators are
>> ready, attempting connect on transport
>> Final graph:
>>
>> +-----------------------------------------------------------
>> -------------------+
>> 1: volume storage-client-0
>> 2: type protocol/client
>> 3: option ping-timeout 42
>> 4: option remote-host storage2
>> 5: option remote-subvolume /data/data-cluster
>> 6: option transport-type socket
>> 7: option username ef4e608d-487b-49a3-85dd-0b36b3554312
>> 8: option password dda0bdbf-95c1-4206-a57d-686756210170
>> 9: end-volume
>> 10:
>> 11: volume storage-client-1
>> 12: type protocol/client
>> 13: option ping-timeout 42
>> 14: option remote-host storage
>> 15: option remote-subvolume /data/data-cluster
>> 16: option transport-type socket
>> 17: option username ef4e608d-487b-49a3-85dd-0b36b3554312
>> 18: option password dda0bdbf-95c1-4206-a57d-686756210170
>> 19: end-volume
>> 20:
>> 21: volume storage-replicate-0
>> 22: type cluster/replicate
>> 23: option node-uuid 7c988af2-9f76-4843-8e6f-d94866
>> d57bb0
>> 24: option background-self-heal-count 0
>> 25: option metadata-self-heal on
>> 26: option data-self-heal on
>> 27: option entry-self-heal on
>> 28: option self-heal-daemon enable
>> 29: option iam-self-heal-daemon yes
>> [2016-12-07 10:14:56.987096] I
>> [rpc-clnt.c:1847:rpc_clnt_reconfig]
>> 0-storage-client-0: changing port to 49152 (from 0)
>> 30: subvolumes storage-client-0 storage-client-1
>> 31: end-volume
>> 32:
>> 33: volume glustershd
>> 34: type debug/io-stats
>> 35: subvolumes storage-replicate-0
>> 36: end-volume
>> 37:
>>
>> +-----------------------------------------------------------
>> -------------------+
>> [2016-12-07 10:14:56.987685] E [MSGID: 114058]
>> [client-handshake.c:1524:client_query_portmap_cbk]
>> 0-storage-client-1: failed to get the port number for remote
>> subvolume. Please run 'gluster volume status' on server to
>> see if
>> brick process is running.
>> [2016-12-07 10:14:56.987766] I [MSGID: 114018]
>> [client.c:2042:client_rpc_notify] 0-storage-client-1:
>> disconnected
>> from storage-client-1. Client process will keep trying to
>> connect to
>> glusterd until brick's port is available
>> [2016-12-07 10:14:56.988065] I [MSGID: 114057]
>> [client-handshake.c:1437:select_server_supported_programs]
>> 0-storage-client-0: Using Program GlusterFS 3.3, Num
>> (1298437),
>> Version (330)
>> [2016-12-07 10:14:56.988387] I [MSGID: 114046]
>> [client-handshake.c:1213:client_setvolume_cbk]
>> 0-storage-client-0:
>> Connected to storage-client-0, attached to remote volume
>> '/data/data-cluster'.
>> [2016-12-07 10:14:56.988409] I [MSGID: 114047]
>> [client-handshake.c:1224:client_setvolume_cbk]
>> 0-storage-client-0:
>> Server and Client lk-version numbers are not same, reopening
>> the fds
>> [2016-12-07 10:14:56.988476] I [MSGID: 108005]
>> [afr-common.c:3841:afr_notify] 0-storage-replicate-0:
>> Subvolume
>> 'storage-client-0' came back up; going online.
>> [2016-12-07 10:14:56.988581] I [MSGID: 114035]
>> [client-handshake.c:193:client_set_lk_version_cbk]
>> 0-storage-client-0: Server lk version = 1
>>
>>
>> - Kindest regards,
>>
>> Milos Cuculovic
>> IT Manager
>>
>> ---
>> MDPI AG
>> Postfach, CH-4020 Basel, Switzerland
>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
>> Tel. +41 61 683 77 35
>> Fax +41 61 302 89 18
>> Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>
>> Skype: milos.cuculovic.mdpi
>>
>> On 07.12.2016 11:09, Atin Mukherjee wrote:
>>
>>
>>
>> On Wed, Dec 7, 2016 at 3:37 PM, Miloš Čučulović - MDPI
>> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>>> wrote:
>>
>> Hi Akin,
>>
>> thanks for your reply.
>>
>> I was trying to debug it since yesterday and today I
>> completely
>> purget the glusterfs-server from the storage server.
>>
>> I installed it again, checked the firewall and the
>> current
>> status is
>> as follows now:
>>
>> On storage2, I am running:
>> sudo gluster volume add-brick storage replica 2
>> storage:/data/data-cluster
>> Answer => volume add-brick: failed: Operation failed
>> cmd_history says:
>> [2016-12-07 09:57:28.471009] : volume add-brick
>> storage
>> replica 2
>> storage:/data/data-cluster : FAILED : Operation failed
>>
>> glustershd.log => no new entry on runing the
>> add-brick command.
>>
>> etc-glusterfs-glusterd.vol.log =>
>> [2016-12-07 10:01:56.567564] I [MSGID: 106482]
>> [glusterd-brick-ops.c:442:__gl
>> usterd_handle_add_brick]
>> 0-management:
>> Received add brick req
>> [2016-12-07 10:01:56.567626] I [MSGID: 106062]
>> [glusterd-brick-ops.c:494:__gl
>> usterd_handle_add_brick]
>> 0-management:
>> replica-count is 2
>> [2016-12-07 10:01:56.567655] E [MSGID: 106291]
>> [glusterd-brick-ops.c:614:__gl
>> usterd_handle_add_brick]
>> 0-management:
>>
>>
>> Logs from storage (new server), there is no relevant
>> log
>> when I am
>> running the command add-brick on storage2.
>>
>>
>> Now, after reinstalling glusterfs-server on storage,
>> I can
>> see on
>> storage2:
>>
>> Status of volume: storage
>> Gluster process TCP Port RDMA
>> Port
>> Online Pid
>>
>>
>> ------------------------------------------------------------
>> ------------------
>> Brick storage2:/data/data-cluster 49152 0
>> Y
>> 2160
>> Self-heal Daemon on localhost N/A N/A
>> Y
>> 7906
>>
>> Task Status of Volume storage
>>
>>
>> ------------------------------------------------------------
>> ------------------
>> There are no active volume tasks
>>
>>
>> By running the "gluster volume start storage force",
>> do I
>> risk to
>> broke the storage2? This is a production server and
>> needs to
>> stay live.
>>
>>
>> No, its going to bring up the brick process(es) if its
>> not up.
>>
>>
>> - Kindest regards,
>>
>> Milos Cuculovic
>> IT Manager
>>
>> ---
>> MDPI AG
>> Postfach, CH-4020 Basel, Switzerland
>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
>> Tel. +41 61 683 77 35
>> Fax +41 61 302 89 18
>> Email: cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com> <mailto:cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com>>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>>
>> Skype: milos.cuculovic.mdpi
>>
>> On 07.12.2016 10:44, Atin Mukherjee wrote:
>>
>>
>>
>> On Tue, Dec 6, 2016 at 10:08 PM, Miloš Čučulović
>> - MDPI
>> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>>
>> <mailto:cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com> <mailto:cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com>>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>>>> wrote:
>>
>> Dear All,
>>
>> I have two servers, storage and storage2.
>> The storage2 had a volume called storage.
>> I then decided to add a replica brick
>> (storage).
>>
>> I did this in the following way:
>>
>> 1. sudo gluster peer probe storage (on
>> storage server2)
>> 2. sudo gluster volume add-brick storage
>> replica 2
>> storage:/data/data-cluster
>>
>> Then I was getting the following error:
>> volume add-brick: failed: Operation failed
>>
>> But, it seems the brick was somehow added,
>> as when
>> checking
>> on storage2:
>> sudo gluster volume info storage
>> I am getting:
>> Status: Started
>> Number of Bricks: 1 x 2 = 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: storage2:/data/data-cluster
>> Brick2: storage:/data/data-cluster
>>
>>
>> So, seems ok here, however, when doing:
>> sudo gluster volume heal storage info
>> I am getting:
>> Volume storage is not of type
>> replicate/disperse
>> Volume heal failed.
>>
>>
>> Also, when doing
>> sudo gluster volume status all
>>
>> I am getting:
>> Status of volume: storage
>> Gluster process TCP
>> Port RDMA
>> Port
>> Online Pid
>>
>>
>>
>> ------------------------------------------------------------
>> ------------------
>> Brick storage2:/data/data-cluster 49152
>> 0
>> Y
>> 2160
>> Brick storage:/data/data-cluster N/A
>> N/A
>> N
>> N/A
>> Self-heal Daemon on localhost N/A
>> N/A
>> Y
>> 7906
>> Self-heal Daemon on storage N/A
>> N/A
>> N
>> N/A
>>
>> Task Status of Volume storage
>>
>>
>>
>> ------------------------------------------------------------
>> ------------------
>>
>> Any idea please?
>>
>>
>> It looks like the brick didn't come up during an
>> add-brick.
>> Could you
>> share cmd_history, glusterd and the new brick
>> log file
>> from both the
>> nodes? As a workaround, could you try 'gluster
>> volume
>> start storage
>> force' and see if the issue persists?
>>
>>
>>
>> --
>> - Kindest regards,
>>
>> Milos Cuculovic
>> IT Manager
>>
>> ---
>> MDPI AG
>> Postfach, CH-4020 Basel, Switzerland
>> Office: St. Alban-Anlage 66, 4052 Basel,
>> Switzerland
>> Tel. +41 61 683 77 35
>> Fax +41 61 302 89 18
>> Email: cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>>
>> <mailto:cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com> <mailto:cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com>>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>>>
>> Skype: milos.cuculovic.mdpi
>> ______________________________
>> _________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> <mailto:Gluster-users at gluster.org>
>> <mailto:Gluster-users at gluster.org
>> <mailto:Gluster-users at gluster.org>>
>> <mailto:Gluster-users at gluster.org
>> <mailto:Gluster-users at gluster.org>
>> <mailto:Gluster-users at gluster.org
>> <mailto:Gluster-users at gluster.org>>>
>> <mailto:Gluster-users at gluster.org
>> <mailto:Gluster-users at gluster.org>
>> <mailto:Gluster-users at gluster.org
>> <mailto:Gluster-users at gluster.org>>
>> <mailto:Gluster-users at gluster.org
>> <mailto:Gluster-users at gluster.org>
>> <mailto:Gluster-users at gluster.org
>> <mailto:Gluster-users at gluster.org>>>>
>>
>> http://www.gluster.org/mailman/listinfo/gluster-users
>> <http://www.gluster.org/mailman/listinfo/gluster-users>
>> <http://www.gluster.org/mailman/listinfo/gluster-users
>> <http://www.gluster.org/mailman/listinfo/gluster-users>>
>>
>> <http://www.gluster.org/mailman/listinfo/gluster-users
>> <http://www.gluster.org/mailman/listinfo/gluster-users>
>> <http://www.gluster.org/mailman/listinfo/gluster-users
>> <http://www.gluster.org/mailman/listinfo/gluster-users>>>
>>
>> <http://www.gluster.org/mailman/listinfo/gluster-users
>> <http://www.gluster.org/mailman/listinfo/gluster-users>
>> <http://www.gluster.org/mailman/listinfo/gluster-users
>> <http://www.gluster.org/mailman/listinfo/gluster-users>>
>>
>> <http://www.gluster.org/mailman/listinfo/gluster-users
>> <http://www.gluster.org/mailman/listinfo/gluster-users>
>> <http://www.gluster.org/mailman/listinfo/gluster-users
>> <http://www.gluster.org/mailman/listinfo/gluster-users>>>>
>>
>>
>>
>>
>> --
>>
>> ~ Atin (atinm)
>>
>>
>>
>>
>> --
>>
>> ~ Atin (atinm)
>>
>>
>>
>>
>> --
>>
>> ~ Atin (atinm)
>>
>>
>>
>>
>> --
>>
>> ~ Atin (atinm)
>>
>
--
~ Atin (atinm)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161208/d3f51274/attachment.html>
More information about the Gluster-users
mailing list