[Gluster-users] Error after crash of Virtual Machine during migration

Mariusz Sobisiak MSobisiak at ydp.pl
Tue Dec 10 10:59:44 UTC 2013


Greetings,

Legend:
storage-gfs-3-prd - the first gluster.
storage-1-saas - new gluster where "the first gluster" had to be
migrated.
storage-gfs-4-prd - the second gluster (which had to be migrated later).

I've started command replace-brick:
'gluster volume replace-brick sa_bookshelf storage-gfs-3-prd:/ydp/shared
storage-1-saas:/ydp/shared start'

During that Virtual Machine (Xen) has crashed. Now I can't abort
migration and continue it again.

When I try:
'# gluster volume replace-brick sa_bookshelf
storage-gfs-3-prd:/ydp/shared storage-1-saas:/ydp/shared abort'
The command lasts about 5 minutes then finishes with no results. Apart
from that Gluster after that command starts behave very strange. 
For example I can't do '# gluster volume heal sa_bookshelf info' because
it lasts about 5 minutes and returns black screen (the same like abort).

Then I restart Gluster server and Gluster returns to normal work except
the replace-brick commands. When I do:
'# gluster volume replace-brick sa_bookshelf
storage-gfs-3-prd:/ydp/shared storage-1-saas:/ydp/shared status'
I get:
Number of files migrated = 0       Current file=
I can do 'volume heal info' commands etc. until I call the command:
'# gluster volume replace-brick sa_bookshelf
storage-gfs-3-prd:/ydp/shared storage-1-saas:/ydp/shared abort'.



# gluster --version
glusterfs 3.3.1 built on Oct 22 2012 07:54:24 Repository revision:
git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> GlusterFS
comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU
General Public License.

Brick (/ydp/shared) logs (repeats the same constantly):
[2013-12-06 11:29:44.790299] W [dict.c:995:data_to_str]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab
) [0x7ff4a5d35fcb]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r
emote_sockaddr+0x15d) [0x7ff4a5d3d64d]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address
_family+0x2bb) [0x7ff4a5d3d4ab]))) 0-dict: data is NULL
[2013-12-06 11:29:44.790402] W [dict.c:995:data_to_str]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab
) [0x7ff4a5d35fcb]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r
emote_sockaddr+0x15d) [0x7ff4a5d3d64d]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address
_family+0x2c6) [0x7ff4a5d3d4b6]))) 0-dict: data is NULL
[2013-12-06 11:29:44.790465] E [name.c:141:client_fill_address_family]
0-sa_bookshelf-replace-brick: transport.address-family not specified.
Could not guess default value from (remote-host:(null) or
transport.unix.connect-path:(null)) options
[2013-12-06 11:29:47.791037] W [dict.c:995:data_to_str]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab
) [0x7ff4a5d35fcb]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r
emote_sockaddr+0x15d) [0x7ff4a5d3d64d]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address
_family+0x2bb) [0x7ff4a5d3d4ab]))) 0-dict: data is NULL
[2013-12-06 11:29:47.791141] W [dict.c:995:data_to_str]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab
) [0x7ff4a5d35fcb]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r
emote_sockaddr+0x15d) [0x7ff4a5d3d64d]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address
_family+0x2c6) [0x7ff4a5d3d4b6]))) 0-dict: data is NULL
[2013-12-06 11:29:47.791174] E [name.c:141:client_fill_address_family]
0-sa_bookshelf-replace-brick: transport.address-family not specified.
Could not guess default value from (remote-host:(null) or
transport.unix.connect-path:(null)) options
[2013-12-06 11:29:50.791775] W [dict.c:995:data_to_str]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab
) [0x7ff4a5d35fcb]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r
emote_sockaddr+0x15d) [0x7ff4a5d3d64d]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address
_family+0x2bb) [0x7ff4a5d3d4ab]))) 0-dict: data is NULL
[2013-12-06 11:29:50.791986] W [dict.c:995:data_to_str]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab
) [0x7ff4a5d35fcb]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r
emote_sockaddr+0x15d) [0x7ff4a5d3d64d]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address
_family+0x2c6) [0x7ff4a5d3d4b6]))) 0-dict: data is NULL
[2013-12-06 11:29:50.792046] E [name.c:141:client_fill_address_family]
0-sa_bookshelf-replace-brick: transport.address-family not specified.
Could not guess default value from (remote-host:(null) or
transport.unix.connect-path:(null)) options


# gluster volume info

Volume Name: sa_bookshelf
Type: Distributed-Replicate
Volume ID: 74512f52-72ec-4538-9a54-4e50c4691722
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: storage-gfs-3-prd:/ydp/shared
Brick2: storage-gfs-4-prd:/ydp/shared
Brick3: storage-gfs-3-prd:/ydp/shared2
Brick4: storage-gfs-4-prd:/ydp/shared2


# gluster volume status
Status of volume: sa_bookshelf
Gluster process                                         Port    Online
Pid
------------------------------------------------------------------------
------
Brick storage-gfs-3-prd:/ydp/shared                     24009   Y
758
Brick storage-gfs-4-prd:/ydp/shared                     24009   Y
730
Brick storage-gfs-3-prd:/ydp/shared2                    24010   Y
764
Brick storage-gfs-4-prd:/ydp/shared2                    24010   Y
4578
NFS Server on localhost                                 38467   Y
770
Self-heal Daemon on localhost                           N/A     Y
776
NFS Server on storage-1-saas                            38467   Y
840
Self-heal Daemon on storage-1-saas                      N/A     Y
846
NFS Server on storage-gfs-4-prd                         38467   Y
4584
Self-heal Daemon on storage-gfs-4-prd                   N/A     Y
4590

storage-gfs-3-prd:~# gluster peer status Number of Peers: 2

Hostname: storage-1-saas
Uuid: 37b9d881-ce24-4550-b9de-6b304d7e9d07
State: Peer in Cluster (Connected)

Hostname: storage-gfs-4-prd
Uuid: 4c384f45-873b-4c12-9683-903059132c56
State: Peer in Cluster (Connected)


(from storage-1-saas)# gluster peer status Number of Peers: 2

Hostname: 172.16.3.60
Uuid: 1441a7b0-09d2-4a40-a3ac-0d0e546f6884
State: Peer in Cluster (Connected)

Hostname: storage-gfs-4-prd
Uuid: 4c384f45-873b-4c12-9683-903059132c56
State: Peer in Cluster (Connected)



Clients work properly.
I googled for that but I found that was a bug but in 3.3.0 version. How
can I repair that and continue my migration? Thank You for any help.

BTW: I moved Gluster Server via Gluster 3.4: Brick Restoration - Replace
Crashed Server how to.

Regards,
Mariusz



More information about the Gluster-users mailing list