[Bugs] [Bug 1564071] directories are invisible on client side
bugzilla at redhat.com
bugzilla at redhat.com
Mon Apr 30 15:37:26 UTC 2018
https://bugzilla.redhat.com/show_bug.cgi?id=1564071
--- Comment #13 from g.amedick at uni-luebeck.de ---
We restarted the rebalance. It'll take a while though (est time 50 hrs). We'll
report the outcome
The bricks actually are virtual discs provided by a big storage system. The
storage reports no errors (including no loss of connectivity or harddrive
failure).
We didn't touch the brick process at all (actually, we weren't even present, it
was late in the evening). It recovered on its own.
Port 49159 on gluster02 belongs to brick DATA208. The port was open when we
came to work the next day. The brick was up and running. The glusterd-log
showed nothing about having lost a brick, just the failed rebalance:
[2018-04-24 18:59:19.256333] I [MSGID: 106172]
[glusterd-handshake.c:1014:__server_event_notify] 0-glusterd: received defrag
status updated
[2018-04-24 18:59:19.263291] W [socket.c:593:__socket_rwv] 0-management: readv
on /var/run/gluster/gluster-rebalance-0d210c70-e44f-46f1-862c-ef260514c9f1.sock
failed (No data available)
[2018-04-24 18:59:19.266258] I [MSGID: 106007]
[glusterd-rebalance.c:158:__glusterd_defrag_notify] 0-management: Rebalance
process for volume $vol1 has disconnected.
That's the complete log of that day.
For some reason, DATA208 tried to connect to port 49057:
[2018-04-24 18:56:02.744587] W [socket.c:593:__socket_rwv] 0-tcp.$vol1-server:
writev on $IP_gluster02:49057 failed (Broken pipe)
We are unsure why. There's nothing listening:
$ netstat -tulpen | grep 49057
$ netstat -tulpen | grep gluster
tcp 0 0 0.0.0.0:49152 0.0.0.0:* LISTEN
0 24130 4064/glusterfsd
tcp 0 0 0.0.0.0:49153 0.0.0.0:* LISTEN
0 18881 4072/glusterfsd
tcp 0 0 0.0.0.0:49154 0.0.0.0:* LISTEN
0 19775 4080/glusterfsd
tcp 0 0 0.0.0.0:49155 0.0.0.0:* LISTEN
0 26969 4090/glusterfsd
tcp 0 0 0.0.0.0:49156 0.0.0.0:* LISTEN
0 45238 4098/glusterfsd
tcp 0 0 0.0.0.0:49157 0.0.0.0:* LISTEN
0 46649 4107/glusterfsd
tcp 0 0 0.0.0.0:49158 0.0.0.0:* LISTEN
0 1440 4116/glusterfsd
tcp 0 0 0.0.0.0:49159 0.0.0.0:* LISTEN
0 18417 4125/glusterfsd
tcp 0 0 0.0.0.0:24007 0.0.0.0:* LISTEN
0 15592 3873/glusterd
tcp 0 0 0.0.0.0:49160 0.0.0.0:* LISTEN
0 19785 4134/glusterfsd
tcp 0 0 0.0.0.0:49161 0.0.0.0:* LISTEN
0 36104 4143/glusterfsd
tcp 0 0 0.0.0.0:49162 0.0.0.0:* LISTEN
0 72783 4152/glusterfsd
tcp 0 0 0.0.0.0:49163 0.0.0.0:* LISTEN
0 38236 4161/glusterfsd
We don't know why the rebalance failed. It's the first time that something like
this happened. And we don't understand the brick log.
We need to discuss uploading the pcap-file with our supervisor, since it
contains our IP's. Is there a way to give it to you without making it public?
There's something else that happened today:
A user reported she wanted to create a smylink with an absolute path to some
file. There was no error message (in fact, the mount log reported Success), but
the symlink lead to nowhere. The volume usually is mounted as /data, on all
compute nodes with the /data-mount, creating a symling to this file didn't
work. The new mount I created at /mnt however could do the symlink. The
Systemd-mount-unit literally is copied except for the mount point. A server
with both mount points (/data and /mnt) could do the smylink on the /mnt- mount
point but not at /data. Relative paths however work fine. It looks like this:
$ ls -lah
lrwxrwxrwx 1 root itsc_test_proj2 120 Apr 30 15:25 test1.gz ->
/mnt/$PATH/$file.gz
lrwxrwxrwx 1 root itsc_test_proj2 121 Apr 30 15:47 test2.gz ->
lrwxrwxrwx 1 root itsc_test_proj2 120 Apr 30 15:48 test3.gz ->
/mnt/$PATH/$file.gz
lrwxrwxrwx 1 root itsc_test_proj2 118 Apr 30 16:05 test4.gz ->
../$PATH/$file.gz
lrwxrwxrwx 1 root itsc_test_proj2 119 Apr 30 16:06 test5.gz ->
lrwxrwxrwx 1 root itsc_test_proj2 121 Apr 30 16:08 test6.gz ->
lrwxrwxrwx 1 root itsc_test_proj2 121 Apr 30 16:08 test7.gz ->
lrwxrwxrwx 1 root itsc_test_proj2 120 Apr 30 15:48 test8.gz ->
/mnt/$PATH/$file.gz
Creation of the symlinks:
test1.gz & test3.gz via "cd /mnt; ln -s /mnt/$PATH/$file.gz test$x.gz"
test2.gz, test5.gz & test6.gz via "cd /data; ln -s /data/$PATH/$file.gz
test$x.gz"
test4.gz via "cd /data; ln -s ../$PATH/$file.gz test$x.gz"
test7.gz via "cd /mnt; ln -s /data/$PATH/$file.gz test$x.gz"
test8.gz via "cd /data; ln -s /mnt/$PATH/$file.gz test$x.gz"
This was reproducible.
We know that the /mnt-mount point is not completely fine either, since the
hidden files we used to create the logs were hidden there, too. Still, the
mounts behave different. Symlinks with an absolute path pointing on /data
aren't created correctly. Following the strange symlinks with zcat produces an
error:
$ zcat test7.gz | head
gzip: test7.gz is a directory -- ignored
All links, including the one with a relative link pointing to /data, can be
used as usual.
--
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=Jg4HQK6rBd&a=cc_unsubscribe
More information about the Bugs
mailing list