[Gluster-users] Writing to distributed (non-replicated) volume with failed nodes

Leonid Isaev leonid.isaev at jila.colorado.edu
Thu Oct 8 02:24:07 UTC 2015


Hi,

	I have an 8-node trusted pool with a distributed, non-replicated volume
The bricks are located only on 2 machines (2 bricks per node), so there are 6
dummy" nodes. Everything is working great until one of the brick-arrying nodes
experiences a power outage.
	In this case, I can still mount the volume after a timeout (there is
plenty of servers to ask for metadata, after all) and read files from there,
but whenever I try to create a random-named file (e.g. running touch
/mnt/.lock-${RANDOM}${RANDOM}) this succeeds only sometimes, but often fails
with "no such file or directory". I understand that error if I were touching
files that already exist on the offline node (but invisible with the degraded
volume), but these are new random files which never existed before.
	So, why does writing to the online bricks fail, and what can I do to
enable it? The machines run fully up-to-date Fedora 22 and ArchLinux with
gluster 3.7.4. I tried to look for similar problems on this ML, but haven't
found anything related, sorry if I missed something.

Thanks!
L.

-- 
Leonid Isaev
GPG fingerprints: DA92 034D B4A8 EC51 7EA6  20DF 9291 EE8A 043C B8C4
                  C0DF 20D0 C075 C3F1 E1BE  775A A7AE F6CB 164B 5A6D


More information about the Gluster-users mailing list