[Gluster-users] No space left on device (when there is actually lots of free space)
Kali Hernandez
kali at thenetcircle.com
Tue Apr 6 05:56:50 UTC 2010
Hi,
In this same environment, when I try to create a new directory on the
mount point (client side), I get this error:
profile3:/mnt # mkdir gluster_new/newdir
mkdir: cannot create directory `gluster_new/newdir': Software caused
connection abort
profile3:/mnt # mkdir gluster_new/newdir
mkdir: cannot create directory `gluster_new/newdir': Transport endpoint
is not connected
profile3:/mnt # mount
If I check the log file, I can see:
[2010-04-06 07:58:26] W [fuse-bridge.c:477:fuse_entry_cbk]
glusterfs-fuse: 4373613: MKDIR() /newdir returning inode 0
pending frames:
frame : type(1) op(MKDIR)
frame : type(1) op(MKDIR)
patchset: v3.0.2-41-g029062c
signal received: 11
time of crash: 2010-04-06 07:58:26
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.0.3
/lib64/libc.so.6[0x7f49e1c7f6e0]
/usr/lib64/libglusterfs.so.0(inode_link+0x23)[0x7f49e23e73b3]
/usr/lib64/glusterfs/3.0.3/xlator/mount/fuse.so[0x7f49e07b8a43]
/usr/lib64/glusterfs/3.0.3/xlator/mount/fuse.so[0x7f49e07b8f92]
/usr/lib64/libglusterfs.so.0[0x7f49e23e0cd5]
/usr/lib64/libglusterfs.so.0[0x7f49e23e0cd5]
/usr/lib64/glusterfs/3.0.3/xlator/cluster/stripe.so(stripe_stack_unwind_inode_cbk+0x1aa)[0x7f49e0de19ba]
/usr/lib64/glusterfs/3.0.3/xlator/cluster/replicate.so(afr_mkdir_unwind+0x113)[0x7f49e0ffa4c3]
/usr/lib64/glusterfs/3.0.3/xlator/cluster/replicate.so(afr_mkdir_wind_cbk+0xbe)[0x7f49e0ffb1de]
/usr/lib64/glusterfs/3.0.3/xlator/protocol/client.so(client_mkdir_cbk+0x405)[0x7f49e1242d35]
/usr/lib64/glusterfs/3.0.3/xlator/protocol/client.so(protocol_client_pollin+0xca)[0x7f49e123024a]
/usr/lib64/glusterfs/3.0.3/xlator/protocol/client.so(notify+0x212)[0x7f49e12376c2]
/usr/lib64/libglusterfs.so.0(xlator_notify+0x43)[0x7f49e23d93e3]
/usr/lib64/glusterfs/3.0.3/transport/socket.so(socket_event_handler+0xd3)[0x7f49dfda6173]
/usr/lib64/libglusterfs.so.0[0x7f49e23f3045]
/usr/sbin/glusterfs(main+0xa28)[0x404268]
/lib64/libc.so.6(__libc_start_main+0xe6)[0x7f49e1c6b586]
/usr/sbin/glusterfs[0x402749]
---------
Again, I am totally clueless...
On 04/06/2010 12:07 PM, Kali Hernandez wrote:
>
> Hi all,
>
> We are running glusterfs 3.0.3, installed from RHEL rpm's, over 30
> nodes (not virtual machines). Our config pairs each 2 machines under
> replicate translator as mirrors, and over that aggregates the 15
> resulting mirrors under stripe translator. Before we were using
> distribute instead, but we had the same problem.
>
> We are copying (using cp) a lot of files which reside under the same
> directory, and I have been monitoring the whole copy process to check
> where the failure starts.
>
> In the middle of the copy process we get this error:
>
> cp: cannot create regular file
> `/mnt/gluster_new/videos/1251512-3CA86758640A31E7770EBC7629AEC10F.mpg': No
> space left on device
> cp: cannot create regular file
> `/mnt/gluster_new/videos/1758650-3AF69C6B7FDAC0A40D85EABA8C85490D.mswmm':
> No space left on device
> cp: cannot create regular file
> `/mnt/gluster_new/videos/179183-A018B5FBE6DCCF04A3BB99C814CD9EAB.wmv':
> No space left on device
> cp: cannot create regular file
> `/mnt/gluster_new/videos/2448602-568B1ACF53675DC762485F2B26539E0D.wmv': No
> space left on device
> cp: cannot create regular file
> `/mnt/gluster_new/videos/626249-7B7FFFE0B9C56E9BE5733409CB73BCDF_300.jpg':
> No space left on device
> cp: cannot create regular file
> `/mnt/gluster_new/videos/1962299-B7CDFF12FB1AD41DF3660BF0C7045CBC.avi': No
> space left on device
>
> (hundreds of times)
>
> When I look at the storage distribution, I can see this:
>
> node 10 37G 14G 23G 38% /glusterfs_storage
> node 11 37G 14G 23G 37% /glusterfs_storage
> node 12 37G 14G 23G 37% /glusterfs_storage
> node 13 37G 14G 23G 37% /glusterfs_storage
> node 14 37G 13G 24G 36% /glusterfs_storage
> node 15 37G 13G 24G 36% /glusterfs_storage
> node 16 37G 13G 24G 35% /glusterfs_storage
> node 17 49G 12G 36G 26% /glusterfs_storage
> node 18 37G 12G 25G 33% /glusterfs_storage
> node 19 37G 12G 25G 33% /glusterfs_storage
> node 20 37G 14G 23G 38% /glusterfs_storage
> node 21 37G 14G 23G 37% /glusterfs_storage
> node 22 37G 14G 23G 37% /glusterfs_storage
> node 23 37G 14G 23G 37% /glusterfs_storage
> node 24 37G 13G 24G 36% /glusterfs_storage
> node 25 37G 13G 24G 36% /glusterfs_storage
> node 26 37G 13G 24G 35% /glusterfs_storage
> node 27 49G 12G 36G 26% /glusterfs_storage
> node 28 37G 12G 25G 33% /glusterfs_storage
> node 29 37G 12G 25G 33% /glusterfs_storage
> node 35 40G 40G 0 100% /glusterfs_storage
> node 36 40G 22G 18G 56% /glusterfs_storage
> node 37 40G 18G 22G 45% /glusterfs_storage
> node 38 40G 16G 24G 40% /glusterfs_storage
> node 39 40G 15G 25G 37% /glusterfs_storage
> node 45 40G 40G 0 100% /glusterfs_storage
> node 46 40G 22G 18G 56% /glusterfs_storage
> node 47 40G 18G 22G 45% /glusterfs_storage
> node 48 40G 16G 24G 40% /glusterfs_storage
> node 49 40G 15G 25G 37% /glusterfs_storage
>
> (node mirror pairings are 10-19 paired to 20-29, and 35-39 to 45-49)
>
>
> As you can see, distribution of space over the cluster is more or less
> rational over most of the nodes, except for node pair 35/45, which run
> out of space. Thus, every time I try to copy more data onto the
> cluster, I run into the mentioned "no space left on device"
>
> From the mountpoint point of view, the gluster free space looks like
> this:
>
> Filesystem 1M-blocks Used
> Available Use% Mounted on
> [...]
> /etc/glusterfs/glusterfs.vol.new 586617 240197 340871
> 42% /mnt/gluster_new
>
>
> So basically, I get out of space messages when there is around 340 Gb
> free on the cluster.
>
>
> I tried using distribute translator instead of stripe, in fact that
> was our first setup, but we thought maybe we are starting to copy a
> big file (usually we store really big .tar.gz backups here) and it
> runs out of space in the meanwhile, so we thought about using stripe,
> because theoretically glusterfs would in that case move and copy the
> next block of the file into another node. But in both cases
> (distribute and stripe) we run into the same problems.
>
> So I am wondering if this is a problem of a maximum number of files in
> a same directory or filesystem or what?
>
>
> Any ideas on this issue?
>
>
>
> Our config as follows:
>
> <snip>
More information about the Gluster-users
mailing list