[Gluster-users] No space left on device (when there is actually lots of free space)

Tue Apr 6 06:36:36 UTC 2010

On Tue, Apr 06, 2010 at 12:07:53PM +0800, Kali Hernandez wrote:
> In the middle of the copy process we get this error:
> 
> cp: cannot create regular file 
> `/mnt/gluster_new/videos/1251512-3CA86758640A31E7770EBC7629AEC10F.mpg': 
> No space left on device
[...]
> 
> When I look at the storage distribution, I can see this:
> 
> node 10    37G   14G   23G  38% /glusterfs_storage
> node 11    37G   14G   23G  37% /glusterfs_storage
> node 12    37G   14G   23G  37% /glusterfs_storage
> node 13    37G   14G   23G  37% /glusterfs_storage
> node 14    37G   13G   24G  36% /glusterfs_storage
> node 15    37G   13G   24G  36% /glusterfs_storage
> node 16    37G   13G   24G  35% /glusterfs_storage
> node 17    49G   12G   36G  26% /glusterfs_storage
> node 18    37G   12G   25G  33% /glusterfs_storage
> node 19    37G   12G   25G  33% /glusterfs_storage
> node 20    37G   14G   23G  38% /glusterfs_storage
> node 21    37G   14G   23G  37% /glusterfs_storage
> node 22    37G   14G   23G  37% /glusterfs_storage
> node 23    37G   14G   23G  37% /glusterfs_storage
> node 24    37G   13G   24G  36% /glusterfs_storage
> node 25    37G   13G   24G  36% /glusterfs_storage
> node 26    37G   13G   24G  35% /glusterfs_storage
> node 27    49G   12G   36G  26% /glusterfs_storage
> node 28    37G   12G   25G  33% /glusterfs_storage
> node 29    37G   12G   25G  33% /glusterfs_storage
> node 35    40G   40G     0 100% /glusterfs_storage
> node 36    40G   22G   18G  56% /glusterfs_storage
> node 37    40G   18G   22G  45% /glusterfs_storage
> node 38    40G   16G   24G  40% /glusterfs_storage
> node 39    40G   15G   25G  37% /glusterfs_storage
> node 45    40G   40G     0 100% /glusterfs_storage
> node 46    40G   22G   18G  56% /glusterfs_storage
> node 47    40G   18G   22G  45% /glusterfs_storage
> node 48    40G   16G   24G  40% /glusterfs_storage
> node 49    40G   15G   25G  37% /glusterfs_storage
> 
> (node mirror pairings are 10-19 paired to 20-29, and 35-39 to 45-49)
[...]
> So basically, I get out of space messages when there is around 340 Gb 
> free on the cluster.
> 
> 
> I tried using distribute translator instead of stripe, in fact that was 
> our first setup, but we thought maybe we are starting to copy a big file 
> (usually we store really big .tar.gz backups here) and it runs out of 
> space in the meanwhile, so we thought about using stripe, because 
> theoretically glusterfs would in that case move and copy the next block 
> of the file into another node. But in both cases (distribute and stripe) 
> we run into the same problems.
> 
> So I am wondering if this is a problem of a maximum number of files in a 
> same directory or filesystem or what?
> 
> Any ideas on this issue?
As you see, nodes 35 and 45 are full. Go back to 2.0.9 and use the unify
translator with load balancing.
Stripe needs free space on each subvolume. DHT (distribute) has the weak
point that it may decide to put a file on a full subvolume, because of
the filename's hash function value. Unify was much better in such situations,
but unfortunately it is no longer supported in 3.x. You may find it under the
"legacy" directory tree and it even compiles, but does not work.
Krzysztof