[Gluster-users] remote operation failed: No space left on device

Fri Apr 27 11:51:53 UTC 2012

Hi,

1. What version of GlusterFS are you running?
2. Do an lsof | grep users98.  Do you see a lot of files in the (DELETED) state?

Gerald

----- Original Message -----
> From: "anthony garnier" <sokar6012 at hotmail.com>
> To: gluster-users at gluster.org
> Sent: Friday, April 27, 2012 6:41:43 AM
> Subject: [Gluster-users] remote operation failed: No space left on device
> 
> 
> 
> 
> 
> 
> Hi all,
> 
> 
> I've got an issue , it's seems that the size reported by df -h grows
> indefinitely. Any help would be appreciated.
> 
> some details :
> On the client :
> 
> yval9000:/users98 # df -h .
> Filesystem Size Used Avail Use% Mounted on
> ylal3510:/poolsave/yval9000
> 1.7T 1.7T 25G 99% /users98
> 
> yval9000:/users98 # du -ch .
> 5.1G /users98
> 
> 
> My logs are full of :
> [2012-04-27 12:14:32.402972] I
> [client3_1-fops.c:683:client3_1_writev_cbk] 0-poolsave-client-1:
> remote operation failed: No space left on device
> [2012-04-27 12:14:32.426964] I
> [client3_1-fops.c:683:client3_1_writev_cbk] 0-poolsave-client-1:
> remote operation failed: No space left on device
> [2012-04-27 12:14:32.439424] I
> [client3_1-fops.c:683:client3_1_writev_cbk] 0-poolsave-client-1:
> remote operation failed: No space left on device
> [2012-04-27 12:14:32.441505] I
> [client3_1-fops.c:683:client3_1_writev_cbk] 0-poolsave-client-0:
> remote operation failed: No space left on device
> 
> 
> 
> This is my volume config :
> 
> Volume Name: poolsave
> Type: Distributed-Replicate
> Status: Started
> Number of Bricks: 2 x 2 = 4
> Transport-type: tcp
> Bricks:
> Brick1: ylal3510:/users3/poolsave
> Brick2: ylal3530:/users3/poolsave
> Brick3: ylal3520:/users3/poolsave
> Brick4: ylal3540:/users3/poolsave
> Options Reconfigured:
> nfs.enable-ino32: off
> features.quota-timeout: 30
> features.quota: off
> performance.cache-size: 6GB
> network.ping-timeout: 60
> performance.cache-min-file-size: 1KB
> performance.cache-max-file-size: 4GB
> performance.cache-refresh-timeout: 2
> nfs.port: 2049
> performance.io-thread-count: 64
> diagnostics.latency-measurement: on
> diagnostics.count-fop-hits: on
> 
> 
> Space left on servers :
> 
> ylal3510:/users3 # df -h .
> Filesystem Size Used Avail Use% Mounted on
> /dev/mapper/users-users3vol
> 858G 857G 1.1G 100% /users3
> ylal3510:/users3 # du -ch /users3 | grep total
> 129G total
> ---
> 
> ylal3530:/users3 # df -h .
> Filesystem Size Used Avail Use% Mounted on
> /dev/mapper/users-users3vol
> 858G 857G 1.1G 100% /users3
> ylal3530:/users3 # du -ch /users3 | grep total
> 129G total
> ---
> 
> ylal3520:/users3 # df -h .
> Filesystem Size Used Avail Use% Mounted on
> /dev/mapper/users-users3vol
> 858G 835G 24G 98% /users3
> ylal3520:/users3 # du -ch /users3 | grep total
> 182G total
> ---
> 
> ylal3540:/users3 # df -h .
> Filesystem Size Used Avail Use% Mounted on
> /dev/mapper/users-users3vol
> 858G 833G 25G 98% /users3
> ylal3540:/users3 # du -ch /users3 | grep total
> 181G total
> 
> 
> This issue appears after those 2 scripts running during 2 weeks :
> 
> test_save.sh is executed each hour,it takes a bunch of data to
> compress (dest :
> REP_SAVE_TEMP) and then move it in a folder (REP_SAVE) that the
> netback.sh
> script will scan each 30 min
> 
> #!/usr/bin/ksh
> #
> ________________________________________________________________________
> #             |
> # Nom         test_save.sh
> #
> ____________|___________________________________________________________
> #             |
> # Description | test GlusterFS
> #
> ____________|___________________________________________________________
> 
> UNIXSAVE=/users98/test
> REP_SAVE_TEMP=${UNIXSAVE}/tmp
> REP_SAVE=${UNIXSAVE}/gluster
> LOG=/users/glusterfs_test
> 
> 
> f_tar_mv()
> {
>   echo "\n"
> ARCHNAME=${REP_SAVE_TEMP}/`date +%d-%m-%H-%M`_${SUBNAME}.tar
> 
>   tar -cpvf ${ARCHNAME} ${REPERTOIRE}
> 
>   echo "creation of ${ARCHNAME}"
> 
> 
>   # mv ${REP_SAVE_TEMP}/*_${SUBNAME}.tar ${REP_SAVE}
>   mv  ${REP_SAVE_TEMP}/* ${REP_SAVE}
>   echo "Moving archive in ${REP_SAVE} "
>   echo "\n"
> 
>   return $?
> }
> 
> REPERTOIRE="/users2/"
> SUBNAME="test_glusterfs_save"
> f_tar_mv >$LOG/save_`date +%d-%m-%Y-%H-%M`.log 2>&1
> 
> 
> #!/usr/bin/ksh
> #
> ________________________________________________________________________
> #             |
> # Nom         netback.sh
> #
> ____________|___________________________________________________________
> #             |
> # Description | Sauvegarde test GlusterFS
> #
> ____________|___________________________________________________________
> 
> UNIXSAVE=/users98/test
> REP_SAVE_TEMP=${UNIXSAVE}/tmp
> REP_SAVE=${UNIXSAVE}/gluster
> LOG=/users/glusterfs_test
> 
> f_net_back()
> {
> if [[ `find ${REP_SAVE} -type f | wc -l` -eq 0 ]]
> then
> echo "nothing to save";
> else
> echo "Simulation netbackup, tar in /dev/null"
> tar -cpvf /dev/null ${REP_SAVE}/*
> echo "deletion archive"
> rm ${REP_SAVE}/*
> 
> fi
> return $?
> }
> 
> f_net_back >${LOG}/netback_`date +%d-%m-%H-%M`.log 2>&1
> 
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>