[Gluster-users] Erroneous "No space left on device." messages
Pat Haley
phaley at mit.edu
Wed Mar 11 14:27:58 UTC 2020
Hi,
I was able to successfully reset cluster.min-free-disk. That only made
the "No space left on device" problem intermittent instead of constant.
I then look at the brick log files again and noticed "No space ..."
error recorded for files that I knew nobody was accessing. gluster
volume status was also reporting a rebalance on-going (but not the same
ID as that one I started on Monday). I stopped the rebalance and I do
not seem to be getting the "No space left on device" messages.
However I now have new curious issue. I have at least one file that I
created after resetting cluster.min-free-disk but before shutting down
the rebalance that does not show up on a simple "ls" command but does
show up if I explicitly try to ls that file (example below, the file in
question is PeManJob). This semi-missing file is located on brick1 (one
of the 2 that were giving the "No space left on device" messages). How
do I fix this new issue?
Thanks
Pat
mseas(DSMccfzR75deg_001b)% ls
at_pe_job pe_nrg.nc
check_times_job pe_out.nc
HoldJob pe_PBI.in
oi_3hr.dat PePbiJob
PE_Data_Comparison_glider_all_smalldom.m pe_PB.in
PE_Data_Comparison_glider_sp011_smalldom.m pe_PB.log
PE_Data_Comparison_glider_sp064_smalldom.m pe_PB_short.in
PeManJob.log PlotJob
mseas(DSMccfzR75deg_001b)% ls PeManJob
PeManJob
mseas(DSMccfzR75deg_001b)% ls PeManJob*
PeManJob.log
On 3/10/20 8:18 PM, Strahil Nikolov wrote:
> On March 10, 2020 9:47:49 PM GMT+02:00, Pat Haley <phaley at mit.edu> wrote:
>> Hi,
>>
>> If I understand this, to remove the "No space left on device" error I
>> either have to clear up 10% space on each brick, or clean-up a lesser
>> amount and reset cluster.min-free. Is this correct?
>>
>> I have found the following command for resetting the cluster.min-free
>>
>> *
>>
>> gluster volume set <volume> cluster.min-free-disk <value>
>>
>> Can this be done while the volume is live? Does the <value> need to be
>>
>> an integer?
>>
>> Thanks
>>
>> Pat
>>
>>
>> On 3/10/20 2:45 PM, Pat Haley wrote:
>>> Hi,
>>>
>>> I get the following
>>>
>>> [root at mseas-data2 bricks]# gluster volume get data-volume all | grep
>>> cluster.min-free
>>> cluster.min-free-disk 10%
>>> cluster.min-free-inodes 5%
>>>
>>>
>>> On 3/10/20 2:34 PM, Strahil Nikolov wrote:
>>>> On March 10, 2020 8:14:41 PM GMT+02:00, Pat Haley <phaley at mit.edu>
>>>> wrote:
>>>>> HI,
>>>>>
>>>>> After some more poking around in the logs (specifically the brick
>> logs)
>>>>> * brick1 & brick2 have both been recording "No space left on
>> device"
>>>>> messages today (as recently at 15 minutes ago)
>>>>> * brick3 last recorded a "No space left on device" message last
>> night
>>>>> around 10:30pm
>>>>> * brick4 has no such messages in its log file
>>>>>
>>>>> Note brick1 & brick2 are on one server, brick3 and brick4 are on
>> the
>>>>> second server.
>>>>>
>>>>> Pat
>>>>>
>>>>>
>>>>> On 3/10/20 11:51 AM, Pat Haley wrote:
>>>>>> Hi,
>>>>>>
>>>>>> We have developed a problem with Gluster reporting "No space left
>> on
>>>>>> device." even though "df" of both the gluster filesystem and the
>>>>>> underlying bricks show space available (details below). Our inode
>>>>>> usage is between 1-3%. We are running gluster 3.7.11 in a
>>>>> distributed
>>>>>> volume across 2 servers (2 bricks each). We have followed the
>> thread
>> https://lists.gluster.org/pipermail/gluster-users/2020-March/037821.html
>>
>>>>>
>>>>>> but haven't found a solution yet.
>>>>>>
>>>>>> Last night we ran a rebalance which appeared successful (and have
>>>>>> since cleared up some more space which seems to have mainly been
>> on
>>>>>> one brick). There were intermittent erroneous "No space..."
>> messages
>>>>>> last night, but they have become much more frequent today.
>>>>>>
>>>>>> Any help would be greatly appreciated.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> ---------------------------
>>>>>> [root at mseas-data2 ~]# df -h
>>>>>> ---------------------------
>>>>>> Filesystem Size Used Avail Use% Mounted on
>>>>>> /dev/sdb 164T 164T 324G 100% /mnt/brick2
>>>>>> /dev/sda 164T 164T 323G 100% /mnt/brick1
>>>>>> ---------------------------
>>>>>> [root at mseas-data2 ~]# df -i
>>>>>> ---------------------------
>>>>>> Filesystem Inodes IUsed IFree IUse% Mounted on
>>>>>> /dev/sdb 1375470800 31207165 1344263635 3% /mnt/brick2
>>>>>> /dev/sda 1384781520 28706614 1356074906 3% /mnt/brick1
>>>>>>
>>>>>> ---------------------------
>>>>>> [root at mseas-data3 ~]# df -h
>>>>>> ---------------------------
>>>>>> /dev/sda 91T 91T 323G 100% /export/sda/brick3
>>>>>> /dev/mapper/vg_Data4-lv_Data4
>>>>>> 91T 88T 3.4T 97% /export/sdc/brick4
>>>>>> ---------------------------
>>>>>> [root at mseas-data3 ~]# df -i
>>>>>> ---------------------------
>>>>>> /dev/sda 679323496 9822199 669501297 2%
>>>>>> /export/sda/brick3
>>>>>> /dev/mapper/vg_Data4-lv_Data4
>>>>>> 3906272768 11467484 3894805284 1%
>>>>>> /export/sdc/brick4
>>>>>>
>>>>>>
>>>>>>
>>>>>> ---------------------------------------
>>>>>> [root at mseas-data2 ~]# gluster --version
>>>>>> ---------------------------------------
>>>>>> glusterfs 3.7.11 built on Apr 27 2016 14:09:22
>>>>>> Repository revision: git://git.gluster.com/glusterfs.git
>>>>>> Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
>>>>>> GlusterFS comes with ABSOLUTELY NO WARRANTY.
>>>>>> You may redistribute copies of GlusterFS under the terms of the
>> GNU
>>>>>> General Public License.
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----------------------------------------
>>>>>> [root at mseas-data2 ~]# gluster volume info
>>>>>> -----------------------------------------
>>>>>> Volume Name: data-volume
>>>>>> Type: Distribute
>>>>>> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
>>>>>> Status: Started
>>>>>> Number of Bricks: 4
>>>>>> Transport-type: tcp
>>>>>> Bricks:
>>>>>> Brick1: mseas-data2:/mnt/brick1
>>>>>> Brick2: mseas-data2:/mnt/brick2
>>>>>> Brick3: mseas-data3:/export/sda/brick3
>>>>>> Brick4: mseas-data3:/export/sdc/brick4
>>>>>> Options Reconfigured:
>>>>>> nfs.export-volumes: off
>>>>>> nfs.disable: on
>>>>>> performance.readdir-ahead: on
>>>>>> diagnostics.brick-sys-log-level: WARNING
>>>>>> nfs.exports-auth-enable: on
>>>>>> server.allow-insecure: on
>>>>>> auth.allow: *
>>>>>> disperse.eager-lock: off
>>>>>> performance.open-behind: off
>>>>>> performance.md-cache-timeout: 60
>>>>>> network.inode-lru-limit: 50000
>>>>>> diagnostics.client-log-level: ERROR
>>>>>>
>>>>>>
>>>>>>
>>>>>> --------------------------------------------------------------
>>>>>> [root at mseas-data2 ~]# gluster volume status data-volume detail
>>>>>> --------------------------------------------------------------
>>>>>> Status of volume: data-volume
>>>>>>
>> ------------------------------------------------------------------------------
>>
>>>>>
>>>>>> Brick : Brick mseas-data2:/mnt/brick1
>>>>>> TCP Port : 49154
>>>>>> RDMA Port : 0
>>>>>> Online : Y
>>>>>> Pid : 4601
>>>>>> File System : xfs
>>>>>> Device : /dev/sda
>>>>>> Mount Options : rw
>>>>>> Inode Size : 256
>>>>>> Disk Space Free : 318.8GB
>>>>>> Total Disk Space : 163.7TB
>>>>>> Inode Count : 1365878288
>>>>>> Free Inodes : 1337173596
>>>>>>
>> ------------------------------------------------------------------------------
>>
>>>>>
>>>>>> Brick : Brick mseas-data2:/mnt/brick2
>>>>>> TCP Port : 49155
>>>>>> RDMA Port : 0
>>>>>> Online : Y
>>>>>> Pid : 7949
>>>>>> File System : xfs
>>>>>> Device : /dev/sdb
>>>>>> Mount Options : rw
>>>>>> Inode Size : 256
>>>>>> Disk Space Free : 319.8GB
>>>>>> Total Disk Space : 163.7TB
>>>>>> Inode Count : 1372421408
>>>>>> Free Inodes : 1341219039
>>>>>>
>> ------------------------------------------------------------------------------
>>
>>>>>
>>>>>> Brick : Brick mseas-data3:/export/sda/brick3
>>>>>> TCP Port : 49153
>>>>>> RDMA Port : 0
>>>>>> Online : Y
>>>>>> Pid : 4650
>>>>>> File System : xfs
>>>>>> Device : /dev/sda
>>>>>> Mount Options : rw
>>>>>> Inode Size : 512
>>>>>> Disk Space Free : 325.3GB
>>>>>> Total Disk Space : 91.0TB
>>>>>> Inode Count : 692001992
>>>>>> Free Inodes : 682188893
>>>>>>
>> ------------------------------------------------------------------------------
>>
>>>>>
>>>>>> Brick : Brick mseas-data3:/export/sdc/brick4
>>>>>> TCP Port : 49154
>>>>>> RDMA Port : 0
>>>>>> Online : Y
>>>>>> Pid : 23772
>>>>>> File System : xfs
>>>>>> Device : /dev/mapper/vg_Data4-lv_Data4
>>>>>> Mount Options : rw
>>>>>> Inode Size : 256
>>>>>> Disk Space Free : 3.4TB
>>>>>> Total Disk Space : 90.9TB
>>>>>> Inode Count : 3906272768
>>>>>> Free Inodes : 3894809903
>>>>>>
>>>> Hi Pat,
>>>>
>>>> What is the output of:
>>>> gluster volume get data-volume all | grep cluster.min-free
>>>>
>>>> 1% of 164 T is 1640G , but in your case you have only 324G which is
>>>> way lower.
>>>>
>>>> Best Regards,
>>>> Strahil Nikolov
> Hey Pat,
>
> Some users have reported they are using a value of 1% and it seems to be working.
>
> Most probably you will be able to do it live, but I have never had to change that. You can give a try on a test cluster.
>
> Best Regards,
> Strahil Nikolov
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley Email: phaley at mit.edu
Center for Ocean Engineering Phone: (617) 253-6824
Dept. of Mechanical Engineering Fax: (617) 253-8125
MIT, Room 5-213 http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA 02139-4301
More information about the Gluster-users
mailing list