[Gluster-users] Gluster not recognizing available space
Pat Haley
phaley at MIT.EDU
Tue Jan 21 16:54:56 UTC 2014
Hi Joe,
The peer status on all 3 showed the
proper connections. Doing the killall
and restart on all three bricks fixed
the N in the Online column. I then
did have to remount the gluster
filesystem on the client
Unfortunately my original problem remains.
I'm still getting "no space on left on device"
when I try to write even a small file to
the gluster filesystem.
What should I look at next?
Thanks.
Pat
> All three.
>
> On 01/21/2014 08:38 AM, Pat Haley wrote:
>>
>> Hi Joe,
>>
>> They do appear as connected from the first
>> brick, checking on the next 2. If they
>> all show the same, is the "killall glusterfsd"
>> command simply run from the first brick, or
>> will I need to try in on all 3 bricks, one
>> at a time?
>>
>> Thanks
>>
>> # gluster peer status
>> Number of Peers: 2
>>
>> Hostname: gluster-0-1
>> Uuid: 978e0f76-6474-4203-8617-ed5ad7d29239
>> State: Peer in Cluster (Connected)
>>
>> Hostname: gluster-0-0
>> Uuid: 3f73f5cc-39d8-4d9a-b442-033cb074b247
>> State: Peer in Cluster (Connected)
>>
>>> You got lucky. That process could have deleted your volume entirely.
>>> The volume configuration and state is stored in that directory path.
>>>
>>> Check gluster peer status on two servers and make sure they're all
>>> "Peer in Cluster (Connected)". If not, peer probe to make them so.
>>>
>>> If they are, "killall glusterfsd". Then restart glusterd. Since your
>>> volume isn't working anyway, this shouldn't really be a problem for
>>> your users.
>>>
>>> Check volume status. I expect them to all be "Y".
>>>
>>> If there are still problems at the client, try unmounting and
>>> mounting again.
>>>
>>> On 01/21/2014 07:42 AM, Pat Haley wrote:
>>>>
>>>> Hi,
>>>>
>>>> To try to clean thing out more, I took
>>>> the following steps
>>>> 1) On gluster-0-0:
>>>> `gluster peer detach gluster-data`, if that fails, `gluster peer
>>>> detach gluster-data force`
>>>> 2) On gluster-data:
>>>> `rm -rf /var/lib/glusterd`
>>>> `service glusterd restart`
>>>> 3) Again on gluster-0-0:
>>>> 'gluster peer probe gluster-data'
>>>> 4) service glusterd restart on each brick
>>>>
>>>> and repeated them for all 3 bricks (i.e. removing the
>>>> /var/lib/glusterd from all 3 bricks). I did this
>>>> one at a time and restarted the glusterd daemon each
>>>> time. Now all 3 bricks appear as N in the Online column
>>>> and I get a "no space left on device" error for even
>>>> a small file.
>>>>
>>>> Does any of this suggest what I should try/test
>>>> next?
>>>>
>>>> Thanks
>>>>
>>>> Status of volume: gdata
>>>> Gluster process Port Online
>>>> Pid
>>>> ------------------------------------------------------------------------------
>>>>
>>>> Brick gluster-0-0:/mseas-data-0-0 24010 N 15360
>>>> Brick gluster-0-1:/mseas-data-0-1 24010 N 21450
>>>> Brick gluster-data:/data 24010 N 16723
>>>> NFS Server on localhost 38467 Y 16728
>>>> NFS Server on gluster-0-1 38467 Y 21455
>>>> NFS Server on gluster-0-0 38467 Y 15365
>>>>
>>>>>
>>>>> Also, going back to an earlier Email,
>>>>> should I be concerned that in the output
>>>>> from "gluster volume status" the
>>>>> brick "gluster-data:/data" has an "N"
>>>>> in the "Online" column? Does this suggest
>>>>> an additional debugging route?
>>>>>
>>>>> gluster volume status
>>>>> Status of volume: gdata
>>>>> Gluster process Port
>>>>> Online Pid
>>>>> ------------------------------------------------------------------------------
>>>>>
>>>>> Brick gluster-0-0:/mseas-data-0-0 24009 Y 27006
>>>>> Brick gluster-0-1:/mseas-data-0-1 24009 Y 7063
>>>>> Brick gluster-data:/data 24010 N 15772
>>>>> NFS Server on localhost 38467 Y 14936
>>>>> NFS Server on gluster-data 38467 Y 15778
>>>>> NFS Server on gluster-0-1 38467 Y 21083
>>>>>
>>>>>>
>>>>>> First, another update on my test of writing
>>>>>> a directory with 480 6Mb files. Not only do
>>>>>> over 3/4 of the files appear, but the are
>>>>>> written on all 3 bricks. Again, it is random
>>>>>> which files are not written but what I seem
>>>>>> to see is that files are written to each brick
>>>>>> even after the failures. Does this suggest
>>>>>> anything else I should be looking at?
>>>>>>
>>>>>> As to Brian's suggestion, how exactly do I perform
>>>>>> a "quick inode allocation test"?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Pat
>>>>>>
>>>>>>> On 01/17/2014 07:48 PM, Pat Haley wrote:
>>>>>>>> Hi Franco,
>>>>>>>>
>>>>>>>> I checked using df -i on all 3 bricks. No brick is over
>>>>>>>> 1% inode usage.
>>>>>>>>
>>>>>>>
>>>>>>> It might be worth a quick inode allocation test on the fs for each
>>>>>>> brick, regardless. There are other non-obvious scenarios that can
>>>>>>> cause
>>>>>>> inode allocation to fail, at least on xfs (i.e., contiguous block
>>>>>>> allocation). Ideally, you'll have the ability to do this in a
>>>>>>> subdirectory outside the actual glusterfs brick.
>>>>>>>
>>>>>>> Brian
>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> Pat
>>>>>>>>
>>>>>>>>> Have you run out of inodes on the underlying filesystems?
>>>>>>>>>
>>>>>>>>> On 18 Jan 2014 05:41, Pat Haley <phaley at MIT.EDU> wrote:
>>>>>>>>>
>>>>>>>>> Latest updates:
>>>>>>>>>
>>>>>>>>> no error messages were found on the log files of the bricks.
>>>>>>>>>
>>>>>>>>> The error messages appear on the client log files. Writing
>>>>>>>>> from a second client also has the same errors.
>>>>>>>>>
>>>>>>>>> Note that if I try to write a directory with 480 6Mb files
>>>>>>>>> to /projects, over 3/4 of the files are written. It is
>>>>>>>>> random which files are not written (i.e. it is not the
>>>>>>>>> last 1/4 of the files which fail)
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Some additional data
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [root at mseas-data save]# gluster volume info
>>>>>>>>>>
>>>>>>>>>> Volume Name: gdata
>>>>>>>>>> Type: Distribute
>>>>>>>>>> Volume ID: eccc3a90-212d-4563-ae8d-10a77758738d
>>>>>>>>>> Status: Started
>>>>>>>>>> Number of Bricks: 3
>>>>>>>>>> Transport-type: tcp
>>>>>>>>>> Bricks:
>>>>>>>>>> Brick1: gluster-0-0:/mseas-data-0-0
>>>>>>>>>> Brick2: gluster-0-1:/mseas-data-0-1
>>>>>>>>>> Brick3: gluster-data:/data
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [root at mseas-data save]# gluster volume status
>>>>>>>>>> Status of volume: gdata
>>>>>>>>>> Gluster process Port
>>>>>>>>> Online Pid
>>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Brick gluster-0-0:/mseas-data-0-0 24009 Y 27006
>>>>>>>>>> Brick gluster-0-1:/mseas-data-0-1 24009 Y 7063
>>>>>>>>>> Brick gluster-data:/data 24010 N 8007
>>>>>>>>>> NFS Server on localhost 38467
>>>>>>>>>> Y 8013
>>>>>>>>>> NFS Server on gluster-0-1 38467
>>>>>>>>>> Y 10228
>>>>>>>>>> NFS Server on 10.1.1.10 38467
>>>>>>>>>> Y 3867
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Noticing that the brick gluster-data:/data was appearing as N
>>>>>>>>>> in the "online" column, I tried (1) detaching gluster-data
>>>>>>>>>> (using
>>>>>>>>>> gluster peer detach gluster-data issued from gluster-0-0),
>>>>>>>>>> (2) removing
>>>>>>>>>> /var/lib/glusterd, (3) restarting glusterd on gluster-data,
>>>>>>>>>> (4) reattaching /gluster-data (using gluster peer probe
>>>>>>>>>> gluster-data
>>>>>>>>>> issued from gluster-0-0) then (5) restart glusterd one more
>>>>>>>>>> time on all
>>>>>>>>>> 3 bricks. The brick gluster-data:/data still appears as N in
>>>>>>>>>> the
>>>>>>>>>> Online
>>>>>>>>>> column.
>>>>>>>>>>
>>>>>>>>>> [root at mseas-data save]# gluster peer status
>>>>>>>>>> Number of Peers: 2
>>>>>>>>>>
>>>>>>>>>> Hostname: gluster-0-1
>>>>>>>>>> Uuid: 393fc4a6-1573-4564-971e-1b1aec434167
>>>>>>>>>> State: Peer in Cluster (Connected)
>>>>>>>>>>
>>>>>>>>>> Hostname: 10.1.1.10
>>>>>>>>>> Uuid: 3619440a-4ca3-4151-b62e-d4d6bf2e0c03
>>>>>>>>>> State: Peer in Cluster (Connected)
>>>>>>>>>>
>>>>>>>>>> (similarly from the other bricks)
>>>>>>>>>>
>>>>>>>>>> Ping works between all bricks too.
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> We are using gluster to present 3 bricks as a single name space.
>>>>>>>>>>> We appear to have a situation in which gluster thinks there
>>>>>>>>>>> is no disk space when there is actually plenty. I have restarted
>>>>>>>>>>> the glusterd deamons on all three bricks and I still get the
>>>>>>>>>>> following message
>>>>>>>>>>>
>>>>>>>>>>> /bin/cp: cannot create regular file
>>>>>>>>>>> `./Bottom_Gravity_Current_25/344.mat': No space left on device
>>>>>>>>>>>
>>>>>>>>>>> This is a 6Mbyte file. The total space available on
>>>>>>>>>>> gluster is 3.6T
>>>>>>>>>>>
>>>>>>>>>>> Filesystem Size Used Avail Use% Mounted on
>>>>>>>>>>> mseas-data:/gdata 55T 51T 3.6T 94% /gdata
>>>>>>>>>>>
>>>>>>>>>>> Also, no single brick is full:
>>>>>>>>>>>
>>>>>>>>>>> Filesystem Size Used Avail Use% Mounted on
>>>>>>>>>>> /dev/mapper/the_raid-lv_data
>>>>>>>>>>> 15T 14T 804G 95% /data
>>>>>>>>>>>
>>>>>>>>>>> Filesystem Size Used Avail Use% Mounted on
>>>>>>>>>>> /dev/sdb1 21T 18T 2.1T 90% /mseas-data-0-0
>>>>>>>>>>>
>>>>>>>>>>> Filesystem Size Used Avail Use% Mounted on
>>>>>>>>>>> /dev/sdb1 21T 20T 784G 97% /mseas-data-0-1
>>>>>>>>>>>
>>>>>>>>>>> What should we do to fix this problem or look at to diagnose
>>>>>>>>>>> this problem?
>>>>>>>>>>>
>>>>>>>>>>> Thanks.
>>>>>>>>>>>
>>>>>>>>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>>>>>>>>>>
>>>>>>>>>>> Pat Haley Email: phaley at mit.edu
>>>>>>>>>>> Center for Ocean Engineering Phone: (617) 253-6824
>>>>>>>>>>> Dept. of Mechanical Engineering Fax: (617) 253-8125
>>>>>>>>>>> MIT, Room 5-213 http://web.mit.edu/phaley/www/
>>>>>>>>>>> 77 Massachusetts Avenue
>>>>>>>>>>> Cambridge, MA 02139-4301
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>>>>>>>> Pat Haley Email: phaley at mit.edu
>>>>>>>>> Center for Ocean Engineering Phone: (617) 253-6824
>>>>>>>>> Dept. of Mechanical Engineering Fax: (617) 253-8125
>>>>>>>>> MIT, Room 5-213 http://web.mit.edu/phaley/www/
>>>>>>>>> 77 Massachusetts Avenue
>>>>>>>>> Cambridge, MA 02139-4301
>>>>>>>>> _______________________________________________
>>>>>>>>> Gluster-users mailing list
>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>
>>>>>>>>> ------------------------------------------------------------------------
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This email and any files transmitted with it are confidential
>>>>>>>>> and are
>>>>>>>>> intended solely for the use of the individual or entity to whom
>>>>>>>>> they
>>>>>>>>> are addressed. If you are not the original recipient or the person
>>>>>>>>> responsible for delivering the email to the intended recipient, be
>>>>>>>>> advised that you have received this email in error, and that
>>>>>>>>> any use,
>>>>>>>>> dissemination, forwarding, printing, or copying of this email is
>>>>>>>>> strictly prohibited. If you received this email in error, please
>>>>>>>>> immediately notify the sender and delete the original.
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>
>>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley Email: phaley at mit.edu
Center for Ocean Engineering Phone: (617) 253-6824
Dept. of Mechanical Engineering Fax: (617) 253-8125
MIT, Room 5-213 http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA 02139-4301
More information about the Gluster-users
mailing list