[Gluster-users] Gluster not recognizing available space

Tue Jan 21 16:54:56 UTC 2014

Hi Joe,

The peer status on all 3 showed the
proper connections.  Doing the killall
and restart on all three bricks fixed
the N in the Online column.  I then
did have to remount the gluster
filesystem on the client

Unfortunately my original problem remains.
I'm still getting "no space on left on device"
when I try to write even a small file to
the gluster filesystem.

What should I look at next?

Thanks.

Pat

> All three.
> 
> On 01/21/2014 08:38 AM, Pat Haley wrote:
>>
>> Hi Joe,
>>
>> They do appear as connected from  the first
>> brick, checking on the next 2.  If they
>> all show the same, is the "killall glusterfsd"
>> command simply run from the first brick, or
>> will I need to try in on all 3 bricks, one
>> at a time?
>>
>> Thanks
>>
>> # gluster peer status
>> Number of Peers: 2
>>
>> Hostname: gluster-0-1
>> Uuid: 978e0f76-6474-4203-8617-ed5ad7d29239
>> State: Peer in Cluster (Connected)
>>
>> Hostname: gluster-0-0
>> Uuid: 3f73f5cc-39d8-4d9a-b442-033cb074b247
>> State: Peer in Cluster (Connected)
>>
>>> You got lucky. That process could have deleted your volume entirely. 
>>> The volume configuration and state is stored in that directory path.
>>>
>>> Check gluster peer status on two servers and make sure they're all 
>>> "Peer in Cluster (Connected)". If not, peer probe to make them so.
>>>
>>> If they are, "killall glusterfsd". Then restart glusterd. Since your 
>>> volume isn't working anyway, this shouldn't really be a problem for 
>>> your users.
>>>
>>> Check volume status. I expect them to all be "Y".
>>>
>>> If there are still problems at the client, try unmounting and 
>>> mounting again.
>>>
>>> On 01/21/2014 07:42 AM, Pat Haley wrote:
>>>>
>>>> Hi,
>>>>
>>>> To try to clean thing out more, I took
>>>> the following steps
>>>> 1) On gluster-0-0:
>>>>     `gluster peer detach gluster-data`, if that fails, `gluster peer 
>>>> detach gluster-data force`
>>>> 2) On gluster-data:
>>>>     `rm -rf /var/lib/glusterd`
>>>>     `service glusterd restart`
>>>> 3) Again on gluster-0-0:
>>>>     'gluster peer probe gluster-data'
>>>> 4) service glusterd restart on each brick
>>>>
>>>> and repeated them for all 3 bricks (i.e. removing the
>>>> /var/lib/glusterd from all 3 bricks).  I did this
>>>> one at a time and restarted the glusterd daemon each
>>>> time.  Now all 3 bricks appear as N in the Online column
>>>> and I get a "no space left on device" error for even
>>>> a small file.
>>>>
>>>> Does any of this suggest what I should try/test
>>>> next?
>>>>
>>>> Thanks
>>>>
>>>> Status of volume: gdata
>>>> Gluster process                                         Port Online  
>>>> Pid
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> Brick gluster-0-0:/mseas-data-0-0 24010   N 15360
>>>> Brick gluster-0-1:/mseas-data-0-1 24010   N 21450
>>>> Brick gluster-data:/data 24010   N 16723
>>>> NFS Server on localhost 38467   Y 16728
>>>> NFS Server on gluster-0-1 38467   Y 21455
>>>> NFS Server on gluster-0-0 38467   Y 15365
>>>>
>>>>>
>>>>> Also, going back to an earlier Email,
>>>>> should I be concerned that in the output
>>>>> from "gluster volume status" the
>>>>> brick "gluster-data:/data" has an "N"
>>>>> in the "Online" column?  Does this suggest
>>>>> an additional debugging route?
>>>>>
>>>>> gluster volume status
>>>>> Status of volume: gdata
>>>>> Gluster process                                         Port 
>>>>> Online  Pid
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> Brick gluster-0-0:/mseas-data-0-0 24009 Y 27006
>>>>> Brick gluster-0-1:/mseas-data-0-1 24009 Y       7063
>>>>> Brick gluster-data:/data 24010 N 15772
>>>>> NFS Server on localhost 38467 Y 14936
>>>>> NFS Server on gluster-data 38467 Y 15778
>>>>> NFS Server on gluster-0-1 38467 Y 21083
>>>>>
>>>>>>
>>>>>> First, another update on my test of writing
>>>>>> a directory with 480 6Mb files.  Not only do
>>>>>> over 3/4 of the files appear, but the are
>>>>>> written on all 3 bricks.  Again, it is random
>>>>>> which files are not written but what I seem
>>>>>> to see is that files are written to each brick
>>>>>> even after the failures.  Does this suggest
>>>>>> anything else I should be looking at?
>>>>>>
>>>>>> As to Brian's suggestion, how exactly do I perform
>>>>>> a "quick inode allocation test"?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Pat
>>>>>>
>>>>>>> On 01/17/2014 07:48 PM, Pat Haley wrote:
>>>>>>>> Hi Franco,
>>>>>>>>
>>>>>>>> I checked using df -i on all 3 bricks.  No brick is over
>>>>>>>> 1% inode usage.
>>>>>>>>
>>>>>>>
>>>>>>> It might be worth a quick inode allocation test on the fs for each
>>>>>>> brick, regardless. There are other non-obvious scenarios that can 
>>>>>>> cause
>>>>>>> inode allocation to fail, at least on xfs (i.e., contiguous block
>>>>>>> allocation). Ideally, you'll have the ability to do this in a
>>>>>>> subdirectory outside the actual glusterfs brick.
>>>>>>>
>>>>>>> Brian
>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> Pat
>>>>>>>>
>>>>>>>>> Have you run out of inodes on the underlying  filesystems?
>>>>>>>>>
>>>>>>>>> On 18 Jan 2014 05:41, Pat Haley <phaley at MIT.EDU> wrote:
>>>>>>>>>
>>>>>>>>> Latest updates:
>>>>>>>>>
>>>>>>>>> no error messages were found on the log files of the bricks.
>>>>>>>>>
>>>>>>>>> The error messages appear on the client log files. Writing
>>>>>>>>> from a second client also has the same errors.
>>>>>>>>>
>>>>>>>>> Note that if I try to write a directory with 480 6Mb files
>>>>>>>>> to /projects, over 3/4 of the files are written.  It is
>>>>>>>>> random which files are not written (i.e. it is not the
>>>>>>>>> last 1/4 of the files which fail)
>>>>>>>>>
>>>>>>>>>>  Hi,
>>>>>>>>>>
>>>>>>>>>>  Some additional data
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  [root at mseas-data save]# gluster volume info
>>>>>>>>>>
>>>>>>>>>>  Volume Name: gdata
>>>>>>>>>>  Type: Distribute
>>>>>>>>>>  Volume ID: eccc3a90-212d-4563-ae8d-10a77758738d
>>>>>>>>>>  Status: Started
>>>>>>>>>>  Number of Bricks: 3
>>>>>>>>>>  Transport-type: tcp
>>>>>>>>>>  Bricks:
>>>>>>>>>>  Brick1: gluster-0-0:/mseas-data-0-0
>>>>>>>>>>  Brick2: gluster-0-1:/mseas-data-0-1
>>>>>>>>>>  Brick3: gluster-data:/data
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  [root at mseas-data save]# gluster volume status
>>>>>>>>>>  Status of volume: gdata
>>>>>>>>>>  Gluster process Port 
>>>>>>>>> Online  Pid
>>>>>>>>> ------------------------------------------------------------------------------ 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>  Brick gluster-0-0:/mseas-data-0-0 24009   Y 27006
>>>>>>>>>>  Brick gluster-0-1:/mseas-data-0-1 24009  Y 7063
>>>>>>>>>>  Brick gluster-data:/data 24010  N       8007
>>>>>>>>>>  NFS Server on localhost                                 38467 
>>>>>>>>>> Y       8013
>>>>>>>>>>  NFS Server on gluster-0-1                               38467 
>>>>>>>>>> Y 10228
>>>>>>>>>>  NFS Server on 10.1.1.10                                 38467 
>>>>>>>>>> Y       3867
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  Noticing that the brick gluster-data:/data was appearing as N
>>>>>>>>>>  in the "online" column, I tried (1) detaching gluster-data 
>>>>>>>>>> (using
>>>>>>>>>>  gluster peer detach gluster-data issued from gluster-0-0), 
>>>>>>>>>> (2) removing
>>>>>>>>>>  /var/lib/glusterd, (3) restarting glusterd on gluster-data,
>>>>>>>>>>  (4) reattaching /gluster-data (using gluster peer probe 
>>>>>>>>>> gluster-data
>>>>>>>>>>  issued from gluster-0-0) then (5) restart glusterd one more 
>>>>>>>>>> time on all
>>>>>>>>>>  3 bricks.  The brick gluster-data:/data still appears as N in 
>>>>>>>>>> the
>>>>>>>>>> Online
>>>>>>>>>>  column.
>>>>>>>>>>
>>>>>>>>>>  [root at mseas-data save]# gluster peer status
>>>>>>>>>>  Number of Peers: 2
>>>>>>>>>>
>>>>>>>>>>  Hostname: gluster-0-1
>>>>>>>>>>  Uuid: 393fc4a6-1573-4564-971e-1b1aec434167
>>>>>>>>>>  State: Peer in Cluster (Connected)
>>>>>>>>>>
>>>>>>>>>>  Hostname: 10.1.1.10
>>>>>>>>>>  Uuid: 3619440a-4ca3-4151-b62e-d4d6bf2e0c03
>>>>>>>>>>  State: Peer in Cluster (Connected)
>>>>>>>>>>
>>>>>>>>>>  (similarly from the other bricks)
>>>>>>>>>>
>>>>>>>>>>  Ping works between all bricks too.
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> We are using gluster to present 3 bricks as a single name space.
>>>>>>>>>>> We appear to have a situation in which gluster thinks there
>>>>>>>>>>> is no disk space when there is actually plenty. I have restarted
>>>>>>>>>>> the glusterd deamons on all three bricks and I still get the
>>>>>>>>>>> following message
>>>>>>>>>>>
>>>>>>>>>>> /bin/cp: cannot create regular file
>>>>>>>>>>> `./Bottom_Gravity_Current_25/344.mat': No space left on device
>>>>>>>>>>>
>>>>>>>>>>> This is a 6Mbyte file.  The total space available on
>>>>>>>>>>> gluster is 3.6T
>>>>>>>>>>>
>>>>>>>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>>>>>>>> mseas-data:/gdata      55T   51T  3.6T  94% /gdata
>>>>>>>>>>>
>>>>>>>>>>> Also, no single brick is full:
>>>>>>>>>>>
>>>>>>>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>>>>>>>> /dev/mapper/the_raid-lv_data
>>>>>>>>>>>                        15T   14T  804G  95% /data
>>>>>>>>>>>
>>>>>>>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>>>>>>>> /dev/sdb1              21T   18T  2.1T  90% /mseas-data-0-0
>>>>>>>>>>>
>>>>>>>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>>>>>>>> /dev/sdb1              21T   20T  784G  97% /mseas-data-0-1
>>>>>>>>>>>
>>>>>>>>>>> What should we do to fix this problem or look at to diagnose
>>>>>>>>>>> this problem?
>>>>>>>>>>>
>>>>>>>>>>> Thanks.
>>>>>>>>>>>
>>>>>>>>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- 
>>>>>>>>>>>
>>>>>>>>>>> Pat Haley                          Email: phaley at mit.edu
>>>>>>>>>>> Center for Ocean Engineering       Phone:  (617) 253-6824
>>>>>>>>>>> Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>>>>>>>>>>> MIT, Room 5-213 http://web.mit.edu/phaley/www/
>>>>>>>>>>> 77 Massachusetts Avenue
>>>>>>>>>>> Cambridge, MA  02139-4301
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -- 
>>>>>>>>>
>>>>>>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>>>>>>>> Pat Haley                          Email: phaley at mit.edu
>>>>>>>>> Center for Ocean Engineering       Phone:  (617) 253-6824
>>>>>>>>> Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>>>>>>>>> MIT, Room 5-213 http://web.mit.edu/phaley/www/
>>>>>>>>> 77 Massachusetts Avenue
>>>>>>>>> Cambridge, MA  02139-4301
>>>>>>>>> _______________________________________________
>>>>>>>>> Gluster-users mailing list
>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>
>>>>>>>>> ------------------------------------------------------------------------ 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This email and any files transmitted with it are confidential 
>>>>>>>>> and are
>>>>>>>>> intended solely for the use of the individual or entity to whom 
>>>>>>>>> they
>>>>>>>>> are addressed. If you are not the original recipient or the person
>>>>>>>>> responsible for delivering the email to the intended recipient, be
>>>>>>>>> advised that you have received this email in error, and that 
>>>>>>>>> any use,
>>>>>>>>> dissemination, forwarding, printing, or copying of this email is
>>>>>>>>> strictly prohibited. If you received this email in error, please
>>>>>>>>> immediately notify the sender and delete the original.
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>
>>
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

-- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  phaley at mit.edu
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301