[Gluster-users] Gluster not recognizing available space

Tue Jan 21 16:46:55 UTC 2014

All three.

On 01/21/2014 08:38 AM, Pat Haley wrote:
>
> Hi Joe,
>
> They do appear as connected from  the first
> brick, checking on the next 2.  If they
> all show the same, is the "killall glusterfsd"
> command simply run from the first brick, or
> will I need to try in on all 3 bricks, one
> at a time?
>
> Thanks
>
> # gluster peer status
> Number of Peers: 2
>
> Hostname: gluster-0-1
> Uuid: 978e0f76-6474-4203-8617-ed5ad7d29239
> State: Peer in Cluster (Connected)
>
> Hostname: gluster-0-0
> Uuid: 3f73f5cc-39d8-4d9a-b442-033cb074b247
> State: Peer in Cluster (Connected)
>
>> You got lucky. That process could have deleted your volume entirely. 
>> The volume configuration and state is stored in that directory path.
>>
>> Check gluster peer status on two servers and make sure they're all 
>> "Peer in Cluster (Connected)". If not, peer probe to make them so.
>>
>> If they are, "killall glusterfsd". Then restart glusterd. Since your 
>> volume isn't working anyway, this shouldn't really be a problem for 
>> your users.
>>
>> Check volume status. I expect them to all be "Y".
>>
>> If there are still problems at the client, try unmounting and 
>> mounting again.
>>
>> On 01/21/2014 07:42 AM, Pat Haley wrote:
>>>
>>> Hi,
>>>
>>> To try to clean thing out more, I took
>>> the following steps
>>> 1) On gluster-0-0:
>>>     `gluster peer detach gluster-data`, if that fails, `gluster peer 
>>> detach gluster-data force`
>>> 2) On gluster-data:
>>>     `rm -rf /var/lib/glusterd`
>>>     `service glusterd restart`
>>> 3) Again on gluster-0-0:
>>>     'gluster peer probe gluster-data'
>>> 4) service glusterd restart on each brick
>>>
>>> and repeated them for all 3 bricks (i.e. removing the
>>> /var/lib/glusterd from all 3 bricks).  I did this
>>> one at a time and restarted the glusterd daemon each
>>> time.  Now all 3 bricks appear as N in the Online column
>>> and I get a "no space left on device" error for even
>>> a small file.
>>>
>>> Does any of this suggest what I should try/test
>>> next?
>>>
>>> Thanks
>>>
>>> Status of volume: gdata
>>> Gluster process                                         Port Online  
>>> Pid
>>> ------------------------------------------------------------------------------ 
>>>
>>> Brick gluster-0-0:/mseas-data-0-0 24010   N 15360
>>> Brick gluster-0-1:/mseas-data-0-1 24010   N 21450
>>> Brick gluster-data:/data 24010   N 16723
>>> NFS Server on localhost 38467   Y 16728
>>> NFS Server on gluster-0-1 38467   Y 21455
>>> NFS Server on gluster-0-0 38467   Y 15365
>>>
>>>>
>>>> Also, going back to an earlier Email,
>>>> should I be concerned that in the output
>>>> from "gluster volume status" the
>>>> brick "gluster-data:/data" has an "N"
>>>> in the "Online" column?  Does this suggest
>>>> an additional debugging route?
>>>>
>>>> gluster volume status
>>>> Status of volume: gdata
>>>> Gluster process                                         Port 
>>>> Online  Pid
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> Brick gluster-0-0:/mseas-data-0-0 24009 Y 27006
>>>> Brick gluster-0-1:/mseas-data-0-1 24009 Y       7063
>>>> Brick gluster-data:/data 24010 N 15772
>>>> NFS Server on localhost 38467 Y 14936
>>>> NFS Server on gluster-data 38467 Y 15778
>>>> NFS Server on gluster-0-1 38467 Y 21083
>>>>
>>>>>
>>>>> First, another update on my test of writing
>>>>> a directory with 480 6Mb files.  Not only do
>>>>> over 3/4 of the files appear, but the are
>>>>> written on all 3 bricks.  Again, it is random
>>>>> which files are not written but what I seem
>>>>> to see is that files are written to each brick
>>>>> even after the failures.  Does this suggest
>>>>> anything else I should be looking at?
>>>>>
>>>>> As to Brian's suggestion, how exactly do I perform
>>>>> a "quick inode allocation test"?
>>>>>
>>>>> Thanks
>>>>>
>>>>> Pat
>>>>>
>>>>>> On 01/17/2014 07:48 PM, Pat Haley wrote:
>>>>>>> Hi Franco,
>>>>>>>
>>>>>>> I checked using df -i on all 3 bricks.  No brick is over
>>>>>>> 1% inode usage.
>>>>>>>
>>>>>>
>>>>>> It might be worth a quick inode allocation test on the fs for each
>>>>>> brick, regardless. There are other non-obvious scenarios that can 
>>>>>> cause
>>>>>> inode allocation to fail, at least on xfs (i.e., contiguous block
>>>>>> allocation). Ideally, you'll have the ability to do this in a
>>>>>> subdirectory outside the actual glusterfs brick.
>>>>>>
>>>>>> Brian
>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> Pat
>>>>>>>
>>>>>>>> Have you run out of inodes on the underlying  filesystems?
>>>>>>>>
>>>>>>>> On 18 Jan 2014 05:41, Pat Haley <phaley at MIT.EDU> wrote:
>>>>>>>>
>>>>>>>> Latest updates:
>>>>>>>>
>>>>>>>> no error messages were found on the log files of the bricks.
>>>>>>>>
>>>>>>>> The error messages appear on the client log files. Writing
>>>>>>>> from a second client also has the same errors.
>>>>>>>>
>>>>>>>> Note that if I try to write a directory with 480 6Mb files
>>>>>>>> to /projects, over 3/4 of the files are written.  It is
>>>>>>>> random which files are not written (i.e. it is not the
>>>>>>>> last 1/4 of the files which fail)
>>>>>>>>
>>>>>>>>>  Hi,
>>>>>>>>>
>>>>>>>>>  Some additional data
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  [root at mseas-data save]# gluster volume info
>>>>>>>>>
>>>>>>>>>  Volume Name: gdata
>>>>>>>>>  Type: Distribute
>>>>>>>>>  Volume ID: eccc3a90-212d-4563-ae8d-10a77758738d
>>>>>>>>>  Status: Started
>>>>>>>>>  Number of Bricks: 3
>>>>>>>>>  Transport-type: tcp
>>>>>>>>>  Bricks:
>>>>>>>>>  Brick1: gluster-0-0:/mseas-data-0-0
>>>>>>>>>  Brick2: gluster-0-1:/mseas-data-0-1
>>>>>>>>>  Brick3: gluster-data:/data
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  [root at mseas-data save]# gluster volume status
>>>>>>>>>  Status of volume: gdata
>>>>>>>>>  Gluster process Port 
>>>>>>>> Online  Pid
>>>>>>>> ------------------------------------------------------------------------------ 
>>>>>>>>
>>>>>>>>
>>>>>>>>>  Brick gluster-0-0:/mseas-data-0-0 24009   Y 27006
>>>>>>>>>  Brick gluster-0-1:/mseas-data-0-1 24009  Y 7063
>>>>>>>>>  Brick gluster-data:/data 24010  N       8007
>>>>>>>>>  NFS Server on localhost                                 38467 
>>>>>>>>> Y       8013
>>>>>>>>>  NFS Server on gluster-0-1                               38467 
>>>>>>>>> Y 10228
>>>>>>>>>  NFS Server on 10.1.1.10                                 38467 
>>>>>>>>> Y       3867
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  Noticing that the brick gluster-data:/data was appearing as N
>>>>>>>>>  in the "online" column, I tried (1) detaching gluster-data 
>>>>>>>>> (using
>>>>>>>>>  gluster peer detach gluster-data issued from gluster-0-0), 
>>>>>>>>> (2) removing
>>>>>>>>>  /var/lib/glusterd, (3) restarting glusterd on gluster-data,
>>>>>>>>>  (4) reattaching /gluster-data (using gluster peer probe 
>>>>>>>>> gluster-data
>>>>>>>>>  issued from gluster-0-0) then (5) restart glusterd one more 
>>>>>>>>> time on all
>>>>>>>>>  3 bricks.  The brick gluster-data:/data still appears as N in 
>>>>>>>>> the
>>>>>>>>> Online
>>>>>>>>>  column.
>>>>>>>>>
>>>>>>>>>  [root at mseas-data save]# gluster peer status
>>>>>>>>>  Number of Peers: 2
>>>>>>>>>
>>>>>>>>>  Hostname: gluster-0-1
>>>>>>>>>  Uuid: 393fc4a6-1573-4564-971e-1b1aec434167
>>>>>>>>>  State: Peer in Cluster (Connected)
>>>>>>>>>
>>>>>>>>>  Hostname: 10.1.1.10
>>>>>>>>>  Uuid: 3619440a-4ca3-4151-b62e-d4d6bf2e0c03
>>>>>>>>>  State: Peer in Cluster (Connected)
>>>>>>>>>
>>>>>>>>>  (similarly from the other bricks)
>>>>>>>>>
>>>>>>>>>  Ping works between all bricks too.
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> We are using gluster to present 3 bricks as a single name space.
>>>>>>>>>> We appear to have a situation in which gluster thinks there
>>>>>>>>>> is no disk space when there is actually plenty. I have restarted
>>>>>>>>>> the glusterd deamons on all three bricks and I still get the
>>>>>>>>>> following message
>>>>>>>>>>
>>>>>>>>>> /bin/cp: cannot create regular file
>>>>>>>>>> `./Bottom_Gravity_Current_25/344.mat': No space left on device
>>>>>>>>>>
>>>>>>>>>> This is a 6Mbyte file.  The total space available on
>>>>>>>>>> gluster is 3.6T
>>>>>>>>>>
>>>>>>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>>>>>>> mseas-data:/gdata      55T   51T  3.6T  94% /gdata
>>>>>>>>>>
>>>>>>>>>> Also, no single brick is full:
>>>>>>>>>>
>>>>>>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>>>>>>> /dev/mapper/the_raid-lv_data
>>>>>>>>>>                        15T   14T  804G  95% /data
>>>>>>>>>>
>>>>>>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>>>>>>> /dev/sdb1              21T   18T  2.1T  90% /mseas-data-0-0
>>>>>>>>>>
>>>>>>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>>>>>>> /dev/sdb1              21T   20T  784G  97% /mseas-data-0-1
>>>>>>>>>>
>>>>>>>>>> What should we do to fix this problem or look at to diagnose
>>>>>>>>>> this problem?
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>
>>>>>>>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- 
>>>>>>>>>>
>>>>>>>>>> Pat Haley                          Email: phaley at mit.edu
>>>>>>>>>> Center for Ocean Engineering       Phone:  (617) 253-6824
>>>>>>>>>> Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>>>>>>>>>> MIT, Room 5-213 http://web.mit.edu/phaley/www/
>>>>>>>>>> 77 Massachusetts Avenue
>>>>>>>>>> Cambridge, MA  02139-4301
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>
>>>>>>>>
>>>>>>>> -- 
>>>>>>>>
>>>>>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>>>>>>> Pat Haley                          Email: phaley at mit.edu
>>>>>>>> Center for Ocean Engineering       Phone:  (617) 253-6824
>>>>>>>> Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>>>>>>>> MIT, Room 5-213 http://web.mit.edu/phaley/www/
>>>>>>>> 77 Massachusetts Avenue
>>>>>>>> Cambridge, MA  02139-4301
>>>>>>>> _______________________________________________
>>>>>>>> Gluster-users mailing list
>>>>>>>> Gluster-users at gluster.org
>>>>>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>
>>>>>>>> ------------------------------------------------------------------------ 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> This email and any files transmitted with it are confidential 
>>>>>>>> and are
>>>>>>>> intended solely for the use of the individual or entity to whom 
>>>>>>>> they
>>>>>>>> are addressed. If you are not the original recipient or the person
>>>>>>>> responsible for delivering the email to the intended recipient, be
>>>>>>>> advised that you have received this email in error, and that 
>>>>>>>> any use,
>>>>>>>> dissemination, forwarding, printing, or copying of this email is
>>>>>>>> strictly prohibited. If you received this email in error, please
>>>>>>>> immediately notify the sender and delete the original.
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
>