[Gluster-users] Gluster not recognizing available space
Joe Julian
joe at julianfamily.org
Tue Jan 21 16:46:55 UTC 2014
All three.
On 01/21/2014 08:38 AM, Pat Haley wrote:
>
> Hi Joe,
>
> They do appear as connected from the first
> brick, checking on the next 2. If they
> all show the same, is the "killall glusterfsd"
> command simply run from the first brick, or
> will I need to try in on all 3 bricks, one
> at a time?
>
> Thanks
>
> # gluster peer status
> Number of Peers: 2
>
> Hostname: gluster-0-1
> Uuid: 978e0f76-6474-4203-8617-ed5ad7d29239
> State: Peer in Cluster (Connected)
>
> Hostname: gluster-0-0
> Uuid: 3f73f5cc-39d8-4d9a-b442-033cb074b247
> State: Peer in Cluster (Connected)
>
>> You got lucky. That process could have deleted your volume entirely.
>> The volume configuration and state is stored in that directory path.
>>
>> Check gluster peer status on two servers and make sure they're all
>> "Peer in Cluster (Connected)". If not, peer probe to make them so.
>>
>> If they are, "killall glusterfsd". Then restart glusterd. Since your
>> volume isn't working anyway, this shouldn't really be a problem for
>> your users.
>>
>> Check volume status. I expect them to all be "Y".
>>
>> If there are still problems at the client, try unmounting and
>> mounting again.
>>
>> On 01/21/2014 07:42 AM, Pat Haley wrote:
>>>
>>> Hi,
>>>
>>> To try to clean thing out more, I took
>>> the following steps
>>> 1) On gluster-0-0:
>>> `gluster peer detach gluster-data`, if that fails, `gluster peer
>>> detach gluster-data force`
>>> 2) On gluster-data:
>>> `rm -rf /var/lib/glusterd`
>>> `service glusterd restart`
>>> 3) Again on gluster-0-0:
>>> 'gluster peer probe gluster-data'
>>> 4) service glusterd restart on each brick
>>>
>>> and repeated them for all 3 bricks (i.e. removing the
>>> /var/lib/glusterd from all 3 bricks). I did this
>>> one at a time and restarted the glusterd daemon each
>>> time. Now all 3 bricks appear as N in the Online column
>>> and I get a "no space left on device" error for even
>>> a small file.
>>>
>>> Does any of this suggest what I should try/test
>>> next?
>>>
>>> Thanks
>>>
>>> Status of volume: gdata
>>> Gluster process Port Online
>>> Pid
>>> ------------------------------------------------------------------------------
>>>
>>> Brick gluster-0-0:/mseas-data-0-0 24010 N 15360
>>> Brick gluster-0-1:/mseas-data-0-1 24010 N 21450
>>> Brick gluster-data:/data 24010 N 16723
>>> NFS Server on localhost 38467 Y 16728
>>> NFS Server on gluster-0-1 38467 Y 21455
>>> NFS Server on gluster-0-0 38467 Y 15365
>>>
>>>>
>>>> Also, going back to an earlier Email,
>>>> should I be concerned that in the output
>>>> from "gluster volume status" the
>>>> brick "gluster-data:/data" has an "N"
>>>> in the "Online" column? Does this suggest
>>>> an additional debugging route?
>>>>
>>>> gluster volume status
>>>> Status of volume: gdata
>>>> Gluster process Port
>>>> Online Pid
>>>> ------------------------------------------------------------------------------
>>>>
>>>> Brick gluster-0-0:/mseas-data-0-0 24009 Y 27006
>>>> Brick gluster-0-1:/mseas-data-0-1 24009 Y 7063
>>>> Brick gluster-data:/data 24010 N 15772
>>>> NFS Server on localhost 38467 Y 14936
>>>> NFS Server on gluster-data 38467 Y 15778
>>>> NFS Server on gluster-0-1 38467 Y 21083
>>>>
>>>>>
>>>>> First, another update on my test of writing
>>>>> a directory with 480 6Mb files. Not only do
>>>>> over 3/4 of the files appear, but the are
>>>>> written on all 3 bricks. Again, it is random
>>>>> which files are not written but what I seem
>>>>> to see is that files are written to each brick
>>>>> even after the failures. Does this suggest
>>>>> anything else I should be looking at?
>>>>>
>>>>> As to Brian's suggestion, how exactly do I perform
>>>>> a "quick inode allocation test"?
>>>>>
>>>>> Thanks
>>>>>
>>>>> Pat
>>>>>
>>>>>> On 01/17/2014 07:48 PM, Pat Haley wrote:
>>>>>>> Hi Franco,
>>>>>>>
>>>>>>> I checked using df -i on all 3 bricks. No brick is over
>>>>>>> 1% inode usage.
>>>>>>>
>>>>>>
>>>>>> It might be worth a quick inode allocation test on the fs for each
>>>>>> brick, regardless. There are other non-obvious scenarios that can
>>>>>> cause
>>>>>> inode allocation to fail, at least on xfs (i.e., contiguous block
>>>>>> allocation). Ideally, you'll have the ability to do this in a
>>>>>> subdirectory outside the actual glusterfs brick.
>>>>>>
>>>>>> Brian
>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> Pat
>>>>>>>
>>>>>>>> Have you run out of inodes on the underlying filesystems?
>>>>>>>>
>>>>>>>> On 18 Jan 2014 05:41, Pat Haley <phaley at MIT.EDU> wrote:
>>>>>>>>
>>>>>>>> Latest updates:
>>>>>>>>
>>>>>>>> no error messages were found on the log files of the bricks.
>>>>>>>>
>>>>>>>> The error messages appear on the client log files. Writing
>>>>>>>> from a second client also has the same errors.
>>>>>>>>
>>>>>>>> Note that if I try to write a directory with 480 6Mb files
>>>>>>>> to /projects, over 3/4 of the files are written. It is
>>>>>>>> random which files are not written (i.e. it is not the
>>>>>>>> last 1/4 of the files which fail)
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Some additional data
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [root at mseas-data save]# gluster volume info
>>>>>>>>>
>>>>>>>>> Volume Name: gdata
>>>>>>>>> Type: Distribute
>>>>>>>>> Volume ID: eccc3a90-212d-4563-ae8d-10a77758738d
>>>>>>>>> Status: Started
>>>>>>>>> Number of Bricks: 3
>>>>>>>>> Transport-type: tcp
>>>>>>>>> Bricks:
>>>>>>>>> Brick1: gluster-0-0:/mseas-data-0-0
>>>>>>>>> Brick2: gluster-0-1:/mseas-data-0-1
>>>>>>>>> Brick3: gluster-data:/data
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [root at mseas-data save]# gluster volume status
>>>>>>>>> Status of volume: gdata
>>>>>>>>> Gluster process Port
>>>>>>>> Online Pid
>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>>> Brick gluster-0-0:/mseas-data-0-0 24009 Y 27006
>>>>>>>>> Brick gluster-0-1:/mseas-data-0-1 24009 Y 7063
>>>>>>>>> Brick gluster-data:/data 24010 N 8007
>>>>>>>>> NFS Server on localhost 38467
>>>>>>>>> Y 8013
>>>>>>>>> NFS Server on gluster-0-1 38467
>>>>>>>>> Y 10228
>>>>>>>>> NFS Server on 10.1.1.10 38467
>>>>>>>>> Y 3867
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Noticing that the brick gluster-data:/data was appearing as N
>>>>>>>>> in the "online" column, I tried (1) detaching gluster-data
>>>>>>>>> (using
>>>>>>>>> gluster peer detach gluster-data issued from gluster-0-0),
>>>>>>>>> (2) removing
>>>>>>>>> /var/lib/glusterd, (3) restarting glusterd on gluster-data,
>>>>>>>>> (4) reattaching /gluster-data (using gluster peer probe
>>>>>>>>> gluster-data
>>>>>>>>> issued from gluster-0-0) then (5) restart glusterd one more
>>>>>>>>> time on all
>>>>>>>>> 3 bricks. The brick gluster-data:/data still appears as N in
>>>>>>>>> the
>>>>>>>>> Online
>>>>>>>>> column.
>>>>>>>>>
>>>>>>>>> [root at mseas-data save]# gluster peer status
>>>>>>>>> Number of Peers: 2
>>>>>>>>>
>>>>>>>>> Hostname: gluster-0-1
>>>>>>>>> Uuid: 393fc4a6-1573-4564-971e-1b1aec434167
>>>>>>>>> State: Peer in Cluster (Connected)
>>>>>>>>>
>>>>>>>>> Hostname: 10.1.1.10
>>>>>>>>> Uuid: 3619440a-4ca3-4151-b62e-d4d6bf2e0c03
>>>>>>>>> State: Peer in Cluster (Connected)
>>>>>>>>>
>>>>>>>>> (similarly from the other bricks)
>>>>>>>>>
>>>>>>>>> Ping works between all bricks too.
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> We are using gluster to present 3 bricks as a single name space.
>>>>>>>>>> We appear to have a situation in which gluster thinks there
>>>>>>>>>> is no disk space when there is actually plenty. I have restarted
>>>>>>>>>> the glusterd deamons on all three bricks and I still get the
>>>>>>>>>> following message
>>>>>>>>>>
>>>>>>>>>> /bin/cp: cannot create regular file
>>>>>>>>>> `./Bottom_Gravity_Current_25/344.mat': No space left on device
>>>>>>>>>>
>>>>>>>>>> This is a 6Mbyte file. The total space available on
>>>>>>>>>> gluster is 3.6T
>>>>>>>>>>
>>>>>>>>>> Filesystem Size Used Avail Use% Mounted on
>>>>>>>>>> mseas-data:/gdata 55T 51T 3.6T 94% /gdata
>>>>>>>>>>
>>>>>>>>>> Also, no single brick is full:
>>>>>>>>>>
>>>>>>>>>> Filesystem Size Used Avail Use% Mounted on
>>>>>>>>>> /dev/mapper/the_raid-lv_data
>>>>>>>>>> 15T 14T 804G 95% /data
>>>>>>>>>>
>>>>>>>>>> Filesystem Size Used Avail Use% Mounted on
>>>>>>>>>> /dev/sdb1 21T 18T 2.1T 90% /mseas-data-0-0
>>>>>>>>>>
>>>>>>>>>> Filesystem Size Used Avail Use% Mounted on
>>>>>>>>>> /dev/sdb1 21T 20T 784G 97% /mseas-data-0-1
>>>>>>>>>>
>>>>>>>>>> What should we do to fix this problem or look at to diagnose
>>>>>>>>>> this problem?
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>
>>>>>>>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>>>>>>>>>
>>>>>>>>>> Pat Haley Email: phaley at mit.edu
>>>>>>>>>> Center for Ocean Engineering Phone: (617) 253-6824
>>>>>>>>>> Dept. of Mechanical Engineering Fax: (617) 253-8125
>>>>>>>>>> MIT, Room 5-213 http://web.mit.edu/phaley/www/
>>>>>>>>>> 77 Massachusetts Avenue
>>>>>>>>>> Cambridge, MA 02139-4301
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>>>>>>> Pat Haley Email: phaley at mit.edu
>>>>>>>> Center for Ocean Engineering Phone: (617) 253-6824
>>>>>>>> Dept. of Mechanical Engineering Fax: (617) 253-8125
>>>>>>>> MIT, Room 5-213 http://web.mit.edu/phaley/www/
>>>>>>>> 77 Massachusetts Avenue
>>>>>>>> Cambridge, MA 02139-4301
>>>>>>>> _______________________________________________
>>>>>>>> Gluster-users mailing list
>>>>>>>> Gluster-users at gluster.org
>>>>>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>
>>>>>>>> ------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> This email and any files transmitted with it are confidential
>>>>>>>> and are
>>>>>>>> intended solely for the use of the individual or entity to whom
>>>>>>>> they
>>>>>>>> are addressed. If you are not the original recipient or the person
>>>>>>>> responsible for delivering the email to the intended recipient, be
>>>>>>>> advised that you have received this email in error, and that
>>>>>>>> any use,
>>>>>>>> dissemination, forwarding, printing, or copying of this email is
>>>>>>>> strictly prohibited. If you received this email in error, please
>>>>>>>> immediately notify the sender and delete the original.
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
>
More information about the Gluster-users
mailing list