[Gluster-users] remove_me files building up
Strahil Nikolov
hunter86_bg at yahoo.com
Thu Jul 13 23:02:34 UTC 2023
Hi Liam,
Have you tried to convert the volume into replica 2 , wipe the arbiter brick , recreate it and then convert back to ‘replica 3 arbiter 1’ mode ?
Best Regards,Strahil Nikolov
On Wednesday, July 12, 2023, 1:31 PM, Liam Smith <liam.smith at ek.co> wrote:
#yiv7898679786 P {margin-top:0;margin-bottom:0;}Hi,
We're still seeing this issue, we can see entries like the below on the servers connecting to gluster in the gluster client logs in case that helps:
[2023-07-06 22:59:47.029036 +0000] W [MSGID: 114031] [client-rpc-fops_v2.c:2561:client4_0_lookup_cbk] 0-gv1-client-7: remote operation failed. [{path=<gfid:55bca2d2-ae47-457c-82f2-bcea5947f558>}, {gfid=55bca2d2-ae47-457c-82f2-bcea5947f558}, {errno=2}, {error=No such file or directory}][2023-07-06 22:59:47.029136 +0000] W [MSGID: 114031] [client-rpc-fops_v2.c:2561:client4_0_lookup_cbk] 0-gv1-client-5: remote operation failed. [{path=<gfid:55bca2d2-ae47-457c-82f2-bcea5947f558>}, {gfid=55bca2d2-ae47-457c-82f2-bcea5947f558}, {errno=2}, {error=No such file or directory}][2023-07-06 22:59:47.038752 +0000] E [MSGID: 133021] [shard.c:3822:shard_delete_shards] 0-gv1-shard: Failed to clean up shards of gfid 55bca2d2-ae47-457c-82f2-bcea5947f558 [No such file or directory]
Thanks,
|
| |
|
|
|
| Liam Smith |
| Linux Systems Support Engineer, Scholar |
|
|
| |
|
| |
|
|
From: Strahil Nikolov <hunter86_bg at yahoo.com>
Sent: 05 July 2023 11:59
To: Liam Smith <liam.smith at ek.co>; gluster-users at gluster.org <gluster-users at gluster.org>; Gluster Devel <gluster-devel at gluster.org>
Subject: Re: [Gluster-users] remove_me files building up
|
CAUTION: This e-mail originates from outside of Ekco. Do not click links or attachments unless you recognise the sender.
|
Adding Gluster Devel list.
Best Regards,Strahil Nikolov
Sent from Yahoo Mail for iPhone
On Wednesday, July 5, 2023, 12:41 PM, Liam Smith <liam.smith at ek.co> wrote:
The contents of this email message and any attachments are intended solely for the addressee(s) and may contain confidential and/or privileged information and may be legally protected from disclosure.
<!--#yiv7898679786 #yiv7898679786x_yiv4442503694 p {margin-top:0;margin-bottom:0;}-->Hi Strahil,
This is the output from the commands:
root at uk3-prod-gfs-arb-01:~# du -h -x -d 1 /data/glusterfs/gv1/brick1/brick2.2G /data/glusterfs/gv1/brick1/brick/.glusterfs24M /data/glusterfs/gv1/brick1/brick/scalelite-recordings16K /data/glusterfs/gv1/brick1/brick/mytute18M /data/glusterfs/gv1/brick1/brick/.shard0 /data/glusterfs/gv1/brick1/brick/.glusterfs-anonymous-inode-d3d1fdec-7df9-4f71-b9fc-660d12c2a0462.3G /data/glusterfs/gv1/brick1/brick
root at uk3-prod-gfs-arb-01:~# du -h -x -d 1 /data/glusterfs/gv1/brick3/brick11G /data/glusterfs/gv1/brick3/brick/.glusterfs15M /data/glusterfs/gv1/brick3/brick/scalelite-recordings460K /data/glusterfs/gv1/brick3/brick/mytute151M /data/glusterfs/gv1/brick3/brick/.shard0 /data/glusterfs/gv1/brick3/brick/.glusterfs-anonymous-inode-d3d1fdec-7df9-4f71-b9fc-660d12c2a04611G /data/glusterfs/gv1/brick3/brick
root at uk3-prod-gfs-arb-01:~# du -h -x -d 1 /data/glusterfs/gv1/brick2/brick12G /data/glusterfs/gv1/brick2/brick/.glusterfs110M /data/glusterfs/gv1/brick2/brick/scalelite-recordings3.1M /data/glusterfs/gv1/brick2/brick/mytute169M /data/glusterfs/gv1/brick2/brick/.shard0 /data/glusterfs/gv1/brick2/brick/.glusterfs-anonymous-inode-d3d1fdec-7df9-4f71-b9fc-660d12c2a04612G /data/glusterfs/gv1/brick2/brick
Also, this is the du -sh output for the specific directory that appears to be taking up space on brick 3:
root at uk3-prod-gfs-arb-01:~# du -sh /data/glusterfs/gv1/brick3/brick/.shard/.remove_me/10G /data/glusterfs/gv1/brick3/brick/.shard/.remove_me/
The gluster package version on all the servers is 11.0, and I believe they were upgraded from 7.2>8.6>11.0; the op.version was only changed after the 11.0 upgrade.
The archival job deletes the files, files shouldn't be overwritten at any point as it's new, unique files being generated everyday.
Thanks,
|
| |
|
|
|
| Liam Smith |
| Linux Systems Support Engineer, Scholar |
|
|
| |
|
| |
|
|
From: Strahil Nikolov <hunter86_bg at yahoo.com>
Sent: 04 July 2023 17:47
To: Liam Smith <liam.smith at ek.co>; gluster-users at gluster.org <gluster-users at gluster.org>
Subject: Re: [Gluster-users] remove_me files building up
|
CAUTION: This e-mail originates from outside of Ekco. Do not click links or attachments unless you recognise the sender.
|
Thanks for the clarification.
That behaviour is quite weird as arbiter bricks should hold only metadata.
What does the following show on host uk3-prod-gfs-arb-01:
du -h -x -d 1 /data/glusterfs/gv1/brick1/brickdu -h -x -d 1 /data/glusterfs/gv1/brick3/brickdu -h -x -d 1 /data/glusterfs/gv1/brick2/brick
If indeed the shards are taking space - that is a really strange situation.From which version did you upgrade and which one is now ? I assume all gluster TSP members (the servers) have the same version, but it’s nice to double check.
Does the archival job actually deletes the original files after being processed or the workload keeps overriding the existing files ?
Best Regards,Strahil Nikolov
Sent from Yahoo Mail for iPhone
On Tuesday, July 4, 2023, 6:50 PM, Liam Smith <liam.smith at ek.co> wrote:
The contents of this email message and any attachments are intended solely for the addressee(s) and may contain confidential and/or privileged information and may be legally protected from disclosure.
<!--#yiv7898679786 #yiv7898679786x_yiv4442503694 #yiv7898679786x_yiv4442503694x_yiv0069265236 p {margin-top:0;margin-bottom:0;}-->Hi Strahil,
We're using gluster to act as a share for an application to temporarily process and store files, before they're then archived off over night.
The issue we're seeing isn't with the inodes running out of space, but the actual disk space on the arb server running low.
This is the df -h output for the bricks on the arb server:/dev/sdd1 15G 12G 3.3G 79% /data/glusterfs/gv1/brick3/dev/sdc1 15G 2.8G 13G 19% /data/glusterfs/gv1/brick1/dev/sde1 15G 14G 1.6G 90% /data/glusterfs/gv1/brick2
And this is the df -hi output for the bricks on the arb server:/dev/sdd1 7.5M 2.7M 4.9M 35% /data/glusterfs/gv1/brick3/dev/sdc1 7.5M 643K 6.9M 9% /data/glusterfs/gv1/brick1/dev/sde1 6.1M 3.0M 3.1M 49% /data/glusterfs/gv1/brick2
So the inode usage appears to be fine, but we're seeing that the actual disk usage keeps increasing on the bricks despite it being the arbiter.
The actual issue appears to be that files under /data/glusterfs/gv1/brick3/brick/.shard/.remove_me/ and/data/glusterfs/gv1/brick2/brick/.shard/.remove_me/ are being retained, even when the original files are deleted from the data nodes.
For reference, I've attached disk usage graphs for brick 3 over the past two weeks; one is a graph from a data node, the other from the arb.
As you can see, the disk usage of the data node builds throughout the day, but then an archival job clears space down. However, on the arb, we see the disk space increasing in the same sort of trend, but it's never cleared down like the data node.
Hopefully this clarifies the issue, we're a bit confused as to why this is occurring and whether this is actually intended behaviour or potentially a bug, so any advice is greatly appreciated.
Thanks,
|
| |
|
|
|
| Liam Smith |
| Linux Systems Support Engineer, Scholar |
|
|
| |
|
The contents of this email message and any attachments are intended solely for the addressee(s) and may contain confidential and/or privileged information and may be legally protected from disclosure.
| |
|
|
From: Strahil Nikolov <hunter86_bg at yahoo.com>
Sent: 04 July 2023 15:51
To: Liam Smith <liam.smith at ek.co>; gluster-users at gluster.org <gluster-users at gluster.org>
Subject: Re: [Gluster-users] remove_me files building up
|
CAUTION: This e-mail originates from outside of Ekco. Do not click links or attachments unless you recognise the sender.
|
Hi Liam,
I saw that your XFS uses ‘imaxpct=25’ which for an arbiter brick is a little bit low.
If you have free space on the bricks, increase the maxpct to a bigger value, like:xfs_growfs -m 80 /path/to/brickThat will set 80% of the Filesystem for inodes, which you can verify with df -i /brick/path (compare before and after). This way you won’t run out of inodes in the future.
Of course, always test that on non Prod first.
Are you using the volume for VM disk storage domain ? What is your main workload ?
Best Regards,Strahil Nikolov
On Tuesday, July 4, 2023, 2:12 PM, Liam Smith <liam.smith at ek.co> wrote:
<!--#yiv7898679786 #yiv7898679786x_yiv4442503694 #yiv7898679786x_yiv4442503694x_yiv0069265236 #yiv7898679786x_yiv4442503694x_yiv0069265236x_yiv8784601153 p {margin-top:0;margin-bottom:0;}-->Hi,
Thanks for your response, please find the xfs_info for each brick on the arbiter below:
root at uk3-prod-gfs-arb-01:~# xfs_info /data/glusterfs/gv1/brick1meta-data=/dev/sdc1 isize=512 agcount=31, agsize=131007 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1data = bsize=4096 blocks=3931899, imaxpct=25 = sunit=0 swidth=0 blksnaming =version 2 bsize=4096 ascii-ci=0, ftype=1log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1realtime =none extsz=4096 blocks=0, rtextents=0
root at uk3-prod-gfs-arb-01:~# xfs_info /data/glusterfs/gv1/brick2meta-data=/dev/sde1 isize=512 agcount=13, agsize=327616 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1data = bsize=4096 blocks=3931899, imaxpct=25 = sunit=0 swidth=0 blksnaming =version 2 bsize=4096 ascii-ci=0, ftype=1log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1realtime =none extsz=4096 blocks=0, rtextents=0
root at uk3-prod-gfs-arb-01:~# xfs_info /data/glusterfs/gv1/brick3meta-data=/dev/sdd1 isize=512 agcount=13, agsize=327616 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1data = bsize=4096 blocks=3931899, imaxpct=25 = sunit=0 swidth=0 blksnaming =version 2 bsize=4096 ascii-ci=0, ftype=1log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1realtime =none extsz=4096 blocks=0, rtextents=0
I've also copied below some df output from the arb server:
root at uk3-prod-gfs-arb-01:~# df -hiFilesystem Inodes IUsed IFree IUse% Mounted onudev 992K 473 991K 1% /devtmpfs 995K 788 994K 1% /run/dev/sda1 768K 105K 664K 14% /tmpfs 995K 3 995K 1% /dev/shmtmpfs 995K 4 995K 1% /run/locktmpfs 995K 18 995K 1% /sys/fs/cgroup/dev/sdb1 128K 113 128K 1% /var/lib/glusterd/dev/sdd1 7.5M 2.6M 5.0M 35% /data/glusterfs/gv1/brick3/dev/sdc1 7.5M 600K 7.0M 8% /data/glusterfs/gv1/brick1/dev/sde1 6.4M 2.9M 3.5M 46% /data/glusterfs/gv1/brick2uk1-prod-gfs-01:/gv1 150M 6.5M 144M 5% /mnt/gfstmpfs 995K 21 995K 1% /run/user/1004
root at uk3-prod-gfs-arb-01:~# df -hFilesystem Size Used Avail Use% Mounted onudev 3.9G 0 3.9G 0% /devtmpfs 796M 916K 795M 1% /run/dev/sda1 12G 3.9G 7.3G 35% /tmpfs 3.9G 8.0K 3.9G 1% /dev/shmtmpfs 5.0M 0 5.0M 0% /run/locktmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup/dev/sdb1 2.0G 456K 1.9G 1% /var/lib/glusterd/dev/sdd1 15G 12G 3.5G 78% /data/glusterfs/gv1/brick3/dev/sdc1 15G 2.6G 13G 18% /data/glusterfs/gv1/brick1/dev/sde1 15G 14G 1.8G 89% /data/glusterfs/gv1/brick2uk1-prod-gfs-01:/gv1 300G 139G 162G 47% /mnt/gfstmpfs 796M 0 796M 0% /run/user/1004
Something I forgot to mention in my initial message is that the opversion was upgraded from 70200 to 100000, which seems as though it could have been a trigger for the issue as well.
Thanks,
|
| |
|
|
|
| Liam Smith |
| Linux Systems Support Engineer, Scholar |
|
|
| |
|
| |
|
|
From: Strahil Nikolov <hunter86_bg at yahoo.com>
Sent: 03 July 2023 18:28
To: Liam Smith <liam.smith at ek.co>; gluster-users at gluster.org <gluster-users at gluster.org>
Subject: Re: [Gluster-users] remove_me files building up
|
CAUTION: This e-mail originates from outside of Ekco. Do not click links or attachments unless you recognise the sender.
|
Hi,
you mentioned that the arbiter bricks run out of inodes.Are you using XFS ?Can you provide the xfs_info of each brick ?
Best Regards,Strahil Nikolov
The contents of this email message and any attachments are intended solely for the addressee(s) and may contain confidential and/or privileged information and may be legally protected from disclosure.
On Sat, Jul 1, 2023 at 19:41, Liam Smith<liam.smith at ek.co> wrote:Hi,
We're running a cluster with two data nodes and one arbiter, and have sharding enabled.
We had an issue a while back where one of the server's crashed, we got the server back up and running and ensured that all healing entries cleared, and also increased the server spec (CPU/Mem) as this seemed to be the potential cause.
Since then however, we've seen some strange behaviour, whereby a lot of 'remove_me' files are building up under `/data/glusterfs/gv1/brick2/brick/.shard/.remove_me/` and `/data/glusterfs/gv1/brick3/brick/.shard/.remove_me/`. This is causing the arbiter to run out of space on brick2 and brick3, as the remove_me files are constantly increasing.
brick1 appears to be fine, the disk usage increases throughout the day and drops down in line with the trend of the brick on the data nodes. We see the disk usage increase and drop throughout the day on the data nodes for brick2 and brick3 as well, but while the arbiter follows the same trend of the disk usage increasing, it doesn't drop at any point.
This is the output of some gluster commands, occasional heal entries come and go:
root at uk3-prod-gfs-arb-01:~# gluster volume info gv1
Volume Name: gv1Type: Distributed-ReplicateVolume ID: d3d1fdec-7df9-4f71-b9fc-660d12c2a046Status: StartedSnapshot Count: 0Number of Bricks: 3 x (2 + 1) = 9Transport-type: tcpBricks:Brick1: uk1-prod-gfs-01:/data/glusterfs/gv1/brick1/brickBrick2: uk2-prod-gfs-01:/data/glusterfs/gv1/brick1/brickBrick3: uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick1/brick (arbiter)Brick4: uk1-prod-gfs-01:/data/glusterfs/gv1/brick3/brickBrick5: uk2-prod-gfs-01:/data/glusterfs/gv1/brick3/brickBrick6: uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick3/brick (arbiter)Brick7: uk1-prod-gfs-01:/data/glusterfs/gv1/brick2/brickBrick8: uk2-prod-gfs-01:/data/glusterfs/gv1/brick2/brickBrick9: uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick2/brick (arbiter)Options Reconfigured:cluster.entry-self-heal: oncluster.metadata-self-heal: oncluster.data-self-heal: onperformance.client-io-threads: offstorage.fips-mode-rchecksum: ontransport.address-family: inetcluster.lookup-optimize: offperformance.readdir-ahead: offcluster.readdir-optimize: offcluster.self-heal-daemon: enablefeatures.shard: enablefeatures.shard-block-size: 512MBcluster.min-free-disk: 10%cluster.use-anonymous-inode: yes
root at uk3-prod-gfs-arb-01:~# gluster peer status
Number of Peers: 2
Hostname: uk2-prod-gfs-01Uuid: 2fdfa4a2-195d-4cc5-937c-f48466e76149State: Peer in Cluster (Connected)
Hostname: uk1-prod-gfs-01Uuid: 43ec93d1-ad83-4103-aea3-80ded0903d88State: Peer in Cluster (Connected)
root at uk3-prod-gfs-arb-01:~# gluster volume heal gv1 info
Brick uk1-prod-gfs-01:/data/glusterfs/gv1/brick1/brick<gfid:5b57e1f6-3e3d-4334-a0db-b2560adae6d1>Status: ConnectedNumber of entries: 1
Brick uk2-prod-gfs-01:/data/glusterfs/gv1/brick1/brickStatus: ConnectedNumber of entries: 0
Brick uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick1/brickStatus: ConnectedNumber of entries: 0
Brick uk1-prod-gfs-01:/data/glusterfs/gv1/brick3/brickStatus: ConnectedNumber of entries: 0
Brick uk2-prod-gfs-01:/data/glusterfs/gv1/brick3/brickStatus: ConnectedNumber of entries: 0
Brick uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick3/brickStatus: ConnectedNumber of entries: 0
Brick uk1-prod-gfs-01:/data/glusterfs/gv1/brick2/brickStatus: ConnectedNumber of entries: 0
Brick uk2-prod-gfs-01:/data/glusterfs/gv1/brick2/brick<gfid:6ba9c472-9232-4b45-b12f-a1232d6f4627>/.shard/.remove_me<gfid:0f042518-248d-426a-93f4-cfaa92b6ef3e>Status: ConnectedNumber of entries: 3
Brick uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick2/brick<gfid:6ba9c472-9232-4b45-b12f-a1232d6f4627>/.shard/.remove_me<gfid:0f042518-248d-426a-93f4-cfaa92b6ef3e>Status: ConnectedNumber of entries: 3
root at uk3-prod-gfs-arb-01:~# gluster volume get all cluster.op-versionOption Value------ -----cluster.op-version 100000
We're not sure if this is a potential bug or if something's corrupted that we don't have visibility of, so any pointers/suggestions about how to approach this would be appreciated.
Thanks,Liam
|
|
The contents of this email message and any attachments are intended solely for the addressee(s) and may contain confidential and/or privileged information and may be legally protected from disclosure.
________
Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users at gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20230713/6081c613/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image979624.png
Type: image/png
Size: 24762 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20230713/6081c613/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image357893.png
Type: image/png
Size: 24762 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20230713/6081c613/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image782979.png
Type: image/png
Size: 24762 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20230713/6081c613/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image135755.png
Type: image/png
Size: 24762 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20230713/6081c613/attachment-0003.png>
More information about the Gluster-users
mailing list