[Gluster-users] transport endpoint not connected on just 2 files

Strahil Nikolov hunter86_bg at yahoo.com
Wed Jun 8 05:27:05 UTC 2022


 Volume stop was not necessary. Every time you access the file , Gluster will check the permissions, acls , extended file attributes and then allow you access or not.

I'm really surprised that this situation ever happened , and most probably is worth a github issue if you are using latest version of Gluster.

Best Regards,
Strahil Nikolov


     В вторник, 7 юни 2022 г., 14:50:23 ч. Гринуич+3, Kingsley Tart <gluster at gluster.dogwind.com> написа:  
 
 Hi,
Thanks - sorry for the late reply - I was suddenly swamped with other work then it was a UK holiday.
I've tried rsync -A -X with the volume stopped, then restarted it. Will see whether it heals.
Cheers,Kingsley.
On Mon, 2022-05-30 at 18:41 +0000, Strahil Nikolov wrote:
Make a backup from all bricks. Based on the info 2 of the bricks have the same copy while brickC has another copy (gfid mismatch).
I would use mtime to identify the latest version and use that, but I have no clue what kind of application you have.
Usually, It's not recommended to manipulate bricks directly, but in this case it might be necessary. The simplest way is to move the file on brick C (the only one that is different) away, but if you need exactly that one, you can rsync/scp it to the other 2 bricks.

Best Regards,Strahil Nikolov

On Fri, May 27, 2022 at 11:45, Kingsley Tart<gluster at gluster.dogwind.com> wrote:Hi, thanks.
OK that's interesting. Picking one of the files, on bricks A and B I see this (and all of the values are identical between bricks A and B):
trusted.afr.dirty=0x000000000000000000000000trusted.afr.gw-runqueues-client-2=0x000000010000000200000000trusted.gfid=0xa40bb83ff3784ae09c997d272296a7a9trusted.gfid2path.06eddbe9be9c7c75=0x30323665396561652d613661662d346365642d623863632d6261353037333339646364372f677733trusted.glusterfs.mdata=0x01000000000000000000000000628ec57700000000007168bb00000000628ec576000000000000000000000000628ec5760000000000000000
and on brick C I see this:
trusted.gfid=0xd73992aee03e4021824b1baced973df3trusted.gfid2path.06eddbe9be9c7c75=0x30323665396561652d613661662d346365642d623863632d6261353037333339646364372f677733trusted.glusterfs.mdata=0x01000000000000000000000000628ec5230000000030136ca000000000628ec523000000000000000000000000628ec5230000000000000000
So brick C is missing the trusted.afr attributes and the trusted.gfid and mdata differ.
What do I need to do to fix this?
Cheers,Kingsley.
On Fri, 2022-05-27 at 03:59 +0000, Strahil Nikolov wrote:
Check the file attributes on all bricks:
getfattr -d -e hex -m. /data/brick/gw-runqueues/<path to file>

Best Regards,Strahil Nikolov

On Thu, May 26, 2022 at 16:05, Kingsley Tart<gluster at gluster.dogwind.com> wrote:Hi,
I've got a strange issue where on all clients I've tested on (tested on4) I have "transport endpoint is not connected" on two files in adirectory, whereas other files can be read fine.
Any ideas?
On one of the servers (all same version):
# gluster --versionglusterfs 9.1
On one of the clients (same thing with all of them) - problem withfiles "gw3" and "gw11":
[root at gw6 btl]# cd /mnt/runqueues/runners/[root at gw6 runners]# ls -lals: cannot access gw11: Transport endpoint is not connectedls: cannot access gw3: Transport endpoint is not connectedtotal 8drwxr-xr-x  2 root root 4096 May 26 09:48 .drwxr-xr-x 13 root root 4096 Apr 12  2021 ..-rw-r--r--  1 root root    0 May 26 09:49 gw1-rw-r--r--  1 root root    0 May 26 09:49 gw10-?????????  ? ?    ?      ?            ? gw11-rw-r--r--  1 root root    0 May 26 09:49 gw2-?????????  ? ?    ?      ?            ? gw3-rw-r--r--  1 root root    0 May 26 09:49 gw4-rw-r--r--  1 root root    0 May 26 09:49 gw6-rw-r--r--  1 root root    0 May 26 09:49 gw7[root at gw6 runners]# cat *cat: gw11: Transport endpoint is not connectedcat: gw3: Transport endpoint is not connected[root at gw6 runners]#

Querying on a server shows those two problematic files:
# gluster volume heal gw-runqueues infoBrick gluster9a:/data/brick/gw-runqueues/runners/runners/gw11/runners/gw3Status: ConnectedNumber of entries: 3
Brick gluster9b:/data/brick/gw-runqueues/runners/runners/gw11/runners/gw3Status: ConnectedNumber of entries: 3
Brick gluster9c:/data/brick/gw-runqueuesStatus: ConnectedNumber of entries: 0

However several hours later there's no obvious change. The servers havehardly any load and the volume is tiny. From a client:
# find /mnt/runqueues | wc -l35

glfsheal-gw-runqueues.log from server gluster9a:https://pastebin.com/7mPszBBM
glfsheal-gw-runqueues.log from server gluster9b:https://pastebin.com/rxXs5Tcv

Any pointers would be much appreciated!
Cheers,Kingsley.
________


Community Meeting Calendar:
Schedule -Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTCBridge: https://meet.google.com/cpu-eiue-hvkGluster-users mailing listGluster-users at gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users

________


Community Meeting Calendar:
Schedule -Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTCBridge: https://meet.google.com/cpu-eiue-hvkGluster-users mailing listGluster-users at gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users


  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20220608/4136c793/attachment.html>


More information about the Gluster-users mailing list