[Gluster-users] posix set mdata failed, No ctime

Pedro Costa pedro at pmc.digital
Sat Sep 22 08:33:15 UTC 2018


Hi Kotresh,

Thank you so much for the update, that’s great to know.

I spent all day yesterday tuning the volume and I have seem to have hit a sweet spot by applying the following options:

cluster.readdir-optimize: on
performance.read-ahead: on
performance.cache-size: 1GB
server.event-threads: 4
client.event-threads: 4
cluster.lookup-optimize: on
network.inode-lru-limit: 90000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.cache-samba-metadata: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
storage.fips-mode-rchecksum: on
features.utime: on
storage.ctime: on

The warning error is still there, but now I have the following every few hours:

[2018-09-22 03:43:25.884474] E [MSGID: 101046] [dht-common.c:1905:dht_revalidate_cbk] 0-gvol1-dht: dict is null
[2018-09-22 07:53:26.958173] E [MSGID: 101046] [dht-common.c:1905:dht_revalidate_cbk] 0-gvol1-dht: dict is null

The app does build and create/delete temporary directories, could it be related to that? I can’t seem to find in any other log that that dict might actually be either.

If you could shed some light on this one too, and any other performance/optimization tips that would be great.

Cheers,
P.

From: Kotresh Hiremath Ravishankar <khiremat at redhat.com>
Sent: 22 September 2018 05:45
To: Pedro Costa <pedro at pmc.digital>
Cc: Gluster Users <gluster-users at gluster.org>
Subject: Re: [Gluster-users] posix set mdata failed, No ctime

You can ignore this error. It is fixed and should be available in next 4.1.x release.



On Sat, 22 Sep 2018, 07:07 Pedro Costa, <pedro at pmc.digital<mailto:pedro at pmc.digital>> wrote:
Forgot to mention, I’m running all VM’s with 16.04.1-Ubuntu,  Kernel 4.15.0-1023-azure #24


From: Pedro Costa
Sent: 21 September 2018 10:16
To: 'gluster-users at gluster.org<mailto:gluster-users at gluster.org>' <gluster-users at gluster.org<mailto:gluster-users at gluster.org>>
Subject: posix set mdata failed, No ctime

Hi,

I have a replicate x3 volume with the following config:

```
Volume Name: gvol1
Type: Replicate
Volume ID: 384acec2-5b5f-40da-bf0e-5c53d12b3ae2
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: vm0:/srv/brick1/gvol1
Brick2: vm1:/srv/brick1/gvol1
Brick3: vm2:/srv/brick1/gvol1
Options Reconfigured:
storage.ctime: on
features.utime: on
storage.fips-mode-rchecksum: on
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
```

This volume was actually created on v3.8, but as since been upgraded (version by version) to v4.1.4 and it’s working fine (for the most part):

```
Client connections for volume gvol1
----------------------------------------------
Brick : vm0:/srv/brick1/gvol1
Clients connected : 6
Hostname                                               BytesRead    BytesWritten       OpVersion
--------                                               ---------    ------------       ---------
10.X.0.5:49143                                          2096520         2480212           40100
10.X.0.6:49141                                            14000           12812           40100
10.X.0.4:49134                                           258324          333456           40100
10.X.0.4:49141                                        565372566      1643447105           40100
10.X.0.5:49145                                        491262003       291782440           40100
10.X.0.6:49139                                        482629418       328228888           40100
----------------------------------------------
Brick : vm1:/srv/brick1/gvol1
Clients connected : 6
Hostname                                               BytesRead    BytesWritten       OpVersion
--------                                               ---------    ------------       ---------
10.X.0.6:49146                                           658516          508904           40100
10.X.0.5:49133                                          4142848         7139858           40100
10.X.0.4:49138                                             4088            3696           40100
10.X.0.4:49140                                        471405874       284488736           40100
10.X.0.5:49140                                        585193563      1670630439           40100
10.X.0.6:49138                                        482407454       330274812           40100
----------------------------------------------
Brick : vm2:/srv/brick1/gvol1
Clients connected : 6
Hostname                                               BytesRead    BytesWritten       OpVersion
--------                                               ---------    ------------       ---------
10.X.0.6:49133                                          1789624         4340938           40100
10.X.0.5:49137                                          3010064         3005184           40100
10.X.0.4:49143                                             4268            3744           40100
10.X.0.4:49139                                        471328402       283798376           40100
10.X.0.5:49139                                        491404443       293342568           40100
10.X.0.6:49140                                        561683906       830511730           40100
----------------------------------------------
```

I’m now getting a lot of errors on the brick log file, like:

`The message "W [MSGID: 113117] [posix-metadata.c:627:posix_set_ctime] 0-gvol1-posix: posix set mdata failed, No ctime : /srv/brick1/gvol1/.glusterfs/18/d0/18d04927-1ec0-4779-8c5b-7ebb82e4a614 gfid:18d04927-1ec0-4779-8c5b-7ebb82e4a614 [Function not implemented]" repeated 2 times between [2018-09-21 08:21:52.480797] and [2018-09-21 08:22:07.529625]`

For different files but the most common is a file that the Node.js application that runs on top of the gluster via a fuse client (glusterfs) stats every 5s for changes, https://nodejs.org/api/fs.html#fs_fs_stat_path_options_callback

I think this is also related to another issue, when reading the file it returns an empty result (not always), as the app reports:

`2018-09-21 08:22:00 | [vm0] [nobody] sync hosts: invalid applications.json, response was empty.`

Doing `gluster volume heal gvol1 info` yields 0 for all bricks.

Should I be concerned about the warning, is this a known issue? If not, what could be causing the empty file return sometimes? Could they be related?

The application that is running on top of the cluster build and spawns other node.js applications with mostly small files, do you have any optimization tips for it?

FYI it is a slightly modified version of https://github.com/totaljs/superadmin to run as a web farm.

Thank you so much for any help in advance,

Kind Regards,
Pedro Maia Costa






_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org<mailto:Gluster-users at gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180922/5ad47a7e/attachment.html>


More information about the Gluster-users mailing list