[Gluster-users] does your samba work with 4.1.x (centos 7.5)

Diego Remolina dijuremo at gmail.com
Thu Nov 15 14:34:10 UTC 2018


I will try to do the wireshark captures within a week from now.

Up until recently, problems were only with Revit central files, but we
lost a server and ran on one (out of two) for a few days until the
motherboard arrived and we started seeing problems with other files
while writing them. At that point I decided to switch everything to
FUSE mounts to get rid of vfs objects = glusterfs in all shares. At
that point I also downgraded one samba version (running 4.7.1-6.el7_5)
since I had recently also upgraded to the latest CentOS had published
(4.7.1-9.el7_5) at the time before seeing more errors.

We are also working on getting ovirt and self-hosted engine out of
these servers, so once that happens, I can even upgrade to the latest
glusterfs 4.1.5 and do tests on a more current glusterfs version.

Someone at the office who has contacts with another IT person who
worked at a large Architectural firm said that IT person also had
mentioned that gluster did not work with Revit as they tested it an
failed. I have no other details than this.

Could you also try to reproduce it yourself? May help to allow you to
capture all the information you need. A free trial of Revit is
available at:

https://www.autodesk.com/products/revit/free-trial

You can just use one of the demo files that Revit comes with and
convert it to a central while stored in a gluster volume when using
vfs objects = glusterfs in samba and that should trigger the problems.

Diego
On Thu, Nov 15, 2018 at 8:04 AM Anoop C S <anoopcs at cryptolab.net> wrote:
>
> On Wed, 2018-11-14 at 22:19 -0500, Diego Remolina wrote:
> > Hi,
> >
> > Please download the logs from:
> >
> > https://www.dropbox.com/s/4k0zvmn4izhjtg7/samba-logs.tar.bz2?dl=0
>
> [2018/11/14 22:01:31.974084, 10, pid=7577, effective(1009, 513), real(1009, 0)]
> ../source3/smbd/smb2_server.c:2279(smbd_smb2_request_dis
> patch)
>   smbd_smb2_request_dispatch: opcode[SMB2_OP_FLUSH] mid = 9303
> [2018/11/14 22:01:31.974123,  4, pid=7577, effective(1009, 513), real(1009, 0)]
> ../source3/smbd/uid.c:384(change_to_user)
>   Skipping user change - already user
> [2018/11/14 22:01:31.974163, 10, pid=7577, effective(1009, 513), real(1009, 0)]
> ../source3/smbd/smb2_flush.c:134(smbd_smb2_flush_send)
>   smbd_smb2_flush: Test-Project_backup/preview.1957.dat - fnum 2399596398
>
> I see the above flush request(SMB2_OP_FLUSH) without a response being logged compared to other
> request-response pairs for SMB2_OP_QUERY_DIRECTORY, SMB2_OP_IOCTL, SMB2_OP_WRITE, SMB2_OP_READ which
> brings me to the following bug:
>
> https://bugzilla.samba.org/show_bug.cgi?id=13297
>
> Is it also possible for you to collect network traces using wireshark from Windows client to samba
> server along with corresponding samba logs?
>
> > These options had to be set in the [global] section:
> > kernel change notify = no
> > kernel oplocks = no
>
> Well, 'kernel oplocks' and 'posix locking' are listed (S) in man smb.conf(5) which can also be used
> in [global] section meaning it affects all services. Anyway this should be fine.
>
> > I also set log level = 10
>
> Good one.
>
> > I renamed the file as Test-Project.rvt for simplicity I opened revit
> > and from revit I opened this file. I then started attempting to save
> > it as a central at around 22:01:30s The file save then got stuck and
> > at around 22:05 it finally failed saying that two dat files did not
> > exist.
>
> Do you experience problems with other applications and/or with other file types? Is this specific to
> Revit?
>
> > Diego
> > On Tue, Nov 13, 2018 at 8:46 AM Anoop C S <anoopcs at autistici.org> wrote:
> > >
> > > On Tue, 2018-11-13 at 07:50 -0500, Diego Remolina wrote:
> > > > >
> > > > > Thanks for explaining the issue.
> > > > >
> > > > > I understand that you are experiencing hang while doing some operations on files/directories
> > > > > in
> > > > > a
> > > > > GlusterFS volume share from a Windows client. For simplicity can you attach the output of
> > > > > following
> > > > > command:
> > > > >
> > > > > # gluster volume info <volume>
> > > > > # testparm -s --section-name global
> > > >
> > > > gluster v status export
> > > > Status of volume: export
> > > > Gluster process                             TCP Port  RDMA Port  Online  Pid
> > > > ------------------------------------------------------------------------------
> > > > Brick 10.0.1.7:/bricks/hdds/brick           49153     0          Y       2540
> > > > Brick 10.0.1.6:/bricks/hdds/brick           49153     0          Y       2800
> > > > Self-heal Daemon on localhost               N/A       N/A        Y       2912
> > > > Self-heal Daemon on 10.0.1.6                N/A       N/A        Y       3107
> > > > Self-heal Daemon on 10.0.1.5                N/A       N/A        Y       5877
> > > >
> > > > Task Status of Volume export
> > > > ------------------------------------------------------------------------------
> > > > There are no active volume tasks
> > > >
> > > > # gluster volume info export
> > > >
> > > > Volume Name: export
> > > > Type: Replicate
> > > > Volume ID: b4353b3f-6ef6-4813-819a-8e85e5a95cff
> > > > Status: Started
> > > > Snapshot Count: 0
> > > > Number of Bricks: 1 x 2 = 2
> > > > Transport-type: tcp
> > > > Bricks:
> > > > Brick1: 10.0.1.7:/bricks/hdds/brick
> > > > Brick2: 10.0.1.6:/bricks/hdds/brick
> > > > Options Reconfigured:
> > > > diagnostics.brick-log-level: INFO
> > > > diagnostics.client-log-level: INFO
> > > > performance.cache-max-file-size: 256MB
> > > > client.event-threads: 5
> > > > server.event-threads: 5
> > > > cluster.readdir-optimize: on
> > > > cluster.lookup-optimize: on
> > > > performance.io-cache: on
> > > > performance.io-thread-count: 64
> > > > nfs.disable: on
> > > > cluster.server-quorum-type: server
> > > > performance.cache-size: 10GB
> > > > server.allow-insecure: on
> > > > transport.address-family: inet
> > > > performance.cache-samba-metadata: on
> > > > features.cache-invalidation-timeout: 600
> > > > performance.md-cache-timeout: 600
> > > > features.cache-invalidation: on
> > > > performance.cache-invalidation: on
> > > > network.inode-lru-limit: 65536
> > > > performance.cache-min-file-size: 0
> > > > performance.stat-prefetch: on
> > > > cluster.server-quorum-ratio: 51%
> > > >
> > > > I had sent you the full smb.conf, so no need to run testparm -s
> > > > --section-name global, please reference:
> > > > http://termbin.com/y4j0
> > >
> > > Fine.
> > >
> > > > >
> > > > > > This is the test samba share exported using vfs object = glusterfs:
> > > > > >
> > > > > > [vfsgluster]
> > > > > >    path = /vfsgluster
> > > > > >    browseable = yes
> > > > > >    create mask = 660
> > > > > >    directory mask = 770
> > > > > >    write list = @Staff
> > > > > >    kernel share modes = No
> > > > > >    vfs objects = glusterfs
> > > > > >    glusterfs:loglevel = 7
> > > > > >    glusterfs:logfile = /var/log/samba/glusterfs-vfsgluster.log
> > > > > >    glusterfs:volume = export
> > > > >
> > > > > Since you have mentioned path as /vfsgluster I hope you are sharing a subdirectory under
> > > > > root of
> > > > > the
> > > > > volume.
> > > >
> > > > Yes, vfsgluster is a directory at the root of the export volume.
> > >
> > > Thanks for the confirmation.
> > >
> > > > It is also currently mounted in /export so that the rest of the files can be
> > > > exported via samba with fuse mounts:
> > > >
> > > > # mount | grep export
> > > > 10.0.1.7:/export on /export type fuse.glusterfs
> > > > (rw,relatime,user_id=0,group_id=0,allow_other,max_read=131072)
> > > >
> > > > # ls -ld /export/vfsgluster
> > > > drwxrws---. 4 dijuremo Staff 4096 Nov 12 20:24 /export/vfsgluster
> > > >
> > > > >
> > > > > > Full smb.conf
> > > > > > http://termbin.com/y4j0
> > > > >
> > > > > I see the "clustering" parameter set to 'yes'. How many nodes are there in the cluster? Out
> > > > > of
> > > > > those
> > > > > how many are running as samba and/or gluster nodes?
> > > > >
> > > >
> > > > There are a total of 3 gluster peers, but only two have bricks. The
> > > > other is just present, but not even configured as an arbiter. Two of
> > > > the nodes with bricks run ctdb and samba.
> > >
> > > OK. So basically a two node Samba-CTDB cluster.
> > >
> > > > > > /var/log/samba/glusterfs-vfsgluster.log
> > > > > > http://termbin.com/5hdr
> > > > > >
> > > > > > Please let me know if there is any other information I can provide.
> > > > >
> > > > > Are there any errors in /var/log/samba/log.<IP/hostname>? IP/hostname = Windows client
> > > > > machine
> > > > >
> > > >
> > > > I do not currently have the log file directive enabled in smb.conf, I
> > > > would have to enable it. Do you need me to repeat the process with it?
> > >
> > > Yes, preferably after adding the following parameters to [vfsgluster] share section(and of
> > > course a
> > > restart):
> > >
> > > kernel change notify = no
> > > kernel oplocks = no
> > > posix locking = no
> > >
>


More information about the Gluster-users mailing list