[Gluster-users] nfs-ganesha locking problems

Fri Sep 29 15:39:59 UTC 2017

Hi,

I have a problem with nfs-ganesha serving gluster volumes

I can read and write files but then one of the DBAs tried to dump an
Oracle DB onto the NFS share and got the following errors:

Export: Release 11.2.0.4.0 - Production on Wed Sep 27 23:27:48 2017

Connected to: Oracle Database 11g Enterprise Edition Release
11.2.0.4.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options
ORA-39001: invalid argument value
ORA-39000: bad dump file specification
ORA-31641: unable to create dump file
"/u00/app/oracle/DB_BACKUPS/FPESSP11/riskdw_prod_tabs_28092017_01.dmp"
ORA-27086: unable to lock file - already in use
Linux-x86_64 Error: 37: No locks available
Additional information: 10
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

the file exists and is accessible.

Details:
There are 2 gluster clusters involved
the first cluster hosts a number of "replica 3 arbiter 1" volumes
the second cluster only hosts the cluster.enable-shared-storage volume
across 3 nodes. it also runs nfs-ganesha in cluster configuration
(pacemaker, corosync). nfs-ganesha serves the volumes from the first
cluster.

Any idea what's wrong?

Kind Regards
Bernhard

CLUSTER 1 info
==============

root at chglbcvtprd04:/etc# cat os-release
NAME="Ubuntu"
VERSION="16.04.3 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.3 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
root at chglbcvtprd04:/etc# cat lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.3 LTS"
root at chglbcvtprd04:/etc# dpkg -l | grep gluster | sort
ii  glusterfs-client                    3.8.15-ubuntu1~xenial1
              amd64        clustered file-system (client package)
ii  glusterfs-common                    3.8.15-ubuntu1~xenial1
              amd64        GlusterFS common libraries and translator
modules
ii  glusterfs-server                    3.8.15-ubuntu1~xenial1
              amd64        clustered file-system (server package)

root at chglbcvtprd04:~# gluster volume status ora_dump
Status of volume: ora_dump
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick chastcvtprd04:/data/glusterfs/ora_dum
p/2I-1-39/brick                             49772     0          Y       11048
Brick chglbcvtprd04:/data/glusterfs/ora_dum
p/2I-1-39/brick                             50108     0          Y       9990
Brick chealglaprd01:/data/glusterfs/arbiter
/vol01/ora_dump.2I-1-39                     49200     0          Y       3114
Brick chastcvtprd04:/data/glusterfs/ora_dum
p/1I-1-18/brick                             49773     0          Y       11085
Brick chglbcvtprd04:/data/glusterfs/ora_dum
p/1I-1-18/brick                             50109     0          Y       10000
Brick chealglaprd01:/data/glusterfs/arbiter
/vol02/ora_dump.1I-1-18                     49201     0          Y       3080
Brick chastcvtprd04:/data/glusterfs/ora_dum
p/2I-1-48/brick                             49774     0          Y       11091
Brick chglbcvtprd04:/data/glusterfs/ora_dum
p/2I-1-48/brick                             50110     0          Y       10007
Brick chealglaprd01:/data/glusterfs/arbiter
/vol03/ora_dump.2I-1-48                     49202     0          Y       3070
Brick chastcvtprd04:/data/glusterfs/ora_dum
p/1I-1-25/brick                             49775     0          Y       11152
Brick chglbcvtprd04:/data/glusterfs/ora_dum
p/1I-1-25/brick                             50111     0          Y       10012
Brick chealglaprd01:/data/glusterfs/arbiter
/vol04/ora_dump.1I-1-25                     49203     0          Y       3090
Self-heal Daemon on localhost               N/A       N/A        Y       27438
Self-heal Daemon on chealglaprd01           N/A       N/A        Y       32209
Self-heal Daemon on chastcvtprd04.fpprod.co
rp                                          N/A       N/A        Y       27378

root at chglbcvtprd04:~# gluster volume info ora_dump

Volume Name: ora_dump
Type: Distributed-Replicate
Volume ID: b26e649d-d1fe-4ebc-aa03-b196c8925466
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x (2 + 1) = 12
Transport-type: tcp
Bricks:
Brick1: chastcvtprd04:/data/glusterfs/ora_dump/2I-1-39/brick
Brick2: chglbcvtprd04:/data/glusterfs/ora_dump/2I-1-39/brick
Brick3: chealglaprd01:/data/glusterfs/arbiter/vol01/ora_dump.2I-1-39 (arbiter)
Brick4: chastcvtprd04:/data/glusterfs/ora_dump/1I-1-18/brick
Brick5: chglbcvtprd04:/data/glusterfs/ora_dump/1I-1-18/brick
Brick6: chealglaprd01:/data/glusterfs/arbiter/vol02/ora_dump.1I-1-18 (arbiter)
Brick7: chastcvtprd04:/data/glusterfs/ora_dump/2I-1-48/brick
Brick8: chglbcvtprd04:/data/glusterfs/ora_dump/2I-1-48/brick
Brick9: chealglaprd01:/data/glusterfs/arbiter/vol03/ora_dump.2I-1-48 (arbiter)
Brick10: chastcvtprd04:/data/glusterfs/ora_dump/1I-1-25/brick
Brick11: chglbcvtprd04:/data/glusterfs/ora_dump/1I-1-25/brick
Brick12: chealglaprd01:/data/glusterfs/arbiter/vol04/ora_dump.1I-1-25 (arbiter)
Options Reconfigured:
auth.allow: 127.0.0.1,10.30.28.43,10.30.28.44,10.8.13.132,10.30.28.36,10.30.28.37,10.30.201.30,10.30.201.31,10.30.201.32,10.30.201.39,10.30.201.43,10.30.201.44
nfs.rpc-auth-allow: all
performance.readdir-ahead: on
diagnostics.latency-measurement: on
diagnostics.count-fop-hits: on
features.bitrot: off
features.scrub: Inactive
nfs.disable: on
features.cache-invalidation: on

root at chglbcvtprd04:~# gluster volume get ora_dump all
Option                                  Value
------                                  -----
cluster.lookup-unhashed                 on
cluster.lookup-optimize                 off
cluster.min-free-disk                   10%
cluster.min-free-inodes                 5%
cluster.rebalance-stats                 off
cluster.subvols-per-directory           (null)
cluster.readdir-optimize                off
cluster.rsync-hash-regex                (null)
cluster.extra-hash-regex                (null)
cluster.dht-xattr-name                  trusted.glusterfs.dht
cluster.randomize-hash-range-by-gfid    off
cluster.rebal-throttle                  normal
cluster.lock-migration                  off
cluster.local-volume-name               (null)
cluster.weighted-rebalance              on
cluster.switch-pattern                  (null)
cluster.entry-change-log                on
cluster.read-subvolume                  (null)
cluster.read-subvolume-index            -1
cluster.read-hash-mode                  1
cluster.background-self-heal-count      8
cluster.metadata-self-heal              on
cluster.data-self-heal                  on
cluster.entry-self-heal                 on
cluster.self-heal-daemon                on
cluster.heal-timeout                    600
cluster.self-heal-window-size           1
cluster.data-change-log                 on
cluster.metadata-change-log             on
cluster.data-self-heal-algorithm        (null)
cluster.eager-lock                      on
disperse.eager-lock                     on
cluster.quorum-type                     none
cluster.quorum-count                    (null)
cluster.choose-local                    true
cluster.self-heal-readdir-size          1KB
cluster.post-op-delay-secs              1
cluster.ensure-durability               on
cluster.consistent-metadata             no
cluster.heal-wait-queue-length          128
cluster.favorite-child-policy           none
cluster.stripe-block-size               128KB
cluster.stripe-coalesce                 true
diagnostics.latency-measurement         on
diagnostics.dump-fd-stats               off
diagnostics.count-fop-hits              on
diagnostics.brick-log-level             INFO
diagnostics.client-log-level            INFO
diagnostics.brick-sys-log-level         CRITICAL
diagnostics.client-sys-log-level        CRITICAL
diagnostics.brick-logger                (null)
diagnostics.client-logger               (null)
diagnostics.brick-log-format            (null)
diagnostics.client-log-format           (null)
diagnostics.brick-log-buf-size          5
diagnostics.client-log-buf-size         5
diagnostics.brick-log-flush-timeout     120
diagnostics.client-log-flush-timeout    120
diagnostics.stats-dump-interval         0
diagnostics.fop-sample-interval         0
diagnostics.fop-sample-buf-size         65535
diagnostics.stats-dnscache-ttl-sec      86400
performance.cache-max-file-size         0
performance.cache-min-file-size         0
performance.cache-refresh-timeout       1
performance.cache-priority
performance.cache-size                  32MB
performance.io-thread-count             16
performance.high-prio-threads           16
performance.normal-prio-threads         16
performance.low-prio-threads            16
performance.least-prio-threads          1
performance.enable-least-priority       on
performance.least-rate-limit            0
performance.cache-size                  128MB
performance.flush-behind                on
performance.nfs.flush-behind            on
performance.write-behind-window-size    1MB
performance.resync-failed-syncs-after-fsyncoff
performance.nfs.write-behind-window-size1MB
performance.strict-o-direct             off
performance.nfs.strict-o-direct         off
performance.strict-write-ordering       off
performance.nfs.strict-write-ordering   off
performance.lazy-open                   yes
performance.read-after-open             no
performance.read-ahead-page-count       4
performance.md-cache-timeout            1
performance.cache-swift-metadata        true
features.encryption                     off
encryption.master-key                   (null)
encryption.data-key-size                256
encryption.block-size                   4096
network.frame-timeout                   1800
network.ping-timeout                    42
network.tcp-window-size                 (null)
features.lock-heal                      off
features.grace-timeout                  10
network.remote-dio                      disable
client.event-threads                    2
network.ping-timeout                    42
network.tcp-window-size                 (null)
network.inode-lru-limit                 16384
auth.allow
127.0.0.1,10.30.28.43,10.30.28.44,10.8.13.132,10.30.28.36,10.30.28.37,10.30.201.30,10.30.201.31,10.30.201.32,10.30.201.39,10.30.201.43,10.30.201.44
auth.reject                             (null)
transport.keepalive                     (null)
server.allow-insecure                   (null)
server.root-squash                      off
server.anonuid                          65534
server.anongid                          65534
server.statedump-path                   /var/run/gluster
server.outstanding-rpc-limit            64
features.lock-heal                      off
features.grace-timeout                  10
server.ssl                              (null)
auth.ssl-allow                          *
server.manage-gids                      off
server.dynamic-auth                     on
client.send-gids                        on
server.gid-timeout                      300
server.own-thread                       (null)
server.event-threads                    2
ssl.own-cert                            (null)
ssl.private-key                         (null)
ssl.ca-list                             (null)
ssl.crl-path                            (null)
ssl.certificate-depth                   (null)
ssl.cipher-list                         (null)
ssl.dh-param                            (null)
ssl.ec-curve                            (null)
performance.write-behind                on
performance.read-ahead                  on
performance.readdir-ahead               on
performance.io-cache                    on
performance.quick-read                  on
performance.open-behind                 on
performance.stat-prefetch               on
performance.client-io-threads           off
performance.nfs.write-behind            on
performance.nfs.read-ahead              off
performance.nfs.io-cache                off
performance.nfs.quick-read              off
performance.nfs.stat-prefetch           off
performance.nfs.io-threads              off
performance.force-readdirp              true
features.uss                            off
features.snapshot-directory             .snaps
features.show-snapshot-directory        off
network.compression                     off
network.compression.window-size         -15
network.compression.mem-level           8
network.compression.min-size            0
network.compression.compression-level   -1
network.compression.debug               false
features.limit-usage                    (null)
features.quota-timeout                  0
features.default-soft-limit             80%
features.soft-timeout                   60
features.hard-timeout                   5
features.alert-time                     86400
features.quota-deem-statfs              off
geo-replication.indexing                off
geo-replication.indexing                off
geo-replication.ignore-pid-check        off
geo-replication.ignore-pid-check        off
features.quota                          off
features.inode-quota                    off
features.bitrot                         off
debug.trace                             off
debug.log-history                       no
debug.log-file                          no
debug.exclude-ops                       (null)
debug.include-ops                       (null)
debug.error-gen                         off
debug.error-failure                     (null)
debug.error-number                      (null)
debug.random-failure                    off
debug.error-fops                        (null)
nfs.enable-ino32                        no
nfs.mem-factor                          15
nfs.export-dirs                         on
nfs.export-volumes                      on
nfs.addr-namelookup                     off
nfs.dynamic-volumes                     off
nfs.register-with-portmap               on
nfs.outstanding-rpc-limit               16
nfs.port                                2049
nfs.rpc-auth-unix                       on
nfs.rpc-auth-null                       on
nfs.rpc-auth-allow                      all
nfs.rpc-auth-reject                     none
nfs.ports-insecure                      off
nfs.trusted-sync                        off
nfs.trusted-write                       off
nfs.volume-access                       read-write
nfs.export-dir
nfs.disable                             on
nfs.nlm                                 on
nfs.acl                                 on
nfs.mount-udp                           off
nfs.mount-rmtab                         /var/lib/glusterd/nfs/rmtab
nfs.rpc-statd                           /sbin/rpc.statd
nfs.server-aux-gids                     off
nfs.drc                                 off
nfs.drc-size                            0x20000
nfs.read-size                           (1 * 1048576ULL)
nfs.write-size                          (1 * 1048576ULL)
nfs.readdir-size                        (1 * 1048576ULL)
nfs.rdirplus                            on
nfs.exports-auth-enable                 (null)
nfs.auth-refresh-interval-sec           (null)
nfs.auth-cache-ttl-sec                  (null)
features.read-only                      off
features.worm                           off
features.worm-file-level                off
features.default-retention-period       120
features.retention-mode                 relax
features.auto-commit-period             180
storage.linux-aio                       off
storage.batch-fsync-mode                reverse-fsync
storage.batch-fsync-delay-usec          0
storage.owner-uid                       -1
storage.owner-gid                       -1
storage.node-uuid-pathinfo              off
storage.health-check-interval           30
storage.build-pgfid                     off
storage.bd-aio                          off
cluster.server-quorum-type              off
cluster.server-quorum-ratio             0
changelog.changelog                     off
changelog.changelog-dir                 (null)
changelog.encoding                      ascii
changelog.rollover-time                 15
changelog.fsync-interval                5
changelog.changelog-barrier-timeout     120
changelog.capture-del-path              off
features.barrier                        disable
features.barrier-timeout                120
features.trash                          off
features.trash-dir                      .trashcan
features.trash-eliminate-path           (null)
features.trash-max-filesize             5MB
features.trash-internal-op              off
cluster.enable-shared-storage           disable
cluster.write-freq-threshold            0
cluster.read-freq-threshold             0
cluster.tier-pause                      off
cluster.tier-promote-frequency          120
cluster.tier-demote-frequency           3600
cluster.watermark-hi                    90
cluster.watermark-low                   75
cluster.tier-mode                       cache
cluster.tier-max-promote-file-size      0
cluster.tier-max-mb                     4000
cluster.tier-max-files                  10000
features.ctr-enabled                    off
features.record-counters                off
features.ctr-record-metadata-heat       off
features.ctr_link_consistency           off
features.ctr_lookupheal_link_timeout    300
features.ctr_lookupheal_inode_timeout   300
features.ctr-sql-db-cachesize           1000
features.ctr-sql-db-wal-autocheckpoint  1000
locks.trace                             off
locks.mandatory-locking                 off
cluster.disperse-self-heal-daemon       enable
cluster.quorum-reads                    no
client.bind-insecure                    (null)
ganesha.enable                          off
features.shard                          off
features.shard-block-size               4MB
features.scrub-throttle                 lazy
features.scrub-freq                     biweekly
features.scrub                          Inactive
features.expiry-time                    120
features.cache-invalidation             on
features.cache-invalidation-timeout     60
features.leases                         off
features.lease-lock-recall-timeout      60
disperse.background-heals               8
disperse.heal-wait-qlength              128
cluster.heal-timeout                    600
dht.force-readdirp                      on
disperse.read-policy                    round-robin
cluster.shd-max-threads                 1
cluster.shd-wait-qlength                1024
cluster.locking-scheme                  full
cluster.granular-entry-heal             no

CLUSTER 2 info
==============

[root at chvirnfsprd10 etc]# cat os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

[root at chvirnfsprd10 etc]# cat centos-release
CentOS Linux release 7.3.1611 (Core)
[root at chvirnfsprd10 ~]# rpm -qa | grep gluster | sort
centos-release-gluster38-1.0-1.el7.centos.noarch
glusterfs-3.8.15-2.el7.x86_64
glusterfs-api-3.8.15-2.el7.x86_64
glusterfs-cli-3.8.15-2.el7.x86_64
glusterfs-client-xlators-3.8.15-2.el7.x86_64
glusterfs-fuse-3.8.15-2.el7.x86_64
glusterfs-ganesha-3.8.15-2.el7.x86_64
glusterfs-libs-3.8.15-2.el7.x86_64
glusterfs-resource-agents-3.8.15-2.el7.noarch
glusterfs-server-3.8.15-2.el7.x86_64
nfs-ganesha-gluster-2.3.3-1.el7.x86_64
[root at chvirnfsprd10 sssd]# rpm -qa | grep ganesha | sort
glusterfs-ganesha-3.8.15-2.el7.x86_64
nfs-ganesha-2.3.3-1.el7.x86_64
nfs-ganesha-gluster-2.3.3-1.el7.x86_64

[root at chvirnfsprd10 ~]# gluster volume status
Status of volume: gluster_shared_storage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick chvirnfsprd11:/var/lib/glusterd/ss_br
ick                                         49155     0          Y       1054
Brick chvirnfsprd12:/var/lib/glusterd/ss_br
ick                                         49155     0          Y       1434
Brick chvirnfsprd10.fpprod.corp:/var/lib/gl
usterd/ss_brick                             49155     0          Y       1474
Self-heal Daemon on localhost               N/A       N/A        Y       12196
Self-heal Daemon on chvirnfsprd11           N/A       N/A        Y       32110
Self-heal Daemon on chvirnfsprd12           N/A       N/A        Y       2877

Task Status of Volume gluster_shared_storage
------------------------------------------------------------------------------
There are no active volume tasks

[root at chvirnfsprd10 ~]# cat /etc/ganesha/ganesha.conf
NFS_Core_Param {
        #Use supplied name other than IP In NSM operations
        NSM_Use_Caller_Name = true;
        #Copy lock states into "/var/lib/nfs/ganesha" dir
       Clustered = true;
        #Use a non-privileged port for RQuota
        Rquota_Port = 875;
}

%include /etc/ganesha/exports/ora_dump.conf
%include /etc/ganesha/exports/chzrhcvtprd04.conf

[root at chvirnfsprd10 ~]# cat /etc/ganesha/exports/ora_dump.conf
EXPORT
{
        # Export Id (mandatory, each EXPORT must have a unique Export_Id)
        Export_Id = 77;

        # Exported path (mandatory)
        Path = /ora_dump;

        # Pseudo Path (required for NFS v4)
        Pseudo = /ora_dump;

        # Exporting FSAL
        FSAL {
                Name = GLUSTER;
                Hostname = 10.30.28.43;
                Volume = ora_dump;
        }

        CLIENT {
                # Oracle Servers
                Clients =
10.30.29.125,10.30.28.25,10.30.28.64,10.30.29.123,10.30.28.21,10.30.28.81,10.30.29.124,10.30.28.82,10.30.29.111;
                Access_Type = RW;
        }
}

[root at chvirnfsprd10 ~]# cat /etc/ganesha/ganesha-ha.conf
HA_NAME="ltq-prd-nfs"
HA_VOL_SERVER="chvirnfsprd10"
HA_CLUSTER_NODES="chvirnfsprd10,chvirnfsprd11,chvirnfsprd12"
VIP_chvirnfsprd10="10.30.201.39"
VIP_chvirnfsprd11="10.30.201.43"
VIP_chvirnfsprd12="10.30.201.44"

[root at chvirnfsprd10 ~]# pcs status
Cluster name: ltq-prd-nfs
Stack: corosync
Current DC: chvirnfsprd11 (version 1.1.15-11.el7_3.5-e174ec8) -
partition with quorum
Last updated: Fri Sep 29 15:01:26 2017          Last change: Mon Sep
18 11:40:45 2017 by root via crm_attribute on chvirnfsprd12

3 nodes and 12 resources configured

Online: [ chvirnfsprd10 chvirnfsprd11 chvirnfsprd12 ]

Full list of resources:

Clone Set: nfs_setup-clone [nfs_setup]
     Started: [ chvirnfsprd10 chvirnfsprd11 chvirnfsprd12 ]
Clone Set: nfs-mon-clone [nfs-mon]
    Started: [ chvirnfsprd10 chvirnfsprd11 chvirnfsprd12 ]
Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ chvirnfsprd10 chvirnfsprd11 chvirnfsprd12 ]
chvirnfsprd10-cluster_ip-1     (ocf::heartbeat:IPaddr):        Started
chvirnfsprd10
chvirnfsprd11-cluster_ip-1     (ocf::heartbeat:IPaddr):        Started
chvirnfsprd11
chvirnfsprd12-cluster_ip-1     (ocf::heartbeat:IPaddr):        Started
chvirnfsprd12

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled