From amukherj at redhat.com Sat Jun 1 11:25:12 2019 From: amukherj at redhat.com (Atin Mukherjee) Date: Sat, 1 Jun 2019 16:55:12 +0530 Subject: [Gluster-devel] Fwd: [Gluster-Maintainers] Build failed in Jenkins: regression-test-with-multiplex #1359 In-Reply-To: <24208463.92.1559325814227.JavaMail.jenkins@jenkins-el7.rht.gluster.org> References: <24208463.92.1559325814227.JavaMail.jenkins@jenkins-el7.rht.gluster.org> Message-ID: subdir-mount.t has started failing in brick mux regression nightly. This needs to be fixed. Raghavendra - did we manage to get any further clue on uss.t failure? ---------- Forwarded message --------- From: Date: Fri, 31 May 2019 at 23:34 Subject: [Gluster-Maintainers] Build failed in Jenkins: regression-test-with-multiplex #1359 To: , , , < amukherj at redhat.com>, See < https://build.gluster.org/job/regression-test-with-multiplex/1359/display/redirect?page=changes > Changes: [atin] glusterd: add an op-version check [atin] glusterd/svc: glusterd_svcs_stop should call individual wrapper function [atin] glusterd/svc: Stop stale process using the glusterd_proc_stop [Amar Tumballi] lcov: more coverage to shard, old-protocol, sdfs [Kotresh H R] tests/geo-rep: Add EC volume test case [Amar Tumballi] glusterfsd/cleanup: Protect graph object under a lock [Mohammed Rafi KC] glusterd/shd: Optimize the glustershd manager to send reconfigure [Kotresh H R] tests/geo-rep: Add tests to cover glusterd geo-rep [atin] glusterd: Optimize code to copy dictionary in handshake code path ------------------------------------------ [...truncated 3.18 MB...] ./tests/basic/afr/stale-file-lookup.t - 9 second ./tests/basic/afr/granular-esh/replace-brick.t - 9 second ./tests/basic/afr/granular-esh/add-brick.t - 9 second ./tests/basic/afr/gfid-mismatch.t - 9 second ./tests/performance/open-behind.t - 8 second ./tests/features/ssl-authz.t - 8 second ./tests/features/readdir-ahead.t - 8 second ./tests/bugs/upcall/bug-1458127.t - 8 second ./tests/bugs/transport/bug-873367.t - 8 second ./tests/bugs/replicate/bug-1498570-client-iot-graph-check.t - 8 second ./tests/bugs/replicate/bug-1132102.t - 8 second ./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove-quota-quota-deem-statfs.t - 8 second ./tests/bugs/quota/bug-1104692.t - 8 second ./tests/bugs/posix/bug-1360679.t - 8 second ./tests/bugs/posix/bug-1122028.t - 8 second ./tests/bugs/nfs/bug-1157223-symlink-mounting.t - 8 second ./tests/bugs/glusterfs/bug-861015-log.t - 8 second ./tests/bugs/glusterd/sync-post-glusterd-restart.t - 8 second ./tests/bugs/glusterd/bug-1696046.t - 8 second ./tests/bugs/fuse/bug-983477.t - 8 second ./tests/bugs/ec/bug-1227869.t - 8 second ./tests/bugs/distribute/bug-1088231.t - 8 second ./tests/bugs/distribute/bug-1086228.t - 8 second ./tests/bugs/cli/bug-1087487.t - 8 second ./tests/bugs/cli/bug-1022905.t - 8 second ./tests/bugs/bug-1258069.t - 8 second ./tests/bugs/bitrot/1209752-volume-status-should-show-bitrot-scrub-info.t - 8 second ./tests/basic/xlator-pass-through-sanity.t - 8 second ./tests/basic/quota-nfs.t - 8 second ./tests/basic/glusterd/arbiter-volume.t - 8 second ./tests/basic/ctime/ctime-noatime.t - 8 second ./tests/line-coverage/cli-peer-and-volume-operations.t - 7 second ./tests/gfid2path/get-gfid-to-path.t - 7 second ./tests/bugs/upcall/bug-1369430.t - 7 second ./tests/bugs/snapshot/bug-1260848.t - 7 second ./tests/bugs/shard/shard-inode-refcount-test.t - 7 second ./tests/bugs/shard/bug-1258334.t - 7 second ./tests/bugs/replicate/bug-767585-gfid.t - 7 second ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t - 7 second ./tests/bugs/replicate/bug-1250170-fsync.t - 7 second ./tests/bugs/posix/bug-1175711.t - 7 second ./tests/bugs/nfs/bug-915280.t - 7 second ./tests/bugs/md-cache/setxattr-prepoststat.t - 7 second ./tests/bugs/md-cache/bug-1211863_unlink.t - 7 second ./tests/bugs/glusterfs/bug-848251.t - 7 second ./tests/bugs/distribute/bug-1122443.t - 7 second ./tests/bugs/changelog/bug-1208470.t - 7 second ./tests/bugs/bug-1702299.t - 7 second ./tests/bugs/bug-1371806_2.t - 7 second ./tests/bugs/bitrot/1209818-vol-info-show-scrub-process-properly.t - 7 second ./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t - 7 second ./tests/bugs/bitrot/1207029-bitrot-daemon-should-start-on-valid-node.t - 7 second ./tests/bitrot/br-stub.t - 7 second ./tests/basic/glusterd/arbiter-volume-probe.t - 7 second ./tests/basic/gfapi/libgfapi-fini-hang.t - 7 second ./tests/basic/fencing/fencing-crash-conistency.t - 7 second ./tests/basic/distribute/file-create.t - 7 second ./tests/basic/afr/tarissue.t - 7 second ./tests/basic/afr/gfid-heal.t - 7 second ./tests/bugs/snapshot/bug-1178079.t - 6 second ./tests/bugs/snapshot/bug-1064768.t - 6 second ./tests/bugs/shard/bug-1342298.t - 6 second ./tests/bugs/shard/bug-1259651.t - 6 second ./tests/bugs/replicate/bug-1686568-send-truncate-on-arbiter-from-shd.t - 6 second ./tests/bugs/replicate/bug-1626994-info-split-brain.t - 6 second ./tests/bugs/replicate/bug-1325792.t - 6 second ./tests/bugs/replicate/bug-1101647.t - 6 second ./tests/bugs/quota/bug-1243798.t - 6 second ./tests/bugs/protocol/bug-1321578.t - 6 second ./tests/bugs/nfs/bug-877885.t - 6 second ./tests/bugs/nfs/bug-1143880-fix-gNFSd-auth-crash.t - 6 second ./tests/bugs/md-cache/bug-1476324.t - 6 second ./tests/bugs/md-cache/afr-stale-read.t - 6 second ./tests/bugs/io-cache/bug-858242.t - 6 second ./tests/bugs/glusterfs/bug-893378.t - 6 second ./tests/bugs/glusterfs/bug-856455.t - 6 second ./tests/bugs/glusterd/quorum-value-check.t - 6 second ./tests/bugs/ec/bug-1179050.t - 6 second ./tests/bugs/distribute/bug-912564.t - 6 second ./tests/bugs/distribute/bug-884597.t - 6 second ./tests/bugs/distribute/bug-1368012.t - 6 second ./tests/bugs/core/bug-986429.t - 6 second ./tests/bugs/core/bug-1699025-brick-mux-detach-brick-fd-issue.t - 6 second ./tests/bugs/core/bug-1168803-snapd-option-validation-fix.t - 6 second ./tests/bugs/bug-1371806_1.t - 6 second ./tests/bugs/bitrot/bug-1229134-bitd-not-support-vol-set.t - 6 second ./tests/bugs/bitrot/bug-1210684-scrub-pause-resume-error-handling.t - 6 second ./tests/bitrot/bug-1221914.t - 6 second ./tests/basic/trace.t - 6 second ./tests/basic/playground/template-xlator-sanity.t - 6 second ./tests/basic/ec/nfs.t - 6 second ./tests/basic/ec/ec-read-policy.t - 6 second ./tests/basic/ec/ec-anonymous-fd.t - 6 second ./tests/basic/distribute/non-root-unlink-stale-linkto.t - 6 second ./tests/basic/changelog/changelog-rename.t - 6 second ./tests/basic/afr/heal-info.t - 6 second ./tests/basic/afr/afr-read-hash-mode.t - 6 second ./tests/gfid2path/gfid2path_nfs.t - 5 second ./tests/bugs/upcall/bug-1422776.t - 5 second ./tests/bugs/replicate/bug-886998.t - 5 second ./tests/bugs/replicate/bug-1365455.t - 5 second ./tests/bugs/readdir-ahead/bug-1670253-consistent-metadata.t - 5 second ./tests/bugs/posix/bug-gfid-path.t - 5 second ./tests/bugs/posix/bug-765380.t - 5 second ./tests/bugs/nfs/bug-847622.t - 5 second ./tests/bugs/nfs/bug-1116503.t - 5 second ./tests/bugs/io-stats/bug-1598548.t - 5 second ./tests/bugs/glusterfs-server/bug-877992.t - 5 second ./tests/bugs/glusterfs-server/bug-873549.t - 5 second ./tests/bugs/glusterfs/bug-895235.t - 5 second ./tests/bugs/fuse/bug-1126048.t - 5 second ./tests/bugs/distribute/bug-907072.t - 5 second ./tests/bugs/core/bug-913544.t - 5 second ./tests/bugs/core/bug-908146.t - 5 second ./tests/bugs/access-control/bug-1051896.t - 5 second ./tests/basic/ec/ec-internal-xattrs.t - 5 second ./tests/basic/ec/ec-fallocate.t - 5 second ./tests/basic/distribute/bug-1265677-use-readdirp.t - 5 second ./tests/basic/afr/arbiter-remove-brick.t - 5 second ./tests/performance/quick-read.t - 4 second ./tests/gfid2path/block-mount-access.t - 4 second ./tests/features/delay-gen.t - 4 second ./tests/bugs/upcall/bug-upcall-stat.t - 4 second ./tests/bugs/upcall/bug-1394131.t - 4 second ./tests/bugs/unclassified/bug-1034085.t - 4 second ./tests/bugs/snapshot/bug-1111041.t - 4 second ./tests/bugs/shard/bug-1272986.t - 4 second ./tests/bugs/shard/bug-1256580.t - 4 second ./tests/bugs/shard/bug-1250855.t - 4 second ./tests/bugs/shard/bug-1245547.t - 4 second ./tests/bugs/rpc/bug-954057.t - 4 second ./tests/bugs/replicate/bug-976800.t - 4 second ./tests/bugs/replicate/bug-880898.t - 4 second ./tests/bugs/replicate/bug-1480525.t - 4 second ./tests/bugs/read-only/bug-1134822-read-only-default-in-graph.t - 4 second ./tests/bugs/readdir-ahead/bug-1446516.t - 4 second ./tests/bugs/readdir-ahead/bug-1439640.t - 4 second ./tests/bugs/readdir-ahead/bug-1390050.t - 4 second ./tests/bugs/quota/bug-1287996.t - 4 second ./tests/bugs/quick-read/bug-846240.t - 4 second ./tests/bugs/posix/disallow-gfid-volumeid-removexattr.t - 4 second ./tests/bugs/posix/bug-1619720.t - 4 second ./tests/bugs/nl-cache/bug-1451588.t - 4 second ./tests/bugs/nfs/zero-atime.t - 4 second ./tests/bugs/nfs/subdir-trailing-slash.t - 4 second ./tests/bugs/nfs/socket-as-fifo.t - 4 second ./tests/bugs/nfs/showmount-many-clients.t - 4 second ./tests/bugs/nfs/bug-1210338.t - 4 second ./tests/bugs/nfs/bug-1166862.t - 4 second ./tests/bugs/nfs/bug-1161092-nfs-acls.t - 4 second ./tests/bugs/md-cache/bug-1632503.t - 4 second ./tests/bugs/glusterfs-server/bug-864222.t - 4 second ./tests/bugs/glusterfs/bug-1482528.t - 4 second ./tests/bugs/glusterd/bug-948729/bug-948729-mode-script.t - 4 second ./tests/bugs/glusterd/bug-948729/bug-948729-force.t - 4 second ./tests/bugs/glusterd/bug-1482906-peer-file-blank-line.t - 4 second ./tests/bugs/glusterd/bug-1091935-brick-order-check-from-cli-to-glusterd.t - 4 second ./tests/bugs/geo-replication/bug-1296496.t - 4 second ./tests/bugs/fuse/bug-1336818.t - 4 second ./tests/bugs/fuse/bug-1283103.t - 4 second ./tests/bugs/core/io-stats-1322825.t - 4 second ./tests/bugs/core/bug-834465.t - 4 second ./tests/bugs/core/bug-1135514-allow-setxattr-with-null-value.t - 4 second ./tests/bugs/core/949327.t - 4 second ./tests/bugs/cli/bug-977246.t - 4 second ./tests/bugs/cli/bug-961307.t - 4 second ./tests/bugs/cli/bug-1004218.t - 4 second ./tests/bugs/bug-1138841.t - 4 second ./tests/bugs/access-control/bug-1387241.t - 4 second ./tests/bitrot/bug-internal-xattrs-check-1243391.t - 4 second ./tests/basic/quota-rename.t - 4 second ./tests/basic/hardlink-limit.t - 4 second ./tests/basic/ec/dht-rename.t - 4 second ./tests/basic/distribute/lookup.t - 4 second ./tests/line-coverage/meta-max-coverage.t - 3 second ./tests/gfid2path/gfid2path_fuse.t - 3 second ./tests/bugs/unclassified/bug-991622.t - 3 second ./tests/bugs/trace/bug-797171.t - 3 second ./tests/bugs/glusterfs-server/bug-861542.t - 3 second ./tests/bugs/glusterfs/bug-869724.t - 3 second ./tests/bugs/glusterfs/bug-860297.t - 3 second ./tests/bugs/glusterfs/bug-844688.t - 3 second ./tests/bugs/glusterd/bug-948729/bug-948729.t - 3 second ./tests/bugs/distribute/bug-1204140.t - 3 second ./tests/bugs/core/bug-924075.t - 3 second ./tests/bugs/core/bug-845213.t - 3 second ./tests/bugs/core/bug-1421721-mpx-toggle.t - 3 second ./tests/bugs/core/bug-1119582.t - 3 second ./tests/bugs/core/bug-1117951.t - 3 second ./tests/bugs/cli/bug-983317-volume-get.t - 3 second ./tests/bugs/cli/bug-867252.t - 3 second ./tests/basic/glusterd/check-cloudsync-ancestry.t - 3 second ./tests/basic/fops-sanity.t - 3 second ./tests/basic/fencing/test-fence-option.t - 3 second ./tests/basic/distribute/debug-xattrs.t - 3 second ./tests/basic/afr/ta-check-locks.t - 3 second ./tests/line-coverage/volfile-with-all-graph-syntax.t - 2 second ./tests/line-coverage/some-features-in-libglusterfs.t - 2 second ./tests/bugs/shard/bug-1261773.t - 2 second ./tests/bugs/replicate/bug-884328.t - 2 second ./tests/bugs/readdir-ahead/bug-1512437.t - 2 second ./tests/bugs/nfs/bug-970070.t - 2 second ./tests/bugs/nfs/bug-1302948.t - 2 second ./tests/bugs/logging/bug-823081.t - 2 second ./tests/bugs/glusterfs-server/bug-889996.t - 2 second ./tests/bugs/glusterfs/bug-892730.t - 2 second ./tests/bugs/glusterfs/bug-811493.t - 2 second ./tests/bugs/glusterd/bug-1085330-and-bug-916549.t - 2 second ./tests/bugs/distribute/bug-924265.t - 2 second ./tests/bugs/core/log-bug-1362520.t - 2 second ./tests/bugs/core/bug-903336.t - 2 second ./tests/bugs/core/bug-1111557.t - 2 second ./tests/bugs/cli/bug-969193.t - 2 second ./tests/bugs/cli/bug-949298.t - 2 second ./tests/bugs/cli/bug-921215.t - 2 second ./tests/bugs/cli/bug-1378842-volume-get-all.t - 2 second ./tests/basic/peer-parsing.t - 2 second ./tests/basic/md-cache/bug-1418249.t - 2 second ./tests/basic/afr/arbiter-cli.t - 2 second ./tests/bugs/replicate/ta-inode-refresh-read.t - 1 second ./tests/bugs/glusterfs/bug-853690.t - 1 second ./tests/bugs/cli/bug-764638.t - 1 second ./tests/bugs/cli/bug-1047378.t - 1 second ./tests/basic/netgroup_parsing.t - 1 second ./tests/basic/gfapi/sink.t - 1 second ./tests/basic/exports_parsing.t - 1 second ./tests/basic/posixonly.t - 0 second ./tests/basic/glusterfsd-args.t - 0 second 2 test(s) failed ./tests/basic/uss.t ./tests/features/subdir-mount.t 0 test(s) generated core 5 test(s) needed retry ./tests/basic/afr/split-brain-favorite-child-policy.t ./tests/basic/ec/self-heal.t ./tests/basic/uss.t ./tests/basic/volfile-sanity.t ./tests/features/subdir-mount.t Result is 1 tar: Removing leading `/' from member names kernel.core_pattern = /%e-%p.core Build step 'Execute shell' marked build as failure _______________________________________________ maintainers mailing list maintainers at gluster.org https://lists.gluster.org/mailman/listinfo/maintainers -- - Atin (atinm) -------------- next part -------------- An HTML attachment was scrubbed... URL: From amukherj at redhat.com Sat Jun 1 11:27:20 2019 From: amukherj at redhat.com (Atin Mukherjee) Date: Sat, 1 Jun 2019 16:57:20 +0530 Subject: [Gluster-devel] Fwd: [Gluster-Maintainers] Build failed in Jenkins: regression-test-with-multiplex #1357 In-Reply-To: <727602310.89.1559238721974.JavaMail.jenkins@jenkins-el7.rht.gluster.org> References: <1056764480.87.1559168297540.JavaMail.jenkins@jenkins-el7.rht.gluster.org> <727602310.89.1559238721974.JavaMail.jenkins@jenkins-el7.rht.gluster.org> Message-ID: Rafi - tests/bugs/glusterd/serializ e-shd-manager-glusterd- restart.t seems to be failing often. Can you please investigate the reason of this spurious failure? ---------- Forwarded message --------- From: Date: Thu, 30 May 2019 at 23:22 Subject: [Gluster-Maintainers] Build failed in Jenkins: regression-test-with-multiplex #1357 To: , See < https://build.gluster.org/job/regression-test-with-multiplex/1357/display/redirect?page=changes > Changes: [Xavi Hernandez] tests: add tests for different signal handling [Xavi Hernandez] marker: remove some unused functions [Xavi Hernandez] glusterd: coverity fix ------------------------------------------ [...truncated 2.92 MB...] ./tests/basic/ec/ec-root-heal.t - 9 second ./tests/basic/afr/ta-write-on-bad-brick.t - 9 second ./tests/basic/afr/ta.t - 9 second ./tests/basic/afr/gfid-mismatch.t - 9 second ./tests/performance/open-behind.t - 8 second ./tests/features/ssl-authz.t - 8 second ./tests/features/readdir-ahead.t - 8 second ./tests/features/lock-migration/lkmigration-set-option.t - 8 second ./tests/bugs/replicate/bug-921231.t - 8 second ./tests/bugs/replicate/bug-1686568-send-truncate-on-arbiter-from-shd.t - 8 second ./tests/bugs/replicate/bug-1132102.t - 8 second ./tests/bugs/posix/bug-990028.t - 8 second ./tests/bugs/posix/bug-1360679.t - 8 second ./tests/bugs/nfs/bug-915280.t - 8 second ./tests/bugs/nfs/bug-1157223-symlink-mounting.t - 8 second ./tests/bugs/glusterfs/bug-872923.t - 8 second ./tests/bugs/glusterfs/bug-861015-log.t - 8 second ./tests/bugs/glusterd/sync-post-glusterd-restart.t - 8 second ./tests/bugs/glusterd/bug-1696046.t - 8 second ./tests/bugs/distribute/bug-1088231.t - 8 second ./tests/bugs/distribute/bug-1086228.t - 8 second ./tests/bugs/cli/bug-1087487.t - 8 second ./tests/bugs/bug-1258069.t - 8 second ./tests/bugs/bitrot/1209818-vol-info-show-scrub-process-properly.t - 8 second ./tests/bugs/bitrot/1209752-volume-status-should-show-bitrot-scrub-info.t - 8 second ./tests/basic/quota-nfs.t - 8 second ./tests/basic/ec/statedump.t - 8 second ./tests/basic/ctime/ctime-noatime.t - 8 second ./tests/basic/afr/ta-shd.t - 8 second ./tests/basic/afr/arbiter-remove-brick.t - 8 second ./tests/line-coverage/cli-peer-and-volume-operations.t - 7 second ./tests/gfid2path/get-gfid-to-path.t - 7 second ./tests/gfid2path/block-mount-access.t - 7 second ./tests/bugs/upcall/bug-1369430.t - 7 second ./tests/bugs/transport/bug-873367.t - 7 second ./tests/bugs/snapshot/bug-1260848.t - 7 second ./tests/bugs/snapshot/bug-1064768.t - 7 second ./tests/bugs/shard/shard-inode-refcount-test.t - 7 second ./tests/bugs/shard/bug-1258334.t - 7 second ./tests/bugs/replicate/bug-1626994-info-split-brain.t - 7 second ./tests/bugs/replicate/bug-1498570-client-iot-graph-check.t - 7 second ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t - 7 second ./tests/bugs/replicate/bug-1250170-fsync.t - 7 second ./tests/bugs/replicate/bug-1101647.t - 7 second ./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove-quota-quota-deem-statfs.t - 7 second ./tests/bugs/quota/bug-1104692.t - 7 second ./tests/bugs/posix/bug-1175711.t - 7 second ./tests/bugs/posix/bug-1122028.t - 7 second ./tests/bugs/md-cache/setxattr-prepoststat.t - 7 second ./tests/bugs/glusterfs/bug-848251.t - 7 second ./tests/bugs/ec/bug-1227869.t - 7 second ./tests/bugs/distribute/bug-884597.t - 7 second ./tests/bugs/distribute/bug-1122443.t - 7 second ./tests/bugs/changelog/bug-1208470.t - 7 second ./tests/bugs/bug-1371806_2.t - 7 second ./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t - 7 second ./tests/bitrot/bug-1221914.t - 7 second ./tests/bitrot/br-stub.t - 7 second ./tests/basic/xlator-pass-through-sanity.t - 7 second ./tests/basic/trace.t - 7 second ./tests/basic/glusterd/arbiter-volume-probe.t - 7 second ./tests/basic/gfapi/libgfapi-fini-hang.t - 7 second ./tests/basic/distribute/file-create.t - 7 second ./tests/basic/afr/tarissue.t - 7 second ./tests/basic/afr/gfid-heal.t - 7 second ./tests/bugs/shard/bug-1342298.t - 6 second ./tests/bugs/shard/bug-1272986.t - 6 second ./tests/bugs/shard/bug-1259651.t - 6 second ./tests/bugs/replicate/bug-767585-gfid.t - 6 second ./tests/bugs/replicate/bug-1325792.t - 6 second ./tests/bugs/readdir-ahead/bug-1670253-consistent-metadata.t - 6 second ./tests/bugs/quota/bug-1243798.t - 6 second ./tests/bugs/protocol/bug-1321578.t - 6 second ./tests/bugs/posix/bug-765380.t - 6 second ./tests/bugs/nfs/bug-877885.t - 6 second ./tests/bugs/nfs/bug-847622.t - 6 second ./tests/bugs/nfs/bug-1143880-fix-gNFSd-auth-crash.t - 6 second ./tests/bugs/md-cache/bug-1211863_unlink.t - 6 second ./tests/bugs/io-stats/bug-1598548.t - 6 second ./tests/bugs/io-cache/bug-858242.t - 6 second ./tests/bugs/glusterfs/bug-893378.t - 6 second ./tests/bugs/glusterfs/bug-856455.t - 6 second ./tests/bugs/glusterd/quorum-value-check.t - 6 second ./tests/bugs/fuse/bug-1126048.t - 6 second ./tests/bugs/ec/bug-1179050.t - 6 second ./tests/bugs/distribute/bug-912564.t - 6 second ./tests/bugs/distribute/bug-1368012.t - 6 second ./tests/bugs/core/bug-986429.t - 6 second ./tests/bugs/core/bug-1699025-brick-mux-detach-brick-fd-issue.t - 6 second ./tests/bugs/core/bug-1168803-snapd-option-validation-fix.t - 6 second ./tests/bugs/bug-1702299.t - 6 second ./tests/bugs/bug-1371806_1.t - 6 second ./tests/bugs/bitrot/1207029-bitrot-daemon-should-start-on-valid-node.t - 6 second ./tests/basic/playground/template-xlator-sanity.t - 6 second ./tests/basic/fencing/fencing-crash-conistency.t - 6 second ./tests/basic/ec/nfs.t - 6 second ./tests/basic/ec/ec-read-policy.t - 6 second ./tests/basic/ec/ec-anonymous-fd.t - 6 second ./tests/basic/afr/afr-read-hash-mode.t - 6 second ./tests/gfid2path/gfid2path_nfs.t - 5 second ./tests/features/delay-gen.t - 5 second ./tests/bugs/unclassified/bug-1034085.t - 5 second ./tests/bugs/snapshot/bug-1178079.t - 5 second ./tests/bugs/snapshot/bug-1111041.t - 5 second ./tests/bugs/shard/bug-1256580.t - 5 second ./tests/bugs/replicate/bug-1365455.t - 5 second ./tests/bugs/posix/bug-gfid-path.t - 5 second ./tests/bugs/nfs/bug-1166862.t - 5 second ./tests/bugs/md-cache/bug-1632503.t - 5 second ./tests/bugs/md-cache/afr-stale-read.t - 5 second ./tests/bugs/glusterfs-server/bug-877992.t - 5 second ./tests/bugs/glusterfs-server/bug-873549.t - 5 second ./tests/bugs/glusterfs-server/bug-864222.t - 5 second ./tests/bugs/glusterfs/bug-895235.t - 5 second ./tests/bugs/glusterfs/bug-1482528.t - 5 second ./tests/bugs/glusterd/bug-948729/bug-948729-force.t - 5 second ./tests/bugs/glusterd/bug-1091935-brick-order-check-from-cli-to-glusterd.t - 5 second ./tests/bugs/geo-replication/bug-1296496.t - 5 second ./tests/bugs/distribute/bug-907072.t - 5 second ./tests/bugs/core/bug-913544.t - 5 second ./tests/bugs/core/bug-834465.t - 5 second ./tests/bugs/bitrot/bug-1229134-bitd-not-support-vol-set.t - 5 second ./tests/bugs/bitrot/bug-1210684-scrub-pause-resume-error-handling.t - 5 second ./tests/bugs/access-control/bug-1051896.t - 5 second ./tests/basic/hardlink-limit.t - 5 second ./tests/basic/ec/ec-fallocate.t - 5 second ./tests/basic/ec/dht-rename.t - 5 second ./tests/basic/distribute/non-root-unlink-stale-linkto.t - 5 second ./tests/basic/changelog/changelog-rename.t - 5 second ./tests/basic/afr/heal-info.t - 5 second ./tests/performance/quick-read.t - 4 second ./tests/gfid2path/gfid2path_fuse.t - 4 second ./tests/bugs/upcall/bug-upcall-stat.t - 4 second ./tests/bugs/upcall/bug-1422776.t - 4 second ./tests/bugs/upcall/bug-1394131.t - 4 second ./tests/bugs/trace/bug-797171.t - 4 second ./tests/bugs/shard/bug-1250855.t - 4 second ./tests/bugs/rpc/bug-954057.t - 4 second ./tests/bugs/replicate/bug-976800.t - 4 second ./tests/bugs/replicate/bug-886998.t - 4 second ./tests/bugs/replicate/bug-880898.t - 4 second ./tests/bugs/replicate/bug-1480525.t - 4 second ./tests/bugs/read-only/bug-1134822-read-only-default-in-graph.t - 4 second ./tests/bugs/readdir-ahead/bug-1446516.t - 4 second ./tests/bugs/readdir-ahead/bug-1439640.t - 4 second ./tests/bugs/readdir-ahead/bug-1390050.t - 4 second ./tests/bugs/quota/bug-1287996.t - 4 second ./tests/bugs/quick-read/bug-846240.t - 4 second ./tests/bugs/nl-cache/bug-1451588.t - 4 second ./tests/bugs/nfs/subdir-trailing-slash.t - 4 second ./tests/bugs/nfs/socket-as-fifo.t - 4 second ./tests/bugs/nfs/showmount-many-clients.t - 4 second ./tests/bugs/nfs/bug-1210338.t - 4 second ./tests/bugs/nfs/bug-1161092-nfs-acls.t - 4 second ./tests/bugs/nfs/bug-1116503.t - 4 second ./tests/bugs/md-cache/bug-1476324.t - 4 second ./tests/bugs/glusterfs/bug-869724.t - 4 second ./tests/bugs/glusterd/bug-948729/bug-948729.t - 4 second ./tests/bugs/fuse/bug-1283103.t - 4 second ./tests/bugs/core/io-stats-1322825.t - 4 second ./tests/bugs/core/bug-924075.t - 4 second ./tests/bugs/core/bug-908146.t - 4 second ./tests/bugs/core/949327.t - 4 second ./tests/bugs/cli/bug-983317-volume-get.t - 4 second ./tests/bugs/cli/bug-977246.t - 4 second ./tests/bugs/cli/bug-961307.t - 4 second ./tests/bugs/cli/bug-1004218.t - 4 second ./tests/bugs/bug-1138841.t - 4 second ./tests/bugs/access-control/bug-1387241.t - 4 second ./tests/basic/quota-rename.t - 4 second ./tests/basic/fencing/test-fence-option.t - 4 second ./tests/basic/ec/ec-internal-xattrs.t - 4 second ./tests/basic/distribute/lookup.t - 4 second ./tests/basic/distribute/bug-1265677-use-readdirp.t - 4 second ./tests/line-coverage/volfile-with-all-graph-syntax.t - 3 second ./tests/line-coverage/some-features-in-libglusterfs.t - 3 second ./tests/bugs/unclassified/bug-991622.t - 3 second ./tests/bugs/readdir-ahead/bug-1512437.t - 3 second ./tests/bugs/posix/disallow-gfid-volumeid-removexattr.t - 3 second ./tests/bugs/posix/bug-1619720.t - 3 second ./tests/bugs/nfs/zero-atime.t - 3 second ./tests/bugs/glusterfs/bug-844688.t - 3 second ./tests/bugs/glusterd/bug-948729/bug-948729-mode-script.t - 3 second ./tests/bugs/glusterd/bug-1482906-peer-file-blank-line.t - 3 second ./tests/bugs/fuse/bug-1336818.t - 3 second ./tests/bugs/core/log-bug-1362520.t - 3 second ./tests/bugs/core/bug-1421721-mpx-toggle.t - 3 second ./tests/bugs/core/bug-1135514-allow-setxattr-with-null-value.t - 3 second ./tests/bugs/core/bug-1119582.t - 3 second ./tests/bugs/core/bug-1117951.t - 3 second ./tests/bugs/cli/bug-867252.t - 3 second ./tests/bitrot/bug-internal-xattrs-check-1243391.t - 3 second ./tests/basic/md-cache/bug-1418249.t - 3 second ./tests/basic/glusterd/check-cloudsync-ancestry.t - 3 second ./tests/basic/fops-sanity.t - 3 second ./tests/basic/distribute/debug-xattrs.t - 3 second ./tests/line-coverage/meta-max-coverage.t - 2 second ./tests/bugs/shard/bug-1261773.t - 2 second ./tests/bugs/shard/bug-1245547.t - 2 second ./tests/bugs/replicate/bug-884328.t - 2 second ./tests/bugs/nfs/bug-970070.t - 2 second ./tests/bugs/nfs/bug-1302948.t - 2 second ./tests/bugs/logging/bug-823081.t - 2 second ./tests/bugs/glusterfs-server/bug-889996.t - 2 second ./tests/bugs/glusterfs-server/bug-861542.t - 2 second ./tests/bugs/glusterfs/bug-892730.t - 2 second ./tests/bugs/glusterfs/bug-860297.t - 2 second ./tests/bugs/glusterfs/bug-853690.t - 2 second ./tests/bugs/glusterfs/bug-811493.t - 2 second ./tests/bugs/glusterd/bug-1085330-and-bug-916549.t - 2 second ./tests/bugs/distribute/bug-924265.t - 2 second ./tests/bugs/distribute/bug-1204140.t - 2 second ./tests/bugs/core/bug-903336.t - 2 second ./tests/bugs/core/bug-845213.t - 2 second ./tests/bugs/core/bug-1111557.t - 2 second ./tests/bugs/cli/bug-969193.t - 2 second ./tests/bugs/cli/bug-764638.t - 2 second ./tests/bugs/cli/bug-1378842-volume-get-all.t - 2 second ./tests/basic/peer-parsing.t - 2 second ./tests/basic/gfapi/sink.t - 2 second ./tests/basic/afr/ta-check-locks.t - 2 second ./tests/basic/afr/arbiter-cli.t - 2 second ./tests/bugs/replicate/ta-inode-refresh-read.t - 1 second ./tests/bugs/cli/bug-949298.t - 1 second ./tests/bugs/cli/bug-921215.t - 1 second ./tests/bugs/cli/bug-1047378.t - 1 second ./tests/basic/posixonly.t - 1 second ./tests/basic/netgroup_parsing.t - 1 second ./tests/basic/exports_parsing.t - 1 second ./tests/basic/glusterfsd-args.t - 0 second 2 test(s) failed ./tests/basic/uss.t ./tests/bugs/glusterd/serialize-shd-manager-glusterd-restart.t 0 test(s) generated core 3 test(s) needed retry ./tests/basic/uss.t ./tests/basic/volfile-sanity.t ./tests/bugs/glusterd/serialize-shd-manager-glusterd-restart.t Result is 124 tar: Removing leading `/' from member names kernel.core_pattern = /%e-%p.core Build step 'Execute shell' marked build as failure _______________________________________________ maintainers mailing list maintainers at gluster.org https://lists.gluster.org/mailman/listinfo/maintainers -- - Atin (atinm) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jenkins at build.gluster.org Mon Jun 3 01:45:02 2019 From: jenkins at build.gluster.org (jenkins at build.gluster.org) Date: Mon, 3 Jun 2019 01:45:02 +0000 (UTC) Subject: [Gluster-devel] Weekly Untriaged Bugs Message-ID: <506588597.104.1559526303096.JavaMail.jenkins@jenkins-el7.rht.gluster.org> [...truncated 6 lines...] https://bugzilla.redhat.com/1714851 / core: issues with 'list.h' elements in clang-scan https://bugzilla.redhat.com/1714895 / libglusterfsclient: Glusterfs(fuse) client crash https://bugzilla.redhat.com/1716097 / project-infrastructure: infra: create suse-packing at lists.nfs-ganesha.org alias [...truncated 2 lines...] -------------- next part -------------- A non-text attachment was scrubbed... Name: build.log Type: application/octet-stream Size: 670 bytes Desc: not available URL: From hgowtham at redhat.com Mon Jun 3 08:18:46 2019 From: hgowtham at redhat.com (Hari Gowtham) Date: Mon, 3 Jun 2019 13:48:46 +0530 Subject: [Gluster-devel] Release 6.3: Expected tagging on June 7th Message-ID: Hi, Expected tagging date for release-6.3 is on June, 7th, 2019. Please ensure required patches are back-ported and also are passing regressions and are appropriately reviewed for easy merging and tagging on the date. -- Regards, Hari Gowtham. From rabhat at redhat.com Mon Jun 3 14:20:30 2019 From: rabhat at redhat.com (FNU Raghavendra Manjunath) Date: Mon, 3 Jun 2019 10:20:30 -0400 Subject: [Gluster-devel] [Gluster-Maintainers] Build failed in Jenkins: regression-test-with-multiplex #1359 In-Reply-To: References: <24208463.92.1559325814227.JavaMail.jenkins@jenkins-el7.rht.gluster.org> Message-ID: Yes. I have sent this patch [1] for review. It is now not failing in regression tests. (i.e. uss.t is not failing) [1] https://review.gluster.org/#/c/glusterfs/+/22728/ Regards, Raghavendra On Sat, Jun 1, 2019 at 7:25 AM Atin Mukherjee wrote: > subdir-mount.t has started failing in brick mux regression nightly. This > needs to be fixed. > > Raghavendra - did we manage to get any further clue on uss.t failure? > > ---------- Forwarded message --------- > From: > Date: Fri, 31 May 2019 at 23:34 > Subject: [Gluster-Maintainers] Build failed in Jenkins: > regression-test-with-multiplex #1359 > To: , , , > , > > > See < > https://build.gluster.org/job/regression-test-with-multiplex/1359/display/redirect?page=changes > > > > Changes: > > [atin] glusterd: add an op-version check > > [atin] glusterd/svc: glusterd_svcs_stop should call individual wrapper > function > > [atin] glusterd/svc: Stop stale process using the glusterd_proc_stop > > [Amar Tumballi] lcov: more coverage to shard, old-protocol, sdfs > > [Kotresh H R] tests/geo-rep: Add EC volume test case > > [Amar Tumballi] glusterfsd/cleanup: Protect graph object under a lock > > [Mohammed Rafi KC] glusterd/shd: Optimize the glustershd manager to send > reconfigure > > [Kotresh H R] tests/geo-rep: Add tests to cover glusterd geo-rep > > [atin] glusterd: Optimize code to copy dictionary in handshake code path > > ------------------------------------------ > [...truncated 3.18 MB...] > ./tests/basic/afr/stale-file-lookup.t - 9 second > ./tests/basic/afr/granular-esh/replace-brick.t - 9 second > ./tests/basic/afr/granular-esh/add-brick.t - 9 second > ./tests/basic/afr/gfid-mismatch.t - 9 second > ./tests/performance/open-behind.t - 8 second > ./tests/features/ssl-authz.t - 8 second > ./tests/features/readdir-ahead.t - 8 second > ./tests/bugs/upcall/bug-1458127.t - 8 second > ./tests/bugs/transport/bug-873367.t - 8 second > ./tests/bugs/replicate/bug-1498570-client-iot-graph-check.t - 8 second > ./tests/bugs/replicate/bug-1132102.t - 8 second > ./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove-quota-quota-deem-statfs.t > - 8 second > ./tests/bugs/quota/bug-1104692.t - 8 second > ./tests/bugs/posix/bug-1360679.t - 8 second > ./tests/bugs/posix/bug-1122028.t - 8 second > ./tests/bugs/nfs/bug-1157223-symlink-mounting.t - 8 second > ./tests/bugs/glusterfs/bug-861015-log.t - 8 second > ./tests/bugs/glusterd/sync-post-glusterd-restart.t - 8 second > ./tests/bugs/glusterd/bug-1696046.t - 8 second > ./tests/bugs/fuse/bug-983477.t - 8 second > ./tests/bugs/ec/bug-1227869.t - 8 second > ./tests/bugs/distribute/bug-1088231.t - 8 second > ./tests/bugs/distribute/bug-1086228.t - 8 second > ./tests/bugs/cli/bug-1087487.t - 8 second > ./tests/bugs/cli/bug-1022905.t - 8 second > ./tests/bugs/bug-1258069.t - 8 second > ./tests/bugs/bitrot/1209752-volume-status-should-show-bitrot-scrub-info.t > - 8 second > ./tests/basic/xlator-pass-through-sanity.t - 8 second > ./tests/basic/quota-nfs.t - 8 second > ./tests/basic/glusterd/arbiter-volume.t - 8 second > ./tests/basic/ctime/ctime-noatime.t - 8 second > ./tests/line-coverage/cli-peer-and-volume-operations.t - 7 second > ./tests/gfid2path/get-gfid-to-path.t - 7 second > ./tests/bugs/upcall/bug-1369430.t - 7 second > ./tests/bugs/snapshot/bug-1260848.t - 7 second > ./tests/bugs/shard/shard-inode-refcount-test.t - 7 second > ./tests/bugs/shard/bug-1258334.t - 7 second > ./tests/bugs/replicate/bug-767585-gfid.t - 7 second > ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t - 7 second > ./tests/bugs/replicate/bug-1250170-fsync.t - 7 second > ./tests/bugs/posix/bug-1175711.t - 7 second > ./tests/bugs/nfs/bug-915280.t - 7 second > ./tests/bugs/md-cache/setxattr-prepoststat.t - 7 second > ./tests/bugs/md-cache/bug-1211863_unlink.t - 7 second > ./tests/bugs/glusterfs/bug-848251.t - 7 second > ./tests/bugs/distribute/bug-1122443.t - 7 second > ./tests/bugs/changelog/bug-1208470.t - 7 second > ./tests/bugs/bug-1702299.t - 7 second > ./tests/bugs/bug-1371806_2.t - 7 second > ./tests/bugs/bitrot/1209818-vol-info-show-scrub-process-properly.t - 7 > second > ./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t - 7 second > ./tests/bugs/bitrot/1207029-bitrot-daemon-should-start-on-valid-node.t - > 7 second > ./tests/bitrot/br-stub.t - 7 second > ./tests/basic/glusterd/arbiter-volume-probe.t - 7 second > ./tests/basic/gfapi/libgfapi-fini-hang.t - 7 second > ./tests/basic/fencing/fencing-crash-conistency.t - 7 second > ./tests/basic/distribute/file-create.t - 7 second > ./tests/basic/afr/tarissue.t - 7 second > ./tests/basic/afr/gfid-heal.t - 7 second > ./tests/bugs/snapshot/bug-1178079.t - 6 second > ./tests/bugs/snapshot/bug-1064768.t - 6 second > ./tests/bugs/shard/bug-1342298.t - 6 second > ./tests/bugs/shard/bug-1259651.t - 6 second > ./tests/bugs/replicate/bug-1686568-send-truncate-on-arbiter-from-shd.t - > 6 second > ./tests/bugs/replicate/bug-1626994-info-split-brain.t - 6 second > ./tests/bugs/replicate/bug-1325792.t - 6 second > ./tests/bugs/replicate/bug-1101647.t - 6 second > ./tests/bugs/quota/bug-1243798.t - 6 second > ./tests/bugs/protocol/bug-1321578.t - 6 second > ./tests/bugs/nfs/bug-877885.t - 6 second > ./tests/bugs/nfs/bug-1143880-fix-gNFSd-auth-crash.t - 6 second > ./tests/bugs/md-cache/bug-1476324.t - 6 second > ./tests/bugs/md-cache/afr-stale-read.t - 6 second > ./tests/bugs/io-cache/bug-858242.t - 6 second > ./tests/bugs/glusterfs/bug-893378.t - 6 second > ./tests/bugs/glusterfs/bug-856455.t - 6 second > ./tests/bugs/glusterd/quorum-value-check.t - 6 second > ./tests/bugs/ec/bug-1179050.t - 6 second > ./tests/bugs/distribute/bug-912564.t - 6 second > ./tests/bugs/distribute/bug-884597.t - 6 second > ./tests/bugs/distribute/bug-1368012.t - 6 second > ./tests/bugs/core/bug-986429.t - 6 second > ./tests/bugs/core/bug-1699025-brick-mux-detach-brick-fd-issue.t - 6 > second > ./tests/bugs/core/bug-1168803-snapd-option-validation-fix.t - 6 second > ./tests/bugs/bug-1371806_1.t - 6 second > ./tests/bugs/bitrot/bug-1229134-bitd-not-support-vol-set.t - 6 second > ./tests/bugs/bitrot/bug-1210684-scrub-pause-resume-error-handling.t - 6 > second > ./tests/bitrot/bug-1221914.t - 6 second > ./tests/basic/trace.t - 6 second > ./tests/basic/playground/template-xlator-sanity.t - 6 second > ./tests/basic/ec/nfs.t - 6 second > ./tests/basic/ec/ec-read-policy.t - 6 second > ./tests/basic/ec/ec-anonymous-fd.t - 6 second > ./tests/basic/distribute/non-root-unlink-stale-linkto.t - 6 second > ./tests/basic/changelog/changelog-rename.t - 6 second > ./tests/basic/afr/heal-info.t - 6 second > ./tests/basic/afr/afr-read-hash-mode.t - 6 second > ./tests/gfid2path/gfid2path_nfs.t - 5 second > ./tests/bugs/upcall/bug-1422776.t - 5 second > ./tests/bugs/replicate/bug-886998.t - 5 second > ./tests/bugs/replicate/bug-1365455.t - 5 second > ./tests/bugs/readdir-ahead/bug-1670253-consistent-metadata.t - 5 second > ./tests/bugs/posix/bug-gfid-path.t - 5 second > ./tests/bugs/posix/bug-765380.t - 5 second > ./tests/bugs/nfs/bug-847622.t - 5 second > ./tests/bugs/nfs/bug-1116503.t - 5 second > ./tests/bugs/io-stats/bug-1598548.t - 5 second > ./tests/bugs/glusterfs-server/bug-877992.t - 5 second > ./tests/bugs/glusterfs-server/bug-873549.t - 5 second > ./tests/bugs/glusterfs/bug-895235.t - 5 second > ./tests/bugs/fuse/bug-1126048.t - 5 second > ./tests/bugs/distribute/bug-907072.t - 5 second > ./tests/bugs/core/bug-913544.t - 5 second > ./tests/bugs/core/bug-908146.t - 5 second > ./tests/bugs/access-control/bug-1051896.t - 5 second > ./tests/basic/ec/ec-internal-xattrs.t - 5 second > ./tests/basic/ec/ec-fallocate.t - 5 second > ./tests/basic/distribute/bug-1265677-use-readdirp.t - 5 second > ./tests/basic/afr/arbiter-remove-brick.t - 5 second > ./tests/performance/quick-read.t - 4 second > ./tests/gfid2path/block-mount-access.t - 4 second > ./tests/features/delay-gen.t - 4 second > ./tests/bugs/upcall/bug-upcall-stat.t - 4 second > ./tests/bugs/upcall/bug-1394131.t - 4 second > ./tests/bugs/unclassified/bug-1034085.t - 4 second > ./tests/bugs/snapshot/bug-1111041.t - 4 second > ./tests/bugs/shard/bug-1272986.t - 4 second > ./tests/bugs/shard/bug-1256580.t - 4 second > ./tests/bugs/shard/bug-1250855.t - 4 second > ./tests/bugs/shard/bug-1245547.t - 4 second > ./tests/bugs/rpc/bug-954057.t - 4 second > ./tests/bugs/replicate/bug-976800.t - 4 second > ./tests/bugs/replicate/bug-880898.t - 4 second > ./tests/bugs/replicate/bug-1480525.t - 4 second > ./tests/bugs/read-only/bug-1134822-read-only-default-in-graph.t - 4 > second > ./tests/bugs/readdir-ahead/bug-1446516.t - 4 second > ./tests/bugs/readdir-ahead/bug-1439640.t - 4 second > ./tests/bugs/readdir-ahead/bug-1390050.t - 4 second > ./tests/bugs/quota/bug-1287996.t - 4 second > ./tests/bugs/quick-read/bug-846240.t - 4 second > ./tests/bugs/posix/disallow-gfid-volumeid-removexattr.t - 4 second > ./tests/bugs/posix/bug-1619720.t - 4 second > ./tests/bugs/nl-cache/bug-1451588.t - 4 second > ./tests/bugs/nfs/zero-atime.t - 4 second > ./tests/bugs/nfs/subdir-trailing-slash.t - 4 second > ./tests/bugs/nfs/socket-as-fifo.t - 4 second > ./tests/bugs/nfs/showmount-many-clients.t - 4 second > ./tests/bugs/nfs/bug-1210338.t - 4 second > ./tests/bugs/nfs/bug-1166862.t - 4 second > ./tests/bugs/nfs/bug-1161092-nfs-acls.t - 4 second > ./tests/bugs/md-cache/bug-1632503.t - 4 second > ./tests/bugs/glusterfs-server/bug-864222.t - 4 second > ./tests/bugs/glusterfs/bug-1482528.t - 4 second > ./tests/bugs/glusterd/bug-948729/bug-948729-mode-script.t - 4 second > ./tests/bugs/glusterd/bug-948729/bug-948729-force.t - 4 second > ./tests/bugs/glusterd/bug-1482906-peer-file-blank-line.t - 4 second > ./tests/bugs/glusterd/bug-1091935-brick-order-check-from-cli-to-glusterd.t > - 4 second > ./tests/bugs/geo-replication/bug-1296496.t - 4 second > ./tests/bugs/fuse/bug-1336818.t - 4 second > ./tests/bugs/fuse/bug-1283103.t - 4 second > ./tests/bugs/core/io-stats-1322825.t - 4 second > ./tests/bugs/core/bug-834465.t - 4 second > ./tests/bugs/core/bug-1135514-allow-setxattr-with-null-value.t - 4 second > ./tests/bugs/core/949327.t - 4 second > ./tests/bugs/cli/bug-977246.t - 4 second > ./tests/bugs/cli/bug-961307.t - 4 second > ./tests/bugs/cli/bug-1004218.t - 4 second > ./tests/bugs/bug-1138841.t - 4 second > ./tests/bugs/access-control/bug-1387241.t - 4 second > ./tests/bitrot/bug-internal-xattrs-check-1243391.t - 4 second > ./tests/basic/quota-rename.t - 4 second > ./tests/basic/hardlink-limit.t - 4 second > ./tests/basic/ec/dht-rename.t - 4 second > ./tests/basic/distribute/lookup.t - 4 second > ./tests/line-coverage/meta-max-coverage.t - 3 second > ./tests/gfid2path/gfid2path_fuse.t - 3 second > ./tests/bugs/unclassified/bug-991622.t - 3 second > ./tests/bugs/trace/bug-797171.t - 3 second > ./tests/bugs/glusterfs-server/bug-861542.t - 3 second > ./tests/bugs/glusterfs/bug-869724.t - 3 second > ./tests/bugs/glusterfs/bug-860297.t - 3 second > ./tests/bugs/glusterfs/bug-844688.t - 3 second > ./tests/bugs/glusterd/bug-948729/bug-948729.t - 3 second > ./tests/bugs/distribute/bug-1204140.t - 3 second > ./tests/bugs/core/bug-924075.t - 3 second > ./tests/bugs/core/bug-845213.t - 3 second > ./tests/bugs/core/bug-1421721-mpx-toggle.t - 3 second > ./tests/bugs/core/bug-1119582.t - 3 second > ./tests/bugs/core/bug-1117951.t - 3 second > ./tests/bugs/cli/bug-983317-volume-get.t - 3 second > ./tests/bugs/cli/bug-867252.t - 3 second > ./tests/basic/glusterd/check-cloudsync-ancestry.t - 3 second > ./tests/basic/fops-sanity.t - 3 second > ./tests/basic/fencing/test-fence-option.t - 3 second > ./tests/basic/distribute/debug-xattrs.t - 3 second > ./tests/basic/afr/ta-check-locks.t - 3 second > ./tests/line-coverage/volfile-with-all-graph-syntax.t - 2 second > ./tests/line-coverage/some-features-in-libglusterfs.t - 2 second > ./tests/bugs/shard/bug-1261773.t - 2 second > ./tests/bugs/replicate/bug-884328.t - 2 second > ./tests/bugs/readdir-ahead/bug-1512437.t - 2 second > ./tests/bugs/nfs/bug-970070.t - 2 second > ./tests/bugs/nfs/bug-1302948.t - 2 second > ./tests/bugs/logging/bug-823081.t - 2 second > ./tests/bugs/glusterfs-server/bug-889996.t - 2 second > ./tests/bugs/glusterfs/bug-892730.t - 2 second > ./tests/bugs/glusterfs/bug-811493.t - 2 second > ./tests/bugs/glusterd/bug-1085330-and-bug-916549.t - 2 second > ./tests/bugs/distribute/bug-924265.t - 2 second > ./tests/bugs/core/log-bug-1362520.t - 2 second > ./tests/bugs/core/bug-903336.t - 2 second > ./tests/bugs/core/bug-1111557.t - 2 second > ./tests/bugs/cli/bug-969193.t - 2 second > ./tests/bugs/cli/bug-949298.t - 2 second > ./tests/bugs/cli/bug-921215.t - 2 second > ./tests/bugs/cli/bug-1378842-volume-get-all.t - 2 second > ./tests/basic/peer-parsing.t - 2 second > ./tests/basic/md-cache/bug-1418249.t - 2 second > ./tests/basic/afr/arbiter-cli.t - 2 second > ./tests/bugs/replicate/ta-inode-refresh-read.t - 1 second > ./tests/bugs/glusterfs/bug-853690.t - 1 second > ./tests/bugs/cli/bug-764638.t - 1 second > ./tests/bugs/cli/bug-1047378.t - 1 second > ./tests/basic/netgroup_parsing.t - 1 second > ./tests/basic/gfapi/sink.t - 1 second > ./tests/basic/exports_parsing.t - 1 second > ./tests/basic/posixonly.t - 0 second > ./tests/basic/glusterfsd-args.t - 0 second > > > 2 test(s) failed > ./tests/basic/uss.t > ./tests/features/subdir-mount.t > > 0 test(s) generated core > > > 5 test(s) needed retry > ./tests/basic/afr/split-brain-favorite-child-policy.t > ./tests/basic/ec/self-heal.t > ./tests/basic/uss.t > ./tests/basic/volfile-sanity.t > ./tests/features/subdir-mount.t > > Result is 1 > > tar: Removing leading `/' from member names > kernel.core_pattern = /%e-%p.core > Build step 'Execute shell' marked build as failure > _______________________________________________ > maintainers mailing list > maintainers at gluster.org > https://lists.gluster.org/mailman/listinfo/maintainers > > > -- > - Atin (atinm) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhishpaliwal at gmail.com Tue Jun 4 10:09:59 2019 From: abhishpaliwal at gmail.com (ABHISHEK PALIWAL) Date: Tue, 4 Jun 2019 15:39:59 +0530 Subject: [Gluster-devel] Memory leak in glusterfs In-Reply-To: References: Message-ID: Hi Team, Please respond on the issue which I raised. Regards, Abhishek On Fri, May 17, 2019 at 2:46 PM ABHISHEK PALIWAL wrote: > Anyone please reply.... > > On Thu, May 16, 2019, 10:49 ABHISHEK PALIWAL > wrote: > >> Hi Team, >> >> I upload some valgrind logs from my gluster 5.4 setup. This is writing to >> the volume every 15 minutes. I stopped glusterd and then copy away the >> logs. The test was running for some simulated days. They are zipped in >> valgrind-54.zip. >> >> Lots of info in valgrind-2730.log. Lots of possibly lost bytes in >> glusterfs and even some definitely lost bytes. >> >> ==2737== 1,572,880 bytes in 1 blocks are possibly lost in loss record 391 >> of 391 >> ==2737== at 0x4C29C25: calloc (in >> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >> ==2737== by 0xA22485E: ??? (in >> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >> ==2737== by 0xA217C94: ??? (in >> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >> ==2737== by 0xA21D9F8: ??? (in >> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >> ==2737== by 0xA21DED9: ??? (in >> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >> ==2737== by 0xA21E685: ??? (in >> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >> ==2737== by 0xA1B9D8C: init (in >> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >> ==2737== by 0x4E511CE: xlator_init (in /usr/lib64/libglusterfs.so.0.0.1) >> ==2737== by 0x4E8A2B8: ??? (in /usr/lib64/libglusterfs.so.0.0.1) >> ==2737== by 0x4E8AAB3: glusterfs_graph_activate (in >> /usr/lib64/libglusterfs.so.0.0.1) >> ==2737== by 0x409C35: glusterfs_process_volfp (in /usr/sbin/glusterfsd) >> ==2737== by 0x409D99: glusterfs_volumes_init (in /usr/sbin/glusterfsd) >> ==2737== >> ==2737== LEAK SUMMARY: >> ==2737== definitely lost: 1,053 bytes in 10 blocks >> ==2737== indirectly lost: 317 bytes in 3 blocks >> ==2737== possibly lost: 2,374,971 bytes in 524 blocks >> ==2737== still reachable: 53,277 bytes in 201 blocks >> ==2737== suppressed: 0 bytes in 0 blocks >> >> -- >> >> >> >> >> Regards >> Abhishek Paliwal >> > -- Regards Abhishek Paliwal -------------- next part -------------- An HTML attachment was scrubbed... URL: From zgrep at 139.com Tue Jun 4 11:33:54 2019 From: zgrep at 139.com (=?utf-8?B?WGllIENoYW5nbG9uZw==?=) Date: 04 Jun 2019 19:33:54 +0800 Subject: [Gluster-devel] GETXATTR op pending on index xlator for more than 10 hours Message-ID: 2019060419335438074695@139.com> Hi all, Today, i found gnfs GETXATTR bailing out on gluster release 3.12.0. I have a simple 4*2 Distributed-Rep volume. [2019-06-03 19:58:33.085880] E [rpc-clnt.c:185:Call_bail] 0-cl25vol01-client-4: bailing out frame type(GlusterFS 3.3) op(GETXATTR(18)) xid=0x21de4275 sent = 2019-06-03 19:28:30.552356. timeout = 1800 for 10.3.133.57:49153 xid= 0x21de4275 = 568214133 Then i try to dump brick 10.3.133.57:49153, and find the GETXATTR op pending on index xlator for more than 10 hours! 1111MicrosoftInternetExplorer402DocumentNotSpecified7.8 ?Normal0 [root at node0001 gluster]# grep -rn 568214133 gluster-brick-1-cl25vol01.6078.dump.15596* gluster-brick-1-cl25vol01.6078.dump.1559617125:5093:unique=568214133 gluster-brick-1-cl25vol01.6078.dump.1559618121:5230:unique=568214133 gluster-brick-1-cl25vol01.6078.dump.1559618912:5434:unique=568214133 gluster-brick-1-cl25vol01.6078.dump.1559628467:6921:unique=568214133 [root at node0001 gluster]# date -d @1559617125 Tue Jun 4 10:58:45 CST 2019 [root at node0001 gluster]# date -d @1559628467 Tue Jun 4 14:07:47 CST 2019 1111MicrosoftInternetExplorer402DocumentNotSpecified7.8 ?Normal0 [root at node0001 gluster]# [global.callpool.stack.115] stack=0x7f8b342623c0 uid=500 gid=500 pid=-6 unique=568214133 lk-owner=faffffff op=stack type=0 cnt=4 [global.callpool.stack.115.frame.1] frame=0x7f8b1d6fb540 ref_count=0 translator=cl25vol01-index complete=0 parent=cl25vol01-quota wind_from=quota_getxattr wind_to=(this->children->xlator)->fops->getxattr unwind_to=default_getxattr_cbk [global.callpool.stack.115.frame.2] frame=0x7f8b30a14da0 ref_count=1 translator=cl25vol01-quota complete=0 parent=cl25vol01-io-stats wind_from=io_stats_getxattr wind_to=(this->children->xlator)->fops->getxattr unwind_to=io_stats_getxattr_cbk [global.callpool.stack.115.frame.3] frame=0x7f8b6debada0 ref_count=1 translator=cl25vol01-io-stats complete=0 parent=cl25vol01-server wind_from=server_getxattr_resume wind_to=FIRST_CHILD(this)->fops->getxattr unwind_to=server_getxattr_cbk [global.callpool.stack.115.frame.4] frame=0x7f8b21962a60 ref_count=1 translator=cl25vol01-server complete=0 I've checked the code logic and got nothing, any advice? I still have the scene on my side, so we can dig more. Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From ykaul at redhat.com Tue Jun 4 13:27:04 2019 From: ykaul at redhat.com (Yaniv Kaul) Date: Tue, 4 Jun 2019 16:27:04 +0300 Subject: [Gluster-devel] [Gluster-infra] rebal-all-nodes-migrate.t always fails now In-Reply-To: <090785225412c2b5b269454f8812d0a165aea62d.camel@redhat.com> References: <94bd8147c5035da76c3ac3ae90a8a02ed000106a.camel@redhat.com> <0ca34e42063ad77f323155c85a7bb3ba7a79931b.camel@redhat.com> <090785225412c2b5b269454f8812d0a165aea62d.camel@redhat.com> Message-ID: What was the result of this investigation? I suspect seeing the same issue on builder209[1]. Y. [1] https://build.gluster.org/job/centos7-regression/6302/consoleFull On Fri, Apr 5, 2019 at 5:40 PM Michael Scherer wrote: > Le vendredi 05 avril 2019 ? 16:55 +0530, Nithya Balachandran a ?crit : > > On Fri, 5 Apr 2019 at 12:16, Michael Scherer > > wrote: > > > > > Le jeudi 04 avril 2019 ? 18:24 +0200, Michael Scherer a ?crit : > > > > Le jeudi 04 avril 2019 ? 19:10 +0300, Yaniv Kaul a ?crit : > > > > > I'm not convinced this is solved. Just had what I believe is a > > > > > similar > > > > > failure: > > > > > > > > > > *00:12:02.532* A dependency job for rpc-statd.service failed. > > > > > See > > > > > 'journalctl -xe' for details.*00:12:02.532* mount.nfs: > > > > > rpc.statd is > > > > > not running but is required for remote locking.*00:12:02.532* > > > > > mount.nfs: Either use '-o nolock' to keep locks local, or start > > > > > statd.*00:12:02.532* mount.nfs: an incorrect mount option was > > > > > specified > > > > > > > > > > (of course, it can always be my patch!) > > > > > > > > > > https://build.gluster.org/job/centos7-regression/5384/console > > > > > > > > same issue, different builder (206). I will check them all, as > > > > the > > > > issue is more widespread than I expected (or it did popup since > > > > last > > > > time I checked). > > > > > > Deepshika did notice that the issue came back on one server > > > (builder202) after a reboot, so the rpcbind issue is not related to > > > the > > > network initscript one, so the RCA continue. > > > > > > We are looking for another workaround involving fiddling with the > > > socket (until we find why it do use ipv6 at boot, but not after, > > > when > > > ipv6 is disabled). > > > > > > > Could this be relevant? > > https://access.redhat.com/solutions/2798411 > > Good catch. > > So, we already do that, Nigel took care of that (after 2 days of > research). But I didn't knew the exact symptoms, and decided to double > check just in case. > > And... there is no sysctl.conf in the initrd. Running dracut -v -f do > not change anything. > > Running "dracut -v -f -H" take care of that (and this fix the problem), > but: > - our ansible script already run that > - -H is hostonly, which is already the default on EL7 according to the > doc. > > However, if dracut-config-generic is installed, it doesn't build a > hostonly initrd, and so do not include the sysctl.conf file (who break > rpcbnd, who break the test suite). > > And for some reason, it is installed the image in ec2 (likely default), > but not by default on the builders. > > So what happen is that after a kernel upgrade, dracut rebuild a generic > initrd instead of a hostonly one, who break things. And kernel was > likely upgraded recently (and upgrade happen nightly (for some value of > "night"), so we didn't see that earlier, nor with a fresh system. > > > So now, we have several solution: > - be explicit on using hostonly in dracut, so this doesn't happen again > (or not for this reason) > > - disable ipv6 in rpcbind in a cleaner way (to be tested) > > - get the test suite work with ip v6 > > In the long term, I also want to monitor the processes, but for that, I > need a VPN between the nagios server and ec2, and that project got > blocked by several issues (like EC2 not support ecdsa keys, and we use > that for ansible, so we have to come back to RSA for full automated > deployment, and openvon requires to use certificates, so I need a newer > python openssl for doing what I want, and RHEL 7 is too old, etc, etc). > > As the weekend approach for me, I just rebuilt the initrd for the time > being. I guess forcing hostonly is the safest fix for now, but this > will be for monday. > -- > Michael Scherer > Sysadmin, Community Infrastructure and Platform, OSAS > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dkhandel at redhat.com Wed Jun 5 06:57:21 2019 From: dkhandel at redhat.com (Deepshikha Khandelwal) Date: Wed, 5 Jun 2019 12:27:21 +0530 Subject: [Gluster-devel] [Gluster-infra] rebal-all-nodes-migrate.t always fails now In-Reply-To: References: <94bd8147c5035da76c3ac3ae90a8a02ed000106a.camel@redhat.com> <0ca34e42063ad77f323155c85a7bb3ba7a79931b.camel@redhat.com> <090785225412c2b5b269454f8812d0a165aea62d.camel@redhat.com> Message-ID: I recently added 3 builders builder208, builder209, builder210 to the regression pool. Network to these new builders did not come up because it was looking for non-existing ethernet card eth0 on reboot and hence failing. I'll reconnect them back and update here once I fix the issue today. Sorry for the inconvenience. On Tue, Jun 4, 2019 at 7:07 PM Yaniv Kaul wrote: > What was the result of this investigation? I suspect seeing the same issue > on builder209[1]. > Y. > > [1] https://build.gluster.org/job/centos7-regression/6302/consoleFull > > On Fri, Apr 5, 2019 at 5:40 PM Michael Scherer > wrote: > >> Le vendredi 05 avril 2019 ? 16:55 +0530, Nithya Balachandran a ?crit : >> > On Fri, 5 Apr 2019 at 12:16, Michael Scherer >> > wrote: >> > >> > > Le jeudi 04 avril 2019 ? 18:24 +0200, Michael Scherer a ?crit : >> > > > Le jeudi 04 avril 2019 ? 19:10 +0300, Yaniv Kaul a ?crit : >> > > > > I'm not convinced this is solved. Just had what I believe is a >> > > > > similar >> > > > > failure: >> > > > > >> > > > > *00:12:02.532* A dependency job for rpc-statd.service failed. >> > > > > See >> > > > > 'journalctl -xe' for details.*00:12:02.532* mount.nfs: >> > > > > rpc.statd is >> > > > > not running but is required for remote locking.*00:12:02.532* >> > > > > mount.nfs: Either use '-o nolock' to keep locks local, or start >> > > > > statd.*00:12:02.532* mount.nfs: an incorrect mount option was >> > > > > specified >> > > > > >> > > > > (of course, it can always be my patch!) >> > > > > >> > > > > https://build.gluster.org/job/centos7-regression/5384/console >> > > > >> > > > same issue, different builder (206). I will check them all, as >> > > > the >> > > > issue is more widespread than I expected (or it did popup since >> > > > last >> > > > time I checked). >> > > >> > > Deepshika did notice that the issue came back on one server >> > > (builder202) after a reboot, so the rpcbind issue is not related to >> > > the >> > > network initscript one, so the RCA continue. >> > > >> > > We are looking for another workaround involving fiddling with the >> > > socket (until we find why it do use ipv6 at boot, but not after, >> > > when >> > > ipv6 is disabled). >> > > >> > >> > Could this be relevant? >> > https://access.redhat.com/solutions/2798411 >> >> Good catch. >> >> So, we already do that, Nigel took care of that (after 2 days of >> research). But I didn't knew the exact symptoms, and decided to double >> check just in case. >> >> And... there is no sysctl.conf in the initrd. Running dracut -v -f do >> not change anything. >> >> Running "dracut -v -f -H" take care of that (and this fix the problem), >> but: >> - our ansible script already run that >> - -H is hostonly, which is already the default on EL7 according to the >> doc. >> >> However, if dracut-config-generic is installed, it doesn't build a >> hostonly initrd, and so do not include the sysctl.conf file (who break >> rpcbnd, who break the test suite). >> >> And for some reason, it is installed the image in ec2 (likely default), >> but not by default on the builders. >> >> So what happen is that after a kernel upgrade, dracut rebuild a generic >> initrd instead of a hostonly one, who break things. And kernel was >> likely upgraded recently (and upgrade happen nightly (for some value of >> "night"), so we didn't see that earlier, nor with a fresh system. >> >> >> So now, we have several solution: >> - be explicit on using hostonly in dracut, so this doesn't happen again >> (or not for this reason) >> >> - disable ipv6 in rpcbind in a cleaner way (to be tested) >> >> - get the test suite work with ip v6 >> >> In the long term, I also want to monitor the processes, but for that, I >> need a VPN between the nagios server and ec2, and that project got >> blocked by several issues (like EC2 not support ecdsa keys, and we use >> that for ansible, so we have to come back to RSA for full automated >> deployment, and openvon requires to use certificates, so I need a newer >> python openssl for doing what I want, and RHEL 7 is too old, etc, etc). >> >> As the weekend approach for me, I just rebuilt the initrd for the time >> being. I guess forcing hostonly is the safest fix for now, but this >> will be for monday. >> -- >> Michael Scherer >> Sysadmin, Community Infrastructure and Platform, OSAS >> >> >> _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From atumball at redhat.com Wed Jun 5 07:00:16 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Wed, 5 Jun 2019 12:30:16 +0530 Subject: [Gluster-devel] Update: GlusterFS code coverage Message-ID: All, I just wanted to update everyone about one of the initiatives we have undertaken, ie, increasing the overall code coverage of GlusterFS above 70%. You can have a look at current code coverage here: https://build.gluster.org/job/line-coverage/lastCompletedBuild/Line_20Coverage_20Report/ (This shows the latest all the time) The daily job, and its details are captured @ https://build.gluster.org/job/line-coverage/ When we started focus on code coverage 3 months back, our code coverage was around 60% overall. We kept the ambitious goal of increasing the code coverage by 10% before glusterfs-7.0 release, and I am happy to announce that we met this goal, before the branching. Before talking about next goals, I want to thank and call out few developers who made this happen. * Xavier Hernandez - Made EC cross 90% from < 70%. * Glusterd Team (Sanju, Rishub, Mohit, Atin) - Increased CLI/glusterd coverage * Geo-Rep Team (Kotresh, Sunny, Shwetha, Aravinda). * Sheetal (help to increase glfs-api test cases, which indirectly helped cover more code across). Also note that, Some components like AFR/replicate was already at 80%+ before we started the efforts. Now, our next goal is to make sure we have above 80% functions coverage in all of the top level components shown. Once that is done, we will focus on 75% code coverage across all components. (ie, no 'Red' in top level page). While it was possible to meet our goal of increasing the overall code coverage from 60% - 70%, increasing it above 70% is not going to be easy, mainly because it involves adding more tests for negative test cases, and adding tests with different options (currently >300 of them across). We also need to look at details from code coverage tests, and reverse engineer to see how to write a test to hit the particular line in the code. I personally invite everyone who is interested to contribute to gluster project to get involved in this effort. Help us write test cases, suggest how to improve it. Help by assigning interns write them for us (if your team has some of them). This is a good way to understand glusterfs code too. We are happy to organize sessions on how to walk through the code etc if required. Happy to hear feedback and see more contribution in this area. Regards, Amar -------------- next part -------------- An HTML attachment was scrubbed... URL: From nbalacha at redhat.com Wed Jun 5 13:52:56 2019 From: nbalacha at redhat.com (Nithya Balachandran) Date: Wed, 5 Jun 2019 19:22:56 +0530 Subject: [Gluster-devel] [Gluster-users] Memory leak in glusterfs In-Reply-To: References: Message-ID: Hi, Writing to a volume should not affect glusterd. The stack you have shown in the valgrind looks like the memory used to initialise the structures glusterd uses and will free only when it is stopped. Can you provide more details to what it is you are trying to test? Regards, Nithya On Tue, 4 Jun 2019 at 15:41, ABHISHEK PALIWAL wrote: > Hi Team, > > Please respond on the issue which I raised. > > Regards, > Abhishek > > On Fri, May 17, 2019 at 2:46 PM ABHISHEK PALIWAL > wrote: > >> Anyone please reply.... >> >> On Thu, May 16, 2019, 10:49 ABHISHEK PALIWAL >> wrote: >> >>> Hi Team, >>> >>> I upload some valgrind logs from my gluster 5.4 setup. This is writing >>> to the volume every 15 minutes. I stopped glusterd and then copy away the >>> logs. The test was running for some simulated days. They are zipped in >>> valgrind-54.zip. >>> >>> Lots of info in valgrind-2730.log. Lots of possibly lost bytes in >>> glusterfs and even some definitely lost bytes. >>> >>> ==2737== 1,572,880 bytes in 1 blocks are possibly lost in loss record >>> 391 of 391 >>> ==2737== at 0x4C29C25: calloc (in >>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >>> ==2737== by 0xA22485E: ??? (in >>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>> ==2737== by 0xA217C94: ??? (in >>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>> ==2737== by 0xA21D9F8: ??? (in >>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>> ==2737== by 0xA21DED9: ??? (in >>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>> ==2737== by 0xA21E685: ??? (in >>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>> ==2737== by 0xA1B9D8C: init (in >>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>> ==2737== by 0x4E511CE: xlator_init (in /usr/lib64/libglusterfs.so.0.0.1) >>> ==2737== by 0x4E8A2B8: ??? (in /usr/lib64/libglusterfs.so.0.0.1) >>> ==2737== by 0x4E8AAB3: glusterfs_graph_activate (in >>> /usr/lib64/libglusterfs.so.0.0.1) >>> ==2737== by 0x409C35: glusterfs_process_volfp (in /usr/sbin/glusterfsd) >>> ==2737== by 0x409D99: glusterfs_volumes_init (in /usr/sbin/glusterfsd) >>> ==2737== >>> ==2737== LEAK SUMMARY: >>> ==2737== definitely lost: 1,053 bytes in 10 blocks >>> ==2737== indirectly lost: 317 bytes in 3 blocks >>> ==2737== possibly lost: 2,374,971 bytes in 524 blocks >>> ==2737== still reachable: 53,277 bytes in 201 blocks >>> ==2737== suppressed: 0 bytes in 0 blocks >>> >>> -- >>> >>> >>> >>> >>> Regards >>> Abhishek Paliwal >>> >> > > -- > > > > > Regards > Abhishek Paliwal > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From sankarshan.mukhopadhyay at gmail.com Thu Jun 6 03:42:30 2019 From: sankarshan.mukhopadhyay at gmail.com (Sankarshan Mukhopadhyay) Date: Thu, 6 Jun 2019 09:12:30 +0530 Subject: [Gluster-devel] More intelligent file distribution across subvols of DHT when file size is known In-Reply-To: References: Message-ID: On Wed, May 22, 2019 at 6:53 PM Krutika Dhananjay wrote: > > Hi, > > I've proposed a solution to the problem of space running out in some children of DHT even when its other children have free space available, here - https://github.com/gluster/glusterfs/issues/675. > > The proposal aims to solve a very specific instance of this generic class of problems where fortunately the size of the file that is getting created is known beforehand. > > Requesting feedback on the proposal or even alternate solutions, if you have any. There has not been much commentary on this issue in the last 10 odd days. What is the next step? From ykaul at redhat.com Thu Jun 6 06:17:25 2019 From: ykaul at redhat.com (Yaniv Kaul) Date: Thu, 6 Jun 2019 09:17:25 +0300 Subject: [Gluster-devel] CI failure - NameError: name 'unicode' is not defined (related to changelogparser.py) Message-ID: >From [1]. I think it's a Python2/3 thing, so perhaps a CI issue additionally (though if our code is not Python 3 ready, let's ensure we use Python 2 explicitly until we fix this). *00:47:05.207* ok 14 [ 13/ 386] < 34> 'gluster --mode=script --wignore volume start patchy'*00:47:05.207* ok 15 [ 13/ 70] < 36> '_GFS --attribute-timeout=0 --entry-timeout=0 --volfile-id=patchy --volfile-server=builder208.int.aws.gluster.org /mnt/glusterfs/0'*00:47:05.207* Traceback (most recent call last):*00:47:05.207* File "./tests/basic/changelog/../../utils/changelogparser.py", line 233, in *00:47:05.207* parse(sys.argv[1])*00:47:05.207* File "./tests/basic/changelog/../../utils/changelogparser.py", line 221, in parse*00:47:05.207* process_record(data, tokens, changelog_ts, callback)*00:47:05.207* File "./tests/basic/changelog/../../utils/changelogparser.py", line 178, in process_record*00:47:05.207* callback(record)*00:47:05.207* File "./tests/basic/changelog/../../utils/changelogparser.py", line 182, in default_callback*00:47:05.207* sys.stdout.write(u"{0}\n".format(record))*00:47:05.207* File "./tests/basic/changelog/../../utils/changelogparser.py", line 128, in __str__*00:47:05.207* return unicode(self).encode('utf-8')*00:47:05.207* NameError: name 'unicode' is not defined*00:47:05.207* not ok 16 [ 53/ 39] < 42> '2 check_changelog_op /d/backends/patchy0/.glusterfs/changelogs RENAME' -> 'Got "0" instead of "2"' Y. [1] https://build.gluster.org/job/centos7-regression/6318/console -------------- next part -------------- An HTML attachment was scrubbed... URL: From xhernandez at redhat.com Thu Jun 6 06:32:59 2019 From: xhernandez at redhat.com (Xavi Hernandez) Date: Thu, 6 Jun 2019 08:32:59 +0200 Subject: [Gluster-devel] Should we enable contention notification by default ? In-Reply-To: References: <2044282595.16006319.1556799471980.JavaMail.zimbra@redhat.com> Message-ID: On Thu, May 2, 2019 at 5:45 PM Atin Mukherjee wrote: > > > On Thu, 2 May 2019 at 20:38, Xavi Hernandez wrote: > >> On Thu, May 2, 2019 at 4:06 PM Atin Mukherjee >> wrote: >> >>> >>> >>> On Thu, 2 May 2019 at 19:14, Xavi Hernandez >>> wrote: >>> >>>> On Thu, 2 May 2019, 15:37 Milind Changire, wrote: >>>> >>>>> On Thu, May 2, 2019 at 6:44 PM Xavi Hernandez >>>>> wrote: >>>>> >>>>>> Hi Ashish, >>>>>> >>>>>> On Thu, May 2, 2019 at 2:17 PM Ashish Pandey >>>>>> wrote: >>>>>> >>>>>>> Xavi, >>>>>>> >>>>>>> I would like to keep this option (features.lock-notify-contention) >>>>>>> enabled by default. >>>>>>> However, I can see that there is one more option which will impact >>>>>>> the working of this option which is "notify-contention-delay" >>>>>>> >>>>>> >>>>> Just a nit. I wish the option was called "notify-contention-interval" >>>>> The "delay" part doesn't really emphasize where the delay would be put >>>>> in. >>>>> >>>> >>>> It makes sense. Maybe we can also rename it or add a second name >>>> (alias). If there are no objections, I will send a patch with the change. >>>> >>>> Xavi >>>> >>>> >>>>> >>>>>> .description = "This value determines the minimum amount of time >>>>>>> " >>>>>>> "(in seconds) between upcall contention >>>>>>> notifications " >>>>>>> "on the same inode. If multiple lock requests >>>>>>> are " >>>>>>> "received during this period, only one upcall >>>>>>> will " >>>>>>> "be sent."}, >>>>>>> >>>>>>> I am not sure what should be the best value for this option if we >>>>>>> want to keep features.lock-notify-contention ON by default? >>>>>>> It looks like if we keep the value of notify-contention-delay more, >>>>>>> say 5 sec, it will wait for this much time to send up call >>>>>>> notification which does not look good. >>>>>>> >>>>>> >>>>>> No, the first notification is sent immediately. What this option does >>>>>> is to define the minimum interval between notifications. This interval is >>>>>> per lock. This is done to avoid storms of notifications if many requests >>>>>> come referencing the same lock. >>>>>> >>>>>> Is my understanding correct? >>>>>>> What will be impact of this value and what should be the default >>>>>>> value of this option? >>>>>>> >>>>>> >>>>>> I think the current default value of 5 seconds seems good enough. If >>>>>> there are many bricks, each brick could send a notification per lock. 1000 >>>>>> bricks would mean a client would receive 1000 notifications every 5 >>>>>> seconds. It doesn't seem too much, but in those cases 10, and considering >>>>>> we could have other locks, maybe a higher value could be better. >>>>>> >>>>>> Xavi >>>>>> >>>>>> >>>>>>> >>>>>>> --- >>>>>>> Ashish >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> ------------------------------ >>>>>>> *From: *"Xavi Hernandez" >>>>>>> *To: *"gluster-devel" >>>>>>> *Cc: *"Pranith Kumar Karampuri" , "Ashish >>>>>>> Pandey" , "Amar Tumballi" >>>>>>> *Sent: *Thursday, May 2, 2019 4:15:38 PM >>>>>>> *Subject: *Should we enable contention notification by default ? >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> there's a feature in the locks xlator that sends a notification to >>>>>>> current owner of a lock when another client tries to acquire the same lock. >>>>>>> This way the current owner is made aware of the contention and can release >>>>>>> the lock as soon as possible to allow the other client to proceed. >>>>>>> >>>>>>> This is specially useful when eager-locking is used and multiple >>>>>>> clients access the same files and directories. Currently both replicated >>>>>>> and dispersed volumes use eager-locking and can use contention notification >>>>>>> to force an early release of the lock. >>>>>>> >>>>>>> Eager-locking reduces the number of network requests required for >>>>>>> each operation, improving performance, but could add delays to other >>>>>>> clients while it keeps the inode or entry locked. With the contention >>>>>>> notification feature we avoid this delay, so we get the best performance >>>>>>> with minimal issues in multiclient environments. >>>>>>> >>>>>>> Currently the contention notification feature is controlled by the >>>>>>> 'features.lock-notify-contention' option and it's disabled by default. >>>>>>> Should we enable it by default ? >>>>>>> >>>>>>> I don't see any reason to keep it disabled by default. Does anyone >>>>>>> foresee any problem ? >>>>>>> >>>>>> >>> Is it a server only option? Otherwise it will break backward >>> compatibility if we rename the key, If alias can get this fixed, that?s a >>> better choice but I?m not sure if it solves all the problems. >>> >> >> It's a server side option. I though that an alias didn't have any other >> implication than accept two names for the same option. Is there anything >> else I need to consider ? >> > > If it?s a server side option then there?s no challenge in alias. If you do > rename then in heterogeneous server versions volume set wouldn?t work > though. > I created a patch to change this and set notify-contention to 'yes' by default. I'll test upgrade paths to make sure that nothing breaks. Xavi > >> >>> >>>>>>> Regards, >>>>>>> >>>>>>> Xavi >>>>>>> >>>>>>> _______________________________________________ >>>>>> Gluster-devel mailing list >>>>>> Gluster-devel at gluster.org >>>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>>>> >>>>> >>>>> >>>>> -- >>>>> Milind >>>>> >>>>> _______________________________________________ >>>> Gluster-devel mailing list >>>> Gluster-devel at gluster.org >>>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>> >>> -- >>> --Atin >>> >> -- > --Atin > -------------- next part -------------- An HTML attachment was scrubbed... URL: From xhernandez at redhat.com Thu Jun 6 06:38:45 2019 From: xhernandez at redhat.com (Xavi Hernandez) Date: Thu, 6 Jun 2019 08:38:45 +0200 Subject: [Gluster-devel] Should we enable contention notification by default ? In-Reply-To: References: <2044282595.16006319.1556799471980.JavaMail.zimbra@redhat.com> Message-ID: Missed the patch link: https://review.gluster.org/c/glusterfs/+/22828 On Thu, Jun 6, 2019 at 8:32 AM Xavi Hernandez wrote: > On Thu, May 2, 2019 at 5:45 PM Atin Mukherjee > wrote: > >> >> >> On Thu, 2 May 2019 at 20:38, Xavi Hernandez >> wrote: >> >>> On Thu, May 2, 2019 at 4:06 PM Atin Mukherjee < >>> atin.mukherjee83 at gmail.com> wrote: >>> >>>> >>>> >>>> On Thu, 2 May 2019 at 19:14, Xavi Hernandez >>>> wrote: >>>> >>>>> On Thu, 2 May 2019, 15:37 Milind Changire, >>>>> wrote: >>>>> >>>>>> On Thu, May 2, 2019 at 6:44 PM Xavi Hernandez >>>>>> wrote: >>>>>> >>>>>>> Hi Ashish, >>>>>>> >>>>>>> On Thu, May 2, 2019 at 2:17 PM Ashish Pandey >>>>>>> wrote: >>>>>>> >>>>>>>> Xavi, >>>>>>>> >>>>>>>> I would like to keep this option (features.lock-notify-contention) >>>>>>>> enabled by default. >>>>>>>> However, I can see that there is one more option which will impact >>>>>>>> the working of this option which is "notify-contention-delay" >>>>>>>> >>>>>>> >>>>>> Just a nit. I wish the option was called "notify-contention-interval" >>>>>> The "delay" part doesn't really emphasize where the delay would be >>>>>> put in. >>>>>> >>>>> >>>>> It makes sense. Maybe we can also rename it or add a second name >>>>> (alias). If there are no objections, I will send a patch with the change. >>>>> >>>>> Xavi >>>>> >>>>> >>>>>> >>>>>>> .description = "This value determines the minimum amount of >>>>>>>> time " >>>>>>>> "(in seconds) between upcall contention >>>>>>>> notifications " >>>>>>>> "on the same inode. If multiple lock requests >>>>>>>> are " >>>>>>>> "received during this period, only one upcall >>>>>>>> will " >>>>>>>> "be sent."}, >>>>>>>> >>>>>>>> I am not sure what should be the best value for this option if we >>>>>>>> want to keep features.lock-notify-contention ON by default? >>>>>>>> It looks like if we keep the value of notify-contention-delay more, >>>>>>>> say 5 sec, it will wait for this much time to send up call >>>>>>>> notification which does not look good. >>>>>>>> >>>>>>> >>>>>>> No, the first notification is sent immediately. What this option >>>>>>> does is to define the minimum interval between notifications. This interval >>>>>>> is per lock. This is done to avoid storms of notifications if many requests >>>>>>> come referencing the same lock. >>>>>>> >>>>>>> Is my understanding correct? >>>>>>>> What will be impact of this value and what should be the default >>>>>>>> value of this option? >>>>>>>> >>>>>>> >>>>>>> I think the current default value of 5 seconds seems good enough. If >>>>>>> there are many bricks, each brick could send a notification per lock. 1000 >>>>>>> bricks would mean a client would receive 1000 notifications every 5 >>>>>>> seconds. It doesn't seem too much, but in those cases 10, and considering >>>>>>> we could have other locks, maybe a higher value could be better. >>>>>>> >>>>>>> Xavi >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> --- >>>>>>>> Ashish >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ------------------------------ >>>>>>>> *From: *"Xavi Hernandez" >>>>>>>> *To: *"gluster-devel" >>>>>>>> *Cc: *"Pranith Kumar Karampuri" , "Ashish >>>>>>>> Pandey" , "Amar Tumballi" >>>>>>> > >>>>>>>> *Sent: *Thursday, May 2, 2019 4:15:38 PM >>>>>>>> *Subject: *Should we enable contention notification by default ? >>>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> there's a feature in the locks xlator that sends a notification to >>>>>>>> current owner of a lock when another client tries to acquire the same lock. >>>>>>>> This way the current owner is made aware of the contention and can release >>>>>>>> the lock as soon as possible to allow the other client to proceed. >>>>>>>> >>>>>>>> This is specially useful when eager-locking is used and multiple >>>>>>>> clients access the same files and directories. Currently both replicated >>>>>>>> and dispersed volumes use eager-locking and can use contention notification >>>>>>>> to force an early release of the lock. >>>>>>>> >>>>>>>> Eager-locking reduces the number of network requests required for >>>>>>>> each operation, improving performance, but could add delays to other >>>>>>>> clients while it keeps the inode or entry locked. With the contention >>>>>>>> notification feature we avoid this delay, so we get the best performance >>>>>>>> with minimal issues in multiclient environments. >>>>>>>> >>>>>>>> Currently the contention notification feature is controlled by the >>>>>>>> 'features.lock-notify-contention' option and it's disabled by default. >>>>>>>> Should we enable it by default ? >>>>>>>> >>>>>>>> I don't see any reason to keep it disabled by default. Does anyone >>>>>>>> foresee any problem ? >>>>>>>> >>>>>>> >>>> Is it a server only option? Otherwise it will break backward >>>> compatibility if we rename the key, If alias can get this fixed, that?s a >>>> better choice but I?m not sure if it solves all the problems. >>>> >>> >>> It's a server side option. I though that an alias didn't have any other >>> implication than accept two names for the same option. Is there anything >>> else I need to consider ? >>> >> >> If it?s a server side option then there?s no challenge in alias. If you >> do rename then in heterogeneous server versions volume set wouldn?t work >> though. >> > > I created a patch to change this and set notify-contention to 'yes' by > default. I'll test upgrade paths to make sure that nothing breaks. > > Xavi > > >> >>> >>>> >>>>>>>> Regards, >>>>>>>> >>>>>>>> Xavi >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>> Gluster-devel mailing list >>>>>>> Gluster-devel at gluster.org >>>>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Milind >>>>>> >>>>>> _______________________________________________ >>>>> Gluster-devel mailing list >>>>> Gluster-devel at gluster.org >>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>>> >>>> -- >>>> --Atin >>>> >>> -- >> --Atin >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhishpaliwal at gmail.com Thu Jun 6 06:38:20 2019 From: abhishpaliwal at gmail.com (ABHISHEK PALIWAL) Date: Thu, 6 Jun 2019 12:08:20 +0530 Subject: [Gluster-devel] [Gluster-users] Memory leak in glusterfs In-Reply-To: References: Message-ID: Hi Nithya, Here is the Setup details and test which we are doing as below: One client, two gluster Server. The client is writing and deleting one file each 15 minutes by script test_v4.15.sh. IP Server side: 128.224.98.157 /gluster/gv0/ 128.224.98.159 /gluster/gv0/ Client side: 128.224.98.160 /gluster_mount/ Server side: gluster volume create gv0 replica 2 128.224.98.157:/gluster/gv0/ 128.224.98.159:/gluster/gv0/ force gluster volume start gv0 root at 128:/tmp/brick/gv0# gluster volume info Volume Name: gv0 Type: Replicate Volume ID: 7105a475-5929-4d60-ba23-be57445d97b5 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 128.224.98.157:/gluster/gv0 Brick2: 128.224.98.159:/gluster/gv0 Options Reconfigured: transport.address-family: inet nfs.disable: on performance.client-io-threads: off exec script: ./ps_mem.py -p 605 -w 61 > log root at 128:/# ./ps_mem.py -p 605 Private + Shared = RAM used Program 23668.0 KiB + 1188.0 KiB = 24856.0 KiB glusterfsd --------------------------------- 24856.0 KiB ================================= Client side: mount -t glusterfs -o acl -o resolve-gids 128.224.98.157:gv0 /gluster_mount We are using the below script write and delete the file. *test_v4.15.sh * Also the below script to see the memory increase whihle the script is above script is running in background. *ps_mem.py* I am attaching the script files as well as the result got after testing the scenario. On Wed, Jun 5, 2019 at 7:23 PM Nithya Balachandran wrote: > Hi, > > Writing to a volume should not affect glusterd. The stack you have shown > in the valgrind looks like the memory used to initialise the structures > glusterd uses and will free only when it is stopped. > > Can you provide more details to what it is you are trying to test? > > Regards, > Nithya > > > On Tue, 4 Jun 2019 at 15:41, ABHISHEK PALIWAL > wrote: > >> Hi Team, >> >> Please respond on the issue which I raised. >> >> Regards, >> Abhishek >> >> On Fri, May 17, 2019 at 2:46 PM ABHISHEK PALIWAL >> wrote: >> >>> Anyone please reply.... >>> >>> On Thu, May 16, 2019, 10:49 ABHISHEK PALIWAL >>> wrote: >>> >>>> Hi Team, >>>> >>>> I upload some valgrind logs from my gluster 5.4 setup. This is writing >>>> to the volume every 15 minutes. I stopped glusterd and then copy away the >>>> logs. The test was running for some simulated days. They are zipped in >>>> valgrind-54.zip. >>>> >>>> Lots of info in valgrind-2730.log. Lots of possibly lost bytes in >>>> glusterfs and even some definitely lost bytes. >>>> >>>> ==2737== 1,572,880 bytes in 1 blocks are possibly lost in loss record >>>> 391 of 391 >>>> ==2737== at 0x4C29C25: calloc (in >>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >>>> ==2737== by 0xA22485E: ??? (in >>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>> ==2737== by 0xA217C94: ??? (in >>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>> ==2737== by 0xA21D9F8: ??? (in >>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>> ==2737== by 0xA21DED9: ??? (in >>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>> ==2737== by 0xA21E685: ??? (in >>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>> ==2737== by 0xA1B9D8C: init (in >>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>> ==2737== by 0x4E511CE: xlator_init (in /usr/lib64/libglusterfs.so.0.0.1) >>>> ==2737== by 0x4E8A2B8: ??? (in /usr/lib64/libglusterfs.so.0.0.1) >>>> ==2737== by 0x4E8AAB3: glusterfs_graph_activate (in >>>> /usr/lib64/libglusterfs.so.0.0.1) >>>> ==2737== by 0x409C35: glusterfs_process_volfp (in /usr/sbin/glusterfsd) >>>> ==2737== by 0x409D99: glusterfs_volumes_init (in /usr/sbin/glusterfsd) >>>> ==2737== >>>> ==2737== LEAK SUMMARY: >>>> ==2737== definitely lost: 1,053 bytes in 10 blocks >>>> ==2737== indirectly lost: 317 bytes in 3 blocks >>>> ==2737== possibly lost: 2,374,971 bytes in 524 blocks >>>> ==2737== still reachable: 53,277 bytes in 201 blocks >>>> ==2737== suppressed: 0 bytes in 0 blocks >>>> >>>> -- >>>> >>>> >>>> >>>> >>>> Regards >>>> Abhishek Paliwal >>>> >>> >> >> -- >> >> >> >> >> Regards >> Abhishek Paliwal >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > > -- Regards Abhishek Paliwal -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ps_mem.py Type: text/x-python Size: 18465 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_v4.15.sh Type: application/x-shellscript Size: 660 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ps_mem_server1.log Type: text/x-log Size: 135168 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ps_mem_server2.log Type: text/x-log Size: 135168 bytes Desc: not available URL: From spisla80 at gmail.com Thu Jun 6 06:46:52 2019 From: spisla80 at gmail.com (David Spisla) Date: Thu, 6 Jun 2019 08:46:52 +0200 Subject: [Gluster-devel] Bitrot: Segmentation fault found in bitrot stub Message-ID: Dear Gluster Devel, all informations are here: https://bugzilla.redhat.com/show_bug.cgi?id=1717757 Also a full backtrace is provided. The place of of the seg fault is located Regards David Spisla -------------- next part -------------- An HTML attachment was scrubbed... URL: From nbalacha at redhat.com Thu Jun 6 10:38:17 2019 From: nbalacha at redhat.com (Nithya Balachandran) Date: Thu, 6 Jun 2019 16:08:17 +0530 Subject: [Gluster-devel] [Gluster-users] Memory leak in glusterfs In-Reply-To: References: Message-ID: Hi Abhishek, I am still not clear as to the purpose of the tests. Can you clarify why you are using valgrind and why you think there is a memory leak? Regards, Nithya On Thu, 6 Jun 2019 at 12:09, ABHISHEK PALIWAL wrote: > Hi Nithya, > > Here is the Setup details and test which we are doing as below: > > > One client, two gluster Server. > The client is writing and deleting one file each 15 minutes by script > test_v4.15.sh. > > IP > Server side: > 128.224.98.157 /gluster/gv0/ > 128.224.98.159 /gluster/gv0/ > > Client side: > 128.224.98.160 /gluster_mount/ > > Server side: > gluster volume create gv0 replica 2 128.224.98.157:/gluster/gv0/ > 128.224.98.159:/gluster/gv0/ force > gluster volume start gv0 > > root at 128:/tmp/brick/gv0# gluster volume info > > Volume Name: gv0 > Type: Replicate > Volume ID: 7105a475-5929-4d60-ba23-be57445d97b5 > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: 128.224.98.157:/gluster/gv0 > Brick2: 128.224.98.159:/gluster/gv0 > Options Reconfigured: > transport.address-family: inet > nfs.disable: on > performance.client-io-threads: off > > exec script: ./ps_mem.py -p 605 -w 61 > log > root at 128:/# ./ps_mem.py -p 605 > Private + Shared = RAM used Program > 23668.0 KiB + 1188.0 KiB = 24856.0 KiB glusterfsd > --------------------------------- > 24856.0 KiB > ================================= > > > Client side: > mount -t glusterfs -o acl -o resolve-gids 128.224.98.157:gv0 > /gluster_mount > > > We are using the below script write and delete the file. > > *test_v4.15.sh * > > Also the below script to see the memory increase whihle the script is > above script is running in background. > > *ps_mem.py* > > I am attaching the script files as well as the result got after testing > the scenario. > > On Wed, Jun 5, 2019 at 7:23 PM Nithya Balachandran > wrote: > >> Hi, >> >> Writing to a volume should not affect glusterd. The stack you have shown >> in the valgrind looks like the memory used to initialise the structures >> glusterd uses and will free only when it is stopped. >> >> Can you provide more details to what it is you are trying to test? >> >> Regards, >> Nithya >> >> >> On Tue, 4 Jun 2019 at 15:41, ABHISHEK PALIWAL >> wrote: >> >>> Hi Team, >>> >>> Please respond on the issue which I raised. >>> >>> Regards, >>> Abhishek >>> >>> On Fri, May 17, 2019 at 2:46 PM ABHISHEK PALIWAL < >>> abhishpaliwal at gmail.com> wrote: >>> >>>> Anyone please reply.... >>>> >>>> On Thu, May 16, 2019, 10:49 ABHISHEK PALIWAL >>>> wrote: >>>> >>>>> Hi Team, >>>>> >>>>> I upload some valgrind logs from my gluster 5.4 setup. This is writing >>>>> to the volume every 15 minutes. I stopped glusterd and then copy away the >>>>> logs. The test was running for some simulated days. They are zipped in >>>>> valgrind-54.zip. >>>>> >>>>> Lots of info in valgrind-2730.log. Lots of possibly lost bytes in >>>>> glusterfs and even some definitely lost bytes. >>>>> >>>>> ==2737== 1,572,880 bytes in 1 blocks are possibly lost in loss record >>>>> 391 of 391 >>>>> ==2737== at 0x4C29C25: calloc (in >>>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >>>>> ==2737== by 0xA22485E: ??? (in >>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>> ==2737== by 0xA217C94: ??? (in >>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>> ==2737== by 0xA21D9F8: ??? (in >>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>> ==2737== by 0xA21DED9: ??? (in >>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>> ==2737== by 0xA21E685: ??? (in >>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>> ==2737== by 0xA1B9D8C: init (in >>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>> ==2737== by 0x4E511CE: xlator_init (in >>>>> /usr/lib64/libglusterfs.so.0.0.1) >>>>> ==2737== by 0x4E8A2B8: ??? (in /usr/lib64/libglusterfs.so.0.0.1) >>>>> ==2737== by 0x4E8AAB3: glusterfs_graph_activate (in >>>>> /usr/lib64/libglusterfs.so.0.0.1) >>>>> ==2737== by 0x409C35: glusterfs_process_volfp (in /usr/sbin/glusterfsd) >>>>> ==2737== by 0x409D99: glusterfs_volumes_init (in /usr/sbin/glusterfsd) >>>>> ==2737== >>>>> ==2737== LEAK SUMMARY: >>>>> ==2737== definitely lost: 1,053 bytes in 10 blocks >>>>> ==2737== indirectly lost: 317 bytes in 3 blocks >>>>> ==2737== possibly lost: 2,374,971 bytes in 524 blocks >>>>> ==2737== still reachable: 53,277 bytes in 201 blocks >>>>> ==2737== suppressed: 0 bytes in 0 blocks >>>>> >>>>> -- >>>>> >>>>> >>>>> >>>>> >>>>> Regards >>>>> Abhishek Paliwal >>>>> >>>> >>> >>> -- >>> >>> >>> >>> >>> Regards >>> Abhishek Paliwal >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >> >> > > -- > > > > > Regards > Abhishek Paliwal > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhishpaliwal at gmail.com Fri Jun 7 02:43:03 2019 From: abhishpaliwal at gmail.com (ABHISHEK PALIWAL) Date: Fri, 7 Jun 2019 08:13:03 +0530 Subject: [Gluster-devel] [Gluster-users] Memory leak in glusterfs In-Reply-To: References: Message-ID: Hi Nithya, We are having the setup where copying the file to and deleting it from gluster mount point to update the latest file. We noticed due to this having some memory increase in glusterfsd process. To find the memory leak we are using valgrind but didn't get any help. That's why contacted to glusterfs community. Regards, Abhishek On Thu, Jun 6, 2019, 16:08 Nithya Balachandran wrote: > Hi Abhishek, > > I am still not clear as to the purpose of the tests. Can you clarify why > you are using valgrind and why you think there is a memory leak? > > Regards, > Nithya > > On Thu, 6 Jun 2019 at 12:09, ABHISHEK PALIWAL > wrote: > >> Hi Nithya, >> >> Here is the Setup details and test which we are doing as below: >> >> >> One client, two gluster Server. >> The client is writing and deleting one file each 15 minutes by script >> test_v4.15.sh. >> >> IP >> Server side: >> 128.224.98.157 /gluster/gv0/ >> 128.224.98.159 /gluster/gv0/ >> >> Client side: >> 128.224.98.160 /gluster_mount/ >> >> Server side: >> gluster volume create gv0 replica 2 128.224.98.157:/gluster/gv0/ >> 128.224.98.159:/gluster/gv0/ force >> gluster volume start gv0 >> >> root at 128:/tmp/brick/gv0# gluster volume info >> >> Volume Name: gv0 >> Type: Replicate >> Volume ID: 7105a475-5929-4d60-ba23-be57445d97b5 >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 1 x 2 = 2 >> Transport-type: tcp >> Bricks: >> Brick1: 128.224.98.157:/gluster/gv0 >> Brick2: 128.224.98.159:/gluster/gv0 >> Options Reconfigured: >> transport.address-family: inet >> nfs.disable: on >> performance.client-io-threads: off >> >> exec script: ./ps_mem.py -p 605 -w 61 > log >> root at 128:/# ./ps_mem.py -p 605 >> Private + Shared = RAM used Program >> 23668.0 KiB + 1188.0 KiB = 24856.0 KiB glusterfsd >> --------------------------------- >> 24856.0 KiB >> ================================= >> >> >> Client side: >> mount -t glusterfs -o acl -o resolve-gids 128.224.98.157:gv0 >> /gluster_mount >> >> >> We are using the below script write and delete the file. >> >> *test_v4.15.sh * >> >> Also the below script to see the memory increase whihle the script is >> above script is running in background. >> >> *ps_mem.py* >> >> I am attaching the script files as well as the result got after testing >> the scenario. >> >> On Wed, Jun 5, 2019 at 7:23 PM Nithya Balachandran >> wrote: >> >>> Hi, >>> >>> Writing to a volume should not affect glusterd. The stack you have shown >>> in the valgrind looks like the memory used to initialise the structures >>> glusterd uses and will free only when it is stopped. >>> >>> Can you provide more details to what it is you are trying to test? >>> >>> Regards, >>> Nithya >>> >>> >>> On Tue, 4 Jun 2019 at 15:41, ABHISHEK PALIWAL >>> wrote: >>> >>>> Hi Team, >>>> >>>> Please respond on the issue which I raised. >>>> >>>> Regards, >>>> Abhishek >>>> >>>> On Fri, May 17, 2019 at 2:46 PM ABHISHEK PALIWAL < >>>> abhishpaliwal at gmail.com> wrote: >>>> >>>>> Anyone please reply.... >>>>> >>>>> On Thu, May 16, 2019, 10:49 ABHISHEK PALIWAL >>>>> wrote: >>>>> >>>>>> Hi Team, >>>>>> >>>>>> I upload some valgrind logs from my gluster 5.4 setup. This is >>>>>> writing to the volume every 15 minutes. I stopped glusterd and then copy >>>>>> away the logs. The test was running for some simulated days. They are >>>>>> zipped in valgrind-54.zip. >>>>>> >>>>>> Lots of info in valgrind-2730.log. Lots of possibly lost bytes in >>>>>> glusterfs and even some definitely lost bytes. >>>>>> >>>>>> ==2737== 1,572,880 bytes in 1 blocks are possibly lost in loss record >>>>>> 391 of 391 >>>>>> ==2737== at 0x4C29C25: calloc (in >>>>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >>>>>> ==2737== by 0xA22485E: ??? (in >>>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>>> ==2737== by 0xA217C94: ??? (in >>>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>>> ==2737== by 0xA21D9F8: ??? (in >>>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>>> ==2737== by 0xA21DED9: ??? (in >>>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>>> ==2737== by 0xA21E685: ??? (in >>>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>>> ==2737== by 0xA1B9D8C: init (in >>>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>>> ==2737== by 0x4E511CE: xlator_init (in >>>>>> /usr/lib64/libglusterfs.so.0.0.1) >>>>>> ==2737== by 0x4E8A2B8: ??? (in /usr/lib64/libglusterfs.so.0.0.1) >>>>>> ==2737== by 0x4E8AAB3: glusterfs_graph_activate (in >>>>>> /usr/lib64/libglusterfs.so.0.0.1) >>>>>> ==2737== by 0x409C35: glusterfs_process_volfp (in >>>>>> /usr/sbin/glusterfsd) >>>>>> ==2737== by 0x409D99: glusterfs_volumes_init (in /usr/sbin/glusterfsd) >>>>>> ==2737== >>>>>> ==2737== LEAK SUMMARY: >>>>>> ==2737== definitely lost: 1,053 bytes in 10 blocks >>>>>> ==2737== indirectly lost: 317 bytes in 3 blocks >>>>>> ==2737== possibly lost: 2,374,971 bytes in 524 blocks >>>>>> ==2737== still reachable: 53,277 bytes in 201 blocks >>>>>> ==2737== suppressed: 0 bytes in 0 blocks >>>>>> >>>>>> -- >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Regards >>>>>> Abhishek Paliwal >>>>>> >>>>> >>>> >>>> -- >>>> >>>> >>>> >>>> >>>> Regards >>>> Abhishek Paliwal >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> >>> >> >> -- >> >> >> >> >> Regards >> Abhishek Paliwal >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nbalacha at redhat.com Fri Jun 7 03:09:03 2019 From: nbalacha at redhat.com (Nithya Balachandran) Date: Fri, 7 Jun 2019 08:39:03 +0530 Subject: [Gluster-devel] [Gluster-users] Memory leak in glusterfs In-Reply-To: References: Message-ID: Hi Abhishek, Please use statedumps taken at intervals to determine where the memory is increasing. See [1] for details. Regards, Nithya [1] https://docs.gluster.org/en/latest/Troubleshooting/statedump/ On Fri, 7 Jun 2019 at 08:13, ABHISHEK PALIWAL wrote: > Hi Nithya, > > We are having the setup where copying the file to and deleting it from > gluster mount point to update the latest file. We noticed due to this > having some memory increase in glusterfsd process. > > To find the memory leak we are using valgrind but didn't get any help. > > That's why contacted to glusterfs community. > > Regards, > Abhishek > > On Thu, Jun 6, 2019, 16:08 Nithya Balachandran > wrote: > >> Hi Abhishek, >> >> I am still not clear as to the purpose of the tests. Can you clarify why >> you are using valgrind and why you think there is a memory leak? >> >> Regards, >> Nithya >> >> On Thu, 6 Jun 2019 at 12:09, ABHISHEK PALIWAL >> wrote: >> >>> Hi Nithya, >>> >>> Here is the Setup details and test which we are doing as below: >>> >>> >>> One client, two gluster Server. >>> The client is writing and deleting one file each 15 minutes by script >>> test_v4.15.sh. >>> >>> IP >>> Server side: >>> 128.224.98.157 /gluster/gv0/ >>> 128.224.98.159 /gluster/gv0/ >>> >>> Client side: >>> 128.224.98.160 /gluster_mount/ >>> >>> Server side: >>> gluster volume create gv0 replica 2 128.224.98.157:/gluster/gv0/ >>> 128.224.98.159:/gluster/gv0/ force >>> gluster volume start gv0 >>> >>> root at 128:/tmp/brick/gv0# gluster volume info >>> >>> Volume Name: gv0 >>> Type: Replicate >>> Volume ID: 7105a475-5929-4d60-ba23-be57445d97b5 >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 1 x 2 = 2 >>> Transport-type: tcp >>> Bricks: >>> Brick1: 128.224.98.157:/gluster/gv0 >>> Brick2: 128.224.98.159:/gluster/gv0 >>> Options Reconfigured: >>> transport.address-family: inet >>> nfs.disable: on >>> performance.client-io-threads: off >>> >>> exec script: ./ps_mem.py -p 605 -w 61 > log >>> root at 128:/# ./ps_mem.py -p 605 >>> Private + Shared = RAM used Program >>> 23668.0 KiB + 1188.0 KiB = 24856.0 KiB glusterfsd >>> --------------------------------- >>> 24856.0 KiB >>> ================================= >>> >>> >>> Client side: >>> mount -t glusterfs -o acl -o resolve-gids 128.224.98.157:gv0 >>> /gluster_mount >>> >>> >>> We are using the below script write and delete the file. >>> >>> *test_v4.15.sh * >>> >>> Also the below script to see the memory increase whihle the script is >>> above script is running in background. >>> >>> *ps_mem.py* >>> >>> I am attaching the script files as well as the result got after testing >>> the scenario. >>> >>> On Wed, Jun 5, 2019 at 7:23 PM Nithya Balachandran >>> wrote: >>> >>>> Hi, >>>> >>>> Writing to a volume should not affect glusterd. The stack you have >>>> shown in the valgrind looks like the memory used to initialise the >>>> structures glusterd uses and will free only when it is stopped. >>>> >>>> Can you provide more details to what it is you are trying to test? >>>> >>>> Regards, >>>> Nithya >>>> >>>> >>>> On Tue, 4 Jun 2019 at 15:41, ABHISHEK PALIWAL >>>> wrote: >>>> >>>>> Hi Team, >>>>> >>>>> Please respond on the issue which I raised. >>>>> >>>>> Regards, >>>>> Abhishek >>>>> >>>>> On Fri, May 17, 2019 at 2:46 PM ABHISHEK PALIWAL < >>>>> abhishpaliwal at gmail.com> wrote: >>>>> >>>>>> Anyone please reply.... >>>>>> >>>>>> On Thu, May 16, 2019, 10:49 ABHISHEK PALIWAL >>>>>> wrote: >>>>>> >>>>>>> Hi Team, >>>>>>> >>>>>>> I upload some valgrind logs from my gluster 5.4 setup. This is >>>>>>> writing to the volume every 15 minutes. I stopped glusterd and then copy >>>>>>> away the logs. The test was running for some simulated days. They are >>>>>>> zipped in valgrind-54.zip. >>>>>>> >>>>>>> Lots of info in valgrind-2730.log. Lots of possibly lost bytes in >>>>>>> glusterfs and even some definitely lost bytes. >>>>>>> >>>>>>> ==2737== 1,572,880 bytes in 1 blocks are possibly lost in loss >>>>>>> record 391 of 391 >>>>>>> ==2737== at 0x4C29C25: calloc (in >>>>>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >>>>>>> ==2737== by 0xA22485E: ??? (in >>>>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>>>> ==2737== by 0xA217C94: ??? (in >>>>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>>>> ==2737== by 0xA21D9F8: ??? (in >>>>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>>>> ==2737== by 0xA21DED9: ??? (in >>>>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>>>> ==2737== by 0xA21E685: ??? (in >>>>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>>>> ==2737== by 0xA1B9D8C: init (in >>>>>>> /usr/lib64/glusterfs/5.4/xlator/mgmt/glusterd.so) >>>>>>> ==2737== by 0x4E511CE: xlator_init (in >>>>>>> /usr/lib64/libglusterfs.so.0.0.1) >>>>>>> ==2737== by 0x4E8A2B8: ??? (in /usr/lib64/libglusterfs.so.0.0.1) >>>>>>> ==2737== by 0x4E8AAB3: glusterfs_graph_activate (in >>>>>>> /usr/lib64/libglusterfs.so.0.0.1) >>>>>>> ==2737== by 0x409C35: glusterfs_process_volfp (in >>>>>>> /usr/sbin/glusterfsd) >>>>>>> ==2737== by 0x409D99: glusterfs_volumes_init (in >>>>>>> /usr/sbin/glusterfsd) >>>>>>> ==2737== >>>>>>> ==2737== LEAK SUMMARY: >>>>>>> ==2737== definitely lost: 1,053 bytes in 10 blocks >>>>>>> ==2737== indirectly lost: 317 bytes in 3 blocks >>>>>>> ==2737== possibly lost: 2,374,971 bytes in 524 blocks >>>>>>> ==2737== still reachable: 53,277 bytes in 201 blocks >>>>>>> ==2737== suppressed: 0 bytes in 0 blocks >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Regards >>>>>>> Abhishek Paliwal >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> >>>>> >>>>> >>>>> >>>>> Regards >>>>> Abhishek Paliwal >>>>> _______________________________________________ >>>>> Gluster-users mailing list >>>>> Gluster-users at gluster.org >>>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>>> >>> >>> -- >>> >>> >>> >>> >>> Regards >>> Abhishek Paliwal >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From atumball at redhat.com Fri Jun 7 04:36:25 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Fri, 7 Jun 2019 10:06:25 +0530 Subject: [Gluster-devel] [Gluster-Maintainers] Fwd: Build failed in Jenkins: regression-test-with-multiplex #1359 In-Reply-To: References: <24208463.92.1559325814227.JavaMail.jenkins@jenkins-el7.rht.gluster.org> Message-ID: Got time to test subdir-mount.t failing in brick-mux scenario. I noticed some issues, where I need further help from glusterd team. subdir-mount.t expects 'hook' script to run after add-brick to make sure the required subdirectories are healed and are present in new bricks. This is important as subdir mount expects the subdirs to exist for successful mount. But in case of brick-mux setup, I see that in some cases (6/10), hook script (add-brick/post-hook/S13-create-subdir-mount.sh) started getting executed after 20second of finishing the add-brick command. Due to this, the mount which we execute after add-brick failed. My question is, what is making post hook script to run so late ?? I can recreate the issues locally on my laptop too. On Sat, Jun 1, 2019 at 4:55 PM Atin Mukherjee wrote: > subdir-mount.t has started failing in brick mux regression nightly. This > needs to be fixed. > > Raghavendra - did we manage to get any further clue on uss.t failure? > > ---------- Forwarded message --------- > From: > Date: Fri, 31 May 2019 at 23:34 > Subject: [Gluster-Maintainers] Build failed in Jenkins: > regression-test-with-multiplex #1359 > To: , , , > , > > > See < > https://build.gluster.org/job/regression-test-with-multiplex/1359/display/redirect?page=changes > > > > Changes: > > [atin] glusterd: add an op-version check > > [atin] glusterd/svc: glusterd_svcs_stop should call individual wrapper > function > > [atin] glusterd/svc: Stop stale process using the glusterd_proc_stop > > [Amar Tumballi] lcov: more coverage to shard, old-protocol, sdfs > > [Kotresh H R] tests/geo-rep: Add EC volume test case > > [Amar Tumballi] glusterfsd/cleanup: Protect graph object under a lock > > [Mohammed Rafi KC] glusterd/shd: Optimize the glustershd manager to send > reconfigure > > [Kotresh H R] tests/geo-rep: Add tests to cover glusterd geo-rep > > [atin] glusterd: Optimize code to copy dictionary in handshake code path > > ------------------------------------------ > [...truncated 3.18 MB...] > ./tests/basic/afr/stale-file-lookup.t - 9 second > ./tests/basic/afr/granular-esh/replace-brick.t - 9 second > ./tests/basic/afr/granular-esh/add-brick.t - 9 second > ./tests/basic/afr/gfid-mismatch.t - 9 second > ./tests/performance/open-behind.t - 8 second > ./tests/features/ssl-authz.t - 8 second > ./tests/features/readdir-ahead.t - 8 second > ./tests/bugs/upcall/bug-1458127.t - 8 second > ./tests/bugs/transport/bug-873367.t - 8 second > ./tests/bugs/replicate/bug-1498570-client-iot-graph-check.t - 8 second > ./tests/bugs/replicate/bug-1132102.t - 8 second > ./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove-quota-quota-deem-statfs.t > - 8 second > ./tests/bugs/quota/bug-1104692.t - 8 second > ./tests/bugs/posix/bug-1360679.t - 8 second > ./tests/bugs/posix/bug-1122028.t - 8 second > ./tests/bugs/nfs/bug-1157223-symlink-mounting.t - 8 second > ./tests/bugs/glusterfs/bug-861015-log.t - 8 second > ./tests/bugs/glusterd/sync-post-glusterd-restart.t - 8 second > ./tests/bugs/glusterd/bug-1696046.t - 8 second > ./tests/bugs/fuse/bug-983477.t - 8 second > ./tests/bugs/ec/bug-1227869.t - 8 second > ./tests/bugs/distribute/bug-1088231.t - 8 second > ./tests/bugs/distribute/bug-1086228.t - 8 second > ./tests/bugs/cli/bug-1087487.t - 8 second > ./tests/bugs/cli/bug-1022905.t - 8 second > ./tests/bugs/bug-1258069.t - 8 second > ./tests/bugs/bitrot/1209752-volume-status-should-show-bitrot-scrub-info.t > - 8 second > ./tests/basic/xlator-pass-through-sanity.t - 8 second > ./tests/basic/quota-nfs.t - 8 second > ./tests/basic/glusterd/arbiter-volume.t - 8 second > ./tests/basic/ctime/ctime-noatime.t - 8 second > ./tests/line-coverage/cli-peer-and-volume-operations.t - 7 second > ./tests/gfid2path/get-gfid-to-path.t - 7 second > ./tests/bugs/upcall/bug-1369430.t - 7 second > ./tests/bugs/snapshot/bug-1260848.t - 7 second > ./tests/bugs/shard/shard-inode-refcount-test.t - 7 second > ./tests/bugs/shard/bug-1258334.t - 7 second > ./tests/bugs/replicate/bug-767585-gfid.t - 7 second > ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t - 7 second > ./tests/bugs/replicate/bug-1250170-fsync.t - 7 second > ./tests/bugs/posix/bug-1175711.t - 7 second > ./tests/bugs/nfs/bug-915280.t - 7 second > ./tests/bugs/md-cache/setxattr-prepoststat.t - 7 second > ./tests/bugs/md-cache/bug-1211863_unlink.t - 7 second > ./tests/bugs/glusterfs/bug-848251.t - 7 second > ./tests/bugs/distribute/bug-1122443.t - 7 second > ./tests/bugs/changelog/bug-1208470.t - 7 second > ./tests/bugs/bug-1702299.t - 7 second > ./tests/bugs/bug-1371806_2.t - 7 second > ./tests/bugs/bitrot/1209818-vol-info-show-scrub-process-properly.t - 7 > second > ./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t - 7 second > ./tests/bugs/bitrot/1207029-bitrot-daemon-should-start-on-valid-node.t - > 7 second > ./tests/bitrot/br-stub.t - 7 second > ./tests/basic/glusterd/arbiter-volume-probe.t - 7 second > ./tests/basic/gfapi/libgfapi-fini-hang.t - 7 second > ./tests/basic/fencing/fencing-crash-conistency.t - 7 second > ./tests/basic/distribute/file-create.t - 7 second > ./tests/basic/afr/tarissue.t - 7 second > ./tests/basic/afr/gfid-heal.t - 7 second > ./tests/bugs/snapshot/bug-1178079.t - 6 second > ./tests/bugs/snapshot/bug-1064768.t - 6 second > ./tests/bugs/shard/bug-1342298.t - 6 second > ./tests/bugs/shard/bug-1259651.t - 6 second > ./tests/bugs/replicate/bug-1686568-send-truncate-on-arbiter-from-shd.t - > 6 second > ./tests/bugs/replicate/bug-1626994-info-split-brain.t - 6 second > ./tests/bugs/replicate/bug-1325792.t - 6 second > ./tests/bugs/replicate/bug-1101647.t - 6 second > ./tests/bugs/quota/bug-1243798.t - 6 second > ./tests/bugs/protocol/bug-1321578.t - 6 second > ./tests/bugs/nfs/bug-877885.t - 6 second > ./tests/bugs/nfs/bug-1143880-fix-gNFSd-auth-crash.t - 6 second > ./tests/bugs/md-cache/bug-1476324.t - 6 second > ./tests/bugs/md-cache/afr-stale-read.t - 6 second > ./tests/bugs/io-cache/bug-858242.t - 6 second > ./tests/bugs/glusterfs/bug-893378.t - 6 second > ./tests/bugs/glusterfs/bug-856455.t - 6 second > ./tests/bugs/glusterd/quorum-value-check.t - 6 second > ./tests/bugs/ec/bug-1179050.t - 6 second > ./tests/bugs/distribute/bug-912564.t - 6 second > ./tests/bugs/distribute/bug-884597.t - 6 second > ./tests/bugs/distribute/bug-1368012.t - 6 second > ./tests/bugs/core/bug-986429.t - 6 second > ./tests/bugs/core/bug-1699025-brick-mux-detach-brick-fd-issue.t - 6 > second > ./tests/bugs/core/bug-1168803-snapd-option-validation-fix.t - 6 second > ./tests/bugs/bug-1371806_1.t - 6 second > ./tests/bugs/bitrot/bug-1229134-bitd-not-support-vol-set.t - 6 second > ./tests/bugs/bitrot/bug-1210684-scrub-pause-resume-error-handling.t - 6 > second > ./tests/bitrot/bug-1221914.t - 6 second > ./tests/basic/trace.t - 6 second > ./tests/basic/playground/template-xlator-sanity.t - 6 second > ./tests/basic/ec/nfs.t - 6 second > ./tests/basic/ec/ec-read-policy.t - 6 second > ./tests/basic/ec/ec-anonymous-fd.t - 6 second > ./tests/basic/distribute/non-root-unlink-stale-linkto.t - 6 second > ./tests/basic/changelog/changelog-rename.t - 6 second > ./tests/basic/afr/heal-info.t - 6 second > ./tests/basic/afr/afr-read-hash-mode.t - 6 second > ./tests/gfid2path/gfid2path_nfs.t - 5 second > ./tests/bugs/upcall/bug-1422776.t - 5 second > ./tests/bugs/replicate/bug-886998.t - 5 second > ./tests/bugs/replicate/bug-1365455.t - 5 second > ./tests/bugs/readdir-ahead/bug-1670253-consistent-metadata.t - 5 second > ./tests/bugs/posix/bug-gfid-path.t - 5 second > ./tests/bugs/posix/bug-765380.t - 5 second > ./tests/bugs/nfs/bug-847622.t - 5 second > ./tests/bugs/nfs/bug-1116503.t - 5 second > ./tests/bugs/io-stats/bug-1598548.t - 5 second > ./tests/bugs/glusterfs-server/bug-877992.t - 5 second > ./tests/bugs/glusterfs-server/bug-873549.t - 5 second > ./tests/bugs/glusterfs/bug-895235.t - 5 second > ./tests/bugs/fuse/bug-1126048.t - 5 second > ./tests/bugs/distribute/bug-907072.t - 5 second > ./tests/bugs/core/bug-913544.t - 5 second > ./tests/bugs/core/bug-908146.t - 5 second > ./tests/bugs/access-control/bug-1051896.t - 5 second > ./tests/basic/ec/ec-internal-xattrs.t - 5 second > ./tests/basic/ec/ec-fallocate.t - 5 second > ./tests/basic/distribute/bug-1265677-use-readdirp.t - 5 second > ./tests/basic/afr/arbiter-remove-brick.t - 5 second > ./tests/performance/quick-read.t - 4 second > ./tests/gfid2path/block-mount-access.t - 4 second > ./tests/features/delay-gen.t - 4 second > ./tests/bugs/upcall/bug-upcall-stat.t - 4 second > ./tests/bugs/upcall/bug-1394131.t - 4 second > ./tests/bugs/unclassified/bug-1034085.t - 4 second > ./tests/bugs/snapshot/bug-1111041.t - 4 second > ./tests/bugs/shard/bug-1272986.t - 4 second > ./tests/bugs/shard/bug-1256580.t - 4 second > ./tests/bugs/shard/bug-1250855.t - 4 second > ./tests/bugs/shard/bug-1245547.t - 4 second > ./tests/bugs/rpc/bug-954057.t - 4 second > ./tests/bugs/replicate/bug-976800.t - 4 second > ./tests/bugs/replicate/bug-880898.t - 4 second > ./tests/bugs/replicate/bug-1480525.t - 4 second > ./tests/bugs/read-only/bug-1134822-read-only-default-in-graph.t - 4 > second > ./tests/bugs/readdir-ahead/bug-1446516.t - 4 second > ./tests/bugs/readdir-ahead/bug-1439640.t - 4 second > ./tests/bugs/readdir-ahead/bug-1390050.t - 4 second > ./tests/bugs/quota/bug-1287996.t - 4 second > ./tests/bugs/quick-read/bug-846240.t - 4 second > ./tests/bugs/posix/disallow-gfid-volumeid-removexattr.t - 4 second > ./tests/bugs/posix/bug-1619720.t - 4 second > ./tests/bugs/nl-cache/bug-1451588.t - 4 second > ./tests/bugs/nfs/zero-atime.t - 4 second > ./tests/bugs/nfs/subdir-trailing-slash.t - 4 second > ./tests/bugs/nfs/socket-as-fifo.t - 4 second > ./tests/bugs/nfs/showmount-many-clients.t - 4 second > ./tests/bugs/nfs/bug-1210338.t - 4 second > ./tests/bugs/nfs/bug-1166862.t - 4 second > ./tests/bugs/nfs/bug-1161092-nfs-acls.t - 4 second > ./tests/bugs/md-cache/bug-1632503.t - 4 second > ./tests/bugs/glusterfs-server/bug-864222.t - 4 second > ./tests/bugs/glusterfs/bug-1482528.t - 4 second > ./tests/bugs/glusterd/bug-948729/bug-948729-mode-script.t - 4 second > ./tests/bugs/glusterd/bug-948729/bug-948729-force.t - 4 second > ./tests/bugs/glusterd/bug-1482906-peer-file-blank-line.t - 4 second > ./tests/bugs/glusterd/bug-1091935-brick-order-check-from-cli-to-glusterd.t > - 4 second > ./tests/bugs/geo-replication/bug-1296496.t - 4 second > ./tests/bugs/fuse/bug-1336818.t - 4 second > ./tests/bugs/fuse/bug-1283103.t - 4 second > ./tests/bugs/core/io-stats-1322825.t - 4 second > ./tests/bugs/core/bug-834465.t - 4 second > ./tests/bugs/core/bug-1135514-allow-setxattr-with-null-value.t - 4 second > ./tests/bugs/core/949327.t - 4 second > ./tests/bugs/cli/bug-977246.t - 4 second > ./tests/bugs/cli/bug-961307.t - 4 second > ./tests/bugs/cli/bug-1004218.t - 4 second > ./tests/bugs/bug-1138841.t - 4 second > ./tests/bugs/access-control/bug-1387241.t - 4 second > ./tests/bitrot/bug-internal-xattrs-check-1243391.t - 4 second > ./tests/basic/quota-rename.t - 4 second > ./tests/basic/hardlink-limit.t - 4 second > ./tests/basic/ec/dht-rename.t - 4 second > ./tests/basic/distribute/lookup.t - 4 second > ./tests/line-coverage/meta-max-coverage.t - 3 second > ./tests/gfid2path/gfid2path_fuse.t - 3 second > ./tests/bugs/unclassified/bug-991622.t - 3 second > ./tests/bugs/trace/bug-797171.t - 3 second > ./tests/bugs/glusterfs-server/bug-861542.t - 3 second > ./tests/bugs/glusterfs/bug-869724.t - 3 second > ./tests/bugs/glusterfs/bug-860297.t - 3 second > ./tests/bugs/glusterfs/bug-844688.t - 3 second > ./tests/bugs/glusterd/bug-948729/bug-948729.t - 3 second > ./tests/bugs/distribute/bug-1204140.t - 3 second > ./tests/bugs/core/bug-924075.t - 3 second > ./tests/bugs/core/bug-845213.t - 3 second > ./tests/bugs/core/bug-1421721-mpx-toggle.t - 3 second > ./tests/bugs/core/bug-1119582.t - 3 second > ./tests/bugs/core/bug-1117951.t - 3 second > ./tests/bugs/cli/bug-983317-volume-get.t - 3 second > ./tests/bugs/cli/bug-867252.t - 3 second > ./tests/basic/glusterd/check-cloudsync-ancestry.t - 3 second > ./tests/basic/fops-sanity.t - 3 second > ./tests/basic/fencing/test-fence-option.t - 3 second > ./tests/basic/distribute/debug-xattrs.t - 3 second > ./tests/basic/afr/ta-check-locks.t - 3 second > ./tests/line-coverage/volfile-with-all-graph-syntax.t - 2 second > ./tests/line-coverage/some-features-in-libglusterfs.t - 2 second > ./tests/bugs/shard/bug-1261773.t - 2 second > ./tests/bugs/replicate/bug-884328.t - 2 second > ./tests/bugs/readdir-ahead/bug-1512437.t - 2 second > ./tests/bugs/nfs/bug-970070.t - 2 second > ./tests/bugs/nfs/bug-1302948.t - 2 second > ./tests/bugs/logging/bug-823081.t - 2 second > ./tests/bugs/glusterfs-server/bug-889996.t - 2 second > ./tests/bugs/glusterfs/bug-892730.t - 2 second > ./tests/bugs/glusterfs/bug-811493.t - 2 second > ./tests/bugs/glusterd/bug-1085330-and-bug-916549.t - 2 second > ./tests/bugs/distribute/bug-924265.t - 2 second > ./tests/bugs/core/log-bug-1362520.t - 2 second > ./tests/bugs/core/bug-903336.t - 2 second > ./tests/bugs/core/bug-1111557.t - 2 second > ./tests/bugs/cli/bug-969193.t - 2 second > ./tests/bugs/cli/bug-949298.t - 2 second > ./tests/bugs/cli/bug-921215.t - 2 second > ./tests/bugs/cli/bug-1378842-volume-get-all.t - 2 second > ./tests/basic/peer-parsing.t - 2 second > ./tests/basic/md-cache/bug-1418249.t - 2 second > ./tests/basic/afr/arbiter-cli.t - 2 second > ./tests/bugs/replicate/ta-inode-refresh-read.t - 1 second > ./tests/bugs/glusterfs/bug-853690.t - 1 second > ./tests/bugs/cli/bug-764638.t - 1 second > ./tests/bugs/cli/bug-1047378.t - 1 second > ./tests/basic/netgroup_parsing.t - 1 second > ./tests/basic/gfapi/sink.t - 1 second > ./tests/basic/exports_parsing.t - 1 second > ./tests/basic/posixonly.t - 0 second > ./tests/basic/glusterfsd-args.t - 0 second > > > 2 test(s) failed > ./tests/basic/uss.t > ./tests/features/subdir-mount.t > > 0 test(s) generated core > > > 5 test(s) needed retry > ./tests/basic/afr/split-brain-favorite-child-policy.t > ./tests/basic/ec/self-heal.t > ./tests/basic/uss.t > ./tests/basic/volfile-sanity.t > ./tests/features/subdir-mount.t > > Result is 1 > > tar: Removing leading `/' from member names > kernel.core_pattern = /%e-%p.core > Build step 'Execute shell' marked build as failure > _______________________________________________ > maintainers mailing list > maintainers at gluster.org > https://lists.gluster.org/mailman/listinfo/maintainers > > > -- > - Atin (atinm) > _______________________________________________ > maintainers mailing list > maintainers at gluster.org > https://lists.gluster.org/mailman/listinfo/maintainers > -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From dkhandel at redhat.com Fri Jun 7 04:54:53 2019 From: dkhandel at redhat.com (Deepshikha Khandelwal) Date: Fri, 7 Jun 2019 10:24:53 +0530 Subject: [Gluster-devel] CI failure - NameError: name 'unicode' is not defined (related to changelogparser.py) In-Reply-To: References: Message-ID: Hi Yaniv, We are working on this. The builders are picking up python3.6 which is leading to modules missing and such undefined errors. Kotresh has sent a patch https://review.gluster.org/#/c/glusterfs/+/22829/ to fix the issue. On Thu, Jun 6, 2019 at 11:49 AM Yaniv Kaul wrote: > From [1]. > > I think it's a Python2/3 thing, so perhaps a CI issue additionally (though > if our code is not Python 3 ready, let's ensure we use Python 2 explicitly > until we fix this). > > *00:47:05.207* ok 14 [ 13/ 386] < 34> 'gluster --mode=script --wignore volume start patchy'*00:47:05.207* ok 15 [ 13/ 70] < 36> '_GFS --attribute-timeout=0 --entry-timeout=0 --volfile-id=patchy --volfile-server=builder208.int.aws.gluster.org /mnt/glusterfs/0'*00:47:05.207* Traceback (most recent call last):*00:47:05.207* File "./tests/basic/changelog/../../utils/changelogparser.py", line 233, in *00:47:05.207* parse(sys.argv[1])*00:47:05.207* File "./tests/basic/changelog/../../utils/changelogparser.py", line 221, in parse*00:47:05.207* process_record(data, tokens, changelog_ts, callback)*00:47:05.207* File "./tests/basic/changelog/../../utils/changelogparser.py", line 178, in process_record*00:47:05.207* callback(record)*00:47:05.207* File "./tests/basic/changelog/../../utils/changelogparser.py", line 182, in default_callback*00:47:05.207* sys.stdout.write(u"{0}\n".format(record))*00:47:05.207* File "./tests/basic/changelog/../../utils/changelogparser.py", line 128, in __str__*00:47:05.207* return unicode(self).encode('utf-8')*00:47:05.207* NameError: name 'unicode' is not defined*00:47:05.207* not ok 16 [ 53/ 39] < 42> '2 check_changelog_op /d/backends/patchy0/.glusterfs/changelogs RENAME' -> 'Got "0" instead of "2"' > > > Y. > > [1] https://build.gluster.org/job/centos7-regression/6318/console > > _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From atumball at redhat.com Sun Jun 9 04:48:48 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Sun, 9 Jun 2019 10:18:48 +0530 Subject: [Gluster-devel] CI failure - NameError: name 'unicode' is not defined (related to changelogparser.py) In-Reply-To: References: Message-ID: Update: The issue happened because python3 got installed on centos7.x series of builders due to other package dependencies. And considering GlusterFS picks python3 as priority even if python2 is default, the tests started to fail. We had completed the work of migrating the code to work smoothly with python3 by glusterfs-6.0 release, but had not noticed issues with regression framework as it was running only on centos7 (python2) earlier. With this event, our regression tests are also now compatible with python3 (Thanks the the below mentioned patch of Kotresh). We were able to mark few spurious failures as BAD_TEST, and fix all the python3 related issues in regression by EOD Friday, and after watching regression tests for 1 more day, can say that the issues are now resolved. Please resubmit (or rebase in the gerrit web) before triggering the 'recheck centos' in the submitted patch(es). Thanks everyone who responded quickly once the issue was noticed, and we are back to GREEN again. Regards, Amar On Fri, Jun 7, 2019 at 10:26 AM Deepshikha Khandelwal wrote: > Hi Yaniv, > > We are working on this. The builders are picking up python3.6 which is > leading to modules missing and such undefined errors. > > Kotresh has sent a patch https://review.gluster.org/#/c/glusterfs/+/22829/ > to fix the issue. > > > > On Thu, Jun 6, 2019 at 11:49 AM Yaniv Kaul wrote: > >> From [1]. >> >> I think it's a Python2/3 thing, so perhaps a CI issue additionally >> (though if our code is not Python 3 ready, let's ensure we use Python 2 >> explicitly until we fix this). >> >> *00:47:05.207* ok 14 [ 13/ 386] < 34> 'gluster --mode=script --wignore volume start patchy'*00:47:05.207* ok 15 [ 13/ 70] < 36> '_GFS --attribute-timeout=0 --entry-timeout=0 --volfile-id=patchy --volfile-server=builder208.int.aws.gluster.org /mnt/glusterfs/0'*00:47:05.207* Traceback (most recent call last):*00:47:05.207* File "./tests/basic/changelog/../../utils/changelogparser.py", line 233, in *00:47:05.207* parse(sys.argv[1])*00:47:05.207* File "./tests/basic/changelog/../../utils/changelogparser.py", line 221, in parse*00:47:05.207* process_record(data, tokens, changelog_ts, callback)*00:47:05.207* File "./tests/basic/changelog/../../utils/changelogparser.py", line 178, in process_record*00:47:05.207* callback(record)*00:47:05.207* File "./tests/basic/changelog/../../utils/changelogparser.py", line 182, in default_callback*00:47:05.207* sys.stdout.write(u"{0}\n".format(record))*00:47:05.207* File "./tests/basic/changelog/../../utils/changelogparser.py", line 128, in __str__*00:47:05.207* return unicode(self).encode('utf-8')*00:47:05.207* NameError: name 'unicode' is not defined*00:47:05.207* not ok 16 [ 53/ 39] < 42> '2 check_changelog_op /d/backends/patchy0/.glusterfs/changelogs RENAME' -> 'Got "0" instead of "2"' >> >> >> Y. >> >> [1] https://build.gluster.org/job/centos7-regression/6318/console >> >> _______________________________________________ >> >> Community Meeting Calendar: >> >> APAC Schedule - >> Every 2nd and 4th Tuesday at 11:30 AM IST >> Bridge: https://bluejeans.com/836554017 >> >> NA/EMEA Schedule - >> Every 1st and 3rd Tuesday at 01:00 PM EDT >> Bridge: https://bluejeans.com/486278655 >> >> Gluster-devel mailing list >> Gluster-devel at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-devel >> >> _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jenkins at build.gluster.org Mon Jun 10 01:45:02 2019 From: jenkins at build.gluster.org (jenkins at build.gluster.org) Date: Mon, 10 Jun 2019 01:45:02 +0000 (UTC) Subject: [Gluster-devel] Weekly Untriaged Bugs Message-ID: <1648858451.122.1560131103113.JavaMail.jenkins@jenkins-el7.rht.gluster.org> [...truncated 6 lines...] https://bugzilla.redhat.com/1714851 / core: issues with 'list.h' elements in clang-scan https://bugzilla.redhat.com/1716790 / geo-replication: geo-rep: Rename with same destination name test case occasionally fails on EC Volume https://bugzilla.redhat.com/1716812 / glusterd: Failed to create volume which transport_type is "tcp,rdma" https://bugzilla.redhat.com/1716875 / gluster-smb: Inode Unref Assertion failed: inode->ref https://bugzilla.redhat.com/1716455 / gluster-smb: OS X error -50 when creating sub-folder on Samba share when using Gluster VFS https://bugzilla.redhat.com/1716440 / gluster-smb: SMBD thread panics when connected to from OS X machine https://bugzilla.redhat.com/1714895 / libglusterfsclient: Glusterfs(fuse) client crash https://bugzilla.redhat.com/1717824 / locks: Fencing: Added the tcmu-runner ALUA feature support but after one of node is rebooted the glfs_file_lock() get stucked https://bugzilla.redhat.com/1718562 / locks: flock failure (regression) https://bugzilla.redhat.com/1718227 / scripts: SELinux context labels are missing for newly added bricks using add-brick command [...truncated 2 lines...] -------------- next part -------------- A non-text attachment was scrubbed... Name: build.log Type: application/octet-stream Size: 1492 bytes Desc: not available URL: From ykaul at redhat.com Mon Jun 10 06:40:27 2019 From: ykaul at redhat.com (Yaniv Kaul) Date: Mon, 10 Jun 2019 09:40:27 +0300 Subject: [Gluster-devel] Test failed - due to out of memory on builder201? Message-ID: >From [1], we can see that non-root-unlink-stale-linkto.t failed on: useradd: /etc/passwd.30380: Cannot allocate memory useradd: cannot lock /etc/passwd; try again later. My patch[2] only removed include statements that were not needed. I'm not sure how it can cause a memory issue. So it's either we have some regression, or the slaves do not have enough memory. It was running on builder201.aws.gluster.org . Checking it, I see other jobs[3] failing on the same issue. Perhaps it is the slave? Any ideas? TIA, Y. [1] https://build.gluster.org/job/centos7-regression/6382/consoleFull [2] https://review.gluster.org/#/c/glusterfs/+/22844/ [3] https://build.gluster.org/job/centos7-regression/6385/console BTW, we REALLY need to fix this message - it's clueless: E [MSGID: 106176] [glusterd-handshake.c:1038:__server_getspec] 0-management: Failed to mount the volume -------------- next part -------------- An HTML attachment was scrubbed... URL: From dkhandel at redhat.com Mon Jun 10 07:05:45 2019 From: dkhandel at redhat.com (Deepshikha Khandelwal) Date: Mon, 10 Jun 2019 12:35:45 +0530 Subject: [Gluster-devel] Test failed - due to out of memory on builder201? In-Reply-To: References: Message-ID: Hi Yaniv, I'm working on it. Looking at the logs and further health checkups, I did not find any memory issue tracebacks on builder201. On Mon, Jun 10, 2019 at 12:12 PM Yaniv Kaul wrote: > From [1], we can see that non-root-unlink-stale-linkto.t failed on: > useradd: /etc/passwd.30380: Cannot allocate memory > useradd: cannot lock /etc/passwd; try again later. > > My patch[2] only removed include statements that were not needed. > I'm not sure how it can cause a memory issue. > So it's either we have some regression, or the slaves do not have enough > memory. > It was running on builder201.aws.gluster.org . > > Checking it, I see other jobs[3] failing on the same issue. Perhaps it is > the slave? > > Any ideas? > TIA, > Y. > > [1] https://build.gluster.org/job/centos7-regression/6382/consoleFull > [2] https://review.gluster.org/#/c/glusterfs/+/22844/ > [3] https://build.gluster.org/job/centos7-regression/6385/console > > BTW, we REALLY need to fix this message - it's clueless: > E [MSGID: 106176] [glusterd-handshake.c:1038:__server_getspec] > 0-management: Failed to mount the volume > > _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cynthia.zhou at nokia-sbell.com Mon Jun 10 07:42:35 2019 From: cynthia.zhou at nokia-sbell.com (Zhou, Cynthia (NSB - CN/Hangzhou)) Date: Mon, 10 Jun 2019 07:42:35 +0000 Subject: [Gluster-devel] glusterfsd memory leak issue found after enable ssl In-Reply-To: References: <07cb1c3aa08b414dbe37442955ddad36@nokia-sbell.com> <6ce04fb69243465295a71b6953eafa19@nokia-sbell.com> <3cd91d1ce39541e7ad30c60ef15000aa@nokia-sbell.com> <5d0c2ed30e884b86ba29bff5a47c960e@nokia-sbell.com> <6d3f68f73e6d440dab19028526745171@nokia-sbell.com> <0d7934cac01f4a43b4581a2f74928dbc@nokia-sbell.com> <9ea2678487544232bfe66e0e7c6d3091@nokia-sbell.com> Message-ID: <217c6a2dbe704777bd8c3662683e75ad@nokia-sbell.com> Hi, How about this patch? I see there is a failed test, is that related to my change? cynthia From: Raghavendra Gowdappa Sent: Thursday, May 09, 2019 12:13 PM To: Zhou, Cynthia (NSB - CN/Hangzhou) Cc: Amar Tumballi Suryanarayan ; gluster-devel at gluster.org Subject: Re: [Gluster-devel] glusterfsd memory leak issue found after enable ssl Thanks!! On Thu, May 9, 2019 at 8:34 AM Zhou, Cynthia (NSB - CN/Hangzhou) > wrote: Hi, Ok, It is posted to https://review.gluster.org/#/c/glusterfs/+/22687/ From: Raghavendra Gowdappa > Sent: Wednesday, May 08, 2019 7:35 PM To: Zhou, Cynthia (NSB - CN/Hangzhou) > Cc: Amar Tumballi Suryanarayan >; gluster-devel at gluster.org Subject: Re: [Gluster-devel] glusterfsd memory leak issue found after enable ssl On Wed, May 8, 2019 at 1:29 PM Zhou, Cynthia (NSB - CN/Hangzhou) > wrote: Hi 'Milind Changire' , The leak is getting more and more clear to me now. the unsolved memory leak is because of in gluterfs version 3.12.15 (in my env)the ssl context is a shared one, while we do ssl_acept, ssl will allocate some read/write buffer to ssl object, however, ssl_free in socket_reset or fini function of socket.c, the buffer is returened back to ssl context free list instead of completely freed. Thanks Cynthia for your efforts in identifying and fixing the leak. If you post a patch to gerrit, I'll be happy to merge it and get the fix into the codebase. So following patch is able to fix the memory leak issue completely.(created for gluster master branch) --- a/rpc/rpc-transport/socket/src/socket.c +++ b/rpc/rpc-transport/socket/src/socket.c @@ -446,6 +446,7 @@ ssl_setup_connection_postfix(rpc_transport_t *this) gf_log(this->name, GF_LOG_DEBUG, "SSL verification succeeded (client: %s) (server: %s)", this->peerinfo.identifier, this->myinfo.identifier); + X509_free(peer); return gf_strdup(peer_CN); /* Error paths. */ @@ -1157,7 +1158,21 @@ __socket_reset(rpc_transport_t *this) memset(&priv->incoming, 0, sizeof(priv->incoming)); event_unregister_close(this->ctx->event_pool, priv->sock, priv->idx); - + if(priv->use_ssl&& priv->ssl_ssl) + { + gf_log(this->name, GF_LOG_TRACE, + "clear and reset for socket(%d), free ssl ", + priv->sock); + if(priv->ssl_ctx) + { + SSL_CTX_free(priv->ssl_ctx); + priv->ssl_ctx = NULL; + } + SSL_shutdown(priv->ssl_ssl); + SSL_clear(priv->ssl_ssl); + SSL_free(priv->ssl_ssl); + priv->ssl_ssl = NULL; + } priv->sock = -1; priv->idx = -1; priv->connected = -1; @@ -4675,6 +4690,21 @@ fini(rpc_transport_t *this) pthread_mutex_destroy(&priv->out_lock); pthread_mutex_destroy(&priv->cond_lock); pthread_cond_destroy(&priv->cond); + if(priv->use_ssl&& priv->ssl_ssl) + { + gf_log(this->name, GF_LOG_TRACE, + "clear and reset for socket(%d), free ssl ", + priv->sock); + if(priv->ssl_ctx) + { + SSL_CTX_free(priv->ssl_ctx); + priv->ssl_ctx = NULL; + } + SSL_shutdown(priv->ssl_ssl); + SSL_clear(priv->ssl_ssl); + SSL_free(priv->ssl_ssl); From: Zhou, Cynthia (NSB - CN/Hangzhou) Sent: Monday, May 06, 2019 2:12 PM To: 'Amar Tumballi Suryanarayan' > Cc: 'Milind Changire' >; 'gluster-devel at gluster.org' > Subject: RE: [Gluster-devel] glusterfsd memory leak issue found after enable ssl Hi, From our test valgrind and libleak all blame ssl3_accept ///////////////////////////from valgrind attached to glusterfds/////////////////////////////////////////// ==16673== 198,720 bytes in 12 blocks are definitely lost in loss record 1,114 of 1,123 ==16673== at 0x4C2EB7B: malloc (vg_replace_malloc.c:299) ==16673== by 0x63E1977: CRYPTO_malloc (in /usr/lib64/libcrypto.so.1.0.2p) ==16673== by 0xA855E0C: ssl3_setup_write_buffer (in /usr/lib64/libssl.so.1.0.2p) ==16673== by 0xA855E77: ssl3_setup_buffers (in /usr/lib64/libssl.so.1.0.2p) ==16673== by 0xA8485D9: ssl3_accept (in /usr/lib64/libssl.so.1.0.2p) ==16673== by 0xA610DDF: ssl_complete_connection (socket.c:400) ==16673== by 0xA617F38: ssl_handle_server_connection_attempt (socket.c:2409) ==16673== by 0xA618420: socket_complete_connection (socket.c:2554) ==16673== by 0xA618788: socket_event_handler (socket.c:2613) ==16673== by 0x4ED6983: event_dispatch_epoll_handler (event-epoll.c:587) ==16673== by 0x4ED6C5A: event_dispatch_epoll_worker (event-epoll.c:663) ==16673== by 0x615C5D9: start_thread (in /usr/lib64/libpthread-2.27.so) ==16673== ==16673== 200,544 bytes in 12 blocks are definitely lost in loss record 1,115 of 1,123 ==16673== at 0x4C2EB7B: malloc (vg_replace_malloc.c:299) ==16673== by 0x63E1977: CRYPTO_malloc (in /usr/lib64/libcrypto.so.1.0.2p) ==16673== by 0xA855D12: ssl3_setup_read_buffer (in /usr/lib64/libssl.so.1.0.2p) ==16673== by 0xA855E68: ssl3_setup_buffers (in /usr/lib64/libssl.so.1.0.2p) ==16673== by 0xA8485D9: ssl3_accept (in /usr/lib64/libssl.so.1.0.2p) ==16673== by 0xA610DDF: ssl_complete_connection (socket.c:400) ==16673== by 0xA617F38: ssl_handle_server_connection_attempt (socket.c:2409) ==16673== by 0xA618420: socket_complete_connection (socket.c:2554) ==16673== by 0xA618788: socket_event_handler (socket.c:2613) ==16673== by 0x4ED6983: event_dispatch_epoll_handler (event-epoll.c:587) ==16673== by 0x4ED6C5A: event_dispatch_epoll_worker (event-epoll.c:663) ==16673== by 0x615C5D9: start_thread (in /usr/lib64/libpthread-2.27.so) ==16673== valgrind --leak-check=f ////////////////////////////////////with libleak attached to glusterfsd///////////////////////////////////////// callstack[2419] expires. count=1 size=224/224 alloc=362 free=350 /home/robot/libleak/libleak.so(malloc+0x25) [0x7f1460604065] /lib64/libcrypto.so.10(CRYPTO_malloc+0x58) [0x7f145ecd9978] /lib64/libcrypto.so.10(EVP_DigestInit_ex+0x2a9) [0x7f145ed95749] /lib64/libssl.so.10(ssl3_digest_cached_records+0x11d) [0x7f145abb6ced] /lib64/libssl.so.10(ssl3_accept+0xc8f) [0x7f145abadc4f] /usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(ssl_complete_connection+0x5e) [0x7f145ae00f3a] /usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc16d) [0x7f145ae0816d] /usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc68a) [0x7f145ae0868a] /usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc9f2) [0x7f145ae089f2] /lib64/libglusterfs.so.0(+0x9b96f) [0x7f146038596f] /lib64/libglusterfs.so.0(+0x9bc46) [0x7f1460385c46] /lib64/libpthread.so.0(+0x75da) [0x7f145f0d15da] /lib64/libc.so.6(clone+0x3f) [0x7f145e9a7eaf] callstack[2432] expires. count=1 size=104/104 alloc=362 free=0 /home/robot/libleak/libleak.so(malloc+0x25) [0x7f1460604065] /lib64/libcrypto.so.10(CRYPTO_malloc+0x58) [0x7f145ecd9978] /lib64/libcrypto.so.10(BN_MONT_CTX_new+0x17) [0x7f145ed48627] /lib64/libcrypto.so.10(BN_MONT_CTX_set_locked+0x6d) [0x7f145ed489fd] /lib64/libcrypto.so.10(+0xff4d9) [0x7f145ed6a4d9] /lib64/libcrypto.so.10(int_rsa_verify+0x1cd) [0x7f145ed6d41d] /lib64/libcrypto.so.10(RSA_verify+0x32) [0x7f145ed6d972] /lib64/libcrypto.so.10(+0x107ff5) [0x7f145ed72ff5] /lib64/libcrypto.so.10(EVP_VerifyFinal+0x211) [0x7f145ed9dd51] /lib64/libssl.so.10(ssl3_get_cert_verify+0x5bb) [0x7f145abac06b] /lib64/libssl.so.10(ssl3_accept+0x988) [0x7f145abad948] /usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(ssl_complete_connection+0x5e) [0x7f145ae00f3a] /usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc16d) [0x7f145ae0816d] /usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc68a) [0x7f145ae0868a] /usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc9f2) [0x7f145ae089f2] /lib64/libglusterfs.so.0(+0x9b96f) [0x7f146038596f] /lib64/libglusterfs.so.0(+0x9bc46) [0x7f1460385c46] /lib64/libpthread.so.0(+0x75da) [0x7f145f0d15da] /lib64/libc.so.6(clone+0x3f) [0x7f145e9a7eaf] one interesting thing is that the memory goes up to about 300m then it stopped increasing !!! I am wondering if this is caused by open-ssl library? But when I search from openssl community, there is no such issue reported before. Is glusterfs using ssl_accept correctly? cynthia From: Zhou, Cynthia (NSB - CN/Hangzhou) Sent: Monday, May 06, 2019 10:34 AM To: 'Amar Tumballi Suryanarayan' > Cc: Milind Changire >; gluster-devel at gluster.org Subject: RE: [Gluster-devel] glusterfsd memory leak issue found after enable ssl Hi, Sorry, I am so busy with other issues these days, could you help me to submit my patch for review? It is based on glusterfs3.12.15 code. But even with this patch , memory leak still exists, from memory leak tool it should be related with ssl_accept, not sure if it is because of openssl library or because improper use of ssl interfaces. --- a/rpc/rpc-transport/socket/src/socket.c +++ b/rpc/rpc-transport/socket/src/socket.c @@ -1019,7 +1019,16 @@ static void __socket_reset(rpc_transport_t *this) { memset(&priv->incoming, 0, sizeof(priv->incoming)); event_unregister_close(this->ctx->event_pool, priv->sock, priv->idx); - + if(priv->use_ssl&& priv->ssl_ssl) + { + gf_log(this->name, GF_LOG_INFO, + "clear and reset for socket(%d), free ssl ", + priv->sock); + SSL_shutdown(priv->ssl_ssl); + SSL_clear(priv->ssl_ssl); + SSL_free(priv->ssl_ssl); + priv->ssl_ssl = NULL; + } priv->sock = -1; priv->idx = -1; priv->connected = -1; @@ -4238,6 +4250,16 @@ void fini(rpc_transport_t *this) { pthread_mutex_destroy(&priv->out_lock); pthread_mutex_destroy(&priv->cond_lock); pthread_cond_destroy(&priv->cond); + if(priv->use_ssl&& priv->ssl_ssl) + { + gf_log(this->name, GF_LOG_INFO, + "clear and reset for socket(%d), free ssl ", + priv->sock); + SSL_shutdown(priv->ssl_ssl); + SSL_clear(priv->ssl_ssl); + SSL_free(priv->ssl_ssl); + priv->ssl_ssl = NULL; + } if (priv->ssl_private_key) { GF_FREE(priv->ssl_private_key); } From: Amar Tumballi Suryanarayan > Sent: Wednesday, May 01, 2019 8:43 PM To: Zhou, Cynthia (NSB - CN/Hangzhou) > Cc: Milind Changire >; gluster-devel at gluster.org Subject: Re: [Gluster-devel] glusterfsd memory leak issue found after enable ssl Hi Cynthia Zhou, Can you post the patch which fixes the issue of missing free? We will continue to investigate the leak further, but would really appreciate getting the patch which is already worked on land into upstream master. -Amar On Mon, Apr 22, 2019 at 1:38 PM Zhou, Cynthia (NSB - CN/Hangzhou) > wrote: Ok, I am clear now. I?ve added ssl_free in socket reset and socket finish function, though glusterfsd memory leak is not that much, still it is leaking, from source code I can not find anything else, Could you help to check if this issue exists in your env? If not I may have a try to merge your patch . Step 1> while true;do gluster v heal info, 2> check the vol-name glusterfsd memory usage, it is obviously increasing. cynthia From: Milind Changire > Sent: Monday, April 22, 2019 2:36 PM To: Zhou, Cynthia (NSB - CN/Hangzhou) > Cc: Atin Mukherjee >; gluster-devel at gluster.org Subject: Re: [Gluster-devel] glusterfsd memory leak issue found after enable ssl According to BIO_new_socket() man page ... If the close flag is set then the socket is shut down and closed when the BIO is freed. For Gluster to have more control over the socket shutdown, the BIO_NOCLOSE flag is set. Otherwise, SSL takes control of socket shutdown whenever BIO is freed. _______________________________________________ Gluster-devel mailing list Gluster-devel at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel -- Amar Tumballi (amarts) _______________________________________________ Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/836554017 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/486278655 Gluster-devel mailing list Gluster-devel at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From ykaul at redhat.com Mon Jun 10 08:31:18 2019 From: ykaul at redhat.com (Yaniv Kaul) Date: Mon, 10 Jun 2019 11:31:18 +0300 Subject: [Gluster-devel] glusterfsd memory leak issue found after enable ssl In-Reply-To: <217c6a2dbe704777bd8c3662683e75ad@nokia-sbell.com> References: <07cb1c3aa08b414dbe37442955ddad36@nokia-sbell.com> <6ce04fb69243465295a71b6953eafa19@nokia-sbell.com> <3cd91d1ce39541e7ad30c60ef15000aa@nokia-sbell.com> <5d0c2ed30e884b86ba29bff5a47c960e@nokia-sbell.com> <6d3f68f73e6d440dab19028526745171@nokia-sbell.com> <0d7934cac01f4a43b4581a2f74928dbc@nokia-sbell.com> <9ea2678487544232bfe66e0e7c6d3091@nokia-sbell.com> <217c6a2dbe704777bd8c3662683e75ad@nokia-sbell.com> Message-ID: On Mon, Jun 10, 2019 at 10:43 AM Zhou, Cynthia (NSB - CN/Hangzhou) < cynthia.zhou at nokia-sbell.com> wrote: > Hi, > > How about this patch? I see there is a failed test, is that related to my > change? > Quite likely. Have you looked at the failure? It produces a stack which looks close to where your patch is: 01:02:58.118 Thread 1 (Thread 0x7efe40930700 (LWP 17150)): 01:02:58.118 #0 0x00007efe4dfd359c in free () from /lib64/libc.so.6 01:02:58.118 No symbol table info available. 01:02:58.118 #1 0x00007efe4e38970d in CRYPTO_free () from /lib64/libcrypto.so.10 01:02:58.118 No symbol table info available. 01:02:58.118 #2 0x00007efe4e4400e7 in sk_free () from /lib64/libcrypto.so.10 01:02:58.118 No symbol table info available. 01:02:58.118 #3 0x00007efe4e4863de in x509_verify_param_zero () from /lib64/libcrypto.so.10 01:02:58.118 No symbol table info available. 01:02:58.118 #4 0x00007efe4e48644e in X509_VERIFY_PARAM_free () from /lib64/libcrypto.so.10 01:02:58.118 No symbol table info available. 01:02:58.118 #5 0x00007efe42a107d9 in SSL_CTX_free () from /lib64/libssl.so.10 01:02:58.120 No symbol table info available. 01:02:58.120 #6 0x00007efe42a12cc0 in SSL_free () from /lib64/libssl.so.10 01:02:58.122 No symbol table info available. 01:02:58.122 #7 0x00007efe42c463eb in __socket_reset (this=0x7efe34001240) at /home/jenkins/root/workspace/centos7-regression/rpc/rpc-transport/socket/src/socket.c:1170 01:02:58.123 priv = 0x7efe340017a0 01:02:58.123 __FUNCTION__ = "__socket_reset" 01:02:58.123 #8 0x00007efe42c46e43 in socket_event_poll_err (this=0x7efe34001240, gen=4, idx=2) at /home/jenkins/root/workspace/centos7-regression/rpc/rpc-transport/socket/src/socket.c:1383 01:02:58.123 priv = 0x7efe340017a0 01:02:58.123 socket_closed = false 01:02:58.123 __FUNCTION__ = "socket_event_poll_err" 01:02:58.123 #9 0x00007efe42c4d056 in socket_event_handler (fd=6, idx=2, gen=4, data=0x7efe34001240, poll_in=1, poll_out=0, poll_err=16, event_thread_died=0 '\000') at /home/jenkins/root/workspace/centos7-regression/rpc/rpc-transport/socket/src/socket.c:3037 > > cynthia > > > > *From:* Raghavendra Gowdappa > *Sent:* Thursday, May 09, 2019 12:13 PM > *To:* Zhou, Cynthia (NSB - CN/Hangzhou) > *Cc:* Amar Tumballi Suryanarayan ; > gluster-devel at gluster.org > *Subject:* Re: [Gluster-devel] glusterfsd memory leak issue found after > enable ssl > > > > Thanks!! > > > > On Thu, May 9, 2019 at 8:34 AM Zhou, Cynthia (NSB - CN/Hangzhou) < > cynthia.zhou at nokia-sbell.com> wrote: > > Hi, > > Ok, It is posted to https://review.gluster.org/#/c/glusterfs/+/22687/ > > > > > > > > *From:* Raghavendra Gowdappa > *Sent:* Wednesday, May 08, 2019 7:35 PM > *To:* Zhou, Cynthia (NSB - CN/Hangzhou) > *Cc:* Amar Tumballi Suryanarayan ; > gluster-devel at gluster.org > *Subject:* Re: [Gluster-devel] glusterfsd memory leak issue found after > enable ssl > > > > > > > > On Wed, May 8, 2019 at 1:29 PM Zhou, Cynthia (NSB - CN/Hangzhou) < > cynthia.zhou at nokia-sbell.com> wrote: > > Hi 'Milind Changire' , > > The leak is getting more and more clear to me now. the unsolved memory > leak is because of in gluterfs version 3.12.15 (in my env)the ssl context > is a shared one, while we do ssl_acept, ssl will allocate some read/write > buffer to ssl object, however, ssl_free in socket_reset or fini function of > socket.c, the buffer is returened back to ssl context free list instead of > completely freed. > > > > Thanks Cynthia for your efforts in identifying and fixing the leak. If you > post a patch to gerrit, I'll be happy to merge it and get the fix into the > codebase. > > > > > > So following patch is able to fix the memory leak issue > completely.(created for gluster master branch) > > > > --- a/rpc/rpc-transport/socket/src/socket.c > +++ b/rpc/rpc-transport/socket/src/socket.c > @@ -446,6 +446,7 @@ ssl_setup_connection_postfix(rpc_transport_t *this) > gf_log(this->name, GF_LOG_DEBUG, > "SSL verification succeeded (client: %s) (server: %s)", > this->peerinfo.identifier, this->myinfo.identifier); > + X509_free(peer); > return gf_strdup(peer_CN); > > /* Error paths. */ > @@ -1157,7 +1158,21 @@ __socket_reset(rpc_transport_t *this) > memset(&priv->incoming, 0, sizeof(priv->incoming)); > > event_unregister_close(this->ctx->event_pool, priv->sock, priv->idx); > - > + if(priv->use_ssl&& priv->ssl_ssl) > + { > + gf_log(this->name, GF_LOG_TRACE, > + "clear and reset for socket(%d), free ssl ", > + priv->sock); > + if(priv->ssl_ctx) > + { > + SSL_CTX_free(priv->ssl_ctx); > + priv->ssl_ctx = NULL; > + } > + SSL_shutdown(priv->ssl_ssl); > + SSL_clear(priv->ssl_ssl); > + SSL_free(priv->ssl_ssl); > + priv->ssl_ssl = NULL; > + } > priv->sock = -1; > priv->idx = -1; > priv->connected = -1; > @@ -4675,6 +4690,21 @@ fini(rpc_transport_t *this) > pthread_mutex_destroy(&priv->out_lock); > pthread_mutex_destroy(&priv->cond_lock); > pthread_cond_destroy(&priv->cond); > + if(priv->use_ssl&& priv->ssl_ssl) > + { > + gf_log(this->name, GF_LOG_TRACE, > + "clear and reset for socket(%d), free ssl > ", > + priv->sock); > + if(priv->ssl_ctx) > + { > + SSL_CTX_free(priv->ssl_ctx); > + priv->ssl_ctx = NULL; > + } > + SSL_shutdown(priv->ssl_ssl); > + SSL_clear(priv->ssl_ssl); > + SSL_free(priv->ssl_ssl); > > *From:* Zhou, Cynthia (NSB - CN/Hangzhou) > *Sent:* Monday, May 06, 2019 2:12 PM > *To:* 'Amar Tumballi Suryanarayan' > *Cc:* 'Milind Changire' ; 'gluster-devel at gluster.org' > > *Subject:* RE: [Gluster-devel] glusterfsd memory leak issue found after > enable ssl > > > > Hi, > > From our test valgrind and libleak all blame ssl3_accept > > ///////////////////////////from valgrind attached to > glusterfds/////////////////////////////////////////// > > ==16673== 198,720 bytes in 12 blocks are definitely lost in loss record > 1,114 of 1,123 > ==16673== at 0x4C2EB7B: malloc (vg_replace_malloc.c:299) > ==16673== by 0x63E1977: CRYPTO_malloc (in /usr/lib64/ > *libcrypto.so.1.0.2p*) > ==16673== by 0xA855E0C: ssl3_setup_write_buffer (in /usr/lib64/ > *libssl.so.1.0.2p*) > ==16673== by 0xA855E77: ssl3_setup_buffers (in /usr/lib64/ > *libssl.so.1.0.2p*) > ==16673== by 0xA8485D9: ssl3_accept (in /usr/lib64/*libssl.so.1.0.2p*) > ==16673== by 0xA610DDF: ssl_complete_connection (socket.c:400) > ==16673== by 0xA617F38: ssl_handle_server_connection_attempt > (socket.c:2409) > ==16673== by 0xA618420: socket_complete_connection (socket.c:2554) > ==16673== by 0xA618788: socket_event_handler (socket.c:2613) > ==16673== by 0x4ED6983: event_dispatch_epoll_handler (event-epoll.c:587) > ==16673== by 0x4ED6C5A: event_dispatch_epoll_worker (event-epoll.c:663) > ==16673== by 0x615C5D9: start_thread (in /usr/lib64/*libpthread-2.27.so > *) > ==16673== > ==16673== 200,544 bytes in 12 blocks are definitely lost in loss record > 1,115 of 1,123 > ==16673== at 0x4C2EB7B: malloc (vg_replace_malloc.c:299) > ==16673== by 0x63E1977: CRYPTO_malloc (in /usr/lib64/ > *libcrypto.so.1.0.2p*) > ==16673== by 0xA855D12: ssl3_setup_read_buffer (in /usr/lib64/ > *libssl.so.1.0.2p*) > ==16673== by 0xA855E68: ssl3_setup_buffers (in /usr/lib64/ > *libssl.so.1.0.2p*) > ==16673== by 0xA8485D9: ssl3_accept (in /usr/lib64/*libssl.so.1.0.2p*) > ==16673== by 0xA610DDF: ssl_complete_connection (socket.c:400) > ==16673== by 0xA617F38: ssl_handle_server_connection_attempt > (socket.c:2409) > ==16673== by 0xA618420: socket_complete_connection (socket.c:2554) > ==16673== by 0xA618788: socket_event_handler (socket.c:2613) > ==16673== by 0x4ED6983: event_dispatch_epoll_handler (event-epoll.c:587) > ==16673== by 0x4ED6C5A: event_dispatch_epoll_worker (event-epoll.c:663) > ==16673== by 0x615C5D9: start_thread (in /usr/lib64/*libpthread-2.27.so > *) > ==16673== > valgrind --leak-check=f > > > > > > ////////////////////////////////////with libleak attached to > glusterfsd///////////////////////////////////////// > > callstack[2419] expires. count=1 size=224/224 alloc=362 free=350 > /home/robot/libleak/*libleak.so(malloc+0x25*) [0x7f1460604065] > /lib64/*libcrypto.so.10(CRYPTO_malloc+0x58*) [0x7f145ecd9978] > /lib64/*libcrypto.so.10(EVP_DigestInit_ex+0x2a9*) [0x7f145ed95749] > /lib64/*libssl.so.10(ssl3_digest_cached_records+0x11d*) > [0x7f145abb6ced] > /lib64/*libssl.so.10(**ssl3_accept**+0xc8f*) [0x7f145abadc4f] > /usr/lib64/glusterfs/3.12.15/rpc-transport/ > *socket.so(ssl_complete_connection+0x5e*) [0x7f145ae00f3a] > /usr/lib64/glusterfs/3.12.15/rpc-transport/*socket.so(+0xc16d*) > [0x7f145ae0816d] > /usr/lib64/glusterfs/3.12.15/rpc-transport/*socket.so(+0xc68a*) > [0x7f145ae0868a] > /usr/lib64/glusterfs/3.12.15/rpc-transport/*socket.so(+0xc9f2*) > [0x7f145ae089f2] > /lib64/*libglusterfs.so.0(+0x9b96f*) [0x7f146038596f] > /lib64/*libglusterfs.so.0(+0x9bc46*) [0x7f1460385c46] > /lib64/*libpthread.so.0(+0x75da*) [0x7f145f0d15da] > /lib64/*libc.so.6(clone+0x3f*) [0x7f145e9a7eaf] > > callstack[2432] expires. count=1 size=104/104 alloc=362 free=0 > /home/robot/libleak/*libleak.so(malloc+0x25*) [0x7f1460604065] > /lib64/*libcrypto.so.10(CRYPTO_malloc+0x58*) [0x7f145ecd9978] > /lib64/*libcrypto.so.10(BN_MONT_CTX_new+0x17*) [0x7f145ed48627] > /lib64/*libcrypto.so.10(BN_MONT_CTX_set_locked+0x6d*) [0x7f145ed489fd] > /lib64/*libcrypto.so.10(+0xff4d9*) [0x7f145ed6a4d9] > /lib64/*libcrypto.so.10(int_rsa_verify+0x1cd*) [0x7f145ed6d41d] > /lib64/*libcrypto.so.10(RSA_verify+0x32*) [0x7f145ed6d972] > /lib64/*libcrypto.so.10(+0x107ff5*) [0x7f145ed72ff5] > /lib64/*libcrypto.so.10(EVP_VerifyFinal+0x211*) [0x7f145ed9dd51] > /lib64/*libssl.so.10(ssl3_get_cert_verify+0x5bb*) [0x7f145abac06b] > /lib64/*libssl.so.10(**ssl3_accept**+0x988*) [0x7f145abad948] > /usr/lib64/glusterfs/3.12.15/rpc-transport/ > *socket.so(ssl_complete_connection+0x5e*) [0x7f145ae00f3a] > /usr/lib64/glusterfs/3.12.15/rpc-transport/*socket.so(+0xc16d*) > [0x7f145ae0816d] > /usr/lib64/glusterfs/3.12.15/rpc-transport/*socket.so(+0xc68a*) > [0x7f145ae0868a] > /usr/lib64/glusterfs/3.12.15/rpc-transport/*socket.so(+0xc9f2*) > [0x7f145ae089f2] > /lib64/*libglusterfs.so.0(+0x9b96f*) [0x7f146038596f] > /lib64/*libglusterfs.so.0(+0x9bc46*) [0x7f1460385c46] > /lib64/*libpthread.so.0(+0x75da*) [0x7f145f0d15da] > /lib64/*libc.so.6(clone+0x3f*) [0x7f145e9a7eaf] > > > > one interesting thing is that the memory goes up to about 300m then it > stopped increasing !!! > > I am wondering if this is caused by open-ssl library? But when I search > from openssl community, there is no such issue reported before. > > Is glusterfs using ssl_accept correctly? > > > > cynthia > > *From:* Zhou, Cynthia (NSB - CN/Hangzhou) > *Sent:* Monday, May 06, 2019 10:34 AM > *To:* 'Amar Tumballi Suryanarayan' > *Cc:* Milind Changire ; gluster-devel at gluster.org > *Subject:* RE: [Gluster-devel] glusterfsd memory leak issue found after > enable ssl > > > > Hi, > > Sorry, I am so busy with other issues these days, could you help me to > submit my patch for review? It is based on glusterfs3.12.15 code. But even > with this patch , memory leak still exists, from memory leak tool it should > be related with ssl_accept, not sure if it is because of openssl library or > because improper use of ssl interfaces. > > --- a/rpc/rpc-transport/socket/src/socket.c > > +++ b/rpc/rpc-transport/socket/src/socket.c > > @@ -1019,7 +1019,16 @@ static void __socket_reset(rpc_transport_t *this) { > > memset(&priv->incoming, 0, sizeof(priv->incoming)); > > > > event_unregister_close(this->ctx->event_pool, priv->sock, priv->idx); > > - > > + if(priv->use_ssl&& priv->ssl_ssl) > > + { > > + gf_log(this->name, GF_LOG_INFO, > > + "clear and reset for socket(%d), free ssl ", > > + priv->sock); > > + SSL_shutdown(priv->ssl_ssl); > > + SSL_clear(priv->ssl_ssl); > > + SSL_free(priv->ssl_ssl); > > + priv->ssl_ssl = NULL; > > + } > > priv->sock = -1; > > priv->idx = -1; > > priv->connected = -1; > > @@ -4238,6 +4250,16 @@ void fini(rpc_transport_t *this) { > > pthread_mutex_destroy(&priv->out_lock); > > pthread_mutex_destroy(&priv->cond_lock); > > pthread_cond_destroy(&priv->cond); > > + if(priv->use_ssl&& priv->ssl_ssl) > > + { > > + gf_log(this->name, GF_LOG_INFO, > > + "clear and reset for socket(%d), free ssl ", > > + priv->sock); > > + SSL_shutdown(priv->ssl_ssl); > > + SSL_clear(priv->ssl_ssl); > > + SSL_free(priv->ssl_ssl); > > + priv->ssl_ssl = NULL; > > + } > > if (priv->ssl_private_key) { > > GF_FREE(priv->ssl_private_key); > > } > > > > > > *From:* Amar Tumballi Suryanarayan > *Sent:* Wednesday, May 01, 2019 8:43 PM > *To:* Zhou, Cynthia (NSB - CN/Hangzhou) > *Cc:* Milind Changire ; gluster-devel at gluster.org > *Subject:* Re: [Gluster-devel] glusterfsd memory leak issue found after > enable ssl > > > > Hi Cynthia Zhou, > > > > Can you post the patch which fixes the issue of missing free? We will > continue to investigate the leak further, but would really appreciate > getting the patch which is already worked on land into upstream master. > > > > -Amar > > > > On Mon, Apr 22, 2019 at 1:38 PM Zhou, Cynthia (NSB - CN/Hangzhou) < > cynthia.zhou at nokia-sbell.com> wrote: > > Ok, I am clear now. > > I?ve added ssl_free in socket reset and socket finish function, though > glusterfsd memory leak is not that much, still it is leaking, from source > code I can not find anything else, > > Could you help to check if this issue exists in your env? If not I may > have a try to merge your patch . > > Step > > 1> while true;do gluster v heal info, > > 2> check the vol-name glusterfsd memory usage, it is obviously > increasing. > > cynthia > > > > *From:* Milind Changire > *Sent:* Monday, April 22, 2019 2:36 PM > *To:* Zhou, Cynthia (NSB - CN/Hangzhou) > *Cc:* Atin Mukherjee ; gluster-devel at gluster.org > *Subject:* Re: [Gluster-devel] glusterfsd memory leak issue found after > enable ssl > > > > According to BIO_new_socket() man page ... > > > > *If the close flag is set then the socket is shut down and closed when the > BIO is freed.* > > > > For Gluster to have more control over the socket shutdown, the BIO_NOCLOSE > flag is set. Otherwise, SSL takes control of socket shutdown whenever BIO > is freed. > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > > -- > > Amar Tumballi (amarts) > > _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amukherj at redhat.com Mon Jun 10 13:12:21 2019 From: amukherj at redhat.com (Atin Mukherjee) Date: Mon, 10 Jun 2019 18:42:21 +0530 Subject: [Gluster-devel] [Gluster-Maintainers] Fwd: Build failed in Jenkins: regression-test-with-multiplex #1359 In-Reply-To: References: <24208463.92.1559325814227.JavaMail.jenkins@jenkins-el7.rht.gluster.org> Message-ID: On Fri, Jun 7, 2019 at 10:07 AM Amar Tumballi Suryanarayan < atumball at redhat.com> wrote: > Got time to test subdir-mount.t failing in brick-mux scenario. > > I noticed some issues, where I need further help from glusterd team. > > subdir-mount.t expects 'hook' script to run after add-brick to make sure > the required subdirectories are healed and are present in new bricks. This > is important as subdir mount expects the subdirs to exist for successful > mount. > > But in case of brick-mux setup, I see that in some cases (6/10), hook > script (add-brick/post-hook/S13-create-subdir-mount.sh) started getting > executed after 20second of finishing the add-brick command. Due to this, > the mount which we execute after add-brick failed. > > My question is, what is making post hook script to run so late ?? > It's not only the add-brick in the post hook. Given post hook scripts are async in nature, I see the respective hook scripts of create/start/set volume operation have executed quite a late which is very surprising until and unless some thread has been stuck for quite a while. Unfortunately for both Mohit and I, the issue isn't reproducible locally. Mohit would give it a try in softserve infra but at this point of time, there's no conclusive evidence, the analysis continues. Amar - would it be possible for you to do a git blame given you can reproduce this? May 31 nightly ( https://build.gluster.org/job/regression-test-with-multiplex/1359/) is when this test started failing. > I can recreate the issues locally on my laptop too. > > > On Sat, Jun 1, 2019 at 4:55 PM Atin Mukherjee wrote: > >> subdir-mount.t has started failing in brick mux regression nightly. This >> needs to be fixed. >> >> Raghavendra - did we manage to get any further clue on uss.t failure? >> >> ---------- Forwarded message --------- >> From: >> Date: Fri, 31 May 2019 at 23:34 >> Subject: [Gluster-Maintainers] Build failed in Jenkins: >> regression-test-with-multiplex #1359 >> To: , , , >> , >> >> >> See < >> https://build.gluster.org/job/regression-test-with-multiplex/1359/display/redirect?page=changes >> > >> >> Changes: >> >> [atin] glusterd: add an op-version check >> >> [atin] glusterd/svc: glusterd_svcs_stop should call individual wrapper >> function >> >> [atin] glusterd/svc: Stop stale process using the glusterd_proc_stop >> >> [Amar Tumballi] lcov: more coverage to shard, old-protocol, sdfs >> >> [Kotresh H R] tests/geo-rep: Add EC volume test case >> >> [Amar Tumballi] glusterfsd/cleanup: Protect graph object under a lock >> >> [Mohammed Rafi KC] glusterd/shd: Optimize the glustershd manager to send >> reconfigure >> >> [Kotresh H R] tests/geo-rep: Add tests to cover glusterd geo-rep >> >> [atin] glusterd: Optimize code to copy dictionary in handshake code path >> >> ------------------------------------------ >> [...truncated 3.18 MB...] >> ./tests/basic/afr/stale-file-lookup.t - 9 second >> ./tests/basic/afr/granular-esh/replace-brick.t - 9 second >> ./tests/basic/afr/granular-esh/add-brick.t - 9 second >> ./tests/basic/afr/gfid-mismatch.t - 9 second >> ./tests/performance/open-behind.t - 8 second >> ./tests/features/ssl-authz.t - 8 second >> ./tests/features/readdir-ahead.t - 8 second >> ./tests/bugs/upcall/bug-1458127.t - 8 second >> ./tests/bugs/transport/bug-873367.t - 8 second >> ./tests/bugs/replicate/bug-1498570-client-iot-graph-check.t - 8 second >> ./tests/bugs/replicate/bug-1132102.t - 8 second >> ./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove-quota-quota-deem-statfs.t >> - 8 second >> ./tests/bugs/quota/bug-1104692.t - 8 second >> ./tests/bugs/posix/bug-1360679.t - 8 second >> ./tests/bugs/posix/bug-1122028.t - 8 second >> ./tests/bugs/nfs/bug-1157223-symlink-mounting.t - 8 second >> ./tests/bugs/glusterfs/bug-861015-log.t - 8 second >> ./tests/bugs/glusterd/sync-post-glusterd-restart.t - 8 second >> ./tests/bugs/glusterd/bug-1696046.t - 8 second >> ./tests/bugs/fuse/bug-983477.t - 8 second >> ./tests/bugs/ec/bug-1227869.t - 8 second >> ./tests/bugs/distribute/bug-1088231.t - 8 second >> ./tests/bugs/distribute/bug-1086228.t - 8 second >> ./tests/bugs/cli/bug-1087487.t - 8 second >> ./tests/bugs/cli/bug-1022905.t - 8 second >> ./tests/bugs/bug-1258069.t - 8 second >> ./tests/bugs/bitrot/1209752-volume-status-should-show-bitrot-scrub-info.t >> - 8 second >> ./tests/basic/xlator-pass-through-sanity.t - 8 second >> ./tests/basic/quota-nfs.t - 8 second >> ./tests/basic/glusterd/arbiter-volume.t - 8 second >> ./tests/basic/ctime/ctime-noatime.t - 8 second >> ./tests/line-coverage/cli-peer-and-volume-operations.t - 7 second >> ./tests/gfid2path/get-gfid-to-path.t - 7 second >> ./tests/bugs/upcall/bug-1369430.t - 7 second >> ./tests/bugs/snapshot/bug-1260848.t - 7 second >> ./tests/bugs/shard/shard-inode-refcount-test.t - 7 second >> ./tests/bugs/shard/bug-1258334.t - 7 second >> ./tests/bugs/replicate/bug-767585-gfid.t - 7 second >> ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t - 7 second >> ./tests/bugs/replicate/bug-1250170-fsync.t - 7 second >> ./tests/bugs/posix/bug-1175711.t - 7 second >> ./tests/bugs/nfs/bug-915280.t - 7 second >> ./tests/bugs/md-cache/setxattr-prepoststat.t - 7 second >> ./tests/bugs/md-cache/bug-1211863_unlink.t - 7 second >> ./tests/bugs/glusterfs/bug-848251.t - 7 second >> ./tests/bugs/distribute/bug-1122443.t - 7 second >> ./tests/bugs/changelog/bug-1208470.t - 7 second >> ./tests/bugs/bug-1702299.t - 7 second >> ./tests/bugs/bug-1371806_2.t - 7 second >> ./tests/bugs/bitrot/1209818-vol-info-show-scrub-process-properly.t - 7 >> second >> ./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t - 7 second >> ./tests/bugs/bitrot/1207029-bitrot-daemon-should-start-on-valid-node.t >> - 7 second >> ./tests/bitrot/br-stub.t - 7 second >> ./tests/basic/glusterd/arbiter-volume-probe.t - 7 second >> ./tests/basic/gfapi/libgfapi-fini-hang.t - 7 second >> ./tests/basic/fencing/fencing-crash-conistency.t - 7 second >> ./tests/basic/distribute/file-create.t - 7 second >> ./tests/basic/afr/tarissue.t - 7 second >> ./tests/basic/afr/gfid-heal.t - 7 second >> ./tests/bugs/snapshot/bug-1178079.t - 6 second >> ./tests/bugs/snapshot/bug-1064768.t - 6 second >> ./tests/bugs/shard/bug-1342298.t - 6 second >> ./tests/bugs/shard/bug-1259651.t - 6 second >> ./tests/bugs/replicate/bug-1686568-send-truncate-on-arbiter-from-shd.t >> - 6 second >> ./tests/bugs/replicate/bug-1626994-info-split-brain.t - 6 second >> ./tests/bugs/replicate/bug-1325792.t - 6 second >> ./tests/bugs/replicate/bug-1101647.t - 6 second >> ./tests/bugs/quota/bug-1243798.t - 6 second >> ./tests/bugs/protocol/bug-1321578.t - 6 second >> ./tests/bugs/nfs/bug-877885.t - 6 second >> ./tests/bugs/nfs/bug-1143880-fix-gNFSd-auth-crash.t - 6 second >> ./tests/bugs/md-cache/bug-1476324.t - 6 second >> ./tests/bugs/md-cache/afr-stale-read.t - 6 second >> ./tests/bugs/io-cache/bug-858242.t - 6 second >> ./tests/bugs/glusterfs/bug-893378.t - 6 second >> ./tests/bugs/glusterfs/bug-856455.t - 6 second >> ./tests/bugs/glusterd/quorum-value-check.t - 6 second >> ./tests/bugs/ec/bug-1179050.t - 6 second >> ./tests/bugs/distribute/bug-912564.t - 6 second >> ./tests/bugs/distribute/bug-884597.t - 6 second >> ./tests/bugs/distribute/bug-1368012.t - 6 second >> ./tests/bugs/core/bug-986429.t - 6 second >> ./tests/bugs/core/bug-1699025-brick-mux-detach-brick-fd-issue.t - 6 >> second >> ./tests/bugs/core/bug-1168803-snapd-option-validation-fix.t - 6 second >> ./tests/bugs/bug-1371806_1.t - 6 second >> ./tests/bugs/bitrot/bug-1229134-bitd-not-support-vol-set.t - 6 second >> ./tests/bugs/bitrot/bug-1210684-scrub-pause-resume-error-handling.t - 6 >> second >> ./tests/bitrot/bug-1221914.t - 6 second >> ./tests/basic/trace.t - 6 second >> ./tests/basic/playground/template-xlator-sanity.t - 6 second >> ./tests/basic/ec/nfs.t - 6 second >> ./tests/basic/ec/ec-read-policy.t - 6 second >> ./tests/basic/ec/ec-anonymous-fd.t - 6 second >> ./tests/basic/distribute/non-root-unlink-stale-linkto.t - 6 second >> ./tests/basic/changelog/changelog-rename.t - 6 second >> ./tests/basic/afr/heal-info.t - 6 second >> ./tests/basic/afr/afr-read-hash-mode.t - 6 second >> ./tests/gfid2path/gfid2path_nfs.t - 5 second >> ./tests/bugs/upcall/bug-1422776.t - 5 second >> ./tests/bugs/replicate/bug-886998.t - 5 second >> ./tests/bugs/replicate/bug-1365455.t - 5 second >> ./tests/bugs/readdir-ahead/bug-1670253-consistent-metadata.t - 5 second >> ./tests/bugs/posix/bug-gfid-path.t - 5 second >> ./tests/bugs/posix/bug-765380.t - 5 second >> ./tests/bugs/nfs/bug-847622.t - 5 second >> ./tests/bugs/nfs/bug-1116503.t - 5 second >> ./tests/bugs/io-stats/bug-1598548.t - 5 second >> ./tests/bugs/glusterfs-server/bug-877992.t - 5 second >> ./tests/bugs/glusterfs-server/bug-873549.t - 5 second >> ./tests/bugs/glusterfs/bug-895235.t - 5 second >> ./tests/bugs/fuse/bug-1126048.t - 5 second >> ./tests/bugs/distribute/bug-907072.t - 5 second >> ./tests/bugs/core/bug-913544.t - 5 second >> ./tests/bugs/core/bug-908146.t - 5 second >> ./tests/bugs/access-control/bug-1051896.t - 5 second >> ./tests/basic/ec/ec-internal-xattrs.t - 5 second >> ./tests/basic/ec/ec-fallocate.t - 5 second >> ./tests/basic/distribute/bug-1265677-use-readdirp.t - 5 second >> ./tests/basic/afr/arbiter-remove-brick.t - 5 second >> ./tests/performance/quick-read.t - 4 second >> ./tests/gfid2path/block-mount-access.t - 4 second >> ./tests/features/delay-gen.t - 4 second >> ./tests/bugs/upcall/bug-upcall-stat.t - 4 second >> ./tests/bugs/upcall/bug-1394131.t - 4 second >> ./tests/bugs/unclassified/bug-1034085.t - 4 second >> ./tests/bugs/snapshot/bug-1111041.t - 4 second >> ./tests/bugs/shard/bug-1272986.t - 4 second >> ./tests/bugs/shard/bug-1256580.t - 4 second >> ./tests/bugs/shard/bug-1250855.t - 4 second >> ./tests/bugs/shard/bug-1245547.t - 4 second >> ./tests/bugs/rpc/bug-954057.t - 4 second >> ./tests/bugs/replicate/bug-976800.t - 4 second >> ./tests/bugs/replicate/bug-880898.t - 4 second >> ./tests/bugs/replicate/bug-1480525.t - 4 second >> ./tests/bugs/read-only/bug-1134822-read-only-default-in-graph.t - 4 >> second >> ./tests/bugs/readdir-ahead/bug-1446516.t - 4 second >> ./tests/bugs/readdir-ahead/bug-1439640.t - 4 second >> ./tests/bugs/readdir-ahead/bug-1390050.t - 4 second >> ./tests/bugs/quota/bug-1287996.t - 4 second >> ./tests/bugs/quick-read/bug-846240.t - 4 second >> ./tests/bugs/posix/disallow-gfid-volumeid-removexattr.t - 4 second >> ./tests/bugs/posix/bug-1619720.t - 4 second >> ./tests/bugs/nl-cache/bug-1451588.t - 4 second >> ./tests/bugs/nfs/zero-atime.t - 4 second >> ./tests/bugs/nfs/subdir-trailing-slash.t - 4 second >> ./tests/bugs/nfs/socket-as-fifo.t - 4 second >> ./tests/bugs/nfs/showmount-many-clients.t - 4 second >> ./tests/bugs/nfs/bug-1210338.t - 4 second >> ./tests/bugs/nfs/bug-1166862.t - 4 second >> ./tests/bugs/nfs/bug-1161092-nfs-acls.t - 4 second >> ./tests/bugs/md-cache/bug-1632503.t - 4 second >> ./tests/bugs/glusterfs-server/bug-864222.t - 4 second >> ./tests/bugs/glusterfs/bug-1482528.t - 4 second >> ./tests/bugs/glusterd/bug-948729/bug-948729-mode-script.t - 4 second >> ./tests/bugs/glusterd/bug-948729/bug-948729-force.t - 4 second >> ./tests/bugs/glusterd/bug-1482906-peer-file-blank-line.t - 4 second >> ./tests/bugs/glusterd/bug-1091935-brick-order-check-from-cli-to-glusterd.t >> - 4 second >> ./tests/bugs/geo-replication/bug-1296496.t - 4 second >> ./tests/bugs/fuse/bug-1336818.t - 4 second >> ./tests/bugs/fuse/bug-1283103.t - 4 second >> ./tests/bugs/core/io-stats-1322825.t - 4 second >> ./tests/bugs/core/bug-834465.t - 4 second >> ./tests/bugs/core/bug-1135514-allow-setxattr-with-null-value.t - 4 >> second >> ./tests/bugs/core/949327.t - 4 second >> ./tests/bugs/cli/bug-977246.t - 4 second >> ./tests/bugs/cli/bug-961307.t - 4 second >> ./tests/bugs/cli/bug-1004218.t - 4 second >> ./tests/bugs/bug-1138841.t - 4 second >> ./tests/bugs/access-control/bug-1387241.t - 4 second >> ./tests/bitrot/bug-internal-xattrs-check-1243391.t - 4 second >> ./tests/basic/quota-rename.t - 4 second >> ./tests/basic/hardlink-limit.t - 4 second >> ./tests/basic/ec/dht-rename.t - 4 second >> ./tests/basic/distribute/lookup.t - 4 second >> ./tests/line-coverage/meta-max-coverage.t - 3 second >> ./tests/gfid2path/gfid2path_fuse.t - 3 second >> ./tests/bugs/unclassified/bug-991622.t - 3 second >> ./tests/bugs/trace/bug-797171.t - 3 second >> ./tests/bugs/glusterfs-server/bug-861542.t - 3 second >> ./tests/bugs/glusterfs/bug-869724.t - 3 second >> ./tests/bugs/glusterfs/bug-860297.t - 3 second >> ./tests/bugs/glusterfs/bug-844688.t - 3 second >> ./tests/bugs/glusterd/bug-948729/bug-948729.t - 3 second >> ./tests/bugs/distribute/bug-1204140.t - 3 second >> ./tests/bugs/core/bug-924075.t - 3 second >> ./tests/bugs/core/bug-845213.t - 3 second >> ./tests/bugs/core/bug-1421721-mpx-toggle.t - 3 second >> ./tests/bugs/core/bug-1119582.t - 3 second >> ./tests/bugs/core/bug-1117951.t - 3 second >> ./tests/bugs/cli/bug-983317-volume-get.t - 3 second >> ./tests/bugs/cli/bug-867252.t - 3 second >> ./tests/basic/glusterd/check-cloudsync-ancestry.t - 3 second >> ./tests/basic/fops-sanity.t - 3 second >> ./tests/basic/fencing/test-fence-option.t - 3 second >> ./tests/basic/distribute/debug-xattrs.t - 3 second >> ./tests/basic/afr/ta-check-locks.t - 3 second >> ./tests/line-coverage/volfile-with-all-graph-syntax.t - 2 second >> ./tests/line-coverage/some-features-in-libglusterfs.t - 2 second >> ./tests/bugs/shard/bug-1261773.t - 2 second >> ./tests/bugs/replicate/bug-884328.t - 2 second >> ./tests/bugs/readdir-ahead/bug-1512437.t - 2 second >> ./tests/bugs/nfs/bug-970070.t - 2 second >> ./tests/bugs/nfs/bug-1302948.t - 2 second >> ./tests/bugs/logging/bug-823081.t - 2 second >> ./tests/bugs/glusterfs-server/bug-889996.t - 2 second >> ./tests/bugs/glusterfs/bug-892730.t - 2 second >> ./tests/bugs/glusterfs/bug-811493.t - 2 second >> ./tests/bugs/glusterd/bug-1085330-and-bug-916549.t - 2 second >> ./tests/bugs/distribute/bug-924265.t - 2 second >> ./tests/bugs/core/log-bug-1362520.t - 2 second >> ./tests/bugs/core/bug-903336.t - 2 second >> ./tests/bugs/core/bug-1111557.t - 2 second >> ./tests/bugs/cli/bug-969193.t - 2 second >> ./tests/bugs/cli/bug-949298.t - 2 second >> ./tests/bugs/cli/bug-921215.t - 2 second >> ./tests/bugs/cli/bug-1378842-volume-get-all.t - 2 second >> ./tests/basic/peer-parsing.t - 2 second >> ./tests/basic/md-cache/bug-1418249.t - 2 second >> ./tests/basic/afr/arbiter-cli.t - 2 second >> ./tests/bugs/replicate/ta-inode-refresh-read.t - 1 second >> ./tests/bugs/glusterfs/bug-853690.t - 1 second >> ./tests/bugs/cli/bug-764638.t - 1 second >> ./tests/bugs/cli/bug-1047378.t - 1 second >> ./tests/basic/netgroup_parsing.t - 1 second >> ./tests/basic/gfapi/sink.t - 1 second >> ./tests/basic/exports_parsing.t - 1 second >> ./tests/basic/posixonly.t - 0 second >> ./tests/basic/glusterfsd-args.t - 0 second >> >> >> 2 test(s) failed >> ./tests/basic/uss.t >> ./tests/features/subdir-mount.t >> >> 0 test(s) generated core >> >> >> 5 test(s) needed retry >> ./tests/basic/afr/split-brain-favorite-child-policy.t >> ./tests/basic/ec/self-heal.t >> ./tests/basic/uss.t >> ./tests/basic/volfile-sanity.t >> ./tests/features/subdir-mount.t >> >> Result is 1 >> >> tar: Removing leading `/' from member names >> kernel.core_pattern = /%e-%p.core >> Build step 'Execute shell' marked build as failure >> _______________________________________________ >> maintainers mailing list >> maintainers at gluster.org >> https://lists.gluster.org/mailman/listinfo/maintainers >> >> >> -- >> - Atin (atinm) >> _______________________________________________ >> maintainers mailing list >> maintainers at gluster.org >> https://lists.gluster.org/mailman/listinfo/maintainers >> > > > -- > Amar Tumballi (amarts) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hgowtham at redhat.com Mon Jun 10 13:28:10 2019 From: hgowtham at redhat.com (hgowtham at redhat.com) Date: Mon, 10 Jun 2019 13:28:10 +0000 Subject: [Gluster-devel] Invitation: Gluster Community Meeting (APAC friendly hours) @ Tue Jun 11, 2019 11:30am - 12:30pm (IST) (gluster-devel@gluster.org) Message-ID: <000000000000c3110d058af82636@google.com> You have been invited to the following event. Title: Gluster Community Meeting (APAC friendly hours) Bridge: https://bluejeans.com/836554017 Meeting minutes: https://hackmd.io/A07qMrezSOyeUUGxPhBHqQ?both Previous Meeting notes: http://github.com/gluster/community When: Tue Jun 11, 2019 11:30am ? 12:30pm India Standard Time - Kolkata Where: https://bluejeans.com/836554017 Calendar: gluster-devel at gluster.org Who: * hgowtham at redhat.com - organizer * gluster-users at gluster.org * gluster-devel at gluster.org Event details: https://www.google.com/calendar/event?action=VIEW&eid=MmFrY3BnZ3I4MG5kdmcxbmJnYzlkcDBycmwgZ2x1c3Rlci1kZXZlbEBnbHVzdGVyLm9yZw&tok=MTkjaGdvd3RoYW1AcmVkaGF0LmNvbWI3NDk5MGVkZDkyOWZjNzhjNTVmNmU0YWY2NWQzNjk4NzYzODdiM2Q&ctz=Asia%2FKolkata&hl=en&es=0 Invitation from Google Calendar: https://www.google.com/calendar/ You are receiving this courtesy email at the account gluster-devel at gluster.org because you are an attendee of this event. To stop receiving future updates for this event, decline this event. Alternatively you can sign up for a Google account at https://www.google.com/calendar/ and control your notification settings for your entire calendar. Forwarding this invitation could allow any recipient to send a response to the organizer and be added to the guest list, or invite others regardless of their own invitation status, or to modify your RSVP. Learn more at https://support.google.com/calendar/answer/37135#forwarding -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/calendar Size: 1721 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: invite.ics Type: application/ics Size: 1759 bytes Desc: not available URL: From hgowtham at redhat.com Mon Jun 10 14:07:24 2019 From: hgowtham at redhat.com (Hari Gowtham) Date: Mon, 10 Jun 2019 19:37:24 +0530 Subject: [Gluster-devel] [Gluster-users] No healing on peer disconnect - is it correct? In-Reply-To: <3B1EE351-5F82-4D05-947A-4960BBAC885A@gmail.com> References: <10D708D0-E523-46A0-91BF-FFC41886E316@gmail.com> <3B1EE351-5F82-4D05-947A-4960BBAC885A@gmail.com> Message-ID: On Mon, Jun 10, 2019 at 7:21 PM snowmailer wrote: > > Can someone advice on this, please? > > BR! > > D?a 3. 6. 2019 o 18:58 u??vate? Martin nap?sal: > > > Hi all, > > > > I need someone to explain if my gluster behaviour is correct. I am not sure if my gluster works as it should. I have simple Replica 3 - Number of Bricks: 1 x 3 = 3. > > > > When one of my hypervisor is disconnected as peer, i.e. gluster process is down but bricks running, other two healthy nodes start signalling that they lost one peer. This is correct. > > Next, I restart gluster process on node where gluster process failed and I thought It should trigger healing of files on failed node but nothing is happening. > > > > I run VMs disks on this gluster volume. No healing is triggered after gluster restart, remaining two nodes get peer back after restart of gluster and everything is running without down time. > > Even VMs that are running on ?failed? node where gluster process was down (bricks were up) are running without down time. I assume your VMs use gluster as the storage. In that case, the gluster volume might be mounted on all the hypervisors. The mount/ client is smart enough to give the correct data from the other two machines which were always up. This is the reason things are working fine. Gluster should heal the brick. Adding people how can help you better with the heal part. @Karthik Subrahmanya @Ravishankar N do take a look and answer this part. > > > > Is this behaviour correct? I mean No healing is triggered after peer is reconnected back and VMs. > > > > Thanks for explanation. > > > > BR! > > Martin > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -- Regards, Hari Gowtham. From atumball at redhat.com Mon Jun 10 17:15:27 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Mon, 10 Jun 2019 22:45:27 +0530 Subject: [Gluster-devel] Regression failure continues: 'tests/basic/afr/split-brain-favorite-child-policy.t` Message-ID: Fails with: *20:56:58* ok 132 [ 8/ 82] < 194> 'gluster --mode=script --wignore volume heal patchy'*20:56:58* not ok 133 [ 8/ 80260] < 195> '^0$ get_pending_heal_count patchy' -> 'Got "2" instead of "^0$"'*20:56:58* ok 134 [ 18/ 2] < 197> '0 echo 0' Looks like when the error occurred, it took 80seconds. I see 2 different patches fail on this, would be good to analyze it further. Regards, Amar -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksubrahm at redhat.com Mon Jun 10 17:17:57 2019 From: ksubrahm at redhat.com (Karthik Subrahmanya) Date: Mon, 10 Jun 2019 22:47:57 +0530 Subject: [Gluster-devel] Regression failure continues: 'tests/basic/afr/split-brain-favorite-child-policy.t` In-Reply-To: References: Message-ID: Hi Amar, I found the issue, will be sending a patch in a while. Regards, Karthik On Mon, Jun 10, 2019 at 10:46 PM Amar Tumballi Suryanarayan < atumball at redhat.com> wrote: > Fails with: > > *20:56:58* ok 132 [ 8/ 82] < 194> 'gluster --mode=script --wignore volume heal patchy'*20:56:58* not ok 133 [ 8/ 80260] < 195> '^0$ get_pending_heal_count patchy' -> 'Got "2" instead of "^0$"'*20:56:58* ok 134 [ 18/ 2] < 197> '0 echo 0' > > > Looks like when the error occurred, it took 80seconds. > > > I see 2 different patches fail on this, would be good to analyze it further. > > > Regards, > > Amar > > > -- > Amar Tumballi (amarts) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksubrahm at redhat.com Mon Jun 10 18:16:11 2019 From: ksubrahm at redhat.com (Karthik Subrahmanya) Date: Mon, 10 Jun 2019 23:46:11 +0530 Subject: [Gluster-devel] Regression failure continues: 'tests/basic/afr/split-brain-favorite-child-policy.t` In-Reply-To: References: Message-ID: Patch posted: https://review.gluster.org/#/c/glusterfs/+/22850/ -Karthik On Mon, Jun 10, 2019 at 10:47 PM Karthik Subrahmanya wrote: > Hi Amar, > > I found the issue, will be sending a patch in a while. > > Regards, > Karthik > > On Mon, Jun 10, 2019 at 10:46 PM Amar Tumballi Suryanarayan < > atumball at redhat.com> wrote: > >> Fails with: >> >> *20:56:58* ok 132 [ 8/ 82] < 194> 'gluster --mode=script --wignore volume heal patchy'*20:56:58* not ok 133 [ 8/ 80260] < 195> '^0$ get_pending_heal_count patchy' -> 'Got "2" instead of "^0$"'*20:56:58* ok 134 [ 18/ 2] < 197> '0 echo 0' >> >> >> Looks like when the error occurred, it took 80seconds. >> >> >> I see 2 different patches fail on this, would be good to analyze it further. >> >> >> Regards, >> >> Amar >> >> >> -- >> Amar Tumballi (amarts) >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ravishankar at redhat.com Tue Jun 11 04:50:10 2019 From: ravishankar at redhat.com (Ravishankar N) Date: Tue, 11 Jun 2019 10:20:10 +0530 Subject: [Gluster-devel] [Gluster-users] No healing on peer disconnect - is it correct? In-Reply-To: References: <10D708D0-E523-46A0-91BF-FFC41886E316@gmail.com> <3B1EE351-5F82-4D05-947A-4960BBAC885A@gmail.com> Message-ID: <28417fb7-5081-cc8e-7ffc-625f9905f9c2@redhat.com> There will be pending heals only when the brick process goes down or there is a disconnect between the client and that brick. When you say " gluster process is down but bricks running", I'm guessing you killed only glusterd and not the glusterfsd brick process. That won't cause any pending heals. If there is something to be healed, `gluster volume heal $volname info` will display the list of files. Hope that helps, Ravi On 10/06/19 7:53 PM, Martin wrote: > My VMs using Gluster as storage through libgfapi support in Qemu. But > I dont see any healing of reconnected brick. > > Thanks Karthik /?Ravishankar in advance! > >> On 10 Jun 2019, at 16:07, Hari Gowtham > > wrote: >> >> On Mon, Jun 10, 2019 at 7:21 PM snowmailer > > wrote: >>> >>> Can someone advice on this, please? >>> >>> BR! >>> >>> D?a 3. 6. 2019 o 18:58 u??vate? Martin >> > nap?sal: >>> >>>> Hi all, >>>> >>>> I need someone to explain if my gluster behaviour is correct. I am >>>> not sure if my gluster works as it should. I have simple Replica 3 >>>> - Number of Bricks: 1 x 3 = 3. >>>> >>>> When one of my hypervisor is disconnected as peer, i.e. gluster >>>> process is down but bricks running, other two healthy nodes start >>>> signalling that they lost one peer. This is correct. >>>> Next, I restart gluster process on node where gluster process >>>> failed and I thought It should trigger healing of files on failed >>>> node but nothing is happening. >>>> >>>> I run VMs disks on this gluster volume. No healing is triggered >>>> after gluster restart, remaining two nodes get peer back after >>>> restart of gluster and everything is running without down time. >>>> Even VMs that are running on ?failed? node where gluster process >>>> was down (bricks were up) are running without down time. >> >> I assume your VMs use gluster as the storage. In that case, the >> gluster volume might be mounted on all the hypervisors. >> The mount/ client is smart enough to give the correct data from the >> other two machines which were always up. >> This is the reason things are working fine. >> >> Gluster should heal the brick. >> Adding people how can help you better with the heal part. >> @Karthik Subrahmanya ?@Ravishankar N do take a look and answer this part. >> >>>> >>>> Is this behaviour correct? I mean No healing is triggered after >>>> peer is reconnected back and VMs. >>>> >>>> Thanks for explanation. >>>> >>>> BR! >>>> Martin >>>> >>>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >> >> >> >> -- >> Regards, >> Hari Gowtham. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From anoopcs at cryptolab.net Tue Jun 11 05:03:43 2019 From: anoopcs at cryptolab.net (Anoop C S) Date: Tue, 11 Jun 2019 10:33:43 +0530 Subject: [Gluster-devel] [Gluster-Maintainers] Fwd: Build failed in Jenkins: regression-test-with-multiplex #1359 In-Reply-To: References: <24208463.92.1559325814227.JavaMail.jenkins@jenkins-el7.rht.gluster.org> Message-ID: <5f05e6372e856402bb344107920cc1693a47be13.camel@cryptolab.net> On Fri, 2019-06-07 at 10:06 +0530, Amar Tumballi Suryanarayan wrote: > Got time to test subdir-mount.t failing in brick-mux scenario. > > I noticed some issues, where I need further help from glusterd team. > > subdir-mount.t expects 'hook' script to run after add-brick to make > sure the required subdirectories are healed and are present in new > bricks. This is important as subdir mount expects the subdirs to > exist for successful mount. > > But in case of brick-mux setup, I see that in some cases (6/10), hook > script (add-brick/post-hook/S13-create-subdir-mount.sh) started > getting executed after 20second of finishing the add-brick command. > Due to this, the mount which we execute after add-brick failed. > > My question is, what is making post hook script to run so late ?? Note:- I would like to add that I have a patch[1] under review adding another post add-brick hook script which might further result in delayed execution of S13create-subdir-mounts.sh. Because S10selinux-label- brick.sh from my change comes before the existing hook script. [1] https://review.gluster.org/c/glusterfs/+/22834 > I can recreate the issues locally on my laptop too. > > > On Sat, Jun 1, 2019 at 4:55 PM Atin Mukherjee > wrote: > > subdir-mount.t has started failing in brick mux regression nightly. > > This needs to be fixed. > > > > Raghavendra - did we manage to get any further clue on uss.t > > failure? > > > > ---------- Forwarded message --------- > > From: > > Date: Fri, 31 May 2019 at 23:34 > > Subject: [Gluster-Maintainers] Build failed in Jenkins: regression- > > test-with-multiplex #1359 > > To: , , < > > amarts at redhat.com>, , > > > > > > See < > > https://build.gluster.org/job/regression-test-with-multiplex/1359/display/redirect?page=changes > > > > > > > Changes: > > > > [atin] glusterd: add an op-version check > > > > [atin] glusterd/svc: glusterd_svcs_stop should call individual > > wrapper function > > > > [atin] glusterd/svc: Stop stale process using the > > glusterd_proc_stop > > > > [Amar Tumballi] lcov: more coverage to shard, old-protocol, sdfs > > > > [Kotresh H R] tests/geo-rep: Add EC volume test case > > > > [Amar Tumballi] glusterfsd/cleanup: Protect graph object under a > > lock > > > > [Mohammed Rafi KC] glusterd/shd: Optimize the glustershd manager to > > send reconfigure > > > > [Kotresh H R] tests/geo-rep: Add tests to cover glusterd geo-rep > > > > [atin] glusterd: Optimize code to copy dictionary in handshake code > > path > > > > ------------------------------------------ > > [...truncated 3.18 MB...] > > ./tests/basic/afr/stale-file-lookup.t - 9 second > > ./tests/basic/afr/granular-esh/replace-brick.t - 9 second > > ./tests/basic/afr/granular-esh/add-brick.t - 9 second > > ./tests/basic/afr/gfid-mismatch.t - 9 second > > ./tests/performance/open-behind.t - 8 second > > ./tests/features/ssl-authz.t - 8 second > > ./tests/features/readdir-ahead.t - 8 second > > ./tests/bugs/upcall/bug-1458127.t - 8 second > > ./tests/bugs/transport/bug-873367.t - 8 second > > ./tests/bugs/replicate/bug-1498570-client-iot-graph-check.t - 8 > > second > > ./tests/bugs/replicate/bug-1132102.t - 8 second > > ./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove- > > quota-quota-deem-statfs.t - 8 second > > ./tests/bugs/quota/bug-1104692.t - 8 second > > ./tests/bugs/posix/bug-1360679.t - 8 second > > ./tests/bugs/posix/bug-1122028.t - 8 second > > ./tests/bugs/nfs/bug-1157223-symlink-mounting.t - 8 second > > ./tests/bugs/glusterfs/bug-861015-log.t - 8 second > > ./tests/bugs/glusterd/sync-post-glusterd-restart.t - 8 second > > ./tests/bugs/glusterd/bug-1696046.t - 8 second > > ./tests/bugs/fuse/bug-983477.t - 8 second > > ./tests/bugs/ec/bug-1227869.t - 8 second > > ./tests/bugs/distribute/bug-1088231.t - 8 second > > ./tests/bugs/distribute/bug-1086228.t - 8 second > > ./tests/bugs/cli/bug-1087487.t - 8 second > > ./tests/bugs/cli/bug-1022905.t - 8 second > > ./tests/bugs/bug-1258069.t - 8 second > > ./tests/bugs/bitrot/1209752-volume-status-should-show-bitrot-scrub- > > info.t - 8 second > > ./tests/basic/xlator-pass-through-sanity.t - 8 second > > ./tests/basic/quota-nfs.t - 8 second > > ./tests/basic/glusterd/arbiter-volume.t - 8 second > > ./tests/basic/ctime/ctime-noatime.t - 8 second > > ./tests/line-coverage/cli-peer-and-volume-operations.t - 7 second > > ./tests/gfid2path/get-gfid-to-path.t - 7 second > > ./tests/bugs/upcall/bug-1369430.t - 7 second > > ./tests/bugs/snapshot/bug-1260848.t - 7 second > > ./tests/bugs/shard/shard-inode-refcount-test.t - 7 second > > ./tests/bugs/shard/bug-1258334.t - 7 second > > ./tests/bugs/replicate/bug-767585-gfid.t - 7 second > > ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t - 7 > > second > > ./tests/bugs/replicate/bug-1250170-fsync.t - 7 second > > ./tests/bugs/posix/bug-1175711.t - 7 second > > ./tests/bugs/nfs/bug-915280.t - 7 second > > ./tests/bugs/md-cache/setxattr-prepoststat.t - 7 second > > ./tests/bugs/md-cache/bug-1211863_unlink.t - 7 second > > ./tests/bugs/glusterfs/bug-848251.t - 7 second > > ./tests/bugs/distribute/bug-1122443.t - 7 second > > ./tests/bugs/changelog/bug-1208470.t - 7 second > > ./tests/bugs/bug-1702299.t - 7 second > > ./tests/bugs/bug-1371806_2.t - 7 second > > ./tests/bugs/bitrot/1209818-vol-info-show-scrub-process-properly.t > > - 7 second > > ./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t - 7 > > second > > ./tests/bugs/bitrot/1207029-bitrot-daemon-should-start-on-valid- > > node.t - 7 second > > ./tests/bitrot/br-stub.t - 7 second > > ./tests/basic/glusterd/arbiter-volume-probe.t - 7 second > > ./tests/basic/gfapi/libgfapi-fini-hang.t - 7 second > > ./tests/basic/fencing/fencing-crash-conistency.t - 7 second > > ./tests/basic/distribute/file-create.t - 7 second > > ./tests/basic/afr/tarissue.t - 7 second > > ./tests/basic/afr/gfid-heal.t - 7 second > > ./tests/bugs/snapshot/bug-1178079.t - 6 second > > ./tests/bugs/snapshot/bug-1064768.t - 6 second > > ./tests/bugs/shard/bug-1342298.t - 6 second > > ./tests/bugs/shard/bug-1259651.t - 6 second > > ./tests/bugs/replicate/bug-1686568-send-truncate-on-arbiter-from- > > shd.t - 6 second > > ./tests/bugs/replicate/bug-1626994-info-split-brain.t - 6 second > > ./tests/bugs/replicate/bug-1325792.t - 6 second > > ./tests/bugs/replicate/bug-1101647.t - 6 second > > ./tests/bugs/quota/bug-1243798.t - 6 second > > ./tests/bugs/protocol/bug-1321578.t - 6 second > > ./tests/bugs/nfs/bug-877885.t - 6 second > > ./tests/bugs/nfs/bug-1143880-fix-gNFSd-auth-crash.t - 6 second > > ./tests/bugs/md-cache/bug-1476324.t - 6 second > > ./tests/bugs/md-cache/afr-stale-read.t - 6 second > > ./tests/bugs/io-cache/bug-858242.t - 6 second > > ./tests/bugs/glusterfs/bug-893378.t - 6 second > > ./tests/bugs/glusterfs/bug-856455.t - 6 second > > ./tests/bugs/glusterd/quorum-value-check.t - 6 second > > ./tests/bugs/ec/bug-1179050.t - 6 second > > ./tests/bugs/distribute/bug-912564.t - 6 second > > ./tests/bugs/distribute/bug-884597.t - 6 second > > ./tests/bugs/distribute/bug-1368012.t - 6 second > > ./tests/bugs/core/bug-986429.t - 6 second > > ./tests/bugs/core/bug-1699025-brick-mux-detach-brick-fd-issue.t - > > 6 second > > ./tests/bugs/core/bug-1168803-snapd-option-validation-fix.t - 6 > > second > > ./tests/bugs/bug-1371806_1.t - 6 second > > ./tests/bugs/bitrot/bug-1229134-bitd-not-support-vol-set.t - 6 > > second > > ./tests/bugs/bitrot/bug-1210684-scrub-pause-resume-error- > > handling.t - 6 second > > ./tests/bitrot/bug-1221914.t - 6 second > > ./tests/basic/trace.t - 6 second > > ./tests/basic/playground/template-xlator-sanity.t - 6 second > > ./tests/basic/ec/nfs.t - 6 second > > ./tests/basic/ec/ec-read-policy.t - 6 second > > ./tests/basic/ec/ec-anonymous-fd.t - 6 second > > ./tests/basic/distribute/non-root-unlink-stale-linkto.t - 6 > > second > > ./tests/basic/changelog/changelog-rename.t - 6 second > > ./tests/basic/afr/heal-info.t - 6 second > > ./tests/basic/afr/afr-read-hash-mode.t - 6 second > > ./tests/gfid2path/gfid2path_nfs.t - 5 second > > ./tests/bugs/upcall/bug-1422776.t - 5 second > > ./tests/bugs/replicate/bug-886998.t - 5 second > > ./tests/bugs/replicate/bug-1365455.t - 5 second > > ./tests/bugs/readdir-ahead/bug-1670253-consistent-metadata.t - 5 > > second > > ./tests/bugs/posix/bug-gfid-path.t - 5 second > > ./tests/bugs/posix/bug-765380.t - 5 second > > ./tests/bugs/nfs/bug-847622.t - 5 second > > ./tests/bugs/nfs/bug-1116503.t - 5 second > > ./tests/bugs/io-stats/bug-1598548.t - 5 second > > ./tests/bugs/glusterfs-server/bug-877992.t - 5 second > > ./tests/bugs/glusterfs-server/bug-873549.t - 5 second > > ./tests/bugs/glusterfs/bug-895235.t - 5 second > > ./tests/bugs/fuse/bug-1126048.t - 5 second > > ./tests/bugs/distribute/bug-907072.t - 5 second > > ./tests/bugs/core/bug-913544.t - 5 second > > ./tests/bugs/core/bug-908146.t - 5 second > > ./tests/bugs/access-control/bug-1051896.t - 5 second > > ./tests/basic/ec/ec-internal-xattrs.t - 5 second > > ./tests/basic/ec/ec-fallocate.t - 5 second > > ./tests/basic/distribute/bug-1265677-use-readdirp.t - 5 second > > ./tests/basic/afr/arbiter-remove-brick.t - 5 second > > ./tests/performance/quick-read.t - 4 second > > ./tests/gfid2path/block-mount-access.t - 4 second > > ./tests/features/delay-gen.t - 4 second > > ./tests/bugs/upcall/bug-upcall-stat.t - 4 second > > ./tests/bugs/upcall/bug-1394131.t - 4 second > > ./tests/bugs/unclassified/bug-1034085.t - 4 second > > ./tests/bugs/snapshot/bug-1111041.t - 4 second > > ./tests/bugs/shard/bug-1272986.t - 4 second > > ./tests/bugs/shard/bug-1256580.t - 4 second > > ./tests/bugs/shard/bug-1250855.t - 4 second > > ./tests/bugs/shard/bug-1245547.t - 4 second > > ./tests/bugs/rpc/bug-954057.t - 4 second > > ./tests/bugs/replicate/bug-976800.t - 4 second > > ./tests/bugs/replicate/bug-880898.t - 4 second > > ./tests/bugs/replicate/bug-1480525.t - 4 second > > ./tests/bugs/read-only/bug-1134822-read-only-default-in-graph.t - > > 4 second > > ./tests/bugs/readdir-ahead/bug-1446516.t - 4 second > > ./tests/bugs/readdir-ahead/bug-1439640.t - 4 second > > ./tests/bugs/readdir-ahead/bug-1390050.t - 4 second > > ./tests/bugs/quota/bug-1287996.t - 4 second > > ./tests/bugs/quick-read/bug-846240.t - 4 second > > ./tests/bugs/posix/disallow-gfid-volumeid-removexattr.t - 4 > > second > > ./tests/bugs/posix/bug-1619720.t - 4 second > > ./tests/bugs/nl-cache/bug-1451588.t - 4 second > > ./tests/bugs/nfs/zero-atime.t - 4 second > > ./tests/bugs/nfs/subdir-trailing-slash.t - 4 second > > ./tests/bugs/nfs/socket-as-fifo.t - 4 second > > ./tests/bugs/nfs/showmount-many-clients.t - 4 second > > ./tests/bugs/nfs/bug-1210338.t - 4 second > > ./tests/bugs/nfs/bug-1166862.t - 4 second > > ./tests/bugs/nfs/bug-1161092-nfs-acls.t - 4 second > > ./tests/bugs/md-cache/bug-1632503.t - 4 second > > ./tests/bugs/glusterfs-server/bug-864222.t - 4 second > > ./tests/bugs/glusterfs/bug-1482528.t - 4 second > > ./tests/bugs/glusterd/bug-948729/bug-948729-mode-script.t - 4 > > second > > ./tests/bugs/glusterd/bug-948729/bug-948729-force.t - 4 second > > ./tests/bugs/glusterd/bug-1482906-peer-file-blank-line.t - 4 > > second > > ./tests/bugs/glusterd/bug-1091935-brick-order-check-from-cli-to- > > glusterd.t - 4 second > > ./tests/bugs/geo-replication/bug-1296496.t - 4 second > > ./tests/bugs/fuse/bug-1336818.t - 4 second > > ./tests/bugs/fuse/bug-1283103.t - 4 second > > ./tests/bugs/core/io-stats-1322825.t - 4 second > > ./tests/bugs/core/bug-834465.t - 4 second > > ./tests/bugs/core/bug-1135514-allow-setxattr-with-null-value.t - > > 4 second > > ./tests/bugs/core/949327.t - 4 second > > ./tests/bugs/cli/bug-977246.t - 4 second > > ./tests/bugs/cli/bug-961307.t - 4 second > > ./tests/bugs/cli/bug-1004218.t - 4 second > > ./tests/bugs/bug-1138841.t - 4 second > > ./tests/bugs/access-control/bug-1387241.t - 4 second > > ./tests/bitrot/bug-internal-xattrs-check-1243391.t - 4 second > > ./tests/basic/quota-rename.t - 4 second > > ./tests/basic/hardlink-limit.t - 4 second > > ./tests/basic/ec/dht-rename.t - 4 second > > ./tests/basic/distribute/lookup.t - 4 second > > ./tests/line-coverage/meta-max-coverage.t - 3 second > > ./tests/gfid2path/gfid2path_fuse.t - 3 second > > ./tests/bugs/unclassified/bug-991622.t - 3 second > > ./tests/bugs/trace/bug-797171.t - 3 second > > ./tests/bugs/glusterfs-server/bug-861542.t - 3 second > > ./tests/bugs/glusterfs/bug-869724.t - 3 second > > ./tests/bugs/glusterfs/bug-860297.t - 3 second > > ./tests/bugs/glusterfs/bug-844688.t - 3 second > > ./tests/bugs/glusterd/bug-948729/bug-948729.t - 3 second > > ./tests/bugs/distribute/bug-1204140.t - 3 second > > ./tests/bugs/core/bug-924075.t - 3 second > > ./tests/bugs/core/bug-845213.t - 3 second > > ./tests/bugs/core/bug-1421721-mpx-toggle.t - 3 second > > ./tests/bugs/core/bug-1119582.t - 3 second > > ./tests/bugs/core/bug-1117951.t - 3 second > > ./tests/bugs/cli/bug-983317-volume-get.t - 3 second > > ./tests/bugs/cli/bug-867252.t - 3 second > > ./tests/basic/glusterd/check-cloudsync-ancestry.t - 3 second > > ./tests/basic/fops-sanity.t - 3 second > > ./tests/basic/fencing/test-fence-option.t - 3 second > > ./tests/basic/distribute/debug-xattrs.t - 3 second > > ./tests/basic/afr/ta-check-locks.t - 3 second > > ./tests/line-coverage/volfile-with-all-graph-syntax.t - 2 second > > ./tests/line-coverage/some-features-in-libglusterfs.t - 2 second > > ./tests/bugs/shard/bug-1261773.t - 2 second > > ./tests/bugs/replicate/bug-884328.t - 2 second > > ./tests/bugs/readdir-ahead/bug-1512437.t - 2 second > > ./tests/bugs/nfs/bug-970070.t - 2 second > > ./tests/bugs/nfs/bug-1302948.t - 2 second > > ./tests/bugs/logging/bug-823081.t - 2 second > > ./tests/bugs/glusterfs-server/bug-889996.t - 2 second > > ./tests/bugs/glusterfs/bug-892730.t - 2 second > > ./tests/bugs/glusterfs/bug-811493.t - 2 second > > ./tests/bugs/glusterd/bug-1085330-and-bug-916549.t - 2 second > > ./tests/bugs/distribute/bug-924265.t - 2 second > > ./tests/bugs/core/log-bug-1362520.t - 2 second > > ./tests/bugs/core/bug-903336.t - 2 second > > ./tests/bugs/core/bug-1111557.t - 2 second > > ./tests/bugs/cli/bug-969193.t - 2 second > > ./tests/bugs/cli/bug-949298.t - 2 second > > ./tests/bugs/cli/bug-921215.t - 2 second > > ./tests/bugs/cli/bug-1378842-volume-get-all.t - 2 second > > ./tests/basic/peer-parsing.t - 2 second > > ./tests/basic/md-cache/bug-1418249.t - 2 second > > ./tests/basic/afr/arbiter-cli.t - 2 second > > ./tests/bugs/replicate/ta-inode-refresh-read.t - 1 second > > ./tests/bugs/glusterfs/bug-853690.t - 1 second > > ./tests/bugs/cli/bug-764638.t - 1 second > > ./tests/bugs/cli/bug-1047378.t - 1 second > > ./tests/basic/netgroup_parsing.t - 1 second > > ./tests/basic/gfapi/sink.t - 1 second > > ./tests/basic/exports_parsing.t - 1 second > > ./tests/basic/posixonly.t - 0 second > > ./tests/basic/glusterfsd-args.t - 0 second > > > > > > 2 test(s) failed > > ./tests/basic/uss.t > > ./tests/features/subdir-mount.t > > > > 0 test(s) generated core > > > > > > 5 test(s) needed retry > > ./tests/basic/afr/split-brain-favorite-child-policy.t > > ./tests/basic/ec/self-heal.t > > ./tests/basic/uss.t > > ./tests/basic/volfile-sanity.t > > ./tests/features/subdir-mount.t > > > > Result is 1 > > > > tar: Removing leading `/' from member names > > kernel.core_pattern = /%e-%p.core > > Build step 'Execute shell' marked build as failure > > _______________________________________________ > > maintainers mailing list > > maintainers at gluster.org > > https://lists.gluster.org/mailman/listinfo/maintainers > > > > > > -- > > - Atin (atinm) > > _______________________________________________ > > maintainers mailing list > > maintainers at gluster.org > > https://lists.gluster.org/mailman/listinfo/maintainers > > > _______________________________________________ > maintainers mailing list > maintainers at gluster.org > https://lists.gluster.org/mailman/listinfo/maintainers From amukherj at redhat.com Tue Jun 11 08:51:35 2019 From: amukherj at redhat.com (Atin Mukherjee) Date: Tue, 11 Jun 2019 14:21:35 +0530 Subject: [Gluster-devel] https://build.gluster.org/job/centos7-regression/6404/consoleFull - Problem accessing //job/centos7-regression/6404/consoleFull. Reason: Not found Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1719174 The patch which failed the regression is https://review.gluster.org/22851 . -------------- next part -------------- An HTML attachment was scrubbed... URL: From linux at eikelenboom.it Tue Jun 11 10:46:38 2019 From: linux at eikelenboom.it (Sander Eikelenboom) Date: Tue, 11 Jun 2019 12:46:38 +0200 Subject: [Gluster-devel] Linux 5.2-RC regression bisected, mounting glusterfs volumes fails after commit: fuse: require /dev/fuse reads to have enough buffer capacity Message-ID: <876aefd0-808a-bb4b-0897-191f0a8d9e12@eikelenboom.it> L.S., While testing a linux 5.2 kernel I noticed it fails to mount my glusterfs volumes. It repeatedly fails with: [2019-06-11 09:15:27.106946] W [fuse-bridge.c:4993:fuse_thread_proc] 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument) [2019-06-11 09:15:27.106955] W [fuse-bridge.c:4993:fuse_thread_proc] 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument) [2019-06-11 09:15:27.106963] W [fuse-bridge.c:4993:fuse_thread_proc] 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument) [2019-06-11 09:15:27.106971] W [fuse-bridge.c:4993:fuse_thread_proc] 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument) etc. etc. Bisecting turned up as culprit: commit d4b13963f217dd947da5c0cabd1569e914d21699: fuse: require /dev/fuse reads to have enough buffer capacity The glusterfs version i'm using is from Debian stable: ii glusterfs-client 3.8.8-1 amd64 clustered file-system (client package) ii glusterfs-common 3.8.8-1 amd64 GlusterFS common libraries and translator modules A 5.1.* kernel works fine, as does a 5.2-rc4 kernel with said commit reverted. -- Sander From atumball at redhat.com Tue Jun 11 11:40:40 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Tue, 11 Jun 2019 17:10:40 +0530 Subject: [Gluster-devel] Linux 5.2-RC regression bisected, mounting glusterfs volumes fails after commit: fuse: require /dev/fuse reads to have enough buffer capacity In-Reply-To: <876aefd0-808a-bb4b-0897-191f0a8d9e12@eikelenboom.it> References: <876aefd0-808a-bb4b-0897-191f0a8d9e12@eikelenboom.it> Message-ID: Thanks for the heads up! We will see how to revert / fix the issue properly for 5.2 kernel. -Amar On Tue, Jun 11, 2019 at 4:34 PM Sander Eikelenboom wrote: > L.S., > > While testing a linux 5.2 kernel I noticed it fails to mount my glusterfs > volumes. > > It repeatedly fails with: > [2019-06-11 09:15:27.106946] W [fuse-bridge.c:4993:fuse_thread_proc] > 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument) > [2019-06-11 09:15:27.106955] W [fuse-bridge.c:4993:fuse_thread_proc] > 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument) > [2019-06-11 09:15:27.106963] W [fuse-bridge.c:4993:fuse_thread_proc] > 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument) > [2019-06-11 09:15:27.106971] W [fuse-bridge.c:4993:fuse_thread_proc] > 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument) > etc. > etc. > > Bisecting turned up as culprit: > commit d4b13963f217dd947da5c0cabd1569e914d21699: fuse: require > /dev/fuse reads to have enough buffer capacity > > The glusterfs version i'm using is from Debian stable: > ii glusterfs-client 3.8.8-1 > amd64 clustered file-system (client package) > ii glusterfs-common 3.8.8-1 > amd64 GlusterFS common libraries and translator modules > > > A 5.1.* kernel works fine, as does a 5.2-rc4 kernel with said commit > reverted. > > -- > Sander > _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From hgowtham at redhat.com Wed Jun 12 09:14:04 2019 From: hgowtham at redhat.com (Hari Gowtham) Date: Wed, 12 Jun 2019 14:44:04 +0530 Subject: [Gluster-devel] Removing glupy from release 5.7 Message-ID: Hi, Due to the recent changes we made. we have a build issue because of glupy. As glupy is already removed from master, we are thinking of removing it in 5.7 as well rather than fixing the issue. The release of 5.7 will be delayed as we have send a patch to fix this issue. And if anyone has any concerns, do let us know. -- Regards, Hari Gowtham. From kirr at nexedi.com Wed Jun 12 11:25:51 2019 From: kirr at nexedi.com (Kirill Smelkov) Date: Wed, 12 Jun 2019 11:25:51 +0000 Subject: [Gluster-devel] [PATCH] fuse: require /dev/fuse reads to have enough buffer capacity (take 2) In-Reply-To: References: <876aefd0-808a-bb4b-0897-191f0a8d9e12@eikelenboom.it> <20190611202738.GA22556@deco.navytux.spb.ru> Message-ID: <20190612112544.GA21465@deco.navytux.spb.ru> On Wed, Jun 12, 2019 at 09:44:49AM +0200, Miklos Szeredi wrote: > On Tue, Jun 11, 2019 at 10:28 PM Kirill Smelkov wrote: > > > Miklos, would 4K -> `sizeof(fuse_in_header) + sizeof(fuse_write_in)` for > > header room change be accepted? > > Yes, next cycle. For 4.2 I'll just push the revert. Thanks Miklos. Please consider queuing the following patch for 5.3. Sander, could you please confirm that glusterfs is not broken with this version of the check? Thanks beforehand, Kirill ---- 8< ---- >From 24a04e8be9bbf6e67de9e1908dcbe95d426d2521 Mon Sep 17 00:00:00 2001 From: Kirill Smelkov Date: Wed, 27 Mar 2019 10:15:15 +0000 Subject: [PATCH] fuse: require /dev/fuse reads to have enough buffer capacity (take 2) [ This retries commit d4b13963f217 which was reverted in 766741fcaa1f. In this version we require only `sizeof(fuse_in_header) + sizeof(fuse_write_in)` instead of 4K for FUSE request header room, because, contrary to libfuse and kernel client behaviour, GlusterFS actually provides only so much room for request header. ] A FUSE filesystem server queues /dev/fuse sys_read calls to get filesystem requests to handle. It does not know in advance what would be that request as it can be anything that client issues - LOOKUP, READ, WRITE, ... Many requests are short and retrieve data from the filesystem. However WRITE and NOTIFY_REPLY write data into filesystem. Before getting into operation phase, FUSE filesystem server and kernel client negotiate what should be the maximum write size the client will ever issue. After negotiation the contract in between server/client is that the filesystem server then should queue /dev/fuse sys_read calls with enough buffer capacity to receive any client request - WRITE in particular, while FUSE client should not, in particular, send WRITE requests with > negotiated max_write payload. FUSE client in kernel and libfuse historically reserve 4K for request header. However an existing filesystem server - GlusterFS - was found which reserves only 80 bytes for header room (= `sizeof(fuse_in_header) + sizeof(fuse_write_in)`). https://lore.kernel.org/linux-fsdevel/20190611202738.GA22556 at deco.navytux.spb.ru/ https://github.com/gluster/glusterfs/blob/v3.8.15-0-gd174f021a/xlators/mount/fuse/src/fuse-bridge.c#L4894 Since `sizeof(fuse_in_header) + sizeof(fuse_write_in)` == `sizeof(fuse_in_header) + sizeof(fuse_read_in)` == `sizeof(fuse_in_header) + sizeof(fuse_notify_retrieve_in)` is the absolute minimum any sane filesystem should be using for header room, the contract is that filesystem server should queue sys_reads with `sizeof(fuse_in_header) + sizeof(fuse_write_in)` + max_write buffer. If the filesystem server does not follow this contract, what can happen is that fuse_dev_do_read will see that request size is > buffer size, and then it will return EIO to client who issued the request but won't indicate in any way that there is a problem to filesystem server. This can be hard to diagnose because for some requests, e.g. for NOTIFY_REPLY which mimics WRITE, there is no client thread that is waiting for request completion and that EIO goes nowhere, while on filesystem server side things look like the kernel is not replying back after successful NOTIFY_RETRIEVE request made by the server. We can make the problem easy to diagnose if we indicate via error return to filesystem server when it is violating the contract. This should not practically cause problems because if a filesystem server is using shorter buffer, writes to it were already very likely to cause EIO, and if the filesystem is read-only it should be too following FUSE_MIN_READ_BUFFER minimum buffer size. Please see [1] for context where the problem of stuck filesystem was hit for real (because kernel client was incorrectly sending more than max_write data with NOTIFY_REPLY; see also previous patch), how the situation was traced and for more involving patch that did not make it into the tree. [1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2 Signed-off-by: Kirill Smelkov Cc: Han-Wen Nienhuys Cc: Jakob Unterwurzacher Signed-off-by: Miklos Szeredi --- fs/fuse/dev.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index ea8237513dfa..15531ba560b5 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -1317,6 +1317,25 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file, unsigned reqsize; unsigned int hash; + /* + * Require sane minimum read buffer - that has capacity for fixed part + * of any request header + negotiated max_write room for data. If the + * requirement is not satisfied return EINVAL to the filesystem server + * to indicate that it is not following FUSE server/client contract. + * Don't dequeue / abort any request. + * + * Historically libfuse reserves 4K for fixed header room, but e.g. + * GlusterFS reserves only 80 bytes + * + * = `sizeof(fuse_in_header) + sizeof(fuse_write_in)` + * + * which is the absolute minimum any sane filesystem should be using + * for header room. + */ + if (nbytes < max_t(size_t, FUSE_MIN_READ_BUFFER, + sizeof(fuse_in_header) + sizeof(fuse_write_in) + fc->max_write)) + return -EINVAL; + restart: spin_lock(&fiq->waitq.lock); err = -EAGAIN; -- 2.20.1 From linux at eikelenboom.it Wed Jun 12 12:11:37 2019 From: linux at eikelenboom.it (Sander Eikelenboom) Date: Wed, 12 Jun 2019 14:11:37 +0200 Subject: [Gluster-devel] [PATCH] fuse: require /dev/fuse reads to have enough buffer capacity (take 2) In-Reply-To: <20190612112544.GA21465@deco.navytux.spb.ru> References: <876aefd0-808a-bb4b-0897-191f0a8d9e12@eikelenboom.it> <20190611202738.GA22556@deco.navytux.spb.ru> <20190612112544.GA21465@deco.navytux.spb.ru> Message-ID: <97c87eb3-5b95-c848-8c50-ed7b535220b0@eikelenboom.it> On 12/06/2019 13:25, Kirill Smelkov wrote: > On Wed, Jun 12, 2019 at 09:44:49AM +0200, Miklos Szeredi wrote: >> On Tue, Jun 11, 2019 at 10:28 PM Kirill Smelkov wrote: >> >>> Miklos, would 4K -> `sizeof(fuse_in_header) + sizeof(fuse_write_in)` for >>> header room change be accepted? >> >> Yes, next cycle. For 4.2 I'll just push the revert. > > Thanks Miklos. Please consider queuing the following patch for 5.3. > Sander, could you please confirm that glusterfs is not broken with this > version of the check? > > Thanks beforehand, > Kirill Sure will give it a spin this evening and report back. -- Sander From linux at eikelenboom.it Wed Jun 12 13:03:49 2019 From: linux at eikelenboom.it (Sander Eikelenboom) Date: Wed, 12 Jun 2019 15:03:49 +0200 Subject: [Gluster-devel] [PATCH] fuse: require /dev/fuse reads to have enough buffer capacity (take 2) In-Reply-To: <20190612112544.GA21465@deco.navytux.spb.ru> References: <876aefd0-808a-bb4b-0897-191f0a8d9e12@eikelenboom.it> <20190611202738.GA22556@deco.navytux.spb.ru> <20190612112544.GA21465@deco.navytux.spb.ru> Message-ID: On 12/06/2019 13:25, Kirill Smelkov wrote: > On Wed, Jun 12, 2019 at 09:44:49AM +0200, Miklos Szeredi wrote: >> On Tue, Jun 11, 2019 at 10:28 PM Kirill Smelkov wrote: >> >>> Miklos, would 4K -> `sizeof(fuse_in_header) + sizeof(fuse_write_in)` for >>> header room change be accepted? >> >> Yes, next cycle. For 4.2 I'll just push the revert. > > Thanks Miklos. Please consider queuing the following patch for 5.3. > Sander, could you please confirm that glusterfs is not broken with this > version of the check? > > Thanks beforehand, > Kirill Hmm unfortunately it doesn't build, see below. -- Sander In file included from ./include/linux/list.h:9:0, from ./include/linux/wait.h:7, from ./include/linux/wait_bit.h:8, from ./include/linux/fs.h:6, from fs/fuse/fuse_i.h:17, from fs/fuse/dev.c:9: fs/fuse/dev.c: In function ?fuse_dev_do_read?: fs/fuse/dev.c:1336:14: error: ?fuse_in_header? undeclared (first use in this function) sizeof(fuse_in_header) + sizeof(fuse_write_in) + fc->max_write)) ^ ./include/linux/kernel.h:818:40: note: in definition of macro ?__typecheck? (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1))) ^ ./include/linux/kernel.h:842:24: note: in expansion of macro ?__safe_cmp? __builtin_choose_expr(__safe_cmp(x, y), \ ^~~~~~~~~~ ./include/linux/kernel.h:918:27: note: in expansion of macro ?__careful_cmp? #define max_t(type, x, y) __careful_cmp((type)(x), (type)(y), >) ^~~~~~~~~~~~~ fs/fuse/dev.c:1335:15: note: in expansion of macro ?max_t? if (nbytes < max_t(size_t, FUSE_MIN_READ_BUFFER, ^~~~~ fs/fuse/dev.c:1336:14: note: each undeclared identifier is reported only once for each function it appears in sizeof(fuse_in_header) + sizeof(fuse_write_in) + fc->max_write)) ^ ./include/linux/kernel.h:818:40: note: in definition of macro ?__typecheck? (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1))) ^ ./include/linux/kernel.h:842:24: note: in expansion of macro ?__safe_cmp? __builtin_choose_expr(__safe_cmp(x, y), \ ^~~~~~~~~~ ./include/linux/kernel.h:918:27: note: in expansion of macro ?__careful_cmp? #define max_t(type, x, y) __careful_cmp((type)(x), (type)(y), >) ^~~~~~~~~~~~~ fs/fuse/dev.c:1335:15: note: in expansion of macro ?max_t? if (nbytes < max_t(size_t, FUSE_MIN_READ_BUFFER, ^~~~~ fs/fuse/dev.c:1336:39: error: ?fuse_write_in? undeclared (first use in this function) sizeof(fuse_in_header) + sizeof(fuse_write_in) + fc->max_write)) ^ ./include/linux/kernel.h:818:40: note: in definition of macro ?__typecheck? (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1))) ^ ./include/linux/kernel.h:842:24: note: in expansion of macro ?__safe_cmp? __builtin_choose_expr(__safe_cmp(x, y), \ ^~~~~~~~~~ ./include/linux/kernel.h:918:27: note: in expansion of macro ?__careful_cmp? #define max_t(type, x, y) __careful_cmp((type)(x), (type)(y), >) ^~~~~~~~~~~~~ fs/fuse/dev.c:1335:15: note: in expansion of macro ?max_t? if (nbytes < max_t(size_t, FUSE_MIN_READ_BUFFER, ^~~~~ ./include/linux/kernel.h:842:2: error: first argument to ?__builtin_choose_expr? not a constant __builtin_choose_expr(__safe_cmp(x, y), \ ^ ./include/linux/kernel.h:918:27: note: in expansion of macro ?__careful_cmp? #define max_t(type, x, y) __careful_cmp((type)(x), (type)(y), >) ^~~~~~~~~~~~~ fs/fuse/dev.c:1335:15: note: in expansion of macro ?max_t? if (nbytes < max_t(size_t, FUSE_MIN_READ_BUFFER, ^~~~~ scripts/Makefile.build:278: recipe for target 'fs/fuse/dev.o' failed make[3]: *** [fs/fuse/dev.o] Error 1 scripts/Makefile.build:489: recipe for target 'fs/fuse' failed make[2]: *** [fs/fuse] Error 2 From ndevos at redhat.com Wed Jun 12 13:34:37 2019 From: ndevos at redhat.com (Niels de Vos) Date: Wed, 12 Jun 2019 15:34:37 +0200 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: Message-ID: <20190612133437.GK8725@ndevos-x270> On Wed, Jun 12, 2019 at 02:44:04PM +0530, Hari Gowtham wrote: > Hi, > > Due to the recent changes we made. we have a build issue because of glupy. > As glupy is already removed from master, we are thinking of removing > it in 5.7 as well rather than fixing the issue. > > The release of 5.7 will be delayed as we have send a patch to fix this issue. > And if anyone has any concerns, do let us know. Could you link to the BZ with the build error and patches that attempt fixing it? We normally do not remove features with minor updates. Fixing the build error would be the preferred approach. Thanks, Niels From hgowtham at redhat.com Wed Jun 12 14:24:17 2019 From: hgowtham at redhat.com (Hari Gowtham) Date: Wed, 12 Jun 2019 19:54:17 +0530 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: <20190612133437.GK8725@ndevos-x270> References: <20190612133437.GK8725@ndevos-x270> Message-ID: We haven't sent any patch to fix it. Waiting for the decision to be made. The bz: https://bugzilla.redhat.com/show_bug.cgi?id=1719778 The link to the build log: https://build.gluster.org/job/strfmt_errors/18888/artifact/RPMS/el6/i686/build.log The last few messages in the log: config.status: creating xlators/features/changelog/lib/src/Makefile config.status: creating xlators/features/changetimerecorder/Makefile config.status: creating xlators/features/changetimerecorder/src/Makefile BUILDSTDERR: config.status: error: cannot find input file: xlators/features/glupy/Makefile.in RPM build errors: BUILDSTDERR: error: Bad exit status from /var/tmp/rpm-tmp.kGZI5V (%build) BUILDSTDERR: Bad exit status from /var/tmp/rpm-tmp.kGZI5V (%build) Child return code was: 1 EXCEPTION: [Error()] Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/mockbuild/trace_decorator.py", line 96, in trace result = func(*args, **kw) File "/usr/lib/python3.6/site-packages/mockbuild/util.py", line 736, in do_with_status raise exception.Error("Command failed: \n # %s\n%s" % (command, output), child.returncode) mockbuild.exception.Error: Command failed: # bash --login -c /usr/bin/rpmbuild -bb --target i686 --nodeps /builddir/build/SPECS/glusterfs.spec On Wed, Jun 12, 2019 at 7:04 PM Niels de Vos wrote: > > On Wed, Jun 12, 2019 at 02:44:04PM +0530, Hari Gowtham wrote: > > Hi, > > > > Due to the recent changes we made. we have a build issue because of glupy. > > As glupy is already removed from master, we are thinking of removing > > it in 5.7 as well rather than fixing the issue. > > > > The release of 5.7 will be delayed as we have send a patch to fix this issue. > > And if anyone has any concerns, do let us know. > > Could you link to the BZ with the build error and patches that attempt > fixing it? > > We normally do not remove features with minor updates. Fixing the build > error would be the preferred approach. > > Thanks, > Niels -- Regards, Hari Gowtham. From kirr at nexedi.com Wed Jun 12 14:12:26 2019 From: kirr at nexedi.com (Kirill Smelkov) Date: Wed, 12 Jun 2019 14:12:26 +0000 Subject: [Gluster-devel] [PATCH] fuse: require /dev/fuse reads to have enough buffer capacity (take 2) In-Reply-To: References: <876aefd0-808a-bb4b-0897-191f0a8d9e12@eikelenboom.it> <20190611202738.GA22556@deco.navytux.spb.ru> <20190612112544.GA21465@deco.navytux.spb.ru> Message-ID: <20190612141220.GA25389@deco.navytux.spb.ru> On Wed, Jun 12, 2019 at 03:03:49PM +0200, Sander Eikelenboom wrote: > On 12/06/2019 13:25, Kirill Smelkov wrote: > > On Wed, Jun 12, 2019 at 09:44:49AM +0200, Miklos Szeredi wrote: > >> On Tue, Jun 11, 2019 at 10:28 PM Kirill Smelkov wrote: > >> > >>> Miklos, would 4K -> `sizeof(fuse_in_header) + sizeof(fuse_write_in)` for > >>> header room change be accepted? > >> > >> Yes, next cycle. For 4.2 I'll just push the revert. > > > > Thanks Miklos. Please consider queuing the following patch for 5.3. > > Sander, could you please confirm that glusterfs is not broken with this > > version of the check? > > > > Thanks beforehand, > > Kirill > > > Hmm unfortunately it doesn't build, see below. > [...] > fs/fuse/dev.c:1336:14: error: ?fuse_in_header? undeclared (first use in this function) > sizeof(fuse_in_header) + sizeof(fuse_write_in) + fc->max_write)) Sorry, my bad, it was missing "struct" before fuse_in_header. I originally compile-tested the patch with `make -j4`, was distracted onto other topic and did not see the error after returning due to long tail of successful CC lines. Apologize for the inconvenience. Below is a fixed patch that was both compile-tested and runtime-tested with my FUSE workloads (non-glusterfs). Kirill ---- 8< ---- >From 98fd29bb6789d5f6c346274b99d47008ad856607 Mon Sep 17 00:00:00 2001 From: Kirill Smelkov Date: Wed, 12 Jun 2019 17:06:18 +0300 Subject: [PATCH v2] fuse: require /dev/fuse reads to have enough buffer capacity (take 2) [ This retries commit d4b13963f217 which was reverted in 766741fcaa1f. In this version we require only `sizeof(fuse_in_header) + sizeof(fuse_write_in)` instead of 4K for FUSE request header room, because, contrary to libfuse and kernel client behaviour, GlusterFS actually provides only so much room for request header. ] A FUSE filesystem server queues /dev/fuse sys_read calls to get filesystem requests to handle. It does not know in advance what would be that request as it can be anything that client issues - LOOKUP, READ, WRITE, ... Many requests are short and retrieve data from the filesystem. However WRITE and NOTIFY_REPLY write data into filesystem. Before getting into operation phase, FUSE filesystem server and kernel client negotiate what should be the maximum write size the client will ever issue. After negotiation the contract in between server/client is that the filesystem server then should queue /dev/fuse sys_read calls with enough buffer capacity to receive any client request - WRITE in particular, while FUSE client should not, in particular, send WRITE requests with > negotiated max_write payload. FUSE client in kernel and libfuse historically reserve 4K for request header. However an existing filesystem server - GlusterFS - was found which reserves only 80 bytes for header room (= `sizeof(fuse_in_header) + sizeof(fuse_write_in)`). https://lore.kernel.org/linux-fsdevel/20190611202738.GA22556 at deco.navytux.spb.ru/ https://github.com/gluster/glusterfs/blob/v3.8.15-0-gd174f021a/xlators/mount/fuse/src/fuse-bridge.c#L4894 Since `sizeof(fuse_in_header) + sizeof(fuse_write_in)` == `sizeof(fuse_in_header) + sizeof(fuse_read_in)` == `sizeof(fuse_in_header) + sizeof(fuse_notify_retrieve_in)` is the absolute minimum any sane filesystem should be using for header room, the contract is that filesystem server should queue sys_reads with `sizeof(fuse_in_header) + sizeof(fuse_write_in)` + max_write buffer. If the filesystem server does not follow this contract, what can happen is that fuse_dev_do_read will see that request size is > buffer size, and then it will return EIO to client who issued the request but won't indicate in any way that there is a problem to filesystem server. This can be hard to diagnose because for some requests, e.g. for NOTIFY_REPLY which mimics WRITE, there is no client thread that is waiting for request completion and that EIO goes nowhere, while on filesystem server side things look like the kernel is not replying back after successful NOTIFY_RETRIEVE request made by the server. We can make the problem easy to diagnose if we indicate via error return to filesystem server when it is violating the contract. This should not practically cause problems because if a filesystem server is using shorter buffer, writes to it were already very likely to cause EIO, and if the filesystem is read-only it should be too following FUSE_MIN_READ_BUFFER minimum buffer size. Please see [1] for context where the problem of stuck filesystem was hit for real (because kernel client was incorrectly sending more than max_write data with NOTIFY_REPLY; see also previous patch), how the situation was traced and for more involving patch that did not make it into the tree. [1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2 Signed-off-by: Kirill Smelkov Cc: Han-Wen Nienhuys Cc: Jakob Unterwurzacher --- fs/fuse/dev.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index ea8237513dfa..b2b2344eadcf 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -1317,6 +1317,26 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file, unsigned reqsize; unsigned int hash; + /* + * Require sane minimum read buffer - that has capacity for fixed part + * of any request header + negotiated max_write room for data. If the + * requirement is not satisfied return EINVAL to the filesystem server + * to indicate that it is not following FUSE server/client contract. + * Don't dequeue / abort any request. + * + * Historically libfuse reserves 4K for fixed header room, but e.g. + * GlusterFS reserves only 80 bytes + * + * = `sizeof(fuse_in_header) + sizeof(fuse_write_in)` + * + * which is the absolute minimum any sane filesystem should be using + * for header room. + */ + if (nbytes < max_t(size_t, FUSE_MIN_READ_BUFFER, + sizeof(struct fuse_in_header) + sizeof(struct fuse_write_in) + + fc->max_write)) + return -EINVAL; + restart: spin_lock(&fiq->waitq.lock); err = -EAGAIN; -- 2.20.1 From ndevos at redhat.com Wed Jun 12 15:11:42 2019 From: ndevos at redhat.com (Niels de Vos) Date: Wed, 12 Jun 2019 17:11:42 +0200 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190612133437.GK8725@ndevos-x270> Message-ID: <20190612151142.GL8725@ndevos-x270> On Wed, Jun 12, 2019 at 07:54:17PM +0530, Hari Gowtham wrote: > We haven't sent any patch to fix it. > Waiting for the decision to be made. > The bz: https://bugzilla.redhat.com/show_bug.cgi?id=1719778 > The link to the build log: > https://build.gluster.org/job/strfmt_errors/18888/artifact/RPMS/el6/i686/build.log > > The last few messages in the log: > > config.status: creating xlators/features/changelog/lib/src/Makefile > config.status: creating xlators/features/changetimerecorder/Makefile > config.status: creating xlators/features/changetimerecorder/src/Makefile > BUILDSTDERR: config.status: error: cannot find input file: > xlators/features/glupy/Makefile.in > RPM build errors: > BUILDSTDERR: error: Bad exit status from /var/tmp/rpm-tmp.kGZI5V (%build) > BUILDSTDERR: Bad exit status from /var/tmp/rpm-tmp.kGZI5V (%build) > Child return code was: 1 > EXCEPTION: [Error()] > Traceback (most recent call last): > File "/usr/lib/python3.6/site-packages/mockbuild/trace_decorator.py", > line 96, in trace > result = func(*args, **kw) > File "/usr/lib/python3.6/site-packages/mockbuild/util.py", line 736, > in do_with_status > raise exception.Error("Command failed: \n # %s\n%s" % (command, > output), child.returncode) > mockbuild.exception.Error: Command failed: > # bash --login -c /usr/bin/rpmbuild -bb --target i686 --nodeps > /builddir/build/SPECS/glusterfs.spec Those messages are caused by missing files. The 'make dist' that generates the tarball in the previous step did not included the glupy files. https://build.gluster.org/job/strfmt_errors/18888/console contains the following message: configure: WARNING: --------------------------------------------------------------------------------- cannot build glupy. python 3.6 and python-devel/python-dev package are required. --------------------------------------------------------------------------------- I am not sure if there have been any recent backports to release-5 that introduced this behaviour. Maybe it is related to the builder where the tarball is generated. The job seems to detect python-3.6.8, which is not included in CentOS-7 for all I know? Maybe someone else understands how this can happen? HTH, Niels > > On Wed, Jun 12, 2019 at 7:04 PM Niels de Vos wrote: > > > > On Wed, Jun 12, 2019 at 02:44:04PM +0530, Hari Gowtham wrote: > > > Hi, > > > > > > Due to the recent changes we made. we have a build issue because of glupy. > > > As glupy is already removed from master, we are thinking of removing > > > it in 5.7 as well rather than fixing the issue. > > > > > > The release of 5.7 will be delayed as we have send a patch to fix this issue. > > > And if anyone has any concerns, do let us know. > > > > Could you link to the BZ with the build error and patches that attempt > > fixing it? > > > > We normally do not remove features with minor updates. Fixing the build > > error would be the preferred approach. > > > > Thanks, > > Niels > > > > -- > Regards, > Hari Gowtham. From linux at eikelenboom.it Wed Jun 12 16:28:17 2019 From: linux at eikelenboom.it (Sander Eikelenboom) Date: Wed, 12 Jun 2019 18:28:17 +0200 Subject: [Gluster-devel] [PATCH] fuse: require /dev/fuse reads to have enough buffer capacity (take 2) In-Reply-To: <20190612141220.GA25389@deco.navytux.spb.ru> References: <876aefd0-808a-bb4b-0897-191f0a8d9e12@eikelenboom.it> <20190611202738.GA22556@deco.navytux.spb.ru> <20190612112544.GA21465@deco.navytux.spb.ru> <20190612141220.GA25389@deco.navytux.spb.ru> Message-ID: On 12/06/2019 16:12, Kirill Smelkov wrote: > On Wed, Jun 12, 2019 at 03:03:49PM +0200, Sander Eikelenboom wrote: >> On 12/06/2019 13:25, Kirill Smelkov wrote: >>> On Wed, Jun 12, 2019 at 09:44:49AM +0200, Miklos Szeredi wrote: >>>> On Tue, Jun 11, 2019 at 10:28 PM Kirill Smelkov wrote: >>>> >>>>> Miklos, would 4K -> `sizeof(fuse_in_header) + sizeof(fuse_write_in)` for >>>>> header room change be accepted? >>>> >>>> Yes, next cycle. For 4.2 I'll just push the revert. >>> >>> Thanks Miklos. Please consider queuing the following patch for 5.3. >>> Sander, could you please confirm that glusterfs is not broken with this >>> version of the check? >>> >>> Thanks beforehand, >>> Kirill >> >> >> Hmm unfortunately it doesn't build, see below. >> [...] >> fs/fuse/dev.c:1336:14: error: ?fuse_in_header? undeclared (first use in this function) >> sizeof(fuse_in_header) + sizeof(fuse_write_in) + fc->max_write)) > > Sorry, my bad, it was missing "struct" before fuse_in_header. I > originally compile-tested the patch with `make -j4`, was distracted onto > other topic and did not see the error after returning due to long tail > of successful CC lines. Apologize for the inconvenience. Below is a > fixed patch that was both compile-tested and runtime-tested with my FUSE > workloads (non-glusterfs). > > Kirill > Just tested and it works for me, thanks ! -- Sander From kirr at nexedi.com Wed Jun 12 17:03:04 2019 From: kirr at nexedi.com (Kirill Smelkov) Date: Wed, 12 Jun 2019 17:03:04 +0000 Subject: [Gluster-devel] [PATCH] fuse: require /dev/fuse reads to have enough buffer capacity (take 2) In-Reply-To: References: <876aefd0-808a-bb4b-0897-191f0a8d9e12@eikelenboom.it> <20190611202738.GA22556@deco.navytux.spb.ru> <20190612112544.GA21465@deco.navytux.spb.ru> <20190612141220.GA25389@deco.navytux.spb.ru> Message-ID: <20190612170259.GA27637@deco.navytux.spb.ru> On Wed, Jun 12, 2019 at 06:28:17PM +0200, Sander Eikelenboom wrote: > On 12/06/2019 16:12, Kirill Smelkov wrote: > > On Wed, Jun 12, 2019 at 03:03:49PM +0200, Sander Eikelenboom wrote: > >> On 12/06/2019 13:25, Kirill Smelkov wrote: > >>> On Wed, Jun 12, 2019 at 09:44:49AM +0200, Miklos Szeredi wrote: > >>>> On Tue, Jun 11, 2019 at 10:28 PM Kirill Smelkov wrote: > >>>> > >>>>> Miklos, would 4K -> `sizeof(fuse_in_header) + sizeof(fuse_write_in)` for > >>>>> header room change be accepted? > >>>> > >>>> Yes, next cycle. For 4.2 I'll just push the revert. > >>> > >>> Thanks Miklos. Please consider queuing the following patch for 5.3. > >>> Sander, could you please confirm that glusterfs is not broken with this > >>> version of the check? > >>> > >>> Thanks beforehand, > >>> Kirill > >> > >> > >> Hmm unfortunately it doesn't build, see below. > >> [...] > >> fs/fuse/dev.c:1336:14: error: ?fuse_in_header? undeclared (first use in this function) > >> sizeof(fuse_in_header) + sizeof(fuse_write_in) + fc->max_write)) > > > > Sorry, my bad, it was missing "struct" before fuse_in_header. I > > originally compile-tested the patch with `make -j4`, was distracted onto > > other topic and did not see the error after returning due to long tail > > of successful CC lines. Apologize for the inconvenience. Below is a > > fixed patch that was both compile-tested and runtime-tested with my FUSE > > workloads (non-glusterfs). > > > > Kirill > > > > Just tested and it works for me, thanks ! Thanks for feedback. Kirill From atumball at redhat.com Wed Jun 12 17:41:09 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Wed, 12 Jun 2019 23:11:09 +0530 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: <20190612151142.GL8725@ndevos-x270> References: <20190612133437.GK8725@ndevos-x270> <20190612151142.GL8725@ndevos-x270> Message-ID: On Wed, Jun 12, 2019 at 8:42 PM Niels de Vos wrote: > On Wed, Jun 12, 2019 at 07:54:17PM +0530, Hari Gowtham wrote: > > We haven't sent any patch to fix it. > > Waiting for the decision to be made. > > The bz: https://bugzilla.redhat.com/show_bug.cgi?id=1719778 > > The link to the build log: > > > https://build.gluster.org/job/strfmt_errors/18888/artifact/RPMS/el6/i686/build.log > > > > The last few messages in the log: > > > > config.status: creating xlators/features/changelog/lib/src/Makefile > > config.status: creating xlators/features/changetimerecorder/Makefile > > config.status: creating xlators/features/changetimerecorder/src/Makefile > > BUILDSTDERR: config.status: error: cannot find input file: > > xlators/features/glupy/Makefile.in > > RPM build errors: > > BUILDSTDERR: error: Bad exit status from /var/tmp/rpm-tmp.kGZI5V (%build) > > BUILDSTDERR: Bad exit status from /var/tmp/rpm-tmp.kGZI5V (%build) > > Child return code was: 1 > > EXCEPTION: [Error()] > > Traceback (most recent call last): > > File "/usr/lib/python3.6/site-packages/mockbuild/trace_decorator.py", > > line 96, in trace > > result = func(*args, **kw) > > File "/usr/lib/python3.6/site-packages/mockbuild/util.py", line 736, > > in do_with_status > > raise exception.Error("Command failed: \n # %s\n%s" % (command, > > output), child.returncode) > > mockbuild.exception.Error: Command failed: > > # bash --login -c /usr/bin/rpmbuild -bb --target i686 --nodeps > > /builddir/build/SPECS/glusterfs.spec > > Those messages are caused by missing files. The 'make dist' that > generates the tarball in the previous step did not included the glupy > files. > > https://build.gluster.org/job/strfmt_errors/18888/console contains the > following message: > > configure: WARNING: > > --------------------------------------------------------------------------------- > cannot build glupy. python 3.6 and python-devel/python-dev > package are required. > > --------------------------------------------------------------------------------- > > I am not sure if there have been any recent backports to release-5 that > introduced this behaviour. Maybe it is related to the builder where the > tarball is generated. The job seems to detect python-3.6.8, which is not > included in CentOS-7 for all I know? > > We recently noticed that in one of the package update on builder (ie, centos7.x machines), python3.6 got installed as a dependency. So, yes, it is possible to have python3 in centos7 now. -Amar > Maybe someone else understands how this can happen? > > HTH, > Niels > > > > > > On Wed, Jun 12, 2019 at 7:04 PM Niels de Vos wrote: > > > > > > On Wed, Jun 12, 2019 at 02:44:04PM +0530, Hari Gowtham wrote: > > > > Hi, > > > > > > > > Due to the recent changes we made. we have a build issue because of > glupy. > > > > As glupy is already removed from master, we are thinking of removing > > > > it in 5.7 as well rather than fixing the issue. > > > > > > > > The release of 5.7 will be delayed as we have send a patch to fix > this issue. > > > > And if anyone has any concerns, do let us know. > > > > > > Could you link to the BZ with the build error and patches that attempt > > > fixing it? > > > > > > We normally do not remove features with minor updates. Fixing the build > > > error would be the preferred approach. > > > > > > Thanks, > > > Niels > > > > > > > > -- > > Regards, > > Hari Gowtham. > _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From kkeithle at redhat.com Wed Jun 12 18:36:17 2019 From: kkeithle at redhat.com (Kaleb Keithley) Date: Wed, 12 Jun 2019 11:36:17 -0700 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190612133437.GK8725@ndevos-x270> <20190612151142.GL8725@ndevos-x270> Message-ID: On Wed, Jun 12, 2019 at 10:43 AM Amar Tumballi Suryanarayan < atumball at redhat.com> wrote: > > We recently noticed that in one of the package update on builder (ie, > centos7.x machines), python3.6 got installed as a dependency. So, yes, it > is possible to have python3 in centos7 now. > EPEL updated from python34 to python36 recently, but C7 doesn't have python3 in the base. I don't think we've ever used EPEL packages for building. And GlusterFS-5 isn't python3 ready. -- Kaleb -------------- next part -------------- An HTML attachment was scrubbed... URL: From kkeithle at redhat.com Wed Jun 12 23:09:55 2019 From: kkeithle at redhat.com (Kaleb Keithley) Date: Wed, 12 Jun 2019 16:09:55 -0700 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190612133437.GK8725@ndevos-x270> <20190612151142.GL8725@ndevos-x270> Message-ID: On Wed, Jun 12, 2019 at 11:36 AM Kaleb Keithley wrote: > > On Wed, Jun 12, 2019 at 10:43 AM Amar Tumballi Suryanarayan < > atumball at redhat.com> wrote: > >> >> We recently noticed that in one of the package update on builder (ie, >> centos7.x machines), python3.6 got installed as a dependency. So, yes, it >> is possible to have python3 in centos7 now. >> > > EPEL updated from python34 to python36 recently, but C7 doesn't have > python3 in the base. I don't think we've ever used EPEL packages for > building. > > And GlusterFS-5 isn't python3 ready. > Correction: GlusterFS-5 is mostly or completely python3 ready. FWIW, python33 is available on both RHEL7 and CentOS7 from the Software Collection Library (SCL), and python34 and now python36 are available from EPEL. But packages built for the CentOS Storage SIG have never used the SCL or EPEL (EPEL not allowed) and the shebangs in the .py files are converted from /usr/bin/python3 to /usr/bin/python2 during the rpmbuild %prep stage. All the python dependencies for the packages remain the python2 flavors. AFAIK the centos-regression machines ought to be building the same way. -- Kaleb -------------- next part -------------- An HTML attachment was scrubbed... URL: From atin.mukherjee83 at gmail.com Thu Jun 13 02:30:17 2019 From: atin.mukherjee83 at gmail.com (Atin Mukherjee) Date: Thu, 13 Jun 2019 08:00:17 +0530 Subject: [Gluster-devel] Fwd: Details on the Scan.coverity.com June 2019 Upgrade In-Reply-To: <656ea81285804a36a32a2e5aface86fc@1061282284> References: <656ea81285804a36a32a2e5aface86fc@1061282284> Message-ID: Fyi..no scan for 3-4 days starting from June 17th for the upgrade. Post that we may have to do some changes to use the new build tool? ---------- Forwarded message --------- From: Peter Degen-Portnoy Date: Thu, 13 Jun 2019 at 00:48 Subject: Details on the Scan.coverity.com June 2019 Upgrade June 17, 9 a.m. MDT View this email in a browser [image: Synopsys] Coverity Scan 2019 Upgrade Dear Atin Mukherjee, Thank you for being an active user of scan.coverity.com . We have some important news to share with you. As you know, the version of Coverity used by the Scan website is somewhat out of date. So we?re pleased to announce that we?re upgrading to the latest stable production version. We?re currently verifying the upgrade. Here?s what you can expect: We plan to start the upgrade *Monday, June 17, around 9 a.m. MDT*. We expect the process to last 3?4 days. During this time, scan.coverity.com may be offline and unavailable. If possible, we?ll provide access to scan.coverity.com in read-only mode. After the upgrade, you should use the new Build tool that matches the upgraded version of Coverity. Specifically, the build tool from Coverity 8.7 will no longer be supported. You can find details about the upgrade and the new build tool on the Scan Status Community page. You can also subscribe to scan.coverity.com status updates on this page by clicking the ?Follow? button and selecting ?Every Post.? Please take a look at the information on the Scan Status Community page. If you have any questions about the upgrade, post them on the Synopsys Software Integrity Community . We?ll answer as soon as we can. Sincerely yours, The Scan Administrators scan-admin at coverity.com Follow ? 2019 Synopsys, Inc. All Rights Reserved 690 E Middlefield Rd, Mountain View, CA 94043 About Privacy Unsubscribe -- --Atin -------------- next part -------------- An HTML attachment was scrubbed... URL: From dkhandel at redhat.com Thu Jun 13 03:13:25 2019 From: dkhandel at redhat.com (Deepshikha Khandelwal) Date: Thu, 13 Jun 2019 08:43:25 +0530 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190612133437.GK8725@ndevos-x270> <20190612151142.GL8725@ndevos-x270> Message-ID: On Thu, Jun 13, 2019 at 4:41 AM Kaleb Keithley wrote: > > > On Wed, Jun 12, 2019 at 11:36 AM Kaleb Keithley > wrote: > >> >> On Wed, Jun 12, 2019 at 10:43 AM Amar Tumballi Suryanarayan < >> atumball at redhat.com> wrote: >> >>> >>> We recently noticed that in one of the package update on builder (ie, >>> centos7.x machines), python3.6 got installed as a dependency. So, yes, it >>> is possible to have python3 in centos7 now. >>> >> >> EPEL updated from python34 to python36 recently, but C7 doesn't have >> python3 in the base. I don't think we've ever used EPEL packages for >> building. >> >> And GlusterFS-5 isn't python3 ready. >> > > Correction: GlusterFS-5 is mostly or completely python3 ready. FWIW, > python33 is available on both RHEL7 and CentOS7 from the Software > Collection Library (SCL), and python34 and now python36 are available from > EPEL. > > But packages built for the CentOS Storage SIG have never used the SCL or > EPEL (EPEL not allowed) and the shebangs in the .py files are converted > from /usr/bin/python3 to /usr/bin/python2 during the rpmbuild %prep stage. > All the python dependencies for the packages remain the python2 flavors. > AFAIK the centos-regression machines ought to be building the same way. > centos-regression machines have 'CentOS Linux release 7.6.1810 (Core)' and using python3.6. Looking at the tracebacks when compiling we confirmed that it is picking up python3.6 somehow. To resolve this issue either we can remove glupy from the release(which is dead anyways) or install glupy on the instances. > > -- > > Kaleb > _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dkhandel at redhat.com Thu Jun 13 03:24:52 2019 From: dkhandel at redhat.com (Deepshikha Khandelwal) Date: Thu, 13 Jun 2019 08:54:52 +0530 Subject: [Gluster-devel] Fwd: Details on the Scan.coverity.com June 2019 Upgrade In-Reply-To: References: <656ea81285804a36a32a2e5aface86fc@1061282284> Message-ID: Thanks Atin for pointing this out. Yes, I'll upgrade the matching build tool on builders once coverity is upgraded on scan.coverity.com On Thu, Jun 13, 2019 at 8:01 AM Atin Mukherjee wrote: > Fyi..no scan for 3-4 days starting from June 17th for the upgrade. Post > that we may have to do some changes to use the new build tool? > > ---------- Forwarded message --------- > From: Peter Degen-Portnoy > Date: Thu, 13 Jun 2019 at 00:48 > Subject: Details on the Scan.coverity.com June 2019 Upgrade > > > June 17, 9 a.m. MDT > View this email in a browser > > > [image: Synopsys] > > Coverity Scan 2019 Upgrade > > > Dear Atin Mukherjee, > Thank you for being an active user of scan.coverity.com > . > We have some important news to share with you. > > As you know, the version of Coverity used by the Scan website is somewhat > out of date. So we?re pleased to announce that we?re upgrading to the > latest stable production version. > > We?re currently verifying the upgrade. Here?s what you can expect: > > We plan to start the upgrade *Monday, June 17, around 9 a.m. MDT*. We > expect the process to last 3?4 days. > > During this time, scan.coverity.com > > may be offline and unavailable. If possible, we?ll provide access to > scan.coverity.com > > in read-only mode. > > After the upgrade, you should use the new Build tool that matches the > upgraded version of Coverity. Specifically, the build tool from Coverity > 8.7 will no longer be supported. > > You can find details about the upgrade and the new build tool on the Scan > Status Community > > page. You can also subscribe to scan.coverity.com > > status updates on this page by clicking the ?Follow? button and selecting > ?Every Post.? > > Please take a look at the information on the Scan Status Community page. > If you have any questions about the upgrade, post them on the Synopsys > Software Integrity Community > . > We?ll answer as soon as we can. > > Sincerely yours, > The Scan Administrators > > scan-admin at coverity.com > > > Follow > > > > > > > > > > > ? 2019 Synopsys, Inc. All Rights Reserved > 690 E Middlefield Rd, Mountain View, CA 94043 > > > About > > Privacy > > Unsubscribe > > > > > -- > --Atin > _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kkeithle at redhat.com Thu Jun 13 04:35:56 2019 From: kkeithle at redhat.com (Kaleb Keithley) Date: Wed, 12 Jun 2019 21:35:56 -0700 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190612133437.GK8725@ndevos-x270> <20190612151142.GL8725@ndevos-x270> Message-ID: On Wed, Jun 12, 2019 at 8:13 PM Deepshikha Khandelwal wrote: > > On Thu, Jun 13, 2019 at 4:41 AM Kaleb Keithley > wrote: > >> >> >> On Wed, Jun 12, 2019 at 11:36 AM Kaleb Keithley >> wrote: >> >>> >>> On Wed, Jun 12, 2019 at 10:43 AM Amar Tumballi Suryanarayan < >>> atumball at redhat.com> wrote: >>> >>>> >>>> We recently noticed that in one of the package update on builder (ie, >>>> centos7.x machines), python3.6 got installed as a dependency. So, yes, it >>>> is possible to have python3 in centos7 now. >>>> >>> >>> EPEL updated from python34 to python36 recently, but C7 doesn't have >>> python3 in the base. I don't think we've ever used EPEL packages for >>> building. >>> >>> And GlusterFS-5 isn't python3 ready. >>> >> >> Correction: GlusterFS-5 is mostly or completely python3 ready. FWIW, >> python33 is available on both RHEL7 and CentOS7 from the Software >> Collection Library (SCL), and python34 and now python36 are available from >> EPEL. >> >> But packages built for the CentOS Storage SIG have never used the SCL or >> EPEL (EPEL not allowed) and the shebangs in the .py files are converted >> from /usr/bin/python3 to /usr/bin/python2 during the rpmbuild %prep stage. >> All the python dependencies for the packages remain the python2 flavors. >> AFAIK the centos-regression machines ought to be building the same way. >> > > centos-regression machines have 'CentOS Linux release 7.6.1810 (Core)' and > using python3.6. Looking at the tracebacks when compiling we confirmed that > it is picking up python3.6 somehow. > We need to figure out why? BTW, my CentOS 7 box is up to date and does not have any version of python3. I would have to use the SCL or EPEL to get it. What changed on June 5th? Between https://build.gluster.org/job/centos7-regression/6309/consoleFull and https://build.gluster.org/job/centos7-regression/6310/consoleFull? 6309 was the last centos-regression with python2.7. 6310 and all subsequent centos-regressions have been built with python3.6. Somebody added EPEL! Do we not have a record of the changes made and who made them? And BTW, this affects more than just glusterfs-5, it's affecting all versions: glusterfs-4.1, glusterfs-5, glusterfs-6, and master. > To resolve this issue either we can remove glupy from the release(which is > dead anyways) or install glupy on the instances. > Or you can resolve where python36 came from and undo the change that introduced it. At the risk of being repetitious ? reiterating what Niels said ? it's highly unusual to remove features in a bug fix update. It's also unusual to have switched to python3 on rhel7 like this. Was there any discussion of such a change? If there was I seem to have missed it. I suggest figuring out where python3.6 on rhel7 came from. Fix that first. Removing glupy is a bandaid over a unrelated problem. Once the real problem is fixed then there can be a separate discussion about removing the glupy feature in glusterfs-5. -- Kaleb -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndevos at redhat.com Thu Jun 13 09:08:25 2019 From: ndevos at redhat.com (Niels de Vos) Date: Thu, 13 Jun 2019 11:08:25 +0200 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190612133437.GK8725@ndevos-x270> <20190612151142.GL8725@ndevos-x270> Message-ID: <20190613090825.GN8725@ndevos-x270> On Wed, Jun 12, 2019 at 04:09:55PM -0700, Kaleb Keithley wrote: > On Wed, Jun 12, 2019 at 11:36 AM Kaleb Keithley wrote: > > > > > On Wed, Jun 12, 2019 at 10:43 AM Amar Tumballi Suryanarayan < > > atumball at redhat.com> wrote: > > > >> > >> We recently noticed that in one of the package update on builder (ie, > >> centos7.x machines), python3.6 got installed as a dependency. So, yes, it > >> is possible to have python3 in centos7 now. > >> > > > > EPEL updated from python34 to python36 recently, but C7 doesn't have > > python3 in the base. I don't think we've ever used EPEL packages for > > building. > > > > And GlusterFS-5 isn't python3 ready. > > > > Correction: GlusterFS-5 is mostly or completely python3 ready. FWIW, > python33 is available on both RHEL7 and CentOS7 from the Software > Collection Library (SCL), and python34 and now python36 are available from > EPEL. > > But packages built for the CentOS Storage SIG have never used the SCL or > EPEL (EPEL not allowed) and the shebangs in the .py files are converted > from /usr/bin/python3 to /usr/bin/python2 during the rpmbuild %prep stage. > All the python dependencies for the packages remain the python2 flavors. > AFAIK the centos-regression machines ought to be building the same way. Indeed, there should not be a requirement on having EPEL enabled on the CentOS-7 builders. At least not for the building of the glusterfs tarball. We still need to do releases of glusterfs-4.1 and glusterfs-5, until then it is expected to have python2 as the (only?) version for the system. Is it possible to remove python3 from the CentOS-7 builders and run the jobs that require python3 on the Fedora builders instead? I guess we could force the release-4.1 and release-5 branches to use python2 only. This might be done by exporting PYTHON=/usr/bin/python2 in the environment where './configure' is run. That would likely require changes to multiple Jenkins jobs... Niels From dkhandel at redhat.com Thu Jun 13 09:22:06 2019 From: dkhandel at redhat.com (Deepshikha Khandelwal) Date: Thu, 13 Jun 2019 14:52:06 +0530 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190612133437.GK8725@ndevos-x270> <20190612151142.GL8725@ndevos-x270> Message-ID: On Thu, Jun 13, 2019 at 10:06 AM Kaleb Keithley wrote: > > > On Wed, Jun 12, 2019 at 8:13 PM Deepshikha Khandelwal > wrote: > >> >> On Thu, Jun 13, 2019 at 4:41 AM Kaleb Keithley >> wrote: >> >>> >>> >>> On Wed, Jun 12, 2019 at 11:36 AM Kaleb Keithley >>> wrote: >>> >>>> >>>> On Wed, Jun 12, 2019 at 10:43 AM Amar Tumballi Suryanarayan < >>>> atumball at redhat.com> wrote: >>>> >>>>> >>>>> We recently noticed that in one of the package update on builder (ie, >>>>> centos7.x machines), python3.6 got installed as a dependency. So, yes, it >>>>> is possible to have python3 in centos7 now. >>>>> >>>> >>>> EPEL updated from python34 to python36 recently, but C7 doesn't have >>>> python3 in the base. I don't think we've ever used EPEL packages for >>>> building. >>>> >>>> And GlusterFS-5 isn't python3 ready. >>>> >>> >>> Correction: GlusterFS-5 is mostly or completely python3 ready. FWIW, >>> python33 is available on both RHEL7 and CentOS7 from the Software >>> Collection Library (SCL), and python34 and now python36 are available from >>> EPEL. >>> >>> But packages built for the CentOS Storage SIG have never used the SCL or >>> EPEL (EPEL not allowed) and the shebangs in the .py files are converted >>> from /usr/bin/python3 to /usr/bin/python2 during the rpmbuild %prep stage. >>> All the python dependencies for the packages remain the python2 flavors. >>> AFAIK the centos-regression machines ought to be building the same way. >>> >> >> centos-regression machines have 'CentOS Linux release 7.6.1810 (Core)' >> and using python3.6. Looking at the tracebacks when compiling we confirmed >> that it is picking up python3.6 somehow. >> > > We need to figure out why? BTW, my CentOS 7 box is up to date and does not > have any version of python3. I would have to use the SCL or EPEL to get it. > > What changed on June 5th? Between > https://build.gluster.org/job/centos7-regression/6309/consoleFull and > https://build.gluster.org/job/centos7-regression/6310/consoleFull? > Yes, you are right. OS got upgrade to newer version on 5th June. It has EPEL repo enabled for various other things. Can we make changes in configure.ac file (python version specific to the branches) rather than falling back to python2 for other branches too? > > 6309 was the last centos-regression with python2.7. 6310 and all > subsequent centos-regressions have been built with python3.6. > > Somebody added EPEL! Do we not have a record of the changes made and who > made them? > > And BTW, this affects more than just glusterfs-5, it's affecting all > versions: glusterfs-4.1, glusterfs-5, glusterfs-6, and master. > > >> To resolve this issue either we can remove glupy from the release(which >> is dead anyways) or install glupy on the instances. >> > > Or you can resolve where python36 came from and undo the change that > introduced it. > > At the risk of being repetitious ? reiterating what Niels said ? it's > highly unusual to remove features in a bug fix update. > > It's also unusual to have switched to python3 on rhel7 like this. Was > there any discussion of such a change? If there was I seem to have missed > it. > > I suggest figuring out where python3.6 on rhel7 came from. Fix that > first. Removing glupy is a bandaid over a unrelated problem. Once the real > problem is fixed then there can be a separate discussion about removing the > glupy feature in glusterfs-5. > > -- > > Kaleb > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndevos at redhat.com Thu Jun 13 12:28:37 2019 From: ndevos at redhat.com (Niels de Vos) Date: Thu, 13 Jun 2019 14:28:37 +0200 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: <20190613090825.GN8725@ndevos-x270> References: <20190612133437.GK8725@ndevos-x270> <20190612151142.GL8725@ndevos-x270> <20190613090825.GN8725@ndevos-x270> Message-ID: <20190613122837.GS8725@ndevos-x270> On Thu, Jun 13, 2019 at 11:08:25AM +0200, Niels de Vos wrote: > On Wed, Jun 12, 2019 at 04:09:55PM -0700, Kaleb Keithley wrote: > > On Wed, Jun 12, 2019 at 11:36 AM Kaleb Keithley wrote: > > > > > > > > On Wed, Jun 12, 2019 at 10:43 AM Amar Tumballi Suryanarayan < > > > atumball at redhat.com> wrote: > > > > > >> > > >> We recently noticed that in one of the package update on builder (ie, > > >> centos7.x machines), python3.6 got installed as a dependency. So, yes, it > > >> is possible to have python3 in centos7 now. > > >> > > > > > > EPEL updated from python34 to python36 recently, but C7 doesn't have > > > python3 in the base. I don't think we've ever used EPEL packages for > > > building. > > > > > > And GlusterFS-5 isn't python3 ready. > > > > > > > Correction: GlusterFS-5 is mostly or completely python3 ready. FWIW, > > python33 is available on both RHEL7 and CentOS7 from the Software > > Collection Library (SCL), and python34 and now python36 are available from > > EPEL. > > > > But packages built for the CentOS Storage SIG have never used the SCL or > > EPEL (EPEL not allowed) and the shebangs in the .py files are converted > > from /usr/bin/python3 to /usr/bin/python2 during the rpmbuild %prep stage. > > All the python dependencies for the packages remain the python2 flavors. > > AFAIK the centos-regression machines ought to be building the same way. > > Indeed, there should not be a requirement on having EPEL enabled on the > CentOS-7 builders. At least not for the building of the glusterfs > tarball. We still need to do releases of glusterfs-4.1 and glusterfs-5, > until then it is expected to have python2 as the (only?) version for the > system. Is it possible to remove python3 from the CentOS-7 builders and > run the jobs that require python3 on the Fedora builders instead? Actually, if the python-devel package for python3 is installed on the CentOS-7 builders, things may work too. It still feels like some sort of Frankenstein deployment, and we don't expect to this see in production environments. But maybe this is a workaround in case something really, really, REALLY depends on python3 on the builders. Niels From kkeithle at redhat.com Thu Jun 13 12:55:22 2019 From: kkeithle at redhat.com (Kaleb Keithley) Date: Thu, 13 Jun 2019 05:55:22 -0700 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190612133437.GK8725@ndevos-x270> <20190612151142.GL8725@ndevos-x270> Message-ID: On Thu, Jun 13, 2019 at 2:22 AM Deepshikha Khandelwal wrote: > On Thu, Jun 13, 2019 at 10:06 AM Kaleb Keithley > wrote: > >> On Wed, Jun 12, 2019 at 8:13 PM Deepshikha Khandelwal < >> dkhandel at redhat.com> wrote: >> >>> On Thu, Jun 13, 2019 at 4:41 AM Kaleb Keithley >>> wrote: >>> >>>> On Wed, Jun 12, 2019 at 11:36 AM Kaleb Keithley >>>> wrote: >>>> >>>>> >>>>> >>>>> >> We need to figure out why? BTW, my CentOS 7 box is up to date and does >> not have any version of python3. I would have to use the SCL or EPEL to get >> it. >> >> What changed on June 5th? Between >> https://build.gluster.org/job/centos7-regression/6309/consoleFull and >> https://build.gluster.org/job/centos7-regression/6310/consoleFull? >> > Yes, you are right. OS got upgrade to newer version on 5th June. It has > EPEL repo enabled for various other things. > What other things? Are there not python2 versions of these things? That work just as well as the pyton3 versions? Adding EPEL and installing python3 on the centos boxes seems like a mistake to me, if only because it has broken the builds there. Was there any discussion of adding EPEL and python3 ? I don't recall seeing any. But since EPEL was added, one possible work-around would be to do the build in mock. -- Kaleb -------------- next part -------------- An HTML attachment was scrubbed... URL: From mscherer at redhat.com Thu Jun 13 13:36:03 2019 From: mscherer at redhat.com (Michael Scherer) Date: Thu, 13 Jun 2019 15:36:03 +0200 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190612133437.GK8725@ndevos-x270> <20190612151142.GL8725@ndevos-x270> Message-ID: Le jeudi 13 juin 2019 ? 05:55 -0700, Kaleb Keithley a ?crit : > On Thu, Jun 13, 2019 at 2:22 AM Deepshikha Khandelwal < > dkhandel at redhat.com> > wrote: > > > On Thu, Jun 13, 2019 at 10:06 AM Kaleb Keithley < > > kkeithle at redhat.com> > > wrote: > > > > > On Wed, Jun 12, 2019 at 8:13 PM Deepshikha Khandelwal < > > > dkhandel at redhat.com> wrote: > > > > > > > On Thu, Jun 13, 2019 at 4:41 AM Kaleb Keithley < > > > > kkeithle at redhat.com> > > > > wrote: > > > > > > > > > On Wed, Jun 12, 2019 at 11:36 AM Kaleb Keithley < > > > > > kkeithle at redhat.com> > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We need to figure out why? BTW, my CentOS 7 box is up to date and > > > does > > > not have any version of python3. I would have to use the SCL or > > > EPEL to get > > > it. > > > > > > What changed on June 5th? Between > > > https://build.gluster.org/job/centos7-regression/6309/consoleFull > > > and > > > https://build.gluster.org/job/centos7-regression/6310/consoleFull > > > ? > > > > > > > Yes, you are right. OS got upgrade to newer version on 5th June. It > > has > > EPEL repo enabled for various other things. > > > > What other things? Are there not python2 versions of these things? > That work just as well as the pyton3 versions? Mock pull python3: [root at builder11 ~]# LC_ALL=C rpm -e --test python36-rpm error: Failed dependencies: python36-rpm is needed by (installed) mock-1.4.16-1.el7.noarch And I think it would be ill advised to backport a EOL version of mock with python 2 on the builders. > Adding EPEL and installing python3 on the centos boxes seems like a > mistake to me, if only because it has broken the builds there. Was > there any discussion of adding EPEL and python3 ? I don't recall > seeing any. We have EPEL for: - munin - nagios - golang - clang, cppcheck - mock - nginx nginx could be removed now (that's kinda legacy). The rest look like very much stuff we use, so I think EPEL is here to stay. > But since EPEL was added, one possible work-around would be to do the > build > in mock. The detection logic is hitting a corner case (granted, that's one that changed under people feet). There is a 2 step approach: - detect the most recent version of python - verify that there is headers for that python version But having python 3 do not mean we want to use that one for building. So I think the right autodetection would be to - list all version of python - take the most recent one with -devel installed (eg, 1 loop that check 2 things, instead of 1 loop for version, and a check after). Or, as a work around, we should be explicit on the python version with a configure switch, so we can be sure we test and build the right one, since the autodetection hit a corner case. -- Michael Scherer Sysadmin, Community Infrastructure -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part URL: From mscherer at redhat.com Thu Jun 13 13:55:21 2019 From: mscherer at redhat.com (Michael Scherer) Date: Thu, 13 Jun 2019 15:55:21 +0200 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: <20190613122837.GS8725@ndevos-x270> References: <20190612133437.GK8725@ndevos-x270> <20190612151142.GL8725@ndevos-x270> <20190613090825.GN8725@ndevos-x270> <20190613122837.GS8725@ndevos-x270> Message-ID: <61c99ac170cc004a7f90897ff9f47cf7facdbc12.camel@redhat.com> Le jeudi 13 juin 2019 ? 14:28 +0200, Niels de Vos a ?crit : > On Thu, Jun 13, 2019 at 11:08:25AM +0200, Niels de Vos wrote: > > On Wed, Jun 12, 2019 at 04:09:55PM -0700, Kaleb Keithley wrote: > > > On Wed, Jun 12, 2019 at 11:36 AM Kaleb Keithley < > > > kkeithle at redhat.com> wrote: > > > > > > > > > > > On Wed, Jun 12, 2019 at 10:43 AM Amar Tumballi Suryanarayan < > > > > atumball at redhat.com> wrote: > > > > > > > > > > > > > > We recently noticed that in one of the package update on > > > > > builder (ie, > > > > > centos7.x machines), python3.6 got installed as a dependency. > > > > > So, yes, it > > > > > is possible to have python3 in centos7 now. > > > > > > > > > > > > > EPEL updated from python34 to python36 recently, but C7 doesn't > > > > have > > > > python3 in the base. I don't think we've ever used EPEL > > > > packages for > > > > building. > > > > > > > > And GlusterFS-5 isn't python3 ready. > > > > > > > > > > Correction: GlusterFS-5 is mostly or completely python3 > > > ready. FWIW, > > > python33 is available on both RHEL7 and CentOS7 from the Software > > > Collection Library (SCL), and python34 and now python36 are > > > available from > > > EPEL. > > > > > > But packages built for the CentOS Storage SIG have never used the > > > SCL or > > > EPEL (EPEL not allowed) and the shebangs in the .py files are > > > converted > > > from /usr/bin/python3 to /usr/bin/python2 during the rpmbuild > > > %prep stage. > > > All the python dependencies for the packages remain the python2 > > > flavors. > > > AFAIK the centos-regression machines ought to be building the > > > same way. > > > > Indeed, there should not be a requirement on having EPEL enabled on > > the > > CentOS-7 builders. At least not for the building of the glusterfs > > tarball. We still need to do releases of glusterfs-4.1 and > > glusterfs-5, > > until then it is expected to have python2 as the (only?) version > > for the > > system. Is it possible to remove python3 from the CentOS-7 builders > > and > > run the jobs that require python3 on the Fedora builders instead? > > Actually, if the python-devel package for python3 is installed on the > CentOS-7 builders, things may work too. It still feels like some sort > of > Frankenstein deployment, and we don't expect to this see in > production > environments. But maybe this is a workaround in case something > really, > really, REALLY depends on python3 on the builders. To be honest, people would be surprised what happen in production around (sysadmins tend to discuss around, we all have horrors stories, stuff that were supposed to be cleaned and wasn't, etc) After all, "frankenstein deployment now" is better than "perfect later", especially since lots of IT departements are under constant pressure (so that's more "perfect never"). I can understand that we want clean and simple code (who doesn't), but real life is much messier than we want to admit, so we need something robust. -- Michael Scherer Sysadmin, Community Infrastructure -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part URL: From dkhandel at redhat.com Fri Jun 14 06:45:40 2019 From: dkhandel at redhat.com (Deepshikha Khandelwal) Date: Fri, 14 Jun 2019 12:15:40 +0530 Subject: [Gluster-devel] Gerrit is down Message-ID: Hello, review.gluster.org is down since morning. We are looking into the issue. Will update once it is back. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mscherer at redhat.com Fri Jun 14 08:10:42 2019 From: mscherer at redhat.com (Michael Scherer) Date: Fri, 14 Jun 2019 10:10:42 +0200 Subject: [Gluster-devel] DNS issue on review.gluster.org, causing outage Message-ID: <6177518854b017c8e7389f402bb61c11695677ee.camel@redhat.com> Hi, there is a ongoing issue regarding review.gluster.org, with some people being directed to the wrong server. A quick fix is to add: 8.43.85.171 review.gluster.org to /etc/hosts (on Linux) Adding a MX record yesterday (due to a RH IT request) do result into the domain name having 2 IP address, one pointing to supercolony (the MX), one to the gerrit server. That is neither the intention, nor what is supposed to happen, so I kinda suspect that's a bug somewhere (or a corner case, or me misreading the RFC). Investigation is ongoing. Review.gluster.org may still work for some people (like, it work for me and still work for me), hence why it wasn't noticed while I tested, apology for that. -- Michael Scherer Sysadmin, Community Infrastructure -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part URL: From mscherer at redhat.com Fri Jun 14 09:50:53 2019 From: mscherer at redhat.com (Michael Scherer) Date: Fri, 14 Jun 2019 11:50:53 +0200 Subject: [Gluster-devel] [Gluster-infra] DNS issue on review.gluster.org, causing outage In-Reply-To: <6177518854b017c8e7389f402bb61c11695677ee.camel@redhat.com> References: <6177518854b017c8e7389f402bb61c11695677ee.camel@redhat.com> Message-ID: Le vendredi 14 juin 2019 ? 10:10 +0200, Michael Scherer a ?crit : > Hi, > > there is a ongoing issue regarding review.gluster.org, with some > people > being directed to the wrong server. > > A quick fix is to add: > 8.43.85.171 review.gluster.org > > to /etc/hosts (on Linux) > > > > Adding a MX record yesterday (due to a RH IT request) do result into > the domain name having 2 IP address, one pointing to supercolony (the > MX), one to the gerrit server. That is neither the intention, nor > what > is supposed to happen, so I kinda suspect that's a bug somewhere (or > a > corner case, or me misreading the RFC). > > Investigation is ongoing. > > Review.gluster.org may still work for some people (like, it work for > me > and still work for me), hence why it wasn't noticed while I tested, > apology for that. Ok so the issue should now be fixed. See https://bugzilla.redhat.com/show_bug.cgi?id=1720453 (sorry forgot to send the email about the fix, too focused on the post mortem) -- Michael Scherer Sysadmin, Community Infrastructure -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part URL: From atumball at redhat.com Fri Jun 14 11:46:00 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Fri, 14 Jun 2019 17:16:00 +0530 Subject: [Gluster-devel] Seems like Smoke job is not voting Message-ID: I see patches starting from 10:45 AM IST (7hrs before) are not getting smoke votes. For one of my patch, the smoke job is not triggered at all IMO. https://review.gluster.org/#/c/22863/ Would be good to check it. Regards, Amar -------------- next part -------------- An HTML attachment was scrubbed... URL: From atumball at redhat.com Fri Jun 14 11:50:46 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Fri, 14 Jun 2019 17:20:46 +0530 Subject: [Gluster-devel] Seems like Smoke job is not voting In-Reply-To: References: Message-ID: Ok, guessed the possible cause. The same possible DNS issue with review.gluster.org could have prevented the patch fetching in smoke, and hence would have not triggered the job. Those of you who have patches not getting a smoke, please run 'recheck smoke' through comment. -Amar On Fri, Jun 14, 2019 at 5:16 PM Amar Tumballi Suryanarayan < atumball at redhat.com> wrote: > I see patches starting from 10:45 AM IST (7hrs before) are not getting > smoke votes. > > For one of my patch, the smoke job is not triggered at all IMO. > > https://review.gluster.org/#/c/22863/ > > Would be good to check it. > > Regards, > Amar > > -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From avishwan at redhat.com Fri Jun 14 14:13:25 2019 From: avishwan at redhat.com (Aravinda) Date: Fri, 14 Jun 2019 19:43:25 +0530 Subject: [Gluster-devel] Project Update: Containers-based distributed tests runner Message-ID: <81e8277f799057f62b11b0cb353e83d408ed6028.camel@redhat.com> **gluster-tester** is a framework to run existing "*.t" test files in parallel using containers. Install and usage instructions are available in the following repository. https://github.com/aravindavk/gluster-tester ## Completed: - Create a base container image with all the dependencies installed. - Create a tester container image with requested refspec(or latest master) compiled and installed. - SSH setup in containers required to test Geo-replication - Take `--num-parallel` option and spawn the containers with ready infra for running tests - Split the tests based on the number of parallel jobs specified. - Execute the tests in parallel in each container and watch for the status. - Archive only failed tests(Optionally enable logs for successful tests using `--preserve-success-logs`) ## Pending: - NFS related tests are not running since the required changes are pending while creating the container image. (To know the failures run gluster-tester with `--include-nfs-tests` option) - Filter support while running the tests(To enable/disable tests on the run time) - Some Loop based tests are failing(I think due to shared `/dev/loop*`) - A few tests are timing out(Due to this overall test duration is more) - Once tests are started, showing real-time status is pending(Now status is checked in `/regression-.log` for example `/var/log/gluster-tester/regression-3.log` - If the base image is not built before running tests, it gives an error. Need to re-trigger the base container image step if not built. (Issue: https://github.com/aravindavk/gluster-tester/issues/11) - Creating an archive of core files - Creating a single archive from all jobs/containers - Testing `--ignore-from` feature to ignore the tests - Improvements to the status output - Cleanup(Stop test containers, and delete) I opened an issue to collect the details of failed tests. I will continue to update that issue as and when I capture failed tests in my setup. https://github.com/aravindavk/gluster-tester/issues/9 Feel free to suggest any feature improvements. Contributions are welcome. https://github.com/aravindavk/gluster-tester/issues -- Regards Aravinda http://aravindavk.in From rkothiya at redhat.com Fri Jun 14 15:38:10 2019 From: rkothiya at redhat.com (Rinku Kothiya) Date: Fri, 14 Jun 2019 21:08:10 +0530 Subject: [Gluster-devel] [Gluster-Maintainers] Release 7: Regression health for release-6.next and release-7 Message-ID: Hi Team, As part of branching preparation next week for release-7, please find test failures and respective test links here. The top tests that are failing are as below and need attention, ./tests/bugs/gfapi/bug-1319374-THIS-crash.t ./tests/basic/uss.t ./tests/basic/volfile-sanity.t ./tests/basic/quick-read-with-upcall.t ./tests/basic/afr/tarissue.t ./tests/features/subdir-mount.t ./tests/basic/ec/self-heal.t ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t ./tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t ./tests/basic/afr/split-brain-favorite-child-policy.t ./tests/basic/distribute/non-root-unlink-stale-linkto.t ./tests/bugs/protocol/bug-1433815-auth-allow.t ./tests/basic/afr/arbiter-mount.t ./tests/basic/all_squash.t ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t ./tests/basic/volume-snapshot-clone.t ./tests/bugs/glusterd/serialize-shd-manager-glusterd-restart.t ./tests/basic/gfapi/upcall-register-api.t Nightly build for this month : https://build.gluster.org/job/nightly-master/ Gluster test failure tracker : https://fstat.gluster.org/summary?start_date=2019-05-15&end_date=2019-06-14 Please file a bug if needed against the test case and report the same here, in case a problem is already addressed, then do send back the patch details that addresses this issue as a response to this mail. Regards Rinku -------------- next part -------------- An HTML attachment was scrubbed... URL: From atumball at redhat.com Fri Jun 14 17:15:29 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Fri, 14 Jun 2019 22:45:29 +0530 Subject: [Gluster-devel] Quick Update: a cleanup patch and subsequent build failures Message-ID: I just merged a larger cleanup patch, which I felt is good to get in, but due to the order of its parents when it passed the regression and smoke, and the other patches which got merged in same time, we hit a compile issue for 'undefined functions'. Below patch fixes it: ---- glfs: add syscall.h after header cleanup in one of the recent patches, we cleaned-up the unneccesary header file includes. In the order of merging the patches, there cropped up an compile error. updates: bz#1193929 Change-Id: I2ad52aa918f9c698d5273bb293838de6dd50ac31 Signed-off-by: Amar Tumballi diff --git a/api/src/glfs.c b/api/src/glfs.c index b0db866441..0771e074d6 100644 --- a/api/src/glfs.c +++ b/api/src/glfs.c @@ -45,6 +45,7 @@ #include #include "rpc-clnt.h" #include +#include #include "gfapi-messages.h" #include "glfs.h" ----- The patch has been pushed to repository, as it is causing critical compile error right now. and if you have a build error, please fetch the latest master to fix the the issue. Regards, Amar -------------- next part -------------- An HTML attachment was scrubbed... URL: From jenkins at build.gluster.org Mon Jun 17 01:45:02 2019 From: jenkins at build.gluster.org (jenkins at build.gluster.org) Date: Mon, 17 Jun 2019 01:45:02 +0000 (UTC) Subject: [Gluster-devel] Weekly Untriaged Bugs Message-ID: <690551110.11.1560735902548.JavaMail.jenkins@jenkins-el7.rht.gluster.org> [...truncated 7 lines...] https://bugzilla.redhat.com/1719778 / core: build fails for every patch on release 5 https://bugzilla.redhat.com/1714851 / core: issues with 'list.h' elements in clang-scan https://bugzilla.redhat.com/1718734 / core: Memory leak in glusterfsd process https://bugzilla.redhat.com/1719290 / glusterd: Glusterfs mount helper script not working with IPv6 because of regular expression or man is wrong https://bugzilla.redhat.com/1718741 / glusterfind: GlusterFS having high CPU https://bugzilla.redhat.com/1716875 / gluster-smb: Inode Unref Assertion failed: inode->ref https://bugzilla.redhat.com/1716455 / gluster-smb: OS X error -50 when creating sub-folder on Samba share when using Gluster VFS https://bugzilla.redhat.com/1716440 / gluster-smb: SMBD thread panics when connected to from OS X machine https://bugzilla.redhat.com/1720733 / libglusterfsclient: glusterfs 4.1.7 client crash https://bugzilla.redhat.com/1714895 / libglusterfsclient: Glusterfs(fuse) client crash https://bugzilla.redhat.com/1717824 / locks: Fencing: Added the tcmu-runner ALUA feature support but after one of node is rebooted the glfs_file_lock() get stucked https://bugzilla.redhat.com/1718562 / locks: flock failure (regression) https://bugzilla.redhat.com/1719174 / project-infrastructure: broken regression link? https://bugzilla.redhat.com/1719388 / project-infrastructure: infra: download.gluster.org /var/www/html/... is out of free space https://bugzilla.redhat.com/1720453 / project-infrastructure: Unable to access review.gluster.org https://bugzilla.redhat.com/1718227 / scripts: SELinux context labels are missing for newly added bricks using add-brick command [...truncated 2 lines...] -------------- next part -------------- A non-text attachment was scrubbed... Name: build.log Type: application/octet-stream Size: 2050 bytes Desc: not available URL: From atumball at redhat.com Mon Jun 17 06:27:42 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Mon, 17 Jun 2019 11:57:42 +0530 Subject: [Gluster-devel] Project Update: Containers-based distributed tests runner In-Reply-To: <81e8277f799057f62b11b0cb353e83d408ed6028.camel@redhat.com> References: <81e8277f799057f62b11b0cb353e83d408ed6028.camel@redhat.com> Message-ID: This is a nice way to validate the patch for us. Question I have is did we measure the time benefit of running them in parallel with containers? Would be great to see the result in getting this tested in a cloud env, with 5 parallel threads and 10 parallel threads. -Amar On Fri, Jun 14, 2019 at 7:44 PM Aravinda wrote: > **gluster-tester** is a framework to run existing "*.t" test files in > parallel using containers. > > Install and usage instructions are available in the following > repository. > > https://github.com/aravindavk/gluster-tester > > ## Completed: > - Create a base container image with all the dependencies installed. > - Create a tester container image with requested refspec(or latest > master) compiled and installed. > - SSH setup in containers required to test Geo-replication > - Take `--num-parallel` option and spawn the containers with ready > infra for running tests > - Split the tests based on the number of parallel jobs specified. > - Execute the tests in parallel in each container and watch for the > status. > - Archive only failed tests(Optionally enable logs for successful tests > using `--preserve-success-logs`) > > ## Pending: > - NFS related tests are not running since the required changes are > pending while creating the container image. (To know the failures run > gluster-tester with `--include-nfs-tests` option) > - Filter support while running the tests(To enable/disable tests on the > run time) > - Some Loop based tests are failing(I think due to shared `/dev/loop*`) > - A few tests are timing out(Due to this overall test duration is more) > - Once tests are started, showing real-time status is pending(Now > status is checked in `/regression-.log` for example > `/var/log/gluster-tester/regression-3.log` > - If the base image is not built before running tests, it gives an > error. Need to re-trigger the base container image step if not built. > (Issue: https://github.com/aravindavk/gluster-tester/issues/11) > - Creating an archive of core files > - Creating a single archive from all jobs/containers > - Testing `--ignore-from` feature to ignore the tests > - Improvements to the status output > - Cleanup(Stop test containers, and delete) > > I opened an issue to collect the details of failed tests. I will > continue to update that issue as and when I capture failed tests in my > setup. > https://github.com/aravindavk/gluster-tester/issues/9 > > Feel free to suggest any feature improvements. Contributions are > welcome. > https://github.com/aravindavk/gluster-tester/issues > > -- > Regards > Aravinda > http://aravindavk.in > > > _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From khiremat at redhat.com Mon Jun 17 11:50:09 2019 From: khiremat at redhat.com (Kotresh Hiremath Ravishankar) Date: Mon, 17 Jun 2019 17:20:09 +0530 Subject: [Gluster-devel] Solving Ctime Issue with legacy files [BUG 1593542] Message-ID: Hi All, The ctime feature is enabled by default from release gluster-6. But as explained in bug [1] there is a known issue with legacy files i.e., the files which are created before ctime feature is enabled. These files would not have "trusted.glusterfs.mdata" xattr which maintain time attributes. So on, accessing those files, it gets created with latest time attributes. This is not correct because all the time attributes (atime, mtime, ctime) get updated instead of required time attributes. There are couple of approaches to solve this. 1. On accessing the files, let the posix update the time attributes from the back end file on respective replicas. This obviously results in inconsistent "trusted.glusterfs.mdata" xattr values with in replica set. AFR/EC should heal this xattr as part of metadata heal upon accessing this file. It can chose to replicate from any subvolume. Ideally we should consider the highest time from the replica and treat it as source but I think that should be fine as replica time attributes are mostly in sync with max difference in order of few seconds if am not wrong. But client side self heal is disabled by default because of performance reasons [2]. If we chose to go by this approach, we need to consider enabling at least client side metadata self heal by default. Please share your thoughts on enabling the same by default. 2. Don't let posix update the legacy files from the backend. On lookup cbk, let the utime xlator update the time attributes from statbuf received synchronously. Both approaches are similar as both results in updating the xattr during lookup. Please share your inputs on which approach is better. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1593542 [2] https://github.com/gluster/glusterfs/issues/473 -- Thanks and Regards, Kotresh H R -------------- next part -------------- An HTML attachment was scrubbed... URL: From jahernan at redhat.com Mon Jun 17 12:08:20 2019 From: jahernan at redhat.com (Xavi Hernandez) Date: Mon, 17 Jun 2019 14:08:20 +0200 Subject: [Gluster-devel] Solving Ctime Issue with legacy files [BUG 1593542] In-Reply-To: References: Message-ID: Hi Kotresh, On Mon, Jun 17, 2019 at 1:50 PM Kotresh Hiremath Ravishankar < khiremat at redhat.com> wrote: > Hi All, > > The ctime feature is enabled by default from release gluster-6. But as > explained in bug [1] there is a known issue with legacy files i.e., the > files which are created before ctime feature is enabled. These files would > not have "trusted.glusterfs.mdata" xattr which maintain time attributes. So > on, accessing those files, it gets created with latest time attributes. > This is not correct because all the time attributes (atime, mtime, ctime) > get updated instead of required time attributes. > > There are couple of approaches to solve this. > > 1. On accessing the files, let the posix update the time attributes from > the back end file on respective replicas. This obviously results in > inconsistent "trusted.glusterfs.mdata" xattr values with in replica set. > AFR/EC should heal this xattr as part of metadata heal upon accessing this > file. It can chose to replicate from any subvolume. Ideally we should > consider the highest time from the replica and treat it as source but I > think that should be fine as replica time attributes are mostly in sync > with max difference in order of few seconds if am not wrong. > > But client side self heal is disabled by default because of performance > reasons [2]. If we chose to go by this approach, we need to consider > enabling at least client side metadata self heal by default. Please share > your thoughts on enabling the same by default. > > 2. Don't let posix update the legacy files from the backend. On lookup > cbk, let the utime xlator update the time attributes from statbuf received > synchronously. > > Both approaches are similar as both results in updating the xattr during > lookup. Please share your inputs on which approach is better. > I prefer second approach. First approach is not feasible for EC volumes because self-heal requires that k bricks (on a k+r configuration) agree on the value of this xattr, otherwise it considers the metadata damaged and needs manual intervention to fix it. During upgrade, first r bricks with be upgraded without problems, but trusted.glusterfs.mdata won't be healed because r < k. In fact this xattr will be removed from new bricks because the majority of bricks agree on xattr not being present. Once the r+1 brick is upgraded, it's possible that posix sets different values for trusted.glusterfs.mdata, which will cause self-heal to fail. Second approach seems better to me if guarded by a new option that enables this behavior. utime xlator should only update the mdata xattr if that option is set, and that option should only be settable once all nodes have been upgraded (controlled by op-version). In this situation the first lookup on a file where utime detects that mdata is not set, will require a synchronous update. I think this is good enough because it will only happen once per file. We'll need to consider cases where different clients do lookups at the same time, but I think this can be easily solved by ignoring the request if mdata is already present. Xavi > > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1593542 > [2] https://github.com/gluster/glusterfs/issues/473 > > -- > Thanks and Regards, > Kotresh H R > -------------- next part -------------- An HTML attachment was scrubbed... URL: From khiremat at redhat.com Tue Jun 18 06:33:43 2019 From: khiremat at redhat.com (Kotresh Hiremath Ravishankar) Date: Tue, 18 Jun 2019 12:03:43 +0530 Subject: [Gluster-devel] Solving Ctime Issue with legacy files [BUG 1593542] In-Reply-To: References: Message-ID: Hi Xavi, Reply inline. On Mon, Jun 17, 2019 at 5:38 PM Xavi Hernandez wrote: > Hi Kotresh, > > On Mon, Jun 17, 2019 at 1:50 PM Kotresh Hiremath Ravishankar < > khiremat at redhat.com> wrote: > >> Hi All, >> >> The ctime feature is enabled by default from release gluster-6. But as >> explained in bug [1] there is a known issue with legacy files i.e., the >> files which are created before ctime feature is enabled. These files would >> not have "trusted.glusterfs.mdata" xattr which maintain time attributes. So >> on, accessing those files, it gets created with latest time attributes. >> This is not correct because all the time attributes (atime, mtime, ctime) >> get updated instead of required time attributes. >> >> There are couple of approaches to solve this. >> >> 1. On accessing the files, let the posix update the time attributes from >> the back end file on respective replicas. This obviously results in >> inconsistent "trusted.glusterfs.mdata" xattr values with in replica set. >> AFR/EC should heal this xattr as part of metadata heal upon accessing this >> file. It can chose to replicate from any subvolume. Ideally we should >> consider the highest time from the replica and treat it as source but I >> think that should be fine as replica time attributes are mostly in sync >> with max difference in order of few seconds if am not wrong. >> >> But client side self heal is disabled by default because of >> performance reasons [2]. If we chose to go by this approach, we need to >> consider enabling at least client side metadata self heal by default. >> Please share your thoughts on enabling the same by default. >> >> 2. Don't let posix update the legacy files from the backend. On lookup >> cbk, let the utime xlator update the time attributes from statbuf received >> synchronously. >> >> Both approaches are similar as both results in updating the xattr during >> lookup. Please share your inputs on which approach is better. >> > > I prefer second approach. First approach is not feasible for EC volumes > because self-heal requires that k bricks (on a k+r configuration) agree on > the value of this xattr, otherwise it considers the metadata damaged and > needs manual intervention to fix it. During upgrade, first r bricks with be > upgraded without problems, but trusted.glusterfs.mdata won't be healed > because r < k. In fact this xattr will be removed from new bricks because > the majority of bricks agree on xattr not being present. Once the r+1 brick > is upgraded, it's possible that posix sets different values for > trusted.glusterfs.mdata, which will cause self-heal to fail. > > Second approach seems better to me if guarded by a new option that enables > this behavior. utime xlator should only update the mdata xattr if that > option is set, and that option should only be settable once all nodes have > been upgraded (controlled by op-version). In this situation the first > lookup on a file where utime detects that mdata is not set, will require a > synchronous update. I think this is good enough because it will only happen > once per file. We'll need to consider cases where different clients do > lookups at the same time, but I think this can be easily solved by ignoring > the request if mdata is already present. > Initially there were two issues. 1. Upgrade Issue with EC Volume as described by you. This is solved with the patch [1]. There was a bug in ctime posix where it was creating xattr even when ctime is not set on client (during utimes system call). With patch [1], the behavior is that utimes system call will only update the "trusted.glusterfs.mdata" xattr if present else it won't create. The new xattr creation should only happen during entry operations (i.e create, mknod and others). So there won't be any problems with upgrade. I think we don't need new option dependent on op version if I am not wrong. 2. After upgrade, how do we update "trusted.glusterfs.mdata" xattr. This mail thread was for this. Here which approach is better? I understand from EC point of view the second approach is the best one. The question I had was, Can't EC treat 'trusted.glusterfs.mdata' as special xattr and add the logic to heal it from one subvolume (i.e. to remove the requirement of having to have consistent data on k subvolumes in k+r configuration). Second approach is independent of AFR and EC. So if we chose this, do we need new option to guard? If the upgrade steps is to upgrade server first and then client, we don't need to guard I think? > > Xavi > > >> >> >> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1593542 >> [2] https://github.com/gluster/glusterfs/issues/473 >> >> -- >> Thanks and Regards, >> Kotresh H R >> > -- Thanks and Regards, Kotresh H R -------------- next part -------------- An HTML attachment was scrubbed... URL: From rkothiya at redhat.com Tue Jun 18 06:37:11 2019 From: rkothiya at redhat.com (Rinku Kothiya) Date: Tue, 18 Jun 2019 12:07:11 +0530 Subject: [Gluster-devel] [Gluster-Maintainers] Release 7: Gentle Reminder, Regression health for release-6.next and release-7 In-Reply-To: References: Message-ID: Hi Team, We need to branch for release-7, but nightly builds failures are blocking this activity. Please find test failures and respective test links below : The top tests that are failing are as below and need attention, ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t ./tests/bugs/gfapi/bug-1319374-THIS-crash.t ./tests/basic/distribute/non-root-unlink-stale-linkto.t ./tests/bugs/posix/bug-1040275-brick-uid-reset-on-volume-restart.t ./tests/features/subdir-mount.t ./tests/basic/ec/self-heal.t ./tests/basic/afr/tarissue.t ./tests/basic/all_squash.t ./tests/basic/ec/nfs.t ./tests/00-geo-rep/00-georep-verify-setup.t ./tests/basic/quota-rename.t ./tests/basic/volume-snapshot-clone.t Nightly build for this month : https://build.gluster.org/job/nightly-master/ Gluster test failure tracker : https://fstat.gluster.org/summary?start_date=2019-06-15&end_date=2019-06-18 Please file a bug if needed against the test case and report the same here, in case a problem is already addressed, then do send back the patch details that addresses this issue as a response to this mail. Regards Rinku On Fri, Jun 14, 2019 at 9:08 PM Rinku Kothiya wrote: > Hi Team, > > As part of branching preparation next week for release-7, please find > test failures and respective test links here. > > The top tests that are failing are as below and need attention, > > ./tests/bugs/gfapi/bug-1319374-THIS-crash.t > ./tests/basic/uss.t > ./tests/basic/volfile-sanity.t > ./tests/basic/quick-read-with-upcall.t > ./tests/basic/afr/tarissue.t > ./tests/features/subdir-mount.t > ./tests/basic/ec/self-heal.t > > ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t > ./tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t > ./tests/basic/afr/split-brain-favorite-child-policy.t > ./tests/basic/distribute/non-root-unlink-stale-linkto.t > ./tests/bugs/protocol/bug-1433815-auth-allow.t > ./tests/basic/afr/arbiter-mount.t > ./tests/basic/all_squash.t > > ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t > ./tests/basic/volume-snapshot-clone.t > ./tests/bugs/glusterd/serialize-shd-manager-glusterd-restart.t > ./tests/basic/gfapi/upcall-register-api.t > > > Nightly build for this month : > https://build.gluster.org/job/nightly-master/ > > Gluster test failure tracker : > https://fstat.gluster.org/summary?start_date=2019-05-15&end_date=2019-06-14 > > Please file a bug if needed against the test case and report the same > here, in case a problem is already addressed, then do send back the > patch details that addresses this issue as a response to this mail. > > Regards > Rinku > -------------- next part -------------- An HTML attachment was scrubbed... URL: From atumball at redhat.com Tue Jun 18 06:42:07 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Tue, 18 Jun 2019 12:12:07 +0530 Subject: [Gluster-devel] [Gluster-Maintainers] Release 7: Gentle Reminder, Regression health for release-6.next and release-7 In-Reply-To: References: Message-ID: On Tue, Jun 18, 2019 at 12:07 PM Rinku Kothiya wrote: > Hi Team, > > We need to branch for release-7, but nightly builds failures are blocking > this activity. Please find test failures and respective test links below : > > The top tests that are failing are as below and need attention, > > > ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t > ./tests/bugs/gfapi/bug-1319374-THIS-crash.t > Still an issue with many tests. > ./tests/basic/distribute/non-root-unlink-stale-linkto.t > Looks like this got fixed after https://review.gluster.org/22847 > ./tests/bugs/posix/bug-1040275-brick-uid-reset-on-volume-restart.t > ./tests/features/subdir-mount.t > Got fixed with https://review.gluster.org/22877 > ./tests/basic/ec/self-heal.t > ./tests/basic/afr/tarissue.t > I see random failures on this, not yet sure if this is setup issue, or a actual regression issue. > ./tests/basic/all_squash.t > ./tests/basic/ec/nfs.t > ./tests/00-geo-rep/00-georep-verify-setup.t > Most of the times, it fails if 'setup' is not complete to run geo-rep. > ./tests/basic/quota-rename.t > ./tests/basic/volume-snapshot-clone.t > > Nightly build for this month : > https://build.gluster.org/job/nightly-master/ > > Gluster test failure tracker : > https://fstat.gluster.org/summary?start_date=2019-06-15&end_date=2019-06-18 > > Please file a bug if needed against the test case and report the same > here, in case a problem is already addressed, then do send back the > patch details that addresses this issue as a response to this mail. > > Thanks! > Regards > Rinku > > > On Fri, Jun 14, 2019 at 9:08 PM Rinku Kothiya wrote: > >> Hi Team, >> >> As part of branching preparation next week for release-7, please find >> test failures and respective test links here. >> >> The top tests that are failing are as below and need attention, >> >> ./tests/bugs/gfapi/bug-1319374-THIS-crash.t >> ./tests/basic/uss.t >> ./tests/basic/volfile-sanity.t >> ./tests/basic/quick-read-with-upcall.t >> ./tests/basic/afr/tarissue.t >> ./tests/features/subdir-mount.t >> ./tests/basic/ec/self-heal.t >> >> ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t >> ./tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t >> ./tests/basic/afr/split-brain-favorite-child-policy.t >> ./tests/basic/distribute/non-root-unlink-stale-linkto.t >> ./tests/bugs/protocol/bug-1433815-auth-allow.t >> ./tests/basic/afr/arbiter-mount.t >> ./tests/basic/all_squash.t >> >> ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t >> ./tests/basic/volume-snapshot-clone.t >> ./tests/bugs/glusterd/serialize-shd-manager-glusterd-restart.t >> ./tests/basic/gfapi/upcall-register-api.t >> >> >> Nightly build for this month : >> https://build.gluster.org/job/nightly-master/ >> >> Gluster test failure tracker : >> >> https://fstat.gluster.org/summary?start_date=2019-05-15&end_date=2019-06-14 >> >> Please file a bug if needed against the test case and report the same >> here, in case a problem is already addressed, then do send back the >> patch details that addresses this issue as a response to this mail. >> >> Regards >> Rinku >> > _______________________________________________ > maintainers mailing list > maintainers at gluster.org > https://lists.gluster.org/mailman/listinfo/maintainers > -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jahernan at redhat.com Tue Jun 18 06:58:15 2019 From: jahernan at redhat.com (Xavi Hernandez) Date: Tue, 18 Jun 2019 08:58:15 +0200 Subject: [Gluster-devel] Solving Ctime Issue with legacy files [BUG 1593542] In-Reply-To: References: Message-ID: Hi Kotresh, On Tue, Jun 18, 2019 at 8:33 AM Kotresh Hiremath Ravishankar < khiremat at redhat.com> wrote: > Hi Xavi, > > Reply inline. > > On Mon, Jun 17, 2019 at 5:38 PM Xavi Hernandez > wrote: > >> Hi Kotresh, >> >> On Mon, Jun 17, 2019 at 1:50 PM Kotresh Hiremath Ravishankar < >> khiremat at redhat.com> wrote: >> >>> Hi All, >>> >>> The ctime feature is enabled by default from release gluster-6. But as >>> explained in bug [1] there is a known issue with legacy files i.e., the >>> files which are created before ctime feature is enabled. These files would >>> not have "trusted.glusterfs.mdata" xattr which maintain time attributes. So >>> on, accessing those files, it gets created with latest time attributes. >>> This is not correct because all the time attributes (atime, mtime, ctime) >>> get updated instead of required time attributes. >>> >>> There are couple of approaches to solve this. >>> >>> 1. On accessing the files, let the posix update the time attributes >>> from the back end file on respective replicas. This obviously results in >>> inconsistent "trusted.glusterfs.mdata" xattr values with in replica set. >>> AFR/EC should heal this xattr as part of metadata heal upon accessing this >>> file. It can chose to replicate from any subvolume. Ideally we should >>> consider the highest time from the replica and treat it as source but I >>> think that should be fine as replica time attributes are mostly in sync >>> with max difference in order of few seconds if am not wrong. >>> >>> But client side self heal is disabled by default because of >>> performance reasons [2]. If we chose to go by this approach, we need to >>> consider enabling at least client side metadata self heal by default. >>> Please share your thoughts on enabling the same by default. >>> >>> 2. Don't let posix update the legacy files from the backend. On lookup >>> cbk, let the utime xlator update the time attributes from statbuf received >>> synchronously. >>> >>> Both approaches are similar as both results in updating the xattr during >>> lookup. Please share your inputs on which approach is better. >>> >> >> I prefer second approach. First approach is not feasible for EC volumes >> because self-heal requires that k bricks (on a k+r configuration) agree on >> the value of this xattr, otherwise it considers the metadata damaged and >> needs manual intervention to fix it. During upgrade, first r bricks with be >> upgraded without problems, but trusted.glusterfs.mdata won't be healed >> because r < k. In fact this xattr will be removed from new bricks because >> the majority of bricks agree on xattr not being present. Once the r+1 brick >> is upgraded, it's possible that posix sets different values for >> trusted.glusterfs.mdata, which will cause self-heal to fail. >> >> Second approach seems better to me if guarded by a new option that >> enables this behavior. utime xlator should only update the mdata xattr if >> that option is set, and that option should only be settable once all nodes >> have been upgraded (controlled by op-version). In this situation the first >> lookup on a file where utime detects that mdata is not set, will require a >> synchronous update. I think this is good enough because it will only happen >> once per file. We'll need to consider cases where different clients do >> lookups at the same time, but I think this can be easily solved by ignoring >> the request if mdata is already present. >> > > Initially there were two issues. > 1. Upgrade Issue with EC Volume as described by you. > This is solved with the patch [1]. There was a bug in ctime posix > where it was creating xattr even when ctime is not set on client (during > utimes system call). With patch [1], the behavior > is that utimes system call will only update the > "trusted.glusterfs.mdata" xattr if present else it won't create. The new > xattr creation should only happen during entry operations (i.e create, > mknod and others). > So there won't be any problems with upgrade. I think we don't need new > option dependent on op version if I am not wrong. > If I'm not missing something, we cannot allow creation of mdata xattr even for create/mknod/setattr fops. Doing so could cause the same problem if some of the bricks are not upgraded and do not support mdata yet (or they have ctime disabled by default). > 2. After upgrade, how do we update "trusted.glusterfs.mdata" xattr. > This mail thread was for this. Here which approach is better? I > understand from EC point of view the second approach is the best one. The > question I had was, Can't EC treat 'trusted.glusterfs.mdata' > as special xattr and add the logic to heal it from one subvolume > (i.e. to remove the requirement of having to have consistent data on k > subvolumes in k+r configuration). > Yes, we can do that. But this would require a newer client with support for this new xattr, which won't be possible during an upgrade, where bricks are upgraded before the clients. So, even if we add this intelligence to the client, the upgrade process is still broken. Only consideration here is if we can rely on self-heal daemon being on the server side (and thus upgraded at the same time than the server) to ensure that files can really be healed even if other bricks/shd daemons are not yet updated. Not sure if it could work, but anyway I don't like it very much. > > Second approach is independent of AFR and EC. So if we chose this, > do we need new option to guard? If the upgrade steps is to upgrade server > first and then client, we don't need to guard I think? > I think you are right for regular clients. Is there any server-side daemon that acts as a client that could use utime xlator ? if not, I think we don't need an additional option here. >> Xavi >> >> >>> >>> >>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1593542 >>> [2] https://github.com/gluster/glusterfs/issues/473 >>> >>> -- >>> Thanks and Regards, >>> Kotresh H R >>> >> > > -- > Thanks and Regards, > Kotresh H R > -------------- next part -------------- An HTML attachment was scrubbed... URL: From khiremat at redhat.com Tue Jun 18 07:25:44 2019 From: khiremat at redhat.com (Kotresh Hiremath Ravishankar) Date: Tue, 18 Jun 2019 12:55:44 +0530 Subject: [Gluster-devel] Solving Ctime Issue with legacy files [BUG 1593542] In-Reply-To: References: Message-ID: Hi Xavi, On Tue, Jun 18, 2019 at 12:28 PM Xavi Hernandez wrote: > Hi Kotresh, > > On Tue, Jun 18, 2019 at 8:33 AM Kotresh Hiremath Ravishankar < > khiremat at redhat.com> wrote: > >> Hi Xavi, >> >> Reply inline. >> >> On Mon, Jun 17, 2019 at 5:38 PM Xavi Hernandez >> wrote: >> >>> Hi Kotresh, >>> >>> On Mon, Jun 17, 2019 at 1:50 PM Kotresh Hiremath Ravishankar < >>> khiremat at redhat.com> wrote: >>> >>>> Hi All, >>>> >>>> The ctime feature is enabled by default from release gluster-6. But as >>>> explained in bug [1] there is a known issue with legacy files i.e., the >>>> files which are created before ctime feature is enabled. These files would >>>> not have "trusted.glusterfs.mdata" xattr which maintain time attributes. So >>>> on, accessing those files, it gets created with latest time attributes. >>>> This is not correct because all the time attributes (atime, mtime, ctime) >>>> get updated instead of required time attributes. >>>> >>>> There are couple of approaches to solve this. >>>> >>>> 1. On accessing the files, let the posix update the time attributes >>>> from the back end file on respective replicas. This obviously results in >>>> inconsistent "trusted.glusterfs.mdata" xattr values with in replica set. >>>> AFR/EC should heal this xattr as part of metadata heal upon accessing this >>>> file. It can chose to replicate from any subvolume. Ideally we should >>>> consider the highest time from the replica and treat it as source but I >>>> think that should be fine as replica time attributes are mostly in sync >>>> with max difference in order of few seconds if am not wrong. >>>> >>>> But client side self heal is disabled by default because of >>>> performance reasons [2]. If we chose to go by this approach, we need to >>>> consider enabling at least client side metadata self heal by default. >>>> Please share your thoughts on enabling the same by default. >>>> >>>> 2. Don't let posix update the legacy files from the backend. On lookup >>>> cbk, let the utime xlator update the time attributes from statbuf received >>>> synchronously. >>>> >>>> Both approaches are similar as both results in updating the xattr >>>> during lookup. Please share your inputs on which approach is better. >>>> >>> >>> I prefer second approach. First approach is not feasible for EC volumes >>> because self-heal requires that k bricks (on a k+r configuration) agree on >>> the value of this xattr, otherwise it considers the metadata damaged and >>> needs manual intervention to fix it. During upgrade, first r bricks with be >>> upgraded without problems, but trusted.glusterfs.mdata won't be healed >>> because r < k. In fact this xattr will be removed from new bricks because >>> the majority of bricks agree on xattr not being present. Once the r+1 brick >>> is upgraded, it's possible that posix sets different values for >>> trusted.glusterfs.mdata, which will cause self-heal to fail. >>> >>> Second approach seems better to me if guarded by a new option that >>> enables this behavior. utime xlator should only update the mdata xattr if >>> that option is set, and that option should only be settable once all nodes >>> have been upgraded (controlled by op-version). In this situation the first >>> lookup on a file where utime detects that mdata is not set, will require a >>> synchronous update. I think this is good enough because it will only happen >>> once per file. We'll need to consider cases where different clients do >>> lookups at the same time, but I think this can be easily solved by ignoring >>> the request if mdata is already present. >>> >> >> Initially there were two issues. >> 1. Upgrade Issue with EC Volume as described by you. >> This is solved with the patch [1]. There was a bug in ctime >> posix where it was creating xattr even when ctime is not set on client >> (during utimes system call). With patch [1], the behavior >> is that utimes system call will only update the >> "trusted.glusterfs.mdata" xattr if present else it won't create. The new >> xattr creation should only happen during entry operations (i.e create, >> mknod and others). >> So there won't be any problems with upgrade. I think we don't need new >> option dependent on op version if I am not wrong. >> > > If I'm not missing something, we cannot allow creation of mdata xattr even > for create/mknod/setattr fops. Doing so could cause the same problem if > some of the bricks are not upgraded and do not support mdata yet (or they > have ctime disabled by default). > Yes, that's right, even create/mknod and other fops won't create xattr if client doesn't set ctime (holds good for older clients). I have commented in the patch [1]. All other fops where xattr gets created as the check that if ctime is not set, don't create. It was missed only in utime syscall. And hence caused upgrade issues. > > >> 2. After upgrade, how do we update "trusted.glusterfs.mdata" xattr. >> This mail thread was for this. Here which approach is better? I >> understand from EC point of view the second approach is the best one. The >> question I had was, Can't EC treat 'trusted.glusterfs.mdata' >> as special xattr and add the logic to heal it from one subvolume >> (i.e. to remove the requirement of having to have consistent data on k >> subvolumes in k+r configuration). >> > > Yes, we can do that. But this would require a newer client with support > for this new xattr, which won't be possible during an upgrade, where bricks > are upgraded before the clients. So, even if we add this intelligence to > the client, the upgrade process is still broken. Only consideration here is > if we can rely on self-heal daemon being on the server side (and thus > upgraded at the same time than the server) to ensure that files can really > be healed even if other bricks/shd daemons are not yet updated. Not sure if > it could work, but anyway I don't like it very much. > > >> >> Second approach is independent of AFR and EC. So if we chose >> this, do we need new option to guard? If the upgrade steps is to upgrade >> server first and then client, we don't need to guard I think? >> > > I think you are right for regular clients. Is there any server-side daemon > that acts as a client that could use utime xlator ? if not, I think we > don't need an additional option here. > No, no other server side daemon has utime xlator loaded. [1] https://review.gluster.org/#/c/glusterfs/+/22858/ > >>> Xavi >>> >>> >>>> >>>> >>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1593542 >>>> [2] https://github.com/gluster/glusterfs/issues/473 >>>> >>>> -- >>>> Thanks and Regards, >>>> Kotresh H R >>>> >>> >> >> -- >> Thanks and Regards, >> Kotresh H R >> > -- Thanks and Regards, Kotresh H R -------------- next part -------------- An HTML attachment was scrubbed... URL: From mscherer at redhat.com Tue Jun 18 07:51:53 2019 From: mscherer at redhat.com (Michael Scherer) Date: Tue, 18 Jun 2019 09:51:53 +0200 Subject: [Gluster-devel] Upgrade of some builder nodes to F30 Message-ID: Hi, as per request (and since F28 is EOL or soon to be), I will give a try at adding a Fedora 30 node to the build cluster. So if you see anything suspicious on the builder49 node once it is back, please tell. The plan is to add the node, then replay some job on it to see if they are ok, then add a few more nodes, and switch the job for good, rince, repeat. For now, the system is blocked at step 1: "fix our playbook broken yet again by a ansible upgrade". -- Michael Scherer Sysadmin, Community Infrastructure -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part URL: From atumball at redhat.com Thu Jun 20 06:06:46 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Thu, 20 Jun 2019 11:36:46 +0530 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: <61c99ac170cc004a7f90897ff9f47cf7facdbc12.camel@redhat.com> References: <20190612133437.GK8725@ndevos-x270> <20190612151142.GL8725@ndevos-x270> <20190613090825.GN8725@ndevos-x270> <20190613122837.GS8725@ndevos-x270> <61c99ac170cc004a7f90897ff9f47cf7facdbc12.camel@redhat.com> Message-ID: Considering python3 is anyways the future, I vote for taking the patch we did in master for fixing regression tests with python3 into the release-6 and release-5 branch and getting over this deadlock. Patch in discussion here is https://review.gluster.org/#/c/glusterfs/+/22829/ and if anyone notices, it changes only the files inside 'tests/' directory, which is not packaged in a release anyways. Hari, can we get the backport of this patch to both the release branches? Regards, Amar On Thu, Jun 13, 2019 at 7:26 PM Michael Scherer wrote: > Le jeudi 13 juin 2019 ? 14:28 +0200, Niels de Vos a ?crit : > > On Thu, Jun 13, 2019 at 11:08:25AM +0200, Niels de Vos wrote: > > > On Wed, Jun 12, 2019 at 04:09:55PM -0700, Kaleb Keithley wrote: > > > > On Wed, Jun 12, 2019 at 11:36 AM Kaleb Keithley < > > > > kkeithle at redhat.com> wrote: > > > > > > > > > > > > > > On Wed, Jun 12, 2019 at 10:43 AM Amar Tumballi Suryanarayan < > > > > > atumball at redhat.com> wrote: > > > > > > > > > > > > > > > > > We recently noticed that in one of the package update on > > > > > > builder (ie, > > > > > > centos7.x machines), python3.6 got installed as a dependency. > > > > > > So, yes, it > > > > > > is possible to have python3 in centos7 now. > > > > > > > > > > > > > > > > EPEL updated from python34 to python36 recently, but C7 doesn't > > > > > have > > > > > python3 in the base. I don't think we've ever used EPEL > > > > > packages for > > > > > building. > > > > > > > > > > And GlusterFS-5 isn't python3 ready. > > > > > > > > > > > > > Correction: GlusterFS-5 is mostly or completely python3 > > > > ready. FWIW, > > > > python33 is available on both RHEL7 and CentOS7 from the Software > > > > Collection Library (SCL), and python34 and now python36 are > > > > available from > > > > EPEL. > > > > > > > > But packages built for the CentOS Storage SIG have never used the > > > > SCL or > > > > EPEL (EPEL not allowed) and the shebangs in the .py files are > > > > converted > > > > from /usr/bin/python3 to /usr/bin/python2 during the rpmbuild > > > > %prep stage. > > > > All the python dependencies for the packages remain the python2 > > > > flavors. > > > > AFAIK the centos-regression machines ought to be building the > > > > same way. > > > > > > Indeed, there should not be a requirement on having EPEL enabled on > > > the > > > CentOS-7 builders. At least not for the building of the glusterfs > > > tarball. We still need to do releases of glusterfs-4.1 and > > > glusterfs-5, > > > until then it is expected to have python2 as the (only?) version > > > for the > > > system. Is it possible to remove python3 from the CentOS-7 builders > > > and > > > run the jobs that require python3 on the Fedora builders instead? > > > > Actually, if the python-devel package for python3 is installed on the > > CentOS-7 builders, things may work too. It still feels like some sort > > of > > Frankenstein deployment, and we don't expect to this see in > > production > > environments. But maybe this is a workaround in case something > > really, > > really, REALLY depends on python3 on the builders. > > To be honest, people would be surprised what happen in production > around (sysadmins tend to discuss around, we all have horrors stories, > stuff that were supposed to be cleaned and wasn't, etc) > > After all, "frankenstein deployment now" is better than "perfect > later", especially since lots of IT departements are under constant > pressure (so that's more "perfect never"). > > I can understand that we want clean and simple code (who doesn't), but > real life is much messier than we want to admit, so we need something > robust. > > -- > Michael Scherer > Sysadmin, Community Infrastructure > > > > _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From hgowtham at redhat.com Thu Jun 20 07:07:31 2019 From: hgowtham at redhat.com (Hari Gowtham) Date: Thu, 20 Jun 2019 12:37:31 +0530 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190612133437.GK8725@ndevos-x270> <20190612151142.GL8725@ndevos-x270> <20190613090825.GN8725@ndevos-x270> <20190613122837.GS8725@ndevos-x270> <61c99ac170cc004a7f90897ff9f47cf7facdbc12.camel@redhat.com> Message-ID: Hi Amar, I have done the above request earlier with release 5 and still it fails. Patch: https://review.gluster.org/#/c/glusterfs/+/22855/ build log for failure: https://build.gluster.org/job/strfmt_errors/18889/artifact/RPMS/el6/i686/build.log The failure is related to building. So we need to fix the python 3 compatibility issues with release 5 as well. The build uses python3 and gluster relies on python2. I'm not sure if the patches to make gluster python3 compatible have made its way to release 5 and 6. If not then we have to work on that and make the build changes necessary to start consuming python3 for release branches. Or we have to make the build script smarter to use python 2 for release branches and python 3 for master. On Thu, Jun 20, 2019 at 11:38 AM Amar Tumballi Suryanarayan wrote: > > > Considering python3 is anyways the future, I vote for taking the patch we did in master for fixing regression tests with python3 into the release-6 and release-5 branch and getting over this deadlock. > > Patch in discussion here is https://review.gluster.org/#/c/glusterfs/+/22829/ and if anyone notices, it changes only the files inside 'tests/' directory, which is not packaged in a release anyways. > > Hari, can we get the backport of this patch to both the release branches? > > Regards, > Amar > > On Thu, Jun 13, 2019 at 7:26 PM Michael Scherer wrote: >> >> Le jeudi 13 juin 2019 ? 14:28 +0200, Niels de Vos a ?crit : >> > On Thu, Jun 13, 2019 at 11:08:25AM +0200, Niels de Vos wrote: >> > > On Wed, Jun 12, 2019 at 04:09:55PM -0700, Kaleb Keithley wrote: >> > > > On Wed, Jun 12, 2019 at 11:36 AM Kaleb Keithley < >> > > > kkeithle at redhat.com> wrote: >> > > > >> > > > > >> > > > > On Wed, Jun 12, 2019 at 10:43 AM Amar Tumballi Suryanarayan < >> > > > > atumball at redhat.com> wrote: >> > > > > >> > > > > > >> > > > > > We recently noticed that in one of the package update on >> > > > > > builder (ie, >> > > > > > centos7.x machines), python3.6 got installed as a dependency. >> > > > > > So, yes, it >> > > > > > is possible to have python3 in centos7 now. >> > > > > > >> > > > > >> > > > > EPEL updated from python34 to python36 recently, but C7 doesn't >> > > > > have >> > > > > python3 in the base. I don't think we've ever used EPEL >> > > > > packages for >> > > > > building. >> > > > > >> > > > > And GlusterFS-5 isn't python3 ready. >> > > > > >> > > > >> > > > Correction: GlusterFS-5 is mostly or completely python3 >> > > > ready. FWIW, >> > > > python33 is available on both RHEL7 and CentOS7 from the Software >> > > > Collection Library (SCL), and python34 and now python36 are >> > > > available from >> > > > EPEL. >> > > > >> > > > But packages built for the CentOS Storage SIG have never used the >> > > > SCL or >> > > > EPEL (EPEL not allowed) and the shebangs in the .py files are >> > > > converted >> > > > from /usr/bin/python3 to /usr/bin/python2 during the rpmbuild >> > > > %prep stage. >> > > > All the python dependencies for the packages remain the python2 >> > > > flavors. >> > > > AFAIK the centos-regression machines ought to be building the >> > > > same way. >> > > >> > > Indeed, there should not be a requirement on having EPEL enabled on >> > > the >> > > CentOS-7 builders. At least not for the building of the glusterfs >> > > tarball. We still need to do releases of glusterfs-4.1 and >> > > glusterfs-5, >> > > until then it is expected to have python2 as the (only?) version >> > > for the >> > > system. Is it possible to remove python3 from the CentOS-7 builders >> > > and >> > > run the jobs that require python3 on the Fedora builders instead? >> > >> > Actually, if the python-devel package for python3 is installed on the >> > CentOS-7 builders, things may work too. It still feels like some sort >> > of >> > Frankenstein deployment, and we don't expect to this see in >> > production >> > environments. But maybe this is a workaround in case something >> > really, >> > really, REALLY depends on python3 on the builders. >> >> To be honest, people would be surprised what happen in production >> around (sysadmins tend to discuss around, we all have horrors stories, >> stuff that were supposed to be cleaned and wasn't, etc) >> >> After all, "frankenstein deployment now" is better than "perfect >> later", especially since lots of IT departements are under constant >> pressure (so that's more "perfect never"). >> >> I can understand that we want clean and simple code (who doesn't), but >> real life is much messier than we want to admit, so we need something >> robust. >> >> -- >> Michael Scherer >> Sysadmin, Community Infrastructure >> >> >> >> _______________________________________________ >> >> Community Meeting Calendar: >> >> APAC Schedule - >> Every 2nd and 4th Tuesday at 11:30 AM IST >> Bridge: https://bluejeans.com/836554017 >> >> NA/EMEA Schedule - >> Every 1st and 3rd Tuesday at 01:00 PM EDT >> Bridge: https://bluejeans.com/486278655 >> >> Gluster-devel mailing list >> Gluster-devel at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-devel >> > > > -- > Amar Tumballi (amarts) > _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > -- Regards, Hari Gowtham. From ndevos at redhat.com Thu Jun 20 07:43:35 2019 From: ndevos at redhat.com (Niels de Vos) Date: Thu, 20 Jun 2019 09:43:35 +0200 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190612133437.GK8725@ndevos-x270> <20190612151142.GL8725@ndevos-x270> <20190613090825.GN8725@ndevos-x270> <20190613122837.GS8725@ndevos-x270> <61c99ac170cc004a7f90897ff9f47cf7facdbc12.camel@redhat.com> Message-ID: <20190620074335.GA12566@ndevos-x270> On Thu, Jun 20, 2019 at 11:36:46AM +0530, Amar Tumballi Suryanarayan wrote: > Considering python3 is anyways the future, I vote for taking the patch we > did in master for fixing regression tests with python3 into the release-6 > and release-5 branch and getting over this deadlock. > > Patch in discussion here is > https://review.gluster.org/#/c/glusterfs/+/22829/ and if anyone notices, it > changes only the files inside 'tests/' directory, which is not packaged in > a release anyways. > > Hari, can we get the backport of this patch to both the release branches? When going this route, you still need to make sure that the python3-devel package is available on the CentOS-7 builders. And I don't know if installing that package is already sufficient, maybe the backport is not even needed in that case. Niels > > Regards, > Amar > > On Thu, Jun 13, 2019 at 7:26 PM Michael Scherer wrote: > > > Le jeudi 13 juin 2019 ? 14:28 +0200, Niels de Vos a ?crit : > > > On Thu, Jun 13, 2019 at 11:08:25AM +0200, Niels de Vos wrote: > > > > On Wed, Jun 12, 2019 at 04:09:55PM -0700, Kaleb Keithley wrote: > > > > > On Wed, Jun 12, 2019 at 11:36 AM Kaleb Keithley < > > > > > kkeithle at redhat.com> wrote: > > > > > > > > > > > > > > > > > On Wed, Jun 12, 2019 at 10:43 AM Amar Tumballi Suryanarayan < > > > > > > atumball at redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > We recently noticed that in one of the package update on > > > > > > > builder (ie, > > > > > > > centos7.x machines), python3.6 got installed as a dependency. > > > > > > > So, yes, it > > > > > > > is possible to have python3 in centos7 now. > > > > > > > > > > > > > > > > > > > EPEL updated from python34 to python36 recently, but C7 doesn't > > > > > > have > > > > > > python3 in the base. I don't think we've ever used EPEL > > > > > > packages for > > > > > > building. > > > > > > > > > > > > And GlusterFS-5 isn't python3 ready. > > > > > > > > > > > > > > > > Correction: GlusterFS-5 is mostly or completely python3 > > > > > ready. FWIW, > > > > > python33 is available on both RHEL7 and CentOS7 from the Software > > > > > Collection Library (SCL), and python34 and now python36 are > > > > > available from > > > > > EPEL. > > > > > > > > > > But packages built for the CentOS Storage SIG have never used the > > > > > SCL or > > > > > EPEL (EPEL not allowed) and the shebangs in the .py files are > > > > > converted > > > > > from /usr/bin/python3 to /usr/bin/python2 during the rpmbuild > > > > > %prep stage. > > > > > All the python dependencies for the packages remain the python2 > > > > > flavors. > > > > > AFAIK the centos-regression machines ought to be building the > > > > > same way. > > > > > > > > Indeed, there should not be a requirement on having EPEL enabled on > > > > the > > > > CentOS-7 builders. At least not for the building of the glusterfs > > > > tarball. We still need to do releases of glusterfs-4.1 and > > > > glusterfs-5, > > > > until then it is expected to have python2 as the (only?) version > > > > for the > > > > system. Is it possible to remove python3 from the CentOS-7 builders > > > > and > > > > run the jobs that require python3 on the Fedora builders instead? > > > > > > Actually, if the python-devel package for python3 is installed on the > > > CentOS-7 builders, things may work too. It still feels like some sort > > > of > > > Frankenstein deployment, and we don't expect to this see in > > > production > > > environments. But maybe this is a workaround in case something > > > really, > > > really, REALLY depends on python3 on the builders. > > > > To be honest, people would be surprised what happen in production > > around (sysadmins tend to discuss around, we all have horrors stories, > > stuff that were supposed to be cleaned and wasn't, etc) > > > > After all, "frankenstein deployment now" is better than "perfect > > later", especially since lots of IT departements are under constant > > pressure (so that's more "perfect never"). > > > > I can understand that we want clean and simple code (who doesn't), but > > real life is much messier than we want to admit, so we need something > > robust. > > > > -- > > Michael Scherer > > Sysadmin, Community Infrastructure > > > > > > > > _______________________________________________ > > > > Community Meeting Calendar: > > > > APAC Schedule - > > Every 2nd and 4th Tuesday at 11:30 AM IST > > Bridge: https://bluejeans.com/836554017 > > > > NA/EMEA Schedule - > > Every 1st and 3rd Tuesday at 01:00 PM EDT > > Bridge: https://bluejeans.com/486278655 > > > > Gluster-devel mailing list > > Gluster-devel at gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > > > -- > Amar Tumballi (amarts) From atumball at redhat.com Thu Jun 20 08:41:21 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Thu, 20 Jun 2019 14:11:21 +0530 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: <20190620074335.GA12566@ndevos-x270> References: <20190612133437.GK8725@ndevos-x270> <20190612151142.GL8725@ndevos-x270> <20190613090825.GN8725@ndevos-x270> <20190613122837.GS8725@ndevos-x270> <61c99ac170cc004a7f90897ff9f47cf7facdbc12.camel@redhat.com> <20190620074335.GA12566@ndevos-x270> Message-ID: On Thu, Jun 20, 2019 at 1:13 PM Niels de Vos wrote: > On Thu, Jun 20, 2019 at 11:36:46AM +0530, Amar Tumballi Suryanarayan wrote: > > Considering python3 is anyways the future, I vote for taking the patch we > > did in master for fixing regression tests with python3 into the release-6 > > and release-5 branch and getting over this deadlock. > > > > Patch in discussion here is > > https://review.gluster.org/#/c/glusterfs/+/22829/ and if anyone > notices, it > > changes only the files inside 'tests/' directory, which is not packaged > in > > a release anyways. > > > > Hari, can we get the backport of this patch to both the release branches? > > When going this route, you still need to make sure that the > python3-devel package is available on the CentOS-7 builders. And I > don't know if installing that package is already sufficient, maybe the > backport is not even needed in that case. > > I was thinking, having this patch makes it compatible with both python2 and python3, so technically, it allows us to move to Fedora30 if we need to run regression there. (and CentOS7 with only python2). The above patch made it compatible, not mandatory to have python3. So, treating it as a bug fix. > Niels > > > > > > Regards, > > Amar > > > > On Thu, Jun 13, 2019 at 7:26 PM Michael Scherer > wrote: > > > > > Le jeudi 13 juin 2019 ? 14:28 +0200, Niels de Vos a ?crit : > > > > On Thu, Jun 13, 2019 at 11:08:25AM +0200, Niels de Vos wrote: > > > > > On Wed, Jun 12, 2019 at 04:09:55PM -0700, Kaleb Keithley wrote: > > > > > > On Wed, Jun 12, 2019 at 11:36 AM Kaleb Keithley < > > > > > > kkeithle at redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > On Wed, Jun 12, 2019 at 10:43 AM Amar Tumballi Suryanarayan < > > > > > > > atumball at redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > We recently noticed that in one of the package update on > > > > > > > > builder (ie, > > > > > > > > centos7.x machines), python3.6 got installed as a dependency. > > > > > > > > So, yes, it > > > > > > > > is possible to have python3 in centos7 now. > > > > > > > > > > > > > > > > > > > > > > EPEL updated from python34 to python36 recently, but C7 doesn't > > > > > > > have > > > > > > > python3 in the base. I don't think we've ever used EPEL > > > > > > > packages for > > > > > > > building. > > > > > > > > > > > > > > And GlusterFS-5 isn't python3 ready. > > > > > > > > > > > > > > > > > > > Correction: GlusterFS-5 is mostly or completely python3 > > > > > > ready. FWIW, > > > > > > python33 is available on both RHEL7 and CentOS7 from the Software > > > > > > Collection Library (SCL), and python34 and now python36 are > > > > > > available from > > > > > > EPEL. > > > > > > > > > > > > But packages built for the CentOS Storage SIG have never used the > > > > > > SCL or > > > > > > EPEL (EPEL not allowed) and the shebangs in the .py files are > > > > > > converted > > > > > > from /usr/bin/python3 to /usr/bin/python2 during the rpmbuild > > > > > > %prep stage. > > > > > > All the python dependencies for the packages remain the python2 > > > > > > flavors. > > > > > > AFAIK the centos-regression machines ought to be building the > > > > > > same way. > > > > > > > > > > Indeed, there should not be a requirement on having EPEL enabled on > > > > > the > > > > > CentOS-7 builders. At least not for the building of the glusterfs > > > > > tarball. We still need to do releases of glusterfs-4.1 and > > > > > glusterfs-5, > > > > > until then it is expected to have python2 as the (only?) version > > > > > for the > > > > > system. Is it possible to remove python3 from the CentOS-7 builders > > > > > and > > > > > run the jobs that require python3 on the Fedora builders instead? > > > > > > > > Actually, if the python-devel package for python3 is installed on the > > > > CentOS-7 builders, things may work too. It still feels like some sort > > > > of > > > > Frankenstein deployment, and we don't expect to this see in > > > > production > > > > environments. But maybe this is a workaround in case something > > > > really, > > > > really, REALLY depends on python3 on the builders. > > > > > > To be honest, people would be surprised what happen in production > > > around (sysadmins tend to discuss around, we all have horrors stories, > > > stuff that were supposed to be cleaned and wasn't, etc) > > > > > > After all, "frankenstein deployment now" is better than "perfect > > > later", especially since lots of IT departements are under constant > > > pressure (so that's more "perfect never"). > > > > > > I can understand that we want clean and simple code (who doesn't), but > > > real life is much messier than we want to admit, so we need something > > > robust. > > > > > > -- > > > Michael Scherer > > > Sysadmin, Community Infrastructure > > > > > > > > > > > > _______________________________________________ > > > > > > Community Meeting Calendar: > > > > > > APAC Schedule - > > > Every 2nd and 4th Tuesday at 11:30 AM IST > > > Bridge: https://bluejeans.com/836554017 > > > > > > NA/EMEA Schedule - > > > Every 1st and 3rd Tuesday at 01:00 PM EDT > > > Bridge: https://bluejeans.com/486278655 > > > > > > Gluster-devel mailing list > > > Gluster-devel at gluster.org > > > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > > > > > > > -- > > Amar Tumballi (amarts) > -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndevos at redhat.com Thu Jun 20 09:05:08 2019 From: ndevos at redhat.com (Niels de Vos) Date: Thu, 20 Jun 2019 11:05:08 +0200 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190612151142.GL8725@ndevos-x270> <20190613090825.GN8725@ndevos-x270> <20190613122837.GS8725@ndevos-x270> <61c99ac170cc004a7f90897ff9f47cf7facdbc12.camel@redhat.com> <20190620074335.GA12566@ndevos-x270> Message-ID: <20190620090508.GA13895@ndevos-x270> On Thu, Jun 20, 2019 at 02:11:21PM +0530, Amar Tumballi Suryanarayan wrote: > On Thu, Jun 20, 2019 at 1:13 PM Niels de Vos wrote: > > > On Thu, Jun 20, 2019 at 11:36:46AM +0530, Amar Tumballi Suryanarayan wrote: > > > Considering python3 is anyways the future, I vote for taking the patch we > > > did in master for fixing regression tests with python3 into the release-6 > > > and release-5 branch and getting over this deadlock. > > > > > > Patch in discussion here is > > > https://review.gluster.org/#/c/glusterfs/+/22829/ and if anyone > > notices, it > > > changes only the files inside 'tests/' directory, which is not packaged > > in > > > a release anyways. > > > > > > Hari, can we get the backport of this patch to both the release branches? > > > > When going this route, you still need to make sure that the > > python3-devel package is available on the CentOS-7 builders. And I > > don't know if installing that package is already sufficient, maybe the > > backport is not even needed in that case. > > > > > I was thinking, having this patch makes it compatible with both python2 and > python3, so technically, it allows us to move to Fedora30 if we need to run > regression there. (and CentOS7 with only python2). > > The above patch made it compatible, not mandatory to have python3. So, > treating it as a bug fix. Well, whatever Python is detected (python3 has preference over python2), needs to have the -devel package available too. Detection is done by probing the python executable. The Matching header files from -devel need to be present in order to be able to build glupy (and others?). I do not think compatibility for python3/2 is the problem while building the tarball. The backport might become relevant while running tests on environments where there is no python2. Niels > > > > Niels > > > > > > > > > > Regards, > > > Amar > > > > > > On Thu, Jun 13, 2019 at 7:26 PM Michael Scherer > > wrote: > > > > > > > Le jeudi 13 juin 2019 ? 14:28 +0200, Niels de Vos a ?crit : > > > > > On Thu, Jun 13, 2019 at 11:08:25AM +0200, Niels de Vos wrote: > > > > > > On Wed, Jun 12, 2019 at 04:09:55PM -0700, Kaleb Keithley wrote: > > > > > > > On Wed, Jun 12, 2019 at 11:36 AM Kaleb Keithley < > > > > > > > kkeithle at redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jun 12, 2019 at 10:43 AM Amar Tumballi Suryanarayan < > > > > > > > > atumball at redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > We recently noticed that in one of the package update on > > > > > > > > > builder (ie, > > > > > > > > > centos7.x machines), python3.6 got installed as a dependency. > > > > > > > > > So, yes, it > > > > > > > > > is possible to have python3 in centos7 now. > > > > > > > > > > > > > > > > > > > > > > > > > EPEL updated from python34 to python36 recently, but C7 doesn't > > > > > > > > have > > > > > > > > python3 in the base. I don't think we've ever used EPEL > > > > > > > > packages for > > > > > > > > building. > > > > > > > > > > > > > > > > And GlusterFS-5 isn't python3 ready. > > > > > > > > > > > > > > > > > > > > > > Correction: GlusterFS-5 is mostly or completely python3 > > > > > > > ready. FWIW, > > > > > > > python33 is available on both RHEL7 and CentOS7 from the Software > > > > > > > Collection Library (SCL), and python34 and now python36 are > > > > > > > available from > > > > > > > EPEL. > > > > > > > > > > > > > > But packages built for the CentOS Storage SIG have never used the > > > > > > > SCL or > > > > > > > EPEL (EPEL not allowed) and the shebangs in the .py files are > > > > > > > converted > > > > > > > from /usr/bin/python3 to /usr/bin/python2 during the rpmbuild > > > > > > > %prep stage. > > > > > > > All the python dependencies for the packages remain the python2 > > > > > > > flavors. > > > > > > > AFAIK the centos-regression machines ought to be building the > > > > > > > same way. > > > > > > > > > > > > Indeed, there should not be a requirement on having EPEL enabled on > > > > > > the > > > > > > CentOS-7 builders. At least not for the building of the glusterfs > > > > > > tarball. We still need to do releases of glusterfs-4.1 and > > > > > > glusterfs-5, > > > > > > until then it is expected to have python2 as the (only?) version > > > > > > for the > > > > > > system. Is it possible to remove python3 from the CentOS-7 builders > > > > > > and > > > > > > run the jobs that require python3 on the Fedora builders instead? > > > > > > > > > > Actually, if the python-devel package for python3 is installed on the > > > > > CentOS-7 builders, things may work too. It still feels like some sort > > > > > of > > > > > Frankenstein deployment, and we don't expect to this see in > > > > > production > > > > > environments. But maybe this is a workaround in case something > > > > > really, > > > > > really, REALLY depends on python3 on the builders. > > > > > > > > To be honest, people would be surprised what happen in production > > > > around (sysadmins tend to discuss around, we all have horrors stories, > > > > stuff that were supposed to be cleaned and wasn't, etc) > > > > > > > > After all, "frankenstein deployment now" is better than "perfect > > > > later", especially since lots of IT departements are under constant > > > > pressure (so that's more "perfect never"). > > > > > > > > I can understand that we want clean and simple code (who doesn't), but > > > > real life is much messier than we want to admit, so we need something > > > > robust. > > > > > > > > -- > > > > Michael Scherer > > > > Sysadmin, Community Infrastructure > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > Community Meeting Calendar: > > > > > > > > APAC Schedule - > > > > Every 2nd and 4th Tuesday at 11:30 AM IST > > > > Bridge: https://bluejeans.com/836554017 > > > > > > > > NA/EMEA Schedule - > > > > Every 1st and 3rd Tuesday at 01:00 PM EDT > > > > Bridge: https://bluejeans.com/486278655 > > > > > > > > Gluster-devel mailing list > > > > Gluster-devel at gluster.org > > > > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > > > > > > > > > > > -- > > > Amar Tumballi (amarts) > > > > > -- > Amar Tumballi (amarts) From atumball at redhat.com Thu Jun 20 09:26:51 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Thu, 20 Jun 2019 14:56:51 +0530 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: <20190620090508.GA13895@ndevos-x270> References: <20190612151142.GL8725@ndevos-x270> <20190613090825.GN8725@ndevos-x270> <20190613122837.GS8725@ndevos-x270> <61c99ac170cc004a7f90897ff9f47cf7facdbc12.camel@redhat.com> <20190620074335.GA12566@ndevos-x270> <20190620090508.GA13895@ndevos-x270> Message-ID: On Thu, Jun 20, 2019 at 2:35 PM Niels de Vos wrote: > On Thu, Jun 20, 2019 at 02:11:21PM +0530, Amar Tumballi Suryanarayan wrote: > > On Thu, Jun 20, 2019 at 1:13 PM Niels de Vos wrote: > > > > > On Thu, Jun 20, 2019 at 11:36:46AM +0530, Amar Tumballi Suryanarayan > wrote: > > > > Considering python3 is anyways the future, I vote for taking the > patch we > > > > did in master for fixing regression tests with python3 into the > release-6 > > > > and release-5 branch and getting over this deadlock. > > > > > > > > Patch in discussion here is > > > > https://review.gluster.org/#/c/glusterfs/+/22829/ and if anyone > > > notices, it > > > > changes only the files inside 'tests/' directory, which is not > packaged > > > in > > > > a release anyways. > > > > > > > > Hari, can we get the backport of this patch to both the release > branches? > > > > > > When going this route, you still need to make sure that the > > > python3-devel package is available on the CentOS-7 builders. And I > > > don't know if installing that package is already sufficient, maybe the > > > backport is not even needed in that case. > > > > > > > > I was thinking, having this patch makes it compatible with both python2 > and > > python3, so technically, it allows us to move to Fedora30 if we need to > run > > regression there. (and CentOS7 with only python2). > > > > The above patch made it compatible, not mandatory to have python3. So, > > treating it as a bug fix. > > Well, whatever Python is detected (python3 has preference over python2), > needs to have the -devel package available too. Detection is done by > probing the python executable. The Matching header files from -devel > need to be present in order to be able to build glupy (and others?). > > I do not think compatibility for python3/2 is the problem while > building the tarball. Got it! True. Compatibility is not the problem to build the tarball. I noticed the issue of smoke is coming only from strfmt-errors job, which checks for 'epel-6-i386' mock, and fails right now. The backport might become relevant while running > tests on environments where there is no python2. > > Backport is very important if we are running in a system where we have only python3. Hence my proposal to include it in releases. But we are stuck with strfmt-errors job right now, and looking at what it was intended to catch in first place, mostly our https://build.gluster.org/job/32-bit-build-smoke/ would be doing same. If that is the case, we can remove the job altogether. Also note, this job is known to fail many smokes with 'Build root is locked by another process' errors. Would be great if disabling strfmt-errors is an option. Regards, > Niels > > > > > > > > > Niels > > > > > > > > > > > > > > Regards, > > > > Amar > > > > > > > > On Thu, Jun 13, 2019 at 7:26 PM Michael Scherer > > > > wrote: > > > > > > > > > Le jeudi 13 juin 2019 ? 14:28 +0200, Niels de Vos a ?crit : > > > > > > On Thu, Jun 13, 2019 at 11:08:25AM +0200, Niels de Vos wrote: > > > > > > > On Wed, Jun 12, 2019 at 04:09:55PM -0700, Kaleb Keithley wrote: > > > > > > > > On Wed, Jun 12, 2019 at 11:36 AM Kaleb Keithley < > > > > > > > > kkeithle at redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jun 12, 2019 at 10:43 AM Amar Tumballi > Suryanarayan < > > > > > > > > > atumball at redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We recently noticed that in one of the package update on > > > > > > > > > > builder (ie, > > > > > > > > > > centos7.x machines), python3.6 got installed as a > dependency. > > > > > > > > > > So, yes, it > > > > > > > > > > is possible to have python3 in centos7 now. > > > > > > > > > > > > > > > > > > > > > > > > > > > > EPEL updated from python34 to python36 recently, but C7 > doesn't > > > > > > > > > have > > > > > > > > > python3 in the base. I don't think we've ever used EPEL > > > > > > > > > packages for > > > > > > > > > building. > > > > > > > > > > > > > > > > > > And GlusterFS-5 isn't python3 ready. > > > > > > > > > > > > > > > > > > > > > > > > > Correction: GlusterFS-5 is mostly or completely python3 > > > > > > > > ready. FWIW, > > > > > > > > python33 is available on both RHEL7 and CentOS7 from the > Software > > > > > > > > Collection Library (SCL), and python34 and now python36 are > > > > > > > > available from > > > > > > > > EPEL. > > > > > > > > > > > > > > > > But packages built for the CentOS Storage SIG have never > used the > > > > > > > > SCL or > > > > > > > > EPEL (EPEL not allowed) and the shebangs in the .py files are > > > > > > > > converted > > > > > > > > from /usr/bin/python3 to /usr/bin/python2 during the rpmbuild > > > > > > > > %prep stage. > > > > > > > > All the python dependencies for the packages remain the > python2 > > > > > > > > flavors. > > > > > > > > AFAIK the centos-regression machines ought to be building the > > > > > > > > same way. > > > > > > > > > > > > > > Indeed, there should not be a requirement on having EPEL > enabled on > > > > > > > the > > > > > > > CentOS-7 builders. At least not for the building of the > glusterfs > > > > > > > tarball. We still need to do releases of glusterfs-4.1 and > > > > > > > glusterfs-5, > > > > > > > until then it is expected to have python2 as the (only?) > version > > > > > > > for the > > > > > > > system. Is it possible to remove python3 from the CentOS-7 > builders > > > > > > > and > > > > > > > run the jobs that require python3 on the Fedora builders > instead? > > > > > > > > > > > > Actually, if the python-devel package for python3 is installed > on the > > > > > > CentOS-7 builders, things may work too. It still feels like some > sort > > > > > > of > > > > > > Frankenstein deployment, and we don't expect to this see in > > > > > > production > > > > > > environments. But maybe this is a workaround in case something > > > > > > really, > > > > > > really, REALLY depends on python3 on the builders. > > > > > > > > > > To be honest, people would be surprised what happen in production > > > > > around (sysadmins tend to discuss around, we all have horrors > stories, > > > > > stuff that were supposed to be cleaned and wasn't, etc) > > > > > > > > > > After all, "frankenstein deployment now" is better than "perfect > > > > > later", especially since lots of IT departements are under constant > > > > > pressure (so that's more "perfect never"). > > > > > > > > > > I can understand that we want clean and simple code (who doesn't), > but > > > > > real life is much messier than we want to admit, so we need > something > > > > > robust. > > > > > > > > > > -- > > > > > Michael Scherer > > > > > Sysadmin, Community Infrastructure > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > Community Meeting Calendar: > > > > > > > > > > APAC Schedule - > > > > > Every 2nd and 4th Tuesday at 11:30 AM IST > > > > > Bridge: https://bluejeans.com/836554017 > > > > > > > > > > NA/EMEA Schedule - > > > > > Every 1st and 3rd Tuesday at 01:00 PM EDT > > > > > Bridge: https://bluejeans.com/486278655 > > > > > > > > > > Gluster-devel mailing list > > > > > Gluster-devel at gluster.org > > > > > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > > > > > > > > > > > > > > > -- > > > > Amar Tumballi (amarts) > > > > > > > > > -- > > Amar Tumballi (amarts) > -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndevos at redhat.com Thu Jun 20 09:49:01 2019 From: ndevos at redhat.com (Niels de Vos) Date: Thu, 20 Jun 2019 11:49:01 +0200 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190613090825.GN8725@ndevos-x270> <20190613122837.GS8725@ndevos-x270> <61c99ac170cc004a7f90897ff9f47cf7facdbc12.camel@redhat.com> <20190620074335.GA12566@ndevos-x270> <20190620090508.GA13895@ndevos-x270> Message-ID: <20190620094901.GB13895@ndevos-x270> On Thu, Jun 20, 2019 at 02:56:51PM +0530, Amar Tumballi Suryanarayan wrote: > On Thu, Jun 20, 2019 at 2:35 PM Niels de Vos wrote: > > > On Thu, Jun 20, 2019 at 02:11:21PM +0530, Amar Tumballi Suryanarayan wrote: > > > On Thu, Jun 20, 2019 at 1:13 PM Niels de Vos wrote: > > > > > > > On Thu, Jun 20, 2019 at 11:36:46AM +0530, Amar Tumballi Suryanarayan > > wrote: > > > > > Considering python3 is anyways the future, I vote for taking the > > patch we > > > > > did in master for fixing regression tests with python3 into the > > release-6 > > > > > and release-5 branch and getting over this deadlock. > > > > > > > > > > Patch in discussion here is > > > > > https://review.gluster.org/#/c/glusterfs/+/22829/ and if anyone > > > > notices, it > > > > > changes only the files inside 'tests/' directory, which is not > > packaged > > > > in > > > > > a release anyways. > > > > > > > > > > Hari, can we get the backport of this patch to both the release > > branches? > > > > > > > > When going this route, you still need to make sure that the > > > > python3-devel package is available on the CentOS-7 builders. And I > > > > don't know if installing that package is already sufficient, maybe the > > > > backport is not even needed in that case. > > > > > > > > > > > I was thinking, having this patch makes it compatible with both python2 > > and > > > python3, so technically, it allows us to move to Fedora30 if we need to > > run > > > regression there. (and CentOS7 with only python2). > > > > > > The above patch made it compatible, not mandatory to have python3. So, > > > treating it as a bug fix. > > > > Well, whatever Python is detected (python3 has preference over python2), > > needs to have the -devel package available too. Detection is done by > > probing the python executable. The Matching header files from -devel > > need to be present in order to be able to build glupy (and others?). > > > > I do not think compatibility for python3/2 is the problem while > > building the tarball. > > > Got it! True. Compatibility is not the problem to build the tarball. > > I noticed the issue of smoke is coming only from strfmt-errors job, which > checks for 'epel-6-i386' mock, and fails right now. > > The backport might become relevant while running > > tests on environments where there is no python2. > > > > > Backport is very important if we are running in a system where we have only > python3. Hence my proposal to include it in releases. I am sure CentOS-7 still has python2. The newer python3 only gets pulled in by some additional packages that get installed from EPEL. > But we are stuck with strfmt-errors job right now, and looking at what it > was intended to catch in first place, mostly our > https://build.gluster.org/job/32-bit-build-smoke/ would be doing same. If > that is the case, we can remove the job altogether. Also note, this job is > known to fail many smokes with 'Build root is locked by another process' > errors. This error means that there are multiple concurrent jobs running 'mock' with this buildroot. That should not happen and is a configuration error in one or more Jenkins jobs. > Would be great if disabling strfmt-errors is an option. I think both jobs do different things. The smoke is functional, where as strfmt-errors catches incorrect string formatting (some maintainers assume always 64-bit, everywhere) that has been missed in reviews. Niels > > Regards, > > > Niels > > > > > > > > > > > > > > Niels > > > > > > > > > > > > > > > > > > Regards, > > > > > Amar > > > > > > > > > > On Thu, Jun 13, 2019 at 7:26 PM Michael Scherer > > > > > > wrote: > > > > > > > > > > > Le jeudi 13 juin 2019 ? 14:28 +0200, Niels de Vos a ?crit : > > > > > > > On Thu, Jun 13, 2019 at 11:08:25AM +0200, Niels de Vos wrote: > > > > > > > > On Wed, Jun 12, 2019 at 04:09:55PM -0700, Kaleb Keithley wrote: > > > > > > > > > On Wed, Jun 12, 2019 at 11:36 AM Kaleb Keithley < > > > > > > > > > kkeithle at redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jun 12, 2019 at 10:43 AM Amar Tumballi > > Suryanarayan < > > > > > > > > > > atumball at redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We recently noticed that in one of the package update on > > > > > > > > > > > builder (ie, > > > > > > > > > > > centos7.x machines), python3.6 got installed as a > > dependency. > > > > > > > > > > > So, yes, it > > > > > > > > > > > is possible to have python3 in centos7 now. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > EPEL updated from python34 to python36 recently, but C7 > > doesn't > > > > > > > > > > have > > > > > > > > > > python3 in the base. I don't think we've ever used EPEL > > > > > > > > > > packages for > > > > > > > > > > building. > > > > > > > > > > > > > > > > > > > > And GlusterFS-5 isn't python3 ready. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Correction: GlusterFS-5 is mostly or completely python3 > > > > > > > > > ready. FWIW, > > > > > > > > > python33 is available on both RHEL7 and CentOS7 from the > > Software > > > > > > > > > Collection Library (SCL), and python34 and now python36 are > > > > > > > > > available from > > > > > > > > > EPEL. > > > > > > > > > > > > > > > > > > But packages built for the CentOS Storage SIG have never > > used the > > > > > > > > > SCL or > > > > > > > > > EPEL (EPEL not allowed) and the shebangs in the .py files are > > > > > > > > > converted > > > > > > > > > from /usr/bin/python3 to /usr/bin/python2 during the rpmbuild > > > > > > > > > %prep stage. > > > > > > > > > All the python dependencies for the packages remain the > > python2 > > > > > > > > > flavors. > > > > > > > > > AFAIK the centos-regression machines ought to be building the > > > > > > > > > same way. > > > > > > > > > > > > > > > > Indeed, there should not be a requirement on having EPEL > > enabled on > > > > > > > > the > > > > > > > > CentOS-7 builders. At least not for the building of the > > glusterfs > > > > > > > > tarball. We still need to do releases of glusterfs-4.1 and > > > > > > > > glusterfs-5, > > > > > > > > until then it is expected to have python2 as the (only?) > > version > > > > > > > > for the > > > > > > > > system. Is it possible to remove python3 from the CentOS-7 > > builders > > > > > > > > and > > > > > > > > run the jobs that require python3 on the Fedora builders > > instead? > > > > > > > > > > > > > > Actually, if the python-devel package for python3 is installed > > on the > > > > > > > CentOS-7 builders, things may work too. It still feels like some > > sort > > > > > > > of > > > > > > > Frankenstein deployment, and we don't expect to this see in > > > > > > > production > > > > > > > environments. But maybe this is a workaround in case something > > > > > > > really, > > > > > > > really, REALLY depends on python3 on the builders. > > > > > > > > > > > > To be honest, people would be surprised what happen in production > > > > > > around (sysadmins tend to discuss around, we all have horrors > > stories, > > > > > > stuff that were supposed to be cleaned and wasn't, etc) > > > > > > > > > > > > After all, "frankenstein deployment now" is better than "perfect > > > > > > later", especially since lots of IT departements are under constant > > > > > > pressure (so that's more "perfect never"). > > > > > > > > > > > > I can understand that we want clean and simple code (who doesn't), > > but > > > > > > real life is much messier than we want to admit, so we need > > something > > > > > > robust. > > > > > > > > > > > > -- > > > > > > Michael Scherer > > > > > > Sysadmin, Community Infrastructure > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > > Community Meeting Calendar: > > > > > > > > > > > > APAC Schedule - > > > > > > Every 2nd and 4th Tuesday at 11:30 AM IST > > > > > > Bridge: https://bluejeans.com/836554017 > > > > > > > > > > > > NA/EMEA Schedule - > > > > > > Every 1st and 3rd Tuesday at 01:00 PM EDT > > > > > > Bridge: https://bluejeans.com/486278655 > > > > > > > > > > > > Gluster-devel mailing list > > > > > > Gluster-devel at gluster.org > > > > > > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Amar Tumballi (amarts) > > > > > > > > > > > > > -- > > > Amar Tumballi (amarts) > > > > > -- > Amar Tumballi (amarts) From dkhandel at redhat.com Thu Jun 20 10:12:06 2019 From: dkhandel at redhat.com (Deepshikha Khandelwal) Date: Thu, 20 Jun 2019 15:42:06 +0530 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: <20190620094901.GB13895@ndevos-x270> References: <20190613090825.GN8725@ndevos-x270> <20190613122837.GS8725@ndevos-x270> <61c99ac170cc004a7f90897ff9f47cf7facdbc12.camel@redhat.com> <20190620074335.GA12566@ndevos-x270> <20190620090508.GA13895@ndevos-x270> <20190620094901.GB13895@ndevos-x270> Message-ID: On Thu, Jun 20, 2019 at 3:20 PM Niels de Vos wrote: > On Thu, Jun 20, 2019 at 02:56:51PM +0530, Amar Tumballi Suryanarayan wrote: > > On Thu, Jun 20, 2019 at 2:35 PM Niels de Vos wrote: > > > > > On Thu, Jun 20, 2019 at 02:11:21PM +0530, Amar Tumballi Suryanarayan > wrote: > > > > On Thu, Jun 20, 2019 at 1:13 PM Niels de Vos > wrote: > > > > > > > > > On Thu, Jun 20, 2019 at 11:36:46AM +0530, Amar Tumballi > Suryanarayan > > > wrote: > > > > > > Considering python3 is anyways the future, I vote for taking the > > > patch we > > > > > > did in master for fixing regression tests with python3 into the > > > release-6 > > > > > > and release-5 branch and getting over this deadlock. > > > > > > > > > > > > Patch in discussion here is > > > > > > https://review.gluster.org/#/c/glusterfs/+/22829/ and if anyone > > > > > notices, it > > > > > > changes only the files inside 'tests/' directory, which is not > > > packaged > > > > > in > > > > > > a release anyways. > > > > > > > > > > > > Hari, can we get the backport of this patch to both the release > > > branches? > > > > > > > > > > When going this route, you still need to make sure that the > > > > > python3-devel package is available on the CentOS-7 builders. And I > > > > > don't know if installing that package is already sufficient, maybe > the > > > > > backport is not even needed in that case. > > > > > > > > > > > > > > I was thinking, having this patch makes it compatible with both > python2 > > > and > > > > python3, so technically, it allows us to move to Fedora30 if we need > to > > > run > > > > regression there. (and CentOS7 with only python2). > > > > > > > > The above patch made it compatible, not mandatory to have python3. > So, > > > > treating it as a bug fix. > > > > > > Well, whatever Python is detected (python3 has preference over > python2), > > > needs to have the -devel package available too. Detection is done by > > > probing the python executable. The Matching header files from -devel > > > need to be present in order to be able to build glupy (and others?). > > > > > > I do not think compatibility for python3/2 is the problem while > > > building the tarball. > > > > > > Got it! True. Compatibility is not the problem to build the tarball. > > > > I noticed the issue of smoke is coming only from strfmt-errors job, which > > checks for 'epel-6-i386' mock, and fails right now. > > > > The backport might become relevant while running > > > tests on environments where there is no python2. > > > > > > > > Backport is very important if we are running in a system where we have > only > > python3. Hence my proposal to include it in releases. > > I am sure CentOS-7 still has python2. The newer python3 only gets pulled > in by some additional packages that get installed from EPEL. > > > But we are stuck with strfmt-errors job right now, and looking at what it > > was intended to catch in first place, mostly our > > https://build.gluster.org/job/32-bit-build-smoke/ would be doing same. > If > > that is the case, we can remove the job altogether. Also note, this job > is > > known to fail many smokes with 'Build root is locked by another process' > > errors. > > This error means that there are multiple concurrent jobs running 'mock' > with this buildroot. That should not happen and is a configuration error > in one or more Jenkins jobs. Adding to this, this error occurs when the last running job using mock has been aborted and no proper cleaning/killing in the build root has happened. I'm planning to call up a cleanup function on abort. > > Would be great if disabling strfmt-errors is an option. > > I think both jobs do different things. The smoke is functional, where as > strfmt-errors catches incorrect string formatting (some maintainers > assume always 64-bit, everywhere) that has been missed in reviews. > Is there any specific reason to use 64-bit for strfmt-errors? Also I have a doubt here, if it needs python3-devel package to build glupy it should have failed for basic smoke testing where we do source build install? > > Niels > > > > > > Regards, > > > > > Niels > > > > > > > > > > > > > > > > > > > Niels > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > Amar > > > > > > > > > > > > On Thu, Jun 13, 2019 at 7:26 PM Michael Scherer < > mscherer at redhat.com > > > > > > > > > wrote: > > > > > > > > > > > > > Le jeudi 13 juin 2019 ? 14:28 +0200, Niels de Vos a ?crit : > > > > > > > > On Thu, Jun 13, 2019 at 11:08:25AM +0200, Niels de Vos wrote: > > > > > > > > > On Wed, Jun 12, 2019 at 04:09:55PM -0700, Kaleb Keithley > wrote: > > > > > > > > > > On Wed, Jun 12, 2019 at 11:36 AM Kaleb Keithley < > > > > > > > > > > kkeithle at redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jun 12, 2019 at 10:43 AM Amar Tumballi > > > Suryanarayan < > > > > > > > > > > > atumball at redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We recently noticed that in one of the package > update on > > > > > > > > > > > > builder (ie, > > > > > > > > > > > > centos7.x machines), python3.6 got installed as a > > > dependency. > > > > > > > > > > > > So, yes, it > > > > > > > > > > > > is possible to have python3 in centos7 now. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > EPEL updated from python34 to python36 recently, but C7 > > > doesn't > > > > > > > > > > > have > > > > > > > > > > > python3 in the base. I don't think we've ever used EPEL > > > > > > > > > > > packages for > > > > > > > > > > > building. > > > > > > > > > > > > > > > > > > > > > > And GlusterFS-5 isn't python3 ready. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Correction: GlusterFS-5 is mostly or completely python3 > > > > > > > > > > ready. FWIW, > > > > > > > > > > python33 is available on both RHEL7 and CentOS7 from the > > > Software > > > > > > > > > > Collection Library (SCL), and python34 and now python36 > are > > > > > > > > > > available from > > > > > > > > > > EPEL. > > > > > > > > > > > > > > > > > > > > But packages built for the CentOS Storage SIG have never > > > used the > > > > > > > > > > SCL or > > > > > > > > > > EPEL (EPEL not allowed) and the shebangs in the .py > files are > > > > > > > > > > converted > > > > > > > > > > from /usr/bin/python3 to /usr/bin/python2 during the > rpmbuild > > > > > > > > > > %prep stage. > > > > > > > > > > All the python dependencies for the packages remain the > > > python2 > > > > > > > > > > flavors. > > > > > > > > > > AFAIK the centos-regression machines ought to be > building the > > > > > > > > > > same way. > > > > > > > > > > > > > > > > > > Indeed, there should not be a requirement on having EPEL > > > enabled on > > > > > > > > > the > > > > > > > > > CentOS-7 builders. At least not for the building of the > > > glusterfs > > > > > > > > > tarball. We still need to do releases of glusterfs-4.1 and > > > > > > > > > glusterfs-5, > > > > > > > > > until then it is expected to have python2 as the (only?) > > > version > > > > > > > > > for the > > > > > > > > > system. Is it possible to remove python3 from the CentOS-7 > > > builders > > > > > > > > > and > > > > > > > > > run the jobs that require python3 on the Fedora builders > > > instead? > > > > > > > > > > > > > > > > Actually, if the python-devel package for python3 is > installed > > > on the > > > > > > > > CentOS-7 builders, things may work too. It still feels like > some > > > sort > > > > > > > > of > > > > > > > > Frankenstein deployment, and we don't expect to this see in > > > > > > > > production > > > > > > > > environments. But maybe this is a workaround in case > something > > > > > > > > really, > > > > > > > > really, REALLY depends on python3 on the builders. > > > > > > > > > > > > > > To be honest, people would be surprised what happen in > production > > > > > > > around (sysadmins tend to discuss around, we all have horrors > > > stories, > > > > > > > stuff that were supposed to be cleaned and wasn't, etc) > > > > > > > > > > > > > > After all, "frankenstein deployment now" is better than > "perfect > > > > > > > later", especially since lots of IT departements are under > constant > > > > > > > pressure (so that's more "perfect never"). > > > > > > > > > > > > > > I can understand that we want clean and simple code (who > doesn't), > > > but > > > > > > > real life is much messier than we want to admit, so we need > > > something > > > > > > > robust. > > > > > > > > > > > > > > -- > > > > > > > Michael Scherer > > > > > > > Sysadmin, Community Infrastructure > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > > > > Community Meeting Calendar: > > > > > > > > > > > > > > APAC Schedule - > > > > > > > Every 2nd and 4th Tuesday at 11:30 AM IST > > > > > > > Bridge: https://bluejeans.com/836554017 > > > > > > > > > > > > > > NA/EMEA Schedule - > > > > > > > Every 1st and 3rd Tuesday at 01:00 PM EDT > > > > > > > Bridge: https://bluejeans.com/486278655 > > > > > > > > > > > > > > Gluster-devel mailing list > > > > > > > Gluster-devel at gluster.org > > > > > > > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Amar Tumballi (amarts) > > > > > > > > > > > > > > > > > -- > > > > Amar Tumballi (amarts) > > > > > > > > > -- > > Amar Tumballi (amarts) > _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kkeithle at redhat.com Thu Jun 20 10:57:28 2019 From: kkeithle at redhat.com (Kaleb Keithley) Date: Thu, 20 Jun 2019 06:57:28 -0400 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190612151142.GL8725@ndevos-x270> <20190613090825.GN8725@ndevos-x270> <20190613122837.GS8725@ndevos-x270> <61c99ac170cc004a7f90897ff9f47cf7facdbc12.camel@redhat.com> <20190620074335.GA12566@ndevos-x270> <20190620090508.GA13895@ndevos-x270> Message-ID: On Thu, Jun 20, 2019 at 5:27 AM Amar Tumballi Suryanarayan < atumball at redhat.com> wrote: > On Thu, Jun 20, 2019 at 2:35 PM Niels de Vos wrote: > >> On Thu, Jun 20, 2019 at 02:11:21PM +0530, Amar Tumballi Suryanarayan >> wrote: >> > On Thu, Jun 20, 2019 at 1:13 PM Niels de Vos wrote: > > > I noticed the issue of smoke is coming only from strfmt-errors job, which > checks for 'epel-6-i386' mock, and fails right now. > ... > But we are stuck with strfmt-errors job right now, and looking at what it > was intended to catch in first place, > ... > Would be great if disabling strfmt-errors is an option. > strfmt-errors checks that snprintf format stings are correct on both 32- and 64-bit platforms. Are you ready to drop support on 32-bit platforms? Some distributions are dropping 32-bit, but Fedora still supports i686 and armv7hl by default; it is possible to drop 32-bit on Fedora but there would be strong resistance to doing it I suspect. I also think it would be strange to drop it in the middle of a release stream. If you want to drop it for, say, release-7 that'd be a good time to do it. strfmt-errors isn't failing generally, AFAICT. The last failure is on a release-5 branch build. Since the strfmt-errors runs on a CentOS machine I suspect that it's the same issue we have with centos-regression, which really goes back to the (misguided IMO) decision to put EPEL and python3 on the centos builders. misc sent me a list of all the things that "need" python3. Some/Many/All of them are for things that run on fedora, e.g. clang-format. Everything was, AFAICT, working fine right up to when EPEL and python3 were installed on the centos builders. If it was my decision, I'd undo that change. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndevos at redhat.com Thu Jun 20 11:05:42 2019 From: ndevos at redhat.com (Niels de Vos) Date: Thu, 20 Jun 2019 13:05:42 +0200 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190613090825.GN8725@ndevos-x270> <20190613122837.GS8725@ndevos-x270> <61c99ac170cc004a7f90897ff9f47cf7facdbc12.camel@redhat.com> <20190620074335.GA12566@ndevos-x270> <20190620090508.GA13895@ndevos-x270> <20190620094901.GB13895@ndevos-x270> Message-ID: <20190620110542.GC13895@ndevos-x270> On Thu, Jun 20, 2019 at 03:42:06PM +0530, Deepshikha Khandelwal wrote: > On Thu, Jun 20, 2019 at 3:20 PM Niels de Vos wrote: > > > On Thu, Jun 20, 2019 at 02:56:51PM +0530, Amar Tumballi Suryanarayan wrote: > > > On Thu, Jun 20, 2019 at 2:35 PM Niels de Vos wrote: > > > > > > > On Thu, Jun 20, 2019 at 02:11:21PM +0530, Amar Tumballi Suryanarayan > > wrote: > > > > > On Thu, Jun 20, 2019 at 1:13 PM Niels de Vos > > wrote: > > > > > > > > > > > On Thu, Jun 20, 2019 at 11:36:46AM +0530, Amar Tumballi > > Suryanarayan > > > > wrote: > > > > > > > Considering python3 is anyways the future, I vote for taking the > > > > patch we > > > > > > > did in master for fixing regression tests with python3 into the > > > > release-6 > > > > > > > and release-5 branch and getting over this deadlock. > > > > > > > > > > > > > > Patch in discussion here is > > > > > > > https://review.gluster.org/#/c/glusterfs/+/22829/ and if anyone > > > > > > notices, it > > > > > > > changes only the files inside 'tests/' directory, which is not > > > > packaged > > > > > > in > > > > > > > a release anyways. > > > > > > > > > > > > > > Hari, can we get the backport of this patch to both the release > > > > branches? > > > > > > > > > > > > When going this route, you still need to make sure that the > > > > > > python3-devel package is available on the CentOS-7 builders. And I > > > > > > don't know if installing that package is already sufficient, maybe > > the > > > > > > backport is not even needed in that case. > > > > > > > > > > > > > > > > > I was thinking, having this patch makes it compatible with both > > python2 > > > > and > > > > > python3, so technically, it allows us to move to Fedora30 if we need > > to > > > > run > > > > > regression there. (and CentOS7 with only python2). > > > > > > > > > > The above patch made it compatible, not mandatory to have python3. > > So, > > > > > treating it as a bug fix. > > > > > > > > Well, whatever Python is detected (python3 has preference over > > python2), > > > > needs to have the -devel package available too. Detection is done by > > > > probing the python executable. The Matching header files from -devel > > > > need to be present in order to be able to build glupy (and others?). > > > > > > > > I do not think compatibility for python3/2 is the problem while > > > > building the tarball. > > > > > > > > > Got it! True. Compatibility is not the problem to build the tarball. > > > > > > I noticed the issue of smoke is coming only from strfmt-errors job, which > > > checks for 'epel-6-i386' mock, and fails right now. > > > > > > The backport might become relevant while running > > > > tests on environments where there is no python2. > > > > > > > > > > > Backport is very important if we are running in a system where we have > > only > > > python3. Hence my proposal to include it in releases. > > > > I am sure CentOS-7 still has python2. The newer python3 only gets pulled > > in by some additional packages that get installed from EPEL. > > > > > But we are stuck with strfmt-errors job right now, and looking at what it > > > was intended to catch in first place, mostly our > > > https://build.gluster.org/job/32-bit-build-smoke/ would be doing same. > > If > > > that is the case, we can remove the job altogether. Also note, this job > > is > > > known to fail many smokes with 'Build root is locked by another process' > > > errors. > > > > This error means that there are multiple concurrent jobs running 'mock' > > with this buildroot. That should not happen and is a configuration error > > in one or more Jenkins jobs. > > Adding to this, this error occurs when the last running job using mock has > been aborted and no proper cleaning/killing in the build root has happened. > I'm planning to call up a cleanup function on abort. Ah, right, that is a possibility too. Jobs should cleanup after themselves and if that is not happening, it is a bug in the job (or missing cleanup on boot). > > > Would be great if disabling strfmt-errors is an option. > > > > I think both jobs do different things. The smoke is functional, where as > > strfmt-errors catches incorrect string formatting (some maintainers > > assume always 64-bit, everywhere) that has been missed in reviews. > > > Is there any specific reason to use 64-bit for strfmt-errors? No, we still support several 32-bit architectures. > Also I have a doubt here, if it needs python3-devel package to build glupy > it should have failed for basic smoke testing where we do source build > install? I do not know if smoke enables/disables building of glupy. Niels From mscherer at redhat.com Thu Jun 20 11:39:06 2019 From: mscherer at redhat.com (Michael Scherer) Date: Thu, 20 Jun 2019 13:39:06 +0200 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190612151142.GL8725@ndevos-x270> <20190613090825.GN8725@ndevos-x270> <20190613122837.GS8725@ndevos-x270> <61c99ac170cc004a7f90897ff9f47cf7facdbc12.camel@redhat.com> <20190620074335.GA12566@ndevos-x270> <20190620090508.GA13895@ndevos-x270> Message-ID: Le jeudi 20 juin 2019 ? 06:57 -0400, Kaleb Keithley a ?crit : > On Thu, Jun 20, 2019 at 5:27 AM Amar Tumballi Suryanarayan < > atumball at redhat.com> wrote: > > > On Thu, Jun 20, 2019 at 2:35 PM Niels de Vos > > wrote: > > > > > On Thu, Jun 20, 2019 at 02:11:21PM +0530, Amar Tumballi > > > Suryanarayan > > > wrote: > > > > On Thu, Jun 20, 2019 at 1:13 PM Niels de Vos > > > > wrote: > > > > > > I noticed the issue of smoke is coming only from strfmt-errors job, > > which > > checks for 'epel-6-i386' mock, and fails right now. > > ... > > But we are stuck with strfmt-errors job right now, and looking at > > what it > > was intended to catch in first place, > > ... > > Would be great if disabling strfmt-errors is an option. > > > > strfmt-errors checks that snprintf format stings are correct on both > 32- > and 64-bit platforms. > > Are you ready to drop support on 32-bit platforms? Some distributions > are > dropping 32-bit, but Fedora still supports i686 and armv7hl by > default; it > is possible to drop 32-bit on Fedora but there would be strong > resistance > to doing it I suspect. I also think it would be strange to drop it > in the > middle of a release stream. If you want to drop it for, say, release- > 7 > that'd be a good time to do it. > > strfmt-errors isn't failing generally, AFAICT. The last failure is on > a > release-5 branch build. Since the strfmt-errors runs on a CentOS > machine I > suspect that it's the same issue we have with centos-regression, > which > really goes back to the (misguided IMO) decision to put EPEL and > python3 on > the centos builders. > > misc sent me a list of all the things that "need" python3. > Some/Many/All of > them are for things that run on fedora, e.g. clang- > format. Everything was, > AFAICT, working fine right up to when EPEL and python3 were installed > on > the centos builders. If it was my decision, I'd undo that change. The biggest problem is that mock do pull python3. -- Michael Scherer Sysadmin, Community Infrastructure -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part URL: From kkeithle at redhat.com Thu Jun 20 12:38:27 2019 From: kkeithle at redhat.com (Kaleb Keithley) Date: Thu, 20 Jun 2019 08:38:27 -0400 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190612151142.GL8725@ndevos-x270> <20190613090825.GN8725@ndevos-x270> <20190613122837.GS8725@ndevos-x270> <61c99ac170cc004a7f90897ff9f47cf7facdbc12.camel@redhat.com> <20190620074335.GA12566@ndevos-x270> <20190620090508.GA13895@ndevos-x270> Message-ID: On Thu, Jun 20, 2019 at 7:39 AM Michael Scherer wrote: > Le jeudi 20 juin 2019 ? 06:57 -0400, Kaleb Keithley a ?crit : > > AFAICT, working fine right up to when EPEL and python3 were installed > > on > > the centos builders. If it was my decision, I'd undo that change. > > The biggest problem is that mock do pull python3. > > That's mock on Fedora ? to run a build in a centos-i386 chroot. Fedora already has python3. I don't see how that can affect what's running in the mock chroot. Is the build inside mock also installing EPEL and python3 somehow? Now? If so, why? And maybe the solution for centos regressions is to run those in mock, with a centos-x86_64 chroot. Without EPEL or python3. -- Kaleb -------------- next part -------------- An HTML attachment was scrubbed... URL: From mscherer at redhat.com Thu Jun 20 13:05:56 2019 From: mscherer at redhat.com (Michael Scherer) Date: Thu, 20 Jun 2019 15:05:56 +0200 Subject: [Gluster-devel] Removing glupy from release 5.7 In-Reply-To: References: <20190612151142.GL8725@ndevos-x270> <20190613090825.GN8725@ndevos-x270> <20190613122837.GS8725@ndevos-x270> <61c99ac170cc004a7f90897ff9f47cf7facdbc12.camel@redhat.com> <20190620074335.GA12566@ndevos-x270> <20190620090508.GA13895@ndevos-x270> Message-ID: Le jeudi 20 juin 2019 ? 08:38 -0400, Kaleb Keithley a ?crit : > On Thu, Jun 20, 2019 at 7:39 AM Michael Scherer > wrote: > > > Le jeudi 20 juin 2019 ? 06:57 -0400, Kaleb Keithley a ?crit : > > > AFAICT, working fine right up to when EPEL and python3 were > > > installed > > > on > > > the centos builders. If it was my decision, I'd undo that > > > change. > > > > The biggest problem is that mock do pull python3. > > > > > > That's mock on Fedora ? to run a build in a centos-i386 chroot. > Fedora > already has python3. I don't see how that can affect what's running > in the > mock chroot. I am not sure we are talking about the same thing, but mock, the rpm package from EPEL 7, do pull python 3: $ cat /etc/redhat-release; rpm -q --requires mock |grep 'python(abi' Red Hat Enterprise Linux Server release 7.6 (Maipo) python(abi) = 3.6 So we do have python3 installed on the Centos 7 builders (and was after a upgrade), and we are not going to remove it, because we use mock for a lot of stuff. And again, if the configure script is detecting the wrong version of python, the fix is not to remove the version of python for the builders, the fix is to detect the right version of python, or at least, permit to people to bypass the detection. > Is the build inside mock also installing EPEL and python3 somehow? > Now? If so, why? No, I doubt but then, if we are using a chroot, the package installed on the builders shouldn't matter, since that's a chroot. So I am kinda being lost. > And maybe the solution for centos regressions is to run those in > mock, with a centos-x86_64 chroot. Without EPEL or python3. That would likely requires a big refactor of the setup, since we have to get the data out of specific place, etc. We would also need to reinstall the builders to set partitions in a different way, with a bigger / and/or give more space for /var/lib/mock. I do not see that happening fast, and if my hypothesis of a issue in configure is right, then fixing seems the faster way to avoid the issue. -- Michael Scherer Sysadmin, Community Infrastructure -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part URL: From jthottan at redhat.com Fri Jun 21 03:58:52 2019 From: jthottan at redhat.com (Jiffin Thottan) Date: Thu, 20 Jun 2019 23:58:52 -0400 (EDT) Subject: [Gluster-devel] Quick question about the latest glusterfs and client side selinux support In-Reply-To: <9D9C2802-539E-4153-81FF-4A0F8E934E27@gtri.gatech.edu> References: <6D60E9DD-C4E1-4B95-9277-C4F746DB228C@gtri.gatech.edu> <62539ffb-4b65-1733-6151-fc9b2c604254@redhat.com> <9D9C2802-539E-4153-81FF-4A0F8E934E27@gtri.gatech.edu> Message-ID: <1430982045.13733997.1561089532953.JavaMail.zimbra@redhat.com> Hi Janak, Currently, it is supported in glusterfs(from 2.8 onwards) and cephfs(already there in 2.7) for nfs-ganesha. -- Jiffin ----- Original Message ----- From: "Janak Desai" To: "Jiffin Tony Thottan" Sent: Thursday, June 20, 2019 9:29:09 PM Subject: Re: Quick question about the latest glusterfs and client side selinux support Hi Jiffin, I came across your presentation ?NFS-Ganesha Weather Report? that you gave at the FOSDEM?19 in early Feb this year. In that you mentioned that ongoing developments in v2.8 include ?labelled NFS? support. I see that v2.8 is now out.? Do you know if labelled NFS support made it in?? If it did, is it only supported in CEPHFS FSAL or any other FSALs also include the support for it? I took a cursory look at the release documents and didn?t see Labelled NFS in it, so thought I would bug you directly. Thanks. -Janak From: Jiffin Tony Thottan Date: Tuesday, August 28, 2018 at 12:50 AM To: Janak Desai , "ndevos at redhat.com" , "mselvaga at redhat.com" Cc: "paul at paul-moore.com" Subject: Re: Quick question about the latest glusterfs and client side selinux support Hi Janak, Thanks for the interest. Basic selinux xlator is present at gluster server stack. It stores selinux context at the backend as a xattr. When we developed that xlator, at that point they were no client to test the functionality. Don't know whether required change in fuse got merged or not. As you mentioned ,here first we need to figure out whether issue is related to server. Can collect the packet trace using tcpdump from client and sent with mail during setting/getting selinux context. Regards, Jiffin On Tuesday 28 August 2018 04:14 AM, Desai, Janak wrote: Hi Niels, Manikandan, Jiffin, I work for Georgia Tech Research Institute?s CIPHER Lab and am investigating suitability of glusterfs for a couple of large upcoming projects. My ?google research? is yielding confusing and inconclusive results, so I thought I would try and reach out to some of the core developers to get some clarity. We use SELinux extensively in our software solution. I am trying to find out if, with the latest version 4.1 of glusterfs running on the latest version of rhel, I should be able to associate and enforce selinux contexts from glusterfs clients. I see in the 3.11 release notes that the selinux feature was implemented but then I also see references to kernel work that is not done yet. I also could not find any documentation/examples on how to add/integrate this selinux translator to setup and enforce selinux labels from the client side. In my simple test setup, which I mounted using the ?selinux? option (which gluster does seem to recognize), I am getting the ?operation not supported? error. I guess either I am not pulling in the selinux translator or I am running up against other missing functionality in the kernel. I would really appreciate if you could clear this up for me. If I am not configuring my mount correctly, I would appreciate if you could point me to a document or an example. Our other option is lustre filesystem since it does have a working client side association and enforcement of selinux contexts. However, lustre appears to be lot difficult to setup and maintain and I would rather use glusterfs. We need a distributed (or parallel) filesystem that can work with Hadoop. If glusterfs doesn?t pan out then I will look at labelled nfs 4.2 that is now available in rhel7. However, my google research shows much more Hadoop affinity for glusterfs than nfs v4. I am also copying Paul Moore, with whom I collaborated a few years ago as part of the team that took Linux through its common criteria evaluation, and who I haven?t bugged lately ?, to see if he can shed some light any missing kernel dependencies. I am currently testing with rhel7.5, but would be willing to try upstream kernel if have to get this proof of concept going. I know the underlying problem in the kernel is supporting extended attrs on FUSE file systems, but was wondering (and hoping) that at least setup/enforcement of selinux contexts from client side for glusterfs is possible. Thanks. -Janak From kirr at nexedi.com Sun Jun 23 07:26:37 2019 From: kirr at nexedi.com (Kirill Smelkov) Date: Sun, 23 Jun 2019 07:26:37 +0000 Subject: [Gluster-devel] [PATCH, RESEND] fuse: require /dev/fuse reads to have enough buffer capacity (take 2) In-Reply-To: Message-ID: <20190623072619.31037-1-kirr@nexedi.com> [ This retries commit d4b13963f217 which was reverted in 766741fcaa1f. In this version we require only `sizeof(fuse_in_header) + sizeof(fuse_write_in)` instead of 4K for FUSE request header room, because, contrary to libfuse and kernel client behaviour, GlusterFS actually provides only so much room for request header. ] A FUSE filesystem server queues /dev/fuse sys_read calls to get filesystem requests to handle. It does not know in advance what would be that request as it can be anything that client issues - LOOKUP, READ, WRITE, ... Many requests are short and retrieve data from the filesystem. However WRITE and NOTIFY_REPLY write data into filesystem. Before getting into operation phase, FUSE filesystem server and kernel client negotiate what should be the maximum write size the client will ever issue. After negotiation the contract in between server/client is that the filesystem server then should queue /dev/fuse sys_read calls with enough buffer capacity to receive any client request - WRITE in particular, while FUSE client should not, in particular, send WRITE requests with > negotiated max_write payload. FUSE client in kernel and libfuse historically reserve 4K for request header. However an existing filesystem server - GlusterFS - was found which reserves only 80 bytes for header room (= `sizeof(fuse_in_header) + sizeof(fuse_write_in)`). https://lore.kernel.org/linux-fsdevel/20190611202738.GA22556 at deco.navytux.spb.ru/ https://github.com/gluster/glusterfs/blob/v3.8.15-0-gd174f021a/xlators/mount/fuse/src/fuse-bridge.c#L4894 Since `sizeof(fuse_in_header) + sizeof(fuse_write_in)` == `sizeof(fuse_in_header) + sizeof(fuse_read_in)` == `sizeof(fuse_in_header) + sizeof(fuse_notify_retrieve_in)` is the absolute minimum any sane filesystem should be using for header room, the contract is that filesystem server should queue sys_reads with `sizeof(fuse_in_header) + sizeof(fuse_write_in)` + max_write buffer. If the filesystem server does not follow this contract, what can happen is that fuse_dev_do_read will see that request size is > buffer size, and then it will return EIO to client who issued the request but won't indicate in any way that there is a problem to filesystem server. This can be hard to diagnose because for some requests, e.g. for NOTIFY_REPLY which mimics WRITE, there is no client thread that is waiting for request completion and that EIO goes nowhere, while on filesystem server side things look like the kernel is not replying back after successful NOTIFY_RETRIEVE request made by the server. We can make the problem easy to diagnose if we indicate via error return to filesystem server when it is violating the contract. This should not practically cause problems because if a filesystem server is using shorter buffer, writes to it were already very likely to cause EIO, and if the filesystem is read-only it should be too following FUSE_MIN_READ_BUFFER minimum buffer size. Please see [1] for context where the problem of stuck filesystem was hit for real (because kernel client was incorrectly sending more than max_write data with NOTIFY_REPLY; see also previous patch), how the situation was traced and for more involving patch that did not make it into the tree. [1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2 Signed-off-by: Kirill Smelkov Tested-by: Sander Eikelenboom Cc: Han-Wen Nienhuys Cc: Jakob Unterwurzacher --- fs/fuse/dev.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index ea8237513dfa..b2b2344eadcf 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -1317,6 +1317,26 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file, unsigned reqsize; unsigned int hash; + /* + * Require sane minimum read buffer - that has capacity for fixed part + * of any request header + negotiated max_write room for data. If the + * requirement is not satisfied return EINVAL to the filesystem server + * to indicate that it is not following FUSE server/client contract. + * Don't dequeue / abort any request. + * + * Historically libfuse reserves 4K for fixed header room, but e.g. + * GlusterFS reserves only 80 bytes + * + * = `sizeof(fuse_in_header) + sizeof(fuse_write_in)` + * + * which is the absolute minimum any sane filesystem should be using + * for header room. + */ + if (nbytes < max_t(size_t, FUSE_MIN_READ_BUFFER, + sizeof(struct fuse_in_header) + sizeof(struct fuse_write_in) + + fc->max_write)) + return -EINVAL; + restart: spin_lock(&fiq->waitq.lock); err = -EAGAIN; -- 2.20.1 From jenkins at build.gluster.org Mon Jun 24 01:45:02 2019 From: jenkins at build.gluster.org (jenkins at build.gluster.org) Date: Mon, 24 Jun 2019 01:45:02 +0000 (UTC) Subject: [Gluster-devel] Weekly Untriaged Bugs Message-ID: <1022785431.22.1561340702849.JavaMail.jenkins@jenkins-el7.rht.gluster.org> [...truncated 7 lines...] https://bugzilla.redhat.com/1722708 / bitrot: WORM: Segmentation Fault if bitrot stub do signature https://bugzilla.redhat.com/1722709 / bitrot: WORM: Segmentation Fault if bitrot stub do signature https://bugzilla.redhat.com/1719778 / core: build fails for every patch on release 5 https://bugzilla.redhat.com/1714851 / core: issues with 'list.h' elements in clang-scan https://bugzilla.redhat.com/1721842 / core: Spelling errors in 6.3 https://bugzilla.redhat.com/1722390 / glusterd: "All subvolumes are down" when all bricks are online https://bugzilla.redhat.com/1722187 / glusterd: Glusterd Seg faults (sig 11) when RDMA used with MLNX_OFED https://bugzilla.redhat.com/1718741 / glusterfind: GlusterFS having high CPU https://bugzilla.redhat.com/1716875 / gluster-smb: Inode Unref Assertion failed: inode->ref https://bugzilla.redhat.com/1716455 / gluster-smb: OS X error -50 when creating sub-folder on Samba share when using Gluster VFS https://bugzilla.redhat.com/1716440 / gluster-smb: SMBD thread panics when connected to from OS X machine https://bugzilla.redhat.com/1720733 / libglusterfsclient: glusterfs 4.1.7 client crash https://bugzilla.redhat.com/1714895 / libglusterfsclient: Glusterfs(fuse) client crash https://bugzilla.redhat.com/1717824 / locks: Fencing: Added the tcmu-runner ALUA feature support but after one of node is rebooted the glfs_file_lock() get stucked https://bugzilla.redhat.com/1718562 / locks: flock failure (regression) https://bugzilla.redhat.com/1719174 / project-infrastructure: broken regression link? https://bugzilla.redhat.com/1719388 / project-infrastructure: infra: download.gluster.org /var/www/html/... is out of free space https://bugzilla.redhat.com/1721353 / project-infrastructure: Run 'line-coverage' regression runs on a latest fedora machine (say fedora30). https://bugzilla.redhat.com/1720453 / project-infrastructure: Unable to access review.gluster.org https://bugzilla.redhat.com/1721462 / quota: Quota limits not honored writes allowed past quota limit. [...truncated 2 lines...] -------------- next part -------------- A non-text attachment was scrubbed... Name: build.log Type: application/octet-stream Size: 2413 bytes Desc: not available URL: From hgowtham at redhat.com Mon Jun 24 09:34:28 2019 From: hgowtham at redhat.com (hgowtham at redhat.com) Date: Mon, 24 Jun 2019 09:34:28 +0000 Subject: [Gluster-devel] Invitation: Gluster Community Meeting (APAC friendly hours) @ Tue Jun 25, 2019 11:30am - 12:30pm (IST) (gluster-devel@gluster.org) Message-ID: <000000000000c553f0058c0e848b@google.com> You have been invited to the following event. Title: Gluster Community Meeting (APAC friendly hours) Hi all, This is the biweekly Gluster community meeting that is hosted to collaborate and make the community better. Please do join the discussion. Bridge: https://bluejeans.com/836554017 Minutes meeting: https://hackmd.io/PEnYhQziQsyBwhMksbRWUw Previous Meeting notes: https://github.com/gluster/community Regards, Hari. When: Tue Jun 25, 2019 11:30am ? 12:30pm India Standard Time - Kolkata Calendar: gluster-devel at gluster.org Who: * hgowtham at redhat.com - organizer * gluster-users * gluster-devel Event details: https://www.google.com/calendar/event?action=VIEW&eid=N3IzZ3FtanYyaHIwNWhqaDhuaW5nN3ZuMHEgZ2x1c3Rlci1kZXZlbEBnbHVzdGVyLm9yZw&tok=MTkjaGdvd3RoYW1AcmVkaGF0LmNvbWU1ZWM5ZDUzZjBlOWMwMDA3NDkyMWMzN2YxZjY2ZmY1MDU2NmRjNGU&ctz=Asia%2FKolkata&hl=en&es=0 Invitation from Google Calendar: https://www.google.com/calendar/ You are receiving this courtesy email at the account gluster-devel at gluster.org because you are an attendee of this event. To stop receiving future updates for this event, decline this event. Alternatively you can sign up for a Google account at https://www.google.com/calendar/ and control your notification settings for your entire calendar. Forwarding this invitation could allow any recipient to send a response to the organizer and be added to the guest list, or invite others regardless of their own invitation status, or to modify your RSVP. Learn more at https://support.google.com/calendar/answer/37135#forwarding -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/calendar Size: 1842 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: invite.ics Type: application/ics Size: 1880 bytes Desc: not available URL: From anoopcs at cryptolab.net Mon Jun 24 12:34:40 2019 From: anoopcs at cryptolab.net (Anoop C S) Date: Mon, 24 Jun 2019 18:04:40 +0530 Subject: [Gluster-devel] CI failure - NameError: name 'unicode' is not defined (related to changelogparser.py) In-Reply-To: References: Message-ID: On Fri, 2019-06-07 at 10:24 +0530, Deepshikha Khandelwal wrote: > Hi Yaniv, > > We are working on this. The builders are picking up python3.6 which > is leading to modules missing and such undefined errors. > > Kotresh has sent a patch > https://review.gluster.org/#/c/glusterfs/+/22829/ to fix the issue. Can we have this backported to release-6 branch? As of now all patches posted against release-6 branch are failing[1] on tests/basic/changelog/changelog-rename.t [1] https://review.gluster.org/q/project:glusterfs+branch:release-6+status:open+label:%2522CentOS-regression-1%2522 > On Thu, Jun 6, 2019 at 11:49 AM Yaniv Kaul wrote: > > From [1]. > > > > I think it's a Python2/3 thing, so perhaps a CI issue additionally > > (though if our code is not Python 3 ready, let's ensure we use > > Python 2 explicitly until we fix this). > > > > 00:47:05.207 ok 14 [ 13/ 386] < 34> 'gluster --mode=script > > --wignore volume start patchy' > > 00:47:05.207 ok 15 [ 13/ 70] < 36> '_GFS --attribute- > > timeout=0 --entry-timeout=0 --volfile-id=patchy --volfile- > > server=builder208.int.aws.gluster.org /mnt/glusterfs/0' > > 00:47:05.207 Traceback (most recent call last): > > 00:47:05.207 File > > "./tests/basic/changelog/../../utils/changelogparser.py", line 233, > > in > > 00:47:05.207 parse(sys.argv[1]) > > 00:47:05.207 File > > "./tests/basic/changelog/../../utils/changelogparser.py", line 221, > > in parse > > 00:47:05.207 process_record(data, tokens, changelog_ts, > > callback) > > 00:47:05.207 File > > "./tests/basic/changelog/../../utils/changelogparser.py", line 178, > > in process_record > > 00:47:05.207 callback(record) > > 00:47:05.207 File > > "./tests/basic/changelog/../../utils/changelogparser.py", line 182, > > in default_callback > > 00:47:05.207 sys.stdout.write(u"{0}\n".format(record)) > > 00:47:05.207 File > > "./tests/basic/changelog/../../utils/changelogparser.py", line 128, > > in __str__ > > 00:47:05.207 return unicode(self).encode('utf-8') > > 00:47:05.207 NameError: name 'unicode' is not defined > > 00:47:05.207 not ok 16 [ 53/ 39] < 42> '2 > > check_changelog_op /d/backends/patchy0/.glusterfs/changelogs > > RENAME' -> 'Got "0" instead of "2"' > > > > > > Y. > > > > [1] https://build.gluster.org/job/centos7-regression/6318/console > > _______________________________________________ > > > > Community Meeting Calendar: > > > > APAC Schedule - > > Every 2nd and 4th Tuesday at 11:30 AM IST > > Bridge: https://bluejeans.com/836554017 > > > > NA/EMEA Schedule - > > Every 1st and 3rd Tuesday at 01:00 PM EDT > > Bridge: https://bluejeans.com/486278655 > > > > Gluster-devel mailing list > > Gluster-devel at gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > From dkhandel at redhat.com Tue Jun 25 05:10:57 2019 From: dkhandel at redhat.com (Deepshikha Khandelwal) Date: Tue, 25 Jun 2019 10:40:57 +0530 Subject: [Gluster-devel] CI failure - NameError: name 'unicode' is not defined (related to changelogparser.py) In-Reply-To: References: Message-ID: Regression on release-5 branch is also failing because of this. Can we have backport Kotresh's patch https://review.gluster.org/#/c/glusterfs/+/22829/ to these branches. On Mon, Jun 24, 2019 at 6:06 PM Anoop C S wrote: > On Fri, 2019-06-07 at 10:24 +0530, Deepshikha Khandelwal wrote: > > Hi Yaniv, > > > > We are working on this. The builders are picking up python3.6 which > > is leading to modules missing and such undefined errors. > > > > Kotresh has sent a patch > > https://review.gluster.org/#/c/glusterfs/+/22829/ to fix the issue. > > Can we have this backported to release-6 branch? As of now all patches > posted against release-6 branch are failing[1] on > tests/basic/changelog/changelog-rename.t > [1] > > https://review.gluster.org/q/project:glusterfs+branch:release-6+status:open+label:%2522CentOS-regression-1%2522 > > > On Thu, Jun 6, 2019 at 11:49 AM Yaniv Kaul wrote: > > > From [1]. > > > > > > I think it's a Python2/3 thing, so perhaps a CI issue additionally > > > (though if our code is not Python 3 ready, let's ensure we use > > > Python 2 explicitly until we fix this). > > > > > > 00:47:05.207 ok 14 [ 13/ 386] < 34> 'gluster --mode=script > > > --wignore volume start patchy' > > > 00:47:05.207 ok 15 [ 13/ 70] < 36> '_GFS --attribute- > > > timeout=0 --entry-timeout=0 --volfile-id=patchy --volfile- > > > server=builder208.int.aws.gluster.org /mnt/glusterfs/0' > > > 00:47:05.207 Traceback (most recent call last): > > > 00:47:05.207 File > > > "./tests/basic/changelog/../../utils/changelogparser.py", line 233, > > > in > > > 00:47:05.207 parse(sys.argv[1]) > > > 00:47:05.207 File > > > "./tests/basic/changelog/../../utils/changelogparser.py", line 221, > > > in parse > > > 00:47:05.207 process_record(data, tokens, changelog_ts, > > > callback) > > > 00:47:05.207 File > > > "./tests/basic/changelog/../../utils/changelogparser.py", line 178, > > > in process_record > > > 00:47:05.207 callback(record) > > > 00:47:05.207 File > > > "./tests/basic/changelog/../../utils/changelogparser.py", line 182, > > > in default_callback > > > 00:47:05.207 sys.stdout.write(u"{0}\n".format(record)) > > > 00:47:05.207 File > > > "./tests/basic/changelog/../../utils/changelogparser.py", line 128, > > > in __str__ > > > 00:47:05.207 return unicode(self).encode('utf-8') > > > 00:47:05.207 NameError: name 'unicode' is not defined > > > 00:47:05.207 not ok 16 [ 53/ 39] < 42> '2 > > > check_changelog_op /d/backends/patchy0/.glusterfs/changelogs > > > RENAME' -> 'Got "0" instead of "2"' > > > > > > > > > Y. > > > > > > [1] https://build.gluster.org/job/centos7-regression/6318/console > > > _______________________________________________ > > > > > > Community Meeting Calendar: > > > > > > APAC Schedule - > > > Every 2nd and 4th Tuesday at 11:30 AM IST > > > Bridge: https://bluejeans.com/836554017 > > > > > > NA/EMEA Schedule - > > > Every 1st and 3rd Tuesday at 01:00 PM EDT > > > Bridge: https://bluejeans.com/486278655 > > > > > > Gluster-devel mailing list > > > Gluster-devel at gluster.org > > > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > > > > _______________________________________________ > > > > Community Meeting Calendar: > > > > APAC Schedule - > > Every 2nd and 4th Tuesday at 11:30 AM IST > > Bridge: https://bluejeans.com/836554017 > > > > NA/EMEA Schedule - > > Every 1st and 3rd Tuesday at 01:00 PM EDT > > Bridge: https://bluejeans.com/486278655 > > > > Gluster-devel mailing list > > Gluster-devel at gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From atumball at redhat.com Tue Jun 25 05:49:54 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Tue, 25 Jun 2019 11:19:54 +0530 Subject: [Gluster-devel] [Gluster-infra] New workflow proposal for glusterfs repo In-Reply-To: References: Message-ID: Adding gluster-devel ML. Only concern to my earlier proposal was not making regression runs wait for reviews, but to be triggered automatically after successful smoke. The ask was to put burden on machines than on developers, which I agree to start with. Lets watch the expenses due to this change for a month once it gets implemented, and then take stock of the situation. For now, lets reduce one more extra work for developers, ie, marking Verified flag. On Tue, Jun 25, 2019 at 11:01 AM Sankarshan Mukhopadhyay < sankarshan.mukhopadhyay at gmail.com> wrote: > Amar, can you bring about an agreement/decision on this so that we can > make progress? > > So, My take is: Lets make serialized smoke + regression a reality. It may add to overall time, but if there are failures, this has potential to reduce overall machine usage... for a successful patch, the extra few minutes at present doesn't harm as many of our review avg time is around a week. > On Tue, Jun 25, 2019 at 10:55 AM Deepshikha Khandelwal > wrote: > > > > > > > > On Mon, Jun 24, 2019 at 5:30 PM Sankarshan Mukhopadhyay < > sankarshan.mukhopadhyay at gmail.com> wrote: > >> > >> Checking back on this - do we need more voices or, amendments to > >> Amar's original proposal before we scope the implementation? > >> > >> I read Amar's proposal as desiring an outcome where the journey of a > >> valid/good patch through the test flows is fast and efficient. > Absolutely! This is critical for us to be inclusive community. > >> > >> On Wed, Jun 12, 2019 at 11:58 PM Raghavendra Talur > wrote: > >> > > >> > > >> > > >> > On Wed, Jun 12, 2019, 1:56 PM Atin Mukherjee > wrote: > >> >> > >> >> > >> >> > >> >> On Wed, 12 Jun 2019 at 18:04, Amar Tumballi Suryanarayan < > atumball at redhat.com> wrote: > >> >>> > >> >>> > >> >>> Few bullet points: > >> >>> > >> >>> * Let smoke job sequentially for below, and if successful, in > parallel for others. > >> >>> - Sequential: > >> >>> -- clang-format check > >> >>> -- compare-bugzilla-version-git-branch > >> >>> -- bugzilla-post > >> >>> -- comment-on-issue > >> >>> -- fedora-smoke (mainly don't want warning). > >> >> > >> >> > >> >> +1 > >> >> > >> >>> - Parallel > >> >>> -- all devrpm jobs > >> >>> -- 32bit smoke > >> >>> -- freebsd-smoke > >> >>> -- smoke > >> >>> -- strfmt_errors > >> >>> -- python-lint, and shellcheck. > >> >> > >> >> > >> >> I?m sure there must be a reason but would like to know that why do > they need to be parallel? Can?t we have them sequentially to have similar > benefits of the resource utilisation like above? Or are all these > individual jobs are time consuming such that having them sequentially will > lead the overall smoke job to consume much longer? > Most of these are doing the same thing, make dist, make install, make rpms. but on different arch and with different flags. To start with, we can do these also sequentially. That way, infra team needn't worry about some parallel, some sequential jobs. > >> >> > >> >>> > >> >>> * Remove Verified flag. No point in one more extra button which > users need to click, anyways CentOS regression is considered as > 'Verification'. > >> > > >> > > >> > The requirement of verified flag by patch owner for regression to run > was added because the number of Jenkins machines we had were few and > patches being uploaded were many. > >> > >> However, do we consider that at present time the situation has > >> improved to consider the change Amar asks for? > >> > >> > > >> >>> > >> >>> * In a normal flow, let CentOS regression which is running after > 'Verified' vote, be triggered on first 'successful' +1 reviewed vote. > >> >> > >> >> > >> >> As I believe some reviewers/maintainers (including me) would like to > see the regression vote to put a +1/+2 in most of the patches until and > unless they are straight forward ones. So although with this you?re > reducing the burden of one extra click from the patch owner, but on the > other way you?re introducing the same burden on the reviewers who would > like to check the regression vote. IMHO, I don?t see much benefits in > implementing this. > >> > > >> > > >> > Agree with Atin here. Burden should be on machines before people. > Reviewers prefer to look at patches that have passed regression. > >> > > >> > In github heketi, we have configured regression to run on all patches > that are submitted by heketi developer group. If such configuration is > possible in gerrit+Jenkins, we should definitely do it that way. > >> > > >> > For patches that are submitted by someone outside of the developer > group, a maintainer should verify that the patch doesn't do anything > harmful and mark the regression to run. > >> > > >> > >> Deepshikha, is the above change feasible in the summation of Amar's > proposal? > > > > Yes, I'm planning to implement the regression & flag related changes > initially if everyone agrees. > >> > >> > I would say, lets get started on these changes. Regards, Amar > >> >>> > >> >>> * For those patches which got pushed to system to just 'validate' > behavior, to run sample tests, WIP patches, continue to support 'recheck > centos' comment message, so we can run without any vote. Let it not be the > norm. > >> >>> > >> >>> > >> >>> With this, I see that we can reduce smoke failures utilize 90% less > resources for a patch which would fail smoke anyways. (ie, 95% of the smoke > failures would be caught in first 10% of the resource, and time). > >> >>> > >> >>> Also we can reduce number of regression running, as review is > mandatory to run regression. > >> >>> > >> >>> These are just suggestions, happy to discuss more on these. > >> _______________________________________________ > >> Gluster-infra mailing list > >> Gluster-infra at gluster.org > >> https://lists.gluster.org/mailman/listinfo/gluster-infra > > > > -- > sankarshan mukhopadhyay > > _______________________________________________ > Gluster-infra mailing list > Gluster-infra at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-infra -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From snowmailer at gmail.com Mon Jun 3 16:58:06 2019 From: snowmailer at gmail.com (Martin) Date: Mon, 03 Jun 2019 16:58:06 -0000 Subject: [Gluster-devel] No healing on peer disconnect - is it correct? Message-ID: <10D708D0-E523-46A0-91BF-FFC41886E316@gmail.com> Hi all, I need someone to explain if my gluster behaviour is correct. I am not sure if my gluster works as it should. I have simple Replica 3 - Number of Bricks: 1 x 3 = 3. When one of my hypervisor is disconnected as peer, i.e. gluster process is down but bricks running, other two healthy nodes start signalling that they lost one peer. This is correct. Next, I restart gluster process on node where gluster process failed and I thought It should trigger healing of files on failed node but nothing is happening. I run VMs disks on this gluster volume. No healing is triggered after gluster restart, remaining two nodes get peer back after restart of gluster and everything is running without down time. Even VMs that are running on ?failed? node where gluster process was down (bricks were up) are running without down time. Is this behaviour correct? I mean No healing is triggered after peer is reconnected back and VMs. Thanks for explanation. BR! Martin From hunter86_bg at yahoo.com Mon Jun 3 17:39:53 2019 From: hunter86_bg at yahoo.com (Strahil) Date: Mon, 03 Jun 2019 17:39:53 -0000 Subject: [Gluster-devel] [Gluster-users] No healing on peer disconnect - is it correct? Message-ID: Hi Martin, By default gluster will proactively start to heal every 10 min - so this is not OK. Usually, I do not wait for that to get triggered and i run gluster volume heal full (using replica 3 with sharding of 4 MB -> oVirt default). Best Regards, Strahil NikolovOn Jun 3, 2019 19:58, Martin wrote: > > Hi all, > > I need someone to explain if my gluster behaviour is correct. I am not sure if my gluster works as it should. I have simple Replica 3 - Number of Bricks: 1 x 3 = 3. > > When one of my hypervisor is disconnected as peer, i.e. gluster process is down but bricks running, other two healthy nodes start signalling that they lost one peer. This is correct. > Next, I restart gluster process on node where gluster process failed and I thought It should trigger healing of files on failed node but nothing is happening. > > I run VMs disks on this gluster volume. No healing is triggered after gluster restart, remaining two nodes get peer back after restart of gluster and everything is running without down time. > Even VMs that are running on ?failed? node where gluster process was down (bricks were up) are running without down time. > > Is this behaviour correct? I mean No healing is triggered after peer is reconnected back and VMs. > > Thanks for explanation. > > BR! > Martin > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users From snowmailer at gmail.com Mon Jun 10 13:50:11 2019 From: snowmailer at gmail.com (snowmailer) Date: Mon, 10 Jun 2019 13:50:11 -0000 Subject: [Gluster-devel] No healing on peer disconnect - is it correct? In-Reply-To: <10D708D0-E523-46A0-91BF-FFC41886E316@gmail.com> References: <10D708D0-E523-46A0-91BF-FFC41886E316@gmail.com> Message-ID: <3B1EE351-5F82-4D05-947A-4960BBAC885A@gmail.com> Can someone advice on this, please? BR! D?a 3. 6. 2019 o 18:58 u??vate? Martin nap?sal: > Hi all, > > I need someone to explain if my gluster behaviour is correct. I am not sure if my gluster works as it should. I have simple Replica 3 - Number of Bricks: 1 x 3 = 3. > > When one of my hypervisor is disconnected as peer, i.e. gluster process is down but bricks running, other two healthy nodes start signalling that they lost one peer. This is correct. > Next, I restart gluster process on node where gluster process failed and I thought It should trigger healing of files on failed node but nothing is happening. > > I run VMs disks on this gluster volume. No healing is triggered after gluster restart, remaining two nodes get peer back after restart of gluster and everything is running without down time. > Even VMs that are running on ?failed? node where gluster process was down (bricks were up) are running without down time. > > Is this behaviour correct? I mean No healing is triggered after peer is reconnected back and VMs. > > Thanks for explanation. > > BR! > Martin > > From snowmailer at gmail.com Mon Jun 10 14:23:50 2019 From: snowmailer at gmail.com (Martin) Date: Mon, 10 Jun 2019 14:23:50 -0000 Subject: [Gluster-devel] [Gluster-users] No healing on peer disconnect - is it correct? In-Reply-To: References: <10D708D0-E523-46A0-91BF-FFC41886E316@gmail.com> <3B1EE351-5F82-4D05-947A-4960BBAC885A@gmail.com> Message-ID: My VMs using Gluster as storage through libgfapi support in Qemu. But I dont see any healing of reconnected brick. Thanks Karthik / Ravishankar in advance! > On 10 Jun 2019, at 16:07, Hari Gowtham wrote: > > On Mon, Jun 10, 2019 at 7:21 PM snowmailer > wrote: >> >> Can someone advice on this, please? >> >> BR! >> >> D?a 3. 6. 2019 o 18:58 u??vate? Martin nap?sal: >> >>> Hi all, >>> >>> I need someone to explain if my gluster behaviour is correct. I am not sure if my gluster works as it should. I have simple Replica 3 - Number of Bricks: 1 x 3 = 3. >>> >>> When one of my hypervisor is disconnected as peer, i.e. gluster process is down but bricks running, other two healthy nodes start signalling that they lost one peer. This is correct. >>> Next, I restart gluster process on node where gluster process failed and I thought It should trigger healing of files on failed node but nothing is happening. >>> >>> I run VMs disks on this gluster volume. No healing is triggered after gluster restart, remaining two nodes get peer back after restart of gluster and everything is running without down time. >>> Even VMs that are running on ?failed? node where gluster process was down (bricks were up) are running without down time. > > I assume your VMs use gluster as the storage. In that case, the > gluster volume might be mounted on all the hypervisors. > The mount/ client is smart enough to give the correct data from the > other two machines which were always up. > This is the reason things are working fine. > > Gluster should heal the brick. > Adding people how can help you better with the heal part. > @Karthik Subrahmanya @Ravishankar N do take a look and answer this part. > >>> >>> Is this behaviour correct? I mean No healing is triggered after peer is reconnected back and VMs. >>> >>> Thanks for explanation. >>> >>> BR! >>> Martin >>> >>> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > > > > -- > Regards, > Hari Gowtham. -------------- next part -------------- An HTML attachment was scrubbed... URL: From miklos at szeredi.hu Tue Jun 11 11:52:26 2019 From: miklos at szeredi.hu (Miklos Szeredi) Date: Tue, 11 Jun 2019 11:52:26 -0000 Subject: [Gluster-devel] Linux 5.2-RC regression bisected, mounting glusterfs volumes fails after commit: fuse: require /dev/fuse reads to have enough buffer capacity In-Reply-To: <876aefd0-808a-bb4b-0897-191f0a8d9e12@eikelenboom.it> References: <876aefd0-808a-bb4b-0897-191f0a8d9e12@eikelenboom.it> Message-ID: On Tue, Jun 11, 2019 at 1:03 PM Sander Eikelenboom wrote: > > L.S., > > While testing a linux 5.2 kernel I noticed it fails to mount my glusterfs volumes. > > It repeatedly fails with: > [2019-06-11 09:15:27.106946] W [fuse-bridge.c:4993:fuse_thread_proc] 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument) > [2019-06-11 09:15:27.106955] W [fuse-bridge.c:4993:fuse_thread_proc] 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument) > [2019-06-11 09:15:27.106963] W [fuse-bridge.c:4993:fuse_thread_proc] 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument) > [2019-06-11 09:15:27.106971] W [fuse-bridge.c:4993:fuse_thread_proc] 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument) > etc. > etc. > > Bisecting turned up as culprit: > commit d4b13963f217dd947da5c0cabd1569e914d21699: fuse: require /dev/fuse reads to have enough buffer capacity > > The glusterfs version i'm using is from Debian stable: > ii glusterfs-client 3.8.8-1 amd64 clustered file-system (client package) > ii glusterfs-common 3.8.8-1 amd64 GlusterFS common libraries and translator modules > > > A 5.1.* kernel works fine, as does a 5.2-rc4 kernel with said commit reverted. Thanks for the report, reverted the bad commit. Thanks, Miklos From kirr at nexedi.com Tue Jun 11 20:42:46 2019 From: kirr at nexedi.com (Kirill Smelkov) Date: Tue, 11 Jun 2019 20:42:46 -0000 Subject: [Gluster-devel] Linux 5.2-RC regression bisected, mounting glusterfs volumes fails after commit: fuse: require /dev/fuse reads to have enough buffer capacity In-Reply-To: References: <876aefd0-808a-bb4b-0897-191f0a8d9e12@eikelenboom.it> Message-ID: <20190611202738.GA22556@deco.navytux.spb.ru> On Tue, Jun 11, 2019 at 01:52:14PM +0200, Miklos Szeredi wrote: > On Tue, Jun 11, 2019 at 1:03 PM Sander Eikelenboom wrote: > > > > L.S., > > > > While testing a linux 5.2 kernel I noticed it fails to mount my glusterfs volumes. > > > > It repeatedly fails with: > > [2019-06-11 09:15:27.106946] W [fuse-bridge.c:4993:fuse_thread_proc] 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument) > > [2019-06-11 09:15:27.106955] W [fuse-bridge.c:4993:fuse_thread_proc] 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument) > > [2019-06-11 09:15:27.106963] W [fuse-bridge.c:4993:fuse_thread_proc] 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument) > > [2019-06-11 09:15:27.106971] W [fuse-bridge.c:4993:fuse_thread_proc] 0-glusterfs-fuse: read from /dev/fuse returned -1 (Invalid argument) > > etc. > > etc. > > > > Bisecting turned up as culprit: > > commit d4b13963f217dd947da5c0cabd1569e914d21699: fuse: require /dev/fuse reads to have enough buffer capacity > > > > The glusterfs version i'm using is from Debian stable: > > ii glusterfs-client 3.8.8-1 amd64 clustered file-system (client package) > > ii glusterfs-common 3.8.8-1 amd64 GlusterFS common libraries and translator modules > > > > > > A 5.1.* kernel works fine, as does a 5.2-rc4 kernel with said commit reverted. > > Thanks for the report, reverted the bad commit. First of all I'm sorry for breaking things here. The diff of the guilty commit is --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -1317,6 +1317,16 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file, unsigned reqsize; unsigned int hash; + /* + * Require sane minimum read buffer - that has capacity for fixed part + * of any request header + negotated max_write room for data. If the + * requirement is not satisfied return EINVAL to the filesystem server + * to indicate that it is not following FUSE server/client contract. + * Don't dequeue / abort any request. + */ + if (nbytes < max_t(size_t, FUSE_MIN_READ_BUFFER, 4096 + fc->max_write)) + return -EINVAL; + restart: spin_lock(&fiq->waitq.lock); err = -EAGAIN; and it was essentially requesting that the filesystem server provide 4K+ buffer for reads from /dev/fuse. That 4K was meant as space for FUSE request header, citing commit: Before getting into operation phase, FUSE filesystem server and kernel client negotiate what should be the maximum write size the client will ever issue. After negotiation the contract in between server/client is that the filesystem server then should queue /dev/fuse sys_read calls with enough buffer capacity to receive any client request - WRITE in particular, while FUSE client should not, in particular, send WRITE requests with > negotiated max_write payload. FUSE client in kernel and libfuse historically reserve 4K for request header. This way the contract is that filesystem server should queue sys_reads with 4K+max_write buffer. I could reproduce the problem and as it turns out what broke here is that glusterfs is using not 4K but a smaller room for header - 80 bytes for gluster-3.8 being `sizeof(fuse_in_header) + sizeof(fuse_write_in)`: https://github.com/gluster/glusterfs/blob/v3.8.15-0-gd174f021a/xlators/mount/fuse/src/fuse-bridge.c#L4894 Since `sizeof(fuse_in_header) + sizeof(fuse_write_in)` == `sizeof(fuse_in_header) + sizeof(fuse_read_in)` is the absolute minimum any sane filesystem should be using for header room, can we please restore the patch with that value instead of 4K? That patch was there in the first place to help diagnose stuck fuse servers much more easier, citing commit: If the filesystem server does not follow this contract, what can happen is that fuse_dev_do_read will see that request size is > buffer size, and then it will return EIO to client who issued the request but won't indicate in any way that there is a problem to filesystem server. This can be hard to diagnose because for some requests, e.g. for NOTIFY_REPLY which mimics WRITE, there is no client thread that is waiting for request completion and that EIO goes nowhere, while on filesystem server side things look like the kernel is not replying back after successful NOTIFY_RETRIEVE request made by the server. We can make the problem easy to diagnose if we indicate via error return to filesystem server when it is violating the contract. This should not practically cause problems because if a filesystem server is using shorter buffer, writes to it were already very likely to cause EIO, and if the filesystem is read-only it should be too following FUSE_MIN_READ_BUFFER minimum buffer size. Please see [1] for context where the problem of stuck filesystem was hit for real (because kernel client was incorrectly sending more than max_write data with NOTIFY_REPLY; see also previous patch), how the situation was traced and for more involving patch that did not make it into the tree. [1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2 so it would be a pity to loose that property. Miklos, would 4K -> `sizeof(fuse_in_header) + sizeof(fuse_write_in)` for header room change be accepted? Kirill From mszeredi at redhat.com Wed Jun 12 07:45:01 2019 From: mszeredi at redhat.com (Miklos Szeredi) Date: Wed, 12 Jun 2019 07:45:01 -0000 Subject: [Gluster-devel] Linux 5.2-RC regression bisected, mounting glusterfs volumes fails after commit: fuse: require /dev/fuse reads to have enough buffer capacity In-Reply-To: <20190611202738.GA22556@deco.navytux.spb.ru> References: <876aefd0-808a-bb4b-0897-191f0a8d9e12@eikelenboom.it> <20190611202738.GA22556@deco.navytux.spb.ru> Message-ID: On Tue, Jun 11, 2019 at 10:28 PM Kirill Smelkov wrote: > Miklos, would 4K -> `sizeof(fuse_in_header) + sizeof(fuse_write_in)` for > header room change be accepted? Yes, next cycle. For 4.2 I'll just push the revert. Thanks, Miklos From Janak.Desai at gtri.gatech.edu Fri Jun 21 12:04:58 2019 From: Janak.Desai at gtri.gatech.edu (Desai, Janak) Date: Fri, 21 Jun 2019 12:04:58 -0000 Subject: [Gluster-devel] Quick question about the latest glusterfs and client side selinux support In-Reply-To: <1430982045.13733997.1561089532953.JavaMail.zimbra@redhat.com> References: <6D60E9DD-C4E1-4B95-9277-C4F746DB228C@gtri.gatech.edu> <62539ffb-4b65-1733-6151-fc9b2c604254@redhat.com> <9D9C2802-539E-4153-81FF-4A0F8E934E27@gtri.gatech.edu>, <1430982045.13733997.1561089532953.JavaMail.zimbra@redhat.com> Message-ID: <38719e2da1254b4092b1b72133425fe0@gtri.gatech.edu> Thank you so much Jiffin for the quick response! -Janak ________________________________ From: Jiffin Thottan Sent: Thursday, June 20, 2019 11:58:52 PM To: Desai, Janak Cc: Gluster Devel; nfs-ganesha-devel Subject: Re: Quick question about the latest glusterfs and client side selinux support Hi Janak, Currently, it is supported in glusterfs(from 2.8 onwards) and cephfs(already there in 2.7) for nfs-ganesha. -- Jiffin ----- Original Message ----- From: "Janak Desai" To: "Jiffin Tony Thottan" Sent: Thursday, June 20, 2019 9:29:09 PM Subject: Re: Quick question about the latest glusterfs and client side selinux support Hi Jiffin, I came across your presentation ?NFS-Ganesha Weather Report? that you gave at the FOSDEM?19 in early Feb this year. In that you mentioned that ongoing developments in v2.8 include ?labelled NFS? support. I see that v2.8 is now out. Do you know if labelled NFS support made it in? If it did, is it only supported in CEPHFS FSAL or any other FSALs also include the support for it? I took a cursory look at the release documents and didn?t see Labelled NFS in it, so thought I would bug you directly. Thanks. -Janak From: Jiffin Tony Thottan Date: Tuesday, August 28, 2018 at 12:50 AM To: Janak Desai , "ndevos at redhat.com" , "mselvaga at redhat.com" Cc: "paul at paul-moore.com" Subject: Re: Quick question about the latest glusterfs and client side selinux support Hi Janak, Thanks for the interest. Basic selinux xlator is present at gluster server stack. It stores selinux context at the backend as a xattr. When we developed that xlator, at that point they were no client to test the functionality. Don't know whether required change in fuse got merged or not. As you mentioned ,here first we need to figure out whether issue is related to server. Can collect the packet trace using tcpdump from client and sent with mail during setting/getting selinux context. Regards, Jiffin On Tuesday 28 August 2018 04:14 AM, Desai, Janak wrote: Hi Niels, Manikandan, Jiffin, I work for Georgia Tech Research Institute?s CIPHER Lab and am investigating suitability of glusterfs for a couple of large upcoming projects. My ?google research? is yielding confusing and inconclusive results, so I thought I would try and reach out to some of the core developers to get some clarity. We use SELinux extensively in our software solution. I am trying to find out if, with the latest version 4.1 of glusterfs running on the latest version of rhel, I should be able to associate and enforce selinux contexts from glusterfs clients. I see in the 3.11 release notes that the selinux feature was implemented but then I also see references to kernel work that is not done yet. I also could not find any documentation/examples on how to add/integrate this selinux translator to setup and enforce selinux labels from the client side. In my simple test setup, which I mounted using the ?selinux? option (which gluster does seem to recognize), I am getting the ?operation not supported? error. I guess either I am not pulling in the selinux translator or I am running up against other missing functionality in the kernel. I would really appreciate if you could clear this up for me. If I am not configuring my mount correctly, I would appreciate if you could point me to a document or an example. Our other option is lustre filesystem since it does have a working client side association and enforcement of selinux contexts. However, lustre appears to be lot difficult to setup and maintain and I would rather use glusterfs. We need a distributed (or parallel) filesystem that can work with Hadoop. If glusterfs doesn?t pan out then I will look at labelled nfs 4.2 that is now available in rhel7. However, my google research shows much more Hadoop affinity for glusterfs than nfs v4. I am also copying Paul Moore, with whom I collaborated a few years ago as part of the team that took Linux through its common criteria evaluation, and who I haven?t bugged lately ?, to see if he can shed some light any missing kernel dependencies. I am currently testing with rhel7.5, but would be willing to try upstream kernel if have to get this proof of concept going. I know the underlying problem in the kernel is supporting extended attrs on FUSE file systems, but was wondering (and hoping) that at least setup/enforcement of selinux contexts from client side for glusterfs is possible. Thanks. -Janak -------------- next part -------------- An HTML attachment was scrubbed... URL: