[Gluster-devel] Help needed in understanding GlusterFS logs and debugging elasticsearch failures

Fri Dec 11 15:26:04 UTC 2015

Hi,

I was trying to use GlusterFS as a backend filesystem for storing the
elasticsearch indices on GlusterFS mount.

The filesystem operations as far as I can understand is, lucene engine
does a lot of renames on the index files. And multiple threads read
from the same file concurrently.

While writing index, elasticsearch/lucene complains of index corruption and the
health of the cluster goes to red, and all the operations on the index fail
hereafter.

===================

[2015-12-10 02:43:45,614][WARN ][index.engine             ] [client-2]
[logstash-2015.12.09][3] failed engine [merge failed]
org.apache.lucene.index.MergePolicy$MergeException:
org.apache.lucene.index.CorruptIndexException: checksum failed
(hardware problem?) : expected=0 actual=6d811d06
(resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/mnt/gluster2/rhs/nodes/0/indices/logstash-2015.12.09/3/index/_a7.cfs")
[slice=_a7_Lucene50_0.doc]))
        at org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$1.doRun(InternalEngine.java:1233)
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.lucene.index.CorruptIndexException: checksum failed
(hardware problem?) : expected=0 actual=6d811d06
(resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/mnt/gluster2/rhs/nodes/0/indices/logstash-2015.12.09/3/index/_a7.cfs")
[slice=_a7_Lucene50_0.doc]))

=====================

Server logs does not have anything. The client logs is full of messages like:

[2015-12-03 18:44:17.882032] I [MSGID: 109066]
[dht-rename.c:1410:dht_rename] 0-esearch-dht: renaming
/rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-61881676454442626.tlog
(hash=esearch-replicate-0/cache=esearch-replicate-0) =>
/rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-311.ckp
(hash=esearch-replicate-1/cache=<nul>)
[2015-12-03 18:45:31.276316] I [MSGID: 109066]
[dht-rename.c:1410:dht_rename] 0-esearch-dht: renaming
/rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-2384654015514619399.tlog
(hash=esearch-replicate-0/cache=esearch-replicate-0) =>
/rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-312.ckp
(hash=esearch-replicate-0/cache=<nul>)
[2015-12-03 18:45:31.587660] I [MSGID: 109066]
[dht-rename.c:1410:dht_rename] 0-esearch-dht: renaming
/rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-4957943728738197940.tlog
(hash=esearch-replicate-0/cache=esearch-replicate-0) =>
/rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-312.ckp
(hash=esearch-replicate-0/cache=<nul>)
[2015-12-03 18:46:48.424605] I [MSGID: 109066]
[dht-rename.c:1410:dht_rename] 0-esearch-dht: renaming
/rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-1731620600607498012.tlog
(hash=esearch-replicate-1/cache=esearch-replicate-1) =>
/rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-313.ckp
(hash=esearch-replicate-1/cache=<nul>)
[2015-12-03 18:46:48.466558] I [MSGID: 109066]
[dht-rename.c:1410:dht_rename] 0-esearch-dht: renaming
/rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-5214949393126318982.tlog
(hash=esearch-replicate-1/cache=esearch-replicate-1) =>
/rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-313.ckp
(hash=esearch-replicate-1/cache=<nul>)
[2015-12-03 18:48:06.314138] I [MSGID: 109066]
[dht-rename.c:1410:dht_rename] 0-esearch-dht: renaming
/rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-9110755229226773921.tlog
(hash=esearch-replicate-0/cache=esearch-replicate-0) =>
/rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-314.ckp
(hash=esearch-replicate-1/cache=<nul>)
[2015-12-03 18:48:06.332919] I [MSGID: 109066]
[dht-rename.c:1410:dht_rename] 0-esearch-dht: renaming
/rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-5193443717817038271.tlog
(hash=esearch-replicate-1/cache=esearch-replicate-1) =>
/rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-314.ckp
(hash=esearch-replicate-1/cache=<nul>)
[2015-12-03 18:49:24.694263] I [MSGID: 109066]
[dht-rename.c:1410:dht_rename] 0-esearch-dht: renaming
/rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-2750483795035758522.tlog
(hash=esearch-replicate-1/cache=esearch-replicate-1) =>
/rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-315.ckp
(hash=esearch-replicate-0/cache=<nul>)

==============================================================

The same setup works well on any of the disk filesystems.
This is 2 x 2 distributed-replicate setup:

# gluster vol info

Volume Name: esearch
Type: Distributed-Replicate
Volume ID: 4e4b205e-28ed-4f9e-9fa4-0d020428dede
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp,rdma
Bricks:
Brick1: 10.70.47.171:/gluster/brick1
Brick2: 10.70.47.187:/gluster/brick1
Brick3: 10.70.47.121:/gluster/brick1
Brick4: 10.70.47.172:/gluster/brick1
Options Reconfigured:
performance.read-ahead: off
performance.write-behind: off

I need a little bit help in understanding the failures. Let me know if you need
further information on setup or access to the system to debug further. I've
attached the debug logs for further investigation.

-sac
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20151211/dda3b001/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mnt-gluster.log.bz2
Type: application/x-bzip2
Size: 618811 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20151211/dda3b001/attachment-0001.bz2>