[Gluster-devel] Help needed in understanding GlusterFS logs and debugging elasticsearch failures

Thu Dec 17 10:09:21 UTC 2015

Hi,

I tried the same use case with pure DHT (1 & 2 nodes). I don't see any
problems.
However, if I try the same tests with distributed replicate, the indices go
into red.

If any additional details are needed than the logs attached in the earlier
mails please let me know.

-sac

On Mon, Dec 14, 2015 at 4:13 PM, Sachidananda URS <surs at redhat.com> wrote:

> Hi,
>
>
> On Sat, Dec 12, 2015 at 2:35 AM, Vijay Bellur <vbellur at redhat.com> wrote:
>
>>
>>
>> ----- Original Message -----
>> > From: "Sachidananda URS" <surs at redhat.com>
>> > To: "Gluster Devel" <gluster-devel at gluster.org>
>> > Sent: Friday, December 11, 2015 10:26:04 AM
>> > Subject: [Gluster-devel] Help needed in understanding GlusterFS logs
>> and debugging elasticsearch failures
>> >
>> > Hi,
>> >
>> > I was trying to use GlusterFS as a backend filesystem for storing the
>> > elasticsearch indices on GlusterFS mount.
>> >
>> > The filesystem operations as far as I can understand is, lucene engine
>> > does a lot of renames on the index files. And multiple threads read
>> > from the same file concurrently.
>> >
>> > While writing index, elasticsearch/lucene complains of index corruption
>> and
>> > the
>> > health of the cluster goes to red, and all the operations on the index
>> fail
>> > hereafter.
>> >
>> > ===================
>> >
>> > [2015-12-10 02:43:45,614][WARN ][index.engine             ] [client-2]
>> > [logstash-2015.12.09][3] failed engine [merge failed]
>> > org.apache.lucene.index.MergePolicy$MergeException:
>> > org.apache.lucene.index.CorruptIndexException: checksum failed (hardware
>> > problem?) : expected=0 actual=6d811d06
>> >
>> (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/mnt/gluster2/rhs/nodes/0/indices/logstash-2015.12.09/3/index/_a7.cfs")
>> > [slice=_a7_Lucene50_0.doc]))
>> >         at
>> >
>>  org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$1.doRun(InternalEngine.java:1233)
>> >         at
>> >
>>  org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
>> >         at
>> >
>>  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> >         at
>> >
>>  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> >         at java.lang.Thread.run(Thread.java:745)
>> > Caused by: org.apache.lucene.index.CorruptIndexException: checksum
>> failed
>> > (hardware problem?) : expected=0 actual=6d811d06
>> >
>> (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/mnt/gluster2/rhs/nodes/0/indices/logstash-2015.12.09/3/index/_a7.cfs")
>> > [slice=_a7_Lucene50_0.doc]))
>> >
>> > =====================
>> >
>> >
>> > Server logs does not have anything. The client logs is full of messages
>> like:
>> >
>> >
>> >
>> > [2015-12-03 18:44:17.882032] I [MSGID: 109066]
>> [dht-rename.c:1410:dht_rename]
>> > 0-esearch-dht: renaming
>> >
>> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-61881676454442626.tlog
>> > (hash=esearch-replicate-0/cache=esearch-replicate-0) =>
>> > /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-311.ckp
>> > (hash=esearch-replicate-1/cache=<nul>)
>> > [2015-12-03 18:45:31.276316] I [MSGID: 109066]
>> [dht-rename.c:1410:dht_rename]
>> > 0-esearch-dht: renaming
>> >
>> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-2384654015514619399.tlog
>> > (hash=esearch-replicate-0/cache=esearch-replicate-0) =>
>> > /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-312.ckp
>> > (hash=esearch-replicate-0/cache=<nul>)
>> > [2015-12-03 18:45:31.587660] I [MSGID: 109066]
>> [dht-rename.c:1410:dht_rename]
>> > 0-esearch-dht: renaming
>> >
>> /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-4957943728738197940.tlog
>> > (hash=esearch-replicate-0/cache=esearch-replicate-0) =>
>> > /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-312.ckp
>> > (hash=esearch-replicate-0/cache=<nul>)
>> > [2015-12-03 18:46:48.424605] I [MSGID: 109066]
>> [dht-rename.c:1410:dht_rename]
>> > 0-esearch-dht: renaming
>> >
>> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-1731620600607498012.tlog
>> > (hash=esearch-replicate-1/cache=esearch-replicate-1) =>
>> > /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-313.ckp
>> > (hash=esearch-replicate-1/cache=<nul>)
>> > [2015-12-03 18:46:48.466558] I [MSGID: 109066]
>> [dht-rename.c:1410:dht_rename]
>> > 0-esearch-dht: renaming
>> >
>> /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-5214949393126318982.tlog
>> > (hash=esearch-replicate-1/cache=esearch-replicate-1) =>
>> > /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-313.ckp
>> > (hash=esearch-replicate-1/cache=<nul>)
>> > [2015-12-03 18:48:06.314138] I [MSGID: 109066]
>> [dht-rename.c:1410:dht_rename]
>> > 0-esearch-dht: renaming
>> >
>> /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-9110755229226773921.tlog
>> > (hash=esearch-replicate-0/cache=esearch-replicate-0) =>
>> > /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-314.ckp
>> > (hash=esearch-replicate-1/cache=<nul>)
>> > [2015-12-03 18:48:06.332919] I [MSGID: 109066]
>> [dht-rename.c:1410:dht_rename]
>> > 0-esearch-dht: renaming
>> >
>> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-5193443717817038271.tlog
>> > (hash=esearch-replicate-1/cache=esearch-replicate-1) =>
>> > /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-314.ckp
>> > (hash=esearch-replicate-1/cache=<nul>)
>> > [2015-12-03 18:49:24.694263] I [MSGID: 109066]
>> [dht-rename.c:1410:dht_rename]
>> > 0-esearch-dht: renaming
>> >
>> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-2750483795035758522.tlog
>> > (hash=esearch-replicate-1/cache=esearch-replicate-1) =>
>> > /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-315.ckp
>> > (hash=esearch-replicate-0/cache=<nul>)
>> >
>> > ==============================================================
>> >
>> > The same setup works well on any of the disk filesystems.
>> > This is 2 x 2 distributed-replicate setup:
>> >
>> > # gluster vol info
>> >
>> > Volume Name: esearch
>> > Type: Distributed-Replicate
>> > Volume ID: 4e4b205e-28ed-4f9e-9fa4-0d020428dede
>> > Status: Started
>> > Number of Bricks: 2 x 2 = 4
>> > Transport-type: tcp,rdma
>> > Bricks:
>> > Brick1: 10.70.47.171:/gluster/brick1
>> > Brick2: 10.70.47.187:/gluster/brick1
>> > Brick3: 10.70.47.121:/gluster/brick1
>> > Brick4: 10.70.47.172:/gluster/brick1
>> > Options Reconfigured:
>> > performance.read-ahead: off
>> > performance.write-behind: off
>> >
>> >
>> > I need a little bit help in understanding the failures. Let me know if
>> you
>> > need
>> > further information on setup or access to the system to debug further.
>> I've
>> > attached the debug logs for further investigation.
>> >
>>
>>
>> Would it be possible to turn off all the performance translators
>> (md-cache, quickread, io-cache etc.) and check if the same problem
>> persists? Collecting strace of the elasticsearch process that does I/O on
>> gluster can also help.
>>
>
> I turned off all the performance xlators.
>
>
>  gluster vol info
>
> Volume Name: esearch
> Type: Distributed-Replicate
> Volume ID: 4e4b205e-28ed-4f9e-9fa4-0d020428dede
> Status: Started
> Number of Bricks: 2 x 2 = 4
> Transport-type: tcp,rdma
> Bricks:
> Brick1: 10.70.47.171:/gluster/brick1
> Brick2: 10.70.47.187:/gluster/brick1
> Brick3: 10.70.47.121:/gluster/brick1
> Brick4: 10.70.47.172:/gluster/brick1
> Options Reconfigured:
> performance.stat-prefetch: off
> performance.md-cache-timeout: 0
> performance.quick-read: off
> performance.io-cache: off
> performance.read-ahead: off
> performance.write-behind: off
>
> The problem still persists. Attaching strace logs.
>
>
> -sac
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20151217/c4565675/attachment.html>