Solr Indexer Crashed

The bug is described at https://jira.xwiki.org/browse/SCSEARCH-1

Solr Indexer crashed when index the article like this:

== 1. Header 1 ==
Text by header 1
== 2.1. Header 2 ==
Text by header 2.1

When we modify it to (correct 2.1. to 2.) it start to work correct:

== 1. Header 1 ==
Text by header 1
== 2. Header 2 ==
Text by header 2

The xwiki 11.4 Tomcat Docker + Mysql 5.7 Docker

2019-06-24 17:20:24,843 [XWiki Solr index thread] ERROR o.a.s.h.RequestHandlerBase - org.apache.solr.common.SolrException: Exception writing document id xwiki:Sandbox.ArticleTest.WebHome_ru to the index; possible analysis error: startOffset must be non-negative, and endOffset must be >= startOffset, and offsets must not go backwards startOffset=40,endOffset=41,lastStartOffset=41 for field ‘doccontentraw_ru’
at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:243)
at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:67)
at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:1001)
at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1222)
at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:693)
at org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:110)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:327)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readIterator(JavaBinUpdateRequestCodec.java:280)
at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:333)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:278)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readNamedList(JavaBinUpdateRequestCodec.java:235)
at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:298)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:278)
at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:191)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:126)
at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:123)
at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:70)
at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2551)
at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:191)
at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194)
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:177)
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:138)
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:156)
at org.xwiki.search.solr.internal.AbstractSolrInstance.add(AbstractSolrInstance.java:61)
at org.xwiki.search.solr.internal.DefaultSolrIndexer.processBatch(DefaultSolrIndexer.java:413)
at org.xwiki.search.solr.internal.DefaultSolrIndexer.run(DefaultSolrIndexer.java:377)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: startOffset must be non-negative, and endOffset must be >= startOffset, and offsets must not go backwards startOffset=40,endOffset=41,lastStartOffset=41 for field ‘doccontentraw_ru’
at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:824)
at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:430)
at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:394)
at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:251)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:494)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1616)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1608)
at org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:969)
at org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:341)
at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:288)
at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:235)
… 30 more

2019-06-24 17:20:24,843 [XWiki Solr index thread] ERROR o.x.s.s.i.DefaultSolrIndexer - Failed to process entry [INDEX xwiki:Sandbox.ArticleTest.WebHome]
org.apache.solr.common.SolrException: Exception writing document id xwiki:Sandbox.ArticleTest.WebHome_ru to the index; possible analysis error: startOffset must be non-negative, and endOffset must be >= startOffset, and offsets must not go backwards startOffset=40,endOffset=41,lastStartOffset=41 for field ‘doccontentraw_ru’
at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:243)
at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:67)
at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:1001)
at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1222)
at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:693)
at org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:110)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:327)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readIterator(JavaBinUpdateRequestCodec.java:280)
at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:333)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:278)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readNamedList(JavaBinUpdateRequestCodec.java:235)
at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:298)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:278)
at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:191)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:126)
at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:123)
at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:70)
at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2551)
at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:191)
at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194)
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:177)
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:138)
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:156)
at org.xwiki.search.solr.internal.AbstractSolrInstance.add(AbstractSolrInstance.java:61)
at org.xwiki.search.solr.internal.DefaultSolrIndexer.processBatch(DefaultSolrIndexer.java:413)
at org.xwiki.search.solr.internal.DefaultSolrIndexer.run(DefaultSolrIndexer.java:377)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: startOffset must be non-negative, and endOffset must be >= startOffset, and offsets must not go backwards startOffset=40,endOffset=41,lastStartOffset=41 for field ‘doccontentraw_ru’
at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:824)
at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:430)
at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:394)
at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:251)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:494)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1616)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1608)
at org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:969)
at org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:341)
at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:288)
at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:235)
… 30 common frames omitted

Word Delimiter Filter

https://lucene.apache.org/solr/guide/6_6/filter-descriptions.html

Word Delimiter Filter has been deprecated in favor of Word Delimiter Graph Filter, which is required to produce a correct token graph so that e.g. phrase queries can work correctly.

Solution: use Word Delimiter Graph Filter instead of Word Delimiter Filter in your config file.

Solution: use Word Delimiter Graph Filter instead of Word Delimiter Filter in your config file.

This has been fixed in XWiki 11.4RC1 by the upgrade to Solr 7.7.1. Did you upgrade to XWiki 11.4 or is it a clean install? Because the default Solr index schema in XWiki 11.4 doesn’t use Word Delimiter Filter, see https://github.com/xwiki/xwiki-platform/blob/xwiki-platform-11.4/xwiki-platform-core/xwiki-platform-search/xwiki-platform-search-solr/xwiki-platform-search-solr-server/xwiki-platform-search-solr-server-data/src/main/resources/xwiki/conf/managed-schema . Only Word Delimiter Graph Filter is used.

Thanks!
I have the old Scheme file with Word Delimiter and I edit it manually to fix the problem.