LDAP Stopped Working (11.1)

Not sure that is an 11.1 issue, as it worked fine after upgrade (2 weeks ago), but we rarely have any issues with the LDAP->AD connection, but this morning this stopped working. Basically, the login will hang for about a minute and then return “Error: Invalid credentials”.

image

Couldn’t find much in the logs and the config looks correct, so restarted Apache Tomcat and now it’s working fine.

Maybe this is related? Probably not, as it happened on restart of the service, but LDAP is working.

12-Mar-2019 10:40:23.446 SEVERE [localhost-startStop-2] org.apache.catalina.core.StandardContext.listenerStop Exception sending context destroyed event to listener instance of class [org.xwiki.container.servlet.XWikiServletContextListener]
org.infinispan.IllegalLifecycleStateException: ISPN000323: Cache ‘user.membership.members’ is in ‘TERMINATED’ state and so it does not accept new invocations. Either restart it or recreate the cache container.
at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:97)
at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:79)
at org.infinispan.commands.AbstractVisitor.visitClearCommand(AbstractVisitor.java:58)
at org.infinispan.commands.write.ClearCommand.acceptVisitor(ClearCommand.java:44)
at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:335)
at org.infinispan.cache.impl.CacheImpl.clear(CacheImpl.java:591)
at org.infinispan.cache.impl.CacheImpl.clear(CacheImpl.java:582)
at org.xwiki.cache.infinispan.internal.InfinispanCache.removeAll(InfinispanCache.java:107)
at org.xwiki.user.internal.group.AbstractGroupCache.removeAll(AbstractGroupCache.java:302)
at org.xwiki.user.internal.group.AbstractGroupCache.dispose(AbstractGroupCache.java:323)
at org.xwiki.component.embed.EmbeddableComponentManager.dispose(EmbeddableComponentManager.java:685)
at org.xwiki.container.servlet.XWikiServletContextListener.contextDestroyed(XWikiServletContextListener.java:161)
at org.apache.catalina.core.StandardContext.listenerStop(StandardContext.java:4792)
at org.apache.catalina.core.StandardContext.stopInternal(StandardContext.java:5429)
at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:226)
at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1435)
at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1424)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

For some reason the infinispan cache ended up in the bad state which very rarely happen but I did saw it in the past. I don’t think it has anything to do with 11.1 since Infinispan hasn’t been upgraded since 10.5.

Usually you don’t see it anymore after restarting XWiki.

Happening again, we can’t login with LDAP, but I can’t find that error.

I’m not restarting Apache Tomcat without doing some diagnostics first. I was able to login with local admin account:

image

Did find this one error, but it’s worked since this time:

Mar 12, 2019 11:10:14 AM org.restlet.engine.application.StatusFilter doHandle
WARNING: Exception or error caught by status service
java.lang.StackOverflowError
at org.hibernate.hql.antlr.HqlBaseParser.logicalAndExpression(HqlBaseParser.java:2476)
at org.hibernate.hql.antlr.HqlBaseParser.logicalOrExpression(HqlBaseParser.java:2443)
at org.hibernate.hql.antlr.HqlBaseParser.expression(HqlBaseParser.java:2204)
at org.hibernate.hql.antlr.HqlBaseParser.expressionOrVector(HqlBaseParser.java:4625)
at org.hibernate.hql.antlr.HqlBaseParser.primaryExpression(HqlBaseParser.java:1032)
at org.hibernate.hql.antlr.HqlBaseParser.atom(HqlBaseParser.java:3679)
at org.hibernate.hql.antlr.HqlBaseParser.unaryExpression(HqlBaseParser.java:3442)
at org.hibernate.hql.antlr.HqlBaseParser.multiplyExpression(HqlBaseParser.java:3309)
at org.hibernate.hql.antlr.HqlBaseParser.additiveExpression(HqlBaseParser.java:2994)
at org.hibernate.hql.antlr.HqlBaseParser.concatenation(HqlBaseParser.java:632)
at org.hibernate.hql.antlr.HqlBaseParser.relationalExpression(HqlBaseParser.java:2757)
at org.hibernate.hql.antlr.HqlBaseParser.equalityExpression(HqlBaseParser.java:2613)
at org.hibernate.hql.antlr.HqlBaseParser.negatedExpression(HqlBaseParser.java:2572)
at org.hibernate.hql.antlr.HqlBaseParser.logicalAndExpression(HqlBaseParser.java:2483)
at org.hibernate.hql.antlr.HqlBaseParser.logicalOrExpression(HqlBaseParser.java:2443)
at org.hibernate.hql.antlr.HqlBaseParser.expression(HqlBaseParser.java:2204)
at org.hibernate.hql.antlr.HqlBaseParser.expressionOrVector(HqlBaseParser.java:4625)
at org.hibernate.hql.antlr.HqlBaseParser.primaryExpression(HqlBaseParser.java:1032)
at org.hibernate.hql.antlr.HqlBaseParser.atom(HqlBaseParser.java:3679)

image

at org.xwiki.rendering.internal.macro.velocity.VelocityMacro.evaluateString(VelocityMacro.java:139)
at org.xwiki.rendering.internal.macro.velocity.VelocityMacro.evaluateString(VelocityMacro.java:52)
at org.xwiki.rendering.macro.script.AbstractScriptMacro.evaluateBlock(AbstractScriptMacro.java:286)
at org.xwiki.rendering.macro.script.AbstractScriptMacro.execute(AbstractScriptMacro.java:182)
at org.xwiki.rendering.macro.script.AbstractScriptMacro.execute(AbstractScriptMacro.java:58)
at org.xwiki.rendering.internal.transformation.macro.MacroTransformation.transform(MacroTransformation.java:297)

Not sure if this is related, but for a few versions now, after restarting Tomcat, xwiki takes a very long time to load. Times out the first 2 or 3 times I try to load it in the browser.

Restarting Tomcat service fixed it…but for how long?

Ok, as part of the restart:

14-Mar-2019 17:01:53.415 INFO [Thread-417] org.apache.coyote.AbstractProtocol.stop Stopping ProtocolHandler [“ajp-nio-8009”]
Java HotSpot™ 64-Bit Server VM warning: ignoring option MaxPermSize=196m; support was removed in 8.0
.hibernate.hql.antlr.HqlBaseParser.primaryExpression(HqlBaseParser.java:1032)
at org.hibernate.hql.antlr.HqlBaseParser.atom(HqlBaseParser.java:3679)
at org.hibernate.hql.antlr.HqlBaseParser.unaryExpression(HqlBaseParser.java:3442)
at org.hibernate.hql.antlr.HqlBaseParser.multiplyExpression(HqlBaseParser.java:3309)
at org.hibernate.hql.antlr.HqlBaseParser.additiveExpression(HqlBaseParser.java:2994)
at org.hibernate.hql.antlr.HqlBaseParser.concatenation(HqlBaseParser.java:632)
at org.hibernate.hql.antlr.HqlBaseParser.relationalExpression(HqlBaseParser.java:2757)
at org.hibernate.hql.antlr.HqlBaseParser.equalityExpression(HqlBaseParser.java:2613)
at org.hibernate.hql.antlr.HqlBaseParser.negatedExpression(HqlBaseParser.java:2572)
at org.hibernate.hql.antlr.HqlBaseParser.logicalAndExpression(HqlBaseParser.java:2483)
at org.hibernate.hql.antlr.HqlBaseParser.logicalOrExpression(HqlBaseParser.java:2443)
at org.hibernate.hql.antlr.HqlBaseParser.expression(HqlBaseParser.java:2204)
at org.hibernate.hql.antlr.HqlBaseParser.expressionOrVector(HqlBaseParser.java:4625)
at org.hibernate.hql.antlr.HqlBaseParser.primaryExpression(HqlBaseParser.java:1032)

More:

2019-03-14 17:00:30,016 [DefaultQuartzScheduler_Worker-4] ERROR n.i.e.NotificationUserIterator - Fail to get next user.
org.xwiki.query.QueryException: Exception while executing query. Query statement = [ select distinct doc.fullName from XWikiDocument as doc , BaseObject as objUser , com.xpn.xwiki.objects.StringProperty as objUser_email1 where ( length ( objUser_email1.value ) > 0 ) and doc.fullName=objUser.name and objUser.className=‘XWiki.XWikiUsers’ and objUser_email1.id.id=objUser.id and objUser_email1.id.name=‘email’ order by doc.fullName ]
at com.xpn.xwiki.store.hibernate.query.HqlQueryExecutor.execute(HqlQueryExecutor.java:195)
at org.xwiki.query.internal.DefaultQueryExecutorManager.execute(DefaultQueryExecutorManager.java:72)
at org.xwiki.query.internal.DefaultQuery.execute(DefaultQuery.java:296)
at org.xwiki.query.xwql.internal.XWQLQueryExecutor.execute(XWQLQueryExecutor.java:124)
at org.xwiki.query.internal.DefaultQueryExecutorManager.execute(DefaultQueryExecutorManager.java:72)
at org.xwiki.query.internal.DefaultQuery.execute(DefaultQuery.java:296)
at org.xwiki.notifications.notifiers.internal.email.NotificationUserIterator.doQuery(NotificationUserIterator.java:141)
at org.xwiki.notifications.notifiers.internal.email.NotificationUserIterator.getNext(NotificationUserIterator.java:116)
at org.xwiki.notifications.notifiers.internal.email.NotificationUserIterator.initialize(NotificationUserIterator.java:104)
at org.xwiki.notifications.notifiers.internal.email.NotificationEmailJob.executeJob(NotificationEmailJob.java:67)
at com.xpn.xwiki.plugin.scheduler.AbstractJob.execute(AbstractJob.java:76)
at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)
Caused by: com.xpn.xwiki.XWikiException: Error number 0 in 3: Exception while hibernate execute
at com.xpn.xwiki.store.XWikiHibernateBaseStore.execute(XWikiHibernateBaseStore.java:927)

2019-03-14 17:07:46,671 [https://wiki.pridedallas.com:8443/xwiki/bin/download/XWiki/XWikiUserSheet/noavatar.png?width=180] WARN c.x.x.p.i.ImagePlugin - Failed to transform image attachment.
java.lang.NullPointerException: null
at com.xpn.xwiki.plugin.image.ImagePlugin.shrinkImage(ImagePlugin.java:318)
at com.xpn.xwiki.plugin.image.ImagePlugin.downloadImageFromCache(ImagePlugin.java:289)
at com.xpn.xwiki.plugin.image.ImagePlugin.downloadImage(ImagePlugin.java:254)
at com.xpn.xwiki.plugin.image.ImagePlugin.downloadAttachment(ImagePlugin.java:225)
at com.xpn.xwiki.plugin.XWikiPluginManager.downloadAttachment(XWikiPluginManager.java:284)
at com.xpn.xwiki.web.DownloadAction.render(DownloadAction.java:149)
at com.xpn.xwiki.web.XWikiAction.execute(XWikiAction.java:473)
at com.xpn.xwiki.web.XWikiAction.execute(XWikiAction.java:210)
at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:425)
at org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:228)
at org.apache.struts.action.ActionServlet.process(ActionServlet.java:1913)
at org.apache.struts.action.ActionServlet.doGet(ActionServlet.java:449)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:635)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)

The long restart duration could be caused by a depleted entroy pool: https://forum.xwiki.org/t/errors-and-warnings-in-logfile/3764/8

Apparently, Xwiki should use /dev/urandom, but something in it somehow seems to read from /dev/random, causing the starting server to block for a long time in case of insufficient entropy.

A “find / > /dev/null” on the server does wonders in this case (and can be interrupted / killed once the server started successfully).

15-Mar-2019 19:46:55.782 INFORMATION [localhost-startStop-1] org.apache.catalina.util.SessionIdGeneratorBase.createSecureRandom Creation of SecureRandom instance for session ID generation using [SHA1PRNG] took [305,590] milliseconds.

(Yes, that’s more than 5 minutes.)

15-Mar-2019 19:46:58.108 INFORMATION [main] org.apache.catalina.startup.Catalina.start Server startup in 350718 ms

(So server startup basically took only 45 seconds, but was blocked for more than 5 minutes to collect sufficient entropy.)

Note that the XWiki Docker image does use urandom: https://github.com/xwiki-contrib/docker-xwiki/blob/master/template/tomcat/setenv.sh#L35

So indeed if the XWiki debian distribution doesn’t, it should, definitely. Same thing for tomcat’s config for slash and backslash if it’s not set up by default in that distribution. I quickly searched XWiki’s source and couldn’t find any use of that… A jira issue for that would be great.

cc @tmortagne

Anyway, the long startup issue is not new (a few months old) but the LDAP issue is (started in 11.1) - are they related? What do we do about the LDAP issue?

Sorry, didn’t want to hijack this thread, I just wanted to indicate where the delays came from.

As requested by vmassol I created a new JIRA issue: https://jira.xwiki.org/browse/XWIKI-16235

I’m also using LDAP but only Xwiki 10.11.3, so I cannot provide any further insight into the original issue here.

Happened again today. Appears to happen every 3-5 days. We may have to downgrade, which means loss of changes (I guess), unless anyone has an idea why this is happening, even if just to prove the issue is 11.1.

This occurred during service restart:

19-Mar-2019 09:01:02.965 SEVERE [https://xwiki01.pridedallas.com:8443/xwiki/bin/loginsubmit/XWiki/XWikiLogin] org.apache.catalina.core.StandardWrapperValve.invoke Servlet.service() for servlet [action] in context with path [/xwiki] threw exception
java.lang.NullPointerException
at org.apache.struts.action.RequestProcessor.getInternal(RequestProcessor.java:1123)
at org.apache.struts.action.RequestProcessor.processException(RequestProcessor.java:512)
at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:427)
at org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:228)
at org.apache.struts.action.ActionServlet.process(ActionServlet.java:1913)
at org.apache.struts.action.ActionServlet.doPost(ActionServlet.java:462)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:661)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)
at

These are from yesterday:

18-Mar-2019 09:04:28.275 SEVERE [https-openssl-nio-8443-exec-2] org.apache.catalina.core.StandardWrapperValve.invoke Servlet.service() for servlet [default] in context with path [/xwiki] threw exception [Failed to handle Resource Reference [[uix, xwiki:Panels.MyRecentModifications, author, xwiki:XWiki.keithdavis, locale, en, request.wiki, xwiki, secureDocument, xwiki:Panels.MyRecentModifications, user, xwiki:XWiki.billyraines, wiki, xwiki, 11]]] with root cause
org.xwiki.resource.ResourceReferenceHandlerException: Cannot find any status for id [[uix, xwiki:Panels.MyRecentModifications, author, xwiki:XWiki.keithdavis, locale, en, request.wiki, xwiki, secureDocument, xwiki:Panels.MyRecentModifications, user, xwiki:XWiki.billyraines, wiki, xwiki, 11]]
at org.xwiki.rendering.async.internal.service.AsyncRendererResourceReferenceHandler.handle(AsyncRendererResourceReferenceHandler.java:115)
at org.xwiki.resource.internal.DefaultResourceReferenceHandlerChain.handleNext(DefaultResourceReferenceHandlerChain.java:79)
at org.xwiki.resource.internal.AbstractResourceReferenceHandlerManager.handle(AbstractResourceReferenceHandlerManager.java:82)
at org.xwiki.resource.servlet.ResourceReferenceHandlerServlet.handleResourceReference(ResourceReferenceHandlerServlet.java:159)
at org.xwiki.resource.servlet.ResourceReferenceHandlerServlet.service(ResourceReferenceHandlerServlet.java:87)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.xwiki.container.servlet.filters.internal.SetHTTPHeaderFilter.doFilter(SetHTTPHeaderFilter.java:63)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java:728)
at org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java:467)
at org.apache.catalina.core.ApplicationDispatcher.doForward(ApplicationDispatcher.java:351)
at org.apache.catalina.core.ApplicationDispatcher.forward(ApplicationDispatcher.java:311)
at org.xwiki.resource.servlet.RoutingFilter.doFilter(RoutingFilter.java:145)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:595)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:80)
at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:650)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342)
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:799)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1457)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)

Happened again today. Again, we get about 3-5 days out of it before it fails. How do we fix this issue?