Project

General

Profile

Actions

Task #1237

closed

Quartz bug filling log files

Added by Peter Štrunc over 5 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Urgent
Assignee:
Petr Fišer
Category:
Scheduler
Target version:
Start date:
09/03/2018
Due date:
% Done:

100%

Estimated time:
Owner:

Description

When connection to database fails, instead of reconnecting, quartz continues to print stacktrace to logfile.

More info in https://jira.terracotta.org/jira/browse/QTZ-470

Actions #1

Updated by Radek Tomiška over 5 years ago

  • Priority changed from Normal to Low
  • Target version deleted (Moonstone (9.1.0))
Actions #2

Updated by Petr Fišer over 5 years ago

Live thread about issue here: https://github.com/quartz-scheduler/quartz/issues/71
Tarracotta's JIRA is "archived", project was moved to github.

Actions #3

Updated by Petr Fišer over 5 years ago

Petr Fišer wrote:

Live thread about issue here: https://github.com/quartz-scheduler/quartz/issues/71
Tarracotta's JIRA is "archived", project was moved to github.

Also update on this: The Quartz guys were able to work-around the issue a bit (by creating a retry delay setting https://github.com/quartz-scheduler/quartz/pull/116 ) but it is not clear if this got into the 2.2.x branch of Quartz or if it is planned for 2.3 release. The only sure thing is that it is already in Quartz master branch. We chould investigate a bit if it is possible to run worked-around version of quartz.

Also please consider raising the priority. This pains us BIG TIME on project deployments.

Actions #4

Updated by Radek Tomiška over 5 years ago

So when you database is wrong configured (or not available - offline), quartz logs this repetitively ... isn't better to configure database properly (or monitor not running database, which should be turned on anyway)?

Actions #5

Updated by Peter Štrunc over 5 years ago

  • Priority changed from Low to High

This happens even when db is temporarily unavailable. For example when a few packets drop. It happens on almost all instances, where db is on another server. IdM works properly and is able to recover connection, but quartz just keeps on logging.

Actions #6

Updated by Radek Tomiška over 5 years ago

Ok, so if db is temporarily unavailable, then one (or two..) logs with correct exception are logged ... this cost really BIG TIME on project "deployments" for you? Then ok ... we can look at it.

Actions #7

Updated by Peter Štrunc over 5 years ago

Db is temporarily unavailable -> quartz logs 100GB of data into log -> this costs "big time"

The thing is that quartz does not stop logging error. Never. You must shutdown tomcat in order to stop it from logging.

Actions #8

Updated by Radek Tomiška over 5 years ago

Ok, i understand now, where is the main issue and i agree, this is really not good.

Actions #9

Updated by Marcel Poul over 5 years ago

  • Priority changed from High to Urgent
Actions #12

Updated by Marek Klement over 5 years ago

Tady to jsou ty chyby:

ERROR 10769 --- [SchedulerThread] o.s.s.quartz.LocalDataSourceJobStore     : Couldn't rollback jdbc connection. This connection has been closed.

org.postgresql.util.PSQLException: This connection has been closed.
        at org.postgresql.jdbc2.AbstractJdbc2Connection.checkClosed(AbstractJdbc2Connection.java:822)
        at org.postgresql.jdbc2.AbstractJdbc2Connection.rollback(AbstractJdbc2Connection.java:839)
        at sun.reflect.GeneratedMethodAccessor580.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.tomcat.jdbc.pool.ProxyConnection.invoke(ProxyConnection.java:126)
        at org.apache.tomcat.jdbc.pool.JdbcInterceptor.invoke(JdbcInterceptor.java:108)
        at org.apache.tomcat.jdbc.pool.DisposableConnectionFacade.invoke(DisposableConnectionFacade.java:81)
        at com.sun.proxy.$Proxy106.rollback(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor580.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.quartz.impl.jdbcjobstore.AttributeRestoringConnectionInvocationHandler.invoke(AttributeRestoringConnectionInvocationHandler.java:73)
        at com.sun.proxy.$Proxy282.rollback(Unknown Source)
        at org.quartz.impl.jdbcjobstore.JobStoreSupport.rollbackConnection(JobStoreSupport.java:3666)
        at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3825)
        at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTriggers(JobStoreSupport.java:2756)
        at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:272)

2018-08-15 16:35:39.091  WARN 10769 --- [SchedulerThread] o.s.s.quartz.LocalDataSourceJobStore     : Failed to override connection auto commit/transaction isolation.

org.postgresql.util.PSQLException: This connection has been closed.
        at org.postgresql.jdbc2.AbstractJdbc2Connection.checkClosed(AbstractJdbc2Connection.java:822)
        at org.postgresql.jdbc2.AbstractJdbc2Connection.getAutoCommit(AbstractJdbc2Connection.java:788)
        at sun.reflect.GeneratedMethodAccessor136.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.tomcat.jdbc.pool.ProxyConnection.invoke(ProxyConnection.java:126)
        at org.apache.tomcat.jdbc.pool.JdbcInterceptor.invoke(JdbcInterceptor.java:108)
        at org.apache.tomcat.jdbc.pool.DisposableConnectionFacade.invoke(DisposableConnectionFacade.java:81)
        at com.sun.proxy.$Proxy106.getAutoCommit(Unknown Source)
        at org.quartz.impl.jdbcjobstore.AttributeRestoringConnectionInvocationHandler.setAutoCommit(AttributeRestoringConnectionInvocationHandler.java:90)
        at org.quartz.impl.jdbcjobstore.AttributeRestoringConnectionInvocationHandler.invoke(AttributeRestoringConnectionInvocationHandler.java:66)
        at com.sun.proxy.$Proxy282.setAutoCommit(Unknown Source)
        at org.quartz.impl.jdbcjobstore.JobStoreCMT.getNonManagedTXConnection(JobStoreCMT.java:191)
        at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3800)
        at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTriggers(JobStoreSupport.java:2756)
        at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:272)
Actions #13

Updated by Radek Tomiška over 5 years ago

  • Assignee changed from Radek Tomiška to Peter Štrunc

I tried to reproduce issue on postgresql and mssql. I was not able to reproduce the main issue - quartz stops logging correctly, when database is online again.

Peter, could you please check, if your issue description is correct or if I'm missing something (e.g. are you using some custom quartz setting with retry mechanism)? You was maybe confused, because quartz is used a lot - with LRT and event mechanism, which depends by setting - https://wiki.czechidm.com/devel/documentation/application_configuration/dev/backend#scheduler - but its configured to fire every second by default.

I'll try to find, if I'm able to "pause" executing quartz tasks somewhere in spring layer (using some validation query maybe), which we used for this.

Actions #14

Updated by Petr Fišer over 5 years ago

  • Assignee changed from Peter Štrunc to Petr Fišer
Actions #16

Updated by Petr Fišer over 5 years ago

  • Status changed from New to Resolved

Okay, solved...

Whole problem was a combination of two bugs:
  1. Misconfiguration in db reconnect.
  2. Bug in quartz 2.2.3 we were discussing in this ticket. (Drop-in replacement to 2.3.0 actually helped with frantic logging. IdM was able to work with 2.3.0.)

The main cause was the misconfiguration which, in turn, allowed the quartz bug to trigger itself.
In our config file, there was:

spring.datasource.tomcat.validationQuery=SELECT 1
spring.datasource.tomcat.test-on-borrow=true

which does not make sense, because correct properties are without the "tomcat" part like this:
spring.datasource.validationQuery=SELECT 1
spring.datasource.test-on-borrow=true

This caused the IdM to not check if db connection is alive and therefore IdM would fire quartz even if the db was unreachable. Quartz would then bug out, start spewing errors and fill up the disk.
Correcting reconnect solves everything for us. I was able to trace it after setting up a virtual lab. Previously I was thinking that "HTTP 500 Server Internal Error" comes as a result of the full disk - which turned out not to be true.

Misconfig of properites has been in the sources forever and was corrected on 2/1/2018 in commit 3c66159f4 (i took application-dev.properties as a reference). However, with the install guide on the wiki being much older and not up-to-date with this change, we deployed the bad configuration on the project test environment.

I corrected both CzechIdM install wiki pages for Linux and Windows. Also checked the sources and corrected application-dev-mysql.properties. This config is old and probably unused but we should not risk that someone will take it as an example config. Change is in commit 9f225e355 and part of the develop. It is not necessary to create any hotfix for this.

Actions #17

Updated by Radek Tomiška over 5 years ago

  • Status changed from Resolved to Closed
  • Target version set to Morganite (9.2.0)
  • % Done changed from 0 to 100
Actions #18

Updated by Radek Tomiška over 5 years ago

It's awesome, thx!
this makes sense, why i cannot reproduce the issue on on product. It's nice to know, the newer library works with our devstack and can be used.

Commit
https://github.com/bcvsolutions/CzechIdMng/commit/3c66159f4ade76871cd6ebbf3b06953510f5c62c#diff-41eb81f010b1f6f4291a60645c56bc55
what you referenced was about the reconnect feature implementation (see #365), it didn't work before. Could you please check install guide, if other obsolete configuration is used (i don't know which install guide you talking about and probably you are the author) - see https://wiki.czechidm.com/devel/documentation/application_configuration/dev/backend#jpa?

Mysql is not officially supported, but thx for fix it, i did review, it's ok.

Actions

Also available in: Atom PDF