Feature #1690
openEvent queue performance enhancement
0%
Description
Currently we use event queue as a way to make some operations asynchronous.
User or LRT modifies some data in IdM. Data is persisted in IdM repository and an event is created. This event is enqueued into an event queue.
One of event queue processing threads (event-task-executor-X) takes the event out of the queue and synchronously makes all necessary operations.
In the current state of implementation, we are moving away from asynchronous provisioning. In effect, event-task-executor does all the work including provisioning operations. Those operations are, in recent IdM versions, coupled in such a way that event-task-executor does synchronous provisioning on all necessary systems one after another.
By default we can have 2*CPU count event-task-executors at most. In most situations this means: 2*CPU count = 8. Therefore we have, for instance, 8 threads that do operations which may take many seconds to complete.
We can raise the number of event-task-executor threads to some high number but this is only a workaround.
We should reevaluate the event queue philosophy and try to adjust the IdM internal architecture so we remove the possible event-queue bottleneck.
- Bring back some sort of asynchronous provisioning.
- Make provisioning operation run i parallel. The whole provisioning operation will be capped only by the slowest system.
- Think out some "intelligent event queue scheduling".
Files
Related issues
Updated by Petr Fišer over 5 years ago
- Assignee set to Petr Fišer
Taking the ticket. I'm going to do some testing with high number of processing threads as compared to ordinary number of threads (e.g. 4,8,...).
Updated by Petr Fišer over 5 years ago
So far, I have tested provisioning of 1000 accounts to end system (via SSH connector) with various speeds of system response (0.5s, 1s, 5.5s) with both synchronous and asynchronous operations.
Raising the number of event executors (from 3 to 30) does not have any big effect on system performace. Importantly, it seems to have a very small positive effect so we probably do not need to fear a performance hit.
Next step is to create scenario with intensive attribute calculation in the IdM to see how it then behaves.
Probably something like this transformation script should be enough:
import java.util.Date;
// cca 9s on my station
// cca 6s on testing server
String megastr = "";
for (int i = 0; i < 10e6; i++) {
megastr = megastr + i;
if (megastr.length() > 1000) {
megastr = megastr.substring(0, 999);
}
}
return attributeValue;
Updated by Petr Fišer over 5 years ago
This may have two causes (I don't know which one yet):
- During tests, I used mainly non-poolable connector. (This is my tip. But it is a problem, because right now most of our connectors are not poolable.)
- JVM maps event executor threads to only a small number of native (or in effect kernel) threads. This causes event executors to share CPU time no matter how many of them we create. (This may not be true, because CPU time needed for transformation scripts does not affect processing times when the end system's response is very slow. However, JVM could be making some clever desicions in rescheduling Java threads on Native threads... guess this would need a bit more of testing.)
I planned results presentation on Friday 6/9/2019.
Updated by Petr Fišer over 5 years ago
- Related to Task #1688: Diagnose suspected interlocking of event and task scheduler executor added
Updated by Petr Fišer over 5 years ago
Results are funny:
- When using LDAP connector, the pooling seemed to have (dunno how, read on...) a positive effect. AccountMgmt and reprovisioning of 1000 users took 8-10 minutes without pooling and about 5 minutes with pooling (stable).
- Number of event executor threads (3, 15, or 30) did not matter.
- When I added transformation script which slowes provisioning computation by about 5s, I got the same behavior as I saw when using SSH connector. Pooling enabled or not, did not matter. Neither did the number of event executors.
So when looking at Threads competing over some monitor objects, I took a series of thread dumps. In this case IdM had LDAP pooling enabled (min. 1 connection, max. 30 connections), 3 event executors and no slowdown transformation script. Whole reprovisioning took about 5 minutes.
Case 1 (threads waiting on DefaultProvisioningExecutor, looks like they compete for database access):
"event-task-executor-1" - Thread t@298 java.lang.Thread.State: BLOCKED at eu.bcvsolutions.idm.acc.service.impl.DefaultProvisioningExecutor.executeInternal(DefaultProvisioningExecutor.java:151) - waiting to lock <1bb9e35c> (a eu.bcvsolutions.idm.acc.service.impl.DefaultProvisioningExecutor) owned by "event-task-executor-2" t@299 at eu.bcvsolutions.idm.acc.service.impl.DefaultProvisioningExecutor$$FastClassBySpringCGLIB$$68607120.invoke(<generated>) at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:720) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157) at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:99) at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:281) at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:655) at eu.bcvsolutions.idm.acc.service.impl.DefaultProvisioningExecutor$$EnhancerBySpringCGLIB$$e87aff4c.executeInternal(<generated>) "event-task-executor-2" - Thread t@299 java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:140) "event-task-executor-3" - Thread t@300 java.lang.Thread.State: BLOCKED at eu.bcvsolutions.idm.acc.service.impl.DefaultProvisioningExecutor.execute(DefaultProvisioningExecutor.java:108) - waiting to lock <1bb9e35c> (a eu.bcvsolutions.idm.acc.service.impl.DefaultProvisioningExecutor) owned by "event-task-executor-2" t@299 at eu.bcvsolutions.idm.acc.service.impl.DefaultProvisioningExecutor$$FastClassBySpringCGLIB$$68607120.invoke(<generated>) at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:720) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157) at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:99) at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:281) at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:655) at eu.bcvsolutions.idm.acc.service.impl.DefaultProvisioningExecutor$$EnhancerBySpringCGLIB$$e87aff4c.execute(<generated>) at eu.bcvsolutions.idm.acc.service.impl.AbstractProvisioningExecutor.doProvisioning(AbstractProvisioningExecutor.java:741)
Case 2 (Threads waiting on DefaultProvisioningExecutor but because LDAP was contacted - this probably should not happen, we should be pooling):
"event-task-executor-3" - Thread t@300 java.lang.Thread.State: WAITING at java.lang.Object.wait(Native Method) - waiting on <48b541cc> (a com.sun.jndi.ldap.LdapRequest) at java.lang.Object.wait(Object.java:502) at com.sun.jndi.ldap.Connection.readReply(Connection.java:493) at com.sun.jndi.ldap.LdapClient.processReply(LdapClient.java:881) at com.sun.jndi.ldap.LdapClient.modify(LdapClient.java:955) at com.sun.jndi.ldap.LdapCtx.c_modifyAttributes(LdapCtx.java:1471) at com.sun.jndi.toolkit.ctx.ComponentDirContext.p_modifyAttributes(ComponentDirContext.java:277) at com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.modifyAttributes(PartialCompositeDirContext.java:192) at com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.modifyAttributes(PartialCompositeDirContext.java:181) at javax.naming.directory.InitialDirContext.modifyAttributes(InitialDirContext.java:167) at net.tirasa.connid.bundles.ldap.modify.LdapUpdate.modifyAttributes(LdapUpdate.java:327) at net.tirasa.connid.bundles.ldap.modify.LdapUpdate.modifyAttributes(LdapUpdate.java:321) at net.tirasa.connid.bundles.ldap.modify.LdapUpdate.update(LdapUpdate.java:131) at net.tirasa.connid.bundles.ldap.LdapConnector.update(LdapConnector.java:161) at org.identityconnectors.framework.impl.api.local.operations.UpdateImpl.update(UpdateImpl.java:101) at sun.reflect.GeneratedMethodAccessor1132.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.identityconnectors.framework.impl.api.local.operations.ConnectorAPIOperationRunnerProxy.invoke(ConnectorAPIOperationRunnerProxy.java:98) at com.sun.proxy.$Proxy365.update(Unknown Source) at sun.reflect.GeneratedMethodAccessor1132.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.identityconnectors.framework.impl.api.local.operations.ThreadClassLoaderManagerProxy.invoke(ThreadClassLoaderManagerProxy.java:96) at com.sun.proxy.$Proxy365.update(Unknown Source) at sun.reflect.GeneratedMethodAccessor1132.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.identityconnectors.framework.impl.api.DelegatingTimeoutProxy.invoke(DelegatingTimeoutProxy.java:99) at com.sun.proxy.$Proxy365.update(Unknown Source) at org.identityconnectors.framework.impl.api.AbstractConnectorFacade.update(AbstractConnectorFacade.java:190) at eu.bcvsolutions.idm.ic.connid.service.impl.ConnIdIcConnectorService.updateObject(ConnIdIcConnectorService.java:123) "event-task-executor-2" - Thread t@299 java.lang.Thread.State: BLOCKED at eu.bcvsolutions.idm.acc.service.impl.DefaultProvisioningExecutor.executeInternal(DefaultProvisioningExecutor.java:151) - waiting to lock <1bb9e35c> (a eu.bcvsolutions.idm.acc.service.impl.DefaultProvisioningExecutor) owned by "event-task-executor-3" t@300 at eu.bcvsolutions.idm.acc.service.impl.DefaultProvisioningExecutor$$FastClassBySpringCGLIB$$68607120.invoke(<generated>) "event-task-executor-1" - Thread t@298 java.lang.Thread.State: BLOCKED at eu.bcvsolutions.idm.acc.service.impl.DefaultProvisioningExecutor.execute(DefaultProvisioningExecutor.java:108) - waiting to lock <1bb9e35c> (a eu.bcvsolutions.idm.acc.service.impl.DefaultProvisioningExecutor) owned by "event-task-executor-3" t@300 at eu.bcvsolutions.idm.acc.service.impl.DefaultProvisioningExecutor$$FastClassBySpringCGLIB$$68607120.invoke(<generated>)
Case 3 (A-OK):
"event-task-executor-3" - Thread t@300 java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:140) "event-task-executor-2" - Thread t@299 java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:140) "event-task-executor-1" - Thread t@298 java.lang.Thread.State: RUNNABLE at java.io.DataOutputStream.writeByte(DataOutputStream.java:153) at org.identityconnectors.framework.impl.serializer.binary.BinaryObjectEncoder$InternalEncoder.writeByte(BinaryObjectEncoder.java:233) at org.identityconnectors.framework.impl.serializer.binary.BinaryObjectEncoder$InternalEncoder.startField(BinaryObjectEncoder.java:180) at org.identityconnectors.framework.impl.serializer.binary.BinaryObjectEncoder.writeStringField(BinaryObjectEncoder.java:445) at org.identityconnectors.framework.impl.serializer.APIConfigurationHandlers$3.serialize(APIConfigurationHandlers.java:138) at org.identityconnectors.framework.impl.serializer.binary.BinaryObjectEncoder$InternalEncoder.writeObject(BinaryObjectEncoder.java:125) at org.identityconnectors.framework.impl.serializer.binary.BinaryObjectEncoder.writeObjectContents(BinaryObjectEncoder.java:424) at org.identityconnectors.framework.impl.serializer.APIConfigurationHandlers$4.serialize(APIConfigurationHandlers.java:165) at org.identityconnectors.framework.impl.serializer.binary.BinaryObjectEncoder$InternalEncoder.writeObject(BinaryObjectEncoder.java:125) at org.identityconnectors.framework.impl.serializer.binary.BinaryObjectEncoder.writeObjectField(BinaryObjectEncoder.java:431) at org.identityconnectors.framework.impl.serializer.APIConfigurationHandlers$5.serialize(APIConfigurationHandlers.java:205) at org.identityconnectors.framework.impl.serializer.binary.BinaryObjectEncoder$InternalEncoder.writeObject(BinaryObjectEncoder.java:125) at org.identityconnectors.framework.impl.serializer.binary.BinaryObjectEncoder.writeObject(BinaryObjectEncoder.java:316) at org.identityconnectors.framework.common.serializer.SerializerUtil.serializeBinaryObject(SerializerUtil.java:57) at org.identityconnectors.framework.common.serializer.SerializerUtil.cloneObject(SerializerUtil.java:159) at org.identityconnectors.framework.impl.api.AbstractConnectorInfo.createDefaultAPIConfiguration(AbstractConnectorInfo.java:93) at eu.bcvsolutions.idm.ic.connid.service.impl.ConnIdIcConnectorService.getConnectorFacade(ConnIdIcConnectorService.java:291)
Case 4 (another waiting for LDAP reply, blocked in DefaultProvisioningExecutor):
"event-task-executor-3" - Thread t@300 java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:140) "event-task-executor-2" - Thread t@299 java.lang.Thread.State: WAITING at java.lang.Object.wait(Native Method) - waiting on <7e27dd00> (a com.sun.jndi.ldap.LdapRequest) at java.lang.Object.wait(Object.java:502) at com.sun.jndi.ldap.Connection.readReply(Connection.java:493) "event-task-executor-1" - Thread t@298 java.lang.Thread.State: BLOCKED at eu.bcvsolutions.idm.acc.service.impl.DefaultProvisioningExecutor.executeInternal(DefaultProvisioningExecutor.java:151) - waiting to lock <1bb9e35c> (a eu.bcvsolutions.idm.acc.service.impl.DefaultProvisioningExecutor) owned by "event-task-executor-2" t@299 at eu.bcvsolutions.idm.acc.service.impl.DefaultProvisioningExecutor$$FastClassBySpringCGLIB$$68607120.invoke(<generated>)
Next step: Configure slowdown transformation script and take Thread dumps.
Updated by Petr Fišer over 5 years ago
When using slowdown transformation script, we hit the synchronization bottleneck in the transformations. Took 6 thread dumps and they look all the same, like this:
[root@czechidm-pf ~]# grep -A1 -B1 "event-task-executor" threaddump-1568287310196.tdump "event-task-executor-3" - Thread t@300 java.lang.Thread.State: RUNNABLE -- "event-task-executor-2" - Thread t@299 java.lang.Thread.State: BLOCKED at eu.bcvsolutions.idm.core.model.service.impl.DefaultGroovyScriptService.evaluate(DefaultGroovyScriptService.java:77) - waiting to lock <703809c5> (a Script1) owned by "event-task-executor-3" t@300 at eu.bcvsolutions.idm.acc.service.impl.DefaultSysSystemAttributeMappingService.transformValueToResource(DefaultSysSystemAttributeMappingService.java:200) -- "event-task-executor-1" - Thread t@298 java.lang.Thread.State: BLOCKED at eu.bcvsolutions.idm.core.model.service.impl.DefaultGroovyScriptService.evaluate(DefaultGroovyScriptService.java:77) - waiting to lock <703809c5> (a Script1) owned by "event-task-executor-3" t@300 at eu.bcvsolutions.idm.acc.service.impl.DefaultSysSystemAttributeMappingService.transformValueToResource(DefaultSysSystemAttributeMappingService.java:200)
Full thread dump attached. So I guess so far we identified three possible bottlenecks:
- Database access to repository. Probable entry point: DefaultProvisioningExecutor.java:151 , synchronized method executeInternal(...) .
- Connector object access or I/O on end system. Probable entry point: DefaultProvisioningExecutor.java:151 , synchronized method executeInternal(...) .
- Accessing Groovy scripts when performing data transformations. Probable entry point: DefaultGroovyScriptService.java:77 , synchronization on the script object scriptObj.
Probably Groovy scripts seem to be the most dire, because we use them everywhere and data transformations in IdM can be very slow sometimes. But this is opinion-based.
I guess we should at least make some telco about this.
Updated by Petr Fišer about 5 years ago
As we briefly discussed in person, here is the conclusion. I have to consult it with Zdenda and the product team once more.
The proposed values are my guess on what would make sense, any feedback appreciated.
- Implement rejection policy for LRT queue #1837.
- Default minimal num. base-task-executor threads to #CPU*2.
- This means having up to 16 LRT threads by default (on 8-core system).
- queueCapacity set to #CPU*2.
- This causes LRT thread pool to enlarge once there is enough work.
- Default max num. base-task-executor threads to 50.
- When there is so much work (on say 4-core or 8-core system) that we need 50 LRT threads, the CPU time one thread gets is way under 20% so things will get reeealy slow by this point. Other LRTs can simply wait.
- Resolve suspected thread interlocking as per: #1690#note-5 , #1690#note-6
- Default minimal num. event-task-executor threads to #CPU*2.
- So that one LRT gets to one event thread on average. Some LRTs do not create events so this gives us a relatively safe margin.
- Default max num. event-task-executor threads to #CPU*4.
- Set scheduler.event.executor.queueCapacity = 25.
- Currently we have this set to 50 but idm.sec.core.event.asynchronous.batchSize is set to 15. Setting both those values to 25 seems like a good average.
- This is a blind guess. If you have any data to support deciding this, I'll be glad.
- Set scheduler.event.executor.queueCapacity = idm.sec.core.event.asynchronous.batchSize which means once there is enough work, the pool start to enlarge.
- We have to do this (or make some implementation change), otherwise the executor pooling does not work.