Defect #2686
closedSynchronization of identities and contracts sometimes left Waiting tasks (-> next synchronization failed to start HR processes)
100%
Description
Tested on 10.7.2, happened also on 10.7.0.
After synchronization of identities + contracts, which created 1 new identity with some automatic roles, some of the tasks (HrEnableContractProcess, HrEndContractProcess, or ProcessAllAutomaticRoleByAttributeTaskExecutor) stayed as "Waiting" in the Scheduler. Next synchronization wasn't able to start them => failed.
My testing data are the same as in Squash (or #2636), only the synchronizations are set as "Reconcilation". Synchronization of contracts is scheduled as dependent on synchronization of identities. I start this Synchronization task manually.
The problematic situation happened 4x in a row, with a bit different combination of Waiting tasks - see screenshots. Then I tested it 3x with 2 identities in source data and all was correct. Then I removed one of them and tested again 4x in a row - now all correct.
Details:- This 1st synchronization created a new identity. The synchronization, HrEnableContractProcess, HrEndContractProcess is Waiting.
- Cancel tasks, remove the identity, run synchronizations again. Now ProcessAllAutomaticRoleByAttributeTaskExecutor is Waiting and synchronization of contracts fails, because it can't start it again.
- Cancel tasks, remove the identity, run synchronizations again. (ProcessSkippedAutomaticRoleByTreeForContract processes more flags, probably because it didn't start in the previous run.)
- Cancel tasks, remove the identity, run synchronizations again. Now the result was the same as in the 1st case
- Make some more tests with 2 identities, no problem.
- Return to the original data, run synchronization - now all is green
Files
Related issues
Updated by Alena Peterová almost 4 years ago
I have a snapshot of the virtual server made after the 1st run, if needed.
Updated by Radek Tomiška almost 4 years ago
- Status changed from New to Needs feedback
- Assignee changed from Vít Švanda to Alena Peterová
I think the source of issue is related to ProcessAllAutomaticRoleByAttributeTaskExecutor, which run in each synchronization - it should run in the second synchronization only.
Could you test it please without it in synchronization of identities?
If it's true, then I can look, what can be improved in dependent task execution, because if synchronization of contracts is scheduled as dependent on synchronization of identities, then the first ProcessAllAutomaticRoleByAttributeTaskExecutor is executed "in the middle" (~ synchronization on contract run in the same time as ProcessAllAutomaticRoleByAttributeTaskExecutor).
Updated by Alena Peterová almost 4 years ago
- File without_automatic_roles_after_identities.png without_automatic_roles_after_identities.png added
- File events.png events.png added
- Assignee changed from Alena Peterová to Radek Tomiška
Unfortunately, it didn't help. I got the similar situation on the first attempt:
Here are events on the identity.
Updated by Radek Tomiška almost 4 years ago
- Status changed from Needs feedback to In Progress
Updated by Alena Peterová almost 4 years ago
Note: I realized that I did all testing when the server had only 1 CPU.
Task executor is initialized: corePoolSize [1], maxPoolSize [2], queueCapacity [20]
Event executor is initialized: corePoolSize [2], maxPoolSize [4], queueCapacity [50]
So I tried adding 2nd CPU, but the issue still happens sometimes.
Also, I got one more variation of the issue - the only "Waiting" task was the SynchronizationSchedulableTaskExecutor of contracts, all the others were Executed.
Updated by Radek Tomiška almost 4 years ago
Nice, thx! I have a clue now, until now I wasn't able to reproduce it in my environment.
Updated by Radek Tomiška almost 4 years ago
- Related to Task #2444: Implement waiting for the completion of the LRT after all asynchronous events added
Updated by Radek Tomiška almost 4 years ago
- Status changed from In Progress to Needs feedback
- Assignee changed from Radek Tomiška to Vít Švanda
- Target version set to 10.8.0
- % Done changed from 0 to 90
- Affected versions 10.6.0, 10.6.1, 10.6.2, 10.6.3, 10.6.4, 10.7.1, 10.6.5, 10.6.6 added
The issue is related to event processing - when events from synchronization are processed to quickly (~ before all long running tasks are saved into queue), then tasks are left in waiting state. Number of tasks in waiting state depends on how quick was events processed => tasks already saved into queue are left in waiting state, the new one are marked as executed correctly.
This issue occurs mainly, when synchronization "do almost nothing" => creates only few events in queue + hr processes take a long time to process
Commit:
https://github.com/bcvsolutions/CzechIdMng/commit/ea1de47f0e5a4a954a6813d8e39b4e5aab8cae5e
Could you provide me a feedback, please?
Updated by Vít Švanda almost 4 years ago
- Status changed from Needs feedback to Resolved
- Assignee changed from Vít Švanda to Radek Tomiška
- % Done changed from 90 to 100
I did review and tested it. I was great brain exercise for me. I am not able simulated this now.
To clarify: The solution is in the new event IdmLongRunningTask.START, which is now created at the start of LRT (it was not there before). Thus, it is now true that each LRT has created at least one event that does not end before that LRT.
Updated by Radek Tomiška almost 4 years ago
- Status changed from Resolved to Closed
Updated by Radek Tomiška over 3 years ago
- Related to Defect #2743: Event: Start event remains in running state, when long running task ends with exception. added