Project

General

Profile

Actions

Defect #1303

closed

Race condition in provisioning queue/executor

Added by Alena Peterová over 5 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Urgent
Assignee:
Radek Tomiška
Category:
Provisioning
Target version:
Start date:
10/09/2018
Due date:
% Done:

100%

Estimated time:
Affected versions:
Owner:

Description

Version 8.1.7

If there are multiple asynchronous provisioning operations nearby, sometimes one of them stays in provisioning queue as "Not executed" and it's never executed. This operation also blocks all future provisioning operations. It must be retried manually, no automatic process handles it.

Situation:
  • The system "IDB export" has asynchronous provisioning.
  • The task ProvisioningQueueTaskExecutor is scheduled every 10 seconds.
  • I change the value of an EAV and then change it back quickly
  • Two provisioning operations are created after EAV save of the same identity
  • One of the provisioning is executed correctly, the second stays in the queue as "Not executed - Account has some unfinished provisioning tasks in queue"

The problem is probably some race condition - ProvisioningQueueTaskExecutor runs just when the provisioning operations are created. They are in the same batch, but the executor processes only the first operation. Please see the screenshots and note the same time in ProvisioningQueueTaskExecutor and provisioning operations.

In our (slow) environment, this race condition happens cca in 1 out of 3 attempts.


Files

EntityEvents.png (20.3 KB) EntityEvents.png Alena Peterová, 10/09/2018 08:05 AM
ProvisioningArchive.png (16.5 KB) ProvisioningArchive.png Alena Peterová, 10/09/2018 08:05 AM
Provisioning_ActiveOperations.png (19.1 KB) Provisioning_ActiveOperations.png Alena Peterová, 10/09/2018 08:05 AM
ProvisioningQueueTaskExecutor.png (38 KB) ProvisioningQueueTaskExecutor.png Alena Peterová, 10/09/2018 08:05 AM
ProvisioningQueueTaskExecutor_items.png (41.2 KB) ProvisioningQueueTaskExecutor_items.png Alena Peterová, 10/09/2018 08:05 AM
prov.png (90.2 KB) prov.png Ondřej Kopr, 10/15/2018 11:11 AM
Actions #1

Updated by Alena Peterová over 5 years ago

  • Description updated (diff)
Actions #2

Updated by Radek Tomiška over 5 years ago

  • Assignee changed from Vít Švanda to Radek Tomiška

Yes, this situation could occur, solution will not be an easy one. I'll try to do some analysis.

Actions #4

Updated by Marcel Poul over 5 years ago

  • Priority changed from Normal to Urgent
Actions #6

Updated by Alena Peterová over 5 years ago

We can switch to synchronous provisioning as temporary solution, then it works, thank you.

Actions #7

Updated by Radek Tomiška over 5 years ago

  • Target version set to Morganite (9.2.1)
  • % Done changed from 0 to 70

I found a way how to fix it ("elegantly and quickly"). It's based on retry mechanism - when some operation is already in queue and running, then next operation is puted into queue with filled time for retry mechnism => operations are executed.

I'll add fix into newly created 9.2.1, i need to add tests only.

Note: workaround with asynchronous events and synchronous provisioning (=> logged user is not blocked with slow operation) can be used on older versions.

Actions #8

Updated by Radek Tomiška over 5 years ago

  • Status changed from New to Needs feedback
  • % Done changed from 70 to 90
Actions #9

Updated by Radek Tomiška over 5 years ago

  • Assignee changed from Radek Tomiška to Ondřej Kopr
Actions #10

Updated by Ondřej Kopr over 5 years ago

  • File prov.png prov.png added
  • Status changed from Needs feedback to Resolved
  • Assignee changed from Ondřej Kopr to Radek Tomiška
  • % Done changed from 90 to 100

Works as I except, newly created provisioning operation has next execution time (first provisioning operation has state running). Retry provisoning didn't broke operation and works perfectly.

Actions #11

Updated by Radek Tomiška over 5 years ago

  • Status changed from Resolved to Closed
Actions

Also available in: Atom PDF