Task #900
closedVery slow synchronization of organizations structure
100%
Description
Tested on 7.5.3, the organization structure has ~16 000 elements, only 1 root (defined by null parent)
The synchronization of organizations structure is very slow - more than 1 minute per organization.
When the "parent" is not defined so all organizations are roots, it takes less than 1 second per organization.
Please could you look at it?
Files
Updated by Alena Peterová almost 7 years ago
Sorry, only in redmine I can see my mistake - the Boolean values are interchanged. The root has null parents, non-root has not null parents.
Strangely, the computing of parents works well :-)
Updated by Radek Tomiška almost 7 years ago
- Category changed from Tree structures to Synchronization
- Assignee changed from Radek Tomiška to Vít Švanda
Updated by Alena Peterová almost 7 years ago
- Subject changed from Very slow synchronization of organizations without groovy script to Very slow synchronization of organizations structure
- Description updated (diff)
OK, so the difference was really caused by my mistake (calling each element root). When I corrected the script, the synchronization is as slow as without the script.
So the aim of this ticket should be checking why is the default sync of 16k organisations' structure so slow.
Updated by Vít Švanda almost 7 years ago
Do you have more complex structure or all 16000 items are directly under one root?
Can you attach the source export with organizations?
Updated by Alena Peterová almost 7 years ago
- File vazby.csv.zip vazby.csv.zip added
The structure is complex, several levels of organizations.
The compressed CSV is attached. I connected it by CSV connector, the attribute CISLOSSMSPM is mapped to the code of the organization, the attribute MANAGER_CISLOSSMSPM to the parent code. Other attributes are not important.
The names of the organizations were taken from different source, so the organizations already exist in IdM. I only need to synchronize the structure from this CSV. I linked the accounts first without setting the parent (because Update entity is not an option yet - #878) and then run the synchronization for LINKED -> Update Entity with computing the parents.
Updated by Vít Švanda almost 7 years ago
- I simulated the problem and sync of tree for 16000 accounts is really slow.
- Problem is in the transformation of the attribute value. This transformation is call for every account. In tree sync is this searche evaluated for every account again (16000 * 16000 calls).
- I implemented cache for method "AbstractSynchronizationExecutor.getValueByMappedAttribute(AttributeMapping attribute, List<IcAttribute> icAttributes)".
- This cache not supports evic by key.
- This cache is cleared on start and end every sync.
- Correct function assumes, that the every transformation on the attributes will return static result. It means transformation on the attribute can not generate "random" values without dependency on the input values. This predicate have to be more consulted.
Commit in the develop: https://github.com/bcvsolutions/CzechIdMng/commit/b8d3f4b5bba80dd473b1f4a2c3df90e6e92aad06
Updated by Alena Peterová almost 7 years ago
Vít Švanda wrote:
Correct function assumes, that the every transformation on the attributes will return static result. It means transformation on the attribute can not generate "random" values without dependency on the input values.* This predicate have to be more consulted.
Thanks.
This solution makes sense to me. I think that the only use case for "random" values could be generating some "ids" for later use in provisioning or so. But in such case, the value of the attribute for one organisation should be the same during the whole synchronization, so the cache is correct in fact.
Updated by Vít Švanda almost 7 years ago
- Status changed from In Progress to Needs feedback
- Assignee changed from Vít Švanda to Ondřej Kopr
- % Done changed from 0 to 80
- Caching is now using only in the sync.
- All exists attribute will be set as cached = false (for back compatibility).
- New created attribute will be cached by default.
- Documentation is here https://wiki.czechidm.com/devel/dev/system/system-mapping#attribute_cache.
Ondra, could you please do review and create test for this feature? You are the best for this job.
Updated by Ondřej Kopr almost 7 years ago
- Status changed from Needs feedback to Closed
- % Done changed from 80 to 100
Thank you for your review, there is test: https://github.com/bcvsolutions/CzechIdMng/commit/c14c0b4847216c93d972ca2083c3b0268fbd2a4f thank you for your help.