Defect #1765
closedDeleteExecutedEventTaskExecutor causes OutOfMemory
100%
Description
Version 9.6.3, 9.7.0-SNAPSHOT
When the DeleteExecutedEventTaskExecutor starts, it runs some minutes and then causes OutOfMemory on java heap. The application doesn't respond. The only solution is to restart Tomcat.
There are:
352739 Events
339623 Executed events
303578 Executed events older than 3 days
- First there is OutOfMemory after night task (the start of the task was at 2:00 and it is not in the logfile, because log is rotated).
- Then I restarted Tomcat and run DeleteExecutedEventTaskExecutor directly after restart of IdM.
2019-07-30 08:03:10.229 WARN 20536 --- [task-executor-3] .c.s.t.i.DeleteExecutedEventTaskExecutor : Start deleting executed events older than [3] days.
Java heap (6 GB) was soon filled, only full garbage collector run with no success:[root@idmtest logs]# /usr/lib/jvm/java-openjdk/bin/jstat -gc 20536 2000 S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT GCT 698880.0 698880.0 0.0 0.0 699392.0 690725.8 4194304.0 4194060.3 150040.0 143199.9 16424.0 15112.1 30 4.500 21 25.198 29.698 698880.0 698880.0 0.0 0.0 699392.0 699392.0 4194304.0 4194071.4 150040.0 143199.9 16424.0 15112.1 30 4.500 22 25.198 29.698 698880.0 698880.0 0.0 0.0 699392.0 699392.0 4194304.0 4194065.0 150040.0 143219.4 16424.0 15114.8 30 4.500 23 27.648 32.148 698880.0 698880.0 0.0 0.0 699392.0 699392.0 4194304.0 4194056.0 150040.0 143219.4 16424.0 15114.8 30 4.500 25 30.411 34.911 698880.0 698880.0 0.0 0.0 699392.0 699392.0 4194304.0 4194090.1 150040.0 143219.4 16424.0 15114.8 30 4.500 27 32.609 37.109
I killed tomcat after cca 5 minutes.
Files
Related issues
Updated by Radek Tomiška over 5 years ago
- Related to Task #1668: Entity event queue - create LRT for remove executed events. added
Updated by Radek Tomiška over 5 years ago
- Status changed from New to In Progress
- Priority changed from Urgent to Normal
- % Done changed from 0 to 10
- Estimated time set to 4.00 h
Thx for feedback from develop, the issue is connected with the stateful task mechanism (black magic), when whole history is loaded at the start of task (the first issue) and all processed item ids are stored in list (the second issue).
Workaround one: if LRT was scheduled before, then try to reschedule it (drop history - fix the first issue)
Workaround two : red button on entity events.
Note: How long do you have delete processed events turned off :) ?
Updated by Alena Peterová over 5 years ago
- Description updated (diff)
Radek Tomiška wrote:
Thx for feedback from develop, the issue is connected with the stateful task mechanism (black magic), when whole history is loaded at the start of task (the first issue) and all processed item ids are stored in list (the second issue).
Statefullness of this task is maybe useless - I don't see any reason to know, when was some event deleted...
BTW there are 131709 events in the queue of this task. This number only increases, if I understand it correctly? So probably the same problem will happen sooner or later on all projects running on 9.6.3...
Workaround one: if LRT was scheduled before, then try to reschedule it (drop history - fix the first issue)
Workaround two : red button on entity events.
Note: How long do you have delete processed events turned off :) ?
It wasn't turned off :-( Almost all of the 300 000 events were generated during one day. It was caused by reconcilation of time slices + clear dirty flags (it run several times).
Updated by Radek Tomiška over 5 years ago
- Status changed from In Progress to Needs feedback
- Assignee changed from Radek Tomiška to Vít Švanda
- % Done changed from 10 to 90
I've added new property to schedulable statefull task to ignore queue.
Commit:
https://github.com/bcvsolutions/CzechIdMng/commit/ec45c14f7483b6ae3d35cb2bb76744383e4a5307
Tested on 300k events.
Could you do a feedback, please?
Note: I wanted to have log, which events were deleted (=>event id is in app log), but if it's useless (just count is enough) then could be redesigned in next version (e.g. to use repository with hql).
Note: I'm not sure about the history only can lead to out of memory issue, but if this issue occurs on 9.6.3 version, then we can release hotfix (or simple workaround one can be used).
Updated by Vít Švanda over 5 years ago
- Status changed from Needs feedback to Resolved
- Assignee changed from Vít Švanda to Radek Tomiška
- % Done changed from 90 to 100
I did review and test. Works correctly.
Updated by Radek Tomiška over 5 years ago
- Status changed from Resolved to Closed