I have a job that runs every 5 minutes. It generally runs pretty quickly, but will fail on occasion. I have found that the retries conflict with subsequent runs of the job which increases the likelihood of subsequent failures (most failures are due to deadlock).
We enabled [DisableConcurrentExecution(1)] to stop this from happening. but i’m a little confused about how this works (I’ve dug through the documentation and forum and am still a bit confused. I’ll document my understanding of the flow, with Instance A being the first scheduled run of the job, B the next run 5 minutes after A, and C being 10 minutes after A - and the assumption the job never takes more than a minute to run.
so, with the assumption that the retry interval doubles with each retry we’d get something like this.
00:00 - Instance A - Attempt 1 - Fail
00:01 - Instance A - Attempt 2 - Fail
00:03 - Instance A - Attempt 3 - Fail
00:05 - Instance B - Attempt 1 - Fail
00:06 - Instance B - Attempt 2 - Fail
00:07 - Instance A - Attempt 4 - Fail
00:08 - Instance B - Attempt 3 - Fail
00:10 - Instance C - Attempt 1 - Fail
00:11 - Instance C - Attempt 2 - Fail
00:12 - Instance B - Attempt 4 - Fail
00:13 - Instance C - Attempt 3 - Fail
00:15 - Instance A - Attempt 5 - Fail
00:16 - Instance B - Attempt 5 - Fail
00:17 - Instance C - Attempt 4 - Fail
00:31 - Instance A - Attempt 6 - Fail
and so forth… I assume DisableConcurrentExecution wouldn’t achieve anything because none of these attempts were ever running at the same time. (Instance D would start at 00:15 and that would be forced to wait for instance A Attempt 5).
Am i correct in my understanding of this? What i think I’m looking for is if Instance A has a pending retry, don’t bother queuing instance B, C, etc.
The only way we got around this was to stop the auto retry ([AutomaticRetry(Attempts = 0)]) - however, we leverage log4net for sending notifications when there is an exception - this works well with automatic retry because we only receive a notification on the last failed attempt… but when retry is 0 any failure will result in a notification - so i’d like to utilize the retry attribute.
please let me know the best way to approach this problem (or if I’ve completely misunderstood how DisableConcurrentExecution works)