How to prevent Recurring job from running if previous instance is pending a retry

I have a job that runs every 5 minutes. It generally runs pretty quickly, but will fail on occasion. I have found that the retries conflict with subsequent runs of the job which increases the likelihood of subsequent failures (most failures are due to deadlock).

We enabled [DisableConcurrentExecution(1)] to stop this from happening. but i’m a little confused about how this works (I’ve dug through the documentation and forum and am still a bit confused. I’ll document my understanding of the flow, with Instance A being the first scheduled run of the job, B the next run 5 minutes after A, and C being 10 minutes after A - and the assumption the job never takes more than a minute to run.

so, with the assumption that the retry interval doubles with each retry we’d get something like this.
00:00 - Instance A - Attempt 1 - Fail
00:01 - Instance A - Attempt 2 - Fail
00:03 - Instance A - Attempt 3 - Fail
00:05 - Instance B - Attempt 1 - Fail
00:06 - Instance B - Attempt 2 - Fail
00:07 - Instance A - Attempt 4 - Fail
00:08 - Instance B - Attempt 3 - Fail
00:10 - Instance C - Attempt 1 - Fail
00:11 - Instance C - Attempt 2 - Fail
00:12 - Instance B - Attempt 4 - Fail
00:13 - Instance C - Attempt 3 - Fail
00:15 - Instance A - Attempt 5 - Fail
00:16 - Instance B - Attempt 5 - Fail
00:17 - Instance C - Attempt 4 - Fail
00:31 - Instance A - Attempt 6 - Fail

and so forth… I assume DisableConcurrentExecution wouldn’t achieve anything because none of these attempts were ever running at the same time. (Instance D would start at 00:15 and that would be forced to wait for instance A Attempt 5).
Am i correct in my understanding of this? What i think I’m looking for is if Instance A has a pending retry, don’t bother queuing instance B, C, etc.

The only way we got around this was to stop the auto retry ([AutomaticRetry(Attempts = 0)]) - however, we leverage log4net for sending notifications when there is an exception - this works well with automatic retry because we only receive a notification on the last failed attempt… but when retry is 0 any failure will result in a notification - so i’d like to utilize the retry attribute.

please let me know the best way to approach this problem (or if I’ve completely misunderstood how DisableConcurrentExecution works)

here’s evidence of what i suspected - even with DisableConcurrentExecution a recurring job will turn into many instances very quickly

I think there are more elegant ways to do this with a attribute, but you could wrap your task in something like this:

    public void RunJobSkipIfRunning(string recurringName, string runParam, IJobCancellationToken cancellationToken)
    {
        var job = JobStorage.Current.GetConnection().GetRecurringJobs().Where(j => j.Id == recurringName).FirstOrDefault();
        if (job != null)
        {
            if (job.LastJobState == "Enqueued" || job.LastJobState == "Processing")
            {
                return;
            }
        }
        RunJob(runParam, cancellationToken);
    }

This way there won’t be additional jobs if the previous one hasn’t completed (they’ll queue and complete without doing anything).

It sounds like you may what you to happen is your retries to not continue if the next scheduled task occurs. You could complete that with a similar wrapper that requires you to pass DateTime.Now to, and then it skips running the retry if the time delta is greater than a timeout you set.