Long running job re-enqueues while processing

I’m using hangfire 1.6.3 and pro.redis 2.0. From what I’ve read it seems that the invisibilitytimeout is causing my long running job to re-enqueue. Unfortunately, the job I’m dealing with may take up to 24 hours to complete. Raising the timeout that far would probably introduce other issues if the server is reset.

Is there any way I can introduce a heartbeat so the running job can let hangfire know it’s still alive thereby preventing the invisibilitytimeout from triggering?

Other info: this job can create hundreds of thousands of pdfs; each taking between 1-5 minutes. The job itself multithreads to process up to 32 concurrently. I can’t just have one job per report because a large customer (with 100k reports) would drown out a smaller customer (with only a few dozen). The goal is for it to remain fairly responsive to all customers even if that means leaving the servers involved slightly underutilized.

After a bit of diving around I figured out that the invisibility timeout based on the the fetched time. The code below uses the PerformContext to update the fetched time of the running job to now. As long as this gets called more frequently than the invisibility timeout the job won’t re-enqueue. It’s a bit of a hack and I’m hoping to make a better solution eventually.

public static class HangfirePerformContextExtension {
		public static void SendProcessingJobHeartBeat( this PerformContext cxt ) {
			//var hashes = cxt.Connection.GetAllEntriesFromHash( $"job:{cxt.BackgroundJob.Id}" );

			cxt.Connection.SetRangeInHash( $"job:{cxt.BackgroundJob.Id}", new[ ] {
				new KeyValuePair<string, string>( "Fetched", JobHelper.SerializeDateTime( DateTime.UtcNow ) )
			} );

As I’ve said in our email conversation, you can use higher InvisibilityTimeout values without any negative penalty. Since Hangfire.Pro.Redis, FetchedJobsWatcher is also checking, whether background job server is active or not. In the latter case, a background job will be requeued regardless of the InvisibilityTimeout setting.

In the near feature I’m considering to remove this option entirely, but now you can set it to one week or so.


What was your solution here?