Job retry while job is running with small delay

j33v3s · December 7, 2018, 4:17pm

Today we had a job retry - even though the job was still running.
The second attempt was only 3 minutes after the initial start - basically resulting in the same job overlapping.

On the dashboard and in the database, this looked like any other “job was interrupted, so start it again” - with only one “Enqueued” entry, several “processing” entries and only one “succeeded” entry.
However, this was not due to an application pool/server crash, and in fact, the serverid of the job was identical, only the workerid was different indicating the hangfire server was not interrupted.

All hangfire settings are default - except each IIS Site instance has its own queue and the AutomaticRetryAttribute is set to 0.
Should also add - we are just using the enqueue call to instantly execute our jobs - we aren’t doing anything fancy like scheduling jobs to execute later or execute after another job at this point.

Anyone else experienced this and/or fixed it?

As far as we’re aware this hasn’t happened before, and in fact the only reason we were aware of it this time is due to an admittedly poor legacy design choice which caused incorrect data output due to the overlapping processes.

j33v3s · December 19, 2018, 3:52pm

Ok - I’m going to reply to this in case someone else comes across it.

The key bit of information I had overlooked was the fact that the server id was the same, but the worker ids were separate. Many thanks to odinserj for getting in contact regarding this.

As it turns out, we have been having some minor network blips between the front end server (where hangfire executes) and the backend server (where the SQL database resides) - this then results in hangfire thinking that the job has failed and allocating a free worker to it. Meanwhile, the “hung” process reconnects and keeps going, resulting in overlapping jobs.

So, as far as I’m concerned, not a result of hangfire - it’s due to our server.

Topic		Replies	Views
What's going on here? Confusing history in the hangfire dashboard question queues , dashboard	1	1897	September 17, 2015
Requeing a Job with time delay feature recurring , sql-server , queues , dotnetcore	3	2063	January 5, 2023
Scheduled jobs or retries never enqueued again bug? sql-server , queues	0	2061	November 3, 2015
Looks like the job was aborted question	5	8453	April 14, 2023
Automatic retry when job is not completed yet bug? sql-server	1	1278	September 16, 2020

Job retry while job is running with small delay

Related topics