Firstly, as everyone keeps saying, Hangfire is awesome! I’m just running into an issue configuring my set-up.
We have potentially long-running tasks, say 30min-1.5 hours. We throw these onto hangfire, and all works well for the most part. However, imagine this scenario:
Invisibility timeout set to 90 mins, as tasks can take up to 90 minutes.
Hangfire server running in console application
Long running task created and launched
console application is restarted, so we call:
Then we re-launch the server:
However, the original job is still running on the old server. Hangfire shows:
Looks like the job was aborted – it is processed by server ip-0ac4cbb1:11540, which reported its heartbeat more than 1 minute ago. It will be retried automatically after invisibility timeout, but you can also re-queue or delete it manually.
It then shows, after further delay:
The job was aborted – it is processed by server ip-0ac4cbb1:11540 which is not in the active servers list for now. It will be retried automatically after invisibility timeout, but you can also re-queue or delete it manually.
The invisibility timeout is set to 90 minutes, as some jobs might take this long. However, shouldn’t the server dying cause the job to be re-queued? Is there a way to get this behaviour?
If alternatively I reduce the invisibility timeout, the job could be launched multiple times - which isn’t ideal. If I add a global lock, and just return success on subsequent attempts - this could cause an issue if the console app is relaunched after the invisibiliy timeout (the failed job will never be captured).
A little unsure how I’m meant to handle this - help appreciated!
Thanks again for all your great work!