I am running Hangfire Server in-process in a single ASP.NET Core 2.1 application and am trying to come up with a strategy to handle updates and restarts.
Imagine my app is chugging along nicely and I’ve written some new features and fixed some bugs so naturally it is time for a new release.
If the process is simply stopped then job progress could be lost or jobs may be processed multiple times (am I correct on that?) due to the database record not being marked as done.
This is not desired and I’d like to avoid this if at all possible.
Solutions I’ve thought of, but not tested, are as follows:
Run Hangfire Server out-of-process (but then how do I update that process?)
Spin up a second web app process and only process future jobs on this one (don’t run future jobs on the previous process) and stop the running process once all of its jobs have completed (sounds complex and error-prone)
Throw my hands up and accept downtime (yikes!), try to pick the least-damaging time to take things offline
How are you all handling this? I can’t imagine this is a new problem.
Hi. From the point of tour Jobs being re-executed, the best recomendation is to follow the best practices (http://docs.hangfire.io/en/latest/best-practices.html) and keep your jobs reentrants. Also, always where you’re doing something that could talento some time to finish, try to call the JobCancellationToken.TrhowIfRequested(), of course, store some kind of job state to knowns what to do in the next execution.
As for the rest… You choose the best approach for updating your application …
Hello Phrohdoh, this is going to be long, but I hope this helps.
There are two things I would like to mention, based on my experience:
The job’s progress should never stop unless it finished its task, or it attempted x-amount of retries, or it is stopped manually, or your app’s application pool is stopped, or the ASP.Net server shuts down. I would add “your app’s application pool is recycled, or the ASP.Net server reboots”, but it depends on the type of job you’re running; recurring jobs are fine - Hangfire will re-launch those jobs after the pool recycle/server reboot is completed; one-time jobs should be “transferred” - by Hangfire - into the retry jobs list, but I’m not 100% sure on that. I didn’t test the “one-time job, pool recycle/server reboot” scenario because I wrapped my target method(the method to be invoked in the Hangfire.BackgroundJobs<>() method) in a Try/Catch that emails and/or logs exceptions; meaning, if Hangfire fails to successfully run that one-time job for any reason, I will know about it; also, whenever a bug was caught for my one-time jobs, Hangfire automatically “transferred” the one-time job into the retry job list. Hangfire’s default number of retries/attempts is 10.
When you publish your new release for the new features and bug fixes (i.e. your updates), Hangfire will retry/re-launch any jobs as soon as you publish your updates to the server. I should also mention, the ASP.Net server should actively give Hangfire time to finish your jobs before pool/server shutdowns and restarts/recycles.
Next,
Although this solution may not be needed, you can update “that” process by creating a recurring job; for recurring jobs, this solution is not needed, but it may be needed for one-time jobs, if my “retry job list” theory is incorrect.
This solution… don’t do this to your server, please. One web app can handle all future and present jobs. You’d be better off going with my solution, and creating a recurring job to update “that” process(es).
You should never have to take anything offline unless it’s for maintenance, or something super serious; for example, switching servers; and technically, you should use Failover/Failsafe systems, and good PR for handling maintenance and prolonged downtime situations. Also, you’re going to throw your hands up regardless, and on several occasions for many different reasons - which is why people in our field have strong arms,