Firstly let me say I’m not sure is specifically a problem with Hangfire
Last night was the first time I left my application running overnight in our staging environment. After our server backup process ran overnight, Hangfire stopped processing any jobs, and they just sat in the processing queue for several hours, not actually doing anything (no cpu activity).
I tried restarting IIS, which had no effect. Then I restarted the server, which had no effect. Then on the SQL Server (a separate server) I dropped the the whole Hangfire schema and reinstalled it. My application then started processing jobs again.
I have the Hangfire schema in a separate database to the main application database, and one of the server backup tasks is to kick off sql server maintenance (reindexing etc) of the main application database.
So my jobs in Hangfire will want to write to the main application database with results of what they process, so when the maintenance is in progress it would be unavailable, which is what I think has caused the problem.
What I don’t understand is why dropping the Hangfire schema and reinstalling it allowed the jobs to start processing again.
Any advice on how to investigate and prevent this from happening in the future would be appreciated, especially as I’m not sure if it was Hangfire’s fault or my own.