Hangfire seems deadlocked?

Tags: #<Tag:0x00007f1860854c18> #<Tag:0x00007f1860854a10> #<Tag:0x00007f1860854880>


since a few days we encounter issues with Hangfire: the processing of background jobs seems to stop from one second to the other. Even restarting the worker does not help, it starts up and does nothing, although according to the Dashboard, a job has been picked and is being processed. But when I hit pause in the debugger and look at the threads, I see nothing running in our code. We use MySql. Does anyone have any pointers what could cause this, or how I can analyze this issue?

Here’s what the log shows when restarting the server: https://pastebin.com/p4GtK8Lh

(outdated: To resolve the situation I have drop the hangfire db. :T)

UPDATE: I noticed that this has something to do with the distributed locks. I had the deadlock situation just now, and I tried messing with the DB to be able to isolate the source of the problem. It seems the problem goes away when I delete the row(s) in the ‘DistributedLock’ table in the hangfire db. As soon as I execute the delete command, operations restore back to normal.

I suspect the problem arises when I kill the hangfire background job server process without properly terminating it. The job that was currently processing at that time will block all subsequent jobs. … can that be? … we only ever use one background worker process, so maybe could we clean up the DB state when starting up the server?