I have a hangfire server that runs random background jobs throughout the day for a report rendering service on prem. When a user requests a document, the background job is enqueued into hangfire then the jobs is rendered and print/faxed/emailed from the worker.
Randomly hangfire will seem to deadlock with all workers (20) on my server “stuck” in the processing state until I restart my hangfire server (running as a windows service, same thing happened when the server was hosted in IIS as an always running app).
There is nothing in the logs that indicates why it stopped processing or deadlocked, the server is still running, the heartbeat is active, and there are no windows event logs that describe any error. Has anyone experienced this issue, or can anyone help me track down the issue?
Here is my hangfire server configuration
Edit: Forgot to mention this is a .net 4.7.2 application, running Hangfire.Core 1.7.24 and Hangfire.SQLServer 1.7.24
Edit2: I can replicate this issue on the server by tossing an arbitrarily large number of jobs at the server at once.
These jobs should take about a second or two to process, however they sit in the “processing” state doing nothing until the worker is restarted, making me think there is a deadlock somewhere
My workers heartbeat show up just fine so it seems like it should still be processing