I have multiple machines receiving requests with SQL Server/MSMQ. If any machine is suddenly removed from the farm, the jobs that were enqueued on that server will be orphaned. They will never be picked up by the other servers.
I believe this is because Hangfire assumes that when a server goes offline, the machine that was hosting it will eventually come back online. The MSMQ messages would still be on that machine, so they would be inherently enqueued at that time.
In this case (using Elastic Beanstalk), however, once a machine goes offline, it never comes back.
The solution to this could be as simple as a background job that looks for orphaned jobs based on the list of servers and re-queues them. In fact, that’s precisely what I did for Hangfire 1.4.7.
My question is: Has anyone else run across this or similar issues, and did I miss something built into Hangfire that is supposed to gracefully handle this situation?