@odinserj I believe you still need at least one main feature. Without the ability to tell all servers to finish their work and then pause, it very difficult to deploy without fearing that you might kill a process in the middle.
For example, I have jobs that i queue, which in turn, queue thousands of smaller jobs.
If i were to deploy during one of these larger jobs running, that job might be 50% done, but since I killed it, it would restarted and the initial 50% would be queued.
There is an opportunity to stop background processing – blocking BackgroundJobServer.Dispose method, and non-blocking BackgroundJobServer.SendStop method (appeared in 1.6.0). When they are called, background processing is gracefully stopped, and no new background jobs are processed. However, this requires manual BackgroundJobServer class lifecycle management.
The main problem is to deliver the Stop command to all of your instances, you can use any distributed pub/sub implementations available, including Redis ones. Not a rocket science
If you have long-running jobs, ensure you are using IJobCancellationToken, or at least CancellationToken parameter in your background job method, and checking it from time to time. This will enable your method to finish gracefully, within the points you’ve defined. Nevertheless, with or without graceful shutdown, your backround job identifier will be returned to the queue, and background job will be resumed after the startup. So you need to ensure that your background jobs are idempotent, or at least can restore their execution from some moment.
Idempotence is a required property for any background job in any framework, that have at least once processing semantics (at most once leads to background job loss, and exactly once don’t exist at all due to async nature).
How does one get the BackgroundJobServer if the UseHangfireServer method is used?
The GetMonitoringApi().Servers() returns a list of ServerDTO, but they can’t be cast into BackgroundJobServer.
In an other thread you also talk about the static server instance. How does one get that instance?
Unfortunately there is no way to stop them without hacks that use reflection. I’ll consider how to implement this feature in 1.7.0, currently you can use the following method to perform the task.
internal static bool DisposeServers()
{
try
{
var type = Type.GetType("Hangfire.AppBuilderExtensions, Hangfire.Core", throwOnError: false);
if (type == null) return false;
var field = type.GetField("Servers", BindingFlags.Static | BindingFlags.NonPublic);
if (field == null) return false;
var value = field.GetValue(null) as ConcurrentBag<BackgroundJobServer>;
if (value == null) return false;
var servers = value.ToArray();
foreach (var server in servers)
{
// Dispose method is a blocking one. It's better to send stop
// signals first, to let them stop at once, instead of one by one.
server.SendStop();
}
foreach (var server in servers)
{
server.Dispose();
}
return true;
}
catch (Exception)
{
return false;
}
}
Reflection didn’t worked for me, however Requeue did the job
var monitoring = JobStorage.Current.GetMonitoringApi();
for (var i = 0; i < Math.Ceiling(monitoring.ProcessingCount() / 1000d); i++)
{
foreach (var processingJob in monitoring.ProcessingJobs(1000 * i, 1000))
{
BackgroundJob.Requeue(processingJob.Key);
}
}