Hangfire Discussion

Incomplete batches stuck in the queue when jobs are expired

hangfire-pro
sql-server
Tags: #<Tag:0x00007f85070cca90> #<Tag:0x00007f85070cc838>

#1

Hangfire Version: 1.6.7
Hangfire.Pro Version: 1.4.7

I have a number of batches in HangFire Pro that contain expired jobs. How can I safely remove these batches from the Batches screen? I have tried deleting the individual jobs, but that page just refreshes and nothing gets removed.

Many thanks,
Kevin


#2

Kevin, sorry for the delay, I’ve just released Hangfire.Pro 1.4.8. There was a problem related to batches that contain continuations for “external” jobs. If their antecedent jobs were finished before creating a batch that contain such a continuation job, the batch was created, but continuation (that is in Deleted state, because antecedent job already finished) will stay in the Pending tab forever. The same were with other background jobs whose state is already finished before creating a batch.


#3

Great - thank you.

I will install that new version for our next release. Will that also clear the batches that already in this state? Or should I clear those from the database manually? (Hangfire.Set & Hangfire.List tables?)


#4

Here is a much better method:

internal static void ExpireBatch(
    JobStorageConnection connection,
    JobStorageTransaction transaction,
    string batchId,
    TimeSpan expireIn)
{
    // Expiring the batch itself
    transaction.ExpireHash($"batch:{batchId}", expireIn);
    transaction.ExpireHash($"batch:{batchId}:state", expireIn);
    transaction.ExpireList($"batch:{batchId}:states", expireIn);

    // Unregister batch from all state sets
    transaction.RemoveFromSet("batches:awaiting", batchId);
    transaction.RemoveFromSet("batches:awaiting-job", batchId);
    transaction.RemoveFromList("batches:completed", batchId);
    transaction.RemoveFromList("batches:deleted", batchId);
    transaction.RemoveFromSet("batches:started", batchId);
    transaction.RemoveFromList("batches:succeeded", batchId);

    // Expire non-initialized jobs
    var jobIds = new List<string>();
    jobIds.AddRange(GetJobIdsFromSet(connection, $"batch:{batchId}:created"));
    jobIds.AddRange(GetJobIdsFromSet(connection, $"batch:{batchId}:pending"));
    jobIds.AddRange(GetJobIdsFromSet(connection, $"batch:{batchId}:processing"));
    jobIds.AddRange(GetJobIdsFromSet(connection, $"batch:{batchId}:succeeded"));
    jobIds.AddRange(GetJobIdsFromSet(connection, $"batch:{batchId}:finished"));

    foreach (var jobId in jobIds)
    {
        var state = connection.GetStateData(jobId);
        if (state != null) continue;

        transaction.ExpireJob(jobId, expireIn);

        // Expire job continuations
        transaction.ExpireSet($"job:{jobId}:continuations:succeeded", expireIn);
        transaction.ExpireSet($"job:{jobId}:continuations:finished", expireIn);

        // Expire batch continuations (upcoming, for future)
        transaction.ExpireSet($"job:{jobId}:continuations:batch:succeeded", expireIn);
        transaction.ExpireSet($"job:{jobId}:continuations:batch:finished", expireIn);
    }

    // Expire data for non-started batches
    transaction.ExpireSet($"batch:{batchId}:created", expireIn);
    transaction.ExpireHash($"batch:{batchId}:next-states", expireIn);

    // Expire batch sets
    transaction.ExpireSet($"batch:{batchId}:pending", expireIn);
    transaction.ExpireSet($"batch:{batchId}:processing", expireIn);
    transaction.ExpireSet($"batch:{batchId}:succeeded", expireIn);
    transaction.ExpireSet($"batch:{batchId}:finished", expireIn);

    // Expire batch continuation data
    transaction.ExpireSet($"batch:{batchId}:continuations:succeeded", expireIn);
    transaction.ExpireSet($"batch:{batchId}:continuations:finished", expireIn);

    // Expire nested batches (upcoming feature)
    var batchIds = new List<string>();
    batchIds.AddRange(GetBatchIdsFromSet(connection, $"batch:{batchId}:created:batches"));
    batchIds.AddRange(GetBatchIdsFromSet(connection, $"batch:{batchId}:pending:batches"));
    batchIds.AddRange(GetBatchIdsFromSet(connection, $"batch:{batchId}:processing:batches"));
    batchIds.AddRange(GetBatchIdsFromSet(connection, $"batch:{batchId}:succeeded:batches"));
    batchIds.AddRange(GetBatchIdsFromSet(connection, $"batch:{batchId}:finished:batches"));

    foreach (var id in batchIds)
    {
        ExpireBatch(connection, transaction, id, expireIn);
    }

    // Expire data for nested non-started batches (upcoming, for future)
    transaction.ExpireSet($"batch:{batchId}:created:batches", expireIn);
    transaction.ExpireHash($"batch:{batchId}:next-states:batches", expireIn);

    // Expire nested batch sets (upcoming, for future too)
    transaction.ExpireSet($"batch:{batchId}:pending:batches", expireIn);
    transaction.ExpireSet($"batch:{batchId}:processing:batches", expireIn);
    transaction.ExpireSet($"batch:{batchId}:succeeded:batches", expireIn);
    transaction.ExpireSet($"batch:{batchId}:finished:batches", expireIn);

    // Expire job continuations for batches (upcoming, for future)
    transaction.ExpireHash($"batch:{batchId}:continunations:succeeded:jobs", expireIn);
    transaction.ExpireHash($"batch:{batchId}:continunations:finished:jobs", expireIn);
}

private static List<string> GetJobIdsFromSet(JobStorageConnection connection, string set)
{
    return connection.GetAllItemsFromSet(set).ToList();
}

private static List<string> GetBatchIdsFromSet(JobStorageConnection connection, string set)
{
    return connection.GetAllItemsFromSet(set).ToList();
}

And an example how to use it:

using (var connection = JobStorage.Current.GetConnection().AsJobStorageConnection())
using (var transaction = connection.CreateWriteTransaction().AsJobStorageTransaction())
{
    ExpireBatch(connection, transaction, antecedent, TimeSpan.FromMinutes(1));
    ExpireBatch(connection, transaction, child, TimeSpan.FromMinutes(1));

    transaction.Commit();
}

#5

Perfect - thank you for that, that’s cleaned everything up. Much appreciated!


#6

Glad to hear, Kevin! Thank you for reporting this issue, you helped to make batches more reliable!


#7

I’ve noticed the same thing. Currently on Hangfire 1.6.17 and Hangfire.Pro.

The code sample you posted above, is that meant to go into my host application, or is that something I run in a console application?

It would be great to have a SQL script that we could run that would do the same thing as well.

Thanks.