Migration .net 3.1 core -> .net 7 - makes hangfire hiccup

I migrated my solution from 3.1 to .NET 7. Everything seems to work perfectly until after a few hours (there is no exact time). Hangfire stops processing and starts queuing up to the available works limit (see image 3). As soon as it reaches the limit, it starts placing jobs in the queued state.

Interestingly, from time to time, it ‘recovers itself’ itself, processing 5x more than normal. It typically processes between 30 and 80 jobs per second. However, when this issue occurs, it reaches peaks of 500 jobs. Unfortunately, there are times when it doesn’t recover, and the only solution is to restart the application (docker image) for it to return to normal.

I have to revert the software version to 3.1, unfortunately.

db3ca0df-6364-4c6d-96df-cd7edfbdf8bf

top_500

My code:

#region Hangfire
if (!deployConfig.Legacy)
{
    services.AddHangfireServer();
    services.AddHangfire((provider, config) =>
    {
        provider.GetRequiredService<HangfireConfigure>().Configure(config);
    });
    services.AddHangfireServer((provider, options) =>
    {
        var config = provider.GetRequiredService<IHangfire>();
        options.ServerName = "Shared";
        options.WorkerCount = config.WorkerCountModifierShared * config.WorkerCountMultiplierShared;
        options.Queues = new string[] { "sh_operation_service", "sh_couchdb_service", "sh_couchdb_expurgo_service", "sh_couchdb_service_upload", "default" };
    });

    services.AddHangfireServer((provider, options) =>
    {
        var config = provider.GetRequiredService<IHangfire>();
        options.ServerName = "Cecom";
        options.WorkerCount = config.WorkerCountModifierCecom * config.WorkerCountMultiplierCecom;
        options.Queues = new string[] { "cecom_service", "talao_service", "default" };
    });

    services.AddHangfireServer((provider, options) =>
    {
        var config = provider.GetRequiredService<IHangfire>();
        options.ServerName = "HighPriority";
        options.WorkerCount = config.WorkerCountModifierHighPriority * config.WorkerCountMultiplierHighPriority;
        options.Queues = new string[] { "hp_couchdb_service", "hp_couchdb_expurgo_service", "default" };
    });

    services.AddHangfireServer((provider, options) =>
    {
        var config = provider.GetRequiredService<IHangfire>();
        options.ServerName = "LowPriority";
        options.WorkerCount = config.WorkerCountModifierLowPriority * config.WorkerCountMultiplierLowPriority;
        options.Queues = new string[] { "lp_couchdb_service", "lp_couchdb_expurgo_service", "default" };
    });

    services.AddHangfireServer((provider, options) =>
    {
        var config = provider.GetRequiredService<IHangfire>();
        options.ServerName = "HighPriorityNoParallel";
        options.WorkerCount = 1;
        options.Queues = new string[] { "hp_couchdb_service_dwp" };
    });

    services.AddHangfireServer((provider, options) =>
    {
        var config = provider.GetRequiredService<IHangfire>();
        options.ServerName = "BackgroudOffice";
        options.WorkerCount = config.WorkerCountModifierBackground * config.WorkerCountMultiplierBackground;
        options.Queues = new string[] { "bg_couchdb_service" };
    });
}
 var bgJob = app.ApplicationServices.GetRequiredService<Common.JobClients.BackgroundJob>();
                    var storage = bgJob.Storage;


                    app.UseHangfireDashboard("/hangfire", new DashboardOptions
                    {
                        DisplayStorageConnectionString = false,
                        Authorization = Enumerable.Empty<IDashboardAuthorizationFilter>(),
                    }, storage: storage);

                    ccRecurrent = new RecurringJobManager(storage);

in the log system (SEQ + Serilog) no error or warning that shows the reason, nothing different.

the server is very big, F16 on Azure (16 vcpu and 32ram) according to htop and looks I don’t even use 20% of it

SQL Version 2016
.NET 7.0.14
Hangfire 1.8.6
image docker: 7.0-jammy (ubuntu)
VM OS: Ubuntu 20.04.1 LTS
PackageReference Include=“Hangfire” Version=“1.8.6” //only this package

crash_


the server is very big, F16 on Azure (Intel(R) Xeon(R) Platinum 8272CL) (16 vcpu and 32ram) according to htop and looks I don’t even use 20% of it