Hangfire exausts connection pool resources and hangs

Greetings,

i’ve been having a problem with hangfire lately, it seems to exaust the connections from the pool after a while. I have 900+ recurring jobs running from time to time, with an average of 20-50 jobs running simultaneously. I rely on the DisableConcurrentExecutionWithParameters Attribute so no jobs run multiple times. After this scenario happens, hangfire hangs for hours trying to reestablish connection with SQL server.

I’m using Dotnet Core 3.1, and hangfire 1.7.18 with the worker template running on a docker container

My pool size is set to 9999 currently
Here’s my hangfire configuration:

services.AddHangfire(x =>
                        x.UseSqlServerStorage(Configuration.GetSection("ConnectionStrings").GetSection("Hangfire").GetValue<string>(EnvironmentName),
                         new SqlServerStorageOptions { SchemaName = settings.Schema,
                             SlidingInvisibilityTimeout = TimeSpan.FromMinutes(30),
                             QueuePollInterval = TimeSpan.FromSeconds(5),
                        })
                        .UseColouredConsoleLogProvider()
                    ); 

                    services.AddHangfireServer();

Stack trace:

> 2021-03-20 08:10:28 [ERROR] (Hangfire.Processing.BackgroundExecution) Execution BackgroundServerProcess is still in the Failed state for 04:58:00.5171834 due to an exception, will be retried no more than in 00:00:15
> carrier_daemon_1   | System.InvalidOperationException
> carrier_daemon_1   | Timeout expired.  The timeout period elapsed prior to obtaining a connection from the pool.  This may have occurred because all pooled connections were in use and max pool size was reached.
> carrier_daemon_1   |    at System.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection)
> carrier_daemon_1   |    at System.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource`1 retry, DbConnectionOptions userOptions)
> carrier_daemon_1   |    at System.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource`1 retry)
> carrier_daemon_1   |    at System.Data.SqlClient.SqlConnection.Open()
> carrier_daemon_1   |    at Hangfire.SqlServer.SqlServerStorage.CreateAndOpenConnection()
> carrier_daemon_1   |    at Hangfire.SqlServer.SqlServerStorage.UseConnection[T](DbConnection dedicatedConnection, Func`2 func)
> carrier_daemon_1   |    at Hangfire.SqlServer.SqlServerStorage.UseConnection(DbConnection dedicatedConnection, Action`1 action)
> carrier_daemon_1   |    at Hangfire.Server.BackgroundServerProcess.CreateServer(BackgroundServerContext context)
> carrier_daemon_1   |    at Hangfire.Server.BackgroundServerProcess.Execute(Guid executionId, BackgroundExecution execution, CancellationToken stoppingToken, CancellationToken stoppedToken, CancellationToken shutdownToken)
> carrier_daemon_1   |    at Hangfire.Server.BackgroundProcessingServer.RunServer(Guid executionId, Object state)
> carrier_daemon_1   |    at Hangfire.Processing.BackgroundExecution.Run(Action`2 callback, Object state)
1 Like

Hi,
We are seeing the same kind of error message, however we are only running a few recurring jobs. Our issue seems to be related to our database restarting occasionally.

For our DEV environments we have a database running in kubernetes and when it hits is memory limit it restart and we believe this coincides with Hangfire hanging and locking 1 CPU when trying to reconnect.

We are running .NET Core 5 and hangfire 1.7.9. We ar NOT using DisableConcurrentExecutionWithParameters Attrbute