.NET Core - Exception while scheduling a background job

We recently migrated to .NET Core on Linux and have started getting the following exception intermittently while attempting to schedule a background job:

Hangfire.BackgroundJobClientException: Background job creation failed. See inner exception for details.
 ---> System.InvalidOperationException: Internal connection fatal error.
at System.Data.SqlClient.TdsParser.TryRun(System.Data.SqlClient.RunBehavior runBehavior, System.Data.SqlClient.SqlCommand cmdHandler, System.Data.SqlClient.SqlDataReader dataStream, System.Data.SqlClient.BulkCopySimpleResultSet bulkCopyHandler, System.Data.SqlClient.TdsParserStateObject stateObj, System.Boolean& dataReady)
at System.Data.SqlClient.TdsParser.Run(System.Data.SqlClient.RunBehavior runBehavior, System.Data.SqlClient.SqlCommand cmdHandler, System.Data.SqlClient.SqlDataReader dataStream, System.Data.SqlClient.BulkCopySimpleResultSet bulkCopyHandler, System.Data.SqlClient.TdsParserStateObject stateObj) at offset 32
at System.Data.SqlClient.TdsParser.ProcessAttention(System.Data.SqlClient.TdsParserStateObject stateObj) at offset 68
at System.Data.SqlClient.TdsParserStateObject.ResetCancelAndProcessAttention() at offset 41
at System.Data.SqlClient.TdsParserStateObject.CloseSession()
at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(System.Data.CommandBehavior cmdBehavior, System.Data.SqlClient.RunBehavior runBehavior, System.Boolean returnStream, System.Boolean async, System.Int32 timeout, System.Threading.Tasks.Task& task, System.Boolean asyncWrite, System.Data.SqlClient.SqlDataReader ds) at offset 1245
at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(System.Threading.Tasks.TaskCompletionSource`1 completion, System.Boolean sendToPipe, System.Int32 timeout, System.Boolean asyncWrite, System.String methodName) at offset 121
at System.Data.SqlClient.SqlCommand.ExecuteNonQuery() at offset 77
at Dapper.SqlMapper.ExecuteImpl(System.Data.IDbConnection cnn, Dapper.CommandDefinition& command) at offset 208
at Dapper.SqlMapper.Execute(System.Data.IDbConnection cnn, System.String sql, System.Object param, System.Data.IDbTransaction transaction, System.Nullable`1 commandTimeout, System.Nullable`1 commandType) at offset 24
at Hangfire.SqlServer.SqlServerWriteOnlyTransaction.<Commit>b__5_0(System.Data.Common.DbConnection connection, System.Data.Common.DbTransaction transaction) at offset 89
at Hangfire.SqlServer.SqlServerStorage.<>c__DisplayClass21_0.<UseTransaction>b__0(System.Data.Common.DbConnection connection, System.Data.Common.DbTransaction transaction) at offset 13
at Hangfire.SqlServer.SqlServerStorage.<>c__DisplayClass22_0`1.<UseTransaction>b__0(System.Data.Common.DbConnection connection) at offset 37
at Hangfire.SqlServer.SqlServerStorage.UseConnection[T](System.Func`2 func) at offset 9
at Hangfire.SqlServer.SqlServerStorage.UseTransaction[T](System.Func`3 func, System.Nullable`1 isolationLevel) at offset 33
at Hangfire.SqlServer.SqlServerStorage.UseTransaction(System.Action`2 action) at offset 41
at Hangfire.SqlServer.SqlServerWriteOnlyTransaction.Commit() at offset 23
at Hangfire.Client.CoreBackgroundJobFactory.Create(Hangfire.Client.CreateContext context) at offset 188
at Hangfire.Client.BackgroundJobFactory.<>c__DisplayClass7_0.<CreateWithFilters>b__0()
at Hangfire.Client.BackgroundJobFactory.InvokeClientFilter(Hangfire.Client.IClientFilter filter, Hangfire.Client.CreatingContext preContext, System.Func`1 continuation) at offset 64
at Hangfire.Client.BackgroundJobFactory.Create(Hangfire.Client.CreateContext context) at offset 77
at Hangfire.BackgroundJobClient.Create(Hangfire.Common.Job job, Hangfire.States.IState state) at offset 56
--- End of inner exception stack trace ---
at Hangfire.BackgroundJobClient.Create(Hangfire.Common.Job job, Hangfire.States.IState state) at offset 105
at Hangfire.BackgroundJobClientExtensions.Schedule(Hangfire.IBackgroundJobClient client, System.Linq.Expressions.Expression`1 methodCall, System.TimeSpan delay) at offset 14
at Hangfire.BackgroundJob.Schedule(System.Linq.Expressions.Expression`1 methodCall, System.TimeSpan delay)
at altPUG.Workers.Jobs.ProvisioningJobs.<ScheduleFollowupForCreate>d__21.MoveNext() at offset 630 in /home/altpug/agent/_work/1/s/altPUG/src/altPUG.Workers/Jobs/ProvisioningJobs.cs:line 333:col 13
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at offset 12
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task task) at offset 46
at System.Runtime.CompilerServices.TaskAwaiter.GetResult() at offset 11
at altPUG.Workers.Jobs.<>c__DisplayClass16_0.<<StartCreateServer>b__0>d.MoveNext() at offset 1919 in /home/altpug/agent/_work/1/s/altPUG/src/altPUG.Workers/Jobs/ProvisioningJobs.cs:line 169:col 29
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at offset 12
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task task) at offset 46
at System.Runtime.CompilerServices.TaskAwaiter.GetResult() at offset 11
at altPUG.<>c__DisplayClass0_0.<<RunSync>b__0>d.MoveNext() at offset 173 in /home/altpug/agent/_work/1/s/altPUG/src/altPUG/AsyncHelper.cs:line 28:col 21

This occurs relatively infrequently but is causing destabilization of our platform. Is there any other information we can provide to help diagnose this issue?

Doesn’t look like Hangfire issue, but SqlClient on Linux issue instead. You may try another version of System.Data.SqlClient package (if there’s any).

Similar issue: https://github.com/dotnet/corefx/issues/4676

That issue is about 18 months old and has been closed. We’re using Azure SQL and typically you will see severed connections when the database has exhausted the available resources (# of connections, concurrent executing queries, DTUs). For reference, there is no indication that we are hitting the CPU, Memory, or DTU limits for the database.

Hangfire is the only application hitting the database in question. Usually this exception is indicative of the connection pool being oversaturated or resource governors being hit; since we have no control over how Hangfire’s implemented internally, we’re just checking to see if there is a known resource leak.