Deploying multiple instances with mode toggle

dashboard
Tags: #<Tag:0x00007f8ba0359700>

#1

Hi

So I’ve been running hangfire to process overnight jobs, but I find when there are a lot of jobs going on the UI tends to lock up. I know you can run multiple instances albeit being careful you don’t run in to sql locks with competing jobs that interact with your sql database.

Any way, I thought I would try deploying two instances of my code so two sites in IIS, add a “mode” toggle to the config, put some feature flags in the startup and essentially deploy two instances one running as a Dashboard only and the other as a Server only (I also added an AllInOne option).

It seems to be working ok, especially in terms of I can only access the dashboard on the dashboard site and I can only see one server in the servers tab and jobs are being processed. I’ve got health check endpoints also, so I know both parts of the app are working as expected.

The one thing I have noticed is occasionally the server count seems to go to 0 and then a minute later its back again. Is this something to be concerned about? I’m also seeing a couple of errors in the logs I don’t think I saw before:

{"@t":"2020-06-23T16:34:26.6485052Z","@m":"An error was encountered writing imputed price ProductID: 0d072f35-ec3e-4120-8f46-cf4d30980608, EffectiveTime: 1/26/2020 12:00:00 AM, Price: 3.029, Error: Transaction (Process ID 136) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.","@i":"42bb0837","@l":"Error","@x":"System.Data.SqlClient.SqlException (0x80131904): Transaction (Process ID 136) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.\r\n   at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)\r\n   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)\r\n   at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)\r\n   at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString)\r\n   at System.Data.SqlClient.SqlCommand.CompleteAsyncExecuteReader()\r\n   at System.Data.SqlClient.SqlCommand.EndExecuteNonQueryInternal(IAsyncResult asyncResult)\r\n   at System.Data.SqlClient.SqlCommand.EndExecuteNonQuery(IAsyncResult asyncResult)\r\n   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at Dapper.SqlMapper.ExecuteImplAsync(IDbConnection cnn, CommandDefinition command, Object param) in C:\\projects\\dapper\\Dapper\\SqlMapper.Async.cs:line 678\r\n   at Pricing.Scheduler.ImputedPrices.Data.Store(ImputedPrice imputedPrice, SqlConnection connection)\r\n   at Pricing.Scheduler.ImputedPrices.Data.Store(IEnumerable`1 imputedPrices)\r\n   at Pricing.Scheduler.ImputedPrices.ImputedPricesSync.WritePrice(ImputedPrice price)\r\nClientConnectionId:65e64d50-4ec7-4555-a29a-6128ba640e7c\r\nError Number:1205,State:51,Class:13","ExceptionDetail":{"Data":{"HelpLink.ProdName":"Microsoft SQL Server","HelpLink.ProdVer":"13.00.4604","HelpLink.EvtSrc":"MSSQLServer","HelpLink.EvtID":"1205","HelpLink.BaseHelpUrl":"https://go.microsoft.com/fwlink","HelpLink.LinkId":"20476"},"HResult":-2146232060,"Message":"Transaction (Process ID 136) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.","Source":"Core .Net SqlClient Data Provider","Errors":[{"Source":"Core .Net SqlClient Data Provider","Number":1205,"State":51,"Class":13,"Server":"qa-vm-sql01","Message":"Transaction (Process ID 136) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.","Procedure":"UpdateSummaryDataOwnProduct","LineNumber":102}],"ClientConnectionId":"65e64d50-4ec7-4555-a29a-6128ba640e7c","Class":13,"LineNumber":102,"Number":1205,"Procedure":"UpdateSummaryDataOwnProduct","Server":"qa-vm-sql01","State":51,"ErrorCode":-2146232060,"Type":"System.Data.SqlClient.SqlException"},"MemoryUsage":118611064,"OperationId":"a21c147a-e748-4886-bcf6-749cf9401d06","Method":"Process","Class":"ImputedPricesSync","TenantId":"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx","MachineName":"QA-VM-APP01","ThreadId":5,"ProcessId":17432,"ProcessName":"w3wp","AssemblyName":"pricing.scheduler","AssemblyVersion":"1.0.0.0"}
{"@t":"2020-06-23T16:34:26.6516074Z","@m":"error encountered during imputed prices load","@i":"2d5d4151","@l":"Error","@x":"System.Data.SqlClient.SqlException (0x80131904): Transaction (Process ID 136) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.\r\n   at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)\r\n   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)\r\n   at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)\r\n   at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString)\r\n   at System.Data.SqlClient.SqlCommand.CompleteAsyncExecuteReader()\r\n   at System.Data.SqlClient.SqlCommand.EndExecuteNonQueryInternal(IAsyncResult asyncResult)\r\n   at System.Data.SqlClient.SqlCommand.EndExecuteNonQuery(IAsyncResult asyncResult)\r\n   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at Dapper.SqlMapper.ExecuteImplAsync(IDbConnection cnn, CommandDefinition command, Object param) in C:\\projects\\dapper\\Dapper\\SqlMapper.Async.cs:line 678\r\n   at Pricing.Scheduler.ImputedPrices.Data.Store(ImputedPrice imputedPrice, SqlConnection connection)\r\n   at Pricing.Scheduler.ImputedPrices.Data.Store(IEnumerable`1 imputedPrices)\r\n   at Pricing.Scheduler.ImputedPrices.ImputedPricesSync.WritePrice(ImputedPrice price)\r\n   at Pricing.Scheduler.ImputedPrices.ImputedPricesSync.<>c.<<ReceiveMessages>b__10_0>d.MoveNext()\r\nClientConnectionId:65e64d50-4ec7-4555-a29a-6128ba640e7c\r\nError Number:1205,State:51,Class:13","ExceptionDetail":{"Data":{"HelpLink.ProdName":"Microsoft SQL Server","HelpLink.ProdVer":"13.00.4604","HelpLink.EvtSrc":"MSSQLServer","HelpLink.EvtID":"1205","HelpLink.BaseHelpUrl":"https://go.microsoft.com/fwlink","HelpLink.LinkId":"20476"},"HResult":-2146232060,"Message":"Transaction (Process ID 136) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.","Source":"Core .Net SqlClient Data Provider","Errors":[{"Source":"Core .Net SqlClient Data Provider","Number":1205,"State":51,"Class":13,"Server":"qa-vm-sql01","Message":"Transaction (Process ID 136) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.","Procedure":"UpdateSummaryDataOwnProduct","LineNumber":102}],"ClientConnectionId":"65e64d50-4ec7-4555-a29a-6128ba640e7c","Class":13,"LineNumber":102,"Number":1205,"Procedure":"UpdateSummaryDataOwnProduct","Server":"qa-vm-sql01","State":51,"ErrorCode":-2146232060,"Type":"System.Data.SqlClient.SqlException"},"MemoryUsage":118856144,"OperationId":"a21c147a-e748-4886-bcf6-749cf9401d06","Method":"Process","Class":"ImputedPricesSync","TenantId":"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx","MachineName":"QA-VM-APP01","ThreadId":5,"ProcessId":17432,"ProcessName":"w3wp","AssemblyName":"pricing.scheduler","AssemblyVersion":"1.0.0.0"}

So it seems like I’m getting two things trying to process the same job?

This was the code change I made, basically just putting the if blocks around the dashboard and server in the startup and using my mode flag to toggle for each site.

    if (HostConfiguration.Mode == OperationMode.AllInOne || HostConfiguration.Mode == OperationMode.Server)
    {
        app.UseHangfireServer(new BackgroundJobServerOptions
        {
            ServerCheckInterval = TimeSpan.FromSeconds(10),
            HeartbeatInterval = TimeSpan.FromSeconds(10),
            ServerTimeout = TimeSpan.FromSeconds(15),
            WorkerCount = Environment.ProcessorCount * HostConfiguration.WorkerCountPerCore,
            ServerName = "Pricing Scheduler",
            Queues = new[] {"default"}
        });
    }

    if (HostConfiguration.Mode == OperationMode.AllInOne || HostConfiguration.Mode == OperationMode.Dashboard)
    {
        app.UseHangfireDashboard("/hangfire", new DashboardOptions
        {
            AppPath = HostConfiguration.WebAppUrl.GetRelativePath(),
            Authorization = new List<IDashboardAuthorizationFilter>
            {
                app.ApplicationServices.GetRequiredService<IAuthenticationProvider>().GetAuthorizationFilter()
            },
            StatsPollingInterval = 1000
        });
    }

Any help would be much appreciated.

Andy


#2

I think the main question is how you are running the server. Is it still running in a website?
When you are running the Hangfire server in an IIS as a website, you will see it being recycled and such all the time.

Personally I would prefer to host the Hangfire server in a Windows service (Topshelf makes it easy), since I cannot for the life of my figure out how to setup the IIS to run the Hangfire server reliably in a website project.

Only advantage I know of to run it in a website project, is the easy deployment feature with Visual Studio and IIS.


#3

I run it in IIS as a website, but I have things in place to ping a health endpoint which keeps it alive.