We use hangfire (v1.4.6) to perform recurring job every day. It fetches some data from web, parses data and stores data in sql database. So basically simple sinhronization (check if this records exists, select … if exists update record else insert new one). We are experiencing sql Timeout:
Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
So we added logging to synchronization (Nlog) and found next lines:
2015-11-12 16:04:59.5020 TRACE SyncWithDatabase started.
2015-11-12 16:05:00.6582 TRACE Modified block took: 1,1562594
2015-11-12 16:05:03.1270 TRACE Updating just RecSyncDate took 2,4374999 seconds
2015-11-12 16:05:03.1270 TRACE Adding new records to list took: 0 seconds
2015-11-12 16:05:03.1270 TRACE Adding 0 records to db took: 0
2015-11-12 16:05:03.1425 TRACE SyncWithDatabase finished in 3,6405649 seconds
So far so good, everything finished in our desired time. And now this:
2015-11-12 16:07:26.6710 TRACE SyncWithDatabase started.
2015-11-12 16:07:26.9923 TRACE Modified block took: 0,3213537
2015-11-12 16:08:02.2591 DEBUG Error occurred during execution of 'Server Heartbeat' component. Execution will be retried (attempt 1 of 2147483647) in 00:00:01 seconds.
2015-11-12 16:08:02.2591 DEBUG Error occurred during execution of 'Worker #10' component. Execution will be retried (attempt 1 of 2147483647) in 00:00:00 seconds.
2015-11-12 16:08:02.0079 ERROR Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
2015-11-12 16:08:02.5991 DEBUG Error occurred during execution of 'Worker #5' component. Execution will be retried (attempt 1 of 2147483647) in 00:00:00 seconds.
2015-11-12 16:08:12.5891 TRACE Updating just RecSyncDate took 45,5967749 seconds
2015-11-12 16:08:12.5891 TRACE Adding new records to list took: 0 seconds
2015-11-12 16:08:12.5891 TRACE Adding 0 records to db took: 0
2015-11-12 16:08:12.6031 TRACE SyncWithDatabase finished in 45,9330875 seconds
I suspect that hangfire does some other work (as shown above, ERROR) while executing our task, which exceedes ms sql default sqlCommandTimeout - 30 seconds . Is this normal behaviour/by design or is there some setting or something that we can adjust, so this behaviour won’t happen anymore. As it seems from the log hangfire loses server heartbeat from the server it executes the task (we only tried to run on one server).
Yes I know: increase the default sql command timeout and everything should be fine. But why this happens? Even if I increase default timeout what would be the right value. You can’t just increase it forever.