Hi,
currently on production server no job running, but there is high CPU load.
I’ve investigated a bit this situation and it seems that high load is caused by HangFire process that calling this query:
(@key nvarchar(4000))select Value from HangFire.[Set] where [Key] = @key
HangFire is running under MSSQL database.
Any suggestions please?
As i said there is no jobs running.
Thanks.
Hi, this query comes from the RecurringJobScheduler. What number of recurring jobs do you have? How many of a such queries you have? Does logging tells you something strange (warning/error/fatal levels)? And what version are you using?
I have 20 recurring job. HangFire was working from morning and it seems that problems started after 500 jobs was done in about 1.5 hour. All jobs was done, no exceptions or fails.
Can’t say the exact amount of such queries, but here is screen from server:
What process causes the high CPU load: sql server or application (I’m assuming that sql server, but want to be sure)?
How many instances of an application that hosts Hangfire Server do you have? And what is the name of the highlighted column, can you give me the entire screenshot?
Hi,
It seems that high load CPU is caused by MSSQL blocking. On HangFire dashboard i see from 10 to 40 different instances with 40 workers per instance. Physically there are 2 servers.
One more note, after i modified web.config (i have flag there to use or not HangFire worker), dashboard is inaccessible but i see that HangFire still sending sql requests.
The data you’ve sent says us that there are many locks taken, and this is the way how recurring job scheduler synchronizes its work (I can explain it). The CPU column is fairly low, especially when comparing to other processes. But you’ve said earlier that
Yes, you are right, CPU column is pretty low. I think i need gather a bit more information. Should i turn on any additional logging that can be useful?
I was switching from previous scheduler to HangFire and code base for jobs is just the same. And it was working just fine but not stable sometimes(i mean previous scheduler).
After couple of hours working when all jobs was actually done the whole system start working unstable (page load time has increased dramatically and other things). And no any other changes except start using of HangFire. This is why i came to you with question.
But maybe i was wrong, so i will continue to investigate.
Thanks
24 * 40 = 960 workers makes 960 queries each N seconds (15 by default) to poll for background jobs, this is very high number, unless you prepared for such a high load. You can decrease their amount to, for example, 5 on each instance. Or offload background processing to a single instance or even windows service.