High CPU load on production server

Hi,
currently on production server no job running, but there is high CPU load.
I’ve investigated a bit this situation and it seems that high load is caused by HangFire process that calling this query:

(@key nvarchar(4000))select Value from HangFire.[Set] where [Key] = @key

HangFire is running under MSSQL database.

Any suggestions please?
As i said there is no jobs running.
Thanks.

Hi, this query comes from the RecurringJobScheduler. What number of recurring jobs do you have? How many of a such queries you have? Does logging tells you something strange (warning/error/fatal levels)? And what version are you using?

Hi, thanks for your reply.

I’m using HangFire version 1.4.5

I have 20 recurring job. HangFire was working from morning and it seems that problems started after 500 jobs was done in about 1.5 hour. All jobs was done, no exceptions or fails.

Can’t say the exact amount of such queries, but here is screen from server:

It’s really urgent.
Thanks.

What process causes the high CPU load: sql server or application (I’m assuming that sql server, but want to be sure)?
How many instances of an application that hosts Hangfire Server do you have? And what is the name of the highlighted column, can you give me the entire screenshot?

Hi,
It seems that high load CPU is caused by MSSQL blocking. On HangFire dashboard i see from 10 to 40 different instances with 40 workers per instance. Physically there are 2 servers.

Name of highlighted column is “queue”

Attached screen shot with more information.

One more note, after i modified web.config (i have flag there to use or not HangFire worker), dashboard is inaccessible but i see that HangFire still sending sql requests.

The data you’ve sent says us that there are many locks taken, and this is the way how recurring job scheduler synchronizes its work (I can explain it). The CPU column is fairly low, especially when comparing to other processes. But you’ve said earlier that

but there is high CPU load.

Are you sure that it is a CPU load problem?

P.S. Very nice dashboard, what is it?

Yes, you are right, CPU column is pretty low. I think i need gather a bit more information. Should i turn on any additional logging that can be useful?

About dashboard - it’s a PerfExpert application

Does your task manager says that CPU is high? I’m trying to understand what is the problem :slight_smile:

I was switching from previous scheduler to HangFire and code base for jobs is just the same. And it was working just fine but not stable sometimes(i mean previous scheduler).

After couple of hours working when all jobs was actually done the whole system start working unstable (page load time has increased dramatically and other things). And no any other changes except start using of HangFire. This is why i came to you with question.

But maybe i was wrong, so i will continue to investigate.
Thanks

Do you mean your application or the dashboard? So let’s debug this.

How many instances of your application do you usually have?

Usually there are about 24 logical instances and 2 physical servers for different web sites with the same code base.

What shows task manager and resource monitor? Are there any high cpu/memory/disk usages?

24 * 40 = 960 workers makes 960 queries each N seconds (15 by default) to poll for background jobs, this is very high number, unless you prepared for such a high load. You can decrease their amount to, for example, 5 on each instance. Or offload background processing to a single instance or even windows service.

Thanks, i will try that

Hi, just wanted to say that i created windows service and it seems that all things is stable now. Thanks for the help!

1 Like