High CPU load on production server

Igor · September 4, 2015, 12:17pm

Hi,
currently on production server no job running, but there is high CPU load.
I’ve investigated a bit this situation and it seems that high load is caused by HangFire process that calling this query:

(@key nvarchar(4000))select Value from HangFire.[Set] where [Key] = @key

HangFire is running under MSSQL database.

Any suggestions please?
As i said there is no jobs running.
Thanks.

odinserj · September 4, 2015, 1:20pm

Hi, this query comes from the RecurringJobScheduler. What number of recurring jobs do you have? How many of a such queries you have? Does logging tells you something strange (warning/error/fatal levels)? And what version are you using?

Igor · September 4, 2015, 1:40pm

Hi, thanks for your reply.

I’m using HangFire version 1.4.5

I have 20 recurring job. HangFire was working from morning and it seems that problems started after 500 jobs was done in about 1.5 hour. All jobs was done, no exceptions or fails.

Can’t say the exact amount of such queries, but here is screen from server:

It’s really urgent.
Thanks.

odinserj · September 4, 2015, 2:02pm

What process causes the high CPU load: sql server or application (I’m assuming that sql server, but want to be sure)?
How many instances of an application that hosts Hangfire Server do you have? And what is the name of the highlighted column, can you give me the entire screenshot?

Igor · September 4, 2015, 2:14pm

Hi,
It seems that high load CPU is caused by MSSQL blocking. On HangFire dashboard i see from 10 to 40 different instances with 40 workers per instance. Physically there are 2 servers.

Name of highlighted column is “queue”

Attached screen shot with more information.

Igor · September 4, 2015, 2:26pm

One more note, after i modified web.config (i have flag there to use or not HangFire worker), dashboard is inaccessible but i see that HangFire still sending sql requests.

odinserj · September 4, 2015, 2:28pm

The data you’ve sent says us that there are many locks taken, and this is the way how recurring job scheduler synchronizes its work (I can explain it). The CPU column is fairly low, especially when comparing to other processes. But you’ve said earlier that

but there is high CPU load.

Are you sure that it is a CPU load problem?

P.S. Very nice dashboard, what is it?

Igor · September 4, 2015, 2:39pm

Yes, you are right, CPU column is pretty low. I think i need gather a bit more information. Should i turn on any additional logging that can be useful?

Igor · September 4, 2015, 2:43pm

About dashboard - it’s a PerfExpert application

odinserj · September 4, 2015, 2:44pm

Does your task manager says that CPU is high? I’m trying to understand what is the problem

Igor · September 4, 2015, 2:57pm

I was switching from previous scheduler to HangFire and code base for jobs is just the same. And it was working just fine but not stable sometimes(i mean previous scheduler).

After couple of hours working when all jobs was actually done the whole system start working unstable (page load time has increased dramatically and other things). And no any other changes except start using of HangFire. This is why i came to you with question.

But maybe i was wrong, so i will continue to investigate.
Thanks

odinserj · September 4, 2015, 3:01pm

Do you mean your application or the dashboard? So let’s debug this.

odinserj · September 4, 2015, 3:03pm

How many instances of your application do you usually have?

Igor · September 4, 2015, 3:13pm

Usually there are about 24 logical instances and 2 physical servers for different web sites with the same code base.

odinserj · September 4, 2015, 3:15pm

What shows task manager and resource monitor? Are there any high cpu/memory/disk usages?

odinserj · September 4, 2015, 3:18pm

24 * 40 = 960 workers makes 960 queries each N seconds (15 by default) to poll for background jobs, this is very high number, unless you prepared for such a high load. You can decrease their amount to, for example, 5 on each instance. Or offload background processing to a single instance or even windows service.

Igor · September 4, 2015, 3:20pm

Thanks, i will try that

Igor · September 15, 2015, 8:32am

Hi, just wanted to say that i created windows service and it seems that all things is stable now. Thanks for the help!

Topic		Replies	Views
Heavy SQL usage reported by host bug? sql-server	5	3998	March 24, 2021
SQL Server CPU and JobQueue Paging bug? sql-server	2	2369	December 20, 2016
Hangfire 1.6.1 causing 100 percent CPU usage bug?	4	6780	January 9, 2017
High CPU Usage, high latency question recurring , queues , job-filters , dashboard , aspnetcore	5	1168	September 18, 2023
Piling up of job queue/Random deadlock bug? recurring , sql-server	1	1944	February 14, 2019

High CPU load on production server

Related topics