Windows Server 2019, IIS 10, MS SQL 2017
We use Hangfire service for scheduling some operations like sending notifications or reloading some data. Each of them occurs single time or once per 15 minutes for some recurrent tasks.
Recently we noticed some memory leaks and used LeanSentry tool to analyze it.
Almost one third part of our Gen0 memory was taken by Hangfire SqlMapping (on picture attached).
Wow, that’s really bad. Please tell me everything to reproduce the behavior – what is the exact version of .NET Framework or .NET Core you are using? What version of Hangfire.Core and Hangfire.SqlServer do you have? Could you show me configuration logic related to Hangfire as well as background job method signatures?
I see you are starting a new background job server each time user session is started (since this is a Session_Start method) and curious how many background processing servers do you have? This may quickly grow the number of servers to an unbounded value, giving a lot of problems. You should start one or multiple servers in the Application_Start method instead.
Also, ensure your background job server is started only after all the configuration logic completed – I see in the code that it’s started, and then other configuration logic is called (like modifying filters, adding a logic, etc). So in this case some background jobs may be executed without using the correct filters, and some background processes may be started without any logger at all.
Also, don’t modify the JobStorage.Current property when application initialization is complete. All the classes that use it may be initialized with a specific job storage:
var storage = new SqlServerStorage("connection_string");
var backgroundJobs = new BackgroundJobClient(storage);
var recurringJobs = new RecurringJobManager(storage);
var server = new BackgroundJobServer(storage);
app.UseHangfireDashboard(storage);
// and so on
If you still see the problem after fixing the configuration, consider sending a memory dump to support[at]hangfire.io for further analysis. I’m afraid the only way to see what’s going on is to use WinDbg, because I didn’t see any other evidences of this problem.