Hello,
We have a paid, licensed version of Hang-Fire and faced a scalability issue using multiple background job servers to efficiently split work (jobs) among them. We use MS SQL Server storage and 3 priority queues (Low, Normal, High) per Job type.
For example, we have indexing, mass delete, etc. agents (one agent means a console application hosting one instance of BackgroundJobService for processing a particular kind of Jobs and Queues). So, HangFire queues for our Job types look like this:
Low_Indexing, Normal_Indexing, High_Indexing
Low_MassDelete, Normal_MassDelete, High_MassDelete
Each agent console application is configured to process the queues of a particular Job Type (e.g. IndexingAgent). It is clear so far. The problem starts with run of multiple instances of this application on different machines (PODs) because it seems that only one Agent console app. takes over most of workload. So, even if you have 5 parallel running Indexing Agents for processing the same x_Indexing queues, only one of them is really utilized to process more hundred of the same Job types.
My question: Does Hang-Fire support some proportional load-balancing of the jobs among the currently running background job server instances or it has to be implemented by us? I don’t know what is happening under the hood. I think the server fetches a batch of the jobs and then, starts to process them. The question is why other agent consoles cannot fetch the next batch to start to work on that on another machine in parallel. Creating a dedicated queue for each console app (machine or POD) seems enough poor solution and also not scalable very well.
We just have explicit settings for Queues and WorkerCount on BackgroundJobServerOptions. The previous one is the mentioned 3 priority queues, the last one is Environment.ProcessorCount * 5 at present.