How to reduce hangfire db table size for a running live system

sql-server
Tags: #<Tag:0x00007f64b6a7d178>

#1

We are running Hangfire on an Azure SQL DB. The size of the database grows dramatically. Especially the tables Set, State, Job and Hash consume 30 GB. Question: Ist there any script or strategy to delete data from old jobs? I saw some scripts to truncate the tables. But I guess truncating data in a live prod system might not be a good strategy.
BTW: Is there any db reference guide explaining the tables in details?


#2

Successful job state data should be cleaned out of the database after 24h, but default strategy for error data is to keep it around forever (I believe).

Are you passing a large quantity of data to each BackgroundJob or failing hundreds of thousands of Jobs?


#3

Well I was able to clean alot of old Jobs that were obsolete. Most tables now are much smaller and everything is still working. But I see still 20 GB data in the table set.

From my understanding, when I now select them by…

SELECT CAST(SUBSTRING([Key], 19, LEN([Key]) - 19) AS int) as SetJobId, *
FROM [HangFire].[Set]
WHERE [Key] LIKE ‘console:%’

I would get the referening jobids. As a result I find 3801 distinct jobids. When I now join them with the Job table I see that only 2 jobids do exists. BTW: These jobsids were also not part of the Jobs I deleted before.

This would mean that 8.7 millions records in the table Set do refer to a Job that doesnt exist anymore.
So my question is: Is the query above really selecting the jobids in the set table correctly?
And if yes: Then I am able to delete these entries. Right?

And: Is there any db reference guide for this db? I looked into the source but didnt find any code that would add console records for this. From my understanding the table Set includes all the console Output that can be seen in the webapp. We dont need them since we are saving console output of our jobs/scripts in an own table.


#4

The Set table gets used for a couple things and as far as I can tell it’s basically a key/value look up and not every row in Set will join to the Job Id. It should be pretty easy to tell from the prefix of the Key whether or not the Key is related to a Job.

If you can thin down the entries in the Set table to ones you are sure are dead Jobs then you can safely delete them. I’d definitely take a backup ahead of time.

I don’t know of any database documentation, but it’s just Entity Framework on top of 11 tables.


#5

I don’t know of any database documentation, but it’s just Entity Framework on top of 11 tables.
<<<
Rofl. Yes it is not that big. But there is no clear description for example how all these Key like ‘console:%’ records are getting into the table.

The description of the table structure is not the problem. It is the sense and the process with/for the tables. Yes, I can scan the source. But even in the source no code like "INSERT INTO SET … Key= ‘console…’ can be found.

I find AddToSet, RemoveFromSet, GetRangeFromSet, Get…FromSet. But all calling Set with KEY = awaiting, retries, schedule, recurring-jobs. None with ‘console:…’.

On the other side I found source here in the forum that

CAST(SUBSTRING([Key], 19, LEN([Key]) - 19) AS int)

means the Jobid in case of

WHERE [Key] LIKE ‘console:%’

The funny thing at the end: Selecting this on the db side, most jobid of this result are not existing in the Job table.

It might not be an error. If I would get an explanation for the meaning and process for Set.

Why? Well Set is blowing up and getting larger and larger. 9 mio console records of of mean ca 11 GB of data. Our System is running alot of Jobs per day. So I now killed 33% of them. Fortunately the system stays stabil.

Maybe my point of view on Hangfire is more a kind of minority. So let me ask a more simple Question:

Is it possible to set a configuration in Hangfire so that hangfire will not add any “console…” record into the table Set? To be honest: I dont need it. I only need it when ps1 files are executed by hangfire. But here it is not hangfire who is doing it. For this I have a small app connecting to hangfire, looking for ps1 files and execute them. And this app adds the console into an own table.


#6

The Hangfire.Console package will add the “console:%” entries to the set table. https://github.com/pieceofsummer/Hangfire.Console/blob/master/src/Hangfire.Console/Storage/ConsoleIdExtensions.cs

I don’t have any experience with this package and you have very possibly encountered a bug with regards to it’s cleanup, or perhaps are missing some setup information. Is it possible that your Hangfire Jobs are running some code that is dumping massive amounts of information to the console?