Why should be arguments of job small and simple?

Hi,
I found in documentation this article: http://docs.hangfire.io/en/latest/best-practices.html#make-job-arguments-small-and-simple

it states that job arguments are serialized and they should be small and simple. Looking into the DB the Job.Arguments field is NVARCHAR(max), so it seems there is no limit for this.

I understand that the arguments are converted into string, but why would this be an issue when the string is large? Are there some technical limits? Or could there be a performance issue?

saving the data in separate Table and referencing only as ID seems for me to have the same performance impact as the Data has to travel to the DB anyway…

thanks in advance,
regards,
Lukas

Small: different storages have different limits, so your code may work with one storage, but unexpectedly fail with the other one. Keeping parameters small will ensure they work with any storage type.

Simple: because they’re saved to database, all parameters’ types must be JSON-serializable. This typically is not an issue for simple types, but may be a problem for the complex ones. Read-only properties, non-default constructors etc. may require special handling (or even custom formatters) for the types to be successfully serialized and deserialized.

Both small and simple here are actually quite abstract. They don’t mean the job must have a few primitive args at most – you can easily have a few megabytes array of structures as a parameter (given it is JSON-serializable), and it would be totally fine.

Another point of keeping parameters small and simple is the dashboard. Having large and complex arguments might not affect job execution, but they will definitely make the job’s page heavy and unreadable.

1 Like

great thanks a lot for your fast reply. This makes perfectly sense for me now!

I forgot about the dashboard UI aspect, that could be some issue.

I am using only simple DTO classes that are simple to de/serialize - no problem here for me.

about the size - good point, for now I am using MsSQL where it works, but it is not future reliable when some developer would exchange the technology.

I need to think a bit if it is worth for me now creating some extra table for storing the parameters or I just do it the simple way :slight_smile:

The pattern we seem to be implementing consistently is any time you’re going to have more than one BackgroundJob working on the same “object”, you’re better off storing that object in a table and passing the primary key into the BackgroundJob.

You have to keep in mind that BackgroundJobs may fail and be executed again. On the retry execution(s), It’s better for the BackgroundJob to look up the data and get a “real-time” understanding of the state rather than relying on the potentially stale state passed in as an input.

We will also generally add auditing timestamps to our saved state as well, to understand how much work has been completed, which really aids in troubleshooting.

If you’re doing some fire and forget, don’t retry, processing, maybe this doesn’t matter, but at least for us, that’s like 2% of the type of code we write for Hangfire.

1 Like

you mentioned an interesting point to consider - data can become stale after time. It depends on what the job should do. In my case I pass to the job list of Email addresses to which something should be sent - in my case there is nothing that could change with time. But there are for sure lot of scenarios where the data should be fresh as possible.

There was an interesting thread here regarding job serialization size when you have 260k jobs queued. When you have that many jobs every optimization counts :slight_smile:

https://github.com/HangfireIO/Hangfire/issues/896