Redis Sentinel Support

redis
Tags: #<Tag:0x00007fa53f73ba18>

#1

We are trying to use redis sentinel for automated fail over. I’ve done some testing and while there seems to be some confusion as to whether or not sentinel is supported in StackExchange.Redis (See following posts, apparently I can only put 3 links in a single post because I’m a new user…), it appears to work well enough by simply providing all nodes and sentinels in the connection string. Performing SENTINEL FAILOVER <mymaster> or manually killing one of the nodes will automatically reconfigure the connection multiplexer.

In Hangfire.Pro.Redis 2.0.3, you have to specify AllowMultipleEndPointsWithoutRedLock. If this option is specified, Hangfire seems to mirror the functionality of StackExchange.Redis in that while an exception might be thrown while the failover is in progress, it will eventually recover:

Background job creation failed. See inner exception for details.
   at Hangfire.BackgroundJobClient.Create(Job job, IState state)
   at Hangfire.BackgroundJobClientExtensions.Create(IBackgroundJobClient client, Expression`1 methodCall, IState state)
   at Hangfire.BackgroundJobClientExtensions.Schedule(IBackgroundJobClient client, Expression`1 methodCall, TimeSpan delay)
   at Hangfire.BackgroundJob.Schedule(Expression`1 methodCall, TimeSpan delay)
   at HangfireTest.Program.Main(String[] args) in C:\Users\nlowe\Projects\se-redis-test\HangfireTest\Program.cs:line 51
-----
ProtocolFailure on EXEC
   at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message message, ResultProcessor`1 processor, ServerEndPoint server)
   at StackExchange.Redis.RedisTransaction.Execute(CommandFlags flags)
   at StackExchange.Redis.KeyspaceIsolation.TransactionWrapper.Execute(CommandFlags flags)
   at Hangfire.Pro.Redis.RedisConnection.CreateExpiredJob(Job job, IDictionary`2 parameters, DateTime createdAt, TimeSpan expireIn)
   at Hangfire.Client.CoreBackgroundJobFactory.Create(CreateContext context)
   at Hangfire.Client.BackgroundJobFactory.<>c__DisplayClass3.<CreateWithFilters>b__0()
   at Hangfire.Client.BackgroundJobFactory.InvokeClientFilter(IClientFilter filter, CreatingContext preContext, Func`1 continuation)
   at Hangfire.Client.BackgroundJobFactory.<>c__DisplayClass3.<>c__DisplayClass5.<CreateWithFilters>b__2()
   at Hangfire.Client.BackgroundJobFactory.CreateWithFilters(CreateContext context, IEnumerable`1 filters)
   at Hangfire.Client.BackgroundJobFactory.Create(CreateContext context)
   at Hangfire.BackgroundJobClient.Create(Job job, IState state)

However, in Hangfire.Pro.Redis 2.1.1, AllowMultipleEndPointsWithoutRedLock is deprecated as “Redis Cluster is fully supported”. If I kill one of the nodes with this version, I get Object Disposed exceptions:

Exception from Hangfire: Background job creation failed. See inner exception for details.
at Hangfire.BackgroundJobClient.Create(Job job, IState state)
at Hangfire.BackgroundJobClientExtensions.Create(IBackgroundJobClient client, Expression`1 methodCall, IState state)
at Hangfire.BackgroundJobClientExtensions.Schedule(IBackgroundJobClient client, Expression`1 methodCall, TimeSpan delay)
at Hangfire.BackgroundJob.Schedule(Expression`1 methodCall, TimeSpan delay)
at HangfireTest.Program.Main(String[] args) in C:\Users\nlowe\Projects\se-redis-test\HangfireTest\Program.cs:line 51
-----
Cannot access a disposed object.
Object name: 'Hangfire@MY-HOSTNAME'.
at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message message, ResultProcessor`1 processor, ServerEndPoint server)
at StackExchange.Redis.RedisTransaction.Execute(CommandFlags flags)
at StackExchange.Redis.KeyspaceIsolation.TransactionWrapper.Execute(CommandFlags flags)
at Hangfire.Pro.Redis.RedisConnection.CreateExpiredJob(Job job, IDictionary`2 parameters, DateTime createdAt, TimeSpan expireIn)
at Hangfire.Client.CoreBackgroundJobFactory.Create(CreateContext context)
at Hangfire.Client.BackgroundJobFactory.<>c__DisplayClass3.<CreateWithFilters>b__0()
at Hangfire.Client.BackgroundJobFactory.InvokeClientFilter(IClientFilter filter, CreatingContext preContext, Func`1 continuation)
at Hangfire.Client.BackgroundJobFactory.<>c__DisplayClass3.<>c__DisplayClass5.<CreateWithFilters>b__2()
at Hangfire.Client.BackgroundJobFactory.CreateWithFilters(CreateContext context, IEnumerable`1 filters)
at Hangfire.Client.BackgroundJobFactory.Create(CreateContext context)
at Hangfire.BackgroundJobClient.Create(Job job, IState state)

There’s also an extreme delay in job scheduling. Sometimes scheduling a job throws this exception after 30-60 seconds:

Exception from Hangfire: Background job creation failed. See inner exception for details.
at Hangfire.BackgroundJobClient.Create(Job job, IState state)
at Hangfire.BackgroundJobClientExtensions.Create(IBackgroundJobClient client, Expression`1 methodCall, IState state)
at Hangfire.BackgroundJobClientExtensions.Schedule(IBackgroundJobClient client, Expression`1 methodCall, TimeSpan delay)
at Hangfire.BackgroundJob.Schedule(Expression`1 methodCall, TimeSpan delay)
at HangfireTest.Program.Main(String[] args) in C:\Users\nlowe\Projects\se-redis-test\HangfireTest\Program.cs:line 51
-----
Connection to Redis isn't available yet, reconnect is in progress. Please try again later.
at Hangfire.Pro.Redis.RedisStorage.GetDatabase()
at Hangfire.Pro.Redis.RedisConnection.CreateExpiredJob(Job job, IDictionary`2 parameters, DateTime createdAt, TimeSpan expireIn)
at Hangfire.Client.CoreBackgroundJobFactory.Create(CreateContext context)
at Hangfire.Client.BackgroundJobFactory.<>c__DisplayClass3.<CreateWithFilters>b__0()
at Hangfire.Client.BackgroundJobFactory.InvokeClientFilter(IClientFilter filter, CreatingContext preContext, Func`1 continuation)
at Hangfire.Client.BackgroundJobFactory.<>c__DisplayClass3.<>c__DisplayClass5.<CreateWithFilters>b__2()
at Hangfire.Client.BackgroundJobFactory.CreateWithFilters(CreateContext context, IEnumerable`1 filters)
at Hangfire.Client.BackgroundJobFactory.Create(CreateContext context)
at Hangfire.BackgroundJobClient.Create(Job job, IState state)

Other times, no exception is thrown. Regardless, the worker never picks up these jobs. Is sentinel support planned? Or do we have to move to Redis Cluster for high availability with Hangfire?


#2

Additional Resources:


#3

@odinserj is sentinel support on the road map? Is there a better support channel we should be using for Hangfire.Pro.* customers?


#4

Redis Sentinel support in the StackExchange.Redis has a very long story, and looks like nobody knows, including SE.Redis library authors when it will be merged into the upstream. I know there are some pull requests that add this feature, but I don’t know whether they are robust enough to be used by Hangfire.Pro.Redis. StackExchange.Redis itself contains some bugs related to connection resilience on network blips, especially on cloud environments. So I’m afraid that I add support for Sentinel, and drown myself in support tickets.

We could add experimental support for Redis Sentinel, but I have no plans currently to implement it and make it well-tested. A lot of things should be considered, one of them is the requirement to have a private fork to support Sentinel. This is fine for .NET Framework version of Hangfire.Pro.Redis, because SE.Redis library is internalized here. But what to do with .NET Core? Separate package on NuGet? So there are a lot of questions for this topic.

Regarding delay on failover – that’s a pain point in Redis Cluster. There may be a delay in Redis itself, before a new master is promoted. Another point is that we can’t get any notification regarding the failover event as in Sentinel – so the only way to know master is changed is to listen hash slot moved errors. StackExchange.Redis library may also add some delays.

P.S. It’s much better to use support at hangfire.io email, and you’ll get faster response times. I’ve added that email to every purchase notification, but looks like that’s not enough. I’m considering to add a Support page on web site with detailed description how to get support.