I’m using Hangfire.Core 1.7.9, Hangfire.Pro 2.2.1 and Hangfire.Pro.Redis 2.5.5, and following exception sometimes occurs on Windows Service startup.
2020-11-07 20:06:24,786  INFO Hangfire.BackgroundJobServer [(null)] - Starting Hangfire Server using job storage: 'redis://no master available/360'
After that message, I’m seeing following exception:
2020-11-07 20:06:39,849 [BackgroundServerProcess #1] DEBUG Hangfire.Processing.BackgroundExecution [(null)] - Execution loop BackgroundServerProcess:9280dbcb caught an exception and will be retried in 00:00:15 Hangfire.Pro.Redis.RedisStorageException: Connection to Redis isn't available yet, reconnect is in progress: please try again later. at Hangfire.Pro.Redis.RedisStorage.ThrowConnectionUnavailableException() at Hangfire.Pro.Redis.RedisStorage.GetDatabase() at Hangfire.Pro.Redis.RedisConnection.TryGetServerTime(DateTime& now, String& reason) at Hangfire.Pro.Redis.RedisConnection.AnnounceServer(String serverId, ServerContext context) at Hangfire.Server.BackgroundServerProcess.CreateServer(BackgroundServerContext context) at Hangfire.Server.BackgroundServerProcess.Execute(Guid executionId, BackgroundExecution execution, CancellationToken stoppingToken, CancellationToken stoppedToken, CancellationToken shutdownToken) at Hangfire.Server.BackgroundProcessingServer.RunServer(Guid executionId, Object state) at Hangfire.Processing.BackgroundExecution.Run(Action`2 callback, Object state)
What confuses me are - “no master available” message and subsequent exception. This is single instance Redis 5.0 server (running on Centos 7), which was available at the time - without any errors on server-side, or network issues.
At the time when the service was trying to reconnect/recover, I was able to connect to Redis server (through Redis desktop manager) without any issues from that same server where service was running.
This usually lasts for about 2-5 minutes, and then the hangfire recovers and things start working properly.
Although it is good that hangfire manages to recover, I’d really like to figure out why is this happening, and resolve it, especially because currently I need to delay execution and processing until the connection is established.
Unfortunately, this keeps popping up intermittently, so it is a bit hard to reproduce.
Can you please assist?