Skip to content

asyncio redis cluster: fixed reconnection when whole cluster goes down #3111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

adaamz
Copy link

@adaamz adaamz commented Jan 14, 2024

Pull Request check-list

  • Do tests and lints pass with this change?
  • Do the CI tests pass with this change (enable it first in your forked repo and wait for the github action build to finish)?
  • Is the new or changed code fully tested?
  • Is a documentation update included (if this change modifies existing APIs, or introduces new ones)?
  • Is there an example added to the examples folder (if applicable)?
  • Was the change added to CHANGES file?

Description of change

When redis cluster goes completely down (all nodes are offline) then redis-py library is unable to reconnect to them when at least startup nodes are back online.
This workaround worked for me, but I'm not sure if there is some more efficient way to achieve same bugfix.

I testesd it isolately with spawning minimal redis cluster in docker and then taking it down and up after few seconds to test how the app reacts on this.
My python test script looks like this:

import asyncio
import time
import traceback

from redis.asyncio.cluster import ClusterNode, RedisCluster

redis_host = "172.26.0.2"
redis_port = 6379
key = "some_key"


def prepare_redis_async():
    return RedisCluster(
        ssl=False,
        startup_nodes=[
            ClusterNode(host=redis_host, port=redis_port)
        ],
        socket_connect_timeout=3,
        socket_timeout=60,
        require_full_coverage=True,
        max_connections=5,
        cluster_error_retry_attempts=20
    )


redis_client_async = prepare_redis_async()


async def async_get_value():
    x = await redis_client_async.get(key)
    print(x)
    await redis_client_async.set(key, "async_test")

if __name__ == '__main__':
    loop = asyncio.get_event_loop()

    while True:
        try:
            loop.run_until_complete(async_get_value())
        except:
            print(traceback.format_exc())

        time.sleep(0.1)

@adaamz adaamz force-pushed the asyncio_redis_cluster_reconnect branch from 1c9a19f to 201b345 Compare February 8, 2024 13:05
@adaamz adaamz marked this pull request as ready for review February 8, 2024 13:06
@adaamz adaamz force-pushed the asyncio_redis_cluster_reconnect branch from 201b345 to 715cfcf Compare February 26, 2024 11:24
@adaamz
Copy link
Author

adaamz commented Feb 26, 2024

@chayim Hello, is there anything I can do to get this PR reviewed? Thanks

@adaamz adaamz force-pushed the asyncio_redis_cluster_reconnect branch from 715cfcf to 975b666 Compare March 4, 2024 10:29
@petyaslavova
Copy link
Collaborator

Hi @adaamz, thank you for the time and effort you put into this PR!
I'm closing it as the issue has already been addressed in PR #3646.

@adaamz
Copy link
Author

adaamz commented May 29, 2025

@petyaslavova but this doesn't solve my issue, or is it?

When all nodes goes down then all nodes are removed fromt he list and when they are up again the list is still empty and we are unable to use those nodes again.

@petyaslavova
Copy link
Collaborator

@adaamz, you should instantiate your cluster with dynamic_startup_nodes=False. This ensures that the initial list of startup_nodes won't be overwritten by nodes discovered from the cluster, allowing the original addresses to remain available after a cluster recovery.

@adaamz
Copy link
Author

adaamz commented May 29, 2025

@petyaslavova Thanks. What about in case we set one DNS node and then the library should discover rest of nodes in cluster? Will it discover whole cluster when connecting to only set node or it will just communicate just with the one configured node?

@petyaslavova
Copy link
Collaborator

? Will it discover whole cluster when connecting to only set node or it will just communicate just with the one configured node?

@adaamz Yes, it will. And the nodes will be used for almost the whole communication. The only request that will be sent to the initial startup node/s is the one for cluster topology extraction.

@adaamz adaamz deleted the asyncio_redis_cluster_reconnect branch May 30, 2025 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants