Skip to content

Fix: bootstrap: Detect cluster service on init node before saving the canonical hostname (bsc#1222714) #1386

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 16, 2024

Conversation

liangxin1300
Copy link
Collaborator

@liangxin1300 liangxin1300 commented Apr 12, 2024

# When cluster service is down on init node
# crm cluster join -c 15sp5-1 -y
WARNING: chronyd.service is not configured to start at system boot.
ERROR: cluster.join:
See crmsh.log:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/crmsh/ui_context.py", line 86, in run
    rv = self.execute_command() is not False
  File "/usr/lib/python3.6/site-packages/crmsh/ui_context.py", line 267, in execute_command
    rv = self.command_info.function(*arglist)
  File "/usr/lib/python3.6/site-packages/crmsh/ui_cluster.py", line 543, in do_join
    bootstrap.bootstrap_join(join_context)
  File "/usr/lib/python3.6/site-packages/crmsh/bootstrap.py", line 2246, in bootstrap_join
    join_ssh(cluster_node, remote_user)
  File "/usr/lib/python3.6/site-packages/crmsh/bootstrap.py", line 1596, in join_ssh
    return join_ssh_impl(local_user, seed_host, seed_user, keys)
  File "/usr/lib/python3.6/site-packages/crmsh/bootstrap.py", line 1628, in join_ssh_impl
    user_by_host.add(seed_user, get_node_canonical_hostname(seed_host))
  File "/usr/lib/python3.6/site-packages/crmsh/bootstrap.py", line 468, in get_node_canonical_hostname
    utils.fatal(err)
  File "/usr/lib/python3.6/site-packages/crmsh/utils.py", line 2711, in fatal
    raise ValueError(error_msg)
ValueError

With this fix, when cluster service is down on init node

# crm cluster join -c 15sp5-1 -y
WARNING: chronyd.service is not configured to start at system boot.
INFO: A new ssh keypair is generated for user hacluster.
WARNING: Cluster is inactive on 15sp5-1. Retry in 10 seconds
WARNING: Cluster is inactive on 15sp5-1. Retry in 10 seconds
...

@@ -1625,8 +1625,6 @@ def join_ssh_impl(local_user, seed_host, seed_user, ssh_public_keys: typing.List
user_by_host.add(local_user, utils.this_node())
user_by_host.set_no_generating_ssh_key(bool(ssh_public_keys))
user_by_host.save_local()
user_by_host.add(seed_user, get_node_canonical_hostname(seed_host))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is required by line 1633 when a hostname alias is used.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we can try to check init node's cluster service status just after the line 1620

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the process/order for this part will be:

  1. call ping_node to check if the init node is up
  2. call join_ssh to setup passwordless with init node
  3. inside join_ssh, after swap the key for the first user('hacluster' is the second), and before using utils.HostUserConfig to save the info of user, check the status of cluster service on init node
  4. then get the canonical hostname from init node

@liangxin1300 liangxin1300 force-pushed the 20240412_minor_changes branch 2 times, most recently from d68e99f to 4b0dd66 Compare April 15, 2024 13:13
@liangxin1300 liangxin1300 changed the title Fix: bootstrap: Save the canonical hostname of init node after cluste… Fix: bootstrap: Detect cluster service on init node before saving the canonical hostname (bsc#1222714) Apr 15, 2024
Copy link

codecov bot commented Apr 15, 2024

Codecov Report

Attention: Patch coverage is 70.00000% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 53.27%. Comparing base (09f3ed8) to head (5036bc6).

Files Patch % Lines
crmsh/bootstrap.py 70.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1386      +/-   ##
==========================================
- Coverage   53.32%   53.27%   -0.05%     
==========================================
  Files          80       80              
  Lines       23941    23942       +1     
==========================================
- Hits        12766    12755      -11     
- Misses      11175    11187      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@liangxin1300 liangxin1300 force-pushed the 20240412_minor_changes branch from c3800b2 to 5036bc6 Compare April 16, 2024 06:25
@liangxin1300 liangxin1300 merged commit 1b89f20 into ClusterLabs:master Apr 16, 2024
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants