Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mgmtd primary not found #184

Open
Icedroid opened this issue Mar 17, 2025 · 0 comments
Open

mgmtd primary not found #184

Icedroid opened this issue Mar 17, 2025 · 0 comments

Comments

@Icedroid
Copy link

Icedroid commented Mar 17, 2025

Got an error after step 4. run a admin_cli list-nodes

# /opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://192.168.3.25:8000"]' "list-nodes"
Encounter error: 6004(MgmtdClient::RoutingInfoNotReady)
# tail -f cli.log -n 1000
[2025-03-17T06:28:38.756472517+00:00 CliConn1: 2092 IBConnect.cc:290 DEBUG] IBSocket [connect RDMA://192.168.3.25:8000] use devices {local mlx5_2:1, remote mlx5_2:1, zone UNKNOWN}
[2025-03-17T06:28:38.756474969+00:00 CliConn1: 2092 IBConnect.cc:386 INFO] IBSocket [connect RDMA://192.168.3.25:8000] connect with dev {local mlx5_2:1, remote mlx5_2:1, zone UNKNOWN}
[2025-03-17T06:28:38.756696933+00:00 CliConn1: 2092 IBDevice.cc:112 DEBUG] mlx5_2:1 skip GID 0, ip fe80::ba3f:d2ff:fec8:fe8e
[2025-03-17T06:28:38.756701670+00:00 CliConn1: 2092 IBDevice.cc:112 DEBUG] mlx5_2:1 skip GID 1, ip fe80::ba3f:d2ff:fec8:fe8e
[2025-03-17T06:28:38.756732933+00:00 CliConn1: 2092 IBDevice.cc:129 DEBUG] mlx5_2:1 skip GID 2, type IB/RoCE v1 != RoCE v2
[2025-03-17T06:28:38.756742849+00:00 CliConn1: 2092 IBDevice.cc:133 DEBUG] mlx5_2:1 found RoCE v2 GID, index 3, ip ::ffff:192.168.3.25
[2025-03-17T06:28:38.756758943+00:00 CliConn1: 2092 IBConnect.cc:545 DEBUG] IBSocket [connect RDMA://192.168.3.25:8000] create QP.
[2025-03-17T06:28:38.757876165+00:00 CliConn1: 2092 IBConnect.cc:727 DEBUG] IBSocket [connect RDMA://192.168.3.25:8000] init socket buffers.
[2025-03-17T06:28:38.758423065+00:00 CliConn1: 2092 IBDevice.cc:520 DEBUG] IBDevice mlx5_2 reg_mr, addr 0x556b78032000, length 1196032, access 1048577, mr 0x556b76a779c0
[2025-03-17T06:28:38.758455178+00:00 CliConn1: 2092 IBConnect.cc:606 DEBUG] IBSocket [connect RDMA://192.168.3.25:8000] init QP.
[2025-03-17T06:28:38.761324337+00:00 CliProc1: 2086 Processor.h:153 DEBUG] receive response 11:2
[2025-03-17T06:28:38.761357114+00:00 CliConn0: 2091 IBConnect.cc:625 DEBUG] IBSocket RDMA://localhost.localdomain/mlx5_2:1/1025 modify QP to RTR.
[2025-03-17T06:28:38.761369815+00:00 CliConn0: 2091 IBConnect.cc:653 DEBUG] IBSocket RDMA://localhost.localdomain/mlx5_2:1/1025 modify QP to RTR, linklayer ETHERNET, dgid 0:0:0:0:0:0:0:0:0:0:ff:ff:c0:a8:3:19, tc 0
[2025-03-17T06:28:38.761691893+00:00 CliConn0: 2091 IBConnect.cc:697 DEBUG] IBSocket RDMA://localhost.localdomain/mlx5_2:1/1025 modify QP to RTS.
[2025-03-17T06:28:38.761862855+00:00 CliConn0: 2091 IBConnect.cc:446 INFO] IBSocket RDMA://localhost.localdomain/mlx5_2:1/1025 connected
[2025-03-17T06:28:38.761893006+00:00 CliEL0: 2093 IBSocket.cc:509 INFO] IBSocket RDMA://localhost.localdomain/mlx5_2:1/1025 turn to READY from CONNECTING.
[2025-03-17T06:28:38.828395564+00:00 IBManager: 2084 IBSocket.cc:1219 DEBUG] IBSocketManager handleEvents running
[2025-03-17T06:28:38.859374605+00:00 CliBG0: 2089 IBSocket.cc:729 DEBUG] IBSocket RDMA://localhost.localdomain/mlx5_2:1/1023 modify QP to ERROR
[2025-03-17T06:28:38.859817155+00:00 CliBG0: 2089 IBSocket.cc:317 DEBUG] IBSocket destructor RDMA://localhost.localdomain/mlx5_2:1/1023
[2025-03-17T06:28:38.860280560+00:00 CliBG0: 2089 IBDevice.cc:537 DEBUG] IBDevice mlx5_1 dereg_mr mr 0x556b76a77680
[2025-03-17T06:28:38.862270903+00:00 CliBG0: 2089 MgmtdClient.cc:509 ERROR] MgmtdClient: probePrimary NodeId(0) failed: RPC::Timeout(2005)
[2025-03-17T06:28:38.862339969+00:00 CliBG0: 2089 MgmtdClient.cc:162 INFO] MgmtdClient: mark node as Skipped NodeId(0) ["RDMA://192.168.3.25:8000"]
[2025-03-17T06:28:38.862355569+00:00 CliBG0: 2089 MgmtdClient.cc:434 WARNING] MgmtdClient: primary not found
[2025-03-17T06:28:38.862366583+00:00 CliBG0: 2089 MgmtdClient.cc:390 ERROR] [MgmtdClientOp RefreshRoutingInfo force=false No.2] failed: MgmtdClient::PrimaryMgmtdNotFound(6000) ./src/client/mgmtd/MgmtdClient.cc:173, '_result' has error. latency: 0ns
[2025-03-17T06:28:38.862420283+00:00 CliBG0: 2089 MgmtdClient.cc:381 INFO] MgmtdClient: receive ExitItem
[2025-03-17T06:28:38.862429671+00:00 CliBG1: 2090 MgmtdClient.cc:286 ERROR] MgmtdClient: AutoRefresh failed: MgmtdClient::PrimaryMgmtdNotFound(6000) ./src/client/mgmtd/MgmtdClient.cc:173, '_result' has error
[2025-03-17T06:28:38.862437676+00:00 admin_cli: 2028 BackgroundRunner.cc:63 DEBUG] BackgroundRunner: start to execute stopAll
[2025-03-17T06:28:38.862468275+00:00 CliBG1: 2090 BackgroundRunner.cc:85 DEBUG] BackgroundRunner: AutoRefresh round 2 start
[2025-03-17T06:28:38.862495441+00:00 CliBG1: 2090 MgmtdClient.cc:286 ERROR] MgmtdClient: AutoRefresh failed: MgmtdClient::Exit(6003) ./src/client/mgmtd/MgmtdClient.cc:455, '_result' has error
[2025-03-17T06:28:38.862527652+00:00 CliBG1: 2090 BackgroundRunner.cc:102 DEBUG] BackgroundRunner: AutoRefresh wait 999946us for next round
[2025-03-17T06:28:38.862550071+00:00 CliBG1: 2090 BackgroundRunner.cc:104 DEBUG] BackgroundRunner: AutoRefresh is cancelled.
[2025-03-17T06:28:38.862553868+00:00 CliBG1: 2090 BackgroundRunner.cc:111 INFO] BackgroundRunner: AutoRefresh stopped
[2025-03-17T06:28:38.862564910+00:00 admin_cli: 2028 BackgroundRunner.cc:70 DEBUG] BackgroundRunner: finish to execute stopAll
[2025-03-17T06:28:38.862570274+00:00 admin_cli: 2028 BackgroundRunner.cc:50 DEBUG] BackgroundRunner: destructed
[2025-03-17T06:28:38.862571436+00:00 admin_cli: 2028 MgmtdClient.cc:277 INFO] MgmtdClient: stopped


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant