-
Notifications
You must be signed in to change notification settings - Fork 16
notifications: Exit protocols on handle drop to save up CPU of minimal-relay-chains
#376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+33
−5
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
2 tasks
bkchr
approved these changes
Apr 17, 2025
Generally could use some test. Make a test that directly drops this stream and then |
Co-authored-by: Bastian Köcher <git@kchr.de>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
This was referenced Apr 29, 2025
lexnv
added a commit
that referenced
this pull request
May 1, 2025
## [0.9.4] - 2025-04-29 This release brings several improvements and fixes to litep2p, advancing its stability and readiness for production use. ### Performance Improvements This release addresses an issue where notification protocols failed to exit on handle drop, lowering CPU usage in scenarios like minimal-relay-chains from 7% to 0.1%. ### Robustness Improvements - Kademlia: - Optimized address store by sorting addresses based on dialing score, bounding memory consumption and improving efficiency. - Limited `FIND_NODE` responses to the replication factor, reducing data stored in the routing table. - Address store improvements enhance robustness against routing table alterations. - Identify Codec: - Enhanced message decoding to manage malformed or unexpected messages gracefully. - Bitswap: - Introduced a write timeout for sending frames, preventing protocol hangs or delays. ### Testing and Reliability - Fuzzing Harness: Added a fuzzing harness by SRLabs to uncover and resolve potential issues, improving code robustness. Thanks to @R9295 for the contribution! - Testing Enhancements: Improved notification state machine testing. Thanks to Dominique (@Imod7) for the contribution! ### Dependency Management - Updated all dependencies for stable feature flags (default and "websocket") to their latest versions. - Reorganized dependencies under specific feature flags, shrinking the default feature set and avoiding exposure of outdated dependencies from experimental features. ### Fixed - notifications: Exit protocols on handle drop to save up CPU of `minimal-relay-chains` ([#376](#376)) - identify: Improve identify message decoding ([#379](#379)) - crypto/noise: Set timeout limits for the noise handshake ([#373](#373)) - kad: Improve robustness of addresses from the routing table ([#369](#369)) - kad: Bound kademlia messages to the replication factor ([#371](#371)) - codec: Decode smaller payloads for identity to None ([#362](#362)) ### Added - bitswap: Add write timeout for sending frames ([#361](#361)) - notif/tests: check test state ([#360](#360)) - SRLabs: Introduce simple fuzzing harness ([#367](#367)) - SRLabs: Introduce Fuzzing Harness ([#365](#365)) ### Changed - features: Move quic related dependencies under feature flag ([#359](#359)) - tests/substrate: Remove outdated substrate specific conformace testing ([#370](#370)) - ci: Update stable dependencies ([#375](#375)) - build(deps): bump hex-literal from 0.4.1 to 1.0.0 ([#381](#381)) - build(deps): bump tokio from 1.44.1 to 1.44.2 in /fuzz/structure-aware ([#378](#378)) - build(deps): bump Swatinem/rust-cache from 2.7.7 to 2.7.8 ([#363](#363)) - build(deps): bump tokio from 1.43.0 to 1.43.1 ([#368](#368)) - build(deps): bump openssl from 0.10.70 to 0.10.72 ([#366](#366)) --------- Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
github-merge-queue bot
pushed a commit
to paritytech/polkadot-sdk
that referenced
this pull request
May 1, 2025
# Litep2p Release This release brings several improvements and fixes to litep2p, advancing its stability and readiness for production use. ### Performance Improvements This release addresses an issue where notification protocols failed to exit on handle drop, lowering CPU usage in scenarios like minimal-relay-chains from 7% to 0.1%. ### Robustness Improvements - Kademlia: - Optimized address store by sorting addresses based on dialing score, bounding memory consumption and improving efficiency. - Limited `FIND_NODE` responses to the replication factor, reducing data stored in the routing table. - Address store improvements enhance robustness against routing table alterations. - Identify Codec: - Enhanced message decoding to manage malformed or unexpected messages gracefully. - Bitswap: - Introduced a write timeout for sending frames, preventing protocol hangs or delays. ### Testing and Reliability - Fuzzing Harness: Added a fuzzing harness by SRLabs to uncover and resolve potential issues, improving code robustness. Thanks to @R9295 for the contribution! - Testing Enhancements: Improved notification state machine testing. Thanks to Dominique (@Imod7) for the contribution! ### Dependency Management - Updated all dependencies for stable feature flags (default and "websocket") to their latest versions. - Reorganized dependencies under specific feature flags, shrinking the default feature set and avoiding exposure of outdated dependencies from experimental features. ### Fixed - notifications: Exit protocols on handle drop to save up CPU of `minimal-relay-chains` ([#376](paritytech/litep2p#376)) - identify: Improve identify message decoding ([#379](paritytech/litep2p#379)) - crypto/noise: Set timeout limits for the noise handshake ([#373](paritytech/litep2p#373)) - kad: Improve robustness of addresses from the routing table ([#369](paritytech/litep2p#369)) - kad: Bound kademlia messages to the replication factor ([#371](paritytech/litep2p#371)) - codec: Decode smaller payloads for identity to None ([#362](paritytech/litep2p#362)) ### Added - bitswap: Add write timeout for sending frames ([#361](paritytech/litep2p#361)) - notif/tests: check test state ([#360](paritytech/litep2p#360)) - SRLabs: Introduce simple fuzzing harness ([#367](paritytech/litep2p#367)) - SRLabs: Introduce Fuzzing Harness ([#365](paritytech/litep2p#365)) ### Changed - features: Move quic related dependencies under feature flag ([#359](paritytech/litep2p#359)) - tests/substrate: Remove outdated substrate specific conformace testing ([#370](paritytech/litep2p#370)) - ci: Update stable dependencies ([#375](paritytech/litep2p#375)) - build(deps): bump hex-literal from 0.4.1 to 1.0.0 ([#381](paritytech/litep2p#381)) - build(deps): bump tokio from 1.44.1 to 1.44.2 in /fuzz/structure-aware ([#378](paritytech/litep2p#378)) - build(deps): bump Swatinem/rust-cache from 2.7.7 to 2.7.8 ([#363](paritytech/litep2p#363)) - build(deps): bump tokio from 1.43.0 to 1.43.1 ([#368](paritytech/litep2p#368)) - build(deps): bump openssl from 0.10.70 to 0.10.72 ([#366](paritytech/litep2p#366)) --------- Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io> Co-authored-by: Dmitry Markin <dmitry@markin.tech>
castillax
pushed a commit
to paritytech/polkadot-sdk
that referenced
this pull request
May 12, 2025
# Litep2p Release This release brings several improvements and fixes to litep2p, advancing its stability and readiness for production use. ### Performance Improvements This release addresses an issue where notification protocols failed to exit on handle drop, lowering CPU usage in scenarios like minimal-relay-chains from 7% to 0.1%. ### Robustness Improvements - Kademlia: - Optimized address store by sorting addresses based on dialing score, bounding memory consumption and improving efficiency. - Limited `FIND_NODE` responses to the replication factor, reducing data stored in the routing table. - Address store improvements enhance robustness against routing table alterations. - Identify Codec: - Enhanced message decoding to manage malformed or unexpected messages gracefully. - Bitswap: - Introduced a write timeout for sending frames, preventing protocol hangs or delays. ### Testing and Reliability - Fuzzing Harness: Added a fuzzing harness by SRLabs to uncover and resolve potential issues, improving code robustness. Thanks to @R9295 for the contribution! - Testing Enhancements: Improved notification state machine testing. Thanks to Dominique (@Imod7) for the contribution! ### Dependency Management - Updated all dependencies for stable feature flags (default and "websocket") to their latest versions. - Reorganized dependencies under specific feature flags, shrinking the default feature set and avoiding exposure of outdated dependencies from experimental features. ### Fixed - notifications: Exit protocols on handle drop to save up CPU of `minimal-relay-chains` ([#376](paritytech/litep2p#376)) - identify: Improve identify message decoding ([#379](paritytech/litep2p#379)) - crypto/noise: Set timeout limits for the noise handshake ([#373](paritytech/litep2p#373)) - kad: Improve robustness of addresses from the routing table ([#369](paritytech/litep2p#369)) - kad: Bound kademlia messages to the replication factor ([#371](paritytech/litep2p#371)) - codec: Decode smaller payloads for identity to None ([#362](paritytech/litep2p#362)) ### Added - bitswap: Add write timeout for sending frames ([#361](paritytech/litep2p#361)) - notif/tests: check test state ([#360](paritytech/litep2p#360)) - SRLabs: Introduce simple fuzzing harness ([#367](paritytech/litep2p#367)) - SRLabs: Introduce Fuzzing Harness ([#365](paritytech/litep2p#365)) ### Changed - features: Move quic related dependencies under feature flag ([#359](paritytech/litep2p#359)) - tests/substrate: Remove outdated substrate specific conformace testing ([#370](paritytech/litep2p#370)) - ci: Update stable dependencies ([#375](paritytech/litep2p#375)) - build(deps): bump hex-literal from 0.4.1 to 1.0.0 ([#381](paritytech/litep2p#381)) - build(deps): bump tokio from 1.44.1 to 1.44.2 in /fuzz/structure-aware ([#378](paritytech/litep2p#378)) - build(deps): bump Swatinem/rust-cache from 2.7.7 to 2.7.8 ([#363](paritytech/litep2p#363)) - build(deps): bump tokio from 1.43.0 to 1.43.1 ([#368](paritytech/litep2p#368)) - build(deps): bump openssl from 0.10.70 to 0.10.72 ([#366](paritytech/litep2p#366)) --------- Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io> Co-authored-by: Dmitry Markin <dmitry@markin.tech>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR ensures that notification tasks are fully terminated when the associated handle is dropped, addressing an issue specific to
minimal-relay-chains
. We have operated from the assumption that the handle is never supposed to get dropped, since protocols cannot be dynamically injected after the backend is started.This PR effectively reduces the CPU usage of minimal-relay-chains from 7% to 0.1%.
We have noticed that the CPU consumption of minimal-relay-chains has increased compared to libp2p backends. However, our metrics for CPU consumption of full relay chains have drastically improved.
The issue is the following:
CPU before and after:

Fixes: paritytech/polkadot-sdk#7998