Skip to content

Change transport.netty.receive_predictor_size default value to 32KB #113085

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

jiangyunpeng
Copy link

Currently the transport.netty.receive_predictor_size defaults to 64kb, The number
can't be cached in Netty PoolThreadCache, because PoolThreadCache normalHeapCaches only cache less than 32kb size(
The code at https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/PoolThreadCache.java#L291)
Change to 32kb can leverage PoolThreadCache cache normal byte buffer to improve performace.

@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.0.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Sep 18, 2024
@kingherc kingherc added the :Distributed Coordination/Network Http and internode communication implementations label Sep 18, 2024
@elasticsearchmachine elasticsearchmachine added Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. and removed needs:triage Requires assignment of a team area label labels Sep 18, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link

cla-checker-service bot commented Sep 19, 2024

💚 CLA has been signed

@mhl-b
Copy link
Contributor

mhl-b commented Oct 7, 2024

Interesting, do you have reference where 32kb comes from? I ran few tests with debugger and can see that cache is populated. There was an issue in 2017 related to 32kb #23185 (comment), it might be no longer the case, but need more excessive testing.

@jiangyunpeng
Copy link
Author

jiangyunpeng commented Oct 12, 2024

Thanks for the reply, @mhl-b.

I couldn't find any reference about '32kb' because the Netty project lacks documentation. I just read Netty's code, specifically in the PoolThreadCache.cacheForNormal() method, which indicates that when requesting memory sizes greater than 32kb, it can't be cached.

Here is a simple test case:

@Test
public void test() {
    PooledByteBufAllocator byteBufAllocator = PooledByteBufAllocator.DEFAULT;
    ByteBuf byteBuf = null;
    byteBuf = byteBufAllocator.buffer(64 * 1024);
    byteBuf.release(); // if it's 32kb, it will be added to the PoolThreadCache when release() is invoked
    
    // request again
    byteBuf = byteBufAllocator.buffer(64 * 1024);
    byteBuf.release();
}

When applying 64kb, both calls to byteBufAllocator.buffer() are allocated by PoolChunk, but if you change it to 32kb, the second call to byteBufAllocator.buffer() will be allocated by PoolThreadCache. I tested this in Netty versions 4.1.113 and 4.1.49.

In addition, we added Netty Allocator metrics to compare 32kb and 64kb in an Elasticsearch node, and we observed that the 32kb size hits the cache more frequently than the 64kb size.

企业微信截图_e4a0be55-7e30-4320-bf92-d39d65d281a4 32kb: cacheTiny+cacheSmall+cacheNormal = 76.88% 企业微信截图_38b54489-01f3-4761-ac1d-cf5f2826cd98 64kb: cacheTiny+cacheSmall+cacheNormal = 29.52%

@jiangyunpeng
Copy link
Author

I learned that @danielmitterdorfer has investigated Netty's allocator. Could you give some advice?

@mhl-b
Copy link
Contributor

mhl-b commented Nov 26, 2024

To move forward with this PR I suggest to have more conclusive numbers about performance improvements. Difference in cache usage does not clearly show an improvement in essential metrics - throughput, GC pressure, OOMs.

You can run a rally benchmark and look at the metrics that we used before in previous comment. Feel free to browse git history in allocation improvements, for example NettyAllocator. Also consider different heap sizes for tests.

@mhl-b
Copy link
Contributor

mhl-b commented Jan 13, 2025

I'm going to close this PR in few days, unless you still want to work on this.

@elasticsearchmachine elasticsearchmachine added v9.1.0 Team:Distributed Coordination Meta label for Distributed Coordination team and removed v9.0.0 labels Jan 30, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-obsolete (Team:Distributed (Obsolete))

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@mhl-b
Copy link
Contributor

mhl-b commented Apr 15, 2025

Closing for now until there are stronger evidences that smaller buffers improve system performance. Previous analysis indicated that smaller chunks has higher pressure on GC, the percent of cache usage does not immediately indicate better performance.

@mhl-b mhl-b closed this Apr 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Network Http and internode communication implementations external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Distributed Coordination Meta label for Distributed Coordination team Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants