Change transport.netty.receive_predictor_size default value to 32KB #113085

jiangyunpeng · 2024-09-18T07:57:56Z

Currently the transport.netty.receive_predictor_size defaults to 64kb, The number
can't be cached in Netty PoolThreadCache, because PoolThreadCache normalHeapCaches only cache less than 32kb size(
The code at https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/PoolThreadCache.java#L291)
Change to 32kb can leverage PoolThreadCache cache normal byte buffer to improve performace.

elasticsearchmachine · 2024-09-18T10:41:35Z

Pinging @elastic/es-distributed (Team:Distributed)

cla-checker-service · 2024-09-19T04:01:24Z

💚 CLA has been signed

mhl-b · 2024-10-07T20:55:02Z

Interesting, do you have reference where 32kb comes from? I ran few tests with debugger and can see that cache is populated. There was an issue in 2017 related to 32kb #23185 (comment), it might be no longer the case, but need more excessive testing.

jiangyunpeng · 2024-10-12T07:13:53Z

Thanks for the reply, @mhl-b.

I couldn't find any reference about '32kb' because the Netty project lacks documentation. I just read Netty's code, specifically in the PoolThreadCache.cacheForNormal() method, which indicates that when requesting memory sizes greater than 32kb, it can't be cached.

Here is a simple test case:

@Test
public void test() {
    PooledByteBufAllocator byteBufAllocator = PooledByteBufAllocator.DEFAULT;
    ByteBuf byteBuf = null;
    byteBuf = byteBufAllocator.buffer(64 * 1024);
    byteBuf.release(); // if it's 32kb, it will be added to the PoolThreadCache when release() is invoked
    
    // request again
    byteBuf = byteBufAllocator.buffer(64 * 1024);
    byteBuf.release();
}

When applying 64kb, both calls to byteBufAllocator.buffer() are allocated by PoolChunk, but if you change it to 32kb, the second call to byteBufAllocator.buffer() will be allocated by PoolThreadCache. I tested this in Netty versions 4.1.113 and 4.1.49.

In addition, we added Netty Allocator metrics to compare 32kb and 64kb in an Elasticsearch node, and we observed that the 32kb size hits the cache more frequently than the 64kb size.

企业微信截图_e4a0be55-7e30-4320-bf92-d39d65d281a4

32kb: cacheTiny+cacheSmall+cacheNormal = 76.88%

企业微信截图_38b54489-01f3-4761-ac1d-cf5f2826cd98

64kb: cacheTiny+cacheSmall+cacheNormal = 29.52%

jiangyunpeng · 2024-10-12T07:27:07Z

I learned that @danielmitterdorfer has investigated Netty's allocator. Could you give some advice?

mhl-b · 2024-11-26T17:37:21Z

To move forward with this PR I suggest to have more conclusive numbers about performance improvements. Difference in cache usage does not clearly show an improvement in essential metrics - throughput, GC pressure, OOMs.

You can run a rally benchmark and look at the metrics that we used before in previous comment. Feel free to browse git history in allocation improvements, for example NettyAllocator. Also consider different heap sizes for tests.

mhl-b · 2025-01-13T19:32:47Z

I'm going to close this PR in few days, unless you still want to work on this.

elasticsearchmachine · 2025-01-30T17:00:13Z

Pinging @elastic/es-distributed-obsolete (Team:Distributed (Obsolete))

elasticsearchmachine · 2025-01-30T17:00:13Z

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

mhl-b · 2025-04-15T21:38:52Z

Closing for now until there are stronger evidences that smaller buffers improve system performance. Previous analysis indicated that smaller chunks has higher pressure on GC, the percent of cache usage does not immediately indicate better performance.

elasticsearchmachine added needs:triage Requires assignment of a team area label v9.0.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Sep 18, 2024

kingherc added the :Distributed Coordination/Network Http and internode communication implementations label Sep 18, 2024

elasticsearchmachine added Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. and removed needs:triage Requires assignment of a team area label labels Sep 18, 2024

Change transport.netty.receive_predictor_size default value to 32KB

6557df9

jiangyunpeng force-pushed the main branch from 82e0b70 to 6557df9 Compare September 19, 2024 04:01

elasticsearchmachine added v9.1.0 Team:Distributed Coordination Meta label for Distributed Coordination team and removed v9.0.0 labels Jan 30, 2025

mhl-b closed this Apr 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change transport.netty.receive_predictor_size default value to 32KB #113085

Change transport.netty.receive_predictor_size default value to 32KB #113085

jiangyunpeng commented Sep 18, 2024

elasticsearchmachine commented Sep 18, 2024

cla-checker-service bot commented Sep 19, 2024 •

edited

Loading

mhl-b commented Oct 7, 2024

jiangyunpeng commented Oct 12, 2024 •

edited

Loading

jiangyunpeng commented Oct 12, 2024

mhl-b commented Nov 26, 2024

mhl-b commented Jan 13, 2025

elasticsearchmachine commented Jan 30, 2025

elasticsearchmachine commented Jan 30, 2025

mhl-b commented Apr 15, 2025

Change transport.netty.receive_predictor_size default value to 32KB #113085

Change transport.netty.receive_predictor_size default value to 32KB #113085

Conversation

jiangyunpeng commented Sep 18, 2024

elasticsearchmachine commented Sep 18, 2024

cla-checker-service bot commented Sep 19, 2024 • edited Loading

mhl-b commented Oct 7, 2024

jiangyunpeng commented Oct 12, 2024 • edited Loading

jiangyunpeng commented Oct 12, 2024

mhl-b commented Nov 26, 2024

mhl-b commented Jan 13, 2025

elasticsearchmachine commented Jan 30, 2025

elasticsearchmachine commented Jan 30, 2025

mhl-b commented Apr 15, 2025

cla-checker-service bot commented Sep 19, 2024 •

edited

Loading

jiangyunpeng commented Oct 12, 2024 •

edited

Loading