Skip to content

[GR-57064] Lazy virtual thread JFR registration and epoch generation caching #9570

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 24, 2025

Conversation

roberttoyonaga
Copy link
Collaborator

@roberttoyonaga roberttoyonaga commented Aug 26, 2024

Summary

In Hotspot, JFR registers virtual threads lazily when they emit an event. Currently, in SubstrateVM, we register virtual threads eagerly whenever they are mounted. This can be problematic if many virtual threads mount/unmount while not emitting many events. This is especially bad since SubstrateVM uses a global mutex to protect the "thread constant pool repository". This mutex must be acquired every time we register or even just check thread registration status. In Hotspot, there is no such global "thread constant pool repository", and no global synchronization -- instead a thread "checkpoint event" is written to thread local JFR buffers lazily when needed.

This PR does 2 things:

  1. Make virtual thread registration lazy instead of eager. Register virtual threads only when they need to emit an event, not upon mounting.
  2. Avoid locking global "thread constant pool repo" mutex every time we need to check virtual thread registration status. Do this by caching the "epoch generation" in the Target_java_lang_VirtualThread.java object (similar to Target_java_lang_Thread.jfrExcluded. Before locking the thread repo mutex, first compare "epoch generation" in case the virtual thread is already registered. A very similar techniqie is used in Hotspot to avoid writing thread checkpoint events unless necessary.

Results

Scenario 1: Many virtual thread mounts/unmounts while not emitting JFR events.
Used test app 1. The new lazy method takes ~1105ms to complete, while the old eager approach takes ~1140ms.

Scenario 2: Many JFR events emitted concurrently by many virtual threads.
Used test app 2. The registration with cached epoch generation checking takes ~42.600s to complete, while registration with global locking takes ~73.650s.

Related issue: #9536

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Aug 26, 2024
@roberttoyonaga roberttoyonaga marked this pull request as ready for review August 26, 2024 19:29
@christianhaeubl christianhaeubl changed the title Lazy virtual thread JFR registration and epoch generation caching [GR-57064] Lazy virtual thread JFR registration and epoch generation caching Aug 27, 2024
Copy link
Member

@christianhaeubl christianhaeubl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, I added a few comments.

* eagerly when started and at chunk rotations.
*/
@Uninterruptible(reason = "Epoch should not change while checking generation.")
private static void maybeRegisterVirtualThread(Thread thread) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this break once we leave uninterruptible code?

  • we start in epoch 0
  • thread A calls getThreadId(...), which registers a vthread for epoch 0 in maybeRegisterVirtualThread
  • thread A leaves the uninterruptible code and gets blocked at a safepoint
  • thread B changes the epoch from 0 to 1
  • after the safepoint, thread A continues execution and uses the thread id that was returned by getThreadId(...) even though the vthread was only registered for epoch 0

One example where I think that this might happen: JavaMonitorQueuedSynchronizer.ConditionNode.notifierJfrTid

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the situation you describe is actually ok. Once the epoch changes, new event emissions from "thread A" will fail the epoch generation check, and registration will be re-done. Since getThreadID() is called for every event emission, the important thing is that gathering event info with getThreadID() and the writing of the event data to the JFR buffer are done within the same block of uninterruptible code, which does indeed happen (ex. all emit0 methods).

Calls to getThreadID() outside of the event emission path (ex. with JavaMonitorQueuedSynchronizer.ConditionNode.notifierJfrTid) result in unnecessary registration checks, but shouldn't break anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so you essentially assume that getThreadID() will be called again for thread A once the thread ID is actually used in a JFR event. However, I don't think that this assumption is valid. Here is one example:

  • Thread B blocks at ConditionNode.
  • Thread A notifies thread B and sets ConditionNode.notifierJfrTid.
  • The JFR epoch changes.
  • Once thread B wakes up, it uses the value of ConditionNode.notifierJfrTid (i.e., the ID of thread A) when emitting a JavaMonitorWaitEvent. However, thread A is not registered for the current epoch.

Copy link
Collaborator Author

@roberttoyonaga roberttoyonaga Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I see. I mistakenly thought we always used JfrNativeEventWriter.putThread(JfrNativeEventWriterData , Thread ) when dealing with threads. But in that case we use a the plain TID directly.

I think we'll have to switch to always using putThread whenever dealing with the JFR Thread type. What do you think of this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside is that we'll have to remember to give careful consideration to emitting JFR Thread data.
I'm not sure there's a better place to put the registration check though. And we'll need to check at some point in order to copy Hotspot and do lazy registration.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just pushed a commit to update all the places we write thread IDs as JFR event data to use either putThread(Thread) or putThread(long). This should ensure that threads are always registered if they get referenced from event data. The downside is that we need to remember to use putThread.

Another unfortunate thing is that we can only do checks with putThread(Thread) because with only the TID we have no way of knowing whether the thread is virtual and cannot access Target_java_lang_VirtualThread.jfrGeneration.

@@ -43,6 +43,7 @@ public class JfrTraceIdEpoch {
private static final long EPOCH_1_BIT = 0b10;

private boolean epoch;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

epoch can probably be removed (i.e., can be computed from the least significant bit of epochGeneration).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, good idea. I'll remove that.

@roberttoyonaga
Copy link
Collaborator Author

Truffle gate error seems to be unrelated

@christianhaeubl
Copy link
Member

Thanks!

I did one pass over the PR and opened #11070. For now, I think it is better/safer if we still register all virtual threads eagerly. At the moment, we have a few JFR events where we only store the thread id. We would need to change all that code so that we keep more information around, which in my opinion is not worth the effort at the moment.

@roberttoyonaga
Copy link
Collaborator Author

I did one pass over the PR and opened #11070. For now, I think it is better/safer if we still register all virtual threads eagerly. At the moment, we have a few JFR events where we only store the thread id. We would need to change all that code so that we keep more information around, which in my opinion is not worth the effort at the moment.

Yes that sounds fine to me as well. Most of improvement I measured was gained through avoiding global locking, not the lazy registration anyway.

@graalvmbot graalvmbot merged commit 8cd35ad into oracle:master Apr 24, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
native-image native-image-jfr OCA Verified All contributors have signed the Oracle Contributor Agreement. redhat-interest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants