-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[GR-57064] Lazy virtual thread JFR registration and epoch generation caching #9570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GR-57064] Lazy virtual thread JFR registration and epoch generation caching #9570
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, I added a few comments.
* eagerly when started and at chunk rotations. | ||
*/ | ||
@Uninterruptible(reason = "Epoch should not change while checking generation.") | ||
private static void maybeRegisterVirtualThread(Thread thread) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this break once we leave uninterruptible code?
- we start in epoch 0
- thread A calls
getThreadId(...)
, which registers avthread
for epoch 0 inmaybeRegisterVirtualThread
- thread A leaves the uninterruptible code and gets blocked at a safepoint
- thread B changes the epoch from 0 to 1
- after the safepoint, thread A continues execution and uses the thread id that was returned by
getThreadId(...)
even though the vthread was only registered for epoch 0
One example where I think that this might happen: JavaMonitorQueuedSynchronizer.ConditionNode.notifierJfrTid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the situation you describe is actually ok. Once the epoch changes, new event emissions from "thread A" will fail the epoch generation check, and registration will be re-done. Since getThreadID()
is called for every event emission, the important thing is that gathering event info with getThreadID()
and the writing of the event data to the JFR buffer are done within the same block of uninterruptible code, which does indeed happen (ex. all emit0
methods).
Calls to getThreadID()
outside of the event emission path (ex. with JavaMonitorQueuedSynchronizer.ConditionNode.notifierJfrTid
) result in unnecessary registration checks, but shouldn't break anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so you essentially assume that getThreadID()
will be called again for thread A once the thread ID is actually used in a JFR event. However, I don't think that this assumption is valid. Here is one example:
- Thread B blocks at
ConditionNode
. - Thread A notifies thread B and sets
ConditionNode.notifierJfrTid
. - The JFR epoch changes.
- Once thread B wakes up, it uses the value of
ConditionNode.notifierJfrTid
(i.e., the ID of thread A) when emitting aJavaMonitorWaitEvent
. However, thread A is not registered for the current epoch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I see. I mistakenly thought we always used JfrNativeEventWriter.putThread(JfrNativeEventWriterData , Thread )
when dealing with threads. But in that case we use a the plain TID directly.
I think we'll have to switch to always using putThread
whenever dealing with the JFR Thread
type. What do you think of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The downside is that we'll have to remember to give careful consideration to emitting JFR Thread
data.
I'm not sure there's a better place to put the registration check though. And we'll need to check at some point in order to copy Hotspot and do lazy registration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just pushed a commit to update all the places we write thread IDs as JFR event data to use either putThread(Thread)
or putThread(long)
. This should ensure that threads are always registered if they get referenced from event data. The downside is that we need to remember to use putThread
.
Another unfortunate thing is that we can only do checks with putThread(Thread)
because with only the TID we have no way of knowing whether the thread is virtual and cannot access Target_java_lang_VirtualThread.jfrGeneration
.
@@ -43,6 +43,7 @@ public class JfrTraceIdEpoch { | |||
private static final long EPOCH_1_BIT = 0b10; | |||
|
|||
private boolean epoch; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
epoch
can probably be removed (i.e., can be computed from the least significant bit of epochGeneration
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, good idea. I'll remove that.
Truffle gate error seems to be unrelated |
Thanks! I did one pass over the PR and opened #11070. For now, I think it is better/safer if we still register all virtual threads eagerly. At the moment, we have a few JFR events where we only store the thread id. We would need to change all that code so that we keep more information around, which in my opinion is not worth the effort at the moment. |
Yes that sounds fine to me as well. Most of improvement I measured was gained through avoiding global locking, not the lazy registration anyway. |
Summary
In Hotspot, JFR registers virtual threads lazily when they emit an event. Currently, in SubstrateVM, we register virtual threads eagerly whenever they are mounted. This can be problematic if many virtual threads mount/unmount while not emitting many events. This is especially bad since SubstrateVM uses a global mutex to protect the "thread constant pool repository". This mutex must be acquired every time we register or even just check thread registration status. In Hotspot, there is no such global "thread constant pool repository", and no global synchronization -- instead a thread "checkpoint event" is written to thread local JFR buffers lazily when needed.
This PR does 2 things:
Target_java_lang_VirtualThread.java
object (similar toTarget_java_lang_Thread.jfrExcluded
. Before locking the thread repo mutex, first compare "epoch generation" in case the virtual thread is already registered. A very similar techniqie is used in Hotspot to avoid writing thread checkpoint events unless necessary.Results
Scenario 1: Many virtual thread mounts/unmounts while not emitting JFR events.
Used test app 1. The new lazy method takes ~1105ms to complete, while the old eager approach takes ~1140ms.
Scenario 2: Many JFR events emitted concurrently by many virtual threads.
Used test app 2. The registration with cached epoch generation checking takes ~42.600s to complete, while registration with global locking takes ~73.650s.
Related issue: #9536