Skip to content

Allow splitting of Truffle::CExt.set_mark_list_on_object. #3541

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

nirvdrum
Copy link
Collaborator

While looking at performance of a production Rails application, I saw a lot of splitting attributed to the wrappers that TruffleRuby creates for rb_define_method. Ultimately, I think this boils down to WriteObjectFieldNode, which is used by set_mark_list_on_object, reporting polymorphism. I had traced the problematic node back to a usage of NIO::Selector within Puma, where an object is apparently shared across threads causing the WriteObjectFieldNode instance to become polymorphic.

I do not have a simple reproduction. I tried to use the nio4r example code to induce the behavior, but to adequately handle the multi-threaded part I found I was basically rewriting Puma's reactors. Instead, I deployed this change out to my production environment and observed that the repeated splitting of simple methods like Integer#== ceased.

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Apr 17, 2024
@nirvdrum nirvdrum added performance shopify Pull requests from Shopify labels Apr 17, 2024
@nirvdrum nirvdrum force-pushed the split-set_mark_list_on_object branch from 7a3b93f to 5760ce8 Compare April 17, 2024 05:29
@@ -1887,6 +1887,7 @@ Object addToMarkList(Object guardedObject,
}

@CoreMethod(names = "set_mark_list_on_object", onSingleton = true, required = 1)
@ReportPolymorphism
Copy link
Member

@eregon eregon Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't have any effect because there is a single specialization instance possible for this node, so no way to trigger multiple and Node#reportPolymorphicSpecialize().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirmed that there is no code generated related to the annotation with this change (I used graal master, but I don't think that makes a difference).

Copy link
Collaborator Author

@nirvdrum nirvdrum Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you make WriteObjectFieldNode go polymorphic? I didn't see the extra splits until then. I may very well have it wrong, but I saw a bunch of Integer splits from the argc checks in cext_ruby go away after I made this change. But, I'm misattributing the changes, we can close this out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you make WriteObjectFieldNode go polymorphic?

I didn't try, no, I just looked at whether this annotation changed anything to the generated code and it looked like no, so should have no effect. It's a case Truffle should warn for, I filed an internal issue for that.

The Integer splits sounds strange, maybe something is redefining these Integer methods, even if it installs back the original ones after it will still invalidate the Assumptions permanently.
Could also be some some_integer == some_nonumeric_object somewhere, and that would end up calling some_nonumeric_object == some_integer which could cause splitting.
But for these Integer#{==,>=} calls they should always use Inlined*Node, unless these methods are redefined at some point, and these inline nodes like InlinedEqualNode do not even use IntegerNode.EqualNode and both sides should be clearly int in that code.

@eregon
Copy link
Member

eregon commented Apr 17, 2024

The call chain is

  • rb_define_method's block to define a Ruby method for a C function
  • Primitive.call_with_c_mutex_and_frame_and_unwrap, which uses RunMarkOnExitNode
  • run_marker
  • set_mark_list_on_object

It feels suboptimal to store the mark_list as a DynamicObject property, because I guess we can see many objects with different classes there, but maybe they have few different Shapes? (I need to check)
We used to have DataHolder which could have been a good candidate to store that list, but that's gone since d8d3daf and replaced by storing that state directly in a calloc'd struct RTypedData* to improve performance and simplify.

Maybe we could store it on the ValueWrapper? I think that should be fine since this a RubyDynamicObject so there is a guaranteed 1-1 mapping between both. OTOH it would make all wrappers bigger when only rdata/rtypedata and NativeArrayStorage arrays need this.

Or maybe we could use some special RubyDynamicObject subclass for RData/RTypedData and then we could store it there as a plain Java field. But we need to handle RArray too somehow.
From looking at rb_data_object_wrap/rb_data_typed_object_wrap in cext.rb they use __layout_allocate__ which is not affected by rb_define_alloc_func, so IOW we should have total control over the representation there and be able to use a specific RubyDynamicObject subclass.

One other thing we could do is use a variant of WriteObjectFieldNode which doesn't report polymorphism, but it's kind of hiding the problem we have polymorphism and ideally we wouldn't. Or stop reporting polymorphism in WriteObjectFieldNode in general, but IIRC that hurts performance on some benchmarks at least.

@nirvdrum nirvdrum force-pushed the split-set_mark_list_on_object branch from 5760ce8 to 0cc549e Compare April 17, 2024 13:49
@nirvdrum nirvdrum marked this pull request as draft April 17, 2024 15:03
@nirvdrum
Copy link
Collaborator Author

I took a look at the generated CExtNodesFactory file before and after this PR and the file contents do not change. The difference I was seeing must be related to something else. The only deployment difference was the addition of @ReportPolymorphism, so I'll have to try to figure out what the difference was. Fortunately, I can roll back to previous container images -- it's just a slow process.

@nirvdrum nirvdrum closed this Apr 17, 2024
@nirvdrum nirvdrum deleted the split-set_mark_list_on_object branch April 17, 2024 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement. performance shopify Pull requests from Shopify
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants