-
Notifications
You must be signed in to change notification settings - Fork 289
Add support for max batch size for connectors #7274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Docs preview has no changesThe preview was not built because there were no changes. Build ID: c678b964f6e36d3aeba47100 |
@andrewmcgivery, please consider creating a changeset entry in |
CI performance tests
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this warrants an integration test next to the existing batch test (which will require a hand-edited supergraph until we release composition changes). nice work tho!
@@ -8,7 +8,7 @@ pub mod expand; | |||
mod header; | |||
mod id; | |||
mod json_selection; | |||
mod models; | |||
pub mod models; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's selectively expose types as pub instead of marking the whole module as pub if possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
"supplied 'max_size' field in `@connect` directive's `batch` field is not a positive integer" | ||
))?); | ||
// Convert the int to a usize since it is used for chunking an array later. | ||
// Much better to fail here than at run time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code is part of the runtime!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad wording on my part 😅
I think what I more meant here is during the request lifecycle? Aka... check this as early as we can (startup?) instead of it failing in the middle of a request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the comment... let me know if this makes sense 😅
@@ -55,6 +57,15 @@ pub(crate) struct SourceHTTPArguments { | |||
pub(crate) headers: IndexMap<HeaderName, HeaderSource>, | |||
} | |||
|
|||
/// Settings for the connector when it is doing a $batch entity resolver | |||
#[cfg_attr(test, derive(Debug))] | |||
pub(crate) struct SourceBatchArguments { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is it called Source BatchArguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦 Good call lol
// Because we may have multiple batch entities requests, we should add to ENTITIES as the requests come in so it is additive | ||
let entities = data | ||
.entry(ENTITIES) | ||
.or_insert(Value::Array(Vec::with_capacity(count))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this count is for the current chunk. i dunno if there's a way to get the original count of entity references here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So... it seems like it's the count of....
let count = responses.len();
Which is calculated prior to looping through the responses. 😕 Weird.
|
||
// If we've got a max_size set, chunk the batch into smaller batches. Otherwise, we'll default to just a single batch. | ||
let max_size = connector.batch_settings.as_ref().and_then(|bs| bs.max_size); | ||
let batches = max_size.map_or(vec![batch.clone()], |size| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vec![batch.clone()]
seems like an unnecessary clone, but i can't solve it without rust-analyzer 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems unnecessary but it complains that it can't be owned by both that line and the closure right below it.... making it borrowed causes all kinds of other errors 😅 I couldn't figure out any alternatives but open to suggestions!
(I tried asking ChatGPT and it said "If you must return owned Vec, then cloning or moving is necessary. 😅 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let batches = if let Some(size) = n {
v.chunks(size).map(|v| v.to_vec()).collect()
} else {
vec![v]
};
i guess the borrow checker can't know that the closures passed to map_or
are mutually exclusive. but an if/else
definitely is!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I suppose I could do it that way! That's actually what I originally started with and it didn't feel rust-y so I re-factored to what I have now 😆
ResponseKey::BatchEntity { | ||
selection: selection.clone(), | ||
inputs, | ||
keys: keys.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not for this PR, but we should rename this to key_field_set
or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In just this function or on request
and BatchEntity
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, everywhere. it's a very confusing name
Add support for max batch size for connectors
Checklist
Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.
Exceptions
Note any exceptions here
Notes
Footnotes
It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this. ↩
Configuration is an important part of many changes. Where applicable please try to document configuration examples. ↩
Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions. ↩