Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement ->join method #7165

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from
Open

implement ->join method #7165

wants to merge 2 commits into from

Conversation

lennyburdette
Copy link
Contributor


Checklist

Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.

  • Changes are compatible1
  • Documentation2 completed
  • Performance impact assessed and acceptable
  • Tests added and passing3
    • Unit Tests
    • Integration Tests
    • Manual Tests

Exceptions

Note any exceptions here

Notes

Footnotes

  1. It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this.

  2. Configuration is an important part of many changes. Where applicable please try to document configuration examples.

  3. Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions.

@lennyburdette lennyburdette requested a review from a team as a code owner April 2, 2025 19:14
@svc-apollo-docs
Copy link
Collaborator

svc-apollo-docs commented Apr 2, 2025

✅ Docs preview has no changes

The preview was not built because there were no changes.

Build ID: 3fe95e1dcde29bb128129c8a

Copy link
Contributor

github-actions bot commented Apr 2, 2025

@lennyburdette, please consider creating a changeset entry in /.changesets/. These instructions describe the process and tooling.

@router-perf
Copy link

router-perf bot commented Apr 2, 2025

CI performance tests

  • connectors-const - Connectors stress test that runs with a constant number of users
  • const - Basic stress test that runs with a constant number of users
  • demand-control-instrumented - A copy of the step test, but with demand control monitoring and metrics enabled
  • demand-control-uninstrumented - A copy of the step test, but with demand control monitoring enabled
  • enhanced-signature - Enhanced signature enabled
  • events - Stress test for events with a lot of users and deduplication ENABLED
  • events_big_cap_high_rate - Stress test for events with a lot of users, deduplication enabled and high rate event with a big queue capacity
  • events_big_cap_high_rate_callback - Stress test for events with a lot of users, deduplication enabled and high rate event with a big queue capacity using callback mode
  • events_callback - Stress test for events with a lot of users and deduplication ENABLED in callback mode
  • events_without_dedup - Stress test for events with a lot of users and deduplication DISABLED
  • events_without_dedup_callback - Stress test for events with a lot of users and deduplication DISABLED using callback mode
  • extended-reference-mode - Extended reference mode enabled
  • large-request - Stress test with a 1 MB request payload
  • no-tracing - Basic stress test, no tracing
  • reload - Reload test over a long period of time at a constant rate of users
  • step-jemalloc-tuning - Clone of the basic stress test for jemalloc tuning
  • step-local-metrics - Field stats that are generated from the router rather than FTV1
  • step-with-prometheus - A copy of the step test with the Prometheus metrics exporter enabled
  • step - Basic stress test that steps up the number of users over time
  • xlarge-request - Stress test with 10 MB request payload
  • xxlarge-request - Stress test with 100 MB request payload

JSON::Bool(b) => b.then_some("true").unwrap_or("false").to_string(),
JSON::Number(number) => number.to_string(),
JSON::String(byte_string) => byte_string.as_str().to_string(),
JSON::Null | JSON::Array(_) | JSON::Object(_) => "".to_string(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should append to warnings, so it's clear why this didn't work. Otherwise, users get an empty string with no indication of what happened.

We may also want to return None like other methods so processing stops?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The null literal should stringify to null. Remember that null is often data, not always an indication of missing data.

Regarding arrays and objects, I thought we discussed stringifying them in the only useful way (as JSON) but also generating an error? Then again, what problem would that error be flagging, exactly? What is the problem with stringifying arrays and objects as JSON, given that you can always stringify them differently by hand before calling ->join(...)?

Copy link
Contributor Author

@lennyburdette lennyburdette Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so one reason i haven't pushed on review or merge for this is that i really want to get validation working for input and argument shapes. then we'll have these scenarios for the input:

  1. it's a scalar or list of scalars, passing validation and the runtime works as expected
  2. it's an object or list of lists/objects, so it doesn't pass validation and we can't get to the runtime
  3. it's an unknown or named input, so it passes validation and the runtime emits empty strings. i agree with Dylan that we should emit warnings — i'll add that.

I don't think we should have an implicit array->toString or object->toString at this point. we can add it later, but i'd rather emit nothing (with warnings) than something unexpected. Dylan's original design doc (that I had forgotten about when we chatted in slack) also specifies this behavior. Part of my motivation is that if we do introduce backticked string {templates}, I don't think we should be jsonifying array and objects there either (instead requiring an explicit stringification like ->join).

For null being "null", I'm trying to decide which of these is better:

  1. null results "", so if you want null, you'll need something like:
    $->map(@->match([null, "null"], [@, @]))->join(",")
    
  2. null results in null, so if you want "", you'll use
    $->map(@->match([null, ""], [@, @]))->join(",")`
    

I'm leaning towards #1 — it's still data in that something is emitted (and the separator is used), but it's less likely to be unwanted junk. I don't expect users to be reading comma-delimited lists as json-encoded values, so "null" isn't useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like mapping null to "null" too. If you see an unwanted null in your string, there shouldn't be much mystery about why it's there, and you have a number of tools to deal with it.

Once we have multi-arg $(...) and optional chaining with ?, an even shorter way to replace null with "" would be:

$->map($(@?, ""))->join(",")

input_path.to_vec(),
method_name.range(),
));
&EMPTY_STRING
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this return None at this point instead of continuing with an empty string? That's what happens when most methods fail, I think, so processing doesn't continue down the line with an unexpected value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to cite other programming languages as precedent for join defaulting to , for the separator string, but it turns out that's only JavaScript. Ruby and Perl default to an empty string (which makes more sense than , IMO), and many other languages require a separator.

Given that we probably don't want people struggling with the distinction between ->join and ->join(), I'd vote we require the argument and return None here to avoid leading anyone to think "Ah, I got a String back, so it must be the string I wanted."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I think any default here might be confusing, though documentation obviously helps that. I think I'd also lean on the side of requiring the separator as an argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validation will require the separator. you won't be able to compose if you call ->join. this runtime code is just for the case when the argument shape is unknown/named. i'm cool with returning None

#[case(json!([1, 2, 3]), "|", json!("1|2|3"))]
#[case(json!([1.00000000000001, 2.9999999999999, 0.3]), "|", json!("1.00000000000001|2.9999999999999|0.3"))]
#[case(json!([true, false]), " and ", json!("true and false"))]
#[case(json!([null, "a", 1]), ", ", json!(", a, 1"))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems logical for people to want to omit nulls—as I think is the case with the RFC 6570 stuff (gotta double check). Probably makes sense for this to require an explicit ->filter, though, rather than make it an implicit behavior here.

Copy link
Member

@benjamn benjamn Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree that null should be omitted by default, and strongly feel it should stringify to null.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think filtering out nulls implicitly is a confusing behaviour. I'd much rather require them use filter if they want to remove them.

It would appear that most languages, including javascript, output an empty string for null values.

[null, 'hello', null, 'hello2'].join(',')
',hello,,hello2'

Copy link
Member

@benjamn benjamn Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, you're right about JS (see docs). That behavior seems so arbitrary to me, but maybe there's a reason?

If the argument is that you sometimes need to do some conditional logic within a single array element expression that may result in an empty string, I would much rather developers actually use the empty string for that than null or None.

Since JS supports "sparse" arrays where not all elements are occupied, and JSON does not, I wonder if another reason is that the holes in a sparse array get filled in with null when you JSON-encode the array, so it the designers of Array.prototype.join were trying to make sure those instances of null still behaved like undefined (which turns into an empty string). I don't think that applies to us though, since we're starting with JSON, so none of our arrays can be sparse (nor can they contain None elements, I believe). However, if you construct or map an array where any of the elements evaluates to None (due to an error, say), we fill that position with null to make sure the rest of the indices are not shifted. So maybe we do have a version of this problem.

Not sure where this leaves me. For now I've crossed out the strongly in my comment above.

#[case(json!(1), ", ", json!("1"))]
#[case(json!("a"), ", ", json!("a"))]
#[case(json!(true), ", ", json!("true"))]
#[case(json!(null), ", ", json!(""))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need tests for objects, arrays of objects, and multi-dimensional arrays as well. Probably should wait until we decide on whether they produce errors and/or None, though.

Comment on lines +104 to +136
if !(input_shape.is_unknown() || matches!(input_shape.case(), ShapeCase::Name(_, _))) {
let mismatches = input_shape_contract.validate(&input_shape);
if !mismatches.is_empty() {
return Shape::error(
format!(
"Method ->{} requires an array of scalars values as input",
method_name.as_ref()
),
[],
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we give any idea of what the mismatches were with this error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question! i'm not sure what to do with this. i'm pretty sure it's giving me one mismatch per shape in the One<>, so i get:

expected: String, received: List<List<String>>
expected: Float, received: List<List<String>>
expected: Bool, received: List<List<String>>
expected: null, received: List<List<String>>
expected: String, received: List<String>
expected: Float, received: List<String>
expected: Bool, received: List<String>
expected: null, received: List<String>
expected: One<String, Float, Bool, null>, received: List<String>
expected: One<String, Float, Bool, null, List<One<String, Float, Bool, null>>>, received: List<List<String>>

that's a lot, what should we do with this?

Copy link
Member

@benjamn benjamn Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An idea I've been considering is to allow mismatches to group/capture other mismatches, forming a kind of hierarchy/tree. In this case the very last error is the one you probably want, so I'm imagining it could encapsulate a set of contributing mismatches that you could dig into to find out more, but they wouldn't have to be flattened into the top-level list of mismatches as they are above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants