Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openapi: Add masks field to /v1/compile results. #121

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

philipaconrad
Copy link
Member

@philipaconrad philipaconrad commented Feb 7, 2025

What Changed?

This PR adds an object property to each /v1/compile return type, allowing us to include masking rules in the output for later consumption by downstream tooling.

What I have in mind for masking rules looks roughly like:

{
  "masks": {
    "tickets.assignee": { "replace": ["***"] },
    "tickets.customer.id": { "md5hash": [] },
    "tickets.customer.phone": { "lastN": [4] }
  }
}

Where the key is the column name, and the value is an object of the form {"mask_function": [args...]}

Open Questions

  • Do we want to allow providing masks as an HTTP request argument? (Ex: add a new key like options.masks)
  • Do we want to allow shorthand syntaxes? (Ex: "tickets.assignee": "***" would indicate a replacement with the constant string ***)
  • This design makes a number of assumptions. Are there any obvious edge cases that this design would be poor for handling?

Signed-off-by: Philip Conrad <philip@chariot-chaser.net>
@philipaconrad philipaconrad added the enhancement New feature or request label Feb 7, 2025
@philipaconrad philipaconrad self-assigned this Feb 7, 2025
@srenatus
Copy link
Member

Do we want to allow shorthand syntaxes? (Ex: "tickets.assignee": "***" would indicate a replacement with the constant string ***)

I had been thinking about something related reading your PR message -- I'm on the fence about encoding masking as function+args: it suggests an openness to the system that we perhaps don't need. Unless of course, want to support exactly that, but I think that's a product decision; we could make both work. I think the simpler, more static approach would be to declare certain well-known mask options, and encode them more directly:

{
  "masks": {
    "tickets.assignee": { "replace": "***" },
    "tickets.customer.id": { "md5hash": {} },
    "tickets.customer.phone": { "lastN": 4 }
  }
}

(💭 Unrelated: Do we support three-part refs for masks? What column of which table does tickets.customer.id refer to?)

@philipaconrad
Copy link
Member Author

philipaconrad commented Feb 11, 2025

Do we support three-part refs for masks? What column of which table does tickets.customer.id refer to?

I hadn't considered this problem because up until just recently, we were not making hard assumptions about what a.b refs were implying. 😅

In the LINQ/EF Core use case, 3-part refs usually imply a JOIN being used to look up a field, like jumping from a Ticket, to the associated Customer object. In an ORM context, this makes sense, elsewhere, less so.

For the masking use case, I'd assume that the LINQ+UCAST library would have to do one of the following:

  • Just ignore fields that require JOINs. (Easy, but leaves pitfalls for users)
  • Warn on JOIN fields? (Still doesn't save users from pitfalls)
  • Actually try to hide the JOIN'd field result, by either decorator, or some kind of overlay "MaskedX" class? (I'm struggling with this actively in my prototypes right now; not sure if it's entirely worth the hassle. It's trivial to mangle fields in-place when it's a plain old datatype struct, much more tricky when it's a "smarter" object.)

What makes this tricky/hard is that in the tickets.customer.id case, the field-to-be-masked actually does not live in the Tickets table at all-- the mask would need to apply for the Customer object that is fetched to display the ID.

@philipaconrad
Copy link
Member Author

philipaconrad commented Feb 11, 2025

I had been thinking about something related reading your PR message -- I'm on the fence about encoding masking as function+args: it suggests an openness to the system that we perhaps don't need.

I agree with this opinion-- I'd been mulling over the initial sketch, and realized pretty quickly that unconstrained lists of args are going to be a pain in strongly-typed languages, like C# and Go. Having a more fixed structure as you suggested, with known key-value pairs, should be much easier to scale across languages.

(EDIT: I've gone and updated the spec to follow a stronger, more locked-down format, as discussed.)

Signed-off-by: Philip Conrad <philip@chariot-chaser.net>
@srenatus
Copy link
Member

I hadn't considered this problem because up until just recently, we were not making hard assumptions about what a.b refs were implying. 😅

In the LINQ/EF Core use case, 3-part refs usually imply a JOIN being used to look up a field, like jumping from a Ticket, to the associated Customer object. In an ORM context, this makes sense, elsewhere, less so.

I suppose I hadn't thought about anything not "a.b" until kurt wanted "a" 😅 So "a.b.c" hadn't crossed my mind, and I don't know if, as a consequence, the compile machinery will let that pass. I think an action item here would be to put some LINQ-specific test cases into the EOPA sources. Perhaps we even need to make pkg/compile/check.go adjust its checks (for unknown ref length, specifically) based on the supported features. I.e. if you only ask for ucast.linq, that'll be OK, if you ask for sql.postgresql and ucast.linq, it won't be -- none of the other targets support "a.b.c" refs.

Signed-off-by: Philip Conrad <philip@chariot-chaser.net>
@philipaconrad philipaconrad force-pushed the philip/add-masking-support branch from 378262a to b79cba1 Compare February 13, 2025 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants