Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment with allowing standard schemas to validate endpoint values #4864

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

EskiMojo14
Copy link
Collaborator

@EskiMojo14 EskiMojo14 commented Feb 26, 2025

Type-wise, schemas are required to match the type of the value they're validating - they can't change a string to a number for example. Transformations that preserve the correct type are allowed, however.

Schemas can also be used as a source of inference, for example

build.query({
  query: ({ id }) => `posts/${id}`,
  argSchema: v.object({ id: v.number() }),
  responseSchema: postSchema
})
// or more likely in a TS app, since arg will be checked statically
build.query({
  query: ({ id }: { id: number }) => `posts/${id}`,
  responseSchema: postSchema
})

Schema parse failures are treated as unhandled errors, i.e. they're thrown and serialized rather than being returned. This is mainly because we have no idea what the base query error shape should look like, so we can't match it.

For transformable values, raw<value>Schema checks the value before transformation, and the <value>Schema checks the possibly transformed value.

build.query({
  query: (_arg: void) => `posts`,
  rawResponseSchema: v.array(postSchema),
  // raw response is inferred from schema
  transformResponse: (posts) =>
    postAdapter.getInitialState(undefined, posts),
  responseSchema: v.object({
    ids: v.array(v.number()),
    entities: v.record(
      v.pipe(v.string(), v.transform(Number), v.number()),
      postSchema,
    ),
  }),
})

@EskiMojo14 EskiMojo14 added the enhancement New feature or request label Feb 26, 2025
Copy link

codesandbox bot commented Feb 26, 2025

Review or Edit in CodeSandbox

Open the branch in Web EditorVS CodeInsiders

Open Preview

Copy link

codesandbox-ci bot commented Feb 26, 2025

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

Latest deployment of this branch, based on commit b53b271:

Sandbox Source
@examples-query-react/basic Configuration
@examples-query-react/advanced Configuration
@examples-action-listener/counter Configuration
rtk-esm-cra Configuration

Copy link

github-actions bot commented Feb 26, 2025

size-limit report 📦

Path Size
1. entry point: @reduxjs/toolkit/query (modern.mjs) 3.68 KB (+0.08% 🔺)
1. entry point: @reduxjs/toolkit/query/react (modern.mjs) 14.71 KB (+2.02% 🔺)
2. entry point: @reduxjs/toolkit/query (without dependencies) (modern.mjs) 110 B (+2.81% 🔺)
1. entry point: @reduxjs/toolkit/query (cjs, production.min.cjs) 23.65 KB (+1.68% 🔺)
1. entry point: @reduxjs/toolkit/query/react (cjs, production.min.cjs) 26.03 KB (+1.52% 🔺)
2. entry point: @reduxjs/toolkit/query (without dependencies) (cjs, production.min.cjs) 10.53 KB (+4.82% 🔺)
3. createApi (.modern.mjs) 15.11 KB (+2.16% 🔺)
3. createApi (react) (.modern.mjs) 17.14 KB (+1.81% 🔺)
3. fetchBaseQuery (.modern.mjs) 4.63 KB (+0.03% 🔺)

Copy link

netlify bot commented Feb 26, 2025

Deploy Preview for redux-starter-kit-docs ready!

Name Link
🔨 Latest commit b53b271
🔍 Latest deploy log https://app.netlify.com/sites/redux-starter-kit-docs/deploys/67bf7be247a7f20008623586
😎 Deploy Preview https://deploy-preview-4864--redux-starter-kit-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@agusterodin
Copy link

agusterodin commented Feb 26, 2025

FWIW, I really like where your head is at in the second example you proposed.

build.query({
  query: ({ id }: { id: number }) => `posts/${id}`,
  resultSchema: postSchema
})

Input/request validation is probably out of scope for a data fetching solution anyway. If inaccurate types are an issue in someone's application, it should probably be addressed at the source of the problem, not as a last ditch effort right before request is placed. The CPU overhead of validating all inputs of all requests probably isn’t great either, especially if this is occurring on main thread.

I definitely prefer the terminology responseSchema btw, as it feels more intuitive, but I understand that RTKQ can be applied to async state management use-cases beyond just simple HTTP requests.

@EskiMojo14
Copy link
Collaborator Author

true - responseSchema would be more consistent with transformResponse. I'll switch it 🙂

@agusterodin
Copy link

Here is link to my initial idea pitch just so there is a paper trail:
https://x.com/agusterodin/status/1894623542471528888

Will respond directly to this PR for now on for brainstorming.

@agusterodin
Copy link

agusterodin commented Feb 26, 2025

It might be overkill, but i'm really enticed by the idea of performing schema validation in a web worker (off main thread). I have seen brief main thread lag when validating massive payloads from server (10MB+). One of the projects I work on professionally is a Tableau-esque data viewer where user can view and interact with up to 500k geospatial results.

One of my transformResponse functions was bogging the main thread down, so I experimented moving that transformation into a web worker. I ultimately decided not to go through with this, particularly because the type-safety situation was less than ideal.

Attached below is my rough attempt. The web worker was usable via promise.

Usage inside API slice

transformResponse: async (response: DashboardDataTableResult[]) => {
  return await transformDataTableResponseOffMainThread(response)
}

workerPromise.ts

import { DashboardDataTableResult } from 'datamodels'

export type TableRowByIdDictionary = Record<string, DashboardDataTableResult>

export function transformDataTableResponseOffMainThread(tableResults: DashboardDataTableResult[]): Promise<TableRowByIdDictionary> {
  return new Promise((resolve, reject) => {
    const worker = new Worker(new URL('./transformWorker.ts', import.meta.url))

    worker.onmessage = (event: MessageEvent<TableRowByIdDictionary>) => {
      resolve(event.data)
      worker.terminate()
    }

    worker.onerror = error => {
      reject(error)
      worker.terminate()
    }

    worker.postMessage(tableResults)
  })
}

workerLogic.ts

import { keyBy } from 'lodash'
import { DashboardDataTableResult } from 'datamodels'

self.onmessage = function (event: MessageEvent<DashboardDataTableResult[]>) {
  const data = event.data
  const transformedData = transformData(data)
  self.postMessage(transformedData)
}

function transformData(data: DashboardDataTableResult[]) {
  const chan: Partial<DashboardDataTableResult>[] = new Array(5000000).fill(0).map((_, i) => ({
    id: i,
    ipv4: 'zcxv',
    ipv6: '342342'
  }))
  return keyBy(chan, 'id')
}

Getting web worker to work with a variety of build tooling could be a pain though, particularly this kind of thing.
const worker = new Worker(new URL('./transformWorker.ts', import.meta.url)). It works well in Next.js, but I have no idea if other bundlers handle it the same way.

My example above is obviously for a different use case (transforming a response instead of validating a payload), but I figure is worth sharing.

@agusterodin
Copy link

agusterodin commented Feb 26, 2025

As for question about whether validation would occur before transformResponse or after:

"I would imagine responseSchema validating what data came back from the server pre-transformResponse.

If a transformResponse function is provided, perhaps the RTKQ endpoint will use the return type of their provided transformResponse function.

If no transformResponse function is provided, the RTKQ endpoint return type will be whatever was provided as responseSchema."

If rawReponseSchema for pre-tranformResponse is the route you end up going, it would be cool if responseSchema would be optional in if rawResponseSchema is provided.

@EskiMojo14
Copy link
Collaborator Author

EskiMojo14 commented Feb 26, 2025

It might be overkill
<snip>
My example above is obviously for a different use case (transforming a response instead of validating a payload), but I figure is worth sharing.

Definitely not something that makes sense for us to look into, but we would just accept any standard schema compliant value, so you're welcome to make your own:

declare function validateInWorker(input: unknown): Promise<StandardSchemaV1.Result<string>>

const workerSchema: StandardSchemaV1<string> = {
  '~standard': {
    version: 1,
    vendor: "custom",
    validate: validateInWorker
  }
}

😄

@EskiMojo14
Copy link
Collaborator Author

If rawReponseSchema for pre-tranformResponse is the route you end up going, it would be cool if responseSchema would be optional in if rawResponseSchema is provided.

All schemas are optional - the difficult thing with responseSchema vs rawResponseSchema is which one should be used for inference/checked against the provided type parameter 🤔

Currently I have it as rawResponseSchema is always StandardSchemaV1<BaseQueryResult<BaseQuery>>, and responseSchema is always StandardSchemaV1<ResultType>, which matches transformResponse's (res: BaseQueryResult<BaseQuery) => ResultType.

The difficulty with that is that rawResponseSchema cannot be used as a source of inference, so the below would not infer properly:

build.query({
  query: ({ id }: { id: number }) => `posts/${id}`,
  rawResponseSchema: postSchema
})

however, this doesn't seem like a huge issue to me, as responseSchema will be used even if transformResponse isn't provided (it still happens, it just defaults to an identity function).

an ideal state of affairs would be

build.query({
  query: ({ id }: { id: number }) => `posts/${id}`,
  rawResponseSchema: postSchema,
  transformResponse: (res /* inferred from rawResponseSchema */) => transformed // provides final data type
})
// or no transformResponse
build.query({
  query: ({ id }: { id: number }) => `posts/${id}`,
  responseSchema: postSchema,
})

but this isn't currently possible (the inference part at least). The closest you'd get is manually annotating

build.query({
  query: ({ id }: { id: number }) => `posts/${id}`,
  rawResponseSchema: postSchema,
  transformResponse: (res: Infer<typeof postSchema>) => transformed // provides final data type
})

@EskiMojo14
Copy link
Collaborator Author

cool, that wasn't actually too bad 😄
so basically:

// without transformResponse
build.query({
  query,
  responseSchema
})
// with transformResponse
build.query({
  query,
  rawResponseSchema,
  transformResponse
})

We could use a union to enforce this, but honestly the types are complicated enough and i think some people may still want to use both.

@agusterodin
Copy link

If rawReponseSchema for pre-tranformResponse is the route you end up going, it would be cool if responseSchema would be optional in if rawResponseSchema is provided.

All schemas are optional - the difficult thing with responseSchema vs rawResponseSchema is which one should be used for inference/checked against the provided type parameter 🤔

Currently I have it as rawResponseSchema is always StandardSchemaV1<BaseQueryResult<BaseQuery>>, and responseSchema is always StandardSchemaV1<ResultType>, which matches transformResponse's (res: BaseQueryResult<BaseQuery) => ResultType.

The difficulty with that is that rawResponseSchema cannot be used as a source of inference, so the below would not infer properly:

build.query({
  query: ({ id }: { id: number }) => `posts/${id}`,
  rawResponseSchema: postSchema
})

however, this doesn't seem like a huge issue to me, as responseSchema will be used even if transformResponse isn't provided (it still happens, it just defaults to an identity function).

an ideal state of affairs would be

build.query({
  query: ({ id }: { id: number }) => `posts/${id}`,
  rawResponseSchema: postSchema,
  transformResponse: (res /* inferred from rawResponseSchema */) => transformed // provides final data type
})
// or no transformResponse
build.query({
  query: ({ id }: { id: number }) => `posts/${id}`,
  responseSchema: postSchema,
})

but this isn't currently possible (the inference part at least). The closest you'd get is manually annotating

build.query({
  query: ({ id }: { id: number }) => `posts/${id}`,
  rawResponseSchema: postSchema,
  transformResponse: (res: Infer<typeof postSchema>) => transformed // provides final data type
})

Is it not possible due to limitations in standard schema specification, the way RTKQ is currently structured, or due to limitation in TypeScript language itself?

@EskiMojo14
Copy link
Collaborator Author

EskiMojo14 commented Feb 26, 2025

Is it not possible due to limitations in standard schema specification, the way RTKQ is currently structured, or due to limitation in TypeScript language itself?

Due to how it's currently structured - I've just pushed a change adding the inference :)

@agusterodin
Copy link

Have’t thought too deeply about it, but agreed about not using union.

Adding the ability to provide both rawResponseSchema and regular responseSchema may not a be common use-case, but probably wouldn't hurt to have.

@agusterodin
Copy link

agusterodin commented Feb 26, 2025

Ability to provide a global callback for situation where schema validation fails would be super valuable btw. Would allow for warning toast, logging to external service, etc.

Callback would contain information about the request such as endpoint name, request payload, etc.


Ability to enable/disable schema validation on global and per-endpoint basis may be nice too. Similar to how RTKQ has a global keepUnusedDataFor value, but have ability to override at the endpoint level.

@EskiMojo14
Copy link
Collaborator Author

EskiMojo14 commented Feb 26, 2025

Ability to provide a global callback for situation where schema validation fails would be super valuable btw. Would allow for warning toast, logging to external service, etc.

Callback would contain information about the request such as endpoint name, request payload, etc.

Added - callback parameters are onSchemaFailure(error, info) where error is an extended SchemaError with .value (the original value before parsing) and .schemaName (e.g. "argSchema") and info is an object like { endpoint: 'getPost', arg: 1, type: "query", queryCacheKey: "getPost(1)" }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants