Skip to content

JSON input out of order causing InvalidInput for large json when using http.getStream() #2152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
xsorifc28 opened this issue Jan 25, 2025 · 12 comments
Labels
question v7 ArduinoJson 7

Comments

@xsorifc28
Copy link

xsorifc28 commented Jan 25, 2025

I do not think this is a bug within ArduinoJson, seeking help from anyone with similar experience

Note - I am able to parse the json by reading the stream char-by-char into an array (and deleting some code/functionaliy to have enough memory)

Describe the bug
When passing http.getStream() to deserialize a large payload, I am getting InvalidInput error.

The json that i am parsing is ~88k, which I understand exceeds the maximum string size for esp32 (65k) (using xiao esp32c3).
Therefore I tried to use http.getStream so that the json isn't duplicated in memory.

I tried Serial.println(http.getStream().readString()); and this failes validation, the complete json is there but there are some characters our of order (cURL to the API & json validation passes).

Given that getStream().readString() is returning a bad string, this doesn't seem to be an arduino json issue, but wanted to ask here to confirm and see if there was a work around.

Troubleshooter report
Here is the report generated by the ArduinoJson Troubleshooter:

  1. The program uses ArduinoJson 7
  2. The issue happens at run time
  3. The issue concerns deserialization
  4. deserializeJson() returns InvalidInput
  5. Input comes from a stream
  6. *jsonlint says the document is valid (when using cURL, invalid data returned by http.getStream()
  7. Adding a buffer doesn't solve the issue
  8. Input's first byte doesn't suggest a BOM

Environment
Here is the environment that I used:

  • Microcontroller: XIAO ESP32C3
  • Core/runtime: Arduino
  • IDE: Arduino IDE 2.3.4

Reproduction
Here is a small snippet that reproduces the issue.

  HTTPClient http;
  client.setInsecure();
  client.setTimeout(20000);
  http.useHTTP10(true);
  http.begin(client, url);
  http.GET();

  // Parse response
  JsonDocument filter;
  JsonObject filter_fixtures_0 = filter["fixtures"].add<JsonObject>();
  filter_fixtures_0["id"] = true;
  filter_fixtures_0["time"] = true;
  filter_fixtures_0["status"] = true;
  filter_fixtures_0["date"] = true;

  DynamicJsonDocument doc(32768);
  ReadBufferingStream bufferedStream(http.getStream(), 64);
  DeserializationError error = deserializeJson(doc, bufferedStream, DeserializationOption::Filter(filter));

  http.end();

  if (error) {
    Serial.println("JSON parse failed: " + String(error.c_str()));
    return false;
  }

Compiler output
n/a

Program output
JSON parse failed: IncompleteInput

@xsorifc28 xsorifc28 added the bug label Jan 25, 2025
@bblanchon
Copy link
Owner

Hi @xsorifc28,

I suspect the server returns a chuck-encoded response, even if you asked for HTTP/1.0.
Please print the response headers.

Best regards,
Benoit

@bblanchon bblanchon added question v7 ArduinoJson 7 and removed bug labels Jan 27, 2025
@xsorifc28
Copy link
Author

xsorifc28 commented Jan 29, 2025

I don't think so, here is the response headers:

> GET <redacted> HTTP/1.0
> Host: <redacted>
> User-Agent: curl/8.7.1
> Accept: */*
> 
* Request completely sent off
< HTTP/1.1 200 OK
< Server: nginx/1.14.2
< Date: Wed, 29 Jan 2025 02:01:02 GMT
< Content-Type: application/json; charset=utf-8
< Content-Length: 88575
< Connection: close
< Vary: Accept-Encoding
< X-Powered-By: Express
< Cache-Control: public, max-age=30
< Vary: Accept-Encoding
< Strict-Transport-Security: max-age=63072000; includeSubdomains; preload
< X-Frame-Options: DENY
< X-Content-Type-Options: nosniff

@bblanchon
Copy link
Owner

User-Agent: curl/8.7.1

You got this header list with Curl.
Could you print the headers using the Arduino code?

Headers might change from one response to the other; only the ones where deserializeJson() returns InvalidInput matter.

See also: How to use ChunkDecodingStream with HTTPClient

bblanchon added a commit to bblanchon/ArduinoJsonTroubleshooter that referenced this issue Jan 31, 2025
@xsorifc28
Copy link
Author

User-Agent: curl/8.7.1

You got this header list with Curl. Could you print the headers using the Arduino code?

Headers might change from one response to the other; only the ones where deserializeJson() returns InvalidInput matter.

See also: How to use ChunkDecodingStream with HTTPClient

I have already tried the ChunkDecodingStream solution, even with this change the same error occurs - regardless of having enough free space to fit the entire response or not.

I will get the headers from Arduino and post them in a follow up comment, curious to see if there is a difference.

@bblanchon
Copy link
Owner

Any progress on this issue?

@xsorifc28
Copy link
Author

xsorifc28 commented Feb 28, 2025

Edit: I also printed the Transfer-Encoding header which is empty:

21:08:28.909 -> Transfer-Encoding: 

Here are the headers form the esp32c3:

21:02:55.953 -> Response Headers:
21:02:55.953 -> Server: nginx/1.14.2
21:02:55.953 -> Date: Fri, 28 Feb 2025 20:02:55 GMT
21:02:55.953 -> Content-Type: application/json; charset=utf-8
21:02:55.953 -> Content-Length: 91126
21:02:55.953 -> Connection: keep-alive
21:02:55.953 -> Vary: Accept-Encoding
21:02:55.953 -> X-Powered-By: Express
21:02:55.953 -> Cache-Control: public, max-age=30
21:02:55.953 -> Strict-Transport-Security: max-age=63072000; includeSubdomains; preload
21:02:55.953 -> X-Frame-Options: DENY
21:02:55.953 -> X-Content-Type-Options: nosniff

http client code:

  HTTPClient http;

  client.setInsecure();

  String url = "redacted";

  const char* headerKeys[] = {
      "Server", 
      "Date", 
      "Content-Type", 
      "Content-Length", 
      "Connection", 
      "Vary", 
      "X-Powered-By", 
      "Cache-Control", 
      "Strict-Transport-Security", 
      "X-Frame-Options", 
      "X-Content-Type-Options"
  };
 
 const size_t numberOfHeaders = 11; // Match the number of headers in the array

  http.collectHeaders(headerKeys, numberOfHeaders);

  http.begin(client, url);

  int httpCode = http.GET();
  if (httpCode != HTTP_CODE_OK) {
      Serial.printf("Failed to make http request: %d\n", httpCode);
      http.end();
      client.stop();
      return false;
  }

  int headerCount = http.headers();
  Serial.println("Response Headers:");
  for (int i = 0; i < headerCount; i++) {
      String headerName = http.headerName(i);
      String headerValue = http.header(i);
      Serial.print(headerName + ": ");
      Serial.println(headerValue);
  }

@bblanchon
Copy link
Owner

I'm surprised Transfer-Encoding is not in headerKeys.
Are you sure it was there when you printed the value?

The server is running an ancient version of Nginx.
Any chance you could update it?

@xsorifc28
Copy link
Author

xsorifc28 commented Mar 1, 2025

I manually added Transfer-Encoding to headerKeys to ensure it prints, but it prints a null value. Both cURL and http client do not print a value for this header.

The server is not mine, just an API that I'm using, so unfortunately I can't update it.

My issue is that if I enable Wifi + BLE on the esp32c3, and try to parse this ~90kb JSON, I run out of memory. I'm able to get it to work using json-streaing-parser and thought that it should work with ArduinoJson in the same (streaming) way.

I'm not sure how much further we can debug it, but it's an interesting problem to solve.

@bblanchon
Copy link
Owner

thought that it should work with ArduinoJson in the same (streaming) way

You mean you intended to use the deserialize in chunks technique?

@xsorifc28
Copy link
Author

thought that it should work with ArduinoJson in the same (streaming) way

You mean you intended to use the deserialize in chunks technique?

No, I have not tried that 🤔

@bblanchon
Copy link
Owner

I'm having a very hard time understanding your issue.

The title says

JSON input out of order causing IncompleteInput for large json when using http.getStream()

But then the original message says

When passing http.getStream() to deserialize a large payload, I am getting InvalidInput error.

And now you talk about memory issues.

So, are you getting IncompleteInput, InvalidInput, or NoMemory?

@xsorifc28 xsorifc28 changed the title JSON input out of order causing IncompleteInput for large json when using http.getStream() JSON input out of order causing InvalidInput for large json when using http.getStream() Mar 10, 2025
@xsorifc28
Copy link
Author

I will attempt to reproduce this error on the device and post with more clarifying details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question v7 ArduinoJson 7
Projects
None yet
Development

No branches or pull requests

2 participants