Skip to content

Commit a5bf6b6

Browse files
authored
Cleaner/Faster Partial Reading (#1502)
* Cleaner/Faster Partial Reading * Updating documentation * non-null buffer support for partial reading with missing fields * Update partial-read.md * nested array partial read test
1 parent d901d56 commit a5bf6b6

File tree

9 files changed

+132
-129
lines changed

9 files changed

+132
-129
lines changed

docs/partial-read.md

+64-70
Original file line numberDiff line numberDiff line change
@@ -2,18 +2,17 @@
22

33
At times it is not necessary to read the entire JSON document, but rather just a header or some of the initial fields. Glaze provides multiple solutions:
44

5-
- `partial_read` in `glz::opts`
65
- `partial_read` in `glz::meta`
7-
- Partial reading via [JSON Pointer syntax](./json-pointer-syntax.md)
8-
9-
> [!NOTE]
10-
>
11-
> If you wish partial reading to work on nested objects, you must also turn on `.partial_read_nested = true` in `glz::opts`.
6+
- Partial reading via [JSON Pointer syntax](./json-pointer-syntax.md)
127

138
# Partial reading with glz::opts
149

1510
`partial_read` is a compile time flag in `glz::opts` that indicates only existing array and object elements should be read into, and once the memory has been read, parsing returns without iterating through the rest of the document.
1611

12+
> [!NOTE]
13+
>
14+
> Once any sub-object is read, the parsing will finish. This ensures high performance with short circuiting.
15+
1716
> A [wrapper](./wrappers.md) by the same name also exists.
1817
1918
Example: read only the first two elements into a `std::tuple`
@@ -47,7 +46,7 @@ expect(obj.size() == 1);
4746
expect(obj.at("2") = 2);
4847
```
4948
50-
Example: read only the fields present in a struct
49+
Example: read only the fields present in a struct and then short circuit the parse
5150
5251
```c++
5352
struct partial_struct
@@ -65,51 +64,6 @@ expect(obj.string == "ha!");
6564
expect(obj.integer == 400);
6665
```
6766
68-
# Partial reading with glz::meta
69-
70-
Glaze allows a `partial_read` flag that can be set to `true` within the glaze metadata.
71-
72-
```json
73-
{
74-
"id": "187d3cb2-942d-484c-8271-4e2141bbadb1",
75-
"type": "message_type"
76-
.....
77-
// the rest of the large JSON
78-
}
79-
```
80-
81-
When `partial_read` is `true`, parsing will end once all the keys defined in the struct have been parsed.
82-
83-
## partial_read_nested
84-
85-
If your object that you wish to only read part of is nested within other objects, set `partial_read_nested = true` so that Glaze will properly parse the parent objects by skipping to the end of the partially read object.
86-
87-
## Example
88-
89-
```c++
90-
struct Header
91-
{
92-
std::string id{};
93-
std::string type{};
94-
};
95-
96-
template <>
97-
struct glz::meta<Header>
98-
{
99-
static constexpr auto partial_read = true;
100-
};
101-
```
102-
103-
```c++
104-
Header h{};
105-
std::string buf = R"({"id":"51e2affb","type":"message_type","unknown key":"value"})";
106-
107-
expect(glz::read_json(h, buf) == glz::error_code::none);
108-
// Glaze will stop reading after "type" is parsed
109-
expect(h.id == "51e2affb");
110-
expect(h.type == "message_type");
111-
```
112-
11367
## Unit Test Examples
11468
11569
```c++
@@ -119,10 +73,10 @@ struct Header
11973
std::string type{};
12074
};
12175
122-
template <>
123-
struct glz::meta<Header>
76+
struct HeaderFlipped
12477
{
125-
static constexpr auto partial_read = true;
78+
std::string type{};
79+
std::string id{};
12680
};
12781
12882
struct NestedPartialRead
@@ -133,23 +87,25 @@ struct NestedPartialRead
13387
};
13488
13589
suite partial_read_tests = [] {
136-
using namespace boost::ut;
90+
using namespace ut;
91+
92+
static constexpr glz::opts partial_read{.partial_read = true};
13793
13894
"partial read"_test = [] {
13995
Header h{};
14096
std::string buf = R"({"id":"51e2affb","type":"message_type","unknown key":"value"})";
14197
142-
expect(glz::read_json(h, buf) == glz::error_code::none);
98+
expect(!glz::read<partial_read>(h, buf));
14399
expect(h.id == "51e2affb");
144100
expect(h.type == "message_type");
145101
};
146102
147103
"partial read 2"_test = [] {
148104
Header h{};
149105
// closing curly bracket is missing
150-
std::string buf = R"({"id":"51e2affb","type":"message_type","unknown key":"value"})";
106+
std::string buf = R"({"id":"51e2affb","type":"message_type","unknown key":"value")";
151107
152-
expect(glz::read_json(h, buf) == glz::error_code::none);
108+
expect(!glz::read<partial_read>(h, buf));
153109
expect(h.id == "51e2affb");
154110
expect(h.type == "message_type");
155111
};
@@ -158,7 +114,7 @@ suite partial_read_tests = [] {
158114
Header h{};
159115
std::string buf = R"({"id":"51e2affb","unknown key":"value","type":"message_type"})";
160116
161-
expect(glz::read_json(h, buf) == glz::error_code::unknown_key);
117+
expect(glz::read<partial_read>(h, buf) == glz::error_code::unknown_key);
162118
expect(h.id == "51e2affb");
163119
expect(h.type.empty());
164120
};
@@ -167,7 +123,16 @@ suite partial_read_tests = [] {
167123
Header h{};
168124
std::string buf = R"({"id":"51e2affb","unknown key":"value","type":"message_type"})";
169125
170-
expect(glz::read<glz::opts{.error_on_unknown_keys = false}>(h, buf) == glz::error_code::none);
126+
expect(!glz::read<glz::opts{.error_on_unknown_keys = false, .partial_read = true}>(h, buf));
127+
expect(h.id == "51e2affb");
128+
expect(h.type == "message_type");
129+
};
130+
131+
"partial read don't read garbage"_test = [] {
132+
Header h{};
133+
std::string buf = R"({"id":"51e2affb","unknown key":"value","type":"message_type"garbage})";
134+
135+
expect(!glz::read<glz::opts{.error_on_unknown_keys = false, .partial_read = true}>(h, buf));
171136
expect(h.id == "51e2affb");
172137
expect(h.type == "message_type");
173138
};
@@ -176,46 +141,75 @@ suite partial_read_tests = [] {
176141
Header h{};
177142
std::string buf = R"({"id":"51e2affb","unknown key":"value"})";
178143
179-
expect(glz::read<glz::opts{.error_on_unknown_keys = false}>(h, buf) != glz::error_code::missing_key);
144+
expect(glz::read<glz::opts{.error_on_unknown_keys = false, .partial_read = true}>(h, buf) != glz::error_code::missing_key);
180145
expect(h.id == "51e2affb");
181146
expect(h.type.empty());
182147
};
183148
184149
"partial read missing key 2"_test = [] {
185150
Header h{};
186-
std::string buf = R"({"id":"51e2affb",""unknown key":"value"})";
151+
std::string buf = R"({"id":"51e2affb","unknown key":"value"})";
187152
188-
expect(glz::read<glz::opts{.error_on_unknown_keys = false, .error_on_missing_keys = false}>(h, buf) !=
189-
glz::error_code::none);
153+
expect(!glz::read<glz::opts{.error_on_unknown_keys = false, .partial_read = true}>(h, buf));
190154
expect(h.id == "51e2affb");
191155
expect(h.type.empty());
192156
};
157+
158+
"partial read HeaderFlipped"_test = [] {
159+
HeaderFlipped h{};
160+
std::string buf = R"({"id":"51e2affb","type":"message_type","unknown key":"value"})";
161+
162+
expect(not glz::read<partial_read>(h, buf));
163+
expect(h.id == "51e2affb");
164+
expect(h.type == "message_type");
165+
};
166+
167+
"partial read HeaderFlipped unknown key"_test = [] {
168+
HeaderFlipped h{};
169+
std::string buf = R"({"id":"51e2affb","unknown key":"value","type":"message_type"})";
170+
171+
expect(glz::read<partial_read>(h, buf) == glz::error_code::unknown_key);
172+
expect(h.id == "51e2affb");
173+
expect(h.type.empty());
174+
};
175+
176+
"partial read unknown key 2 HeaderFlipped"_test = [] {
177+
HeaderFlipped h{};
178+
std::string buf = R"({"id":"51e2affb","unknown key":"value","type":"message_type","another_field":409845})";
179+
180+
expect(glz::read<glz::opts{.error_on_unknown_keys = false, .partial_read = true}>(h, buf) == glz::error_code::none);
181+
expect(h.id == "51e2affb");
182+
expect(h.type == "message_type");
183+
};
193184
};
194185
195186
suite nested_partial_read_tests = [] {
196-
using namespace boost::ut;
187+
using namespace ut;
188+
189+
static constexpr glz::opts partial_read{.partial_read = true};
197190
198191
"nested object partial read"_test = [] {
199192
NestedPartialRead n{};
200193
std::string buf =
201194
R"({"method":"m1","header":{"id":"51e2affb","type":"message_type","unknown key":"value"},"number":51})";
202195
203-
expect(glz::read_json(n, buf) == glz::error_code::unknown_key);
196+
expect(not glz::read<partial_read>(n, buf));
204197
expect(n.method == "m1");
205198
expect(n.header.id == "51e2affb");
206199
expect(n.header.type == "message_type");
207200
expect(n.number == 0);
208201
};
209202
210-
"nested object partial read 2"_test = [] {
203+
"nested object partial read, don't read garbage"_test = [] {
211204
NestedPartialRead n{};
212205
std::string buf =
213-
R"({"method":"m1","header":{"id":"51e2affb","type":"message_type","unknown key":"value"},"number":51})";
206+
R"({"method":"m1","header":{"id":"51e2affb","type":"message_type","unknown key":"value",garbage},"number":51})";
214207
215-
expect(glz::read<glz::opts{.partial_read_nested = true}>(n, buf) == glz::error_code::none);
208+
expect(not glz::read<partial_read>(n, buf));
216209
expect(n.method == "m1");
217210
expect(n.header.id == "51e2affb");
218211
expect(n.header.type == "message_type");
212+
expect(n.number == 0);
219213
};
220214
};
221215
```

include/glaze/beve/read.hpp

+3-3
Original file line numberDiff line numberDiff line change
@@ -1290,7 +1290,7 @@ namespace glz
12901290
}();
12911291

12921292
decltype(auto) fields = [&]() -> decltype(auto) {
1293-
if constexpr (is_partial_read<T> || Opts.partial_read) {
1293+
if constexpr (Opts.partial_read) {
12941294
return bit_array<N>{};
12951295
}
12961296
else {
@@ -1304,7 +1304,7 @@ namespace glz
13041304
}
13051305

13061306
for (size_t i = 0; i < n_keys; ++i) {
1307-
if constexpr (is_partial_read<T> || Opts.partial_read) {
1307+
if constexpr (Opts.partial_read) {
13081308
if ((all_fields & fields) == all_fields) {
13091309
return;
13101310
}
@@ -1325,7 +1325,7 @@ namespace glz
13251325
const auto index = decode_hash_with_size<BEVE, T, HashInfo, HashInfo.type>::op(it, end, n);
13261326

13271327
if (index < N) [[likely]] {
1328-
if constexpr (is_partial_read<T> || Opts.partial_read) {
1328+
if constexpr (Opts.partial_read) {
13291329
fields[index] = true;
13301330
}
13311331

include/glaze/core/common.hpp

+2
Original file line numberDiff line numberDiff line change
@@ -587,6 +587,7 @@ struct glz::meta<glz::error_code>
587587
"send_error",
588588
"connection_failure",
589589
"end_reached",
590+
"partial_read_complete",
590591
"no_read_input",
591592
"data_must_be_null_terminated",
592593
"parse_number_failure",
@@ -650,6 +651,7 @@ struct glz::meta<glz::error_code>
650651
send_error, //
651652
connection_failure, //
652653
end_reached, // A non-error code for non-null terminated input buffers
654+
partial_read_complete,
653655
no_read_input, //
654656
data_must_be_null_terminated, //
655657
parse_number_failure, //

include/glaze/core/context.hpp

+1
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ namespace glz
2626
connection_failure,
2727
// Other errors
2828
end_reached, // A non-error code for non-null terminated input buffers
29+
partial_read_complete, // A non-error code for short circuiting partial reads
2930
no_read_input, //
3031
data_must_be_null_terminated, //
3132
parse_number_failure, //

include/glaze/core/meta.hpp

-3
Original file line numberDiff line numberDiff line change
@@ -317,7 +317,4 @@ namespace glz
317317

318318
template <class T>
319319
concept custom_write = requires { requires meta<T>::custom_write == true; };
320-
321-
template <class T>
322-
concept is_partial_read = requires { requires meta<T>::partial_read == true; };
323320
}

include/glaze/core/opts.hpp

-1
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,6 @@ namespace glz
9393
false; // Reads into only existing fields and elements and then exits without parsing the rest of the input
9494

9595
// glaze_object_t concepts
96-
bool_t partial_read_nested = false; // Advance the partially read struct to the end of the struct
9796
bool_t concatenate = true; // Concatenates ranges of std::pair into single objects when writing
9897

9998
bool_t hide_non_invocable =

include/glaze/core/read.hpp

+6-4
Original file line numberDiff line numberDiff line change
@@ -83,10 +83,12 @@ namespace glz
8383
}
8484

8585
finish:
86-
if constexpr (Opts.partial_read_nested || Opts.partial_read) {
87-
// We don't do depth validation for partial reading
88-
// This end_reached condition is set for valid end points in parsing
89-
if (ctx.error == error_code::end_reached) {
86+
// We don't do depth validation for partial reading
87+
if constexpr (Opts.partial_read) {
88+
if (ctx.error == error_code::partial_read_complete) [[likely]] {
89+
ctx.error = error_code::none;
90+
}
91+
else if (ctx.error == error_code::end_reached && ctx.indentation_level == 0) {
9092
ctx.error = error_code::none;
9193
}
9294
}

0 commit comments

Comments
 (0)