Skip to content

Commit 092c1af

Browse files
authored
[Java/C++/C#] Fix a bug in overzealous DTO validation (#1073)
* [Java] Cover null values in PBTs. We had a recent bug report where the generated Java DTOs were not correctly handling null values for optional fields. It turns out the property-based tests were not encoding these values. Now, they do, and they fail with messages like this one: ``` java.lang.IllegalArgumentException: member1: value is out of allowed range: 255 at uk.co.real_logic.sbe.properties.TestMessageDto.validateMember1(TestMessageDto.java:33) at uk.co.real_logic.sbe.properties.TestMessageDto.member1(TestMessageDto.java:55) at uk.co.real_logic.sbe.properties.TestMessageDto.decodeWith(TestMessageDto.java:112) at uk.co.real_logic.sbe.properties.TestMessageDto.decodeFrom(TestMessageDto.java:135) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at uk.co.real_logic.sbe.properties.DtosPropertyTest.javaDtoEncodeShouldBeTheInverseOfDtoDecode(DtosPropertyTest.java:114) at java.base/java.lang.reflect.Method.invoke(Method.java:568) ``` * [Java] Support arbitrary char arrays in PBTs without data after null. I spotted a failure in the DTO round-trip test locally where the DTO's (and flyweight codec's) String-typed accessor would return `""` if the first character in an array was the `null` character. Therefore, it would not round-trip the "hidden bytes" after the `null`. In this commit, I've added a generation mode that disables the generation of strings with this format. Given the representation of char[N] in DTOs it would not be sensible to support round-tripping these "hidden bytes". * [Java,C++,C#] Improve DTO value validation. Previously, in Java, the generated validation methods would not permit the null value, even for fields with optional presence. Now, we do allow these correctly. In this commit, I've also attempted to clean up, centralise, and test the logic that decides when validation for a particular side of a range needs to be included. We omit the validation when the native type cannot represent values outside of the range. There are still some gaps around validation, e.g., we do not validate array values. * [Java] Fix JSON escaping of string characters, delimeters, etc. Previously, there were several problems: - (delimeters) single `char` values were output with single quotes, but this is not valid JSON. - (escaping) escaping was not implemented as per the specification in Section 7 of RFC 8259. For example, special-cased control codes, e.g., `\b` were encoded as `\\` followed by `\b` rather than `\\` followed by `b`. Also, the non-special-cased control characters were not encoded using JSON's mechanism for doing so, e.g., `\u0020`. - (numbers) Section 6 of the specification says "Numeric values that cannot be represented in the grammar below (such as Infinity and NaN) are not permitted." However, these were encoded as expressions of numbers with multiple terms, e.g., `-1/0` for positive infinity. While these are quite logical encodings of such "numbers", it is not valid JSON. Therefore, I have kept the expressions, but enclosed them within quotes. Also, in this commit: - Replaced custom compilation logic with Agrona's CompilerUtil. Note that `CompilerUtil#compileOnDisk` is broken. I attempted to use it to see if I could see classes in IntelliJ rather than just stack frames, but it fails with an NPE.
1 parent 13f9636 commit 092c1af

File tree

13 files changed

+959
-299
lines changed

13 files changed

+959
-299
lines changed
Lines changed: 224 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,224 @@
1+
/*
2+
* Copyright 2013-2025 Real Logic Limited.
3+
*
4+
* Licensed under the Apache License, Version 2.0 (the "License");
5+
* you may not use this file except in compliance with the License.
6+
* You may obtain a copy of the License at
7+
*
8+
* https://www.apache.org/licenses/LICENSE-2.0
9+
*
10+
* Unless required by applicable law or agreed to in writing, software
11+
* distributed under the License is distributed on an "AS IS" BASIS,
12+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
* See the License for the specific language governing permissions and
14+
* limitations under the License.
15+
*/
16+
package uk.co.real_logic.sbe.generation.common;
17+
18+
import uk.co.real_logic.sbe.PrimitiveType;
19+
import uk.co.real_logic.sbe.PrimitiveValue;
20+
import uk.co.real_logic.sbe.ir.Encoding;
21+
import uk.co.real_logic.sbe.ir.Token;
22+
23+
/**
24+
* Helpers for generating value validation code.
25+
*/
26+
public final class DtoValidationUtil
27+
{
28+
private DtoValidationUtil()
29+
{
30+
}
31+
32+
/**
33+
* What support the target language has for native integer types.
34+
*/
35+
public enum NativeIntegerSupport
36+
{
37+
/**
38+
* The target language supports both signed and unsigned integers natively and the generated code uses
39+
* these to represent the SBE types.
40+
*/
41+
SIGNED_AND_UNSIGNED,
42+
43+
/**
44+
* The target language only supports signed integers natively and the generated code uses the next biggest
45+
* signed integer type to represent the unsigned SBE types, except for UINT64 which is always represented
46+
* as a signed long.
47+
*/
48+
SIGNED_ONLY
49+
}
50+
51+
/**
52+
* Checks if the native type can represent values less than the valid range of the SBE type.
53+
*
54+
* @param fieldToken the field token to check if it is optional.
55+
* @param encoding the encoding of the field to check the applicable minimum and null values.
56+
* @param integerSupport the support for native integer types in the target language.
57+
* @return true if the native type can represent values less than the valid range of the SBE type,
58+
* false otherwise.
59+
*/
60+
public static boolean nativeTypeRepresentsValuesLessThanValidRange(
61+
final Token fieldToken,
62+
final Encoding encoding,
63+
final NativeIntegerSupport integerSupport)
64+
{
65+
final PrimitiveType primitiveType = encoding.primitiveType();
66+
final PrimitiveValue minValue = encoding.applicableMinValue();
67+
68+
switch (minValue.representation())
69+
{
70+
case LONG:
71+
final long nativeMinValue = nativeTypeMinValue(primitiveType, integerSupport);
72+
final PrimitiveValue nullValue = encoding.applicableNullValue();
73+
final boolean gapBefore = minValue.longValue() > nativeMinValue;
74+
final boolean nullFillsGap = fieldToken.isOptionalEncoding() &&
75+
nullValue.longValue() == nativeMinValue &&
76+
minValue.longValue() == nativeMinValue + 1L;
77+
return gapBefore && !nullFillsGap;
78+
79+
case DOUBLE:
80+
switch (primitiveType)
81+
{
82+
case FLOAT:
83+
return minValue.doubleValue() > -Float.MAX_VALUE;
84+
case DOUBLE:
85+
return minValue.doubleValue() > -Double.MAX_VALUE;
86+
default:
87+
throw new IllegalArgumentException(
88+
"Type did not have a double representation: " + primitiveType);
89+
}
90+
91+
default:
92+
throw new IllegalArgumentException(
93+
"Cannot understand the range of a type with representation: " + minValue.representation());
94+
}
95+
}
96+
97+
/**
98+
* Checks if the native type can represent values greater than the valid range of the SBE type.
99+
*
100+
* @param fieldToken the field token to check if it is optional.
101+
* @param encoding the encoding of the field to check the applicable maximum and null values.
102+
* @param integerSupport the support for native integer types in the target language.
103+
* @return true if the native type can represent values greater than the valid range of the SBE type,
104+
* false otherwise.
105+
*/
106+
public static boolean nativeTypeRepresentsValuesGreaterThanValidRange(
107+
final Token fieldToken,
108+
final Encoding encoding,
109+
final NativeIntegerSupport integerSupport)
110+
{
111+
final PrimitiveType primitiveType = encoding.primitiveType();
112+
final PrimitiveValue maxValue = encoding.applicableMaxValue();
113+
114+
switch (maxValue.representation())
115+
{
116+
case LONG:
117+
final long nativeMaxValue = nativeTypeMaxValue(primitiveType, integerSupport);
118+
final PrimitiveValue nullValue = encoding.applicableNullValue();
119+
final boolean gapAfter = maxValue.longValue() < nativeMaxValue;
120+
final boolean nullFillsGap = fieldToken.isOptionalEncoding() &&
121+
nullValue.longValue() == nativeMaxValue &&
122+
maxValue.longValue() + 1L == nativeMaxValue;
123+
return gapAfter && !nullFillsGap;
124+
125+
case DOUBLE:
126+
switch (primitiveType)
127+
{
128+
case FLOAT:
129+
return maxValue.doubleValue() < Float.MAX_VALUE;
130+
case DOUBLE:
131+
return maxValue.doubleValue() < Double.MAX_VALUE;
132+
default:
133+
throw new IllegalArgumentException(
134+
"Type did not have a double representation: " + primitiveType);
135+
}
136+
137+
default:
138+
throw new IllegalArgumentException(
139+
"Cannot understand the range of a type with representation: " + maxValue.representation());
140+
}
141+
}
142+
143+
private static long nativeTypeMinValue(
144+
final PrimitiveType primitiveType,
145+
final NativeIntegerSupport integerSupport)
146+
{
147+
switch (primitiveType)
148+
{
149+
case CHAR:
150+
return Character.MIN_VALUE;
151+
case INT8:
152+
return Byte.MIN_VALUE;
153+
case INT16:
154+
return Short.MIN_VALUE;
155+
case INT32:
156+
return Integer.MIN_VALUE;
157+
case INT64:
158+
return Long.MIN_VALUE;
159+
case UINT8:
160+
if (integerSupport == NativeIntegerSupport.SIGNED_ONLY)
161+
{
162+
return Short.MIN_VALUE;
163+
}
164+
return 0L;
165+
case UINT16:
166+
if (integerSupport == NativeIntegerSupport.SIGNED_ONLY)
167+
{
168+
return Integer.MIN_VALUE;
169+
}
170+
return 0L;
171+
case UINT32:
172+
if (integerSupport == NativeIntegerSupport.SIGNED_ONLY)
173+
{
174+
return Long.MIN_VALUE;
175+
}
176+
return 0L;
177+
case UINT64:
178+
return 0L;
179+
default:
180+
throw new IllegalArgumentException("Type did not have a long representation: " + primitiveType);
181+
}
182+
}
183+
184+
private static long nativeTypeMaxValue(
185+
final PrimitiveType primitiveType,
186+
final NativeIntegerSupport integerSupport)
187+
{
188+
switch (primitiveType)
189+
{
190+
case CHAR:
191+
return Character.MAX_VALUE;
192+
case INT8:
193+
return Byte.MAX_VALUE;
194+
case INT16:
195+
return Short.MAX_VALUE;
196+
case INT32:
197+
return Integer.MAX_VALUE;
198+
case INT64:
199+
return Long.MAX_VALUE;
200+
case UINT8:
201+
if (integerSupport == NativeIntegerSupport.SIGNED_ONLY)
202+
{
203+
return Short.MAX_VALUE;
204+
}
205+
return 0xFFL;
206+
case UINT16:
207+
if (integerSupport == NativeIntegerSupport.SIGNED_ONLY)
208+
{
209+
return Integer.MAX_VALUE;
210+
}
211+
return 0xFFFFL;
212+
case UINT32:
213+
if (integerSupport == NativeIntegerSupport.SIGNED_ONLY)
214+
{
215+
return Long.MAX_VALUE;
216+
}
217+
return 0xFFFFFFFFL;
218+
case UINT64:
219+
return 0xFFFFFFFFFFFFFFFFL;
220+
default:
221+
throw new IllegalArgumentException("Type did not have a long representation: " + primitiveType);
222+
}
223+
}
224+
}

sbe-tool/src/main/java/uk/co/real_logic/sbe/generation/cpp/CppDtoGenerator.java

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,9 @@
3535

3636
import static uk.co.real_logic.sbe.generation.Generators.toLowerFirstChar;
3737
import static uk.co.real_logic.sbe.generation.Generators.toUpperFirstChar;
38+
import static uk.co.real_logic.sbe.generation.common.DtoValidationUtil.NativeIntegerSupport.SIGNED_AND_UNSIGNED;
39+
import static uk.co.real_logic.sbe.generation.common.DtoValidationUtil.nativeTypeRepresentsValuesGreaterThanValidRange;
40+
import static uk.co.real_logic.sbe.generation.common.DtoValidationUtil.nativeTypeRepresentsValuesLessThanValidRange;
3841
import static uk.co.real_logic.sbe.generation.cpp.CppUtil.*;
3942
import static uk.co.real_logic.sbe.ir.GenerationUtil.collectFields;
4043
import static uk.co.real_logic.sbe.ir.GenerationUtil.collectGroups;
@@ -1591,8 +1594,11 @@ private static void generateSingleValuePropertyValidateMethod(
15911594

15921595
String value = "value";
15931596

1594-
final boolean mustPreventLesser = !encoding.applicableMinValue().equals(encoding.primitiveType().minValue());
1595-
final boolean mustPreventGreater = !encoding.applicableMaxValue().equals(encoding.primitiveType().maxValue());
1597+
final boolean mustPreventLesser =
1598+
nativeTypeRepresentsValuesLessThanValidRange(fieldToken, encoding, SIGNED_AND_UNSIGNED);
1599+
1600+
final boolean mustPreventGreater =
1601+
nativeTypeRepresentsValuesGreaterThanValidRange(fieldToken, encoding, SIGNED_AND_UNSIGNED);
15961602

15971603
if (fieldToken.isOptionalEncoding())
15981604
{

sbe-tool/src/main/java/uk/co/real_logic/sbe/generation/csharp/CSharpDtoGenerator.java

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,9 @@
3535
import java.util.Set;
3636
import java.util.function.Predicate;
3737

38+
import static uk.co.real_logic.sbe.generation.common.DtoValidationUtil.NativeIntegerSupport.SIGNED_AND_UNSIGNED;
39+
import static uk.co.real_logic.sbe.generation.common.DtoValidationUtil.nativeTypeRepresentsValuesGreaterThanValidRange;
40+
import static uk.co.real_logic.sbe.generation.common.DtoValidationUtil.nativeTypeRepresentsValuesLessThanValidRange;
3841
import static uk.co.real_logic.sbe.generation.csharp.CSharpUtil.*;
3942
import static uk.co.real_logic.sbe.ir.GenerationUtil.collectFields;
4043
import static uk.co.real_logic.sbe.ir.GenerationUtil.collectGroups;
@@ -1285,7 +1288,9 @@ private void generateSingleValueProperty(
12851288
.append("}\n");
12861289
}
12871290

1288-
final boolean mustPreventLesser = !encoding.applicableMinValue().equals(encoding.primitiveType().minValue());
1291+
final boolean mustPreventLesser =
1292+
nativeTypeRepresentsValuesLessThanValidRange(fieldToken, typeToken.encoding(), SIGNED_AND_UNSIGNED);
1293+
12891294
if (mustPreventLesser)
12901295
{
12911296
sb.append(indent).append(INDENT)
@@ -1299,7 +1304,9 @@ private void generateSingleValueProperty(
12991304
.append("}\n");
13001305
}
13011306

1302-
final boolean mustPreventGreater = !encoding.applicableMaxValue().equals(encoding.primitiveType().maxValue());
1307+
final boolean mustPreventGreater =
1308+
nativeTypeRepresentsValuesGreaterThanValidRange(fieldToken, typeToken.encoding(), SIGNED_AND_UNSIGNED);
1309+
13031310
if (mustPreventGreater)
13041311
{
13051312
sb.append(indent).append(INDENT)

0 commit comments

Comments
 (0)