-
Notifications
You must be signed in to change notification settings - Fork 0
feat(rust/cbork): deterministic map decoding helper #360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Adds validation for minimal length encoding of string types (Str and Bytes) in the DeterministicDecoder according to RFC 8949 Section 4.2. This ensures that string lengths are encoded using the minimal number of bytes required. For example, strings of length 0-23 must use direct encoding, length 24-255 must use one byte, etc. The changes: - Add length validation for Type::Str and Type::Bytes - Check for indefinite length strings - Validate minimal length encoding using check_minimal_length function
Adds comprehensive test coverage for RFC 8949 Section 4.2 deterministic encoding requirements. The new tests verify: - Minimal length integer encoding rules for values 0-23, 24-255, etc. - Floating point value requirements including shortest form and non-finite prohibition - String/array/map length encoding rules and indefinite length checks - Map key ordering rules with length-first canonical ordering Each test includes detailed comments explaining: - The specific RFC requirement being tested - Byte-level breakdown of CBOR encodings - Why each test case is valid or invalid - References to relevant RFC sections This ensures proper validation of all deterministic encoding rules and helps maintainers understand the requirements.
Add detailed test cases for deterministic CBOR encoding rules as specified in RFC 8949 section 4.2. The new tests cover: - Integer boundary conditions and minimal encoding requirements - Negative integer encoding across different ranges - Map key ordering (length-first, then lexicographic) - Floating point encoding with different precision requirements - String comparison ordering including UTF-8 handling - Nested structure validation - Array length encoding rules - Duplicate map key detection The tests are extensively documented with RFC requirements and include TODOs for future validation improvements, particularly for floating point handling where additional checks for non-finite values and minimal encoding could be added. Includes commented-out test cases that can be enabled once support for validating non-finite floating point values is implemented. RFC: https://datatracker.ietf.org/doc/html/rfc8949#section-4.2
Refactor test cases to fix clippy warnings: - Use simpler iterator chaining in array length test - Remove redundant calls - Replace explicit type annotations with inferred types - Fix collect() with redundant map operations Also simplify floating point test cases to match current implementation and improve RFC 8949 compliance documentation. The floating point tests now focus on valid encodings while keeping commented-out future test cases for non-finite values validation. Tests still verify the same RFC requirements but with more idiomatic Rust code.
Improve documentation and refactor validate_next() to align with RFC 8949 § 4.2 specification for deterministically encoded CBOR. Split validation logic into smaller, focused functions for better maintainability. - Split validate_next into specialized validation functions: * validate_integer() - Handles minimal integer encoding * validate_array() - Validates definite-length arrays * validate_string() - Checks string/bytes encoding * validate_map() - Ensures proper key ordering - Add comprehensive documentation referencing RFC 8949: * Detail core deterministic encoding requirements * Document rules for integer minimality * Explain length field constraints * Specify map key ordering rules * Include examples of valid/invalid encodings This refactoring improves code organization while maintaining full compliance with the CBOR deterministic encoding specification. The enhanced documentation helps developers understand both implementation details and RFC requirements.
Improve documentation and refactor validate_next() to align with RFC 8949 § 4.2 specification for deterministically encoded CBOR. Split validation logic into smaller, focused functions for better maintainability. - Split validate_next into specialized validation functions: * validate_integer() - Handles minimal integer encoding * validate_array() - Validates definite-length arrays * validate_string() - Checks string/bytes encoding * validate_map() - Ensures proper key ordering - Add comprehensive documentation referencing RFC 8949: * Detail core deterministic encoding requirements * Document rules for integer minimality * Explain length field constraints * Specify map key ordering rules * Include examples of valid/invalid encodings This refactoring improves code organization while maintaining full compliance with the CBOR deterministic encoding specification. The enhanced documentation helps developers understand both implementation details and RFC requirements.
✅ Test Report | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I've left some comments mostly on the error types.
On that topic, I am a bit hesitant about introducing DetermenisticError
. Some of its variants have parallels in minicbor::decode::Error
. For example, DetermenisticError::UnexpectedEof
is the same as Error::end_of_input()
from minicbor. It gets a bit confusing, considering the later is also wrapped under one of the variants.
If I were to implement this, I'd rather wrap the other way around: our local error within minicbor's Error::custom
. Our error would then be extractable via std::error::Error::source
, and we'd get the benefits of integration inside of minicbor::Decode
implementations seamlessly and Error::at
.
d.skip()?; | ||
let value_end = d.position(); | ||
|
||
// Extract the raw bytes for both key and value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the Decoder::skip
skips over a CBOR value, which can't be represented by less than 1 byte, wouldn't it be safe to assume the ranges extracted below to be ok by definition?
Or are we testing the Decoder::skip
implementation itself here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressing this now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@no30bit error handling has been addressed along with the cbor header helper which can be used for various types.
The code satisfies RFC 8949's deterministic CBOR map requirements by ensuring:
Key Ordering
Key Uniqueness
Encoding Rules
All requirements are properly validated and tested
example usage:
p.s will do separate PR for abstracted cbor decoder with optional deterministic decoding w logging.