Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require UUIDs for feature IDs #207

Open
j-d-b opened this issue Sep 17, 2021 · 7 comments · Fixed by #278
Open

Require UUIDs for feature IDs #207

j-d-b opened this issue Sep 17, 2021 · 7 comments · Fixed by #278
Assignees
Milestone

Comments

@j-d-b
Copy link
Collaborator

j-d-b commented Sep 17, 2021

A Universally Unique IDentifer (UUID), also known as GUID, defined in RFC 4122, is a 128-bit ID that can guarantee uniqueness across space and time.

Here's an example of a UUID: 6ba7b810-9dad-11d1-80b4-00c04fd430c8

Currently, WZDx RoadEventFeature id is just a string. It is likely that there will be collisions between road event IDs from different producers which makes it difficult to aggregate feeds and to relate objects which reference a road event ID (such as within a Relationship, e.g. connecting detours to work zones) to the road event they refer to. Within a single data source there is no issue, but across feeds from multiple producers the ID of the road event alone is not ample to reference a road event. It would be way easier if it was, and a UUID would enable that.

From my experience working with MassDOT and Ver-Mac, this change would not be burdensome as the content of a RoadEventFeature ID does not have meaning and by its description in the WZDx specification is not intended to. Thus the requirement of a UUID would just require a change in how IDs are generated. There are many libraries for generating UUIDs, so this is trivial.

Currently, the only feature in WZDx with an identifier is the RoadEventFeature. However, active PRs seek to add a FieldDeviceFeature (see #195, #208), and there has been discussion about a RoadFeature. I think all feature IDs should be UUIDs which would greatly facilitate referencing features without the context of the original feed or data source they came from, historical reporting and data warehousing, and relating features.

@DeraldDudley
Copy link
Collaborator

Brilliant!

@j-d-b j-d-b added Data-content Simple This issue is can be easily resolved/implemented labels Sep 17, 2021
@j-d-b j-d-b self-assigned this Jan 6, 2022
@j-d-b j-d-b added this to the v4.1 milestone Jan 25, 2022
@j-d-b
Copy link
Collaborator Author

j-d-b commented Jan 25, 2022

For the v4.1 release, this can be implemented as a recommendation through the description of the identifier properties and shown in all examples. Specifically, the following properties would recommended using a UUID:

Property Object
id RoadEventFeature
id FieldDeviceFeature
data_source_id FeedDataSource

The use of a UUID could eventually be required via a future major WZDx release. Having it recommended would be a good stepping stone towards this goal.

@j-d-b
Copy link
Collaborator Author

j-d-b commented Feb 24, 2022

Based on discussion in the 2022-02-23 spec update subgroup meeting, members are on board with the transition to UUID, however, I gathered it would be helped to add a name property to the RoadEventCoreDetails for providing a human-readable "name" for the road event, which is what some current WZDx users were using the ID for. The FieldDeviceCoreDetails already has the name property, defined as:

A human-readable name for the field device.

With the addition of name there is no lost functionality when ID is required to be a UUID.

@j-d-b
Copy link
Collaborator Author

j-d-b commented Aug 11, 2022

UUIDs are recommended in v4.1 from #278.

This issue will stay open for requiring UUIDs in v5.0.

@j-d-b j-d-b modified the milestones: v4.1, v5.0 Aug 11, 2022
@jacob6838
Copy link

At CDOT, we are currently generating WZDx messages by translating a different message to WZDx. That higher level message only has a string identifier, which is unique to all other CDOT messages (of the form OpenTMS-Event2702170538, OpenTMS-Event2843552682, ...). I am now working on updating this translator to 4.1, and need to determine how to handle the identifier, preferably as a UUID. I believe that a goal should be to keep the same identifier for each consecutive message update, meaning that this UUID should persist across updates. Currently, the only simple way to do that (for us) would be to use seeded UUIDs, seeded from that CDOT identifier (OpenTMS-Event2702170538). this solves our problem, but may not be a good long term strategy, as it removes the "guarantee" of uniqueness of the UUID if it is being seeded. I am fairly confident that these identifiers will be globally unique, at least in the short term. Is this a valid solution, seeding UUIDs with our own unique IDs? We have some other options, but this is the simplest by far.

@AdamICone
Copy link

@jacob6838, I think there are two key points you bring up:

  1. Yes, id's (UUID or otherwise) should be maintained for the same feature (both "RoadEvent" and "Device"), for as long as it's the same functional feature. Which attributes define a functional feature will vary based on specifics, but if your internal system would continue to use the same id (i.e. the feature defining attributes haven't changed), that feature in WZDx should also keep using the same id. (Note: the internal system id doesn't have to be the same as the WZDx id)
  2. There are 4 practical versions of UUIDs (ignoring version-2), and pseudo-random number generated is only one of them. As long as the UUID is generated correctly, there's no functional difference between the different versions - a UUID is a UUID.

Specifically for this case, I would use a Version 5 UUID - see RFC 4122, section 4.3 for some more details about name seeded UUIDs (https://www.rfc-editor.org/rfc/rfc4122#section-4.3). In short, the process is to convert a SHA1 hash of a string to a UUID: genUUID_v5(SHA1(namespaceUUID + name)) – there are a number of online generators, e.g. https://www.uuidtools.com/v5. So, I would start by generating a UUID, and saving it as the organization/namespace UUID (really doesn’t matter how this is generated, just store it to always use the same one). Then, you can append your internally unique name (i.e. "OpenTMS-Event2702170538") to generate the globally unique UUID for the feature – which can always be re-generated by you using the same organization/namespace UUID and the same internally unique name.

Also see the following stackoverflow question – the top answer includes some good pseudo-code for an implementation: https://stackoverflow.com/questions/10867405/generating-v5-uuid-what-is-name-and-namespace

@jacob6838
Copy link

@AdamICone That makes perfect sense, UUID version 5 supports this use case perfectly. Thanks!

@j-d-b j-d-b removed the Simple This issue is can be easily resolved/implemented label Oct 20, 2022
@j-d-b j-d-b changed the title Use UUIDs for feature (RoadEventFeature) IDs Require UUIDs for feature (RoadEventFeature) IDs Oct 20, 2022
@j-d-b j-d-b changed the title Require UUIDs for feature (RoadEventFeature) IDs Require UUIDs for feature IDs Oct 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants