-
Notifications
You must be signed in to change notification settings - Fork 0
Invalid character in XML #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The changeset is https://www.openstreetmap.org/changeset/36447235. I'm downloading the dump to check how it was handled too. |
Whoops - the changeset is in the 160107 dump, but the discussion is still too new. |
So, same problem with the dumps. |
Fixed the issue with the planet dump in this commit, will do something similar for the changeset replication soon. While testing, I noticed that it's also bad from the API:
I can understand why these characters aren't allowed in XML raw, but it's really quite annoying that they're not even allowed escaped... if they were, then it's likely that libxml would simply escape them rather than just letting them pass. |
Oh - not allowed even as an entity? I guess my XML is rusty |
Sadly, yes:
As far as I can tell from the character set section of the spec it's simply impossible to represent these characters in XML, even with CDATA:
|
via irc: xmllint doesn't support 1.1, and I bet support elsewhere is spotty. Can the 001/659/607 replication be manually fixed? I opened openstreetmap/openstreetmap-website#1135 for the API |
http://planet.osm.org/replication/changesets/001/659/607.osm.gz contains an invalid character value
Viewing it with
less
shows line 11 isThe
^A
is highlighted as a control code.The relevant part of the hexdump is
which confirms there's a
&x01;
in the document not as an entity.Cross-ref ToeBee/ChangesetMD#20
There is also probably a bug somewhere if this character got into the database
The text was updated successfully, but these errors were encountered: