Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trimmed and final(?) version #8

Closed
wants to merge 2 commits into from
Closed

Trimmed and final(?) version #8

wants to merge 2 commits into from

Conversation

iherman
Copy link
Member

@iherman iherman commented Apr 18, 2020

I have made the changes as discussed on 2010-04-17.

Some things that we have not discussed, we may want to take care of this with this PR:

  • I have kept the extra terms of the sort license, comment, label, seeAlso, etc. I think they are useful, mainly the comment that can then be used as, well, a comment in a JSON-LD file.
  • I have looked at the spec for other aliases that we may want to include as a matter of consistency. If we have @none, we may want to have @list, @set, @nest, etc. (Actually we may include almost all keywords). Which is, imho, an overkill. My approach would be to only alias terms that are typically used in the data and not terms mostly used within context files. What this may mean is to:
    • add @graph and @included
    • remove @json, @none

Fix #1
Fix #5
Fix #6

@iherman
Copy link
Member Author

iherman commented Apr 18, 2020

Actually... for historical reasons the mapping of 'license' is to http://www.w3.org/1999/xhtml/vocab#license. I wonder whether, these days, http://schema.org/license is not more appropriate.

@iherman
Copy link
Member Author

iherman commented Apr 18, 2020

Another minor things that I forgot to add above: these days http://schema.org is automatically moved to https://schema.org. Should we use https instead of http?

@BigBlueHat
Copy link
Member

@iherman here's the guidance from Schema.org https://schema.org/docs/faq.html#19

So...sure. 😃 Would've been nice if the world had come at securing HTTP a different way than changing protocols... https://www.w3.org/DesignIssues/Security-NotTheS.html (we needed the security, not the extra letter).

But that 🚢 has definitely sailed. 😉

@BigBlueHat
Copy link
Member

Generally I'm beginning to feel like all the aliasing (which goes well beyond just JSON-LD core terms) is creating a sub-language/style of JSON-LD which is increasingly idiosyncratic. It goes well beyond the basic prefixing that the RDFa initial context does--which only includes the aliasing of 3 "bare" terms: describedby, license, and role.

This feels very much like it's own animal now... 🐱

@iherman
Copy link
Member Author

iherman commented Apr 19, 2020

@iherman here's the guidance from Schema.org https://schema.org/docs/faq.html#19

that faq item says "This is a lengthy way of saying that both 'https://schema.org' and 'http://schema.org' are fine". I am not sure it gives us a clear answer to my original question. Maybe the answer is "ain't broken don't fix it", i.e., leave it as http. WDYT?

@BigBlueHat
Copy link
Member

I'd think at this point we should encourage the use of https://schema.org for safer context dereferencing. However...that does rename stuff and could cause issues with equivalency in people's datasets... Maybe we can sameAs these universally somewhere? 😜

Verifiable Credentials uses https://schema.org/ fwiw: https://www.w3.org/TR/vc-data-model/#example-41-a-credential-uniquely-identifying-a-subject

@iherman
Copy link
Member Author

iherman commented Apr 20, 2020

Verifiable Credentials uses https://schema.org/ fwiw: https://www.w3.org/TR/vc-data-model/#example-41-a-credential-uniquely-identifying-a-subject

Yep, and so does the publication manifest.

However... if we switch to https, then this should probably be done in the JSON-LD specs as well, for consistency. (And probably in the test files...)

@azaroth42
Copy link
Contributor

azaroth42 commented Apr 20, 2020

Strongest of 👎s to https. All of the examples in the schema.org site use in the context document a @vocab of "http://schema.org/" (e.g. https://schema.org/Action), the faq entry notwithstanding about their intent (but no action?) to move to https. Thus in their markup examples, the predicates are all http, not https. [Edit to reflect vocab, not context location]

It's not at all helpful to swim against the current here.

@azaroth42
Copy link
Contributor

Agree with inclusion of graph and included. But I disagree with the removal of none and json. All of these are used in compacted documents, which is the main utility of having them aliased.

For example: https://iiif.io/api/presentation/3.0/#44-language-of-property-values has none aliased to @none. Having type: json, value: "{...}" is similarly much more readable.

Conversely, @list, @set, @nest are not used in compacted documents and thus don't need to be aliased in a context.

Copy link
Contributor

@azaroth42 azaroth42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that it should have included and graph aliases.
schema should stay http:
should not remove none and json

@iherman
Copy link
Member Author

iherman commented Apr 20, 2020

If we keep json (which is fine with me) do we create a possible problem by having both json (aliased to @json) and JSON (aliased to rdf:JSON)?

Just wondering...

@azaroth42
Copy link
Contributor

I wondered about that too. Technically no, as json and JSON are different values ... but they have the same usage. I would probably remove JSON as our examples all use the @json form rather than the CURIE form, but no strong feelings either way.

@iherman
Copy link
Member Author

iherman commented Apr 23, 2020

I have made some tiny changes on where I feel we have consensus; maybe we can get all this finalized on our meeting tomorrow. Here is where we are now:

  1. the set of prefixes are dc11, dcterms, dctype, rdf, rdfs, schema, and xsd, with schema mapping on http.
  2. the set of aliases are direction, graph, id, included, json, language, none, and type.
  3. the two specific types are HTML and JSON, mapping on rdf:HTML and rdf:JSON, respectively.
  4. the set of specific, pre-defined terms are: comment, label, license, isDefinedBy, seeAlso, with licence mapping on schema:license.

Here is my assessment on the four items, based on the discussions so far:

  • I believe we have a consensus on (1)
  • I believe we have a consensus on (2) (with @BigBlueHat and myself grudgingly accepting the fact of having aliases:-)
  • The usage of JSON (as opposed to (json)) may be problematic, and @azaroth42 (in agreement with @gkellogg) proposed to remove it altogether. I am fine with that and, furthermore, I would actually propose to remove HTML, too; I would think this term would be very rarely used in JSON-LD anyway, and the rdf prefix makes an easy one to write in curie.
  • I believe having license is very useful for obvious reasons; label and comment are important, due to the missing annotation facilities in JSON-LD. isDefinedBy and seeAlso may also be useful for the same reasons as label and comment, but I am actually less convinced. But I do not think it creates any harm keeping them.
  • @BigBlueHat refered to describedby and role. The former could be seen as an alternative to isDefinedBy but less formal (which may be a good thing), ie, it may be a good replacement or addition. role, I believe, was added as target for the mapping of the role attribute in HTML, I am not sure we should have it here.

@azaroth42 azaroth42 self-requested a review April 23, 2020 18:18
@BigBlueHat
Copy link
Member

I'm still not OK with the aliases...and would still like to get a vote had on whether we pull those out as a separate context file.

Can we clarify who this "recommended context" is for?

It can't be for existing JSON API's--as those already have existing semantics with which these will conflict. It isn't really helpful for context authors either, unless they really do want all those things and are confident they won't have to rename them. And (which is probably the worst), if they do rename them, then there was really no point in our encouraging a shared set of aliases to begin with--because you can't depend on them "ecosystem-wide" because folks will (likely) rename them. 😢

Here's some "found JSON" (from someone else's API I'm working with):

{
  "data": [
    {
      "id": "...uuid1...",
      "type": [
        {
          "id": "...uuid2...",
          "value": "Application"
        }
      ]
    }
  ]
}

Adding the recommended context immediately throws this error:

jsonld.SyntaxError: Invalid JSON-LD syntax; "@type" value must a string, an array of strings, an empty object, or a default object.

And there's no way around that without reshaping the data (afaict).

Additionally, none of those id values are currently URLs nor do they share the same base URL, so using @base would be wrong--or at least alter the shape of the API.

Additionally, if an @base were added as a second context, all the type values (if they were changed to work) would now in that namespace...so the developer (who's quickly becoming a context file author) would need to add @vocab to separate them.

In the end, the developer will need to learn JSON-LD itself--presumably from examples and/or the Syntax doc. They will have to write some amount of context file or object, and when they attempt to do that (and learn from examples) they won't see any of these aliases. If those aliases conflict with their intended use (as http://schema.org/language conflicts with @language currently), they'll have to make and informed decision about which to pick and understand those ramifications. But if they've just grabbed the recommended context off the shelf...they'll have a much harder time learning and making the necessary changes to undo the recommendations. Which ultimately means the recommended context becomes a cost to them, not a value add.

So...if there's someone else this is for, we should define that, so we can hit that target. 😃

Regardless, I'd like to vote on at least moving the aliases into their own context file.

(I'm also happy...but confused...that we didn't also alias value to @value...but lets not do that regardless). 😉

@azaroth42
Copy link
Contributor

It can't be for existing JSON API's--as those already have existing semantics with which these will conflict. It isn't really helpful for context authors either, unless they really do want all those things and are confident they won't have to rename them.

As a context author, I disagree here. This is very useful for several reasons:

  1. It prevents me having to define all these things by hand in my contexts.
  2. It sets expectations in the community when they see the use of the base context that the current one has been thought through and designed with the environment in mind, not just automatically or haphazardly thrown together.
  3. It demonstrates that as a context author, I have taken concrete action to understand and follow best practice guidelines of the W3C.
  4. If there are new entries in the base context, which we trust the W3C process to maintain in a backwards-compatible fashion, then they will become available to users of my context without any action taken by me. I can delegate the need to keep up to date with foundational ontologies and patterns up the chain.

They will have to write some amount of context file or object, and when they attempt to do that (and learn from examples) they won't see any of these aliases.

Yes. If you're writing a context and include the base context, then you hopefully have looked at what that context defines!

I don't believe that this context is useful for people who don't understand what they're doing, but I don't believe that any context would be. Nor would people who don't understand what contexts are even know or want to look for a base context... the set of people who would successfully add and use a context and not understand what it is is an empty set.

Thus we are providing a set of common, best practice definitions as a machine-usable document rather than an HTML file and requiring those aliases to be cut and paste everywhere. Given context caching, the improvement for doing this is actually non-zero in a distributed system, as demonstrated by Gregg's recent work in PyLD where context caching generally made a 50x speed improvement!

@BigBlueHat
Copy link
Member

The core ask--which I think we should narrow this down to--is pulling the top-level names into their own aliases context file.

JSON-LD uses the @ prefix for its stuff on purpose--to avoid conflicts with existing JSON and to make it easy to mix in as needed/wanted.

In communities of practice (like IIIF or similar) aliasing things is totally fine--and indeed, probably recommended! But it would be naive to think that we can (or should!) determine those "meanings" for the world at large. And...again...this is why JSON-LD uses the @ prefix in the first place...

So, the practical ask again is to vote on pulling aliases into their own context file--for use by whoever wants them. No more, no less. 😄

@iherman
Copy link
Member Author

iherman commented May 1, 2020

This issue was discussed in a meeting.

  • RESOLVED: the set of prefixes are dc11, dcterms, dctype, rdf, rdfs, schema, and xsd, with schema mapping on http
  • RESOLVED: Remove the datatypes HTML and JSON from the context
  • RESOLVED: JSON-LD Maintenance Group to take over the RDFa Initial Context work and maintain it and it’s context file via GitHub
  • ACTION: move rdfa initial context (Ivan Herman)
View the transcript Base Context Discussion
Rob Sanderson: should we talk about the base context if there’s no other items?
Benjamin Young: #8
Ivan Herman: can we try to focus on the 4 categories?
… so we can talk through what we should or should not do around those?
Rob Sanderson: #8 (comment)
Ivan Herman: can we take those one by one?
Proposed resolution: the set of prefixes are dc11, dcterms, dctype, rdf, rdfs, schema, and xsd, with schema mapping on http. (Rob Sanderson)
Rob Sanderson: +1
Gregg Kellogg: +1
Ivan Herman: +1
Harold Solbrig: what does “schema mapping on http” mean?
Rob Sanderson: it means not “https://”
… but the schema.org community uses http:// so we should to
… otherwise we differ from them…though they do say others can
Gregg Kellogg: yeah. they essentially have their own processing that differs from RDF’s
Harold Solbrig: are the aliases and prefixes in the same context?
… was SKOS discussed?
Harold Solbrig: +1
Resolution #2: the set of prefixes are dc11, dcterms, dctype, rdf, rdfs, schema, and xsd, with schema mapping on http
Ivan Herman: it’s not that widely used, so we limited the list
Harold Solbrig: SKOS is widely used
Rob Sanderson: we use it
Benjamin Young: so does Wiley
Proposed resolution: the set of aliases are direction, graph, id, included, json, language, none, and type. (Rob Sanderson)
Rob Sanderson: +1
Rob Sanderson: but, whatever, we took it out
Benjamin Young: -1
Gregg Kellogg: I’m sympathetic with bigbluehat’s concerns about the aliases
… we did use @ on purpose
… and not using it does conflict with existing JSON
… and it doesn’t make JSON-LD obvious along side other JSON
… I’m not a +1 or -1…probably a +/-0
Rob Sanderson: we agree about the prefixes
… the prefixes, terms, and datatypes we’re all ok with, correct?
Gregg Kellogg: I don’t think there’s any place we use “json” as a data type in JSON-LD
… so we’re just left with “html”
… and is it really worth creating a separate data type for that?
… and if so, why not add XML?
Rob Sanderson: so, I think we’re all OK removing data types
Proposed resolution: Remove the datatypes HTML and JSON from the context (Rob Sanderson)
Rob Sanderson: +1
Ivan Herman: +1
Gregg Kellogg: +1
Harold Solbrig: +1
Harold Solbrig: can I pick and choose these? or is this a single file?
Benjamin Young: +1
Resolution #3: Remove the datatypes HTML and JSON from the context
Rob Sanderson: we agree about 1 and 3, but disagree on 2 and 4
… one suggestion was to pull aliases into a separate context file
… so, aliases.context and prefix.context and aggregate.context which covers both
Rob Sanderson: top-context includes prefix-context, alias-context
Gregg Kellogg: but then just define your own context
Harold Solbrig: except when it comes to prefixes, if we don’t come up with a handy collect–even paired down as much as this one is
… then each community is going to come up with their own
… and I think as far as best practices go, pulling the prefixes out more clearly in the JSON-LD context file definitions themselves would be helpful
… and most of these aliases conflict with existing terms, so we wouldn’t use this as is
Ivan Herman: I’m fine if we do a context file for prefixes
… and another for aliases–some people like them; I don’t, but whatever
… I would not do a third one, though
… we haven’t yet talked about comment, label, etc.
… but separate the aliases
… having the common terms–maybe less than what’s there–and the prefixes should be fine
Benjamin Young: Where we’re headed now is good to take out the top level context. Most of my concerns is taking the top level, and will run afoul of communities
… that they should pick these, and there’s conflicts with other widely used ones, such as language in schema.org
… in dedicated communities, make all the aliases you want, and if there are shared patterns, then share them in the community
… but if we do it, then we’re defining them for /everyone/
… thus a few prefixes, and maybe a few terms
… would prefer a longer list of prefixes – the w3c specs and surrounding communities
Rob Sanderson: 1+
Rob Sanderson: to answer ivan about pre-defined terms
… there’s an issue, not about the mapping, but around the value space of the JSON
… so there’s a question about label and how we map it to a string or a language map
… so when we put label into the predefined terms we’ve made a decision
… and in IIIF we’ve moved from a string from a language map
… but conversely in Linked Art, we’ve used rdfs:label as a string for developers for non-I18N labels
… and use something else for lang string stuff
… so that’s the danger for the terms because they make determinations about the value space
Ivan Herman: I get that, so simply taking out the value space would fix that, no?
Rob Sanderson: sadly not, you’d still have to redefine it
… it would default to being a literal, I guess
Rob Sanderson: > “label”: “rdfs:label”
Rob Sanderson: but you’d have to still define it if you wanted it to be a language map
… “label”: “rdfs:label” would be an xsd:string
Rob Sanderson: “label”: {“en”: “english label”}
Ivan Herman: no, it defaults to nothing
Rob Sanderson: well, if you put a language map in there, it wouldn’t be treated as a language map
Gregg Kellogg: if we did make it a language map, you can still use string values
… if you used a string, then it’d be xsd:string
Rob Sanderson: > “label”: {“@language”: “en”, “@value”: “english label”}
Gregg Kellogg: and if you used an object, it would be a language object
Rob Sanderson: well, you could also use a value object
… I’ve no problem putting the predefined terms in a separate file
Benjamin Young: yeah…it all runs into conflict with somebody somewhere
… and seems best avoided
… other than the prefixes
Rob Sanderson: so, how about 2 files
… prefixes
… aliases & terms
… you can then import them into your context
… you’d still get the benefit of context caching
Benjamin Young: This is for context authors as a starter kit
… as newbies will just mess things up, and they need to define their own terms in a context anyway
… can already just pull in schema.org.
Harold Solbrig: yeah, I wonder when you were talking about heavily cached
… many of these are already added into libraries anyhow
… and so these URLs would just be ignored and ultimately just mapped to what’s hard coded–assuming the mappings are the same
Benjamin Young: exactly. that’s “heavy” caching :)
Ivan Herman: the reason I’m a bit hesitant
… is that if I’m going to use JSON-LD in a heavy way, then I’d avoid the aliases
… but things like describedby or label, I’d use those
… but if they’re mixed into the aliases, I wouldn’t use them
… and if we make 3 files…we’d look ridiculous…
… but the prefixes stuff I’d use often
… and for me, I’d expect that it’s there and I could use that more easily
… but then…I am not the target here
Rob Sanderson: one of the benefit of the existing prefixes in the list is that most of them are not terms
… but schema is another potential conflict
… which is why some communities have used sdo
… so we could defer the terms and aliases discussion–pulling them out from the prefixes
… and then we can focus on the prefix names and list
Gregg Kellogg: ivan will remember how familiar this all sounds
Ivan Herman: yes. and at some point we’ll conflict with someone
… so it starts to feel fruitless
… so my feeling is, we tried…let’s move on
Harold Solbrig: the question on why bother
… many of these are so common, that if we don’t bother, it’s going to get done anyway…and be done multiple different ways
Ivan Herman: but maybe that’s a good thing
… your community will make a bunch of prefixes for the medical domain
… that the publishing community won’t care about
… and vice versa
Harold Solbrig: but then there’s the common set of prefixes that we all care about
Ivan Herman: but that’s like 5 things
… xsd, rdfs, and that’s about it
Gregg Kellogg: and JSON people won’t use any of those
Ivan Herman: right…so I propose to stop the exercise
Benjamin Young: There is a context file which is the rdfa initial context. We reference in Wiley
… the prefix list already exists and already a context in use
… perhaps we just need to take that list
Ivan Herman: a practical problem
Benjamin Young: Oh?
Ivan Herman: If I’m not around, no one will maintain it
Benjamin Young: We should fix that
Ivan Herman: We can pull into GH and make a redirect from the old URL, then people can maintain it in the future
Harold Solbrig: prefix.cc ?
Benjamin Young: No that’s community, and more about the right hand side because communities move things around
Harold Solbrig: The list?
Benjamin Young: https://www.w3.org/2011/rdfa-context/rdfa-1.1
Benjamin Young: http://www.w3.org/2013/json-ld-context/rdfa11
Benjamin Young: A lot of graph related tools just use this
Ivan Herman: Some unnecessary things
Benjamin Young: Yes but heavily cached so no cost
… maintenance group would take over the maintenance of the rdfa core initial context
Ivan Herman: So it goes into a repo, with a redirection in w3c space into this one
Proposed resolution: JSON-LD Maintenance Group to take over the RDFa Initial Context work and maintain it and its context file via GitHub (Benjamin Young)
Ivan Herman: +1
Gregg Kellogg: +1
Benjamin Young: +1
Rob Sanderson: +1
Harold Solbrig: +1
Gregg Kellogg: if we take over this work, and point it to the github repo, then we should be in a better place
… something we can maintain without a staff contact
… and potentially use some automation to automate the maintenance of this
Ivan Herman: true for the JSON-LD context, but maybe not for the RDFa stuff
Benjamin Young: I’d like to talk about the RDFa stuff separately
Resolution #4: JSON-LD Maintenance Group to take over the RDFa Initial Context work and maintain it and it’s context file via GitHub
Ivan Herman: maybe what we can do is also bring into this repo the RDFa stuff, and it’s maintained in the same place
… ok?
Action #2: move rdfa initial context (Ivan Herman)

@iherman
Copy link
Member Author

iherman commented May 1, 2020

In view of the resolutions of 2020-05-01 closing this PR without any further actions

@iherman iherman closed this May 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Trimming the context Minimal vs RDFa context Review the initial content...
4 participants