Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

put together the provenance info needed for each package/objects #7

Open
isanti opened this issue Jul 15, 2022 · 5 comments
Open

put together the provenance info needed for each package/objects #7

isanti opened this issue Jul 15, 2022 · 5 comments
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed
Milestone

Comments

@isanti
Copy link
Contributor

isanti commented Jul 15, 2022

No description provided.

@isanti isanti added this to the 2022-10 milestone Jul 15, 2022
@kmexter
Copy link
Contributor

kmexter commented Jul 28, 2022

According to what I can remember about the common provenance model of EOSC Life WP6, the recommendations are that the following needs to be provided for each file as a digital object:

  • who/what provided it (ORCID, project name and URL, what software process executed the "get" or "create" or "update")
  • was this a "get" a "create" an "update" etc
  • where it was gotten from (URL(s) ideally) or how was it created (e.g. a software process)
  • its provenance from the place it was obtained from (e.g. URL pointing to a metadata record if the dataset was taken from a metadata record, and internally for us if we e.g. merge files, then also URLs pointing to the provenance files in github that belong to the files that were merged)
  • licence and more general access rights
  • who controls it (i.e. who to contact about it)
  • modification remarks if this is an update (and ideally with some of the remarks taken from a vocab, so it is clear if this is an "original copy", or a "updated data" or whatever, to machines as well as humans)

These provenance information can be packaged in a prov ro-crate we can create, but/also written in prov-o following the CPM of WP6 (my notes about this can be found on confluence: https://confluence.vliz.be/display/VMDCOS/2022-07-08+Vienna+ISO+pt+3+meeting and https://confluence.vliz.be/display/VMDCOS/Reading+on+provenance+in+marine+biology and 2 papers that I am not allowed to share digitally but which I have printed out and on my desk :-})

@kmexter
Copy link
Contributor

kmexter commented Jul 28, 2022

Then additionally, the provenance for biological material and its digital "derivatives", we will need provenance information following the EMBRC "provenance model" that we are building in WP6. This will cover the metadata necessary for each spreadsheet from a single station/sampling event, the digital files (e.g. the sequences, the ARMS images), also the biobanked material (especially if the stations don't do this properly!).
Since Laurian and I will not have time to put this model together until Oct/Nov, I think that for this part of the provenance, we will have to wait until then. What we can do before then, perhaps, is decide how we will store these metadata. Ideally not as CSV files (data.csv and metadata.csv), because that is just too clunky for the amount of digital data that will need to be managed. We will need to create a template that can be (ideally) automatically filled, and which L and K can do as part of our EMBRC prov model work.

@cedricdcc
Copy link
Contributor

Can be made into an action that can be applied to a github repo , doesn't matter if the repo is a RO-Crate or not.
@marc-portier thoughts?

  • responsibility can be placed on the original author of the file + contact info is gh account.
  • license is the over-arching repo license.
  • output format should be produced according to input params.
  • Search for existing actions that already make the provenance from a given repo.

@cedricdcc cedricdcc added enhancement New feature or request help wanted Extra attention is needed labels Aug 2, 2022
@laurianvm
Copy link

prov-o link: https://www.w3.org/TR/prov-o/

@kmexter
Copy link
Contributor

kmexter commented Oct 16, 2024

Still on my list of things to do end nov

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants