Skip to content

Commit a01b056

Browse files
authored
Merge pull request #376 from NFDI4Chem/localisation
Localisation
2 parents 6055556 + 6c72574 commit a01b056

12 files changed

+1010
-103
lines changed

docs/00_intro/10_fair.mdx

+22-22
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
title: "FAIR Data Principles"
33
slug: "/fair"
44
---
5+
56
# FAIR Data Principles
67

78
![FAIR Data](/img/topics/FAIR_data_principles.png)
@@ -23,7 +24,6 @@ In chemistry, the deposition of crystallographic data in a standardized file for
2324

2425
In the following, we answer the questions: What makes data FAIR? What do researchers and those who provide data preservation services need to consider?
2526

26-
2727
## Findable
2828

2929
Researchers — and the computers working on their behalf — must be able to find datasets to be able to reuse them. Therefore, the first guideline of the FAIR Data Principles outlines methods to ensure a dataset’s discovery.
@@ -38,11 +38,11 @@ A common example of a citable PID is the Digital Object Identifier, or [DOI](htt
3838

3939
Data need to be sufficiently described in order to make them both findable and reusable. Hence, the specific focus here lies on making the (meta)data findable by using rich discovery [metadata](/docs/metadata) in a standardized format and allowing computers and humans to quickly understand the dataset’s contents. This is an essential component in the plurality of metadata described by [R1](#r1-metadata-are-richly-described-with-a-plurality-of-accurate-and-relevant-attributes) below. This information may include, but is not limited to:
4040

41-
- the context on what the dataset is, how it was generated, and how it can be interpreted,
42-
- the data quality,
43-
- licensing and (re)use agreements,
44-
- what other data may be related (linked via its PID), and
45-
- associated journal publications and their DOI.
41+
- the context on what the dataset is, how it was generated, and how it can be interpreted,
42+
- the data quality,
43+
- licensing and (re)use agreements,
44+
- what other data may be related (linked via its PID), and
45+
- associated journal publications and their DOI.
4646

4747
Repositories should provide researchers with a fillable [application profile](https://en.wikipedia.org/wiki/Application_profile) that allows researchers to give extensive and precise information on their deposited datasets. For example, the Chemotion Repository uses, among others, the [Datacite Metadata Schema](http://doi.org/10.5438/0012) to build its application profile, a schema specifically created for the publication and citation of research data. [RADAR](https://radar.products.fiz-karlsruhe.de/en), including the variant [RADAR4Chem](https://www.nfdi4chem.de/index.php/2650-2/), has also built [its metadata schema](https://radar.products.fiz-karlsruhe.de/en/radarfeatures/radar-metadatenschema) on Datacite. These include an assortment of mandatory, recommended, and optional metadata properties, allowing for a rich description of the deposited dataset. For those publishing data, always keep in mind: the more information provided, the better.
4848

@@ -106,17 +106,17 @@ Many of the previous points lead to one key aspect of data sharing: data reusabi
106106

107107
Related to [F2](#f2-data-are-described-with-rich-metadata-defined-by-r1-below) above, the focus here lies on whether the data, once found, is useable to the person or computer searching. It also stresses giving the data as many attributes as possible. Researchers should not assume the person—or that person’s computer—looking to re(use) their data is completely familiar with the discipline. Examples of information to assign here include (non-exhaustive list):
108108

109-
- What the dataset contains, including whether the data is raw and/or processed
110-
- How the data was processed
111-
- How the data can be reused
112-
- Who created the data
113-
- Date of creation
114-
- Variable names
115-
- Standard methods used
116-
- Scope of the data and project
117-
- Lab conditions
118-
- Any limitations to the data
119-
- Software and versions used for acquisition and processing.
109+
- What the dataset contains, including whether the data is raw and/or processed
110+
- How the data was processed
111+
- How the data can be reused
112+
- Who created the data
113+
- Date of creation
114+
- Variable names
115+
- Standard methods used
116+
- Scope of the data and project
117+
- Lab conditions
118+
- Any limitations to the data
119+
- Software and versions used for acquisition and processing.
120120

121121
An important piece of information for chemical data are [machine-readable chemical structures](/docs/machine-readable_chemical_structures). This should be included within the dataset and/or metadata and aids computers in finding the correct data in their queries.
122122

@@ -141,8 +141,8 @@ Where required, format converters should be linked in the dataset’s metadata.
141141

142142
## Sources and further information
143143

144-
- [FORCE 11: FAIR Data Principles](https://www.force11.org/group/fairgroup/fairprinciples)
145-
- [Go-FAIR initiative: FAIR Principles](https://www.go-fair.org/fair-principles/)
146-
- [TIB Blog: The FAIR Data Principles for Research Data](https://blogs.tib.eu/wp/tib/2017/09/12/the-fair-data-principles-for-research-data/)
147-
- [FAIRsFAIR: How to be FAIR with your data. A teaching and training handbook for higher education institutions](https://doi.org/10.5281/zenodo.6674301) & [Engelhardt et al. (book version)](10.17875/gup2022-1915) & [Gitbook version](https://fairsfair.gitbook.io/fair-teaching-handbook)
148-
- [Checklist: How FAIR are your data?](https://doi.org/10.5281/zenodo.1065991) [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.1065991.svg)](https://doi.org/10.5281/zenodo.1065991)
144+
- [FORCE 11: FAIR Data Principles](https://www.force11.org/group/fairgroup/fairprinciples)
145+
- [Go-FAIR initiative: FAIR Principles](https://www.go-fair.org/fair-principles/)
146+
- [TIB Blog: The FAIR Data Principles for Research Data](https://blogs.tib.eu/wp/tib/2017/09/12/the-fair-data-principles-for-research-data/)
147+
- [FAIRsFAIR: How to be FAIR with your data. A teaching and training handbook for higher education institutions](https://doi.org/10.5281/zenodo.6674301) & [Engelhardt et al. (book version)](https://doi.org/10.17875/gup2022-1915) & [Gitbook version](https://fairsfair.gitbook.io/fair-teaching-handbook)
148+
- [Checklist: How FAIR are your data?](https://doi.org/10.5281/zenodo.1065991) [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.1065991.svg)](https://doi.org/10.5281/zenodo.1065991)

docs/20_role/50_core_facility_manager.mdx

+33-23
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,20 @@ title: "Core facility manager"
33
slug: "/core_facility_manager"
44
---
55

6-
import useBaseUrl from '@docusaurus/useBaseUrl';
6+
import useBaseUrl from "@docusaurus/useBaseUrl";
77

88
:::info Applies to:
99
This article applies to core facility managers and heads of analytical service units.
1010
:::
1111

1212
## Motivation
1313

14-
<img alt="Data LifeCycle" src={useBaseUrl('/img/Intro/DataLifeCycle_KB.svg')} width="500" align="right" />
14+
<img
15+
alt="Data LifeCycle"
16+
src={useBaseUrl("/img/Intro/DataLifeCycle_KB.svg")}
17+
width="500"
18+
align="right"
19+
/>
1520

1621
In the chemistry data lifecycle, core facilities play an important role as major producers of chemical data. For modern analytical techniques such as mass spectrometry or NMR spectroscopy, data are usually recorded digitally and the challenges lie less in digitalisation but management issues.
1722

@@ -23,7 +28,7 @@ When thinking about how to store data and make them available, an important star
2328

2429
### The Situation in Germany
2530

26-
The German Research Council ([DFG](https://www.dfg.de)) summarizes the consensus on the *fundamental principles and standards of good practice* in science in their Code fo Conduct *Guidelines for Safeguarding Good Research Practice* [\[1\]](#dfg_code). In guideline 17, a storage of all research data for the period of ten years is demanded, starting from the date of publication. Data storage strategies should therefore contain longterm storage for at least that time.
31+
The German Research Council ([DFG](https://www.dfg.de)) summarizes the consensus on the _fundamental principles and standards of good practice_ in science in their Code fo Conduct _Guidelines for Safeguarding Good Research Practice_ [\[1\]](#dfg_code). In guideline 17, a storage of all research data for the period of ten years is demanded, starting from the date of publication. Data storage strategies should therefore contain longterm storage for at least that time.
2732

2833
## How to start
2934

@@ -41,24 +46,24 @@ In addition, [backup strategies](/docs/data_storage/) for all instrument worksta
4146

4247
While most of the scientific work still lies ahead, there are already valuable metadata to be harvested and digested at the early stage of sample submission. These can include, among many others:
4348

44-
- Date
45-
- Creator (person, group)
46-
- Project
47-
- Sample identifier
48-
- Molecular structure(s), and derived properties:
49-
- Molecular formula
50-
- Molecular weight
51-
- Elemental composition
52-
- Physicochemical properties
53-
- Solvent or solubility
54-
- Purity
55-
- Experiment information of interest, such as:
56-
- Retation time
57-
- Polarity
58-
- Ionisation method
59-
- NMR nuclei and experiments
60-
- Chiroptical data
61-
- Biological properties
49+
- Date
50+
- Creator (person, group)
51+
- Project
52+
- Sample identifier
53+
- Molecular structure(s), and derived properties:
54+
- Molecular formula
55+
- Molecular weight
56+
- Elemental composition
57+
- Physicochemical properties
58+
- Solvent or solubility
59+
- Purity
60+
- Experiment information of interest, such as:
61+
- Retation time
62+
- Polarity
63+
- Ionisation method
64+
- NMR nuclei and experiments
65+
- Chiroptical data
66+
- Biological properties
6267

6368
The challenge of digesting those metadata according to [FAIR guiding principles](/docs/fair/) can be a challenge for core facilities and essentially come down to two possible strategies:
6469

@@ -67,5 +72,10 @@ The challenge of digesting those metadata according to [FAIR guiding principles]
6772

6873
## Sources
6974

70-
1. <a name="dfg_code"></a> Deutsche Forschungsgemeinschaft (DFG), <em>Guidelines for Safeguarding Good Research Practice. Code of Conduct</em>, September 2019, <a href="https://doi.org/10.5281/zenodo.3923602"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.3923602.svg" alt="DOI" /></a>
71-
2. <a name="sync"></a>See <a href="https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/robocopy" target="_blank">Microsoft documentation</a> for <code>robocopy</code> or <a href="https://linux.die.net/man/1/rsync" target="_blank">manpage</a> for <code>rsync</code>.
75+
1. <span id="dfg_code" />
76+
Deutsche Forschungsgemeinschaft (DFG), *Guidelines for Safeguarding Good Research
77+
Practice. Code of Conduct*, September 2019, [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3923602.svg)](https://doi.org/10.5281/zenodo.3923602)
78+
79+
2. <span id="sync" />
80+
See [Microsoft documentation](https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/robocopy)
81+
for `robocopy` or [manpage](https://linux.die.net/man/1/rsync) for `rsync`.

docs/40_smartlab/00_smartlab.mdx

+9-5
Original file line numberDiff line numberDiff line change
@@ -4,19 +4,23 @@ slug: "/smartlab"
44
id: "smartlab"
55
---
66

7-
import IntroButton from '@site/src/components/IntroButton.js';
8-
import useBaseUrl from '@docusaurus/useBaseUrl';
7+
import IntroButton from "@site/src/components/IntroButton.js";
8+
import useBaseUrl from "@docusaurus/useBaseUrl";
99

1010
# Smart Laboratory (Smart Lab)
1111

1212
![smartlab_flow](/img/smartlab/smartlab_flow2.png)
1313

14-
A smart lab represents a holistic approach to [data management](/docs/data_guide) in chemistry with seamless data flows. What does this mean? It means that all steps within a researcher's [workflow](/docs/domain_guide) across the [research data lifecycle](/docs/data_life_cycle) are interconnected in a digital way. The key difference to a [Laboratory Management System (LIMS)](https://en.wikipedia.org/wiki/Laboratory_information_management_system) is that the Smart Lab's main focus is the realisation of the [FAIR data principles](/docs/fair). For example, a researcher plans and [documents](/docs/data_documentation) their experiment in an [electronic lab notebook (ELN)](/docs/eln). Any experimental data from devices such as spectrometers are then directly ingested by the ELN via [Application Programming Interfaces (APIs)](https://en.wikipedia.org/wiki/API).
14+
A smart lab represents a holistic approach to [data management](/docs/data_guide) in chemistry with seamless data flows. What does this mean? It means that all steps within a researcher's [workflow](/docs/domain_guide) across the [research data lifecycle](/docs/data_life_cycle) are interconnected in a digital way. The key difference to a [Laboratory Management System (LIMS)](https://en.wikipedia.org/wiki/Laboratory_information_management_system) is that the Smart Lab's main focus is the realisation of the [FAIR data principles](/docs/fair). For example, a researcher plans and [documents](/docs/data_documentation) their experiment in an [electronic lab notebook (ELN)](/docs/eln). Any experimental data from devices such as spectrometers are then directly ingested by the ELN via [Application Programming Interfaces (APIs)](https://en.wikipedia.org/wiki/API).
1515

16-
The ELN then ideally assigns all the necessary [metadata](/docs/metadata) automatically and appropriately for a corresponding workflow and converts proprietary [data formats](/docs/format_standards) to open data formats. The ELN structures the (meta)data and experimental descriptions in a meaningful and sustainable way which is both human- and machine-readable (e.g., via the use of [machine-readable chemical structures](/docs/machine-readable_chemical_structures). When the researcher chooses to [publish](/docs/data_publishing) or [archive](/docs/data_storage) their data, it is then ingested seamlessly by a data [repository](/docs/repositories) or archive without much further work as the ELN has already appropriately prepared the dataset to meet a repository’s or archive’s [requirements](/docs//choose_repository).
16+
The ELN then ideally assigns all the necessary [metadata](/docs/metadata) automatically and appropriately for a corresponding workflow and converts proprietary [data formats](/docs/format_standards) to open data formats. The ELN structures the (meta)data and experimental descriptions in a meaningful and sustainable way which is both human- and machine-readable (e.g., via the use of [machine-readable chemical structures](/docs/machine-readable_chemical_structures). When the researcher chooses to [publish](/docs/data_publishing) or [archive](/docs/data_storage) their data, it is then ingested seamlessly by a data [repository](/docs/repositories) or archive without much further work as the ELN has already appropriately prepared the dataset to meet a repository’s or archive’s [requirements](/docs/choose_repository).
1717

1818
In this section, key components of the smart lab will be introduced to you.
1919

2020
## Get started:
2121

22-
<IntroButton url={"/docs/eln"} imgUrl={"/img/nfdi4chem_SmartLab.svg"} text={"Electronic Lab Notebooks"} />
22+
<IntroButton
23+
url={"/docs/eln"}
24+
imgUrl={"/img/nfdi4chem_SmartLab.svg"}
25+
text={"Electronic Lab Notebooks"}
26+
/>

docs/50_data_publication/10_repositories.mdx

+1-1
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Some, but not all, repositories curate and review the data before **ingestion**
2525

2626
In order to allow data reuse by other researchers, [metadata](/docs/metadata), including [provenance information](/docs/provenance/), are required beside the actual data. Metadata describe the research data and provide information about its creation, the methods or software used as well as legal aspects. Metadata can be either added manually via a metadata editor or can be provided through other applications. The process to manually add metadata via a metadata editor can be compared to the process of submitting a manuscript to a publisher via the publishers submission system.
2727

28-
One main function of repositories is to provide a search function, with which users and machines can find, view, and download data. In order to ensure that data are permanently referenced and can be [linked and cited](/docs/best_practice/#how-to-use-dataset-pids-in-scientific-articles), repositories assign unique [persistent identifiers](/docs/pid) (PIDs). This also enhances the findability and accessibility of research data.
28+
One main function of repositories is to provide a search function, with which users and machines can find, view, and download data. In order to ensure that data are permanently referenced and can be [linked and cited](/docs/best_practice/), repositories assign unique [persistent identifiers](/docs/pid) (PIDs). This also enhances the findability and accessibility of research data.
2929

3030
Repositories can also be certified (e.g. CoreTrustSeal). Such certification ensures that the data is citable, preserved in the long run, and may also cover aspects of data curation and data quality.
3131

0 commit comments

Comments
 (0)