You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the category of data publication you will find all the important information on the topic of data publication. This includes the motivation to publish research data, paths to publish data, recommendations for research data repositories to be used, best practices and aspects of machine actionability.
65
+
In this category on data publishing you will find all the important information on the topic of data publishing. This includes the motivation to publish research data, paths to publish data, recommendations for research data repositories to be used, best practices and aspects of machine actionability.
Copy file name to clipboardexpand all lines: docs/00_intro/10_fair.mdx
+1-1
Original file line number
Diff line number
Diff line change
@@ -132,7 +132,7 @@ In simple terms: metadata include any relevant history. If the dataset is relate
132
132
133
133
#### R1.3. (meta)data meet domain-relevant community standards
134
134
135
-
As research data management and, as such, [data publication](/docs/data_publication) becomes more and more prevalent across research areas, [best practices](/docs/best_practice) in the individual communities will arise. This should encompass metadata templates for proper documentation of datasets, how the data should be [organized](/docs/data_organisation), which vocabularies or [ontologies](/docs/ontology) to use, and [file formats](/docs/format_standards). NFDI4Chem is working to establish [metadata and data standards](https://www.nfdi4chem.de/index.php/task-areas/) for the various communities in chemistry.
135
+
As research data management and, as such, [data publishing](/docs/data_publishing) becomes more and more prevalent across research areas, [best practices](/docs/best_practice) in the individual communities will arise. This should encompass metadata templates for proper documentation of datasets, how the data should be [organized](/docs/data_organisation), which vocabularies or [ontologies](/docs/ontology) to use, and [file formats](/docs/format_standards). NFDI4Chem is working to establish [metadata and data standards](https://www.nfdi4chem.de/index.php/task-areas/) for the various communities in chemistry.
136
136
137
137
Where available, community standards and best practices should be followed when those publishing prepare their datasets and relevant metadata for publication. [Repositories](/docs/repositories), especially domain-specific service providers, should adhere to the standards set forth by the community by requiring files and metadata to follow format specifications.
138
138
As noted in [I1](#i1-metadata-use-a-formal-accessible-shared-and-broadly-applicable-language-for-knowledge-representation) above, the CIF format represents a community-specific standard associated with the chemical community. Furthermore, [NMReDATA](https://doi.org/10.1002/mrc.4737) represents a possible [standard](/docs/format_standards) for publishing and archiving (meta)data of Nuclear Magnetic Resonance (NMR) experiments.
Copy file name to clipboardexpand all lines: docs/00_intro/20_data_life_cycle.mdx
+2-2
Original file line number
Diff line number
Diff line change
@@ -43,8 +43,8 @@ Before sharing data, you should check whether the data is subject to **copyright
43
43
During this exchange and the associated reflections on the data, you should think about archiving and using the data in scientific publications. If you are not aware of any **criteria for archiving** and no criteria are specified in your working group or institute, decision-making guides such as the [“5 steps to decide what data to keep”](https://www.dcc.ac.uk/guidance/how-guides/five-steps-decide-what-data-keep) outlined by the DCC can help. Based on the established criteria, it is determined which of the collected raw data should be archived and which should be deliberately deleted.
44
44
45
45
In addition to the criteria, the migration of the data into **suitable [formats](/docs/format_standards) and onto suitable media** is important for archiving the data. In this step, the data should again be enriched with metadata so that it can be understood in the future without further knowledge about the data.
46
-
In addition to archiving, the [publication](/docs/data_publication) of the data plays a special role. Many research funders expect the data to be published if there are no special reasons not to do so, such as a non-disclosure agreement or the inclusion of personal data. A **chemistry-specific or chemistry-related [repository](/docs/repositories)** such as the [Chemotion Repository](https://www.chemotion-repository.net/), [NOMAD](https://nomad-lab.eu/services/repo-arch), or [MassBank](https://massbank.eu/MassBank/) is recommended for the publication of data. An overview of repositories can be found, for example, at [re3data.org](https://www.re3data.org/) or [fairsharing.org](https://fairsharing.org/). re3data.org allows you to filter repositories according to certain criteria such as the assignment of a persistent identifier or access.
47
-
Data publication often takes place at certain milestones, for example, in combination with a text publication or at the end of a project. The **final version of the data management plan** is also required at the end of a project.
46
+
In addition to archiving, the [publication](/docs/data_publishing) of the data plays a special role. Many research funders expect the data to be published if there are no special reasons not to do so, such as a non-disclosure agreement or the inclusion of personal data. A **chemistry-specific or chemistry-related [repository](/docs/repositories)** such as the [Chemotion Repository](https://www.chemotion-repository.net/), [NOMAD](https://nomad-lab.eu/services/repo-arch), or [MassBank](https://massbank.eu/MassBank/) is recommended for the publication of data. An overview of repositories can be found, for example, at [re3data.org](https://www.re3data.org/) or [fairsharing.org](https://fairsharing.org/). re3data.org allows you to filter repositories according to certain criteria such as the assignment of a persistent identifier or access.
47
+
Data publishing often takes place at certain milestones, for example, in combination with a text publication or at the end of a project. The **final version of the data management plan** is also required at the end of a project.
Copy file name to clipboardexpand all lines: docs/10_domains/10_analytical_chemistry.mdx
+4-4
Original file line number
Diff line number
Diff line change
@@ -79,9 +79,9 @@ A typical workflow begins with the conceptualisation of the research question, a
79
79
- A detailed view, evaluation and interpretation of results is carried out with the Chemotion ELN features.
80
80
81
81
82
-
## Publication of research data
82
+
## Publishing research data
83
83
84
-
- In addition to a research article in a scientific journal, the underlying research data are [published](/docs/data_publication) in a [repository](/docs/repositories) and linked to the article to realise research data management according to the [FAIR data principles](/docs/fair) ([Best practice examples](/docs/best_practice)).
85
-
- Data publication in a repository includes raw and processed data for reuse.
84
+
- In addition to a research article in a scientific journal, the underlying research data are [published](/docs/data_publishing) in a [repository](/docs/repositories) and linked to the article to realise research data management according to the [FAIR data principles](/docs/fair) ([Best practice examples](/docs/best_practice)).
85
+
- Data publications in repositories include raw and processed data for reuse.
86
86
- The use of the [Chemotion ELN](https://www.chemotion.net/chemotionsaurus/index.html) enables a direct transfer of research data and the respective metadata to the [Chemotion Repository](https://www.chemotion-repository.net/welcome). Subsequently, these data are automatically shared with other repositories, e.g. [PubChem](https://pubchem.ncbi.nlm.nih.gov/). For the publication of research data in other discipline-specific repositories, such as the [MassBank](https://massbank.eu/MassBank/) for reference mass spectra, data have to be exported from the Chemotion ELN and submitted to the respective database.
87
-
- A [persistent identifier](/docs/pid) (e.g., DOI) is generated for a dataset by a repository (e.g., [DataCite](https://datacite.org/) for the Chemotion Repository), which is given in the journal publication or corresponding supporting information to link the data publication with the manuscript.
87
+
- A [persistent identifier](/docs/pid) (e.g., DOI) is generated for a dataset by a repository (e.g., [DataCite](https://datacite.org/) for the Chemotion Repository), which is given in the journal article or corresponding supporting information to link the data publication with the manuscript.
Copy file name to clipboardexpand all lines: docs/10_domains/20_physical_chemistry.mdx
+3-3
Original file line number
Diff line number
Diff line change
@@ -39,7 +39,7 @@ Physical chemistry is an interdisciplinary science at the frontier between chemi
39
39
- obtained unprocessed raw files from measurements are uploaded to [ELN](/docs/eln) in open file formats and attached directly to the respective [ELN](/docs/eln) experiment entry, including metadata with data on the instrument (e.g. manufacturer, type, etc.), measurement conditions & parameters
40
40
-[metadata](/docs/metadata) related to the obtained data, such as temperature or solvent of measurement, follow common [metadata standards](/docs/metadata)
41
41
- research data are processed, analysed and compared with open non-proprietary software tools
42
-
- simultaneously with [publication](/docs/data_publication) as a research article in a scientific journal, the underlying research data is published in an open data [repository](/docs/repositories) and linked to the article (incl. semantically richly annotated raw and processed data in open data formats for reuse)
42
+
- simultaneously with [publication](/docs/data_publishing) as a research article in a scientific journal, the underlying research data is published in an open data [repository](/docs/repositories) and linked to the article (incl. semantically richly annotated raw and processed data in open data formats for reuse)
43
43
- an unique [persistent identifier](/docs/pid) (e.g. DOI) is generated for each dataset as well as for the journal publication
44
44
45
45
## Quantum Mechanical (QM) calculations
@@ -62,7 +62,7 @@ Physical chemistry is an interdisciplinary science at the frontier between chemi
62
62
- reproducibility of calculations to within numerical accuracy can be ensured by storing the input files and adding the program and its version (ideally even the compiler version and any compiler flags) as metadata. Numerical thresholds are well defined but reproducibility of calculations across different programs and versions is not guaranteed. This warrants the safekeeping of version specific source files for the same time period as the stored data
63
63
- data analysis scripts should be uploaded to the repository in open file formats, attached directly to the corresponding data entry and accompanied with appropriate documentation
64
64
- if possible, analysis and evaluation of calculations should be conducted with open, non-proprietary software tools
65
-
- simultaneously with [publication](/docs/data_publication) as a research article in a scientific journal, the data in the [repository](/docs/repositories) is linked to the article (incl. semantically richly annotated raw and processed data, if possible in open data formats for reuse)
65
+
- simultaneously with [publication](/docs/data_publishing) as a research article in a scientific journal, the data in the [repository](/docs/repositories) is linked to the article (incl. semantically richly annotated raw and processed data, if possible in open data formats for reuse)
66
66
- a unique [persistent identifier](/docs/pid) (e.g. DOI) is generated for the dataset as well as for the journal publication
67
67
- XML and CML (Chemical Markup Language) is used by a few software packages but this is not common practice
68
68
@@ -89,7 +89,7 @@ Physical chemistry is an interdisciplinary science at the frontier between chemi
89
89
-[documentation of all research data](/docs/data_documentation) and [metadata](/docs/metadata) is carried out digitally using a suitable repository to store the data
90
90
- reproducibility of calculations can be ensured by storing the input file and adding the program and its version (ideally including the compiler and any compiler flags) as metadata
91
91
- if possible, analysis and evaluation of calculations should be conducted with open non-proprietary software tools
92
-
- simultaneously with [publication](/docs/data_publication) as a research article in a scientific journal, the data in the [repository](/docs/repositories) is linked to the article (incl. semantically richly annotated raw and processed data, if possible in open data formats for reuse)
92
+
- simultaneously with [publication](/docs/data_publishing) as a research article in a scientific journal, the data in the [repository](/docs/repositories) is linked to the article (incl. semantically richly annotated raw and processed data, if possible in open data formats for reuse)
93
93
- a unique [persistent identifier](/docs/pid) (e.g. DOI) is generated for each dataset as well as for the journal publication
Copy file name to clipboardexpand all lines: docs/10_domains/40_synthetic_chemistry.mdx
+3-3
Original file line number
Diff line number
Diff line change
@@ -60,9 +60,9 @@ The main goal of a synthetic organic or inorganic chemist is to synthesise desir
60
60
- Optionally, preprocessing of digital data with software of analytical device before data are transferred to the Chemotion ELN (cf. data producing methods).
61
61
- A detailed view, evaluation and interpretation of results is carried out with the Chemotion ELN features.
62
62
63
-
## Publication of research data
63
+
## Publishing research data
64
64
65
65
- In addition to a research article in a scientific journal, the underlying research data are published in a repository and linked to the article to realise research data management according to the [FAIR data principles](/docs/fair) ([Best practice examples](/docs/best_practice)).
66
-
- Data publication in a repository includes raw and processed data for reuse.
66
+
- Data publications in repositories include raw and processed data for reuse.
67
67
- The use of the Chemotion ELN enables a direct transfer of research data and the respective metadata into the Chemotion Repository. Subsequently, these data are automatically shared with other repositories, e.g. [PubChem](https://pubchem.ncbi.nlm.nih.gov/). For the publication of research data in other discipline-specific repositories, such as the [CCDC](https://www.ccdc.cam.ac.uk/) for crystallographic data, data have to be exported from the Chemotion ELN and uploaded into the respective database.
68
-
- A [persistent identifier](/docs/pid) (e.g. DOI) is generated for a dataset by a repository (via [DataCite](https://datacite.org/) for the Chemotion Repository), which is given in the journal publication or corresponding supporting information to link the data publication with the manuscript.
68
+
- A [persistent identifier](/docs/pid) (e.g. DOI) is generated for a dataset by a repository (via [DataCite](https://datacite.org/) for the Chemotion Repository), which is given in the journal article or corresponding supporting information to link the data publication with the manuscript.
As research group leader, you are responsible for the [research data organisation](/docs/data_organisation) of your group. Many research institutions and also most funding institutions require or give internal RDM guidelines (e.g. [DFG checklist](https://www.dfg.de/download/pdf/foerderung/grundlagen_dfg_foerderung/forschungsdaten/forschungsdaten_checkliste_de.pdf), BMBF, EU guidelines) and recommend the set-up of [data management plans](/docs/dmp) in order to ensure that the data are archived in a [FAIR](/docs/fair) (**F**indable, **A**ccessible, **I**nteroperable, **R**e-usable) manner. Many funding institutions encourage or even enforce the [publication](/docs/data_publication) of FAIR data.
16
+
As research group leader, you are responsible for the [research data organisation](/docs/data_organisation) of your group. Many research institutions and also most funding institutions require or give internal RDM guidelines (e.g. [DFG checklist](https://www.dfg.de/download/pdf/foerderung/grundlagen_dfg_foerderung/forschungsdaten/forschungsdaten_checkliste_de.pdf), BMBF, EU guidelines) and recommend the set-up of [data management plans](/docs/dmp) in order to ensure that the data are archived in a [FAIR](/docs/fair) (**F**indable, **A**ccessible, **I**nteroperable, **R**e-usable) manner. Many funding institutions encourage or even enforce the [publication](/docs/data_publishing) of FAIR data.
17
17
18
18
In recent years, many new digital tools have been developed to support researchers in their RDM needs. The technical possibilities are briefly outlined below. For more details, please refer to the related chapters, many of which are directly linked.
19
19
20
20
:::danger Consider
21
21
Digitisation of research data only **after** the end of the production process is most tedious and time-consuming.
22
22
:::
23
23
24
-
Therefore, it is more efficient to capture the data and their corresponding [metadata](/docs/metadata) as early as from the planning of the experiment. Here, [electronic lab notebooks (ELN)](/docs/eln) facilitate everyday work considerably: the planning of the experiment, the documentation of experimental procedures, the analysis of the obtained spectroscopic data as well as the peak assignment can all be completed in one digital environment. And even better: complete experiment reports with analytical data (e.g. for the supporting information for [publications](/docs/data_publication)) can be generated automatically by the ELN.
24
+
Therefore, it is more efficient to capture the data and their corresponding [metadata](/docs/metadata) as early as from the planning of the experiment. Here, [electronic lab notebooks (ELN)](/docs/eln) facilitate everyday work considerably: the planning of the experiment, the documentation of experimental procedures, the analysis of the obtained spectroscopic data as well as the peak assignment can all be completed in one digital environment. And even better: complete experiment reports with analytical data (e.g. for the supporting information for [publications](/docs/data_publishing)) can be generated automatically by the ELN.
25
25
26
26
The time invested to set up the ELN and to organise the experiments thus pays off in numerous ways: In addition to facilitating [documentation](/docs/data_documentation), the storage of the produced data in a FAIR format in a [repository](/docs/repositories) is simplified. Also, the [data organisation](/docs/data_organisation) for the research group can be improved by setting up an internal database. This is invaluable for growing working groups.
0 commit comments