Skip to content
This repository was archived by the owner on May 27, 2024. It is now read-only.
This repository was archived by the owner on May 27, 2024. It is now read-only.

Define syntax and format of REUSE.yaml #81

Open
@mxmehl

Description

@mxmehl

As discussed in spdx/spdx-spec#502, the SPDX project plans to support a "metadata, pre-document file" that contains specific information about files relative to its position. This follows a request to implement something called REUSE.yaml, first discussed here. This issue is to discuss the exact format and syntax of the file.

Proposed YAML options

In the original discussion, we proposed four different syntaxes. One of them (also disliked by the REUSE team) has been turned down in a SPDX call. I removed two others as they are rather unintuitive and clumsy. Also, I changed the format a bit to comply with the YAML syntax (using * as key name is invalid), and added another option.

Option 1: list

Each list item is a SPDX tag as used in file headers. Easy to read thanks to the -, but all items must be wrapped in " to escape the : which would separate a key from a value – we cannot have multiple keys!

- files: "src/*"
  info:
    - "SPDX-FileCopyrightText: 2020 Me"
    - "SPDX-FileCopyrightText: © 2017 You"
    - "SPDX-License-Identifier: MIT"

Option 2: multi-line string

SPDX tags are just separated by new lines. No - or escaping of : are required. However, indentation must be preserved for all lines!

- files: "src/*"
  info: |
    SPDX-FileCopyrightText: 2020 Me
    SPDX-FileCopyrightText: © 2017 You
    SPDX-License-Identifier: MIT

Option 3: license and copyright as separate keys

We could also separate the two information items. Downside: the keys must be wrapped in " to escape the - in the key name.

- files: "src/*"
  "SPDX-FileCopyrightText":
    - "2020 Me"
    - "© 2017 You"
  "SPDX-License-Identifier": MIT

Background on the YAML keys

Unlike the SPDX YAML format, we would like to avoid copyrightText and licenseDeclared as key names. In REUSE, the SPDX-License-Identifier and SPDX-FileCopyrightText (or alternatively traditional, varying copyright statements) are common and understood by the users.

This was also accepted in the SPDX call.

Possible targets

REUSE.yaml is intended to target files that are relative to its position, and only those that are "below".

Statements like files: "../../src/*" should not be possible.

Supporting traditional copyright statements?

A related question is whether we should only support SPDX-FileCopyrightText as indicator for files' copyright, or also "traditional" statements like "Copyright © 2021 Jane Doe".

REUSE recommends the SPDX tag, but also supports the traditional statements. My suggestion would be to do the same in REUSE.yaml to reduce friction, but in SPDX this could lead to conflicts. Happy to collect opinions here!

Globbing

DEP-5 uses a simple glob syntax. In this, */Makefile would include any Makefile in all paths below. I am not sure whether this globbing is represented in any native Python module. The benefit of sticking with the DEP-5 glob is that we could more easily convert existing DEP-5 files to REUSE.yaml.

Another possibility would be using the Python-native glob. */Makefile would only match a Makefile in one level below, while **/Makefile would match all Makefiles.

We could also use pathspec, supporting the same globbing as gitignore.

Conflict resolution

As in DEP-5, I would suggest that the last match of a file wins. So if the file foo.txt is first matched by * and then *.txt, the last statement would count.

The dependecy resolution within REUSE and its different options – including REUSE.yaml – is discussed in #70.

Metadata

Metadata

Assignees

No one assigned

    Labels

    blockedBlocked by another issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions