Skip to content

build: Include tests in source distributions #2285

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mgorny
Copy link
Contributor

@mgorny mgorny commented Mar 25, 2025

What type of PR is this? (check all applicable)

  • πŸ’Ύ Refactor
  • ✨ Feature
  • πŸ› Bug Fix
  • πŸ”§ Optimization
  • πŸ“ Documentation
  • βœ… Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

Include test files in the source distribution, to make it possible to run them while repackaging from sources. This permits Linux distributions to use signed PyPI release artficats rather than having to use the git repository directly.

Include test files in the source distribution, to make it possible
to run them while repackaging from sources.  This permits Linux
distributions to use signed PyPI release artficats rather than having
to use the git repository directly.

Fixes narwhals-dev#2284
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your pr but i'd rather not increase the package size

rather than having to use the git repository directly

just for my understanding, what's the downside of using the git repository?

@mgorny
Copy link
Contributor Author

mgorny commented Mar 25, 2025

thanks for your pr but i'd rather not increase the package size

Does it matter? Regular users install from wheels anyway. Almost all users of source distributions are downstream repackagers who actually want tests.

rather than having to use the git repository directly

just for my understanding, what's the downside of using the git repository?

git archive has no stability guarantees, and the stability of produced artifacts relies entirely on keeping the same compressor version. GitHub already broke that once in the past, and the immediate result is that all pipelines relying on GitHub archives broke over hash changes.

On top of that, PyPI supports signed attestations which provide an additional layer of authentication.

@MarcoGorelli
Copy link
Member

So this wouldn't affect the wheel, but only the .tar.gz file that gets uploaded?

@mgorny
Copy link
Contributor Author

mgorny commented Mar 25, 2025

Yes. I suppose the diff can be confusing, but I've verified that Hatchling includes only the narwhals package itself in the wheel.

@MarcoGorelli
Copy link
Member

Thanks - would it work if we just included a subset of tests? I'm just wondering if people will see a large tar gz file on github and draw the wrong conclusion

@mgorny
Copy link
Contributor Author

mgorny commented Mar 30, 2025

That wouldn't work for us, since we want to run as many tests as possible (and they are fast enough not to justify skipping some). Not sure how much that can actually help you either, given that without tests, the tarball is already 253 KiB (with tests, it's 432 KiB β€” I'm not really convinced people can really infer anything from these two numbers).

Another option would be to generate a separate tarball and upload it somewhere (e.g. attach to GitHub release) β€” but that's more work. An extra advantage of that approach is that you would be able to use a stronger compression.

@MarcoGorelli
Copy link
Member

thanks for measuring + explaining!

i don't know much about packaging (and virtually nothing about linux distributions) but i'll look into this shortly, we can probably get this into the next release

thanks for your interest here!

Copy link
Collaborator

@EdAbati EdAbati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR!
I'm also definitely not an expert in packaging/linux distributions but I took this opportunity to learn more about it πŸ˜„

It looks like there is a bit of debate whether it is a requirement to include tests in sdist or not and it seems that the answer is "it depends but there is no harm in doing so": see "Should sdists include docs and tests?""

The Python Packaging documentation says at some point:

Wheels are meant to contain exactly what is to be installed, and nothing more. In particular, wheels should never include tests and documentation, while sdists commonly do.

I checked what some of the most common library do. pandas and scikit-learn have tests in both wheel and sdist because their tests are close to the source code not in a separate folder. Our friend polars has docs and tests (and other stuff) in sdists but not in wheel. See tar.gz:

Screenshot 2025-04-01 at 08 56 17

In our case we could use the tool.hatch.build.target.<target> functionality to include what we want. I tried locally and this seems to do the trick:

[tool.hatch.build.targets.sdist]
include = [
  "narwhals/*",
  "tests/*",
  # or docs too?
  # "docs/*",
]

[tool.hatch.build.targets.wheel]
include = [
  "narwhals/*",
]

I think include is still better than exclude so that we don't have to update these lines each time we add a new file in root (e.g. just noticed that we ship the Makefile in the .tar.gz)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Enh]: Include tests in source distribution
3 participants