Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache pyapp build on GHA #106

Open
sandangel opened this issue Apr 5, 2024 · 4 comments
Open

Cache pyapp build on GHA #106

sandangel opened this issue Apr 5, 2024 · 4 comments

Comments

@sandangel
Copy link

Hi, I think the pyapp compiled everytime CI run even no code change and I have enabled cache-to, cache-from in docker/build-push-action. Is there a way for us to cache the pyapp build?

@blakeNaccarato
Copy link

blakeNaccarato commented Feb 4, 2025

Because a pyapp.exe build customized with your environment variables happens in the final stage of building or installing with Cargo, you could probably leverage an approach like the Rust Cache GitHub Action, which caches e.g. .cargo and target in the build directory.

I've played with the cargo install pyapp --force --root . approach, which is convenient for one-off, drive-by builds, but doesn't properly cache the upstream dependent artifacts so building takes awhile. I've since figured out a project structure that works pretty well. I probably try to slim this down, isolate it, and release as a reusable GitHub Action with caching and such at some point:

  • Get sources for PyApp as recommended in the docs
  • Add .gitignore entries as below to check in only the things we're going to modify from the sources to unzip
  • Unzip the sources into pyapp in your project directory
  • Build your project wheel, set up environment variables, then in the pyapp dir run cargo build --release
  • After cargo build --release, do git restore 'Cargo.lock' (we unignored it so that we can "discard" pyapp being added to it when cargo build --release forcibly re-locks it). As far as I know, there's no other way to accomplish this behavior?
  • Do your post-build stuff (in my case, changing binary icon, signing...)
  • If in CI using e.g. Rust Cache GitHub Action, will probably need to discard/avoid caching other bits from ./pyapp/target like the .exe, and ensure Cargo.lock doesn't have pyapp in it as we did above

This means that 382/383 stages are built the first time only, and just the pyapp build stage is repeated, saving a bit of time. Though in practice the first 382 build stages don't take that long. If you then cache pyapp/target, you'll get your desired behavior, which may be implementable in your Docker workflow slightly differently. See the justfile implementation of this build approach below.


EDIT

Proof-of-concept Justfile to help along with properly caching builds...

Here's a minimal Justfile derived from my work-in-progress on easing the interface for building binaries from PyApp over at blakeNaccarato/buildsign. This kind of workflow benefits from build caching, and it's conceivable that a GitHub Actions workflow configured in tandem with the Justfile approach would speed up builds in CI.

You can see a more involved (and specific) version over at the repo, but this should work for a typical project, and with some work on the TODO section, should also download the correct pyapp sources cross-platform as well.

Notice _remove_stale also avoids a certain side-effect, that PyApp reuses the environment even as the built Python code changes, since the version number hasn't changed.

.gitignore

pyapp/*
!pyapp/Cargo.lock
!pyapp/.cargo

justfile

# Settings
set windows-shell := ['powershell.exe', '-NonInteractive', '-NoProfile', '-Command']

# Project details
proj_name := 'hello'
proj_version := '0.0.0'
pyapp_version := '0.26.0'

# Artifacts
pyapp := absolute_path( 'pyapp' )
pyapp_bin := absolute_path( pyapp/'target/release/pyapp.exe' )
bin := absolute_path( 'dist'/(proj_name+'.exe') )

# Compile binary
build \
  $PYAPP_EXPOSE_ALL_COMMANDS = '1' \
  $PYAPP_PYTHON_VERSION = '3.12' \
  $PYAPP_PROJECT_NAME = proj_name \
  $PYAPP_UV_ENABLED = '1' \
  $PYAPP_UV_VERSION = '0.5.29' \
  $PYAPP_PROJECT_PATH = absolute_path( 'dist'/proj_name+'-'+proj_version+'-py3-none-any.whl' ) \
: _get_pyapp_sources && _remove_stale
  uv --preview build --package 'hello'
  cd '{{pyapp}}'; cargo build --release
  {{ if path_exists(bin) == "true" { "rm " + bin } else {""} }}
  mv '{{pyapp_bin}}' '{{bin}}'

# Remove possibly stale PyApp installation
_remove_stale:
  {{bin}} self remove

# TODO: Implement bash version as well
# Get PyApp sources
_get_pyapp_sources:
  {{ if path_exists(pyapp) == 'true' { "" } else { __get_pyapp_sources } }}
__get_pyapp_sources := \
  "Invoke-WebRequest" \
  + " 'https://github.com/ofek/pyapp/releases/download/v" + pyapp_version + "/source.zip'" \
  + " -OutFile 'source.zip'; " \
  + zip + " 'x' 'source.zip'; " \
  + "mv 'pyapp-v*' 'pyapp'; " \
  + "rm 'source.zip'"
zip := if os_family()=='windows' {require( '7z.exe' )} else {require( 'tar' )}

@sandangel
Copy link
Author

thanks. I don’t use pyapp anymore because it is super slow to build. now switching to just ‘uv’ and cut the build time for Docker image alot . going to close this issue

@ofek ofek reopened this Feb 4, 2025
@ofek
Copy link
Owner

ofek commented Feb 4, 2025

Thanks Blake, I am interested in improvements here!

@blakeNaccarato
Copy link

blakeNaccarato commented Feb 4, 2025

@ofek you're welcome, and thank you for a relatively accessible way to build Python binaries! It's really quite a feat.

I don't know Rust, but your docs and some fiddling got me on the right track (hopefully). It seems the crux of it is that all the magic is done in that cargo build --release ... step, which as far as I can tell needs pyapp local sources, e.g. you can't specify the pyapp crate.

You're able to cargo install --force pyapp --root ... but then this doesn't seem to benefit from intermediate build caching, but if you omit --force you sometimes get a stale pyapp.exe. Like if only you could say --force-last-stage-only.

What paths are you interested in here?

  1. Pack all the details into the Hatch wrapper, including supporting faster builds via caching and all that (is that already supported?).
  2. A runnable utility, versioned alongside Pyapp that hides all the gunk in a tidy bundle (perhaps itself compiled by Pyapp, lol...), taking in command line options and spitting out binaries, leveraging caching internally.
  3. A GitHub Action with inputs reflecting all the environment variables and nuances, or perhaps one that only claims to handle only common cases.
  4. Documentation, tutorials, cookiecutters, etc. that help the user understand it all "under the hood" for arbitrary mashups. This could teach things like cargo-binstall and friends which I think could be composed to reduce build times? But I don't really have the lay of the land for the Rust ecosystem so I don't quite know for sure.

I like that (1) exists, and greatly appreciate there being effort put in to make pyapp approachable without Rust knowledge as a prerequisite. My personal experience here is I found myself bouncing off of the nuances of Hatch environment orchestration, are the env vars being set like I think they are, when it goes wrong how can I open the escape hatch, etc. For me, attempting pivoting to Hatch for the pyapp benefit alone was not super approachable. But it's awesome for projects already all-in on Hatch.

A solution like (2) may be more general, and could even be used to simplify implementations like (3), but could be similarly opaque to (1). The philosophy of cibuildwheel is representative of this approach. Again this is just my experience and opinion after a week or so trying to grok it all.

But (3) alone would be an easier lift, though only enables GHA caching specifically, see e.g. Hynek's Build and Inspect Python Package – baipp GitHub Action.

Option (4) aims to arm the reader with knowledge and flexibility of building manually, which would probably facilitate projects that already have some Rust code or are polyglot.

It seems the Hatch helper you've already got is the most mature "easier" implementation so far, and any caching niceties could probably fit there. Now that I've gone through the manual build gauntlet I could probably go back to the hatch implementation and get it to work. 😅 It would be interesting if that ease of use could be isolated into its own utility package, distinct from Hatch!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants