Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a page on parallel and async tools. #323

Merged
merged 11 commits into from
Mar 25, 2024
Merged

Conversation

samcunliffe
Copy link
Member

Solves #178.

No async yet, but there are a couple of projects using async so I'll ask around for opinions.

@samcunliffe samcunliffe added documentation Improvements or additions to documentation enhancement New feature or request labels Mar 23, 2024
@samcunliffe samcunliffe force-pushed the sc/178-parallel-async branch from 10d0fa4 to 8d623a9 Compare March 23, 2024 12:07
@samcunliffe samcunliffe linked an issue Mar 23, 2024 that may be closed by this pull request
@samcunliffe samcunliffe added the website Related to https://github-pages.arc.ucl.ac.uk/python-tooling label Mar 24, 2024
Copy link
Member

@paddyroddy paddyroddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the philosophy behind the traffic light system, but one can easily use i.e. numba and multiprocess in conjunction - so not sure if it makes too much sense.

Copy link
Collaborator

@matt-graham matt-graham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @samcunliffe this looks good. Have made some suggestions about adding a bit more detail around different parallelism models in Python and briefly mentioning global interpreter lock to explain why we can't just use threading module. I don't if worth specifically adding a separate section on libraries with support for running code on accelerator devices like GPUs? Of the tools already mentioned JAX and Numba I think have built-in support for running code on GPUs. Other options similar to JAX would be CuPy, TensorFlow and PyTorch, and there also options like pycuda, cuda-python and pyopencl.

Copy link
Member

@dstansby dstansby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could be split into thread/process based parallelism (multiprocess, multiprocessing, dask, mpi4py I think?) and (JIT) compiler based parallelism (numba, jax, Cython)?

@samcunliffe

This comment was marked as resolved.

@matt-graham

This comment was marked as resolved.

Co-authored-by: David Stansby <dstansby@gmail.com>
@paddyroddy

This comment was marked as resolved.

@paddyroddy
Copy link
Member

@samcunliffe
Copy link
Member Author

@paddyroddy

Worth mentioning https://docs.python.org/3/library/concurrent.futures.html too

I'm gonna need a traffic light colour for that, old bean.

@paddyroddy
Copy link
Member

I'm gonna need a traffic light colour for that, old bean.

I'm a fan! 🎐

@paddyroddy
Copy link
Member

@matt-graham GIL may go in the future https://peps.python.org/pep-0703

@matt-graham
Copy link
Collaborator

matt-graham commented Mar 25, 2024

@matt-graham GIL may go in the future https://peps.python.org/pep-0703

Yep I linked to that PEP in my suggested change above 😉

EDIT: Potentially misunderstood your comment, was the point that is not certain and so not worth linking to?

Co-authored-by: David Stansby <dstansby@gmail.com>
@paddyroddy
Copy link
Member

Yep I linked to that PEP in my suggested change above 😉

EDIT: Potentially misunderstood your comment, was the point that is not certain and so not worth linking to?

To be honest, it is so many words that my brain skipped some of it... I feel like it is going into a lot of detail that won't be relevant for most users. However, perhaps it could be useful in a <details> block?

But yes it is also may fall out of date with the Python ecosystem, so may just be worth linking to the PEP?

@samcunliffe
Copy link
Member Author

I guess we might then have to recommend a specific tool for compiler based parallelism which might require some thought / discussion though.

Left them as all-amber for now. I'd be inclined to factor out the discussion of our favorites to either a new ticket, or to an in-person meetup.

samcunliffe and others added 3 commits March 25, 2024 11:15
Lines added to the blurb hacked from Matt's suggestions in #323. More
details about the GIL and PEP703.

Co-authored-by: Matt Graham <matthew.m.graham@gmail.com>
@paddyroddy
Copy link
Member

Worth mentioning docs.python.org/3/library/concurrent.futures.html too

Part of standard library too

Co-authored-by: David Stansby <dstansby@gmail.com>
@matt-graham
Copy link
Collaborator

To be honest, it is so many words that my brain skipped some of it...

I wouldn't say ~150 words is all that many, and irrespective I feel it's not unreasonable to expect you to read things you comment on before commenting on them, even if you find them overly wordy 🙁. Not trying to make a big deal of this, but comments like the above do make it less appealing to contribute and I would say we should keep comments kind and constructive.

I feel like it is going into a lot of detail that won't be relevant for most users. However, perhaps it could be useful in a <details> block?

But yes it is also may fall out of date with the Python ecosystem, so may just be worth linking to the PEP?

Even if/when PEP0703 gets fully implemented the GIL will be optional but not removed so a lot of the mentioned details will still be relevant. I think having some explanatory comments is useful as while we want to keep thing simple for users, wanting to exploit parallelism / asynchronicity is a relatively advanced use case, and is inherently a complex topic and I think trying to hide away the complexity doesn't necessarily help. Putting some extra detail in a <details> block I think would be a good approach.

@paddyroddy
Copy link
Member

I wouldn't say ~150 words is all that many, and irrespective I feel it's not unreasonable to expect you to read things you comment on before commenting on them, even if you find them overly wordy 🙁. Not trying to make a big deal of this, but comments like the above do make it less appealing to contribute and I would say we should keep comments kind and constructive.

I did read them... What I am saying is that is why I missed you'd already linked to the issue. We all have different reading and interpreting styles.

@samcunliffe
Copy link
Member Author

To stop this PR snowballing, I'm factoring out the asynchronous stuff to

@samcunliffe samcunliffe merged commit 036c1e9 into main Mar 25, 2024
15 checks passed
@samcunliffe samcunliffe deleted the sc/178-parallel-async branch March 25, 2024 15:47
samcunliffe added a commit that referenced this pull request Mar 25, 2024
The first couple of async tools. I don't have tonnes of experience with
async stuff so I'm probably not the best person to decide the 🚦 s.

We use `asyncio` for [INFORMus](https://github.com/inform-us/INFORMus/)
(and probably a fair few other web APIs??) and @paddyroddy likes
`futures`.

## Continues

- #323 
- #178

---------

Co-authored-by: Matt Graham <matthew.m.graham@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request website Related to https://github-pages.arc.ucl.ac.uk/python-tooling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a page on tools for parallel programming
4 participants