Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File Limit Request: dashinfer - 300 MiB #5662

Open
3 tasks done
laiwenzh opened this issue Feb 11, 2025 · 0 comments
Open
3 tasks done

File Limit Request: dashinfer - 300 MiB #5662

laiwenzh opened this issue Feb 11, 2025 · 0 comments

Comments

@laiwenzh
Copy link

Project URL

https://pypi.org/project/dashinfer/

Does this project already exist?

  • Yes

New Limit

300

Update issue title

  • I have updated the title.

Which indexes

PyPI

About the project

DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including CUDA, x86 and ARMv9.
We opensource dashinfer in April 2024, and recently updated it to v2.1.0, which has some prebuilt GPU codes, and exceeds upload limits.

Project github repository:
https://github.com/modelscope/dash-infer

Reasons for the request

Our project has many cuda kernels, these cuda kernels (not data) need to be compiled with different SMs. This results in a large shared library (.so) in our package. Currently the wheels are about 289 MB.

Code of Conduct

  • I agree to follow the PSF Code of Conduct
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant