You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including CUDA, x86 and ARMv9.
We opensource dashinfer in April 2024, and recently updated it to v2.1.0, which has some prebuilt GPU codes, and exceeds upload limits.
Our project has many cuda kernels, these cuda kernels (not data) need to be compiled with different SMs. This results in a large shared library (.so) in our package. Currently the wheels are about 289 MB.
Code of Conduct
I agree to follow the PSF Code of Conduct
The text was updated successfully, but these errors were encountered:
Project URL
https://pypi.org/project/dashinfer/
Does this project already exist?
New Limit
300
Update issue title
Which indexes
PyPI
About the project
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including CUDA, x86 and ARMv9.
We opensource dashinfer in April 2024, and recently updated it to v2.1.0, which has some prebuilt GPU codes, and exceeds upload limits.
Project github repository:
https://github.com/modelscope/dash-infer
Reasons for the request
Our project has many cuda kernels, these cuda kernels (not data) need to be compiled with different SMs. This results in a large shared library (.so) in our package. Currently the wheels are about 289 MB.
Code of Conduct
The text was updated successfully, but these errors were encountered: