-
-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU not supported on Nvidia Jetson AGX with JetPack 5.1 #301
Comments
I think 8.7 was added to the Torch whitelist fairly late last year so I'm not sure what the status is for Torch 2.1.2. You could try exporting Could you clarify what you mean by the requirements install removing the NVIDIA CUDA libs? It shouldn't affect those, and if you already have |
Hello,
First, thank you for the response and suggestions, I will certainly try the
export you suggest.
Second, I apologize for not being more precise in my original explanation.
It appears that the the pip installation command I am using is failing on a
torch dependency issue on my Nvidia Jetson.
More detail:
1. I'm using a conda environment specifically for exllamav2
2. pip list command shows the Nvidia JetPack version of torch installed
prior to exllamav2 install:
tokenizers 0.15.1
*torch 2.0.0a0+8aa34602.nv23.3*
tqdm 4.66.1
transformers 4.37.1
typing_extensions 4.9.0
tzdata 2023.4
urllib3 2.1.0
websockets 12.0
wheel 0.41.2
(exllamav2) ***@***.***:~$
3. Here is the installation command I am using which seems to closest match
the Nvidia JetPack 5.1 and arch:
export TORCH_INSTALL=
https://developer.download.nvidia.com/compute/redist/jp/v51/pytorch/torch-2.0.0a0+8aa34602.nv23.03-cp38-cp38-linux_aarch64.whl
python3 -m pip install --no-cache $TORCH_INSTALL
4. Here is the installation output and dependency error:
python3 -m pip install --no-cache $TORCH_INSTALL
Collecting torch==2.0.0a0+8aa34602.nv23.03
Downloading
https://developer.download.nvidia.com/compute/redist/jp/v51/pytorch/torch-2.0.0a0+8aa34602.nv23.03-cp38-cp38-linux_aarch64.whl
(167.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 167.2/167.2 MB 66.9 MB/s eta
0:00:00
Requirement already satisfied: filelock in
/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages (from
torch==2.0.0a0+8aa34602.nv23.03) (3.13.1)
Requirement already satisfied: networkx in
/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages (from
torch==2.0.0a0+8aa34602.nv23.03) (3.1)
Requirement already satisfied: sympy in
/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages (from
torch==2.0.0a0+8aa34602.nv23.03) (1.12)
Requirement already satisfied: typing-extensions in
/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages (from
torch==2.0.0a0+8aa34602.nv23.03) (4.9.0)
Requirement already satisfied: mpmath>=0.19 in
/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages (from
sympy->torch==2.0.0a0+8aa34602.nv23.03) (1.3.0)
Installing collected packages: torch
Attempting uninstall: torch
Found existing installation: torch 2.1.2
Uninstalling torch-2.1.2:
Successfully uninstalled torch-2.1.2
ERROR: pip's dependency resolver does not currently take into account all
the packages that are installed. This behaviour is the source of the
following dependency conflicts.
exllamav2 0.0.12 requires torch>=2.0.1, but you have torch
2.0.0a0+8aa34602.nv23.3 which is incompatible.
Successfully installed torch-2.0.0a0+8aa34602.nv23.3
Brief comments: First, my environment doesn't have torch 2.1.2 installed so
I'm not sure where that check is coming from. But it does look like the
Nvidia latest version is shy
of what exllamav2 requires?
Thank you for any commentary on this.
…On Tue, Jan 30, 2024 at 4:14 AM turboderp ***@***.***> wrote:
I think 8.7 was added to the Torch whitelist fairly late last year
<NixOS/nixpkgs#249250> so I'm not sure what the
status is for Torch 2.1.2.
You could try exporting TORCH_CUDA_ARCH_LIST="8.7+PTX" to see if that
makes a difference. Otherwise Torch 2.2 just released, so that might behave
differently, though I haven't had a chance to test it yet.
Could you clarify what you mean by the requirements install removing the
NVIDIA CUDA libs? It shouldn't affect those, and if you already have
torch>=2.1.0 installed (which should match against 2.1.2+cuxxx too) in
your (v)env, it shouldn't affect that install.
—
Reply to this email directly, view it on GitHub
<#301 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ARYO7J3H6WMPMCUMGNDINO3YRC2WFAVCNFSM6AAAAABCK4GEVWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJWGM4DQMBXHA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Oh. If that's a special version of Torch you have to use with the Jetson, you'll probably want to remove it from the requirements (or just install ninja, sentencepiece, safetensors etc. manually, it's not that many packages). Otherwise it will try to "upgrade" you to >= 2.1.0 which might default to the non-CUDA package. I haven't tested on 2.0.0, and especially not that particular version of 2.0.0, but in theory it should still have all the features exllama would need. So what I'd try is:
|
Appreciate the info.. After removing the distro and the conda environment
and starting from scratch (including manually reinstalling
and testing the Jetson torch), I ran pip install . and received lots of
diagnostics with this as the primary output:
Building wheels for collected packages: exllamav2
Building wheel for exllamav2 (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [385 lines of output]
Version: 0.0.12
warning: no previously-included files matching '*.pyc' found anywhere
in distribution
warning: no previously-included files matching 'dni_*' found anywhere
in distribution
/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/command/build_py.py:204:
_Warning: Package 'exllamav2.exllamav2_ext' is absent from the `packages`
configuration.
*... more more more ... and then:*
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/ssd/exllamav2/setup.py", line 76, in <module>
setup(
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/__init__.py",
line 103, in setup
return distutils.core.setup(**attrs)
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/core.py",
line 185, in setup
return run_commands(dist)
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/core.py",
line 201, in run_commands
dist.run_commands()
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/dist.py",
line 969, in run_commands
self.run_command(cmd)
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/dist.py",
line 989, in run_command
super().run_command(command)
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/dist.py",
line 988, in run_command
cmd_obj.run()
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/wheel/bdist_wheel.py",
line 364, in run
self.run_command("build")
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/cmd.py",
line 318, in run_command
self.distribution.run_command(command)
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/dist.py",
line 989, in run_command
super().run_command(command)
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/dist.py",
line 988, in run_command
cmd_obj.run()
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/command/build.py",
line 131, in run
self.run_command(cmd_name)
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/cmd.py",
line 318, in run_command
self.distribution.run_command(command)
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/dist.py",
line 989, in run_command
super().run_command(command)
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/dist.py",
line 988, in run_command
cmd_obj.run()
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/command/build_ext.py",
line 88, in run
_build_ext.run(self)
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py",
line 345, in run
self.build_extensions()
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py",
line 843, in build_extensions
build_ext.build_extensions(self)
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py",
line 467, in build_extensions
self._build_extensions_serial()
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py",
line 493, in _build_extensions_serial
self.build_extension(ext)
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/command/build_ext.py",
line 249, in build_extension
_build_ext.build_extension(self, ext)
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py",
line 548, in build_extension
objects = self.compiler.compile(
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py",
line 649, in unix_wrap_ninja_compile
cuda_post_cflags = unix_cuda_flags(cuda_post_cflags)
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py",
line 548, in unix_cuda_flags
cflags + _get_cuda_arch_flags(cflags))
File
"/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py",
line 1786, in _get_cuda_arch_flags
raise ValueError(f"Unknown CUDA arch ({arch}) or GPU not
supported")
ValueError: Unknown CUDA arch (8.7+PTX) or GPU not supported
[end of output]
note: This error originates from a subprocess, and is likely not a
problem with pip.
ERROR: Failed building wheel for exllamav2
Running setup.py clean for exllamav2
Failed to build exllamav2
ERROR: Could not build wheels for exllamav2, which is required to install
pyproject.toml-based projects
…On Tue, Jan 30, 2024 at 12:47 PM turboderp ***@***.***> wrote:
Oh. If that's a special version of Torch you have to use with the Jetson,
you'll probably want to remove it from the requirements (or just install
ninja, sentencepiece, safetensors etc. manually, it's not that many
packages). Otherwise it will try to "upgrade" you to >= 2.1.0 which might
default to the non-CUDA package.
I haven't tested on 2.0.0, and especially not that particular version of
2.0.0, but in theory it should still have all the features exllama would
need.
So what I'd try is:
git clone https://github.com/turboderp/exllamav2
cd exllamav2
pip install pandas ninja fastparquet safetensors sentencepiece pygments websockets regex numpy
pip install .
—
Reply to this email directly, view it on GitHub
<#301 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ARYO7JYT2ESGFCN4M5NXFJDYREW4LAVCNFSM6AAAAABCK4GEVWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJXGU3TKOJVHE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Hello,
I'm wondering if anyone has been able to get exllamav2 to work with the Jetson AGX? The requirements install removes the Nvidia CUDA libs and installs a base torch-2.1.2. Unfortunately, that won't work with the Jetson AGX. Here is my run output:
`python test_inference.py -m /ssd/llama-2/llama-2-7b-chat-hf -p "Once upon a time,"
Traceback (most recent call last):
File "test_inference.py", line 2, in
from exllamav2 import(
File "/ssd/exllamav2/exllamav2/init.py", line 3, in
from exllamav2.model import ExLlamaV2
File "/ssd/exllamav2/exllamav2/model.py", line 16, in
from exllamav2.config import ExLlamaV2Config
File "/ssd/exllamav2/exllamav2/config.py", line 2, in
from exllamav2.fasttensors import STFile
File "/ssd/exllamav2/exllamav2/fasttensors.py", line 5, in
from exllamav2.ext import exllamav2_ext as ext_c
File "/ssd/exllamav2/exllamav2/ext.py", line 142, in
exllamav2_ext = load
File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1284, in load
return _jit_compile(
File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1509, in _jit_compile
_write_ninja_file_and_build_library(
File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1611, in _write_ninja_file_and_build_library
_write_ninja_file_to_build_library(
File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 2007, in _write_ninja_file_to_build_library cuda_flags = common_cflags + COMMON_NVCC_FLAGS + _get_cuda_arch_flags()
File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1786, in _get_cuda_arch_flagsraise ValueError(f"Unknown CUDA arch ({arch}) or GPU not supported")
ValueError: Unknown CUDA arch (8.7+PTX) or GPU not supported`
FWIW here is the latest (as of this post) Nvidia CUDA library for the AGX: 2.0.0a0+8aa34602.nv23.03.
Hoping someone has a workaround. Thank you
The text was updated successfully, but these errors were encountered: