RuntimeError: cuDNN error: CUDNN_STATUS_SUBLIBRARY_VERSION_MISMATCH on WSL2 Ubuntu24.04 #282

cospotato · 2024-12-07T11:04:22Z

Hi, i am new to deep learning. It's work on Windows with CUDA 12.5 and cudnn 9.3.0. Then i tried to run on WSL2 with config belowing get the error RuntimeError: cuDNN error: CUDNN_STATUS_SUBLIBRARY_VERSION_MISMATCH in WSL2 Ubuntu24.04. What i have missing ?

OS: WSL2 Ubuntu24.04
Kernel: Linux cospotato 5.15.167.4-microsoft-standard-WSL2 #1 SMP Tue Nov 5 00:21:55 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
PyTorch Version: 2.5.1
CUDA version: 12.6
cudnn version: 9.3.0

Traceback:

Traceback (most recent call last):
  File "/home/cospotato/repo/github.com/MahmoudAshraf97/whisper-diarization/diarize.py", line 199, in <module>
    msdd_model = NeuralDiarizer(cfg=create_config(temp_path)).to(args.device)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cospotato/repo/github.com/MahmoudAshraf97/whisper-diarization/.venv/lib/python3.12/site-packages/nemo/collections/asr/models/msdd_models.py", line 994, in __init__
    self._init_msdd_model(cfg)
  File "/home/cospotato/repo/github.com/MahmoudAshraf97/whisper-diarization/.venv/lib/python3.12/site-packages/nemo/collections/asr/models/msdd_models.py", line 1096, in _init_msdd_model
    self.msdd_model = EncDecDiarLabelModel.from_pretrained(model_name=model_path, map_location=cfg.device)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cospotato/repo/github.com/MahmoudAshraf97/whisper-diarization/.venv/lib/python3.12/site-packages/nemo/core/classes/common.py", line 754, in from_pretrained
    instance = class_.restore_from(
               ^^^^^^^^^^^^^^^^^^^^
  File "/home/cospotato/repo/github.com/MahmoudAshraf97/whisper-diarization/.venv/lib/python3.12/site-packages/nemo/core/classes/modelPT.py", line 464, in restore_from
    instance = cls._save_restore_connector.restore_from(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cospotato/repo/github.com/MahmoudAshraf97/whisper-diarization/.venv/lib/python3.12/site-packages/nemo/core/connectors/save_restore_connector.py", line 255, in restore_from
    loaded_params = self.load_config_and_state_dict(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cospotato/repo/github.com/MahmoudAshraf97/whisper-diarization/.venv/lib/python3.12/site-packages/nemo/core/connectors/save_restore_connector.py", line 179, in load_config_and_state_dict
    instance = instance.to(map_location)
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cospotato/repo/github.com/MahmoudAshraf97/whisper-diarization/.venv/lib/python3.12/site-packages/lightning_fabric/utilities/device_dtype_mixin.py", line 55, in to
    return super().to(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cospotato/repo/github.com/MahmoudAshraf97/whisper-diarization/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1340, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/cospotato/repo/github.com/MahmoudAshraf97/whisper-diarization/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 900, in _apply
    module._apply(fn)
  File "/home/cospotato/repo/github.com/MahmoudAshraf97/whisper-diarization/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 900, in _apply
    module._apply(fn)
  File "/home/cospotato/repo/github.com/MahmoudAshraf97/whisper-diarization/.venv/lib/python3.12/site-packages/torch/nn/modules/rnn.py", line 288, in _apply
    self._init_flat_weights()
  File "/home/cospotato/repo/github.com/MahmoudAshraf97/whisper-diarization/.venv/lib/python3.12/site-packages/torch/nn/modules/rnn.py", line 215, in _init_flat_weights
    self.flatten_parameters()
  File "/home/cospotato/repo/github.com/MahmoudAshraf97/whisper-diarization/.venv/lib/python3.12/site-packages/torch/nn/modules/rnn.py", line 269, in flatten_parameters
    torch._cudnn_rnn_flatten_weight(
RuntimeError: cuDNN error: CUDNN_STATUS_SUBLIBRARY_VERSION_MISMATCH

The text was updated successfully, but these errors were encountered:

cospotato · 2024-12-07T11:09:10Z

Additional: If i run NeMo MSDD diarization model section alone. It works. Maybe conflict NeMo conflicted with Whisper ?

DrJPK · 2024-12-12T01:43:09Z

@cospotato did you manage to work this out? because I am having exactly the same issue on a RHEL 9 system RuntimeError: cuDNN error: CUDNN_STATUS_SUBLIBRARY_VERSION_MISMATCH on a bare metal server.

System
OS: Red Hat Enterprise Linux release 9.4 (Plow)
Kernel: 5.14.0-427.35.1.el9_4.x86_64
GPU: Nvidia A30 24GB
CUDA: 12.4.r12.4/compiler.34097967_0
cuDNN: 9.6.0.74
Python: Python 3.12.1 running in venv
torch: 2.5.1

Traceback:

Traceback (most recent call last):
  File "/srv/whisperAI/whisper-diarization/diarize.py", line 202, in <module>
    msdd_model = NeuralDiarizer(cfg=create_config(temp_path)).to(args.device)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/whisperAI/whisper-diarization/.venv/lib64/python3.12/site-packages/nemo/collections/asr/models/msdd_models.py", line 994, in __init__
    self._init_msdd_model(cfg)
  File "/srv/whisperAI/whisper-diarization/.venv/lib64/python3.12/site-packages/nemo/collections/asr/models/msdd_models.py", line 1096, in _init_msdd_model
    self.msdd_model = EncDecDiarLabelModel.from_pretrained(model_name=model_path, map_location=cfg.device)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/whisperAI/whisper-diarization/.venv/lib64/python3.12/site-packages/nemo/core/classes/common.py", line 754, in from_pretrained
    instance = class_.restore_from(
               ^^^^^^^^^^^^^^^^^^^^
  File "/srv/whisperAI/whisper-diarization/.venv/lib64/python3.12/site-packages/nemo/core/classes/modelPT.py", line 464, in restore_from
    instance = cls._save_restore_connector.restore_from(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/whisperAI/whisper-diarization/.venv/lib64/python3.12/site-packages/nemo/core/connectors/save_restore_connector.py", line 255, in restore_from
    loaded_params = self.load_config_and_state_dict(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/whisperAI/whisper-diarization/.venv/lib64/python3.12/site-packages/nemo/core/connectors/save_restore_connector.py", line 179, in load_config_and_state_dict
    instance = instance.to(map_location)
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/whisperAI/whisper-diarization/.venv/lib64/python3.12/site-packages/lightning_fabric/utilities/device_dtype_mixin.py", line 55, in to
    return super().to(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/whisperAI/whisper-diarization/.venv/lib64/python3.12/site-packages/torch/nn/modules/module.py", line 1340, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/srv/whisperAI/whisper-diarization/.venv/lib64/python3.12/site-packages/torch/nn/modules/module.py", line 900, in _apply
    module._apply(fn)
  File "/srv/whisperAI/whisper-diarization/.venv/lib64/python3.12/site-packages/torch/nn/modules/module.py", line 900, in _apply
    module._apply(fn)
  File "/srv/whisperAI/whisper-diarization/.venv/lib64/python3.12/site-packages/torch/nn/modules/rnn.py", line 288, in _apply
    self._init_flat_weights()
  File "/srv/whisperAI/whisper-diarization/.venv/lib64/python3.12/site-packages/torch/nn/modules/rnn.py", line 215, in _init_flat_weights
    self.flatten_parameters()
  File "/srv/whisperAI/whisper-diarization/.venv/lib64/python3.12/site-packages/torch/nn/modules/rnn.py", line 269, in flatten_parameters
    torch._cudnn_rnn_flatten_weight(
RuntimeError: cuDNN error: CUDNN_STATUS_SUBLIBRARY_VERSION_MISMATCH

Clearly this is a CUDA issue but I cannot work out what is going on. I assume it is a pyTorch thing

DrJPK · 2024-12-12T03:05:41Z

OK quick update... diarize.py -a audio.MP3 is still causing the issue above. HOWEVER, diarize_parallel.py -a audio.MP3 runs and transcribes the autio to text and srt with a good level of activity. BUT the speaker identification does not work. I don't know if that helps or confuses things but thought I would share it.

DrJPK · 2024-12-12T04:54:17Z

EDIT: I think this post below is actually a just a set of warnings and is unrelated to diarize.py not running on linux

@cospotato just out of interest did you get a warning directly before this error when calling diarize.py about tarfile.py:2252 not being allowed to use absolute paths anymore?

[NeMo W 2024-12-12 15:17:10 nemo_logging:393] /usr/lib64/python3.12/tarfile.py:2252: RuntimeWarning: The default behavior of tarfile extraction has been changed to disallow common exploits (including CVE-2007-4559). By default, absolute/parent paths are disallowed and some mode bits are cleared. See https://access.redhat.com/articles/7004769 for more details.
      warnings.warn(

[NeMo W 2024-12-12 15:17:11 nemo_logging:393] If you intend to do training or fine-tuning, please call the ModelPT.setup_training_data() method and provide a valid configuration file to setup the train data loader.
    Train config :
    manifest_filepath: null
    emb_dir: null
    sample_rate: 16000
    num_spks: 2
    soft_label_thres: 0.5
    labels: null
    batch_size: 15
    emb_batch_size: 0
    shuffle: true

[NeMo W 2024-12-12 15:17:11 nemo_logging:393] If you intend to do validation, please call the ModelPT.setup_validation_data() or ModelPT.setup_multiple_validation_data() method and provide a valid configuration file to setup the validation data loader(s).
    Validation config :
    manifest_filepath: null
    emb_dir: null
    sample_rate: 16000
    num_spks: 2
    soft_label_thres: 0.5
    labels: null
    batch_size: 15
    emb_batch_size: 0
    shuffle: false

[NeMo W 2024-12-12 15:17:11 nemo_logging:393] Please call the ModelPT.setup_test_data() or ModelPT.setup_multiple_test_data() method and provide a valid configuration file to setup the test data loader(s).
    Test config :
    manifest_filepath: null
    emb_dir: null
    sample_rate: 16000
    num_spks: 2
    soft_label_thres: 0.5
    labels: null
    batch_size: 15
    emb_batch_size: 0
    shuffle: false
    seq_eval_mode: false

sadathknorket · 2024-12-12T09:14:08Z

same issue , is this resolved ? @DrJPK @cospotato

DrJPK · 2024-12-12T12:04:20Z

@sadathknorket not resolved but for some reason that I can't quite explain, the diarize_parralel.py script runs without this error for me. Unfortunately, that parallel script seems to label everything as speaker 0 so it's not working perfectly but it is transcribing and completing. I'm thinking something upstream with NeMo has changed causing this issue.

juntatalor · 2025-01-20T12:26:07Z

Hi there. I faced same issue (not WSL, standalone Ubuntu 24.04). Inside conda environment:

pip install -U nvidia-cuda-runtime-cu12 nvidia-cudnn-cu12

pip throws dependency error for pytorch....:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. torch 2.5.1 requires nvidia-cuda-runtime-cu12==12.4.127; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cuda-runtime-cu12 12.6.77 which is incompatible. torch 2.5.1 requires nvidia-cudnn-cu12==9.1.0.70; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cudnn-cu12 9.6.0.74 which is incompatible.

... but the packages are installed successfully, and no more CUDNN_STATUS_SUBLIBRARY_VERSION_MISMATCH exception thrown for diarize.py

eq-gdekhayser · 2025-02-08T17:51:28Z

Hi there. I faced same issue (not WSL, standalone Ubuntu 24.04). Inside conda environment:

pip install -U nvidia-cuda-runtime-cu12 nvidia-cudnn-cu12

pip throws dependency error for pytorch....:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. torch 2.5.1 requires nvidia-cuda-runtime-cu12==12.4.127; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cuda-runtime-cu12 12.6.77 which is incompatible. torch 2.5.1 requires nvidia-cudnn-cu12==9.1.0.70; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cudnn-cu12 9.6.0.74 which is incompatible.

... but the packages are installed successfully, and no more CUDNN_STATUS_SUBLIBRARY_VERSION_MISMATCH exception thrown for diarize.py

I can verify I had the SAME issue, applied the "fix" here, got the SAME error but got the same successful install, and the cuDNN mismatch was resolved. Very weird, but all's well that ends.

Investroj · 2025-02-25T10:30:04Z

Is there no solution for this Yet ?

MahmoudAshraf97 · 2025-02-25T14:29:30Z

This is not an issue that will be solved in this project, you just need to configure all your cuda libraries correctly which can be hard

cospotato changed the title ~~RuntimeError: cuDNN error: CUDNN_STATUS_SUBLIBRARY_VERSION_MISMATCH in WSL2 Ubuntu24.04~~ RuntimeError: cuDNN error: CUDNN_STATUS_SUBLIBRARY_VERSION_MISMATCH on WSL2 Ubuntu24.04 Dec 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: cuDNN error: CUDNN_STATUS_SUBLIBRARY_VERSION_MISMATCH on WSL2 Ubuntu24.04 #282

RuntimeError: cuDNN error: CUDNN_STATUS_SUBLIBRARY_VERSION_MISMATCH on WSL2 Ubuntu24.04 #282

cospotato commented Dec 7, 2024

cospotato commented Dec 7, 2024 •

edited

Loading

DrJPK commented Dec 12, 2024

DrJPK commented Dec 12, 2024

DrJPK commented Dec 12, 2024 •

edited

Loading

sadathknorket commented Dec 12, 2024

DrJPK commented Dec 12, 2024

juntatalor commented Jan 20, 2025 •

edited

Loading

eq-gdekhayser commented Feb 8, 2025

Investroj commented Feb 25, 2025

MahmoudAshraf97 commented Feb 25, 2025

RuntimeError: cuDNN error: CUDNN_STATUS_SUBLIBRARY_VERSION_MISMATCH on WSL2 Ubuntu24.04 #282

RuntimeError: cuDNN error: CUDNN_STATUS_SUBLIBRARY_VERSION_MISMATCH on WSL2 Ubuntu24.04 #282

Comments

cospotato commented Dec 7, 2024

cospotato commented Dec 7, 2024 • edited Loading

DrJPK commented Dec 12, 2024

DrJPK commented Dec 12, 2024

DrJPK commented Dec 12, 2024 • edited Loading

sadathknorket commented Dec 12, 2024

DrJPK commented Dec 12, 2024

juntatalor commented Jan 20, 2025 • edited Loading

eq-gdekhayser commented Feb 8, 2025

Investroj commented Feb 25, 2025

MahmoudAshraf97 commented Feb 25, 2025

cospotato commented Dec 7, 2024 •

edited

Loading

DrJPK commented Dec 12, 2024 •

edited

Loading

juntatalor commented Jan 20, 2025 •

edited

Loading