Skip to content
@neuralmagic

Neural Magic

Neural Magic (Acquired by Red Hat) empowers developers to optimize & deploy LLMs at scale. Our model compression & acceleration enable top performance with vLLM

Pinned Loading

  1. nm-vllm-certs nm-vllm-certs Public

    General Information, model certifications, and benchmarks for nm-vllm enterprise distributions

    11 1

  2. deepsparse deepsparse Public archive

    Sparsity-aware deep learning inference runtime for CPUs

    Python 3.2k 187

  3. sparseml sparseml Public archive

    Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

    Python 2.1k 157

  4. docs docs Public archive

    Top-level directory for documentation and general content

    MDX 121 7

  5. sparsezoo sparsezoo Public archive

    Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes

    Python 388 28

  6. guidellm guidellm Public

    Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

    Python 330 41

Repositories

Showing 10 of 72 repositories
  • yolov5 Public archive Forked from ultralytics/yolov5

    YOLOv5 in PyTorch > ONNX > CoreML > TFLite

    neuralmagic/yolov5’s past year of commit activity
    Python 19 GPL-3.0 17,151 0 0 Updated Jun 4, 2025
  • yolov3 Public archive Forked from ultralytics/yolov3

    YOLOv3 in PyTorch > ONNX > CoreML > TFLite

    neuralmagic/yolov3’s past year of commit activity
    Python 3 GPL-3.0 3,484 0 0 Updated Jun 4, 2025
  • pytest-nm-releng Public

    Pytest plugin used by the Release Engineering team

    neuralmagic/pytest-nm-releng’s past year of commit activity
    Python 0 Apache-2.0 0 0 0 Updated Jun 4, 2025
  • compressed-tensors Public

    A safetensors extension to efficiently store sparse quantized tensors on disk

    neuralmagic/compressed-tensors’s past year of commit activity
    Python 119 Apache-2.0 11 6 15 Updated Jun 4, 2025
  • vllm Public Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    neuralmagic/vllm’s past year of commit activity
    Python 13 Apache-2.0 7,871 0 9 Updated Jun 4, 2025
  • transformers Public archive Forked from huggingface/transformers

    🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

    neuralmagic/transformers’s past year of commit activity
    Python 9 Apache-2.0 29,691 0 0 Updated Jun 4, 2025
  • speculators Public
    neuralmagic/speculators’s past year of commit activity
    Python 3 Apache-2.0 0 19 9 Updated Jun 4, 2025
  • guidellm Public

    Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

    neuralmagic/guidellm’s past year of commit activity
    Python 330 Apache-2.0 41 39 (3 issues need help) 13 Updated Jun 4, 2025
  • deepsparse Public archive

    Sparsity-aware deep learning inference runtime for CPUs

    neuralmagic/deepsparse’s past year of commit activity
    Python 3,150 187 1 0 Updated Jun 2, 2025
  • sparsify Public archive

    ML model optimization product to accelerate inference.

    neuralmagic/sparsify’s past year of commit activity
    Python 325 Apache-2.0 30 1 0 Updated Jun 2, 2025