-
text-generation-webui Public
A Gradio web UI for Large Language Models with support for multiple inference backends.
-
-
exllamav2 Public
Forked from turboderp-org/exllamav2A fast inference library for running LLMs locally on modern consumer-class GPUs
-
llama-cpp-python-cuBLAS-wheels Public
Forked from jllllll/llama-cpp-python-cuBLAS-wheelsWheels for llama-cpp-python compiled with cuBLAS support
-
-
flash-attention Public
Forked from jllllll/flash-attentionFast and memory-efficient exact attention - Windows wheels
-
llama-cpp-python-basic Public
Forked from abetlen/llama-cpp-pythonPython bindings for llama.cpp
-
-
AutoAWQ_kernels Public
Forked from casper-hansen/AutoAWQ_kernels -
AutoAWQ Public
Forked from casper-hansen/AutoAWQAutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
-
GPTQ-for-LLaMa-CUDA Public
Forked from jllllll/GPTQ-for-LLaMa-CUDAA combination of Oobabooga's fork and the main cuda branch of GPTQ-for-LLaMa in a package format.
-
AutoGPTQ Public
Forked from jllllll/AutoGPTQAn easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
-
bitsandbytes Public
Forked from jllllll/bitsandbytes8-bit CUDA functions for PyTorch
-
bitsandbytes-windows-webui Public
Forked from jllllll/bitsandbytes-windows-webuiWindows compile of bitsandbytes for use in text-generation-webui.
-
SillyTavern Public
Forked from SillyTavern/SillyTavernLLM Frontend for Power Users.
-
quip-sharp Public
Forked from Cornell-RelaxML/quip-sharp -
chatbot-ui Public
Forked from mckaywrigley/chatbot-uiAn open source ChatGPT UI.
-
-
BlockMerge_Gradient Public
Forked from Gryphe/BlockMerge_GradientMerge Transformers language models by use of gradient parameters.
-
-
whisper Public
Forked from openai/whisperRobust Speech Recognition via Large-Scale Weak Supervision
-
one-click-installers Public archive
Simplified installers for oobabooga/text-generation-webui.
-
-
transformers Public
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
-
llama.cpp Public
Forked from ggml-org/llama.cppPort of Facebook's LLaMA model in C/C++
-
audiocraft-infinity-webui Public
Forked from 1aienthusiast/audiocraft-infinity-webui -
-
GPTQ-for-LLaMa Public
Forked from qwopqwop200/GPTQ-for-LLaMa4 bits quantization of LLaMa using GPTQ
-
roop Public
Forked from s0md3v/roopone-click deepfake (face swap)
-
GPTQ-for-LLaMa-Wheels Public
Forked from jllllll/GPTQ-for-LLaMa-WheelsPrecompiled Wheels for GPTQ-for-LLaMa