-
KAUST (King Abdullah University of Science and Technology)
- in/liangyu-wang-in
- @liangyuwang10
Pinned Loading
-
Tiny-DeepSpeed
Tiny-DeepSpeed PublicTiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library
-
Flash-Attention-Implementation
Flash-Attention-Implementation PublicImplementation of Flash-Attention (both forward and backward) with PyTorch, CUDA, and Triton
Python 2
-
Tiny-Megatron
Tiny-Megatron PublicTiny-Megatron, a minimalistic re-implementation of the Megatron library
Python 4
-
MetaProfiler
MetaProfiler PublicMetaProfiler is a lightweight, structure-agnostic operator-level profiler for PyTorch models that leverages MetaTensor execution to simulate and benchmark individual ops without loading the full mo…
Python 1
-
simple_cuda_kernel
simple_cuda_kernel PublicA collection of ultra-simple yet high-performance CUDA kernels.
Python 1
If the problem persists, check the GitHub status page or contact support.