Skip to content

Add NeuronxDistributedInference support, Speculative Decoding, Dynamic on-device sampling#16357

Merged
simon-mo merged 44 commits intovllm-project:mainfrom
aws-neuron:upstream-neuron-vllm-04-08
May 7, 2025

Commits

Commits on Apr 9, 2025

Commits on Apr 22, 2025

Commits on Apr 28, 2025