Skip to content

0.19.4

Compare
Choose a tag to compare
@r4victor r4victor released this 17 Apr 10:23
· 35 commits to master since this release
fb57f55

Services

Rate limits

You can now configure rate limits for your services running behind a gateway.

type: service
image: my-app:latest
port: 80

rate_limits:
# For /api/auth/* - 1 request per second, no bursts
- prefix: /api/auth/
  rps: 1
# For other URLs - 4 requests per second + bursts of up to 9 requests
- rps: 4
  burst: 9

Examples

TensorRT-LLM

We added a new example on TensorRT-LLM that shows how to deploy both DeepSeek R1 and its distilled version
using TensorRT-LLM and dstack.

Llama 4

The Llama example was updated to demonstrate the deployment of Llama 4 Scout using dstack.

Contributing

We continue to make contributing to dstack easier and improve dev experience. Since the last release, we moved from pip to uv in CI and dev pipelines. Dependencies installation times went from ~70 seconds to less than 10 seconds. The Development guide was updated to show how to get the dstack development setup with uv. The CI Build pipeline triggered on pull requests were optimized from 9 minutes to 4 minutes.

We also documented uv as one of the recommended installation options for dstack.

What's changed

New contributors

Full changelog: 0.19.3...0.19.4