From 663275503e0d6cc176d10fde1dd63dba7a7aa37d Mon Sep 17 00:00:00 2001 From: arjkesh <33526713+arjkesh@users.noreply.github.com> Date: Mon, 24 Mar 2025 11:59:26 -0700 Subject: [PATCH 1/3] Update sagemaker docs structure --- docs/sagemaker/_toctree.yml | 8 +++- docs/sagemaker/reference.md | 87 +--------------------------------- docs/sagemaker/tgi.md | 5 ++ docs/sagemaker/transformers.md | 6 +++ 4 files changed, 20 insertions(+), 86 deletions(-) create mode 100644 docs/sagemaker/tgi.md create mode 100644 docs/sagemaker/transformers.md diff --git a/docs/sagemaker/_toctree.yml b/docs/sagemaker/_toctree.yml index d5464737d..c66be617c 100644 --- a/docs/sagemaker/_toctree.yml +++ b/docs/sagemaker/_toctree.yml @@ -7,4 +7,10 @@ - local: inference title: Deploy models to Amazon SageMaker - local: reference - title: Reference \ No newline at end of file + title: AWS Deep Learning Containers (DLCs) + isExpanded: true + sections: + - local: tgi + title: Text Generation Inference (TGI) + - local: transformers + title: Transformers diff --git a/docs/sagemaker/reference.md b/docs/sagemaker/reference.md index 42c5e5146..e2085721e 100644 --- a/docs/sagemaker/reference.md +++ b/docs/sagemaker/reference.md @@ -1,91 +1,8 @@ -# Reference +# AWS Deep Learning Containers (DLCs) ## Deep Learning Container -Below you can find a version table of currently available Hugging Face DLCs. The table doesn't include the full `image_uri` here are two examples on how to construct those if needed. - -**Manually construction the `image_uri`** - -`{dlc-aws-account-id}.dkr.ecr.{region}.amazonaws.com/huggingface-{framework}-{(training | inference)}:{framework-version}-transformers{transformers-version}-{device}-{python-version}-{device-tag}` - -- `dlc-aws-account-id`: The AWS account ID of the account that owns the ECR repository. You can find them in the [here](https://github.com/aws/sagemaker-python-sdk/blob/e0b9d38e1e3b48647a02af23c4be54980e53dc61/src/sagemaker/image_uri_config/huggingface.json#L21) -- `region`: The AWS region where you want to use it. -- `framework`: The framework you want to use, either `pytorch` or `tensorflow`. -- `(training | inference)`: The training or inference mode. -- `framework-version`: The version of the framework you want to use. -- `transformers-version`: The version of the transformers library you want to use. -- `device`: The device you want to use, either `cpu` or `gpu`. -- `python-version`: The version of the python of the DLC. -- `device-tag`: The device tag you want to use. The device tag can include os version and cuda version - -**Example 1: PyTorch Training:** -`763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-training:1.6.0-transformers4.4.2-gpu-py36-cu110-ubuntu18.04` -**Example 2: Tensorflow Inference:** -`763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-tensorflow-inference:2.4.1-transformers4.6.1-cpu-py37-ubuntu18.04` - -## Training DLC Overview - -The Training DLC overview includes all released and available Hugging Face Training DLCs. It includes PyTorch and TensorFlow flavored -versions for GPU. - -| 🤗 Transformers version | 🤗 Datasets version | PyTorch/TensorFlow version | type | device | Python Version | -| ----------------------- | ------------------- | -------------------------- | -------- | ------ | -------------- | -| 4.4.2 | 1.5.0 | PyTorch 1.6.0 | training | GPU | 3.6 | -| 4.4.2 | 1.5.0 | TensorFlow 2.4.1 | training | GPU | 3.7 | -| 4.5.0 | 1.5.0 | PyTorch 1.6.0 | training | GPU | 3.6 | -| 4.5.0 | 1.5.0 | TensorFlow 2.4.1 | training | GPU | 3.7 | -| 4.6.1 | 1.6.2 | PyTorch 1.6.0 | training | GPU | 3.6 | -| 4.6.1 | 1.6.2 | PyTorch 1.7.1 | training | GPU | 3.6 | -| 4.6.1 | 1.6.2 | TensorFlow 2.4.1 | training | GPU | 3.7 | -| 4.10.2 | 1.11.0 | PyTorch 1.8.1 | training | GPU | 3.6 | -| 4.10.2 | 1.11.0 | PyTorch 1.9.0 | training | GPU | 3.8 | -| 4.10.2 | 1.11.0 | TensorFlow 2.4.1 | training | GPU | 3.7 | -| 4.10.2 | 1.11.0 | TensorFlow 2.5.1 | training | GPU | 3.7 | -| 4.11.0 | 1.12.1 | PyTorch 1.9.0 | training | GPU | 3.8 | -| 4.11.0 | 1.12.1 | TensorFlow 2.5.1 | training | GPU | 3.7 | -| 4.12.3 | 1.15.1 | PyTorch 1.9.1 | training | GPU | 3.8 | -| 4.12.3 | 1.15.1 | TensorFlow 2.5.1 | training | GPU | 3.7 | -| 4.17.0 | 1.18.4 | PyTorch 1.10.2 | training | GPU | 3.8 | -| 4.17.0 | 1.18.4 | TensorFlow 2.6.3 | training | GPU | 3.8 | -| 4.26.0 | 2.9.0 | PyTorch 1.13.1 | training | GPU | 3.9 | - -## Inference DLC Overview - -The Inference DLC overview includes all released and available Hugging Face Inference DLCs. It includes PyTorch and TensorFlow flavored -versions for CPU, GPU & AWS Inferentia. - - -| 🤗 Transformers version | PyTorch/TensorFlow version | type | device | Python Version | -| ----------------------- | -------------------------- | --------- | ------ | -------------- | -| 4.6.1 | PyTorch 1.7.1 | inference | CPU | 3.6 | -| 4.6.1 | PyTorch 1.7.1 | inference | GPU | 3.6 | -| 4.6.1 | TensorFlow 2.4.1 | inference | CPU | 3.7 | -| 4.6.1 | TensorFlow 2.4.1 | inference | GPU | 3.7 | -| 4.10.2 | PyTorch 1.8.1 | inference | GPU | 3.6 | -| 4.10.2 | PyTorch 1.9.0 | inference | GPU | 3.8 | -| 4.10.2 | TensorFlow 2.4.1 | inference | GPU | 3.7 | -| 4.10.2 | TensorFlow 2.5.1 | inference | GPU | 3.7 | -| 4.10.2 | PyTorch 1.8.1 | inference | CPU | 3.6 | -| 4.10.2 | PyTorch 1.9.0 | inference | CPU | 3.8 | -| 4.10.2 | TensorFlow 2.4.1 | inference | CPU | 3.7 | -| 4.10.2 | TensorFlow 2.5.1 | inference | CPU | 3.7 | -| 4.11.0 | PyTorch 1.9.0 | inference | GPU | 3.8 | -| 4.11.0 | TensorFlow 2.5.1 | inference | GPU | 3.7 | -| 4.11.0 | PyTorch 1.9.0 | inference | CPU | 3.8 | -| 4.11.0 | TensorFlow 2.5.1 | inference | CPU | 3.7 | -| 4.12.3 | PyTorch 1.9.1 | inference | GPU | 3.8 | -| 4.12.3 | TensorFlow 2.5.1 | inference | GPU | 3.7 | -| 4.12.3 | PyTorch 1.9.1 | inference | CPU | 3.8 | -| 4.12.3 | TensorFlow 2.5.1 | inference | CPU | 3.7 | -| 4.12.3 | PyTorch 1.9.1 | inference | Inferentia | 3.7 | -| 4.17.0 | PyTorch 1.10.2 | inference | GPU | 3.8 | -| 4.17.0 | TensorFlow 2.6.3 | inference | GPU | 3.8 | -| 4.17.0 | PyTorch 1.10.2 | inference | CPU | 3.8 | -| 4.17.0 | TensorFlow 2.6.3 | inference | CPU | 3.8 | -| 4.26.0 | PyTorch 1.13.1 | inference | CPU | 3.9 | -| 4.26.0 | PyTorch 1.13.1 | inference | GPU | 3.9 | - - +There are several different types of AWS Deep Learning Containers. Feel free to explore further in the subheadings! ## Hugging Face Transformers Amazon SageMaker Examples diff --git a/docs/sagemaker/tgi.md b/docs/sagemaker/tgi.md new file mode 100644 index 000000000..0fa6ce8bb --- /dev/null +++ b/docs/sagemaker/tgi.md @@ -0,0 +1,5 @@ +# Text Generation Inference (TGI) Images + +[TGI](https://huggingface.co/docs/text-generation-inference/en/index) is a toolkit for deploying and serving Large Language Models (LLMs). TGI enables high-performance text generation for the most popular open-source LLMs, including Llama, Falcon, StarCoder, BLOOM, GPT-NeoX, and T5. + +Below, you can find a list of the latest available images for TGI for use on AWS SageMaker. diff --git a/docs/sagemaker/transformers.md b/docs/sagemaker/transformers.md new file mode 100644 index 000000000..1159ff706 --- /dev/null +++ b/docs/sagemaker/transformers.md @@ -0,0 +1,6 @@ +# Transformers Images + +[Transformers](https://huggingface.co/docs/transformers/en/index) provides APIs and tools to easily download and fine-tune state-of-the-art pretrained models, for use across NLP, computer vision, audio, and more. + +Below, we include a list of the latest images available on AWS, which come pre-packaged with transformers and [datasets](https://huggingface.co/docs/datasets/en/index) libraries for your convenience. Check out some of the tutorials in the reference section for more information! + \ No newline at end of file From aab1ed9a237de709892852f42eb8f611dd9bdd8a Mon Sep 17 00:00:00 2001 From: arjkesh <33526713+arjkesh@users.noreply.github.com> Date: Mon, 24 Mar 2025 12:05:04 -0700 Subject: [PATCH 2/3] Update tgi.md --- docs/sagemaker/tgi.md | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/docs/sagemaker/tgi.md b/docs/sagemaker/tgi.md index 0fa6ce8bb..d594d3f16 100644 --- a/docs/sagemaker/tgi.md +++ b/docs/sagemaker/tgi.md @@ -3,3 +3,34 @@ [TGI](https://huggingface.co/docs/text-generation-inference/en/index) is a toolkit for deploying and serving Large Language Models (LLMs). TGI enables high-performance text generation for the most popular open-source LLMs, including Llama, Falcon, StarCoder, BLOOM, GPT-NeoX, and T5. Below, you can find a list of the latest available images for TGI for use on AWS SageMaker. + +To find the latest supported versions of the HF DLCs, check out https://aws.amazon.com/releasenotes/dlc-support-policy/ + + +## huggingface-pytorch-tgi-inference + +| Framework Version | Image Type | Image URI | Size (GB) | Pushed At | Details | +| --- | --- | --- | --- | --- | --- | +| 2.6 | gpu | `763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-tgi-inference:2.6.0-tgi3.1.1-gpu-py311-cu124-ubuntu22.04-v2.0` | 8.1 | 2025-03-17 16:47:39 | [Details](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-text-generation-inference-tgi-containers) | +| 2.4 | gpu | `763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-tgi-inference:2.4.0-tgi3.0.1-gpu-py311-cu124-ubuntu22.04-v2.2` | 6.5 | 2025-03-06 18:28:24 | [Details](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-text-generation-inference-tgi-containers) | +| 2.3 | gpu | `763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-tgi-inference:2.3.0-tgi2.2.0-gpu-py310-cu121-ubuntu22.04-v2.1` | 4.92 | 2024-10-04 21:59:12 | [Details](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-text-generation-inference-tgi-containers) | + + +### SM Example +``` +# create Hugging Face Model Class +huggingface_model = HuggingFaceModel( + image_uri=get_huggingface_llm_image_uri("huggingface",version="2.6"), + env=, + role=, +) + +# deploy model to SageMaker Inference +predictor = huggingface_model.deploy( + initial_instance_count=1, + instance_type="ml.g6.48xlarge", + container_startup_health_check_timeout=2400, +) +``` + + From c08025ef8d8d316cd6669eb2e411ff78288c99a3 Mon Sep 17 00:00:00 2001 From: arjkesh <33526713+arjkesh@users.noreply.github.com> Date: Mon, 24 Mar 2025 12:05:34 -0700 Subject: [PATCH 3/3] Update transformers.md --- docs/sagemaker/transformers.md | 60 +++++++++++++++++++++++++++++++++- 1 file changed, 59 insertions(+), 1 deletion(-) diff --git a/docs/sagemaker/transformers.md b/docs/sagemaker/transformers.md index 1159ff706..1aba457bd 100644 --- a/docs/sagemaker/transformers.md +++ b/docs/sagemaker/transformers.md @@ -3,4 +3,62 @@ [Transformers](https://huggingface.co/docs/transformers/en/index) provides APIs and tools to easily download and fine-tune state-of-the-art pretrained models, for use across NLP, computer vision, audio, and more. Below, we include a list of the latest images available on AWS, which come pre-packaged with transformers and [datasets](https://huggingface.co/docs/datasets/en/index) libraries for your convenience. Check out some of the tutorials in the reference section for more information! - \ No newline at end of file + + To find the latest supported versions of the HF DLCs, check out https://aws.amazon.com/releasenotes/dlc-support-policy/ + + +## huggingface-pytorch-training + +| Framework Version | Image Type | Image URI | Size (GB) | Pushed At | Details | +| --- | --- | --- | --- | --- | --- | +| 2.3 | gpu | `763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-training:2.3.0-transformers4.48.0-gpu-py311-cu121-ubuntu20.04-v2.1` | 8.75 | 2025-03-14 13:15:19 | [Details](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-training-containers) | + + +### SM Example +``` +# create Hugging Face Model Class +huggingface_model = HuggingFaceModel( + image_uri=get_huggingface_llm_image_uri("huggingface",version="2.3"), + env=, + role=, +) + +# deploy model to SageMaker Inference +predictor = huggingface_model.deploy( + initial_instance_count=1, + instance_type="ml.g6.48xlarge", + container_startup_health_check_timeout=2400, +) +``` + + +## huggingface-pytorch-inference + +| Framework Version | Image Type | Image URI | Size (GB) | Pushed At | Details | +| --- | --- | --- | --- | --- | --- | +| 2.3 | gpu | `763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-inference:2.3.0-transformers4.48.0-gpu-py311-cu121-ubuntu22.04-v2.1` | 9.12 | 2025-03-03 18:16:45 | [Details](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-inference-containers) | +| 2.3 | cpu | `763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-inference:2.3.0-transformers4.48.0-cpu-py311-ubuntu22.04-v2.1` | 1.39 | 2025-03-03 18:04:16 | [Details](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-inference-containers) | + + +### SM Example +``` +# create Hugging Face Model Class +huggingface_model = HuggingFaceModel( + image_uri=get_huggingface_llm_image_uri("huggingface",version="2.3"), + env=, + role=, +) + +# deploy model to SageMaker Inference +predictor = huggingface_model.deploy( + initial_instance_count=1, + instance_type="ml.g6.48xlarge", + container_startup_health_check_timeout=2400, +) +``` + + + + + +