huggingface · arjkesh · Mar 24, 2025 · Mar 24, 2025 · Mar 24, 2025 · arjkesh
diff --git a/docs/sagemaker/_toctree.yml b/docs/sagemaker/_toctree.yml
@@ -7,4 +7,10 @@
 - local: inference
   title: Deploy models to Amazon SageMaker
 - local: reference
-  title: Reference
+  title: AWS Deep Learning Containers (DLCs)
+  isExpanded: true
+  sections:
+  - local: tgi
+    title: Text Generation Inference (TGI)
+  - local: transformers
+    title: Transformers
diff --git a/docs/sagemaker/reference.md b/docs/sagemaker/reference.md
@@ -1,91 +1,8 @@
-# Reference
+# AWS Deep Learning Containers (DLCs)
 
 ## Deep Learning Container
 
-Below you can find a version table of currently available Hugging Face DLCs. The table doesn't include the full `image_uri` here are two examples on how to construct those if needed.
-
-**Manually construction the `image_uri`**
-
-`{dlc-aws-account-id}.dkr.ecr.{region}.amazonaws.com/huggingface-{framework}-{(training | inference)}:{framework-version}-transformers{transformers-version}-{device}-{python-version}-{device-tag}`
-
-- `dlc-aws-account-id`: The AWS account ID of the account that owns the ECR repository. You can find them in the [here](https://github.com/aws/sagemaker-python-sdk/blob/e0b9d38e1e3b48647a02af23c4be54980e53dc61/src/sagemaker/image_uri_config/huggingface.json#L21)
-- `region`: The AWS region where you want to use it.
-- `framework`: The framework you want to use, either `pytorch` or `tensorflow`.
-- `(training | inference)`: The training or inference mode.
-- `framework-version`: The version of the framework you want to use.
-- `transformers-version`: The version of the transformers library you want to use.
-- `device`: The device you want to use, either `cpu` or `gpu`.
-- `python-version`: The version of the python of the DLC.
-- `device-tag`: The device tag you want to use. The device tag can include os version and cuda version
-
-**Example 1: PyTorch Training:**
-`763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-training:1.6.0-transformers4.4.2-gpu-py36-cu110-ubuntu18.04`
-**Example 2: Tensorflow Inference:**
-`763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-tensorflow-inference:2.4.1-transformers4.6.1-cpu-py37-ubuntu18.04`
-
-## Training DLC Overview
-
-The Training DLC overview includes all released and available Hugging Face Training DLCs. It includes PyTorch and TensorFlow flavored 
-versions for GPU.
-
-| 🤗 Transformers version | 🤗 Datasets version | PyTorch/TensorFlow version | type     | device | Python Version |
-| ----------------------- | ------------------- | -------------------------- | -------- | ------ | -------------- |
-| 4.4.2                   | 1.5.0               | PyTorch 1.6.0              | training | GPU    | 3.6            |
-| 4.4.2                   | 1.5.0               | TensorFlow 2.4.1           | training | GPU    | 3.7            |
-| 4.5.0                   | 1.5.0               | PyTorch 1.6.0              | training | GPU    | 3.6            |
-| 4.5.0                   | 1.5.0               | TensorFlow 2.4.1           | training | GPU    | 3.7            |
-| 4.6.1                   | 1.6.2               | PyTorch 1.6.0              | training | GPU    | 3.6            |
-| 4.6.1                   | 1.6.2               | PyTorch 1.7.1              | training | GPU    | 3.6            |
-| 4.6.1                   | 1.6.2               | TensorFlow 2.4.1           | training | GPU    | 3.7            |
-| 4.10.2                  | 1.11.0              | PyTorch 1.8.1              | training | GPU    | 3.6            |
-| 4.10.2                  | 1.11.0              | PyTorch 1.9.0              | training | GPU    | 3.8            |
-| 4.10.2                  | 1.11.0              | TensorFlow 2.4.1           | training | GPU    | 3.7            |
-| 4.10.2                  | 1.11.0              | TensorFlow 2.5.1           | training | GPU    | 3.7            |
-| 4.11.0                  | 1.12.1              | PyTorch 1.9.0              | training | GPU    | 3.8            |
-| 4.11.0                  | 1.12.1              | TensorFlow 2.5.1           | training | GPU    | 3.7            |
-| 4.12.3                  | 1.15.1              | PyTorch 1.9.1              | training | GPU    | 3.8            |
-| 4.12.3                  | 1.15.1              | TensorFlow 2.5.1           | training | GPU    | 3.7            |
-| 4.17.0                  | 1.18.4              | PyTorch 1.10.2             | training | GPU    | 3.8            |
-| 4.17.0                  | 1.18.4              | TensorFlow 2.6.3           | training | GPU    | 3.8            |
-| 4.26.0                  |  2.9.0              | PyTorch 1.13.1             | training | GPU    | 3.9            |
-
-## Inference DLC Overview
-
-The Inference DLC overview includes all released and available Hugging Face Inference DLCs. It includes PyTorch and TensorFlow flavored 
-versions for CPU, GPU & AWS Inferentia.
-
-
-| 🤗 Transformers version | PyTorch/TensorFlow version | type      | device | Python Version |
-| ----------------------- | -------------------------- | --------- | ------ | -------------- |
-| 4.6.1                   | PyTorch 1.7.1              | inference | CPU    | 3.6            |
-| 4.6.1                   | PyTorch 1.7.1              | inference | GPU    | 3.6            |
-| 4.6.1                   | TensorFlow 2.4.1           | inference | CPU    | 3.7            |
-| 4.6.1                   | TensorFlow 2.4.1           | inference | GPU    | 3.7            |
-| 4.10.2                  | PyTorch 1.8.1              | inference | GPU    | 3.6            |
-| 4.10.2                  | PyTorch 1.9.0              | inference | GPU    | 3.8            |
-| 4.10.2                  | TensorFlow 2.4.1           | inference | GPU    | 3.7            |
-| 4.10.2                  | TensorFlow 2.5.1           | inference | GPU    | 3.7            |
-| 4.10.2                  | PyTorch 1.8.1              | inference | CPU    | 3.6            |
-| 4.10.2                  | PyTorch 1.9.0              | inference | CPU    | 3.8            |
-| 4.10.2                  | TensorFlow 2.4.1           | inference | CPU    | 3.7            |
-| 4.10.2                  | TensorFlow 2.5.1           | inference | CPU    | 3.7            |
-| 4.11.0                  | PyTorch 1.9.0              | inference | GPU    | 3.8            |
-| 4.11.0                  | TensorFlow 2.5.1           | inference | GPU    | 3.7            |
-| 4.11.0                  | PyTorch 1.9.0              | inference | CPU    | 3.8            |
-| 4.11.0                  | TensorFlow 2.5.1           | inference | CPU    | 3.7            |
-| 4.12.3                  | PyTorch 1.9.1              | inference | GPU    | 3.8            |
-| 4.12.3                  | TensorFlow 2.5.1           | inference | GPU    | 3.7            |
-| 4.12.3                  | PyTorch 1.9.1              | inference | CPU    | 3.8            |
-| 4.12.3                  | TensorFlow 2.5.1           | inference | CPU    | 3.7            |
-| 4.12.3                  | PyTorch 1.9.1              | inference | Inferentia    | 3.7            |
-| 4.17.0                  | PyTorch 1.10.2              | inference | GPU    | 3.8            |
-| 4.17.0                  | TensorFlow 2.6.3           | inference | GPU    | 3.8            |
-| 4.17.0                  | PyTorch 1.10.2              | inference | CPU    | 3.8            |
-| 4.17.0                  | TensorFlow 2.6.3           | inference | CPU    | 3.8            |
-| 4.26.0                  | PyTorch 1.13.1              | inference | CPU    | 3.9            |
-| 4.26.0                  | PyTorch 1.13.1              | inference | GPU    | 3.9            |
-
-
+There are several different types of AWS Deep Learning Containers. Feel free to explore further in the subheadings!
 
 ## Hugging Face Transformers Amazon SageMaker Examples
 

diff --git a/docs/sagemaker/tgi.md b/docs/sagemaker/tgi.md
@@ -0,0 +1,36 @@
+# Text Generation Inference (TGI) Images
+
+[TGI](https://huggingface.co/docs/text-generation-inference/en/index) is a toolkit for deploying and serving Large Language Models (LLMs). TGI enables high-performance text generation for the most popular open-source LLMs, including Llama, Falcon, StarCoder, BLOOM, GPT-NeoX, and T5.
+
+Below, you can find a list of the latest available images for TGI for use on AWS SageMaker.
+
+To find the latest supported versions of the HF DLCs, check out https://aws.amazon.com/releasenotes/dlc-support-policy/
+
+<!-- START AUTOGEN TABLE -->
+## huggingface-pytorch-tgi-inference
+
+| Framework Version | Image Type | Image URI | Size (GB) | Pushed At | Details |
+| --- | --- | --- | --- | --- | --- |
+| 2.6 | gpu | `763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-tgi-inference:2.6.0-tgi3.1.1-gpu-py311-cu124-ubuntu22.04-v2.0` | 8.1 | 2025-03-17 16:47:39 | [Details](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-text-generation-inference-tgi-containers) |
+| 2.4 | gpu | `763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-tgi-inference:2.4.0-tgi3.0.1-gpu-py311-cu124-ubuntu22.04-v2.2` | 6.5 | 2025-03-06 18:28:24 | [Details](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-text-generation-inference-tgi-containers) |
+| 2.3 | gpu | `763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-tgi-inference:2.3.0-tgi2.2.0-gpu-py310-cu121-ubuntu22.04-v2.1` | 4.92 | 2024-10-04 21:59:12 | [Details](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-text-generation-inference-tgi-containers) |
+
+
+### SM Example
+```
+# create Hugging Face Model Class
+huggingface_model = HuggingFaceModel(
+	image_uri=get_huggingface_llm_image_uri("huggingface",version="2.6"),
+	env=<insert_hub_obj>,
+	role=<insert_role>, 
+)
+
+# deploy model to SageMaker Inference
+predictor = huggingface_model.deploy(
+	initial_instance_count=1,
+	instance_type="ml.g6.48xlarge",
+	container_startup_health_check_timeout=2400,
+)
+```
+
+<!-- END AUTOGEN TABLE -->
diff --git a/docs/sagemaker/transformers.md b/docs/sagemaker/transformers.md
@@ -0,0 +1,64 @@
+# Transformers Images
+
+[Transformers](https://huggingface.co/docs/transformers/en/index) provides APIs and tools to easily download and fine-tune state-of-the-art pretrained models, for use across NLP, computer vision, audio, and more.
+
+Below, we include a list of the latest images available on AWS, which come pre-packaged with transformers and [datasets](https://huggingface.co/docs/datasets/en/index) libraries for your convenience. Check out some of the tutorials in the reference section for more information!
+
+ To find the latest supported versions of the HF DLCs, check out https://aws.amazon.com/releasenotes/dlc-support-policy/
+
+<!-- START AUTOGEN TABLE -->
+## huggingface-pytorch-training
+
+| Framework Version | Image Type | Image URI | Size (GB) | Pushed At | Details |
+| --- | --- | --- | --- | --- | --- |
+| 2.3 | gpu | `763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-training:2.3.0-transformers4.48.0-gpu-py311-cu121-ubuntu20.04-v2.1` | 8.75 | 2025-03-14 13:15:19 | [Details](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-training-containers) |
+
+
+### SM Example
+```
+# create Hugging Face Model Class
+huggingface_model = HuggingFaceModel(
+	image_uri=get_huggingface_llm_image_uri("huggingface",version="2.3"),
+	env=<insert_hub_obj>,
+	role=<insert_role>, 
+)
+
+# deploy model to SageMaker Inference
+predictor = huggingface_model.deploy(
+	initial_instance_count=1,
+	instance_type="ml.g6.48xlarge",
+	container_startup_health_check_timeout=2400,
+)
+```
+
+
+## huggingface-pytorch-inference
+
+| Framework Version | Image Type | Image URI | Size (GB) | Pushed At | Details |
+| --- | --- | --- | --- | --- | --- |
+| 2.3 | gpu | `763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-inference:2.3.0-transformers4.48.0-gpu-py311-cu121-ubuntu22.04-v2.1` | 9.12 | 2025-03-03 18:16:45 | [Details](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-inference-containers) |
+| 2.3 | cpu | `763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-inference:2.3.0-transformers4.48.0-cpu-py311-ubuntu22.04-v2.1` | 1.39 | 2025-03-03 18:04:16 | [Details](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-inference-containers) |
+
+
+### SM Example
+```
+# create Hugging Face Model Class
+huggingface_model = HuggingFaceModel(
+	image_uri=get_huggingface_llm_image_uri("huggingface",version="2.3"),
+	env=<insert_hub_obj>,
+	role=<insert_role>, 
+)
+
+# deploy model to SageMaker Inference
+predictor = huggingface_model.deploy(
+	initial_instance_count=1,
+	instance_type="ml.g6.48xlarge",
+	container_startup_health_check_timeout=2400,
+)
+```
+
+
+
+
+
+<!-- END AUTOGEN TABLE -->