Avoid running neuron integration tests twice #3054

dacorvo · 2025-02-25T11:05:54Z

What does this PR do?

This modifies the neuron integration-tests fixture that compiles and exports models before running tests to be able to run it as a script and export all neuron test models at once.

The script is now called during the precompile_neuron_models CI step instead of running the full integration tests.

This also updates the documentation to fix some obsolete links.

HuggingFaceDocBuilderDev · 2025-02-25T11:07:18Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

tengomucho

LGTM, just two questions

tengomucho · 2025-02-25T13:26:19Z

integration-tests/fixtures/neuron/export_models.py

@@ -224,3 +229,7 @@ def neuron_model_config(request):
 @pytest.fixture(scope="module")
 def neuron_model_path(neuron_model_config):
    yield neuron_model_config["neuron_model_path"]
+
+
+if __name__ == "__main__":


why do you need a main here, aren't you going to run it as a test only?

OK I got my answer, you run it on the workflow to export models.

tengomucho · 2025-02-25T13:27:48Z

docs/source/multi_backend_support.md

@@ -13,3 +13,4 @@ TGI remains consistent across backends, allowing you to switch between them seam
  However, it requires a model-specific compilation step for each GPU architecture.
 * **[TGI Llamacpp backend](./backends/llamacpp)**: This backend facilitates the deployment of large language models
  (LLMs) by integrating [llama.cpp][llama.cpp], an advanced inference engine optimized for both CPU and GPU computation.
+* **[TGI Neuron backend](./backends/neuron)**: This backend leverages the [AWS Neuron SDK](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/) to allow the deployment of large language models (LLMs) on [AWS Trainium and Inferentia chips](https://aws.amazon.com/ai/machine-learning/trainium/).


do they run also on trainium?

Also rename fixture file fro clarity.

CARGO_REGISTRIES_CRATES_IO_PROTOCOL=sparse is not required anymore.

dacorvo force-pushed the avoid_testing_neuron_twice branch from 7f94375 to 9b213b4 Compare February 25, 2025 11:35

dacorvo marked this pull request as ready for review February 25, 2025 13:08

dacorvo requested review from Narsil and tengomucho February 25, 2025 13:09

tengomucho reviewed Feb 25, 2025

View reviewed changes

dacorvo requested a review from tengomucho February 25, 2025 14:52

tengomucho approved these changes Feb 25, 2025

View reviewed changes

dacorvo added 9 commits February 26, 2025 08:44

test(neuron): refactor to prepare batch export

70e846d

test(neuron): add helper to batch export models

53c1226

Also rename fixture file fro clarity.

ci(neuron): do not run tests twice

40e2f3f

ci(neuron): rename precompilation job

0cff388

test(neuron): remove redundant subdirectory

f6859c4

test(neuron): remove erroneous line

e783f88

doc(neuron): update links to installation page

d59b4fd

feat(neuron): cleanup Dockerfile

be06297

CARGO_REGISTRIES_CRATES_IO_PROTOCOL=sparse is not required anymore.

test(neuron): try to reduce download errors

b370902

dacorvo force-pushed the avoid_testing_neuron_twice branch from 92d2e6f to b370902 Compare February 26, 2025 08:47

dacorvo requested review from mfuntowicz, danieldk and drbh February 26, 2025 09:37

Narsil approved these changes Feb 26, 2025

View reviewed changes

dacorvo merged commit 5eec3a8 into main Feb 26, 2025
26 of 27 checks passed

dacorvo deleted the avoid_testing_neuron_twice branch February 26, 2025 11:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid running neuron integration tests twice #3054

Avoid running neuron integration tests twice #3054

dacorvo commented Feb 25, 2025

HuggingFaceDocBuilderDev commented Feb 25, 2025

tengomucho left a comment

tengomucho Feb 25, 2025

tengomucho Feb 25, 2025

tengomucho Feb 25, 2025

dacorvo Feb 25, 2025

Avoid running neuron integration tests twice #3054

Avoid running neuron integration tests twice #3054

Conversation

dacorvo commented Feb 25, 2025

What does this PR do?

HuggingFaceDocBuilderDev commented Feb 25, 2025

tengomucho left a comment

Choose a reason for hiding this comment

tengomucho Feb 25, 2025

Choose a reason for hiding this comment

tengomucho Feb 25, 2025

Choose a reason for hiding this comment

tengomucho Feb 25, 2025

Choose a reason for hiding this comment

dacorvo Feb 25, 2025

Choose a reason for hiding this comment