Orchestration With Pipelines

You are here: vertex-ai-mlops/MLOps/Pipelines/readme.md

The workflow of ML code does many steps in sequence. Some of the steps involve conditional logic like deploying the new model only when it is more accurate than the currently deployed model. This is a pipeline. Pipelines are essential for turning ML processes into MLOps. MLOps goes the next mile with automation, monitoring, and governing the workflow.

There are frameworks for specifying these steps like Kubeflow Pipelines (KFP) and TensorFlow Extended (TFX). Vertex AI Pipelines is a managed service that can execute both of these.

The history of Kubeflow is creating a simplified way for running TensorFlow Extended jobs on Kubernetes.

TL;DR

This is a series of notebook based workflows that teach all the ways to use pipelines within Vertex AI. The suggested order and description/reason is:

Link To Section	Notebook Workflow	Description
Link To Section	Vertex AI Pipelines - Start Here	What are pipelines? Start here to go from code to pipeline and see it in action.
Link To Section	Vertex AI Pipelines - Introduction	Introduction to pipelines with the console and Vertex AI SDK
Link To Section	Vertex AI Pipelines - Components	An introduction to all the ways to create pipeline components from your code
Link To Section	Vertex AI Pipelines - IO	An overview of all the type of inputs and outputs for pipeline components
Link To Section	Vertex AI Pipelines - Control	An overview of controlling the flow of exectution for pipelines
Link To Section	Vertex AI Pipelines - Secret Manager	How to pass sensitive information to pipelines and components
Link To Section	Vertex AI Pipelines - GCS Read and Write	How to read/write to GCS from components, including container components.
Link To Section	Vertex AI Pipelines - Scheduling	How to schedule pipeline execution
Link To Section	Vertex AI Pipelines - Notifications	How to send email notification of pipeline status.
Link To Section	Vertex AI Pipelines - Management	Managing, Reusing, and Storing pipelines and components
Link To Section	Vertex AI Pipelines - Testing	Strategies for testing components and pipeliens locally and remotely to aide development.
Link To Section	Vertex AI Pipelines - Managing Pipeline Jobs	Manage runs of pipelines in an environment: list, check status, filtered list, cancel and delete jobs.

To discover these notebooks as part of an introduction to MLOps read on below!

Start Here

What are pipelines?

They help you automate, manage, and scale your ML workflows
They offer reproducibility, collaboration, and efficiency

Before getting into the details let's go from code to pipeline and see this in action!

Notebook Workflow:

In this quick start, we'll take a simple code example and run it both in a notebook and as a pipeline on Vertex AI Pipelines. This will likely spark many questions, and that's great! The rest of this series will dive deeper into each aspect of pipelines, providing comprehensive answers by example.

Vertex AI Pipelines - Start Here
- Code, Python, pulling data and training a model
- Same code running in a pipeline on Vertex AI Pipelines
- Same code modified to be for MLOps on Vertex AI Pipelines

Introduction

Pipelines are constructed of:

Create Components From Code
Construct Pipelines Where steps, or Tasks, are made from components
Run Pipelines on Vertex AI Pipelines
Review pipelines runs and tasks results
Review task Execution: Each task runs as a Vertex AI Training Custom Job

An overview:

Notebook Workflow:

Get a quick start with pipelines by reviewing this workflow for an example using both the Vertex AI Console and SDK.

Vertex AI Pipelines - Introduction
- Build a simple pipeline with IO parameters and artifacts as well as conditional execution
- Review all parts (runs, tasks, parameters, artifacts, metadata) with the Vertex AI Console
- Retrieve all parts (runs, tasks, parameters, artifacts, metadata) with the Vertex AI SDK

Components

The steps of the workflow, an ML task, are run with components. Getting logic and code into components can consists of using prebuilt components or constructing custom components:

KFP
- Pre-Built:
  - Google Cloud Pipeline Components
    - GitHub
- Custom:
  - Lightweight Python Components - create a component from a Python function
  - Containerized Python Components - for complex dependencies
  - Container Component - a component from a container
TFX
- Pre-Built:
  - TFX Standard Components
  - Community-developed components
- Custom:
  - Python function-based components - create a component from a Python function
  - Container-based components - a component from a contaienr
  - Fully custom components - reuse and extend standard components.

Notebook Workflow:

For an overview of components from custom to pre-built, check out this notebook:

Vertex AI Pipelines - Components
- Pre-Built Components: Easy access to many GCP services
- Lightweight Python Components: Build a component from a Python function
- Containerized Python Components: Build an entire Python enviornment as a component
- Container Components: Any container as a component
- Importer Components: Quickly import artifacts

Compute Resources For Components:

Running pipleines on Vertex AI Pipelines runs each component as a Vertex AI Training CustomJob. This defaults to a vm based on e2-standard-4 (4 core CPU, 16GB memory). This can be modified at the task level of pipelines to choose different computing resources, including GPUs. For example:

@kfp.dsl.pipeline()
def pipeline():
    task = component().set_cpu_limit(C).set_memory_limit(M).add_node_selector_constraint(A).set_accelerator_limit(G).

Where the inputs are defining machine configuration for the step:

C = a string representing the number of CPUs (up to 96).
M = a string represent the memory limit. An integer follwed by K, M, or G (up to 624GB).
A = a string representing the desired GPU or TPU type
G = an integer representing the multiple of A desired.

Component IO

Getting information into code and results out is the IO part of components. These inputs and outputs are particularly important in MLOps as they are the artifacts that define an ML system: datasets, models, metrics, and more. Pipelines tools like TFX and KFP go a step further and automatically track the inputs and outpus and even provide lineage information for them. Component inputs and outputs can take two forms: parameters and artifacts.

Parameters are Python objects like str, int, float, bool, list, dict objects that are defined as inputs to pipelines and components. Components can also return parameters for input into subsequent components. Paramters are excellent for changing the behavior of a pipeline/component through inputs rather than rewriting code.

KFP Parameters
TFX Parameters

Artifacts are multi-parameter objects that represent machine learning artifacts and have defined schemas and are stored as metadata with lineage. The artifact schemas follow the ML Metadata (MLMD) client library. This helps with understanding and analyzing a pipeline.

KFP Artifacts
- provided artifact types
- Google Cloud Artifact Types
TFX Artifacts

Notebook Workflow:

See all the types of parameters and artifacts in action with the following notebook based workflow:

Vertex AI Pipelines - IO
- parameters: input, multi-input, output, multi-output
- artifacts: input, output, Vertex AI ML Metadata Lineage

Secure Parameters: Passing credentials for an API or service can expose them. If these credentials are hardcoded then they can be discovered from the source code and are harder to update. A great solution is using Secret Manager to host credentials and then pass the name of the credential as a parameter. The only modification needed to a component is to use a Python client to retrieve the credentials at run time.

Notebook Workflow:

Check out how easy secret manager isis to implement with the following notebook based example workflow:

Vertex AI Pipelines - Secret Manager
- Setup Secret Manager and use the console and Python Client to store secrets
- Retrieve secrets using the Python Client
- example pipeline that retrieves credentials from Secret Manager

GCS Read/Write: Methods for reading and writing data in GCS within a component. Components run as Vertex AI Training jobs which include GCS as a Fuse mount. That means components can utlizes GCS at the /gcs mount during runs. This include container components and the notebook workflow below even shows how to pass code directly to a container for execution.

Notebook Workflow:

Use the /gcs mount point to easily read and write data to GCS without need any library imports or setup. The pipeline runs as a service account on a network and any bucket that can be reached with this setup is automatically available.

Vertex AI Pipelines - GCS Read and Write

Control Flow For Pipelines

As the task of an ML pipeline run they form a graph. The outputs of upstream components become the inputs of downstram components. Both TFX and KFP automatically use these connection to create a DAG of execution. When logic needs to be specified in the pipeline flow of execution the use of control structures is necessary.

Notebook Workflow:

The following notebook shows many examples of implement controls in KFP while running on Vertex AI Pipelines:

Vertex AI Pipelines - Control
- Ordering: DAG and Explicit ordering
- Conditional Execution: if, elif (else if), and else
  - Collecting: Conditional results
- Looping: And Parallelism
  - Collecting: Looped Results
- Exit Handling: with and without task failures
- Error Handling continue execution even after task failures

Scheduling Pipelines

Pipelines can be run on a schedule directly in Vertex AI without the need to setup a scheduler and trigger (like PubSub).

Notebook Workflow:

Here is an example of a pipeline run followed by a schedule that repeats the pipeline at a specified interval the number of iterations set as the maximum on the schedule:

Vertex AI Pipelines - Scheduling
- Create
- Retrieve
- Manage

This can have many helpful applications, including:

Running Batch predictions, evaluations, monitoring each day or week
Retraining a model, do evaluations, and comparing the new model to the currently deployed model then conditionally updating the deployed model
Check for new training records and commence with retraining if conditions are met - like records that increase a class by 10%, atleast 1000 new records, ....

Notifications From Pipelines

As the number of pipelines grow and the use of schedulinng and triggering is implemented it becomes necessary to know which pipelines need to be reviewed. Getting notificaitons about the completion of pipeliens is a good first step. Then, being able to control notificaitons to only be sent on failure or particular failures becomes important.

Notebook Workflow:

This notebook workflow covers pre-built components for email notification and building a custom notification system for send emails (or tasks) conditional on the pipelines status.

Vertex AI Pipelines - Notifications
- Pre-Built Component to send emails on pipelines completion
- Overview of retrieving pipeline runs final status information
- Building a custom component to send emails conditional on the pipelines final status.

Managing Pipelines: Storing And Reusing Pipelines & Components

As seen above, pipelines are made up of steps which are executions of components. These components are made up of code, container, and instructions (inputs and outputs).

Components:

For each type of component, kfp compiles the component into YAML as part of the pipeline. You can also directly compile individual components. This makes the YAML for a component a source that can be managed. And using this in additional pipelines is made possible with kfp.components.load_component_from_*() which has version for files, urls, text (strings).

Pipelines:

Pipelines are compiled into YAML files that include component specifications. Managine these pipelines files as artifacts is made easy with the combination of:

Kubeflow Pipelines SDK and the included kfp.registry.RegistryClient
Google Cloud Artifact Registry with native format for Kubeflow pipeline templates
Integration with Vertex AI for creating, uploading and using pipeline templates

Notebook Workflow:

Work directly with these concepts in the following notebook based workflow:

Vertex AI Pipelines - Management

Testing Components And Pipelines: Strategies for Local and Remote Development

When creating pipeline components and pipelines the process of testing can be aided by local testing and several strategies for remote (On Vertex AI Pipelines) testing. This section covers local and remote stratedgies to aide development processes.

Notebook Workflow:

Work directly with these concepts in the following notebook based workflow:

Vertex AI Pipelines - Testing

Managing Pipeline Jobs

Vertex AI Pipeline Jobs are runs of a pipeline. These can be directly run by a user, started by API, or scheduled. Withing Vetex AI a project can have many jobs running at any time and a history of all past jobs. This workflow shows how to review and manage the jobs in an enviornment using the Python SDK. For custom metrics in Cloud Logging check out this helpful page.

Notebook Workflow:

Work directly with these concepts in the following notebook based workflow:

Vertex AI Pipelines - Managing Pipeline Jobs

Pipeline Patterns - Putting Concepts Together Into Common Workflows

A series of notebook based workflows that show how to put all the concepts from the material above into common workflows:

Vertex AI Pipelines - Pattern - Modular and Reusable
- Example 1: Store a pipeline in artifact registry and directly run it on Vertex AI Pipelines without a local download.
- Example 2: Store and retrieve components for reusability: as files (at url, file directory, or text string) and as artifact in artifact registry
- Example 3: Store pipelines in artifact registry and retrieve (download, and import) to use as components in new pipelines
Run R on Vertex AI Pipelines
- Use a prebuilt container to easily run an R script with inputs for the required libraries and command line arguments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

Orchestration With Pipelines

Table of Contents

Start Here

Introduction

Components

Component IO

Control Flow For Pipelines

Scheduling Pipelines

Notifications From Pipelines

Managing Pipelines: Storing And Reusing Pipelines & Components

Testing Components And Pipelines: Strategies for Local and Remote Development

Managing Pipeline Jobs

Pipeline Patterns - Putting Concepts Together Into Common Workflows

Putting It All Together

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

Orchestration With Pipelines

Table of Contents

Start Here

Introduction

Components

Component IO

Control Flow For Pipelines

Scheduling Pipelines

Notifications From Pipelines

Managing Pipelines: Storing And Reusing Pipelines & Components

Testing Components And Pipelines: Strategies for Local and Remote Development

Managing Pipeline Jobs

Pipeline Patterns - Putting Concepts Together Into Common Workflows

Putting It All Together