Run train.ipynb
for training
For the demo, we use mock Salesforce data with client opportunities.
In this pipeline, we use Tensorflow and TFX to build a deep-learning model for predicting the opportunity window in days
Prepared models and parameters tailored for our data are located in ./pretrained
To use Custom Python Transform you have two options:
Use PythonExternalTransform to stage PyPI package dependencies
If your custom Python transform is available in PyPI, you can use the withExtraPackages
method of the
PythonExternalTransform
class and specify the dependencies required by the RunInference model handler in the arguments.
More details in Apache Beam documentation: Creating cross-language Python transforms
Use Custom Expansion Service
NOTE: Only use the Expansion Service when you need a custom environment that includes packages not available in the Beam SDK or not published in pip. In other cases, use the
withExtraPackages
method to pass Python PTransforms dependencies
In order to make the inference on Python available in a multi-language pipeline, all model inference and data preprocessing is packed into a custom PTransform that is hosted on an expansion service.
The Expansion service image must be supplied with all the dependencies that are used in the Python part of the pipeline. If these imports are available in pip
,
we can include them directly in Dockerfile
, otherwise we can copy the package source to the image and install it directly with setup.py
.
To start an expansion service for loading custom PTransform with RunInference, you need to:
Java multi-language pipelines quickstart
The parameters you'll also need to pass are the URIs to the saved model weights from ./pretrained
. You also need to specify the hosted expansion service public address in the Java pipeline.