Introduction
This document will provide instructions to create a TensorFlow Extended (TFX) pipeline
using templates which are provided with TFX Python package.
Many of the instructions are Linux shell commands, which will run on an AI Platform Notebooks instance. Corresponding Jupyter Notebook code cells which invoke those commands using !
are provided.
You will build a pipeline using Taxi Trips dataset released by the City of Chicago. We strongly encourage you to try building your own pipeline using your dataset by utilizing this pipeline as a baseline.
Step 1. Set up your environment.
AI Platform Pipelines will prepare a development environment to build a pipeline, and a Kubeflow Pipeline cluster to run the newly built pipeline.
Install tfx
python package with kfp
extra requirement.
import sys
# Use the latest version of pip.
!pip install --upgrade pip
# Install tfx and kfp Python packages.
!pip install --upgrade "tfx[kfp]<2"
Let's check the versions of TFX.
python3 -c "from tfx import version ; print('TFX version: {}'.format(version.__version__))"
TFX version: 1.12.0
In AI Platform Pipelines, TFX is running in a hosted Kubernetes environment using Kubeflow Pipelines.
Let's set some environment variables to use Kubeflow Pipelines.
First, get your GCP project ID.
# Read GCP project id from env.
shell_output=!gcloud config list --format 'value(core.project)' 2>/dev/null
GOOGLE_CLOUD_PROJECT=shell_output[0]
%env GOOGLE_CLOUD_PROJECT={GOOGLE_CLOUD_PROJECT}
print("GCP project ID:" + GOOGLE_CLOUD_PROJECT)
env: GOOGLE_CLOUD_PROJECT=tensorflow-testing GCP project ID:tensorflow-testing
We also need to access your KFP cluster. You can access it in your Google Cloud Console under "AI Platform > Pipeline" menu. The "endpoint" of the KFP cluster can be found from the URL of the Pipelines dashboard, or you can get it from the URL of the Getting Started page where you launched this notebook. Let's create an ENDPOINT
environment variable and set it to the KFP cluster endpoint. ENDPOINT should contain only the hostname part of the URL. For example, if the URL of the KFP dashboard is <a href="https://1e9deb537390ca22-dot-asia-east1.pipelines.googleusercontent.com/#/start">https://1e9deb537390ca22-dot-asia-east1.pipelines.googleusercontent.com/#/start</a>
, ENDPOINT value becomes 1e9deb537390ca22-dot-asia-east1.pipelines.googleusercontent.com
.
# This refers to the KFP cluster endpoint
ENDPOINT='' # Enter your ENDPOINT here.
if not ENDPOINT:
from absl import logging
logging.error('Set your ENDPOINT in this cell.')
ERROR:absl:Set your ENDPOINT in this cell.
Set the image name as tfx-pipeline
under the current GCP project.
# Docker image name for the pipeline image.
CUSTOM_TFX_IMAGE='gcr.io/' + GOOGLE_CLOUD_PROJECT + '/tfx-pipeline'
And, it's done. We are ready to create a pipeline.
Step 2. Copy the predefined template to your project directory.
In this step, we will create a working pipeline project directory and files by copying additional files from a predefined template.
You may give your pipeline a different name by changing the PIPELINE_NAME
below. This will also become the name of the project directory where your files will be put.
PIPELINE_NAME="my_pipeline"
import os
PROJECT_DIR=os.path.join(os.path.expanduser("~"),"imported",PIPELINE_NAME)
TFX includes the taxi
template with the TFX python package. If you are planning to solve a point-wise prediction problem, including classification and regresssion, this template could be used as a starting point.
The tfx template copy
CLI command copies predefined template files into your project directory.
!tfx template copy \
--pipeline-name={PIPELINE_NAME} \
--destination-path={PROJECT_DIR} \
--model=taxi
2023-03-15 09:29:51.273036: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:29:51.273139: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:29:51.273157: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. CLI Copying taxi pipeline template model_analysis.ipynb -> /home/kbuilder/imported/my_pipeline/model_analysis.ipynb __init__.py -> /home/kbuilder/imported/my_pipeline/__init__.py data_validation.ipynb -> /home/kbuilder/imported/my_pipeline/data_validation.ipynb __init__.py -> /home/kbuilder/imported/my_pipeline/models/__init__.py preprocessing.py -> /home/kbuilder/imported/my_pipeline/models/preprocessing.py model.py -> /home/kbuilder/imported/my_pipeline/models/keras_model/model.py __init__.py -> /home/kbuilder/imported/my_pipeline/models/keras_model/__init__.py model_test.py -> /home/kbuilder/imported/my_pipeline/models/keras_model/model_test.py constants.py -> /home/kbuilder/imported/my_pipeline/models/keras_model/constants.py preprocessing_test.py -> /home/kbuilder/imported/my_pipeline/models/preprocessing_test.py model.py -> /home/kbuilder/imported/my_pipeline/models/estimator_model/model.py __init__.py -> /home/kbuilder/imported/my_pipeline/models/estimator_model/__init__.py model_test.py -> /home/kbuilder/imported/my_pipeline/models/estimator_model/model_test.py constants.py -> /home/kbuilder/imported/my_pipeline/models/estimator_model/constants.py features.py -> /home/kbuilder/imported/my_pipeline/models/features.py features_test.py -> /home/kbuilder/imported/my_pipeline/models/features_test.py local_runner.py -> /home/kbuilder/imported/my_pipeline/local_runner.py .gitignore -> /home/kbuilder/imported/my_pipeline/.gitignore __init__.py -> /home/kbuilder/imported/my_pipeline/pipeline/__init__.py pipeline.py -> /home/kbuilder/imported/my_pipeline/pipeline/pipeline.py configs.py -> /home/kbuilder/imported/my_pipeline/pipeline/configs.py kubeflow_runner.py -> /home/kbuilder/imported/my_pipeline/kubeflow_runner.py kubeflow_v2_runner.py -> /home/kbuilder/imported/my_pipeline/kubeflow_v2_runner.py
Change the working directory context in this notebook to the project directory.
%cd {PROJECT_DIR}
/home/kbuilder/imported/my_pipeline
Step 3. Browse your copied source files
The TFX template provides basic scaffold files to build a pipeline, including Python source code, sample data, and Jupyter Notebooks to analyse the output of the pipeline. The taxi
template uses the same Chicago Taxi dataset and ML model as the Airflow Tutorial.
Here is brief introduction to each of the Python files.
pipeline
- This directory contains the definition of the pipelineconfigs.py
— defines common constants for pipeline runnerspipeline.py
— defines TFX components and a pipeline
models
- This directory contains ML model definitions.features.py
,features_test.py
— defines features for the modelpreprocessing.py
,preprocessing_test.py
— defines preprocessing jobs usingtf::Transform
estimator
- This directory contains an Estimator based model.constants.py
— defines constants of the modelmodel.py
,model_test.py
— defines DNN model using TF estimator
keras
- This directory contains a Keras based model.constants.py
— defines constants of the modelmodel.py
,model_test.py
— defines DNN model using Keras
local_runner.py
,kubeflow_runner.py
— define runners for each orchestration engine
You might notice that there are some files with _test.py
in their name. These are unit tests of the pipeline and it is recommended to add more unit tests as you implement your own pipelines.
You can run unit tests by supplying the module name of test files with -m
flag. You can usually get a module name by deleting .py
extension and replacing /
with .
. For example:
{sys.executable} -m models.features_test
{sys.executable} -m models.keras.model_test
2023-03-15 09:29:54.446312: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:29:54.446398: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:29:54.446408: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. Running tests under Python 3.9.16: /tmpfs/src/tf_docs_env/bin/python [ RUN ] FeaturesTest.testNumberOfBucketFeatureBucketCount INFO:tensorflow:time(__main__.FeaturesTest.testNumberOfBucketFeatureBucketCount): 0.0s I0315 09:29:55.530076 139660746958656 test_util.py:2457] time(__main__.FeaturesTest.testNumberOfBucketFeatureBucketCount): 0.0s [ OK ] FeaturesTest.testNumberOfBucketFeatureBucketCount [ RUN ] FeaturesTest.testTransformedNames INFO:tensorflow:time(__main__.FeaturesTest.testTransformedNames): 0.0s I0315 09:29:55.530458 139660746958656 test_util.py:2457] time(__main__.FeaturesTest.testTransformedNames): 0.0s [ OK ] FeaturesTest.testTransformedNames [ RUN ] FeaturesTest.test_session [ SKIPPED ] FeaturesTest.test_session ---------------------------------------------------------------------- Ran 3 tests in 0.001s OK (skipped=1) /tmpfs/src/tf_docs_env/bin/python: Error while finding module specification for 'models.keras.model_test' (ModuleNotFoundError: No module named 'models.keras')
Step 4. Run your first TFX pipeline
Components in the TFX pipeline will generate outputs for each run as ML Metadata Artifacts, and they need to be stored somewhere. You can use any storage which the KFP cluster can access, and for this example we will use Google Cloud Storage (GCS). A default GCS bucket should have been created automatically. Its name will be <your-project-id>-kubeflowpipelines-default
.
Let's upload our sample data to GCS bucket so that we can use it in our pipeline later.
gsutil cp data/data.csv gs://{GOOGLE_CLOUD_PROJECT}-kubeflowpipelines-default/tfx-template/data/taxi/data.csv
BucketNotFoundException: 404 gs://tensorflow-testing-kubeflowpipelines-default bucket does not exist.
Let's create a TFX pipeline using the tfx pipeline create
command.
!tfx pipeline create --pipeline-path=kubeflow_runner.py --endpoint={ENDPOINT} \
--build-image
2023-03-15 09:29:59.277389: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:29:59.277478: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:29:59.277489: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. CLI Creating pipeline Detected Kubeflow. Use --engine flag if you intend to use a different orchestrator. Failed to load kube config. Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection raise err File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 398, in _make_request conn.request(method, url, **httplib_request_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 244, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/usr/lib/python3.9/http/client.py", line 1285, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1331, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1280, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1040, in _send_output self.send(msg) File "/usr/lib/python3.9/http/client.py", line 980, in send self.connect() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 205, in connect conn = self._new_conn() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f7eb3b89c10>: Failed to establish a new connection: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/bin/tfx", line 8, in <module> sys.exit(cli_group()) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/decorators.py", line 73, in new_func return ctx.invoke(f, obj, *args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/commands/pipeline.py", line 150, in create_pipeline handler_factory.create_handler(ctx.flags_dict).create_pipeline() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 106, in create_handler return detect_handler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 60, in detect_handler return kubeflow_handler.KubeflowHandler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/kubeflow_handler.py", line 59, in __init__ self._client = kfp.Client( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 196, in __init__ if not self._context_setting['namespace'] and self.get_kfp_healthz( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 410, in get_kfp_healthz response = self._healthz_api.get_healthz() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 63, in get_healthz return self.get_healthz_with_http_info(**kwargs) # noqa: E501 File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 134, in get_healthz_with_http_info return self.api_client.call_api( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 364, in call_api return self.__call_api(resource_path, method, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 181, in __call_api response_data = self.request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 389, in request return self.rest_client.GET(url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 230, in GET return self.request("GET", url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 208, in request r = self.pool_manager.request(method, url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 74, in request return self.request_encode_url( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 96, in request_encode_url return self.urlopen(method, url, **extra_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/poolmanager.py", line 376, in urlopen response = conn.urlopen(method, u.request_uri, **kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /apis/v1beta1/healthz (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7eb3b89c10>: Failed to establish a new connection: [Errno 111] Connection refused'))
While creating a pipeline, Dockerfile
will be generated to build a Docker image. Don't forget to add it to the source control system (for example, git) along with other source files.
Now start an execution run with the newly created pipeline using the tfx run create
command.
tfx run create --pipeline-name={PIPELINE_NAME} --endpoint={ENDPOINT}
2023-03-15 09:30:04.169335: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:04.169420: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:04.169430: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. CLI Creating a run for pipeline: my_pipeline Detected Kubeflow. Use --engine flag if you intend to use a different orchestrator. Failed to load kube config. Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection raise err File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 398, in _make_request conn.request(method, url, **httplib_request_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 244, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/usr/lib/python3.9/http/client.py", line 1285, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1331, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1280, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1040, in _send_output self.send(msg) File "/usr/lib/python3.9/http/client.py", line 980, in send self.connect() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 205, in connect conn = self._new_conn() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fb68199e130>: Failed to establish a new connection: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/bin/tfx", line 8, in <module> sys.exit(cli_group()) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/decorators.py", line 73, in new_func return ctx.invoke(f, obj, *args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/commands/run.py", line 94, in create_run handler = handler_factory.create_handler(ctx.flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 106, in create_handler return detect_handler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 60, in detect_handler return kubeflow_handler.KubeflowHandler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/kubeflow_handler.py", line 59, in __init__ self._client = kfp.Client( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 196, in __init__ if not self._context_setting['namespace'] and self.get_kfp_healthz( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 410, in get_kfp_healthz response = self._healthz_api.get_healthz() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 63, in get_healthz return self.get_healthz_with_http_info(**kwargs) # noqa: E501 File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 134, in get_healthz_with_http_info return self.api_client.call_api( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 364, in call_api return self.__call_api(resource_path, method, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 181, in __call_api response_data = self.request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 389, in request return self.rest_client.GET(url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 230, in GET return self.request("GET", url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 208, in request r = self.pool_manager.request(method, url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 74, in request return self.request_encode_url( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 96, in request_encode_url return self.urlopen(method, url, **extra_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/poolmanager.py", line 376, in urlopen response = conn.urlopen(method, u.request_uri, **kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /apis/v1beta1/healthz (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb68199e130>: Failed to establish a new connection: [Errno 111] Connection refused'))
Or, you can also run the pipeline in the KFP Dashboard. The new execution run will be listed under Experiments in the KFP Dashboard. Clicking into the experiment will allow you to monitor progress and visualize the artifacts created during the execution run.
However, we recommend visiting the KFP Dashboard. You can access the KFP Dashboard from the Cloud AI Platform Pipelines menu in Google Cloud Console. Once you visit the dashboard, you will be able to find the pipeline, and access a wealth of information about the pipeline. For example, you can find your runs under the Experiments menu, and when you open your execution run under Experiments you can find all your artifacts from the pipeline under Artifacts menu.
One of the major sources of failure is permission related problems. Please make sure your KFP cluster has permissions to access Google Cloud APIs. This can be configured when you create a KFP cluster in GCP, or see Troubleshooting document in GCP.
Step 5. Add components for data validation.
In this step, you will add components for data validation including StatisticsGen
, SchemaGen
, and ExampleValidator
. If you are interested in data validation, please see Get started with Tensorflow Data Validation.
Double-click to change directory to
pipeline
and double-click again to openpipeline.py
. Find and uncomment the 3 lines which addStatisticsGen
,SchemaGen
, andExampleValidator
to the pipeline. (Tip: search for comments containingTODO(step 5):
). Make sure to savepipeline.py
after you edit it.
You now need to update the existing pipeline with modified pipeline definition. Use the tfx pipeline update
command to update your pipeline, followed by the tfx run create
command to create a new execution run of your updated pipeline.
# Update the pipeline
!tfx pipeline update \
--pipeline-path=kubeflow_runner.py \
--endpoint={ENDPOINT}
# You can run the pipeline the same way.
!tfx run create --pipeline-name {PIPELINE_NAME} --endpoint={ENDPOINT}
2023-03-15 09:30:09.047304: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:09.047388: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:09.047399: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. CLI Updating pipeline Detected Kubeflow. Use --engine flag if you intend to use a different orchestrator. Failed to load kube config. Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection raise err File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 398, in _make_request conn.request(method, url, **httplib_request_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 244, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/usr/lib/python3.9/http/client.py", line 1285, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1331, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1280, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1040, in _send_output self.send(msg) File "/usr/lib/python3.9/http/client.py", line 980, in send self.connect() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 205, in connect conn = self._new_conn() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fd43c339d00>: Failed to establish a new connection: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/bin/tfx", line 8, in <module> sys.exit(cli_group()) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/decorators.py", line 73, in new_func return ctx.invoke(f, obj, *args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/commands/pipeline.py", line 215, in update_pipeline handler_factory.create_handler(ctx.flags_dict).update_pipeline() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 106, in create_handler return detect_handler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 60, in detect_handler return kubeflow_handler.KubeflowHandler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/kubeflow_handler.py", line 59, in __init__ self._client = kfp.Client( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 196, in __init__ if not self._context_setting['namespace'] and self.get_kfp_healthz( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 410, in get_kfp_healthz response = self._healthz_api.get_healthz() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 63, in get_healthz return self.get_healthz_with_http_info(**kwargs) # noqa: E501 File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 134, in get_healthz_with_http_info return self.api_client.call_api( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 364, in call_api return self.__call_api(resource_path, method, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 181, in __call_api response_data = self.request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 389, in request return self.rest_client.GET(url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 230, in GET return self.request("GET", url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 208, in request r = self.pool_manager.request(method, url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 74, in request return self.request_encode_url( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 96, in request_encode_url return self.urlopen(method, url, **extra_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/poolmanager.py", line 376, in urlopen response = conn.urlopen(method, u.request_uri, **kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /apis/v1beta1/healthz (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd43c339d00>: Failed to establish a new connection: [Errno 111] Connection refused')) 2023-03-15 09:30:13.900284: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:13.900372: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:13.900396: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. CLI Creating a run for pipeline: my_pipeline Detected Kubeflow. Use --engine flag if you intend to use a different orchestrator. Failed to load kube config. Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection raise err File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 398, in _make_request conn.request(method, url, **httplib_request_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 244, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/usr/lib/python3.9/http/client.py", line 1285, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1331, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1280, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1040, in _send_output self.send(msg) File "/usr/lib/python3.9/http/client.py", line 980, in send self.connect() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 205, in connect conn = self._new_conn() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f4ddc70a8b0>: Failed to establish a new connection: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/bin/tfx", line 8, in <module> sys.exit(cli_group()) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/decorators.py", line 73, in new_func return ctx.invoke(f, obj, *args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/commands/run.py", line 94, in create_run handler = handler_factory.create_handler(ctx.flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 106, in create_handler return detect_handler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 60, in detect_handler return kubeflow_handler.KubeflowHandler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/kubeflow_handler.py", line 59, in __init__ self._client = kfp.Client( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 196, in __init__ if not self._context_setting['namespace'] and self.get_kfp_healthz( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 410, in get_kfp_healthz response = self._healthz_api.get_healthz() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 63, in get_healthz return self.get_healthz_with_http_info(**kwargs) # noqa: E501 File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 134, in get_healthz_with_http_info return self.api_client.call_api( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 364, in call_api return self.__call_api(resource_path, method, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 181, in __call_api response_data = self.request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 389, in request return self.rest_client.GET(url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 230, in GET return self.request("GET", url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 208, in request r = self.pool_manager.request(method, url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 74, in request return self.request_encode_url( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 96, in request_encode_url return self.urlopen(method, url, **extra_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/poolmanager.py", line 376, in urlopen response = conn.urlopen(method, u.request_uri, **kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /apis/v1beta1/healthz (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f4ddc70a8b0>: Failed to establish a new connection: [Errno 111] Connection refused'))
Check pipeline outputs
Visit the KFP dashboard to find pipeline outputs in the page for your pipeline run. Click the Experiments tab on the left, and All runs in the Experiments page. You should be able to find the latest run under the name of your pipeline.
Step 6. Add components for training.
In this step, you will add components for training and model validation including Transform
, Trainer
, Resolver
, Evaluator
, and Pusher
.
Double-click to open
pipeline.py
. Find and uncomment the 5 lines which addTransform
,Trainer
,Resolver
,Evaluator
andPusher
to the pipeline. (Tip: search forTODO(step 6):
)
As you did before, you now need to update the existing pipeline with the modified pipeline definition. The instructions are the same as Step 5. Update the pipeline using tfx pipeline update
, and create an execution run using tfx run create
.
!tfx pipeline update \
--pipeline-path=kubeflow_runner.py \
--endpoint={ENDPOINT}
!tfx run create --pipeline-name {PIPELINE_NAME} --endpoint={ENDPOINT}
2023-03-15 09:30:18.765784: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:18.765883: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:18.765901: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. CLI Updating pipeline Detected Kubeflow. Use --engine flag if you intend to use a different orchestrator. Failed to load kube config. Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection raise err File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 398, in _make_request conn.request(method, url, **httplib_request_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 244, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/usr/lib/python3.9/http/client.py", line 1285, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1331, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1280, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1040, in _send_output self.send(msg) File "/usr/lib/python3.9/http/client.py", line 980, in send self.connect() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 205, in connect conn = self._new_conn() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fe6edaeab50>: Failed to establish a new connection: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/bin/tfx", line 8, in <module> sys.exit(cli_group()) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/decorators.py", line 73, in new_func return ctx.invoke(f, obj, *args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/commands/pipeline.py", line 215, in update_pipeline handler_factory.create_handler(ctx.flags_dict).update_pipeline() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 106, in create_handler return detect_handler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 60, in detect_handler return kubeflow_handler.KubeflowHandler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/kubeflow_handler.py", line 59, in __init__ self._client = kfp.Client( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 196, in __init__ if not self._context_setting['namespace'] and self.get_kfp_healthz( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 410, in get_kfp_healthz response = self._healthz_api.get_healthz() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 63, in get_healthz return self.get_healthz_with_http_info(**kwargs) # noqa: E501 File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 134, in get_healthz_with_http_info return self.api_client.call_api( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 364, in call_api return self.__call_api(resource_path, method, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 181, in __call_api response_data = self.request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 389, in request return self.rest_client.GET(url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 230, in GET return self.request("GET", url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 208, in request r = self.pool_manager.request(method, url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 74, in request return self.request_encode_url( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 96, in request_encode_url return self.urlopen(method, url, **extra_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/poolmanager.py", line 376, in urlopen response = conn.urlopen(method, u.request_uri, **kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /apis/v1beta1/healthz (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fe6edaeab50>: Failed to establish a new connection: [Errno 111] Connection refused')) 2023-03-15 09:30:23.639565: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:23.639653: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:23.639664: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. CLI Creating a run for pipeline: my_pipeline Detected Kubeflow. Use --engine flag if you intend to use a different orchestrator. Failed to load kube config. Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection raise err File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 398, in _make_request conn.request(method, url, **httplib_request_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 244, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/usr/lib/python3.9/http/client.py", line 1285, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1331, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1280, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1040, in _send_output self.send(msg) File "/usr/lib/python3.9/http/client.py", line 980, in send self.connect() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 205, in connect conn = self._new_conn() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fb338e91040>: Failed to establish a new connection: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/bin/tfx", line 8, in <module> sys.exit(cli_group()) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/decorators.py", line 73, in new_func return ctx.invoke(f, obj, *args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/commands/run.py", line 94, in create_run handler = handler_factory.create_handler(ctx.flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 106, in create_handler return detect_handler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 60, in detect_handler return kubeflow_handler.KubeflowHandler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/kubeflow_handler.py", line 59, in __init__ self._client = kfp.Client( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 196, in __init__ if not self._context_setting['namespace'] and self.get_kfp_healthz( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 410, in get_kfp_healthz response = self._healthz_api.get_healthz() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 63, in get_healthz return self.get_healthz_with_http_info(**kwargs) # noqa: E501 File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 134, in get_healthz_with_http_info return self.api_client.call_api( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 364, in call_api return self.__call_api(resource_path, method, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 181, in __call_api response_data = self.request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 389, in request return self.rest_client.GET(url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 230, in GET return self.request("GET", url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 208, in request r = self.pool_manager.request(method, url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 74, in request return self.request_encode_url( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 96, in request_encode_url return self.urlopen(method, url, **extra_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/poolmanager.py", line 376, in urlopen response = conn.urlopen(method, u.request_uri, **kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /apis/v1beta1/healthz (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb338e91040>: Failed to establish a new connection: [Errno 111] Connection refused'))
When this execution run finishes successfully, you have now created and run your first TFX pipeline in AI Platform Pipelines!
Step 7. (Optional) Try BigQueryExampleGen
BigQuery is a serverless, highly scalable, and cost-effective cloud data warehouse. BigQuery can be used as a source for training examples in TFX. In this step, we will add BigQueryExampleGen
to the pipeline.
Double-click to open
pipeline.py
. Comment outCsvExampleGen
and uncomment the line which creates an instance ofBigQueryExampleGen
. You also need to uncomment thequery
argument of thecreate_pipeline
function.
We need to specify which GCP project to use for BigQuery, and this is done by setting --project
in beam_pipeline_args
when creating a pipeline.
Double-click to open
configs.py
. Uncomment the definition ofGOOGLE_CLOUD_REGION
,BIG_QUERY_WITH_DIRECT_RUNNER_BEAM_PIPELINE_ARGS
andBIG_QUERY_QUERY
. You should replace the region value in this file with the correct values for your GCP project.
Change directory one level up. Click the name of the directory above the file list. The name of the directory is the name of the pipeline which is
my_pipeline
if you didn't change.
Double-click to open
kubeflow_runner.py
. Uncomment two arguments,query
andbeam_pipeline_args
, for thecreate_pipeline
function.
Now the pipeline is ready to use BigQuery as an example source. Update the pipeline as before and create a new execution run as we did in step 5 and 6.
!tfx pipeline update \
--pipeline-path=kubeflow_runner.py \
--endpoint={ENDPOINT}
!tfx run create --pipeline-name {PIPELINE_NAME} --endpoint={ENDPOINT}
2023-03-15 09:30:28.506949: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:28.507039: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:28.507049: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. CLI Updating pipeline Detected Kubeflow. Use --engine flag if you intend to use a different orchestrator. Failed to load kube config. Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection raise err File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 398, in _make_request conn.request(method, url, **httplib_request_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 244, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/usr/lib/python3.9/http/client.py", line 1285, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1331, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1280, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1040, in _send_output self.send(msg) File "/usr/lib/python3.9/http/client.py", line 980, in send self.connect() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 205, in connect conn = self._new_conn() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f325f5c57f0>: Failed to establish a new connection: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/bin/tfx", line 8, in <module> sys.exit(cli_group()) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/decorators.py", line 73, in new_func return ctx.invoke(f, obj, *args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/commands/pipeline.py", line 215, in update_pipeline handler_factory.create_handler(ctx.flags_dict).update_pipeline() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 106, in create_handler return detect_handler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 60, in detect_handler return kubeflow_handler.KubeflowHandler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/kubeflow_handler.py", line 59, in __init__ self._client = kfp.Client( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 196, in __init__ if not self._context_setting['namespace'] and self.get_kfp_healthz( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 410, in get_kfp_healthz response = self._healthz_api.get_healthz() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 63, in get_healthz return self.get_healthz_with_http_info(**kwargs) # noqa: E501 File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 134, in get_healthz_with_http_info return self.api_client.call_api( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 364, in call_api return self.__call_api(resource_path, method, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 181, in __call_api response_data = self.request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 389, in request return self.rest_client.GET(url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 230, in GET return self.request("GET", url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 208, in request r = self.pool_manager.request(method, url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 74, in request return self.request_encode_url( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 96, in request_encode_url return self.urlopen(method, url, **extra_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/poolmanager.py", line 376, in urlopen response = conn.urlopen(method, u.request_uri, **kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /apis/v1beta1/healthz (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f325f5c57f0>: Failed to establish a new connection: [Errno 111] Connection refused')) 2023-03-15 09:30:33.362864: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:33.362950: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:33.362971: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. CLI Creating a run for pipeline: my_pipeline Detected Kubeflow. Use --engine flag if you intend to use a different orchestrator. Failed to load kube config. Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection raise err File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 398, in _make_request conn.request(method, url, **httplib_request_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 244, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/usr/lib/python3.9/http/client.py", line 1285, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1331, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1280, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1040, in _send_output self.send(msg) File "/usr/lib/python3.9/http/client.py", line 980, in send self.connect() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 205, in connect conn = self._new_conn() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fb8b42ba280>: Failed to establish a new connection: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/bin/tfx", line 8, in <module> sys.exit(cli_group()) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/decorators.py", line 73, in new_func return ctx.invoke(f, obj, *args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/commands/run.py", line 94, in create_run handler = handler_factory.create_handler(ctx.flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 106, in create_handler return detect_handler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 60, in detect_handler return kubeflow_handler.KubeflowHandler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/kubeflow_handler.py", line 59, in __init__ self._client = kfp.Client( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 196, in __init__ if not self._context_setting['namespace'] and self.get_kfp_healthz( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 410, in get_kfp_healthz response = self._healthz_api.get_healthz() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 63, in get_healthz return self.get_healthz_with_http_info(**kwargs) # noqa: E501 File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 134, in get_healthz_with_http_info return self.api_client.call_api( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 364, in call_api return self.__call_api(resource_path, method, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 181, in __call_api response_data = self.request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 389, in request return self.rest_client.GET(url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 230, in GET return self.request("GET", url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 208, in request r = self.pool_manager.request(method, url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 74, in request return self.request_encode_url( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 96, in request_encode_url return self.urlopen(method, url, **extra_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/poolmanager.py", line 376, in urlopen response = conn.urlopen(method, u.request_uri, **kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /apis/v1beta1/healthz (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb8b42ba280>: Failed to establish a new connection: [Errno 111] Connection refused'))
Step 8. (Optional) Try Dataflow with KFP
Several TFX Components uses Apache Beam to implement data-parallel pipelines, and it means that you can distribute data processing workloads using Google Cloud Dataflow. In this step, we will set the Kubeflow orchestrator to use dataflow as the data processing back-end for Apache Beam.
Double-click
pipeline
to change directory, and double-click to openconfigs.py
. Uncomment the definition ofGOOGLE_CLOUD_REGION
, andDATAFLOW_BEAM_PIPELINE_ARGS
.
Change directory one level up. Click the name of the directory above the file list. The name of the directory is the name of the pipeline which is
my_pipeline
if you didn't change.
Double-click to open
kubeflow_runner.py
. Uncommentbeam_pipeline_args
. (Also make sure to comment out currentbeam_pipeline_args
that you added in Step 7.)
Now the pipeline is ready to use Dataflow. Update the pipeline and create an execution run as we did in step 5 and 6.
!tfx pipeline update \
--pipeline-path=kubeflow_runner.py \
--endpoint={ENDPOINT}
!tfx run create --pipeline-name {PIPELINE_NAME} --endpoint={ENDPOINT}
2023-03-15 09:30:38.256851: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:38.256933: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:38.256943: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. CLI Updating pipeline Detected Kubeflow. Use --engine flag if you intend to use a different orchestrator. Failed to load kube config. Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection raise err File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 398, in _make_request conn.request(method, url, **httplib_request_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 244, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/usr/lib/python3.9/http/client.py", line 1285, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1331, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1280, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1040, in _send_output self.send(msg) File "/usr/lib/python3.9/http/client.py", line 980, in send self.connect() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 205, in connect conn = self._new_conn() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f1194b1aca0>: Failed to establish a new connection: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/bin/tfx", line 8, in <module> sys.exit(cli_group()) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/decorators.py", line 73, in new_func return ctx.invoke(f, obj, *args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/commands/pipeline.py", line 215, in update_pipeline handler_factory.create_handler(ctx.flags_dict).update_pipeline() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 106, in create_handler return detect_handler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 60, in detect_handler return kubeflow_handler.KubeflowHandler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/kubeflow_handler.py", line 59, in __init__ self._client = kfp.Client( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 196, in __init__ if not self._context_setting['namespace'] and self.get_kfp_healthz( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 410, in get_kfp_healthz response = self._healthz_api.get_healthz() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 63, in get_healthz return self.get_healthz_with_http_info(**kwargs) # noqa: E501 File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 134, in get_healthz_with_http_info return self.api_client.call_api( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 364, in call_api return self.__call_api(resource_path, method, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 181, in __call_api response_data = self.request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 389, in request return self.rest_client.GET(url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 230, in GET return self.request("GET", url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 208, in request r = self.pool_manager.request(method, url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 74, in request return self.request_encode_url( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 96, in request_encode_url return self.urlopen(method, url, **extra_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/poolmanager.py", line 376, in urlopen response = conn.urlopen(method, u.request_uri, **kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /apis/v1beta1/healthz (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1194b1aca0>: Failed to establish a new connection: [Errno 111] Connection refused')) 2023-03-15 09:30:43.106134: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:43.106222: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:43.106233: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. CLI Creating a run for pipeline: my_pipeline Detected Kubeflow. Use --engine flag if you intend to use a different orchestrator. Failed to load kube config. Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection raise err File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 398, in _make_request conn.request(method, url, **httplib_request_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 244, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/usr/lib/python3.9/http/client.py", line 1285, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1331, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1280, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1040, in _send_output self.send(msg) File "/usr/lib/python3.9/http/client.py", line 980, in send self.connect() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 205, in connect conn = self._new_conn() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f553877b100>: Failed to establish a new connection: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/bin/tfx", line 8, in <module> sys.exit(cli_group()) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/decorators.py", line 73, in new_func return ctx.invoke(f, obj, *args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/commands/run.py", line 94, in create_run handler = handler_factory.create_handler(ctx.flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 106, in create_handler return detect_handler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 60, in detect_handler return kubeflow_handler.KubeflowHandler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/kubeflow_handler.py", line 59, in __init__ self._client = kfp.Client( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 196, in __init__ if not self._context_setting['namespace'] and self.get_kfp_healthz( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 410, in get_kfp_healthz response = self._healthz_api.get_healthz() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 63, in get_healthz return self.get_healthz_with_http_info(**kwargs) # noqa: E501 File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 134, in get_healthz_with_http_info return self.api_client.call_api( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 364, in call_api return self.__call_api(resource_path, method, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 181, in __call_api response_data = self.request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 389, in request return self.rest_client.GET(url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 230, in GET return self.request("GET", url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 208, in request r = self.pool_manager.request(method, url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 74, in request return self.request_encode_url( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 96, in request_encode_url return self.urlopen(method, url, **extra_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/poolmanager.py", line 376, in urlopen response = conn.urlopen(method, u.request_uri, **kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /apis/v1beta1/healthz (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f553877b100>: Failed to establish a new connection: [Errno 111] Connection refused'))
You can find your Dataflow jobs in Dataflow in Cloud Console.
Step 9. (Optional) Try Cloud AI Platform Training and Prediction with KFP
TFX interoperates with several managed GCP services, such as Cloud AI Platform for Training and Prediction. You can set your Trainer
component to use Cloud AI Platform Training, a managed service for training ML models. Moreover, when your model is built and ready to be served, you can push your model to Cloud AI Platform Prediction for serving. In this step, we will set our Trainer
and Pusher
component to use Cloud AI Platform services.
Before editing files, you might first have to enable AI Platform Training & Prediction API.
Double-click
pipeline
to change directory, and double-click to openconfigs.py
. Uncomment the definition ofGOOGLE_CLOUD_REGION
,GCP_AI_PLATFORM_TRAINING_ARGS
andGCP_AI_PLATFORM_SERVING_ARGS
. We will use our custom built container image to train a model in Cloud AI Platform Training, so we should setmasterConfig.imageUri
inGCP_AI_PLATFORM_TRAINING_ARGS
to the same value asCUSTOM_TFX_IMAGE
above.
Change directory one level up, and double-click to open
kubeflow_runner.py
. Uncommentai_platform_training_args
andai_platform_serving_args
.
Update the pipeline and create an execution run as we did in step 5 and 6.
!tfx pipeline update \
--pipeline-path=kubeflow_runner.py \
--endpoint={ENDPOINT}
!tfx run create --pipeline-name {PIPELINE_NAME} --endpoint={ENDPOINT}
2023-03-15 09:30:47.973152: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:47.973236: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:47.973247: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. CLI Updating pipeline Detected Kubeflow. Use --engine flag if you intend to use a different orchestrator. Failed to load kube config. Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection raise err File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 398, in _make_request conn.request(method, url, **httplib_request_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 244, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/usr/lib/python3.9/http/client.py", line 1285, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1331, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1280, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1040, in _send_output self.send(msg) File "/usr/lib/python3.9/http/client.py", line 980, in send self.connect() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 205, in connect conn = self._new_conn() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f5a098f4250>: Failed to establish a new connection: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/bin/tfx", line 8, in <module> sys.exit(cli_group()) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/decorators.py", line 73, in new_func return ctx.invoke(f, obj, *args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/commands/pipeline.py", line 215, in update_pipeline handler_factory.create_handler(ctx.flags_dict).update_pipeline() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 106, in create_handler return detect_handler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 60, in detect_handler return kubeflow_handler.KubeflowHandler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/kubeflow_handler.py", line 59, in __init__ self._client = kfp.Client( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 196, in __init__ if not self._context_setting['namespace'] and self.get_kfp_healthz( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 410, in get_kfp_healthz response = self._healthz_api.get_healthz() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 63, in get_healthz return self.get_healthz_with_http_info(**kwargs) # noqa: E501 File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 134, in get_healthz_with_http_info return self.api_client.call_api( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 364, in call_api return self.__call_api(resource_path, method, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 181, in __call_api response_data = self.request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 389, in request return self.rest_client.GET(url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 230, in GET return self.request("GET", url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 208, in request r = self.pool_manager.request(method, url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 74, in request return self.request_encode_url( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 96, in request_encode_url return self.urlopen(method, url, **extra_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/poolmanager.py", line 376, in urlopen response = conn.urlopen(method, u.request_uri, **kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /apis/v1beta1/healthz (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f5a098f4250>: Failed to establish a new connection: [Errno 111] Connection refused')) 2023-03-15 09:30:52.894719: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:52.894810: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-03-15 09:30:52.894821: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. CLI Creating a run for pipeline: my_pipeline Detected Kubeflow. Use --engine flag if you intend to use a different orchestrator. Failed to load kube config. Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection raise err File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 398, in _make_request conn.request(method, url, **httplib_request_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 244, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/usr/lib/python3.9/http/client.py", line 1285, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1331, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1280, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.9/http/client.py", line 1040, in _send_output self.send(msg) File "/usr/lib/python3.9/http/client.py", line 980, in send self.connect() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 205, in connect conn = self._new_conn() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f6f90781190>: Failed to establish a new connection: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmpfs/src/tf_docs_env/bin/tfx", line 8, in <module> sys.exit(cli_group()) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/decorators.py", line 73, in new_func return ctx.invoke(f, obj, *args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/commands/run.py", line 94, in create_run handler = handler_factory.create_handler(ctx.flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 106, in create_handler return detect_handler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/handler_factory.py", line 60, in detect_handler return kubeflow_handler.KubeflowHandler(flags_dict) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx/tools/cli/handler/kubeflow_handler.py", line 59, in __init__ self._client = kfp.Client( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 196, in __init__ if not self._context_setting['namespace'] and self.get_kfp_healthz( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp/_client.py", line 410, in get_kfp_healthz response = self._healthz_api.get_healthz() File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 63, in get_healthz return self.get_healthz_with_http_info(**kwargs) # noqa: E501 File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 134, in get_healthz_with_http_info return self.api_client.call_api( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 364, in call_api return self.__call_api(resource_path, method, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 181, in __call_api response_data = self.request( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 389, in request return self.rest_client.GET(url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 230, in GET return self.request("GET", url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/kfp_server_api/rest.py", line 208, in request r = self.pool_manager.request(method, url, File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 74, in request return self.request_encode_url( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/request.py", line 96, in request_encode_url return self.urlopen(method, url, **extra_kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/poolmanager.py", line 376, in urlopen response = conn.urlopen(method, u.request_uri, **kw) File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen return self.urlopen( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /apis/v1beta1/healthz (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f6f90781190>: Failed to establish a new connection: [Errno 111] Connection refused'))
You can find your training jobs in Cloud AI Platform Jobs. If your pipeline completed successfully, you can find your model in Cloud AI Platform Models.
Step 10. Ingest YOUR data to the pipeline
We made a pipeline for a model using the Chicago Taxi dataset. Now it's time to put your data into the pipeline.
Your data can be stored anywhere your pipeline can access, including GCS, or BigQuery. You will need to modify the pipeline definition to access your data.
- If your data is stored in files, modify the
DATA_PATH
inkubeflow_runner.py
orlocal_runner.py
and set it to the location of your files. If your data is stored in BigQuery, modifyBIG_QUERY_QUERY
inpipeline/configs.py
to correctly query for your data. - Add features in
models/features.py
. - Modify
models/preprocessing.py
to transform input data for training. - Modify
models/keras/model.py
andmodels/keras/constants.py
to describe your ML model.- You can use an estimator based model, too. Change
RUN_FN
constant tomodels.estimator.model.run_fn
inpipeline/configs.py
.
- You can use an estimator based model, too. Change
Please see Trainer component guide for more introduction.
Cleaning up
To clean up all Google Cloud resources used in this project, you can delete the Google Cloud project you used for the tutorial.
Alternatively, you can clean up individual resources by visiting each consoles: