|View source on GitHub|
Tuner executor that launches parallel tuning flock on Cloud AI Platform.
tfx.extensions.google_cloud_ai_platform.tuner.executor.Executor( context: Optional[
tfx.dsl.components.base.base_executor.BaseExecutor.Context] = None )
This executor starts a Cloud AI Platform (CAIP) Training job with a flock of workers, where each worker independently executes Tuner's search loop on the single machine.
Per KerasTuner's design, distributed Tuner's identity is controlled by the environment variable (KERASTUNER_TUNER_ID) to each workers in the CAIP training job. Those environment variables are configured in each worker of CAIP training job's worker flock.
In addition, some implementation of KerasTuner requires a separate process to centrally manage the state of tuning (called as 'chief oracle') which is consulted by all workers according as another set of environment variables (KERASTUNER_ORACLE_IP and KERASTUNER_ORACLE_PORT).
In summary, distributed tuning flock by Cloud AI Platform Job is structured as follows.
Executor.Do() -> launch _Executor.Do() on a possibly multi-worker CAIP job ->
-+> master -> _search() (-> create a subprocess -> run the chief oracle.) | +> trigger a single tuner.search() +> worker -> _search() -> trigger a single tuner.search() +> worker -> _search() -> trigger a single tuner.search()
Do( input_dict: Dict[Text, List[types.Artifact]], output_dict: Dict[Text, List[types.Artifact]], exec_properties: Dict[Text, Any] ) -> None
Starts a Tuner component as a job on Google Cloud AI Platform.