The Vertex AI Custom Job job type submits a containerised training
job to Google Vertex AI via
POST projects/{project}/locations/{location}/customJobs. The
template Custom Job (used to seed compute spec) is identified by its
resource name — Polysync stores
projects/{project}/locations/{location}/customJobs/{jobId} in the
Job's External Id.
This job type is supported on the Google Vertex AI platform.
Vertex AI Custom Job (set automatically on import).Compute & container spec (stored as job attributes from the template):
Machine Type, Container Image URI, Replica Count, optional
accelerator settings.
GET projects/{project}/locations/{location}/customJobs?pageSize=100
(paginated via pageToken). Each Custom Job's compute spec
(machineType, containerUri, replicaCount, accelerators) is
captured as job attributes for re-use.
Input + Input&Output parameters are sent as container environment
variables on the worker pool's containerSpec:
{
"displayName": "<polysync-name>_<timestamp>",
"jobSpec": {
"workerPoolSpecs": [
{
"replicaCount": <int>,
"machineSpec": {
"machineType": "<from-attribute>",
"acceleratorType": "<optional>",
"acceleratorCount": <optional>
},
"containerSpec": {
"imageUri": "<from-attribute>",
"env": [
{ "name": "<param>", "value": "<typed-value-as-string>" }
]
}
}
]
}
}
| Direction | Sent as container env | Updated from response |
|---|---|---|
Input |
✅ | ❌ |
Output |
❌ | ❌ (not supported) |
Input&Output |
✅ | ❌ (not supported) |
Output parameters are not supported. Persist training outputs to GCS or the Vertex AI Model Registry.
Polysync POSTs the body above; the new job's full resource name becomes the Polysync RunId.
Status is polled via
GET projects/{project}/locations/{location}/customJobs/{jobId}
and decoded from state:
Vertex AI state |
Polysync status |
|---|---|
JOB_STATE_SUCCEEDED |
Success |
JOB_STATE_FAILED / JOB_STATE_EXPIRED |
Failed |
JOB_STATE_RUNNING / JOB_STATE_QUEUED / JOB_STATE_PENDING / JOB_STATE_PAUSED |
Running |
JOB_STATE_CANCELLING |
Running |
JOB_STATE_CANCELLED |
Cancelled |
| (other) | Unknown |
Cancel is supported via POST {resourceName}:cancel.
https://console.cloud.google.com/vertex-ai/locations/{location}
/training/{jobId}?project={projectId}
os.environ and
type-cast as needed (env vars are always strings).scheduling.timeout on the template
to bound runtime.PERMISSION_DENIED on submit — the runtime identity lacks
Vertex AI User or the worker SA isn't permitted to pull the
container image.JOB_STATE_QUEUED for a long time — accelerator
quota in the region is exhausted; switch region or request quota.